Discourse on the Move: Using corpus analysis to describe discourse structure (Studies in Corpus Linguistics)

  • 97 642 5
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Discourse on the Move: Using corpus analysis to describe discourse structure (Studies in Corpus Linguistics)

Discourse on the Move Studies in Corpus Linguistics (SCL) SCL focuses on the use of corpora throughout language study,

1,780 276 12MB

Pages 304 Page size 475 x 709 pts Year 2009

Report DMCA / Copyright


Recommend Papers

File loading please wait...
Citation preview

Discourse on the Move

Studies in Corpus Linguistics (SCL) SCL focuses on the use of corpora throughout language study, the development of a quantitative approach to linguistics, the design and use of new tools for processing language texts, and the theoretical implications of a data-rich discipline.

General Editor

Consulting Editor

Elena Tognini-Bonelli

Wolfgang Teubert

The Tuscan Word Center/ The University of Siena

University of Birmingham

Advisory Board Michael Barlow

Graeme Kennedy

Douglas Biber

Geoffrey N. Leech

Marina Bondi

Anna Mauranen

Christopher S. Butler

Ute Römer

Sylviane Granger

Michaela Mahlberg

M.A.K. Halliday

Jan Svartvik

Susan Hunston

John M. Swales

Stig Johansson

Yang Huizhong

University of Auckland Northern Arizona University University of Modena and Reggio Emilia University of Wales, Swansea University of Louvain University of Sydney University of Birmingham Oslo University

Victoria University of Wellington University of Lancaster University of Helsinki University of Hannover University of Liverpool University of Lund University of Michigan Jiao Tong University, Shanghai

Volume 28 Discourse on the Move. Using corpus analysis to describe discourse structure Douglas Biber, Ulla Connor and Thomas A. Upton

Discourse on the Move Using corpus analysis to describe discourse structure

Douglas Biber Northern Arizona University

Ulla Connor Thomas A. Upton Indiana University – Indianapolis

John Benjamins Publishing Company Amsterdam / Philadelphia



The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data Biber, Douglas. Discourse on the move : using corpus analysis to describe discourse structure / Douglas Biber, Ulla Connor, Thomas A. Upton. p. cm. (Studies in Corpus Linguistics, issn 1388-0373 ; v. 28) Includes bibliographical references and index. 1. Discourse analysis--Data processing. I. Connor, Ulla, 1948- II. Upton, Thomas A. (Thomas Albin) III. Title. P302.3.B53




isbn 978 90 272 2302 9 (Hb; alk. paper)

© 2007 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

Table of contents Preface


chapter 1 1 Discourse analysis and corpus linguistics 1 Discourse and discourse analysis  1 1.1 Discourse studies of language use  3 1.2 Discourse studies of linguistic structure ‘beyond the sentence’  4 1.3 Discourse studies of social practices and ideological assumptions associated with communication  6 1.4 “Register” and “genre” perspectives on discourse  7 1.5 Identifying structural units in discourse  9 2 Corpus-based investigation of discourse structure  10 3 Top-down versus bottom-up corpus-based approaches to discourse analysis  12 3.1 Examples of top-down analyses of discourse  14 3.2 Example of bottom-up approach  16 4 Creating a specialized corpus for discourse analysis  17 5 Overview of the book  19 Part 1. Top-down analyses of discourse organization chapter 2 Introduction to move analysis WITH Budsaba Kanoksilapatham 1 2 3 3.1 3.2 4 4.1

Background  23 Swales’ move analysis of research articles  25 Move analysis of research articles applied across genres  29 Description and examples  29 Summary of previous research on move analysis  32 Overview of the methods for move analysis  32 General steps of a move analysis  32



Discourse on the Move

4.2 5 5.1 5.2 5.3

5.3.1 5.3.2 5.3.3 5.3.4 6

Inter-rater reliability  35 Using a corpus-based approach to move analysis  36 Corpus-based move analysis  36 General advantages of corpus-based approaches to discourse analysis  37 Specific advantages of a corpus-based perspective for move analysis  38 Identifying linguistic features of moves  38 Move frequencies and lengths  39 Mapping move use and locations  39 Genre prototypes  40 Summary  40

chapter 3 43 Identifying and analyzing rhetorical moves in philanthropic discourse 1 Background  43 2 A specialized corpus of fundraising texts  44 3 Determining and analyzing discourse moves: Direct mail letters  46 3.1 Previous analysis of direct mail letters  46 3.2 A move analysis of fundraising letters: Background and methodology  46 3.2.1 Move types  46 3.2.2 Structural elements  52 3.3 Analysis  54 3.4 Results  55 3.5 Discussion  57 3.6 Letter prototypes  58 4 Linguistic analysis of moves: Tracking the use of stance structures  61 4.1 Identifying grammatical stance devices  62 Interpreting the use of grammatical stance devices used in moves  63 4.2 5 Final thoughts  68 chapter 4 Rhetorical moves in biochemistry research articles BY Budsaba Kanoksilapatham 1 2 3 3.1 3.2 3.3

Background  73 Description of the corpus  75 Determining the move categories in the genre of biochemistry research articles  76 The introduction section  77 The methods section  78 The results section  79


3.4 4 5 6 7 8

Table of contents 

The discussion section  81 Coding moves in the corpus of biochemistry research articles  83 Distribution of move types within texts from the biochemistry corpus  84 Linguistic characteristics of rhetorical moves in biochemistry research articles  87 Linguistic variation among move categories in biochemistry research articles  90 Multi-dimensional variation among move types within the same section  103

chapter 5 Rhetorical appeals in fundraising WITH Molly Anthony & Kostyantyn Gladkov 1 2 2.1 2.2 2.3 3 3.1 4 4.1 4.2 5 6


Elements of persuasion  121 Determining and analyzing rhetorical appeals  124 Rational appeals (Logos)  125 Credibility appeals (Ethos)  129 Affective appeals (Pathos)  131 Analysis, segmentation, and classification  132 Results and discussion  133 Linguistic description of appeals  136 Wordlists  137 Keywords  138 Appeals and discourse structure of letters  141 Conclusion  143

Part 2. Bottom-up analyses of discourse organization chapter 6 Introduction to the identification and analysis of vocabulary-based discourse units WITH Eniko Csomay, James K. Jones, & Casey Keck 1 2 3 4 5

Conceptual introduction to VBDUs  156 Automatic identification of VBDUs in texts  161 Perceptual correlates of VBDUs  163 Using VBDUs to analyze the discourse structure of texts  169 Going one step further: Identifying generalizable VBDU ‘types’  170


 Discourse on the Move

chapter 7 Vocabulary-based discourse units in biology research articles WITH James K. Jones 1 2 3 4 5 6 7 7.1 7.2 7.3 8 9 10

Constructing the corpus of VBDUs  176 Analyzing the linguistic characteristics of VBDUs: Multi-dimensional analysis  178 Comparing the multi-dimensional characteristics of research article sections  184 The multi-dimensional profile of VBDUs within a research article: Tracking the movement of discourse  186 Identifying and interpreting the multi-dimensional text types of biology research articles  190 Using VBDU text types to describe the discourse organizational patterns of biology research articles  194 Starting and ending research article sections  196 Describing the typical discourse organizations of introductions  197 Describing the typical discourse organizations of methods sections  199 Describing the typical discourse organizations of discussion sections  201 Preferred text type sequences across research article section boundaries  203 Comparing the preferred discourse styles of research journals  205 Conclusion  208

chapter 8 Vocabulary-based discourse units in university class sessions BY Eniko Csomay 1



From constructing a corpus of VBDUs to identifying VBDU text-types  214 1.1 Constructing a corpus of VBDUs  214 1.2 Analyzing the linguistic characteristics of VBDUs applying MD analytical techniques  215 1.3 VBDUs and dimension scores: the multi-dimensional profile of the first three VBDUs of a business management class  217 2 Dimension scores and VBDU text-types  222 2.1 Interpreting the clusters as VBDU types based on their linguistic characteristics  224 2.1.1 Cluster 1: Personalized framing  225 2.1.2 Cluster 2: Informational monologue  227 2.1.3 Cluster 3: Contextual interactive  228

2.1.4 3 3.1 3.2 4

Table of contents 

Cluster 4: Unmarked  229 From VBDU text-types to discourse structure  230 Functional interpretation of VBDU types  230 Text as sequences of VBDU types  232 Summary and conclusion  237

chapter 9 239 Conclusion: Comparing the analytical approaches 1 Overview  239 2 Comparing the top-down and bottom-up descriptions of biology research articles.  242 2.1 Discourse units in biology research articles  243 2.2 The dimensions of linguistic variation in biology research articles  244 2.3 The functional and linguistic characteristics of the discourse types (move types vs VBDU types) in biology research articles  249 2.4 Description of the typical discourse organization of biology research articles  253 3 Summary and prospects for future research  258 Appendix 1 A brief introduction to multi-dimensional analysis A.1 Conceptual introduction to the multi-dimensional approach to variation  261 A.2 Overview of methodology in the multi-dimensional approach  262

261 261

Appendix 2 Grammatical and lexico-grammatical features included in the multi-dimensional analyses

267 267





The idea for this book evolved slowly, emerging from research taking place at several institutions applying different approaches to a single research problem: can discourse structure and organization be investigated from a corpus perspective? At Northern Arizona University (NAU), research on this topic began in a PhD seminar in 1999. Inspired by the research of Youmans (1991; 1994) on the ‘Vocabulary Management Profile’, students in that seminar explored ways in which the discourse structure of a text can be discovered automatically by tracking the text-internal use of vocabulary and other linguistic features. This initial effort resulted in a PhD dissertation by Csomay (2002), followed by several other research studies undertaken at NAU that employed the ‘TextTiling’ methods originally developed by Hearst (1997). Over the same period, researchers at Indiana University Purdue University Indianapolis (IUPUI) and Georgetown University were exploring a completely different approach to this same research problem: applying the framework of rhetorical move analysis, developed by Swales (1981; 1990) for the detailed analysis of texts, to analyze the general rhetorical and linguistic patterns of discourse structure in a corpus. At IUPUI, this research effort focused primarily on philanthropic discourse, especially grant proposals and fundraising letters. And at Georgetown University, this research culminated in 2003 with the completion of a PhD dissertation by Kanoksilapatham (2003) on the discourse structure of biochemistry research articles. The actual idea for the present book came about as colleagues from these different institutions would get together at conferences and discuss their different approaches to the study of discourse structure and organization from a corpus perspective. We realized that there had been very little previous research done on this topic, and that by combining and comparing our approaches, we could provide a relatively comprehensive overview of this emerging subfield. Because the book grew out of relatively independent research efforts, each author has had different primary responsibilities. At the same time, we have been eager to structure the book as a coherent treatment of this subject: an authored book rather than an edited collection of articles. Thus, the three book authors

 Discourse on the Move

share equal responsibility for revising and editing all chapters, and ultimately the content of all chapters. But on the other hand, each chapter has different primary authors, including several co-authors in addition to the three book authors for Chapters 1–3, 5–7, and 9. Two chapters are invited, single-authored contributions – Chapter 4 by Kanoksilapatham and Chapter 8 by Csomay. The primary authors for each chapter are as follows: Chapter 1: Biber, Connor, Upton Chapter 2: Connor, Upton, Kanoksilapatham Chapter 3: Upton, Connor Chapter 4: Kanoksilapatham Chapter 5: Connor, Anthony, Gladkov, Upton Chapter 6: Biber, Csomay, Jones, Keck Chapter 7: Biber, Jones Chapter 8: Csomay Chapter 9: Biber, Connor, Upton We would like to thank the numerous colleagues who have made useful suggestions and criticisms over the years in relation to the various research projects that come together in the present book. We also owe a special thanks to Eric Friginal, Bethany Gray, Jack Grieve, Mark Johnson, Erkan Karabacak, YouJin Kim, Poonpon Kornwipa, Jingjing Qin, Angkana Tongpoon, and Faith Young -- the students of ENG 707 (Seminar on Discourse) at Northern Arizona University in the fall of 2006, who read the entire book manuscript and made numerous useful comments and suggestions (including the title for our book, suggested by Jack Grieve).

chapter 1

Discourse analysis and corpus linguistics


Discourse and discourse analysis

The study of discourse has become a major focus of research in many disciplines of the humanities, social sciences, and information sciences. Because this area of study can be approached from so many different perspectives, the terms ‘discourse’ and ‘discourse analysis’ have come to be used in widely divergent ways. Several introductory treatments survey the range of definitions given to the term ‘discourse’ (e.g., Jaworski & Coupland, 1999, pp. 1–7; Schiffrin, 1994, pp. 23–43). Schiffrin, Tannen, and Hamilton (2001) in their introduction to The Handbook of Discourse Analysis (p. 1), group previous definitions of ‘discourse analysis’ into three general categories: 1) the study of language use; 2) the study of linguistic structure ‘beyond the sentence’; and 3) the study of social practices and ideological assumptions that are associated with language and/or communication. The object of study for these three approaches to discourse is increasingly removed from the research goals of traditional structural linguistics. The study of language use focuses on traditional linguistic constructs, such as phrase structures and clause structures, but addresses the problem of why languages have structural variants with nearly equivalent meanings (e.g., particle movement, as in pick up the book versus pick the book up). By considering factors that are not strictly structural, linguists are able to predict when one or another variant is likely to be used. For example, the length of the direct object noun phrase is an important factor predicting the likelihood of particle movement. Aspects of the discourse context are often important for understanding linguistic variation, especially for linguistic constructions that involve word order variation (such as passives, extraposition, clefts, inversions, existential there, etc.). For example, writers will choose passive voice rather than active voice depending on the topical relevance of the ‘patient’ noun phrase. The study of linguistic structure ‘beyond the sentence’ focuses on a larger object of study: extended sequences of utterances or sentences, and how those ‘texts’ are constructed and organized in systematic ways. Although studies of this type are removed from the traditional concerns of structural linguistics (which focuses

Discourse on the Move

mostly on phrasal and clause syntax), the two share a primary focus on linguistic form and how language structures are used for communication. In contrast, the third approach to discourse is socio-cultural in orientation, and generally not concerned with the description of particular texts or the analysis of language structure and use. Socio-cultural approaches to discourse sometimes focus on the actions of participants in particular communication events, and at other times focus on the general characteristics of speech/discourse communities in relation to issues such as power and gender. Although the socio-cultural approaches are obviously important for understanding the broader role of texts in culture, they typically are not concerned with understanding the linguistic forms used in those texts. Corpus linguistic studies are generally considered to be a type of discourse analysis because they describe the use of linguistic forms in context. For example, words are described in terms of their typical collocates: the words that normally occur in the discourse context. Grammatical variation is also described in terms of the words and other grammatical structures that occur in the context. As such, corpus linguistic research has fallen squarely under the first approach to discourse: the study of language use. However, it has been much less common to study discourse organization from a corpus perspective. In fact, these two subfields have research goals and methods that might be considered incompatible: The study of discourse organization – linguistic structure ‘beyond the sentence’ – is usually based on detailed analysis of a single text, resulting in a qualitative linguistic description of the textual organization. In contrast, corpus studies are based on analysis of all texts in a corpus, utilizing quantitative measures to identify the typical distributional patterns that occur across texts. In fact, individual ‘texts’ often have no status whatsoever in corpus investigations. Instead, what we find are comparisons of the distributional patterns in one sub-corpus to the patterns in a second sub-corpus. For example, Scott and Tribble (2006) describe how we can compare the ‘keywords’ of the spoken versus written sub-corpora from the British National Corpus. Nesselhauf (2005, Chapter 3) describes the ‘deviant collocations’ in a corpus of learner English essays. And Römer (2005, Chapter 4) documents the variants and distributional patterns of progressive verb phrases in the spoken sub-corpora from the British National Corpus. These studies are typical of corpus-based research on discourse: they describe the typical patterns of language use, considering the systematic ways in which aspects of the lexico-grammatical context tend to occur together with different linguistic variants; but such corpus-based studies usually tell us nothing about the discourse structure of particular texts.

Chapter 1.  Discourse analysis and corpus linguistics

We thus see this interface as one of the current challenges of corpus linguistics: Is it possible to merge the analytical goals and methods of corpus linguistics with those of discourse analysis that focuses on the structural organization of texts? Can a corpus be analyzed to identify the general patterns of discourse organization that are used to construct texts, and can individual texts be analyzed in terms of the general patterns that result from corpus analysis? These are the central issues that we take up in the present book. 1.1

Discourse studies of language use

The first major approach to discourse identified above – the study of language use – has been carried out from several different perspectives, including research in pragmatics, speech act theory, functional linguistics, variationist studies, and register studies. These subfields all investigate how words and linguistic structures are used in discourse contexts to express a range of meanings. Many of these approaches focus on the study of linguistic variation, showing how linguistic choice is systematic and principled when considered in the larger discourse context. There have been numerous studies of grammar and discourse over the last two decades, as researchers have come to realize that the description of grammatical function is as important as structural analysis. By studying linguistic variation in naturally occurring discourse, researchers have been able to identify systematic differences in the functional use of each variant. An early study of this type is Prince (1978), who compares the discourse functions of WH-clefts and it-clefts. Thompson and Schiffrin have carried out numerous studies in this research tradition; Thompson on detached participial clauses (1983), adverbial purpose clauses (1985), omission of the complementizer that (S. Thompson & Mulac, 1991a, 1991b), relative clauses (Fox & Thompson, 1990); and Schiffrin on verb tense (1981), causal sequences (1985b), and discourse markers (1985a, 1987). Other more recent studies of this type include Ward (1990) on VP preposing, Collins (1995) on dative alternation, and Myhill (1995; 1997) on modal verbs. Most corpus-based research is discourse analytic in this sense, investigating systematic patterns of language use across discourse contexts, generalized over all the texts in a corpus (see, e.g., Biber, Conrad, & Reppen, 1998; McEnery, Xiao, & Tono, 2006). The advantages of a corpus approach for the study of discourse, lexis, and grammatical variation include the emphasis on the representativeness of the text sample, and the computational tools for investigating distributional patterns across discourse contexts. The recent edited volumes by Connor and Upton (2004b), Meyer and Leistyna (2003), Lindquist and Mair (2004), and Sampson and McCarthy (2004) provide good introductions to work of this type. There are also a number of book-length treatments reporting corpus-based investigations of grammar and

Discourse on the Move

discourse: for example, Aijmer (2002) on discourse particles, Collins (1991) on clefts, Granger (1983) on passives, Mair (1990) on infinitival complement clauses, Meyer (1992) on apposition, Römer (2005) on progressive verbs, Tottie (1991) on negation, and several books on nominal structures (e.g., de Haan, 1989; Geisler, 1995; Johansson, 1995; Varantola, 1984). The Longman Grammar of Spoken and Written English (1999) applies corpus-based analysis to a more comprehensive grammatical description of English, showing how any grammatical feature can be described for both structural characteristics and discourse patterns of use. The recent book by Partington (2003) is interesting here in that it combines corpus-based study with an analysis of pragmatics, to investigate the discourse features of White House briefings. A corpus of 48 briefings (250,000 words of running texts) was subjected to computerized concordance and ‘keyword’ analysis. However, the computational analyses were guided by detailed qualitative analysis: “a summer reading the corpus briefings and making notes” (p. 12). This allowed Partington to check on oddities of computerized collocation analysis, highlighting odd language usage that computerized analysis might not have revealed. A more specialized corpus-based approach to the study of language use is multi-dimensional (MD) analysis. Unlike most corpus-based research, MD studies investigate language use in individual texts. This approach describes how linguistic features co-occur in each text, resulting in more general patterns of linguistic co-occurrence that hold across all texts of a corpus. The approach can thus be used to show how patterns of linguistic features vary across individual texts, or across registers and genres. MD analysis is used in several chapters in the present book, and so it is introduced more fully in Appendix One. 1.2

Discourse studies of linguistic structure ‘beyond the sentence’

The second major approach to discourse analysis identified above – the study of linguistic structure ‘beyond the sentence’ – is the primary focus of the present book. Previous research on discourse-level structures has been undertaken from linguistic, cognitive, and computational perspectives. Linguistic Perspectives: Linguistic analyses of discourse structure have focused on lexico-grammatical features that indicate the organization of discourse (see, e.g., the papers in Coulthard, 1994). Focusing on units beyond the sentence-level (e.g., paragraphs in written discourse and episodes in oral discourse), these researchers investigate linguistic devices that signal the underlying discourse structure. Much research of this type has described the discourse functions of particular words and phrases, referred to as ‘discourse markers’, ‘connectives’, ‘discourse particles’ (Schiffrin, 1994), ‘lexical phrases’ (Hansen, 1994; Nattinger & DeCarrico,

Chapter 1.  Discourse analysis and corpus linguistics

1992), or ‘cue phrases’ (Passonneau & Litman, 1996). Other studies discuss the linguistic devices used to mark information structure, topical development, or ‘rhetorical’ structures in discourse (e.g., Mann, Matthiessen, & Thompson, 1992; Mann & Thompson, 1988; Prince, 1981). Finally, some studies track the use of linguistic devices across a text. For example, discourse ‘maps’ are used to track verb tense and voice patterns across the sections of research articles (Biber et al., 1998, Chapter 5), while other studies track referential expressions used in anaphoric chains throughout a text (e.g., Biber, 1992; Fox, 1987; Givón, 1983). A related area of research is the study of textual ‘cohesion’: the use of lexical and grammatical devices as the ‘glue’ of a text, holding the text together as discourse rather than an accidental sequence of sentences (see, e.g., Halliday, 1989; Halliday & Hasan, 1976; Hoey, 1991; Phillips, 1985; Tyler, 1995). Linguistic devices used to establish cohesion include anaphoric pronouns, linking adverbials, and the use of lexical repetition and synonymy to establish topical cohesion. Similarly, Tannen (1989) found that repetitions in conversation “operate as a kind of theme-setting” at the beginning of a topical unit and “at the end, forming a kind of coda” (p. 69). Cognitive perspectives: Cognitive investigations of discourse structure study the factors that make a text ‘coherent’. Text coherence refers to the linking of ideas within a text to create meaning for readers. Analyses of textual coherence typically identify the propositions expressed in a text, the logical relations among those propositions, and how listeners/readers are able to construct the overall textual meaning in terms of those propositional relations. In contrast to the study of cohesion, which refers to surface-level patterns, coherence entails the study of larger discourse relationships. Many of these studies describe texts in terms of the coherence relations expressed by clause-level propositions (Bateman & Rondhuis, 1997; Dahlgren, 1996; Hobbs, 1979; Sanders, 1997; Sanders & Noordman, 2000). Related studies also consider other factors that influence coherence, including differences between subject versus presentational matter (Mann & Thompson, 1988), text structural patterns – like problem-solution (Connor, 1987) and given-new (themerheme) structures (Cooper, 1988), and the semantic and pragmatic relations between units (Polanyi, 1985, 1988; Sanders, 1997). Several researchers have developed analytical frameworks for the study of coherence relations (e.g., Grosz & Sidner, 1986; McNamara & Kintsch, 1996; Tomlin, Forrest, Ming Pu, & Hee Kim, 1997; Van Dijk, 1981, 1997; Van Dijk & Kintsch, 1983). The ongoing flow of information is also central to coherence (Grabe & Kaplan, 1996). Studies have approached information flow from various perspectives, including representations of the flow of thought (Chafe, 1994, 1997) or short-term memory (Tomlin et al., 1997).

Discourse on the Move

Computational perspectives: Computational studies of discourse organization have attempted to model discourse organization for the purposes of information retrieval and natural language processing. Most computational studies of discourse structure have focused on written texts. For example, Morris and Hirst (1991) developed a lexical algorithm to find chains of related terms, which can be used to describe the structure of texts, applying Grosz and Sidner’s (1986) attentional/intentional model. Marcu (2000) explores the feasibility of automatic rhetorical parsing, applying Mann and Thompson’s (1988) Rhetorical Structure Theory. One important study for the purposes of the present book is Youmans (1991; 1994), who developed the Vocabulary Management Profile (VMP), a computational method to track the introduction of new vocabulary into a text. Youmans shows that VMPs are quite sensitive indicators of the episodic structure of written literary texts, suggesting that the VMP graph “provides a direct visual analogue for constituent structure” (p. 113). Youmans compared the results of the VMP to the paragraph boundaries of literary texts and found 80 percent agreement. Fewer computational studies have focused on the discourse structure of spoken discourse. One of the best known of these, Passonneau and Litman (1996; 1997), attempts to automatically segment spoken texts (spontaneous, narrative monologues) into discourse units, based on the use of referential noun phrases, cue words, and pauses. This study further compares the results of the automatic segmentation to perceptually-identified discourse units. 1.3

Discourse studies of social practices and ideological assumptions associated with communication

Finally, the third approach to ‘discourse’ – the study of communicative social practices and ideological assumptions – focuses on the social construction of discourse rather than the linguistic description of particular texts. For example, proponents of the New Rhetoric (e.g., Bazerman, 1988, 1994; Berkenkotter & Huckin, 1995; Miller, 1984) have argued for the importance of understanding the knowledge of social context surrounding texts for helping writers select rhetorical strategies that work in a given situation. The focus here is to look not only at the products (texts) but also the processes surrounding the production and consumption of texts, asking “Why are specific discourse-genres written and used by the specialist communities the way they are?” (Bhatia, 1993a, p. 11). In an attempt to understand the broader social contexts of the discourse, several recent corpus-based studies have added analyses of interviews and focus group discussions with actual writers and readers of the texts or other academic specialists. For example, Hyland (2000) goes beyond the textual approach to discourse analysis of academic articles by adding focus groups, unstructured inter-

Chapter 1.  Discourse analysis and corpus linguistics

views, and discourse-based interviews with subject specialists from those disciplines, although the interviewees were not the writers of the articles in Hyland’s corpus. The focus groups and the first part of the one-to-one interviews used a semi-structured format and encouraged the informants to speak generally about communication and publication practices in their fields. The second stage used a discourse-based interview which involved detailed discussions about particular pieces of writing. The informants responded as members of the particular discourse community as they interpreted meanings, reconstructed writer motivations, and evaluated rhetorical effectiveness. They were also encouraged to discuss specific points in their own work by referring to a paper they had written. In another corpus study, Hyland (2004b) analyzed a corpus of 240 dissertations by L2 writers at Hong Kong universities, together with interviews with 24 students. The interviews helped in understanding the use of the analyzed metadiscourse markers – transitions, frame markers, endophoric markers, evidentials, code glosses, hedges, boosters, attitude markers, engagement markers, and selfmentions. Such qualitative analyses can shed light on disciplinary differences as well as differences between MA and PhD level writers even if the interviewees are not the actual writers. Unlike many qualitative studies of texts and writing, in which the researcher observes, interviews, and works with the actual writer or writers (see, e.g., Bazerman & Prior, 2004), corpus studies tend to rely on anonymous writers who are members of the particular discourse community. In many cases, corpora are constructed from published resources, rather than being collected from writers personally, making it nearly impossible to obtain information about the writers and the circumstances of writing. However, like the Hyland studies cited above, it is possible to combine corpus-based analysis with the careful observation of individual writers. For example, Connor & Mauranen (1999) undertook a large-scale corpus analysis of rhetorical moves in grant proposal writing in the sciences and humanities. This study was later complemented by detailed interviews with five scholars in these disciplines (Connor, 2000). These scholars were not the writers of the proposals in the large corpus. However, as specialist informants they were able to comment on the appropriateness of the move definitions and the identification of move boundaries in a small corpus of their own proposals. 1.4

“Register” and “genre” perspectives on discourse

The terms ‘register’ and ‘genre’ have been central to previous investigations of discourse. Both terms have been used to refer to varieties associated with particular situations of use and particular communicative purposes. Many studies simply adopt one of these terms and disregard the other. In some cases, these authors

Discourse on the Move

might be assuming a theoretical distinction between the two terms, but that distinction is usually not explicitly noted. For example, studies like Bhatia (2002), Samraj (2002), Bunton (2002), Love (2002), and Swales (2004) exclusively use the term ‘genre’. In contrast, studies like Ure (1982), Ferguson (1983), Hymes (1984), Heath and Langman (1994), Bruthiaux (1994; 1996), Conrad (2001), and Biber et al. (1999) exclusively use the term ‘register’. A few studies attempt to define a theoretical distinction between the constructs underlying these two terms. For example, Ventola (1984) and Martin (1985) refer to register and genre as different ‘semiotic planes’: genre is the ‘content-plane’ of register, and register is the ‘expression-plane’ of genre; register is in turn the ‘content-plane’ of language. Lee (2001) surveys the use of these terms, providing one of the most comprehensive discussions of how they have been used in previous research (as well as terms like text type and style). When research studies have attempted to distinguish between register and genre (such as Couture, 1986; Ferguson, 1994; Martin, 1985; Swales, 1990; Ventola, 1984), the distinction has been applied at two different levels of analysis: 1) to the object of study; 2) to the characteristics of language and culture that are investigated. Thus, the term register (when it is distinguished from genre) has been used to refer to a general kind of language associated with a domain of use, such as a ‘legal register’, ‘scientific register’, or ‘bureaucratic register’. Register studies have usually focused on lexico-grammatical features, showing how the use of particular words and grammatical features vary systematically in accord with the situation of use (factors such as interactivity, personal involvement, mode, production circumstances, and communicative purpose). As such, the term register has been associated with the first general approach to discourse identified in Section 1 above – the study of language use. In contrast, the term genre has been used to refer to a culturally recognized ‘message type’ with a conventional internal structure, such as an affidavit, a biology research article, or a business memo. Genre studies have usually focused on the conventional discourse structure of texts or the expected socio-cultural actions of a discourse community. For example, genres are “how things get done, when language is used to accomplish them” (Martin, 1985, p. 250), and “frames for social action” (Bazerman, 1997b, p. 19). As such, the term genre is often associated with the second general approach to discourse identified in Section 1 above – the study of linguistic structure ‘beyond the sentence’. In his previous work on linguistic variation, Biber has disregarded theoretical distinctions between the terms register and genre, preferring the term genre in earlier studies (e.g., Biber 1986, 1988) and the term register in later research (Biber,

Chapter 1.  Discourse analysis and corpus linguistics

1995, 2006b). In both cases, these were used simply as a general cover term to refer to situationally-defined varieties described for their characteristic lexico-grammatical features, with no implied theoretical distinction between register and genre. However, in the present book we are focused especially on the internal structure and organization of texts from a specific variety (e.g., fundraising letters or biology research articles), a perspective typically associated with the analysis of a genre rather than register. For this reason, we adopt the term genre throughout the book to refer to the linguistic variety being analyzed. 1.5

Identifying structural units in discourse

One specific research emphasis for discourse studies of structure ‘beyond the sentence’ has been the attempt to segment a text into higher-level structural units. These studies are foundational to the goals of the present book, because the ‘units of analysis’ in corpus-based studies of discourse structure must be well-defined discourse units: the segments of discourse that provide the building blocks of texts. In studies of written texts, discourse units have generally been identified based on visual as well as textual clues (see, e.g., Hunston, 1994). The smallest unit of analysis has usually been the proposition, followed by the t-unit or sentence, the paragraph, and finally the chapter or the whole text (Meyer, 1985). Such units are identified by written para-linguistic devices (such as sentence punctuation and paragraph indenting), rather than analysis of textual content or function. Other studies have considered the initiation of new topics within a text. Investigating written fiction, Youmans (1991, p. 774) claimed that syntactic function words “do not denote new topics”, whereas content words do. Similarly, Fox (1987) found that, in expository writing, full noun phrases are more likely than pronouns to indicate the start of a new topic. In spoken discourse (especially conversation) it has proved especially difficult to determine what constitutes a new topic, resulting in a reliance on qualitative or impressionistic findings. As Tannen (1984, p. 38) notes, the boundaries of the shifting topics in conversation are not always clearly and readily identifiable, and the initiation of new topics is often unclear (see also Tannen, 1984, 1989; Van Dijk, 1997). Some research has suggested that prosodic and linguistic cues can be used to determine topical boundaries in oral discourse. For example, pauses, hesitations, false starts, change in pitch, discourse particles, preposed adverbials, summary statements, and evaluative comments have all been proposed as linguistic markers that signal a discourse shift in theme or topic (e.g., Brown & Yule, 1983; Gee, 1986; Korolija & Linell, 1996; Polanyi, 1985; Stubbs, 1983; Tannen, 1987; Van Dijk, 1981).


Discourse on the Move

In general, these studies have focused on linguistic devices that signal the transition from one topic to the next, but they have not attempted to rigorously segment complete texts into well-defined discourse units. However, this is exactly the task that must be accomplished for corpus-based analyses of discourse structure: we need comprehensive identification of the structural discourse units within all texts in the corpus. Two general approaches to text segmentation have been employed in previous corpus-based research: top-down and bottom-up methods of segmentation. The following section discusses these two approaches in more detail.

2 Corpus-based investigation of discourse structure As summarized in the sections above, research on the linguistic characteristics of texts and discourse has been carried out from two major perspectives: one focusing on the distribution and functions of surface linguistic features – corpus studies of language use in discourse (which typically disregards the existence of individual texts) – and the second focusing on the internal organization of texts – discourse studies of linguistic structure ‘beyond the sentence’ in particular texts. Discourse studies of language use have usually been quantitative, and in more recent years, they have been carried out on large text corpora using the techniques of corpus linguistics; these studies often compare the linguistic characteristics of discourse from different spoken and written registers. Studies of the second type have usually been qualitative and based on detailed analysis of a small number of texts; these studies usually focus on the internal structure of a few texts from a single genre, such as scientific research articles. Römer (2005) is a good example of the first approach. This study describes the use of progressive verb phrases in spoken English, based on analysis of the British National Corpus and the Bank of English. Rather than focusing on the organization of any particular text, the study focuses on the overall patterns of distribution and use, considering factors such as the tendency of progressives to occur with different tenses and aspects; occurrence with different subject types or object types; occurrence with different adverbials; and the tendency to occur with specific verbs and verb classes. In contrast, the chapters in Mann and Thompson (1992) are good examples of the second approach. This book is based on analysis of a single fundraising letter, showing how the discourse structure and organization of that single text can be analyzed from different perspectives. Surprisingly, few studies have attempted to combine these two research perspectives. On the one hand, most corpus-based studies have focused on the quantitative distribution of lexical and grammatical features, generally disregarding the language used in particular texts and higher-level discourse structures or other

Chapter 1.  Discourse analysis and corpus linguistics

aspects of discourse organization. On the other hand, most qualitative discourse analyses have focused on the analysis of discourse patterns in a few texts from a single genre, but they have not provided tools for empirical analyses that can be applied on a large scale across a number of texts or genres. As a result, we know little at present about the general patterns of discourse organization across a large representative sample of texts from a genre. One of the major methodological problems to be solved by any corpus-based analysis of discourse structure is deciding on a unit of analysis. That is, the first step in an analysis of discourse structure is to identify the internal discourse segments of a text, corresponding to distinct propositions, topics, or communicative functions; these discourse segments become the basic units of the subsequent discourse analysis. For a corpus study of discourse structure, all texts in the corpus must first be analyzed for their component discourse units. However, such analyses were not even possible based on early text corpora, because they were composed of ‘text-files’ rather than complete ‘texts’. For example, ‘text files’ in the Brown, LOB, and London-Lund Corpora were defined by length – 2,000 words long in the case of Brown and LOB, and 5,000 words long in the case of London-Lund. In some cases, a single text file combines multiple ‘texts’, while in other cases a ‘text’ is truncated in a text file when the word limit is reached. This characteristic of early corpora might help to explain why most previous corpus studies have not considered individual texts at all. Rather, the analysis has reported general patterns for the corpus as a whole, or it has compared overall results for various sub-corpora (e.g., the overall frequency of progressive verbs in a conversational sub-corpus compared to the frequency in a sub-corpus of academic writing). More recently, corpora such as the BNC and T2K-SWAL have been designed to include complete texts, such as complete chapters from a book or complete research articles. It is thus possible, in theory, to analyze the internal discourse structure of each text in the corpus, and to then discover general patterns of discourse organization that hold across all texts in the corpus. To achieve this goal, corpus texts must first be segmented into well-defined discourse units, and then those units can be used to identify the general ways in which the discourse of corpus texts is organized. In the following section, we introduce the two major corpusbased approaches that can be applied to these research goals.



Discourse on the Move

3 Top-down versus bottom-up corpus-based approaches to discourse analysis To achieve generalizable corpus-based descriptions of discourse structure, seven major analytical steps are required: – Determining the types of discourse units – the functional/communicative distinctions that discourse units can serve in these texts (‘Communicative/Functional Categories’) – Segmenting all texts in the corpus into well-defined discourse units (‘Segmentation’) – Identifying and labeling the type (or category) of each discourse unit in each text of the corpus (‘Classification’) – Analyzing the linguistic characteristics of each discourse unit in each text of the corpus (‘Linguistic analysis of each unit’) – Describing the typical linguistic characteristics of each discourse unit type, by comparing all discourse units of a given type across the texts of the corpus (‘Linguistic description of discourse categories’) – Describing the discourse structures of particular texts as sequences of discourse units, in terms of the general type or category of each of those units (‘Text structure’) – Describing general patterns of discourse organization that hold across all texts of the corpus (‘Discourse organizational tendencies’) These seven steps can be achieved through either a top-down research approach or a bottom-up research approach. The two approaches differ primarily in the order of analytical steps. In a top-down approach, the analytical framework is developed at the outset: the discourse unit types are determined before beginning the corpus analysis, and the entire analysis is then carried out in those terms. In a bottom-up approach, the corpus analysis comes first, and the discourse unit types emerge from the corpus patterns. Tables 1.1 and 1.2 summarize the major differences between these two analytical approaches.

Chapter 1.  Discourse analysis and corpus linguistics

Table 1.1  Top-down corpus-based analyses of discourse organization Required step in the analysis

Realization in this approach

1. Communicative/Functional Categories

Develop the analytical framework: determine set of possible functional types of discourse units, that is, the major communicative functions that discourse units can serve in corpus Segment each text into discourse units (applying the analytical framework from Step 1) Identify the functional type of each discourse unit in each text of the corpus (applying the analytical framework from Step 1) Analyze the lexical/grammatical characteristics of each discourse unit in each text of the corpus Describe the typical linguistic characteristics of each functional category, based on analysis of all discourse units of a particular functional type in the corpus Analyze complete texts as sequences of discourse units shifting among the different functional types Describe the general patterns of discourse organization across all texts in the corpus

2. Segmentation 3. Classification

4. Linguistic analysis of each unit 5. Linguistic description of discourse categories

6. Text structure

7. Discourse organizational tendencies

In the top-down approach, the first step is to develop the analytical framework, determining the set of possible discourse unit types based on an a priori determination of the major communicative functions that discourse units can serve in these texts. That framework is then applied to the analysis of all texts in a corpus. Thus, when texts are segmented into discourse units, it is done by identifying a stretch of discourse of a particular type; that is, that serves a particular communicative function. In contrast, in the bottom-up approach, the first step is to automatically segment all texts in the corpus into discourse units (based on linguistic criteria). Those discourse units are then analyzed for many other linguistic features, and grouped into clusters of discourse units that are linguistically similar. Only then – after the discourse units have already been grouped linguistically – are those groupings interpreted as discourse unit types, by determining their typical functions in texts.



Discourse on the Move

Table 1.2  Bottom-up corpus-based analyses of discourse organization Required step in the analysis

Realization in this approach

1. Segmentation





6. 7.


Segment each text in the corpus into discourse units, based on shifts in vocabulary or other linguistic features Linguistic analysis of each unit Analyze the full range of lexical / grammatical characteristics of each discourse unit in each text of the corpus Classification Identify the set of discourse units types that emerge from the corpus analysis, based on linguistic criteria; that is, group all discourse units in the corpus into linguistically-defined categories or ‘types’ Describe the typical linguistic characteristics of Linguistic description of discourse categories each discourse category, based on analysis of all discourse units of a particular type in the corpus Communicative/functional categories Describe the functional bases of each discourse category, based on post-hoc analysis of the discourse units identified as belonging to a particular type Text structure Analyze complete texts as sequences of discourse units shifting among the different functional types Discourse organizational tendencies Describe the general patterns of discourse organization across all texts in the corpus

Examples of top-down analyses of discourse

Several top-level discourse structure theories were advanced by text linguists in the 1980s and 1990s. Theories of superstructures were developed for different types of texts such as exposition, argumentation, and narration. These superstructures of texts were called macrostructures by Van Dijk (1980), problem-solution patterns by Hoey (1983; 1986), superstructures of arguments by Tirkkonen-Condit (1985), and story grammars by Mandler and Johnson (1977). Story grammar analysis had its start in the work of Labov and Waletsky (1967), who proposed the following structure for analyzing oral narratives: orientation (the major characters are introduced and a setting is established); complication (a series of events unfold, and a crisis develops); resolution (the crisis is solved); and coda (the final stage, in which the writer may express an attitude toward the story or give her perspective on its significance). Although developed for oral texts originally, the story grammar analysis became a popular tool in written discourse analysis. Martin and Rothery (1986) used it effectively as a research and teaching method for school writing in Australia.

Chapter 1.  Discourse analysis and corpus linguistics

There are other approaches to the analysis of text structure that could be classified as being top-down in nature. Mann and Thompson (1992) in their book, Discourse Description: Diverse Linguistic Analyses of a Fund-raising Text, showcase seven different methods for looking at the text organization of a single fundraising letter. One, described by Callow and Callow (1992), is somewhat like the appeals analysis described below, except that the focus is on identifying the kinds of intended meanings (rather than appeals) that reflect the writer’s purposes. These different meaning purposes (e.g., informative, expressive, and conative [expressing desires and intentions]) can be used to analyze the meaning-based structure of the text. In their chapter of the book, Mann, Matthiessen, & Thompson (1992) use Rhetorical Structure Theory (RST) to analyze the “relational structure” of a text. At its most basic level, RST identifies coherence in a text – that is, how different parts of a text relate to each other, or more specifically how one part of a text supports, elaborates, provides background for, offers contrast to, justifies, etc, another part of the text. By looking at these relationships, the rhetorical structure of the texts in a corpus could also be mapped out (see also Fox, 1987, Chapters 4–5). Connor (1996) pointed out that the above kinds of analyses provided a new development in written discourse analysis. Researchers became keenly aware that different textual modes (e.g., narration, exposition, argumentation) used different discourse structures. Unlike the study of cohesion, for example, the analysis of super structures was specific to a text type. The increased interest in specific genres has further stimulated research on discourse structures of texts. ‘Move analysis’ (Swales, 1981, 1990) is an example of such a specific genre analysis. Move analysis was developed as a top-down approach to analyze the discourse structure of texts from a genre; the text is described as a sequence of ‘moves’, where each move represents a stretch of text serving a particular communicative function. The analysis begins with the development of an analytical framework, identifying and describing the move types that can occur in this genre: these are the functional/ communicative distinctions that moves can serve in the target genre. Subsequently, selected texts are segmented into moves, noting the move type of each move. The overall discourse structure of a text can be described in relation to the sequence of move types. For example, a research article might begin with a move that identifies the topic and reviews previous research, followed by a move that identifies a gap in previous research, followed by a move that outlines the goals of the present study, summarizes the major findings, and outlines the organization of the paper. Until recently, top-down approaches (including move analysis) have not been applied to an entire corpus of texts, because it is highly labor-intensive to apply a top-down analytical framework to a large corpus of texts. However, this investment of labor pays off by enabling generalizable analyses of discourse structure



Discourse on the Move

across a representative sample of texts from a genre. For example, once a corpus of texts has been coded for moves, we can easily analyze the typical linguistic (lexical and grammatical) characteristics of each move type. It is then possible to identify the sequences of move types that are typical for a genre, and against that background, it is also possible to identify particular texts that use more innovative sequences of move types. In summary, corpus-based move analyses illustrate the top down approach: the functional analytical framework is developed first; that framework is then applied to segment texts into discourse units (moves); and finally the moves and functional move types are analyzed to describe their linguistic characteristics. Chapters 3–4 in the present book illustrate this general approach to discourse structure. Rhetorical appeals analysis is another top-down approach (see Chapter 5). Instead of describing texts according to their communicative functions (‘moves’), rhetorical appeals analysis divides texts into sections using the three basic means of Aristotelian persuasion: ethos, pathos, and logos. Similar to move analysis, this approach begins with the development of an analytical framework, identifying and defining the appeal types. The texts in a corpus are then analyzed by applying this analytical framework: segmenting texts into appeals, noting the appeal type of each appeal. In practice, most previous discourse analyses have been top-down. However, there have been few previous top-down studies of discourse applied to an entire corpus of texts, in large part because the analyses are so labor-intensive. In the present book, we illustrate two particular top-down approaches to discourse: move analysis (Chapters 3–4) and rhetorical appeals analysis (Chapter 5). 3.2

Example of bottom-up approach

In contrast to the long research tradition applying top-down analyses of discourse, the bottom-up approach was only recently developed, specifically for corpusbased analyses of discourse structure. This approach has not been previously practiced by discourse analysts because it requires advanced computational techniques and does not make sense for the analysis of an individual text. That is, a discourse analyst traditionally begins by considering the communicative-functional context of a text, and relies on those considerations to identify the components of the text, and how a text is organized in those terms. In contrast, the bottom-up approach was developed to address the methodological problem of how discourse patterns could be analyzed in a large corpus, with hundreds or thousands of texts. In theory, top-down analyses can also be applied to large text corpora, but in practice, such analyses are limited by the human resources that are available for manually coding discourse units in texts. The bottom-up

Chapter 1.  Discourse analysis and corpus linguistics 

approach has no such limitations, because it incorporates automatic computational techniques which can be easily applied to the analysis of hundreds of texts. ‘Vocabulary-Based Discourse Unit’ (VBDU) analysis is the specific bottom-up approach illustrated in the present book. (See Chapter 6 for a detailed description.) The first step is to automatically segment texts into discourse units – the VBDUs. This is done using computational techniques, based on vocabulary repetition. At this stage, we know nothing about the underlying types of discourse units or the communicative functions served by these types. Then, in the second step, we undertake comprehensive linguistic descriptions of each VBDU (again utilizing automatic computational techniques). These linguistic descriptions are used to group VBDUs into categories, so that all the VBDUs in a grouping are similar linguistically. At that point, functional considerations become important, because the linguistic groupings of VBDUs are interpreted as functional VBDU-types. That is, each ‘type’ represents a grouping of VBDUs that are similar in their lexicogrammatical characteristics, and those groupings are interpreted to identify their typical discourse meanings and functions. Finally, the overall discourse organization of texts is described as sequences of VBDUs, noting the functional discourse type of each VBDU. One major difference between the two approaches is the role of the functional versus linguistic analyses. In the top-down approach, the functional framework is primary. Thus, the first step in the analysis is to determine the possible discourse unit types (e.g., move types) and provide an operational definition for each one. This functional framework is then used to segment texts into discourse units. Linguistic analysis is secondary in a top-down approach, serving an interpretive role to investigate the extent to which functionally-defined discourse units also have systematic linguistic characteristics. In contrast, the linguistic description is primary in the bottom-up approach. Texts are automatically segmented into VBDUs based on vocabulary patterns, and then VBDUs are grouped into categories based on the use of a wide range of lexico-grammatical features. Functional analysis is secondary in VBDU analysis, serving an interpretive role to investigate the extent to which linguistically-defined discourse unit categories also have systematic functional characteristics.

4 Creating a specialized corpus for discourse analysis One of the central methodological issues for corpus-based research is to ensure that the corpus chosen for analysis actually represents the discourse domain being studied and is thus suitable for the research questions being investigated (see Biber 1993, 2004). This is of course no different than any other quantitative research in


Discourse on the Move

the social sciences, where there is always concern that the ‘sample’ being studied actually represents the larger target ‘population’ (one of the potential threats to external ‘validity’). Corpus-based studies of discourse structure are potentially problematic in this regard for two related reasons: 1. Corpora are often designed for general use rather than a specific study. As a result, the population being represented can be relatively general, such as newspaper language, or even an entire language. 2. Researchers sometimes choose to use a corpus just because it is publicly available, with little consideration of whether that corpus actually represents the target population being investigated. However, these problems can be readily addressed. Most corpora have been designed with relatively well-specified sub-corpora that represent particular text categories, such as academic research articles, newspaper editorials, or face-to-face conversation. When corpus studies have been based on particular sub-corpora, the findings have been much more interpretable. In addition, many recent corpora have been designed for more particular research purposes. For example, the T2KSWAL Corpus – a relatively general corpus – was designed to represent the range of spoken and written genres used in American universities (including sub-corpora for office hours, study groups, textbooks, course syllabi, etc.; see Biber 2006b). The ICIC Fundraising Corpus is somewhat more specialized, designed to represent American fundraising discourse, including sub-corpora for genres like direct mail letters and grant proposals (see Connor & Upton, 2004a, 2004b; Upton, 2002; Upton & Connor, 2001). In general, more specialized corpora are more appropriate for the study of discourse structure. The corpora used in the present book are all relatively specialized, but they differ in the extent to which they represent a narrowly-defined genre. At one extreme, the study reported in Chapter 4 of the present book is based on a highly restricted corpus of research articles published in biochemistry academic journals. Prior research was carried out to identify the five most prestigious academic journals in this discipline, and then research articles were collected over a 12-month period from those journals. The study reported in Chapter 7 is based on a corpus of research articles published in biology academic journals, but it deliberately includes a range of sub-disciplines in the sample. The study in Chapter 3 is based on analysis of the direct mail letters included in the ICIC Fundraising Corpus; these include letters from a wide variety of non-profit organizations across a wide variety of non-profit fields (e.g., health and human services, education). Finally, the corpus used in Chapter 8 is probably the least specialized, consisting of transcripts from university-level classroom teaching sessions collected across sev-

Chapter 1.  Discourse analysis and corpus linguistics 

eral different academic disciplines. However, all corpora used here are relatively specialized, restricted to particular genres. Such corpora are required for corpusbased studies of discourse structure: each text has its own discourse organization, and it is reasonable to hypothesize that all texts from a genre will tend to share similar patterns of discourse organization. Our goals in the present book are relatively straightforward: we hope to analyze corpora that represent particular genres, to describe the patterns of discourse organization in those genres and to investigate empirically the variation in discourse patterns across texts within a single genre.

5 Overview of the book The book is organized into two parts, corresponding to the two major corpus-based approaches to discourse organization introduced in Section 2 above. Part I of the book focuses on ‘Top-down analyses of discourse organization’. Chapter 2 introduces top-down analysis in greater detail, describing the analytical procedures required for these analyses, with a special focus on genre-based ‘move’ analysis and the methodological issues that arise during the application of this approach to the analysis of a corpus of texts. Part I of the book then presents three case studies illustrating the top-down approach. The first case study (Chapter 3) describes how fundraising letters are structured in terms of rhetorical moves, focusing on the linguistic expression of ‘stance’ in the different move types. The second case study (Chapter 4) describes the typical discourse organizations of biochemistry research articles, again using move analysis as the primary analytical framework. Rather than focusing on a restricted set of linguistic features, this second case study undertakes a multi-dimensional analysis (see Appendix One) to describe the typical linguistic characteristics of move types in this genre with respect to a wide range of lexical and grammatical features. Finally, the last chapter in Part I of the book introduces a second top-down approach to discourse structure: ‘appeals’ analysis. This approach is applied to the same corpus of fundraising letters as in Chapter 3, allowing a direct comparison of these two analytical approaches. Part II of the book – ‘Bottom-up analyses of discourse organization’ – then deals primarily with Vocabulary-Based Discourse Unit (VBDU) analysis. Chapter 6 introduces this analytical framework in detail, describing both the analytical procedures and experimental research that explores the extent to which the automatically-identified VBDUs correspond to discourse units recognized on a perceptual basis by human raters. Two case studies based on this approach are then presented: Chapter 7 presents a bottom-up analysis of a corpus of biology research articles, describing how texts from this genre are structured as sequences of VBDUs; Chapter 8 presents a similar analysis of VBDUs in university classroom


Discourse on the Move

teaching sessions. Finally, the concluding chapter (9) provides a synopsis of findings, a more theoretical discussion of the strengths and weaknesses of each approach, and a discussion of future prospects for investigations of this type.

Part 1

Top-down analyses of discourse organization

chapter 2

Introduction to move analysis WITH Budsaba Kanoksilapatham

In Chapter 1, we introduced two different approaches for using corpora to analyze discourse organization: top-down and bottom-up corpus-based analyses. This chapter focuses primarily on one type of top-down approach: move analysis. We give a detailed description of move analysis – including what it is, what this type of analysis tells you, examples of studies using move analysis, steps to conducting a move analysis, and special considerations for and advantages of using a corpus-based approach. As noted in the previous chapter, there are many top-down approaches to discourse analysis, like the appeals analysis described in Chapter 5; move analysis, however, is the approach that has been most frequently used to date in corpus-based studies. Chapters 3 and 4 provide specific examples of these kinds of studies. The intent of the present chapter is to introduce the goals and methods of corpus-based move analysis (as one common type of top-down discourse analysis), in order to show how generalizable corpus-based descriptions of discourse organizational patterns can be achieved using a topdown approach.



Genre analysis using rhetorical moves was originally developed by Swales (1981) to describe the rhetorical organizational patterns of research articles. Its goal is to describe the communicative purposes of a text by categorizing the various discourse units within the text according to their communicative purposes or rhetorical moves. A move thus refers to a section of a text that performs a specific communicative function. Each move not only has its own purpose but also contributes to the overall communicative purposes of the genre. In Swales’ words, these purposes together constitute the rationale for the genre, which in turn “shapes the schematic structure of the discourse and influences and constrains choice of content and style,” with texts in a genre exhibiting “various patterns of similarity in terms of structure, style, content and intended audience” (1990, p. 58).


Discourse on the Move

Genre analysis was developed in the 1970s and 1980s as part of the wider growth of discourse analyses focusing on the organization of discourse. Bhatia (2004) documents how structural concerns, for example Hoey’s (1983) problemsolution structure analysis, directed the analyst’s attention away from studying lexico-grammatical features of texts (e.g., passives and nominalizations, use of tenses, coherence). Researchers involved in the analysis of text as genre further related discourse structures to the communicative functions of texts, resulting in the current approach of doing genre analysis using rhetorical moves. In genre analysis, the purposes of the genre are recognized by the expert members of the discourse community, less so by the novice members, and probably not by the nonmembers. These purposes shape the rationale, and the rationale helps develop the constraining conventions. According to Swales (1990), these conventions are constantly changing but still exert influence. As we will see in later chapters, discourse communities are powerful in shaping the conventions of the genre. Research papers in scholarly disciplines are good examples of such discourse communities where novice writers are indoctrinated into the paper-writing genre in their graduate studies and young publishing lives. There are genres, however, which are not shaped by such strong discourse community rationales. Take fundraising letters as an example. It is fair to say that both writers and readers recognize a fundraising letter as such. However, since readers and potential donors do not typically write them, conventions may not be so strictly adhered to. In fact, deviance from conventions may seem fresh to the reader who may receive hundreds of them a year but does not need to worry about writing any. In move analysis, the general organizational patterns of texts are typically described as consisting of a series of moves, with moves being functional units in a text which together fulfill the overall communicative purpose of the genre (Connor, Davis, & De Rycker, 1995). Moves can vary in length, but normally contain at least one proposition (Connor & Mauranen, 1999). Some move types occur more frequently than others in a genre and can be described as conventional, whereas other moves occurring not as frequently can be described as optional. Moves may contain multiple elements that together, or in some combination, realize the move. These elements are referred to as ‘steps’ by Swales (1990) or ‘strategies’ by Bhatia (1993a). The steps of a move primarily function to achieve the purpose of the move to which it belongs (see, e.g., Crookes, 1986; Dudley-Evans, 1994a; Hopkins & Dudley-Evans, 1988; Swales, 1981, 1984, 1990). In short, moves represent semantic and functional units of texts that have specific communicative purposes; in addition, as the following sections show, moves generally have distinct linguistic boundaries that can be objectively analyzed.

Chapter 2.  Introduction to move analysis 

2 Swales’ move analysis of research articles Swales (1981) developed the discourse approach of move analysis within the more general field of English for Specific Purposes (ESP). This approach has been revised and extended by several scholars, including Swales (1990). The original aim of Swales’ work on move analysis was to address the needs of advanced non-native English speakers (NNSs) learning to read and write research articles, as well as to help NNS professionals who want to publish their articles in English. His analysis of 48 introduction sections in research articles from a range of disciplines (physics, medicine, and social sciences), written in English, led Swales to propose a series of moves – i.e., specific communicative functions performed by specific sections of the introductions – that defined the rhetorical structure of research article introductions. A closer examination of Swales’ move structure, or framework, for these introductions helps elucidate the interaction between moves and steps in performing communicative functions in scientific texts. Swales’ three-move schema for article introductions, collectively known as the Create a Research Space (CARS) model, is presented in Table 2.1. The model shows the preferred sequences of move types and steps, which are largely predictable in research article introductions. Table 2.1  CARS model for research article introductions, adapted from Swales (1990, p. 141) Move 1:

Move 2:

Establishing a territory Step 1 Step 2 Step 3 Establishing a niche Step 1A Step 1B Step 1C Step 1D

Move 3:

Claiming centrality and/or Making topic generalization(s) and/or Reviewing items of previous research

Counter-claiming or Indicating a gap or Question raising or Continuing a tradition

Occupying the niche Step 1A Step 1B Step 2 Step 3

Outlining purposes or Announcing present research Announcing principal findings Indicating RA structure

Swales’ model includes three basic move types in research article introductions. Move 1 – Establishing a territory – introduces the general topic of research. Move 2


Discourse on the Move

– Establishing a niche – identifies the more specific areas of research that require further investigation. And Move 3 – Occupying a niche – introduces the current research study in the context of the previous research described in Moves 1 and 2. Move 1 can have a maximum of three steps (Step 1, Step 2, and Step 3). In Move 1, Step 1, Claiming centrality, the author can make a centrality claim by claiming interest or importance in referring to the classic, favorite or central perspective, or by claiming that there are many investigators in the area. This step is usually, but not always, at the beginning of the introduction. To illustrate Move 1, Step 1, Swales (1990) presents the following examples: – The study of…has become an important aspect of… – A central issue in…is the validity of… (Swales, 1990, p. 144) Move 1, Step 2, Making topic generalizations, represents a neutral kind of general statement. It usually takes the form of either statements about knowledge or practice, or statements about phenomena. Usually, this step seeks to establish territory by emphasizing the frequency and complexity of the data. Some examples of Move 1, Step 2 are: – The aetiology and pathology …is well known. – A standard procedure for assessing has been … – There are many situations where… (Swales, 1990, p. 146) The last step of this move, Step 3, Reviewing items of previous literature, is where the author reviews selected relevant groups of previous research. Here, the author specifies the important findings of the study and situates his/her own current research study. Examples of Move 1, Step 3 are: – X Was found by Sang et al. (1972) to be impaired. – Chomskyan grammarians have recently… (Swales, 1990, p. 150) In establishing territory, then, the author convinces the readers about the importance of the area of study by making strong claims with reference to previously published research, which can be done in three ways, as indicated by the three step options. Move 2 of the CARS model, Establishing a niche for about-to-be presented research, is considered a key move in research article introductions because it connects Move 1 to Move 3, by articulating the need for the research that is being presented. Move 2 is manifested in one of four ways: Step 1A, Counter claiming; Step 1B, Indicating a gap, Step 1C, Question raising, and Step 1D, Continuing a

Chapter 2.  Introduction to move analysis 

tradition. The four options for realizing Move 2 are represented by the following examples, taken from Swales, 1990, p. 154: Step 1A, Counter Claiming Step 1B, Indicating a Gap Step 1C, Question Raising Step 1D, Continuing a Tradition

Emphasis has been on…, with scant attention given to… The first group...cannot treat and is limited to… Both suffer from the dependency on… A question remains whether…

The final move type that Swales proposed for research article introductions is Move 3, Occupying the niche. As noted earlier, Move 1 reports on the centrality of the research topic or generalizations about previous research. Move 2 expresses the authors’ own opinions about the need for the current research (with reference to the past literature). Importantly, Move 3 is distinct from the other two moves in the Introduction in that the authors assume a more active role in the research conducted, rather than just referring to previous studies or asserting the need for this one. In fact, Move 3 is the only place in the research article introduction where the authors express and enjoy their own accomplishment, pride, and commitment (Swales, 1990). Move 3 introduces new research by first either Stating research purpose(s) (Step 1A) or Describing the main features of the research (Step 1B), then by Announcing the principal findings (Step 2), and then finally by Indicating the research article structure (Step 3). Examples illustrating the steps of Move 3, taken from Swales (1990, p. 160) are: Step 1A, Outlining Purpose Step 1B, Announcing Present Research Step 2, Announcing Principal Findings Step 3, Indicating Research Article Structure

The aim of the present paper is to give… This study was designed to evaluate… The paper utilizes the notion of… This paper is structured as follows…

Swales’ CARS model for academic research articles has been widely studied and validated since it was first published in 1990. The model has been shown to have a recursive nature – what Swales has called “recycling” (1990, p. 140) – with moves or steps occurring more than once as well as with varied realizations in research writing across contexts. For example, Bunton (2002) has shown that the genre of Ph.D. theses introductions, while having the same general CARS structure pro-


Discourse on the Move

posed by Swales, has some alternate ways for realizing the three basic moves. One example of this is in Move 1, Establishing a Territory; Bunton proposes that a new step ‘Defining terms’ plays an important part in fulfilling the function of helping to establish the territory to be covered in Ph.D. thesis introductions, while this is not the case for research article introductions. Indeed, subsequent research on the introduction section of research articles in other disciplines (see discussion below) has helped us recognize how different disciplines manipulate a common genre – in this case, research articles – to meet their own communicative needs. Our understanding of one small section of academic research articles – Introductions – has evolved from a “one size fits all” perspective to a more subtle, discipline-specific understanding of the rhetorical purposes and expectations of research articles. Swales (2004), in response to this subsequent research, modified his model to better reflect the variability in how the three move types are realized in different sub-genres of research article introductions. His revised model, shown in Table 2.2, has a broader description of the communicative purposes of Move 1 and Move 2; it also reflects – particularly in Move 3 – the variation that occurs in introductions in different research fields, and recognizes the possibility of cyclical patterns of occurrence of the move types (described further below) within the introduction section. Table 2.2  Swales’ revised model for research article Introductions (2004, pp. 230, 232) Move 1:

Establishing a territory (citations required) via Topic generalizations of increasing specificity

Move 2:

Establishing a niche (citations possible) via: Step 1A: Indicating a gap, or Step 1B: Adding to what is known Step 2: Presenting positive justification (optional)

Move 3:

Presenting the present work via: Step 1: Step 2: Step 3: Step 4: Step 5: Step 6: Step 7:

Announcing present research descriptively and/or purposively (obligatory) Presenting research questions or hypotheses* (optional) Definitional clarifications* (optional) Summarizing methods* (optional) Announcing principal outcomes (optional)** Stating the value of the present research (optional)** Outlining the structure of the paper (optional)**

* Steps 2–4 are less fixed in their order of occurrence than the others. ** Steps 5–7 are probable in some fields, but unlikely in others.

Chapter 2.  Introduction to move analysis 

The key point here is that while related genres will certainly share common move types, each will have their own unique structural characteristics that reflect the specific communicative functions that the genres have.

3 Move analysis of research articles applied across genres 3.1

Description and examples

While move analysis was originally developed as a tool to teach non-native speakers the rhetorical structures of research articles, Swales’ framework has been successfully extended to other areas of English for Specific Purposes (ESP) instruction, including English for Business and Technology (Bhatia, 1993a, 1997a) and English for Professional Communication (Flowerdew, 1993). Swales’ framework of move analysis has stimulated substantial research on the rhetorical structures of academic and professional texts. In academic writing, it has been applied to academic disciplines including biochemistry (Kanoksilapatham, 2005; D. Thompson, 1993), biology (Samraj, 2002), computer science (Posteguillo, 1999), and medicine (Nwogu, 1997; Williams, 1999), as well as on a variety of academic genres, including university lectures (S. Thompson, 1994), master of science dissertations (Hopkins & Dudley-Evans, 1988), and textbooks (Nwogu, 1991). Within the genre of scientific research articles – the original focus of move analysis – a number of move-based studies have focused on specific sections of research articles. For example, Crookes (1986) compared Introduction sections of research articles across a variety of fields; Wood (1982) described the moves of Methods sections in chemistry articles; Thompson (D. Thompson, 1993) and Williams (1999) focused on the moves of Results sections in biochemistry and medical research articles respectively; and Peng (1987) looked at the moves used in the Discussion section of chemical engineering research articles. Posteguillo (1999) – computer science – and Nwogu (1997) – medicine – both went a step further and explored the use of moves across multiple sections within the genres they investigated, and Kanoksilapatham (2005) has investigated the move structure of complete biochemistry research articles. A more detailed description of how move analysis was used to describe the structure, and linguistic features, of entire biochemistry research articles is provided by Kanoksilapatham in Chapter 4 of this book. More recently, professional discourse has also been examined through the lens of move analysis, including legal discourse (Bhatia, 1993b), philanthropic discourse – focusing on direct mail letters (Upton, 2002; Upton & Connor, 2001) and grant proposals (Connor, 2000; Connor & Mauranen, 1999; Connor & Upton, 2004a) – and movie reviews (Pang, 2002).


Discourse on the Move

A brief description of a move analysis done on a corpus of job application letters (Connor, Precht, & Upton, 2002) provides an interesting illustration of how different genres can have quite different move types. The letters in this study were from the Indianapolis Business Learner Corpus (IBLC), which included job application letters written by business students at U.S., Belgian, and Finnish universities between 1990–1998. The 99 letters in the corpus were generated by students (all either business and/or English majors) as part of a common class assignment. Applying Swales’ approach to analyze the genre of job application letters, the following move types were identified: Move 1: Identify the source of information. (Explain how and where you learned of the position.) “I recently received word from Blockbuster Recruiting about a management position available at your company.” Move 2: Apply for the position. (State desire for consideration.) “I am very interested in a temporary job working as a European business student intern in the U.S.A.” Move 3: Provide arguments for the job application. Step 1: Implicit arguments based on neutral evidence or information about background and experience. In providing supporting information or arguments, the writers simply list their background experience. “I received my Associates Degree in General Studies in May 1993. Previously I have received a degree in Office Management from Indiana Business College and I have obtained the Certified Professional Secretary (CPS) certification.” Step 2: Arguments based on what would be good for the hiring company. In this step, the writer argues explicitly that their experience or education will benefit the company that hires them. “My intercultural training will be an asset to your international negotiations team.” Step 3: Arguments based on what would be good for the applicant. In this step, the writer argues how the position would in fact be beneficial to him/herself. “The opportunity to study abroad the globalised business environment would help me gain the knowledge and experience to grow in the changing business world of today.” Move 4: Indicate desire for an interview or a desire for further contact. “I hope I got you interested so that I will be selected for an interview.” “I’m always prepared to participate in an interview.”

Chapter 2.  Introduction to move analysis

Move 5: Move 6:

Express pleasantries or appreciation at the end of the letter. “Thank you in advance for your consideration.” “Thank you for your time in reviewing this material.” Offer to provide more information. “I will be happy to provide you with any additional information that you may need.” Move 7: Reference attached resume. “I have enclosed my resume...” “A resume is enclosed.”

The most obvious difference between the move structure of research article introductions and the move structure of letters of application is that the former has only three major move types and the latter has seven. This is all the more interesting to note because research article introductions (with only three major moves) are typically much longer than letters of application. Three other important points are illustrated by comparing these two move structures. The first is the fact that moves are identified by the communicative purpose that the writer is seeking to accomplish, whether that be done in one sentence or five paragraphs. Consequently, moves can be quite variable in length. The second point is that some genres have a fairly simple move structure, with only three or four basic communicative functions, while other genres may have a fairly complex move structure, with many different communicative functions. The third point is that while some moves may be realized through two or more different ‘steps,’ other moves may only be expressed in one general functional-semantic way (e.g., Swales’ Move 1 has three steps, while Connor et al.’s Move 1 has no step options). There are two additional characteristics of moves that should be noted. The first is that some move types in a genre may be more common (or obligatory), while other moves may be optional. Lewin, Fine, and Young (2001) and Bhatia (1993a) are among those that underscore this characteristic of moves. Bhatia prefers the term ‘strategy’ as opposed to ‘step’, to reflect the variability among elements within a move: move elements may or may not regularly appear, and they can be used in different sequential order. In Chapter 3 of this volume, for example, we describe the variable move structure of direct mail letters; some of the move types in this genre are clearly optional, and there is a fairly free ordering of the moves within a given text. Similarly, Kwan (2006) shows that the third move (Occupying the niche) is optional in the literature review of Ph.D. theses of applied linguistics. In addition, it is possible that some move types will recur in a cyclical fashion within a section of text (Swales, 2004). Typically, the cyclical reoccurrence of a move within a section of text has been dealt with by considering each appearance of a particular move as a separate occurrence. For example, if a text starts with, say, Move Type 1, continues with Move Type 2, and then returns to Move Type 1, Move



Discourse on the Move

Type 1 would be counted as having occurred twice. The studies in Chapters 3 and 4 both used this approach to identify and count moves. More rarely, moves can be interrupted by – or have inserted into them – another move type (Upton, 2002). While this is rather unusual, there can be clear instances where one communicative functional unit (move type) of a text interrupts, often as an aside or a tangential comment, another very different communicative functional unit of text. The study described in Chapter 3 provides an example of this. These cyclical and embedded patterns of move types tend to occur mainly in genres that are less constrained and allow more variability than those that are more prescribed. 3.2

Summary of previous research on move analysis

To highlight key points introduced above, move analysis proposes that genres are composed of definable and, to a great extent, predictable functional components – that is, ‘moves’ of certain types. For example, article introductions typically have three rhetorical move types – establishing territory, establishing a niche, and occupying the niche. Letters of application have seven distinguishable move types as described above. According to Bhatia (1993a), the move structuring of a genre is the property of the genre itself, not something that the reader constructs. This structure is controlled by the communicative purpose(s) of the text, and is the underlying reason that one genre varies from another. The moves of a genre are considered such an inherent part of the genre that they can be used as the building blocks for teaching novice writers how to successfully write texts in that genre (Dudley-Evans, 1995), which, as already noted, was Swales’ initial motivation for exploring the structure of research article introductions.

4 Overview of the methods for move analysis 4.1

General steps of a move analysis

Kwan (2006) provides a useful introduction to the functional-semantic methods used for identifying discourse moves. A functional approach to text analysis calls for cognitive judgement, rather than a reliance on linguistic criteria, to identify the intention of a text and the textual boundaries (see also Bhatia, 1993a; Paltridge, 1994). This approach is in line with the theoretical definition of a move; that is, that each move has a local purpose but also contributes to the overall rhetorical purpose of the text.

Chapter 2.  Introduction to move analysis 

It is important to note that there are no strict “rules” for doing a move analysis, nor does every researcher necessarily do each of the steps described below. The intent here is to simply describe common procedures in doing a move analysis. First, in order to identify the move categories for a genre, it is important to get a ‘big-picture’ understanding of the overall rhetorical purpose of the texts in the genre. The second step is then to look at the function of each text segment and evaluate what its local purpose is. This is the most difficult step. Move categories need to be distinctive. Multiple readings and reflections of the texts are needed before clear categories emerge. The third step is to look for any common functional and/or semantic themes represented by the various text segments that have been identified, especially those that are in relative proximity to each other or often occur in approximately the same location in various texts representing the genre. These functional-semantic themes can then be grouped together, reflecting the various steps (or strategies) of a broader move type, with each move having its own functional-semantic contribution to the overall rhetorical purpose of the text. Swales proposed the first CARS move, Establishing a Territory, as it was clear that research article introductions almost always began with a section that functioned to provide a context for the study being introduced, whether this was done by claiming the centrality of the study (Step 1), and/or by making generalizations about the topic being studied (Step 2), and/or by reviewing items of previous research on the topic (Step 3). Not all research articles introductions have all the steps, but most have at least one of them, serving the function of establishing the ‘territory’ for the study to follow. When a researcher is ready to segment a particular text into moves, it is best to begin first with a pilot coding, ideally with at least two coders. Because coders are seeking to understand the functional-semantic purposes of text segments, coding must be done by hand. Initial analyses are then discussed and fine-tuned until there is agreement on the functional and semantic purposes that are being realized by the text segments, resulting in a protocol of move and step features for the genre, with clearly defined purposes and examples. For a corpus-based move analysis, this coding protocol is then applied to the full set of texts. Inter-rater reliability should be checked to confirm that there is agreement on what the move types are and how they are realized by text segments (see Section 4.2 below). At this point, it may be necessary to resolve any discrepancies through further discussion and analysis, and then re-code problematic texts. It is also not uncommon that additional steps or even move types will be discovered during the analysis of the full set of texts. As noted earlier, some move structures can prove more complex than the three-move structure of the CARS model. For example, Bhatia (1998) has noted that fundraising discourse “offers a large variety of creative options” (p. 100; see


Discourse on the Move

also Chapter 3). In other words, some genres, especially dynamic and persuasionoriented ones like fundraising letters, may have obligatory, typical, and optional move elements, and move types may not necessarily occur in a fixed order. Nevertheless, a move structure for a genre can still be identified by working through the general process outlined above. Table 2.3 summarizes the typical move analysis process as it is done in a corpus-based approach. Table 2.3  General steps often used to conduct a corpus-based move analysis Step 1: Step 2: Step 3:

Step 4: Step 5: Step 6: Step 7: Step 8:

Step 9: Step 10:

Determine rhetorical purposes of the genre Determine rhetorical function of each text segment in its local context; identify the possible move types of the genre Group functional and/or semantic themes that are either in relative proximity to each other or often occur in similar locations in representative texts. These reflect the specific steps that can be used to realize a broader move. Conduct pilot-coding to test and fine-tune definitions of move purposes. Develop coding protocol with clear definitions and examples of move types and steps. Code full set of texts, with inter-rater reliability check to confirm that there is clear understanding of move definitions and how moves/steps are realized in texts. Add any additional steps and/or moves that are revealed in the full analysis. Revise coding protocol to resolve any discrepancies revealed by the inter-rater reliability check or by newly ‘discovered’ moves/steps, and re-code problematic areas. Conduct linguistic analysis of move features and/or other corpus-facilitated analyses. Describe corpus of texts in terms of typical and alternate move structures and linguistic characteristics

The ten steps outlined in Table 2.3 correspond to the general analytical steps for top-down analyses listed in Table 1.1 (in Chapter 1). For example, the analytical step “Communicative/Functional Categories” in Table 1.1 corresponds to Steps 1–5 in Table 2.3. The steps “Segmentation” and “Classification” from Table 1.1 in practice occur concurrently in Steps 6–8. The steps “Linguistic analysis of each unit” and “Linguistic analysis of discourse categories” from Table 1.1 are reflected in Step 9, and the final step in both Table 1.1 and Table 2.3 are the same. While the process described here is not the only way to do a corpus-based move analysis, in the end, the move structure should represent the “rhetorical movement” (Swales, 1990, p. 140) of the functional-semantic purposes of the text segments that make up the genre, and all texts in the corpus must be coded for these distinctions.


Chapter 2.  Introduction to move analysis 

Inter-rater reliability

For top-down approaches to discourse analysis, the first methodological steps in the analysis involve human judgements to identify and code the discourse components of a text. This kind of analysis requires a detailed coding rubric, which explicitly defines the discourse components (e.g., the move types and steps). A minimal evaluation of this rubric is to determine whether raters can achieve high inter-rater reliability when they apply the coding scheme. That is, do different raters understand the coding definitions in the same way, with the result that they all identify the same discourse components in a text, and they all agree on the classification of those text segments as move types. The simplest method of reporting inter-rater reliability is percent agreement. This statistic merely reflects the number of agreements per total number of coding decisions, but it does not account for chance agreement among raters. A more common statistic for determining inter-rater reliability is Cohen’s kappa (k). Cohen’s kappa is a chance-corrected measure of inter-rater reliability that assumes two or more raters, n cases, and m mutually exclusive and exhaustive nominal categories (Capozzoli, McSweeney, & Sinha, 1999). Training is generally done to achieve better and more consistent inter-rater reliability, but more importantly, training encourages evaluators to examine the definitions in the coding rubric, and to arrive at a more explicit description of what each coding category represents. Inter-rater reliability should not be confused with objectivity or validity; it is rather just a measure of consistency and agreement. As noted by Raymond (1982), the degree to which inter-rater reliability is desirable varies with what is being evaluated: “It would be possible to achieve near perfect inter-rater reliability by simply counting the number of words produced…; but no one would seriously accept this as a measure of quality [of writing]. Because the quality of writing resides not entirely in the text, but in the interactions among the text, its author, and its individual readers, we should not only expect but actually demand a reasonable amount of variation among raters,” with an inter-rater reliability of.80 being acceptable (p. 401). Much the same can be said about identifying move boundaries and coding move types. Moves, by definition, perform communicative functions within a text, but raters can differ in their understanding of the purpose of a specific text or portion of a text. Nevertheless, the process of identifying and discussing discrepancies increases inter-rater reliability among researchers and results in a more usable and consistently interpreted move framework for a genre.


Discourse on the Move

5 Using a corpus-based approach to move analysis 5.1

Corpus-based move analysis

Much of the previous discussion has focused primarily on describing and discussing the theory behind and the process of doing a move analysis. Discourse analysis in general, and move analysis in particular, has typically been a qualitative approach to analyzing discourse, with studies focusing on only a few texts. This is well illustrated by the collection edited by Mann and Thompson (1992), which includes twelve different analytical approaches to analyzing the discourse of one single letter. In contrast, a corpus-based approach requires analysis of a well-designed ‘representative’ collection of texts of a particular genre. These texts are encoded electronically, allowing for more complex and generalizable research findings, revealing linguistic patterns and frequency information that would otherwise be too labor intensive to uncover by hand (Baker, 2006, p. 2). That is not to say that a corpus-based approach is simply a quantitative approach. Corpus-based discourse analysis depends on both quantitative and qualitative techniques. Even with a corpus-based approach, the moves and move types in each text must first be identified and tagged individually by the researchers making qualitative judgments about the communicative purposes of the different parts of a text. And even once quantitative data are run, the results must still be interpreted functionally. As has been noted previously, “Association patterns represent quantitative relations, measuring the extent to which features and variants are associated with contextual factors. However, functional [qualitative] interpretation is also an essential step in any corpus-based analysis” (Biber et al., 1998, p. 4). To summarize, what makes a corpus-based approach to move analysis different from the ‘traditional’ approach are the following: a) analyses are done on a relatively large representative collection of texts from a particular genre; b) all texts are electronically encoded to allow for computerized counts and calculations using different programs and software packages; c) once the coding rubric for move types is developed, all texts in the corpus are coded to identify the moves and code the move types; d) analysis of the linguistic characteristics of specific move types can be easily done in order to provide details about how different communicative purposes are realized linguistically; and e) in addition to conducting the traditional move analysis, quantitative counts permit the discussion of general trends, relative frequency of particular move types, and prototypical and alternate patterns of move type usage (this is discussed further below).


Chapter 2.  Introduction to move analysis 

General advantages of corpus-based approaches to discourse analysis

There are several advantages to using a corpus-based approach to top-down analyses of discourse (including move analysis and appeals analysis). Baker (2006) in his book, Using Corpora in Discourse Analysis, outlines four advantages of using corpora to analyze discourse. First, a corpus-based approach helps reduce researcher bias. All researchers approach their research from a particular worldview; often we are aware and take account of our biases, but often we are unaware of biases. As Baker notes, “by using a corpus, we at least are able to place a number of restrictions on our cognitive biases” (p. 12); overall patterns and trends are more likely to show through when we are looking at dozens of texts rather than just one or two ‘selected’ texts. In short, corpus-based approaches help put the focus of discourse analysis on interpretation of the data – not the data itself – by reducing the opportunity for manipulation (conscious or unconscious) of the texts selected for analysis. The second advantage of corpus-based discourse analysis identified by Baker (2006) addresses what he calls “the incremental effect of discourse” (p. 13). The primary purpose of discourse analysis is to understand how language is used, often in quite subtle ways. A single text on its own is insignificant; however, corpus analysis allows us to see patterns of words, phrases, structures and/or discourses that permeate, often contrary to ‘common-sense,’ our language. A corpus also allows researchers to see patterns that exist but might otherwise miss when analyzing a small sample of texts because they are not overwhelmingly frequent. The third advantage Baker (2006) gives for using a corpus-based approach to discourse analysis is that it is much easier to identify counter-examples – “resistant discourse” – on the one hand, and to less readily mistake them for “hegemonic” – or “dominant” – discourse on the other hand (p. 14). For example, results of a corpus-based move analysis are much more likely to represent the move and linguistic structures that are in fact typical for the genre as a whole, and much less likely to be skewed by the random selection and analysis of only a handful of texts that may turn out to not be representative of the genre as a whole. Lastly, Baker (2006) suggests that a significant advantage to a corpus-based approach is that it is easily combined with other methodologies to reinforce and strengthen the overall analysis, what is often called “triangulation.” For example, the approach presented in Chapters 3–4 of the present book combines move analysis with analysis of the linguistic characteristics of the move types to describe how different communicative purposes are linguistically realized. While these four advantages are relevant for all approaches to discourse analysis, a corpus-based perspective offers distinct advantages to move analysis in particular, which are described in the next section.


Discourse on the Move


Specific advantages of a corpus-based perspective for move analysis

5.3.1 Identifying linguistic features of moves While one could do a move analysis of a single text, it only becomes possible to describe the typical linguistic characteristics of move types through a corpusbased approach. Before computerized analysis, there were attempts to summarize the occurrence of linguistic features in genre moves. For example, Swales (1990, pp. 131–132) summarized the findings of 40 published studies which described the use of linguistic features in the four major sections of research articles. He concluded that five linguistic features – that verb complement, present tense, past tense, passive voice, and author’s comments or hedging – co-occur in particular patterns to convey particular rhetorical functions. The patterns observed, based on the five linguistic features, provide evidence for a two-way distinction between Introduction/Discussion and Methods/Results sections. The Introduction and Discussion sections have the functions, respectively, of providing the background of the current study and interpretation of the results. The features frequently found to be associated with these functions are that complements, present tense, and author’s comments. The Methods/Results sections, respectively, provide information regarding experimental procedures and present findings of the current study. Associated with these functions are a high use of past tense and a variable use of passive voice verb forms. The studies cited by Swales usually analyzed selected linguistic features by hand, looking for patterns and differences. With computers, much more interesting and comprehensive linguistic analyses can be undertaken. Analyses which take into account only individual linguistic features will reveal very little about the co-occurrence of linguistic features and how features interact with each other in a move to perform a particular communicative purpose. It would be more informative and useful to study the distribution and co-occurrence of many features of language at once, rather than considering the distribution and function of individual features singly. Computer driven, corpus-based approaches allow us to do this. Chapters 3, 4, & 5 in this volume provide examples of how various linguistic structures work together in unique combinations to help realize the rhetorical purposes of the different moves identified for each genre. It needs to be remembered that move types, and their component steps, are identified by the functional and semantic purposes that they have. Nevertheless, because different moves have different functional and semantic purposes, it seems reasonable to expect that move purposes will be realized through variations in linguistic features. This is, in fact, what Swales observed in his early analysis of research articles: “The evidence suggests a differential distribution of linguistic and rhetorical features across the four standard sections of the research article”

Chapter 2.  Introduction to move analysis 

(1990, p. 136). Consequently, as noted in Chapter 1 of this volume, once texts have been segmented into moves, it is possible to analyze the linguistic characteristics of each move to determine the typical linguistic characteristics of the different move types. This type of analysis has not generally been done in “traditional” move analysis studies, and it can be argued that the lack of a description of the typical lexico-grammtical characteristics of these discourse units (i.e., move types) is a significant shortcoming of the non-corpus-based approach. 5.3.2 Move frequencies and lengths Another advantage of the corpus-based approach to move analysis is that it allows description of the typical distributional and structural characteristics of each move type. That is, once moves in a corpus have been coded, a variety of descriptive counts can be made. The most obvious of these are the overall frequency of occurrence of each move type in the corpus, and the average length in words of each move type. Statistics like these allow us to make a clear determination as to whether a particular move type is obligatory, expected, or merely optional. For example, in the study described in Chapter 3, the third move type can be considered obligatory, as it occurred in over 97% of the texts, while the first move type is clearly optional, as it occurred in only about 15% of the texts. If it were not for the corpusbased approach used to analyze this genre, this optional move might not have even been identified, because it occurs so infrequently – or if it had been identified, its importance in the genre might have been overstated. Similarly, it is interesting to note that the third move is, on average, 48 words in length, while the second move, which occurs in 93% of the texts, is three times longer at 150 words in length. By identifying this rather large difference in length between these two obligatory move types, the corpus-based approach invites additional follow-up questions to explore what the source of this difference might be. 5.3.3 Mapping move use and locations A computer can be used to count not only the presence of each move type for each text but also to keep track of their positions relative to each other (e.g., first, second, third), what other move types each most commonly co-occur with, how frequently a move is embedded in another move, and how frequently a move occurs in the body of the text as opposed to, say, a P.S. The ability to make these sorts of observations permits us to extend our analysis in several ways. For example, it is possible to look at the relationship that different move types have with each other. Again, looking ahead to the study described in Chapter 3, the text position of two of the moves that are identified for the fundraising letter genre turns out to be quite predictable: although Move 1 and Move 7 are optional moves, when they are present in a direct mail letter, Move 1 occurs as


Discourse on the Move

the initial move in the letter 97% (34/35) of the time and Move 7 occurs as the final move before the complementary close 100% (33/33) of the time. The positions of Move 2 and Move 3 are also highly predictable. If one ignores the presence of Move 1, Move 2 occurs as the initial move in the direct mail letter 74% (180/242) of the time. And Move 2, regardless of its position in the letter, is immediately followed by Move 3 87% (316/362) of the time. 5.3.4 Genre prototypes With statistics on move frequencies and lengths, as well as descriptions of where in the genre a move type tends to occur and how one move type typically relates to another, a key advantage of a corpus-based approach can be realized: the ability to develop genre prototypes. Prototypes are particularly valuable in educational and training contexts to help novices learn to understand and produce a genre that is new to them. In the study described in Chapter 3, for example, three different prototypes of the genre are provided. The first includes only the obligatory moves, the second adds the expected moves, and the third is based on all moves (including the optional ones). In these prototypes, not only can the different move types be included, but typical and alternate locations of moves relative to other moves in the text can be described. In addition, if linguistic analysis or other follow-up analyses of the individual moves were done, the prototypes can represent these features as well. Prototypes such as these are also very useful in understanding better the genre variation that occurs between different disciplines. For example, Kanoksilapatham in Chapter 4 shows that the moves in the introduction sections of biochemistry research articles varies somewhat from the CARS model that Swales has proposed (introduced earlier in this chapter).

6 Summary This chapter has introduced the top-down approach used most often by applied linguists for the analysis of discourse structure: move analysis. While discourse analysis has often been concerned with sentence-level features in writing or general modes of writing such as narration, description, and comparison and contrast, move analysis has given researchers and practitioners useful text-focused tools. We first discussed the theoretical and empirical underpinnings of traditional move analysis. We then presented a description of corpus-based move analysis, with steps that followed the guidelines proposed in Chapter 1 for top-down analyses of discourse structure. The chapter concluded with a discussion of the added advantages of a corpus-based approach to move analysis. These include the ease of identifying the linguistic characteristics of the moves, their frequencies and

Chapter 2.  Introduction to move analysis 

lengths, and the mapping of their use and location in the overall discourse structure of texts. Chapters 3 and 4 put this model into practice, presenting corpusbased move analyses of fund raising letters (Chapter 3) and biochemistry research articles (Chapter 4).

chapter 3

Identifying and analyzing rhetorical moves in philanthropic discourse In Chapter 2, we described the general approach that can be used to identify and analyze the moves of a genre. In this chapter, we will describe and expand on a study1 that uses a move analysis to show the rhetorical structure of direct mail letters, a type of philanthropic discourse. This study illustrates how move analysis is done and provides a model that can be used for the study of other genres, especially genres from professional (e.g., business, legal, medical) contexts. Furthermore, a review of the corpus that was used in this study and its characteristics will provide a useful example of a specialized corpus that is essential to conducting a move analysis of a specific discourse genre.



Philanthropic discourse – fundraising texts like direct mail letters or magazine advertisements – seeks to persuade, inform, request, catch one’s eye, wrench one’s heart, and twist one’s arm – all in a tidy attractive package. The weight upon these texts is, in fact, enormous. Nonprofit organizations depend to a larger or smaller extent on fund-raising texts for operating expenses or for funding to accomplish capital goals. And yet, the various genres of philanthropic discourse have not been closely studied. Indeed, Bhatia (1998) claims that the discourse of fundraising represents one of the most dynamic forms of language use. “For a relatively limited number of communicative functions, this discourse form offers a large variety of creative options, some rarely used before. It is a category of genre that offers an interesting and challenging profile of linguistic realizations to achieve a limited set of generic objectives” (Bhatia, 1998, p. 100). 1. This chapter draws on material previously published in the following two articles: (1) Upton, T. A. (2002). Understanding direct mail letters as a genre. International Journal of Corpus Linguistics 7(1), 65–85; and (2) Connor, U. & Upton, T. A. (2003). Linguistic dimensions of direct mail letters. In C. Meyer & P. Leistyna (Eds.), Corpus Analysis: Language Structure and Language Use (pp. 71–86). Amsterdam: Rodopi Publishers.


Discourse on the Move

The dynamic nature of philanthropic discourse is due to the fact that it is designed to be quite persuasive. In short, its primary purpose is to persuade people to contribute to worthy causes or to underwrite philanthropic programs (Connor, 2000). Because of its persuasive purposes, fundraising has a great deal in common with promotional materials such as sales letters and job applications, in which the purpose is to sell something: in sales letters, a service or product; in letters of application, a person’s abilities; in fundraising, a worthy cause (Bhatia, 1993a; Connor & Wagner, 1998). Recent studies of philanthropic discourse, specifically fund-raising texts, have for the most part employed a qualitative approach, analyzing characteristics such as communicative functions (Bhatia, 1997b; Connor, 1997), rhetorical patterns (Abelen, Redecker, & Thompson, 1993; Crismore, 1997; Lauer, 1997), social contexts (Bazerman, 1997a; Myers, 1997), metaphors (McCagg, 1997), and cultural differences (Connor & Wagner, 1998; Graves, 1997). Although these studies have contributed to our understanding of the language of fund raising, the qualitative nature of these studies left us without an empirical baseline for comparing the general features of fundraising texts with those of other common texts. Of particular interest are the types of rhetorical moves that are used to define the different genres of philanthropic discourse. What was missing is a corpus-based study of fundraising texts to develop such a baseline. The Indiana Center for Intercultural Communication (ICIC), with funding from and in cooperation with the Indiana University Center on Philanthropy, undertook a concerted effort to carefully study the language of fundraising by collecting a large corpus of fundraising material and then studying, among other things, the rhetorical moves in these genres. The focus of the present chapter is on the direct mail letters used by non-profit agencies to introduce readers to or remind them about what the agency does, the clientele/services they are involved with, and/or the needs that they have that the reader is being asked to assist with – usually financially. Specifically, this study will first investigate the discourse structure typical of the letters in the corpus, using move analysis, and then provide a linguistic description of the grammatical stance features that each move most commonly draws on to accomplish its particular function in the genre.

2 A specialized corpus of fundraising texts The fundraising letters analyzed in this study are part of the ICIC Fundraising Corpus, which includes over 900 fundraising documents from 236 organizations and totals nearly 2 million words. The documents in the corpus include direct mail letters, newsletters, case statements, grant proposals, and annual reports. Table 3.1

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

shows the total number of organizations, items and words for each text type in the corpus. Table 3.1  ICIC fundraising corpus document types Type of Text

Org. n

Item n

Word n

Direct Mail Letter Invitation, Newsletter Case Statement Grant Proposal Annual Report

108 172 12 27 51

316 445 13 69 84

191,540 922,212 121,780 156,021 523,770





Note: Org. n = the number of organizations represented in this type. Item n = the number of items of this type in the corpus. Word n = the number of words in the documents of this type in the corpus.

The present study focuses on the genre of direct mail letters, and thus uses only that component of the ICIC corpus. Letters were collected from five major types of organization; Table 3.2 shows the number of organizations, number of letters, and words broken down by these organization categories. Table 3.2  ICIC fundraising corpus – direct mail letters by organization type Type of Organization

Org. n

Item n

Word n

Health/Human Services Environmental

33 10

91 13

54,187 8,126

Community Development








Arts and Culture












The ICIC Fundraising Corpus was designed to represent a specific type of discourse – fundraising texts – and to represent specific genres within that domain. The sub-corpus for the genre of direct mail letters was further designed to represent the range of variation found for this genre. To prevent any skewing of the corpus towards the writing of any one organization or non-profit field, effort was made to collect letters from a wide variety of non-profit organizations across a wide variety of non-profit fields (e.g., health and human services, education).


Discourse on the Move

3 Determining and analyzing discourse moves: Direct mail letters 3.1

Previous analysis of direct mail letters

Direct mail letters for nonprofit fundraising have the general purpose of selling a product: a good cause. It has been noted (Connor & Upton, 2003) that a whole industry has developed around direct mail letters in nonprofits, as “experts” offer their advice for fundraisers in books and newsletters. It is fair to say, though, that the advice given in many of these materials often comes from the knowledge base of mass marketing rather than a careful analysis of the language actually used. Frequently, a great deal of emphasis is put on the physical appearance of the letter, while an examination of language use, for the most part, does not appear to be an important consideration. For example, even though the need for donor segmentation is frequently recommended, little concrete advice is given about how to appeal to specific audiences. Linguists’ interest in the direct mail letter is relatively new. As far as we are aware, there have only been three research studies published by linguists that focus on the fundraising direct mail letter. The edited book by Mann and Thompson (1992) showcased the merits of particular linguistic/rhetorical analyses (such as the Rhetorical Structure Theory and the topical structure analysis); however, the purpose of their volume was not necessarily to advance knowledge about the fundraising letter as a text type. Abelen, Redeker, and Thompson (1993) offered more valuable linguistic/rhetorical information about direct mail fundraising letters, but their focus was a cross-cultural comparison of fundraising letters written by Dutch and American in one type of non-profit (based on analysis of only 8 letters). The third article, by Upton (2002), is most relevant here and will be described in more detail below. 3.2

A move analysis of fundraising letters: Background and methodology

3.2.1 Move types Upton (2002) conducted a study using the ICIC-FC with the goal of providing a better, and more definitive, understanding of the discourse structure that underlies the persuasive aspect of direct mail letters. This study drew on the work done by Bhatia (1998), who did a preliminary move analysis on a small set of direct mail letters. Using a comprehensive, rigorous, and sustained analysis of data, a research team at ICIC identified a seven-move structure. Move Type 1, Get Attention: The communicative, functional purpose of this first move type was to get and focus the reader’s attention at the start of the letter. This

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

move type could be realized through one of two steps. Step 1 is to start with a quotation or story of some sort or a shocking or unexpected statement. Step 2 is to start by offering some type of general pleasantries. Examples from letters in the corpus of Move Type 1, as expressed through one or both of its two steps, are given in Table 3.3. Table 3.3  Examples of Move Type 1, Steps 1 & 2, from corpus Move Type 1 Get Attention Optional Steps: Step 1 Pleasantries – 1996 is off to a fast start! – What a Summer! And we’re just getting started! Step 2 Quotation, story or shocking/unexpected statement – I learned about gardening when I was very young from my parents. They always had a garden and now so do I. The garden that I have now is very different from the garden that my parents grew. Dad would start planting about the fifteenth of April. He had two acres to plow so he used a mule and a plow. My garden now is very different from my dad’s garden… – “Philanthropy is the rent we pay for the joy and privilege we have for our space on this earth.” Jerold Panas. – Cecilia desperately searched for medical care for her unborn child. She would have a better chance of getting help and delivering a healthy baby if she lived in Sweden. But Cecilia lives in central Indiana. Cecilia might even be your neighbor.

Move Type 2, Introduce the Cause and/or Establish Credentials: This move type serves two general functions. It focuses on establishing the credentials of the organization by highlighting what the organization does and the contribution it can make, and/or it serves to introduce the cause/need that the organization seeks to address. For many non-profit organizations, their primary or even sole purpose is to address a particular need; they talk about who they are and what they do in the context of what the cause is. Consequently, these two functions are considered part of one move type: introduce the cause and/or establish credentials of organization. This move type could be expressed by any one or more of the following five steps: 1) indicating a general problem or need, 2) highlighting a specific problem or need, 3) highlighting the successes of past organization efforts, and 4) outlining the mission of the organization. Examples from letters in the corpus of Move Type 2, as expressed through its four steps, are given in Table 3.4.


Discourse on the Move

Table 3.4  Examples of Move Type 2, Steps 1 – 4, from corpus Move Type 2 Introduce the cause and/or establish credentials of organization Optional Steps: Step 1 Indicate general problem/need – One of the biggest challenges you face may be to find qualified, educated people to fill positions in your company…Indy Reads is working to change that! Step 2 Highlight specific problem/need – This summer, more than 300 children ages 4 through 14 will attend the YWCA of Indianapolis’ “Everyone belongs” Summer Day Camp…As you can imagine, a summer like this is expensive to provide. And more than 30% of the kids we serve cannot afford the camp fee. Step 3 Highlight the successes of past organization efforts – My name is Joe Cooper. Last year I was so proud to be named student of the year that I thought my chest was going to burst when I was on stage. I learned first hand what GILL is all about, giving to others unselfishly. Step 4 Outline the mission of the organization – Young women are growing up in an ever-changing society. As a contributor to the Council in past appeals I know that you are aware of our mission--to prepare girls with ethical values, character, a desire to succeed and a commitment to their community.

Move Type 3, Solicit Response: In the pilot study, it was observed that many letters not only requested support but also sought some other type of response, such as volunteering to help or contacting the organization for further information. Consequently, this move type was labeled “solicit response,” which was realized by one of two steps – or both. Step 1, soliciting financial support has three options: Step 1A, state benefit of support to the need/problem; Step 1B, ask directly for pledge/ donation; and Step 1C, remind of past support to encourage future support. Step 2, soliciting other response, requests a response from the reader other than financial, such as volunteering to help. Examples from letters in the corpus of Move Type 3, as expressed through its two steps, are given in Table 3.5.

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

Table 3.5  Examples of Move Type 3, Steps 1 & 2, from corpus Move Type 3 Solicit Response Optional Steps: Step 1 Solicit financial support Step 1A: State benefit of support to the need/problem – You can help more than 200,000 people with just one gift…Your one gift to United Way of Central Indiana supports 82 human service agencies... Only if you contribute this year can these agencies continue to provide programs and services that: Strengthen Families…; Invest In Our Children…; Serve The Elderly And Disabled…; Help People Become Self-Sufficient…; Promote Health And Well-Being… – And that’s why I’m writing to you today. I urge you to continue to make a difference in the lives of individuals like Cecilia and her son. You can literally help save a life. Step 1B: Ask directly for pledge/donation – Please send your gift today. – Please send the largest contribution you can comfortably make. Step 1C: Remind of past support to encourage future support – Last year your memorial gift of $5 for hospice care in March gave VNSF, Inc. the ability to address the needs of patients I described above. I am asking that you consider supporting our efforts once again this year with a similar gift. – You have helped make Goodwill’s work possible with your previous support. Step 2 Solicit other response – Every year we seek companies, organizations and individuals to sponsor one or more of our families… If you are interested and would like more information, please contact… We would like to have families matched with sponsors non later than… – I’d be glad to respond to any questions you might have about our work. You may call me at...

Move Type 4, Offer Incentives: In Move Type 4, the writer offers an incentive, or indicates some other benefit of giving. In our analysis, we found that this move type could be realized in one of two ways, either by Step 1, which is the offer of a “tangible” (e.g., a mug, a matching donation) incentive, or by Step 2, the noting of an “intangible” (e.g., a good feeling) incentive for giving. Examples from letters in the corpus of Move Type 4, as expressed through its two steps, are given in Table 3.6.


Discourse on the Move

Table 3.6  Examples of Move Type 4, Steps 1 & 2, from corpus Move Type 4 Offer Incentives Optional Steps: Step 1 Offer of Tangible Incentive – We’ll send you our newsletters, invitations and membership cards. – As an Indiana resident, your Federal tax-deductible contribution also qualifies for a special Indiana State Income Tax credit of 50%. – Your membership fee assures your receiving notices of exhibition openings, lectures, discounts for Saturday School and the Pre-College Workshop, and invitations to the Janus Ball, artists’ dinners and other Friends only events. Step 2 Offer of Intangible Incentive – When your gift helps an outstanding student become an outstanding teacher, you will know that you, too, have touched the future. – I am sure you will feel good about giving. – If you enjoy reading the stories…there is an excellent chance that you will enjoy membership in the Indiana Historical Society.

Move Type 5, Reference Insert: Move Type 5 is a simple, straightforward structure that is used to draw attention to material beyond the letter itself that was included in the mailing, such as a brochure, a pledge form, or a return envelope. Two examples of Move Type 5 from the corpus are:

(1) I have enclosed a return envelope for your convenience, as well as an overview of the services we provide. (2) I have enclosed a brochure which tells you more about the Chancellor’s Circle and which includes a reply card. I have also enclosed a reply envelope for your convenience.

When analyzing the direct mail letters in the corpus, it became clear that Move Type 4 Offer Incentives and Move Type 5 Reference Insert were often embedded in other move types. Take, for example, the following sentence: “Please fill out the enclosed card to send in your tax-deductible contribution to help support the boys and girls at Camp X” (emphasis added). The primary function of this sentence is to solicit a financial response, Move Type 3, but there are two other functions it seeks to accomplish: offering an incentive for contributing (“tax-deductible”), which is Move Type 4, and bringing attention to the enclosure (“the enclosed card), which is Move Type 5. It was decided to view this sentence and others like it

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse

as containing three move types: the primary move of soliciting support and the embedded moves of referencing insert and offering incentive. Consequently, the two moves referencing insert and offering incentive were seen as being capable of either standing alone or being embedded in other moves. A longer example of how these two move types can be embedded in a longer move type, often Move Type 3, is the following, with “tags” included to mark where move types start and stop:

(3) …Let me assure you that we would appreciate receiving one million dollars from you. But let me also assure you that we would appreciate equally well any contribution you are able to make. Whatever you can contribute, you will be helping to support a geology student at (university).

Your tax-deductible contribution may be sent in the enclosed postage-paid envelope with the attached return card. As an Indiana resident, your gift qualifies for a special tax credit of 50% (up to a maximum of $100 for an individual or $200 for a joint return). For your convenience, I am enclosing a copy of Form CC-40, which should be filed with your Indiana State Income Tax. Please give today.

Move Type 6, Express Gratitude: This move type, which is used to express thanks, is realized by one or both of two steps. Step 1 offers thanks for past financial or other support, and Step 2 offers thanks for current as well as future financial (or other) support. Examples from letters in the corpus of Move Type 6, as expressed through its two steps, are given in Table 3.7. Table 3.7  Examples of Move Type 6, Steps 1 & 2, from corpus Move Type 6 Express Gratitude Optional Steps: Step 1 Thanks for Past Financial or Other Support – Thank you for your past gift to the Girl Scout Capital Campaign. – I want to thank you for your past support of the Visiting Nurse Service Foundation, Inc. Step 2 Thanks for Current & Future Financial or Other Support – Your support is greatly needed and greatly appreciated. – Their appreciation and enthusiasm for what they are doing will go a long way to thank you for your encouragement and support. – Thank you again for sharing our hope for a future without cancer.



Discourse on the Move

Move Type 7, Conclude with Pleasantries: While not occurring as frequently as the other move types, one final move type, conclude with pleasantries, comes at the end of the letters and its communicative function is to bring the letter to a pleasant close. Examples of Move Type 7 include the following:

(4) May you be blessed, today and always. (5) I hope you have a nice day. (6) Happy Holidays!

The complete move structure for direct mail letters is given in Table 3.8. Table 3.8  Move structure of non-profit direct mail fundraising letters Move Type 1: Get attention Move Type 2: Introduce the cause and/or establish credentials of org. Step 1 General problem/need indicated, and/or Step 2 Specific problem/need highlighted, and/or Step 3 Successes of past organization efforts highlighted, and/or Step 4 Goals of future organization efforts outlined Move Type 3: Solicit response Step 1 Solicit financial support Step 1A State benefit of support to the need/problem, and/or Step 1B Ask directly for pledge/donation, and/or Step 1C Remind of past support to encourage future support, and/or Step 2 Solicit other response Move Type 4: Offer incentives Step 1 Offer of Tangible Incentive, and/or Step 2 Offer of Intangible Incentive Move Type 5: Reference insert Move Type 6: Express gratitude Step 1 Thanks for Past Financial or Other Support, and/or Step 2 Thanks for Current & Future Financial or Other Support Move Type 7: Conclude with pleasantries

3.2.2 Structural elements All of the letters in the direct mail corpus include text that strikes the reader as somehow ‘different’ than the text in the body of the letter. Things like the date, address information, and even the signature and the signature footer have a very different function in the direct mail letter than the communicative functions

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

served by the move types described above. Their functions, while important and in many respects required, are more structural in nature than communicative. These features of the direct mail letters are called structural elements. According to Crossley (2007), discussing the related genre of cover letters, “It appears that while structural elements are important to the framing of a cover letter, their individual meaning is not so dependent upon the writer’s intention as much as upon their inclusion by the writer. Structural elements are for the most part standardized patterns that rarely differ from one writer to another” (p. 7). In many respects, move types are to structural elements as lexical words are to function words. Describing the latter relationship, Biber et al (1999) see lexical words as “the main building blocks of texts,” while function words are the “mortar which binds the text together” (p. 55); on a larger, genre level, move types can be seen as the main building blocks of the direct-mail letter while the structural elements provide the (boilerplate) scaffolding around which the letter is built. The structural elements that are frequently found in direct mail letters were examined to see what role they might play in the persuasive appeal of these letters. Table 3.9 below describes the seven basic structural elements that can appear in direct mail letters. As noted above, these elements are clearly something different than the seven move types outlined in Table 3.8. They do not have clear or major communicative functions, and they are for the most part very constrained (e.g., the date or writer’s name) or highly formulaic in nature (e.g., the salutation, the complementary close). While the study of these elements are tangential to the goals of discourse analysis, many instructional materials designed to train writers specifically address and stress the importance of using these various elements to make direct mail letters more persuasive (e.g., Cone, 1987; Lewis, 1997). Consequently, as practitioners view these structural elements as an important part of the direct mail letter, and they are intended to have an impact on the reader, they seemed worth examining; structural elements are included here as they are represented in virtually all direct mail letters and in fact can be viewed as markers that are used to help identify this text type


Discourse on the Move

Table 3.9  Direct mail “Structural Elements” Element A: Date line The date when the letter was written/sent is given. – January 10, 1998 Element B: Address information The address of the addressee is given. This provides a level of formality to the letter. – Joy Us Donor 123 Boulevard Road Here, There 45678 Element C: Salutation This is the opening greeting of the letter and is followed either by no punctuation, a comma, or a colon. – Dear Joy Donor, Element D: Complimentary Close This is the word or phrase that draws the letter to a close and is followed by either a comma or no punctuation. – Sincerely yours, – On behalf of our clients, Element E: Signature This is the author’s penned signature. Element F: Signature footer This provides the printed name of the letter signer and/or the title of the signer. – Nahn Prophet President Element G: Footnote information This is information located after everything else in the letter and indicates that there is other information the reader should be aware of. – enclosure – cc



Using the rubrics given in Table 3.8 outlining the rhetorical moves of the direct mail letters and in Table 3.9 outlining the structural elements, two raters handcoded the rhetorical moves and structural elements in all 242 letters in the corpus. As noted in Section 3.1 of Chapter 2, individual moves often reappeared throughout a letter, and each appearance was counted as a distinct occurrence; as a result a single move type could occur multiple times. Inter-rater reliability was calculated

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

at 84%, with all discrepancies reconciled through discussion. The vast majority of discrepancies that occurred between the two raters resulted from initial disagreement as to where one move ended and the next started, not as to the presence of a particular move. This inter-rater reliability is quite good, since, as Bhatia notes, there are sometimes “cases which will pose problems and escape identification or clear discrimination, however fine a net one may use. After all, we are dealing with the rationale underlying linguistic behavior rather than its surface form” (Bhatia, 1993a, p. 93). Once all of the moves were agreed upon and marked, each letter was then tagged to indicate the start and stop of each move in each text. The sequence of each move type and structural element for each text was also noted. This allowed for the tracking of the total frequency of each move type in the corpus, their relative locations in each letter (e.g., first, second, third), what other move types a move most commonly occurred with, how frequently a move was embedded in another move, and how frequently a move type occurred in the body of the text as opposed to in a P.S. 3.4


Move Type Frequencies and Lengths: Table 3.10 provides summary information about the moves in this corpus of 242 direct mail letters, including the frequency of each move type, the number of letters that contained each move type, and the average number of words per move type. Not surprisingly, the most common move type in all of these letters was Move Type 3 Solicit Response, which occurs 546 times. This represents 39% of all the moves occurring in this corpus, showing up at the average rate of 2.3 times per letter. Table 3.10  Move totals, percentages and rates of occurrence

Moves Total Number % of total moves Letters w/ ≥1 occurrence % of total letters Words/move Avg.

Move 1

Move 2

Move 3

Move 4

Move 5

Move 6

Move 7

35 2.5%

362 26.0%

546 39.3%

113 8.1%

153 11.0%

148 10.7%

33 2.4%

35 15%

226 93%

236 97%

85 35%

127 52%

124 51%

31 13%









Discourse on the Move

In fact, of the 242 letters, only six letters did not have at least one Move Type 3 occurring at some point in the letter, with Move Type 3 represented in 97% (236/242) of the letters. The second most common move was Move Type 2 Introduce the cause and/or establish credentials of the organization, which occurred 362 times. At the rate of 1.5 times per letter, this move represents 26% of all the moves in this corpus. Move Type 2, like Move Type 3, also clearly seems to be a ‘required’ move (that is, one that almost every letter uses) in this genre as it occurs in 93% of the letters. Move Type 4 (Offer Incentive) at 8.1% of the total moves, Move Type 5 (Reference Insert) at 11.0%, and Move Type 6 (Express Gratitude) at 10.7% occurred at relatively similar rates of frequency across the 242 letters. While apparently optional move types within this genre, each occurred fairly regularly in these letters: Move Type 4 was represented at least once in 35% of the letters, Move Type 5 occurred in 52% of the letters, and Move Type 6 occurred in 51% of the 242 letters. Move Type 1 (Get attention) and Move Type 7 (Conclude with pleasantries) were clearly ‘icing-on-the-cake’ moves that writers of this genre could draw upon when desired but did not do so very frequently. Move Type 1 represented only 2.5% of the moves in this corpus and occurred in only 15% of the letters. Similarly, Move Type 7 represented 2.4% of the moves in this corpus and occurred in only 13% of the letters. It is further possible to compare the lengths of each of these move types. Move Type 2 is by far the longest move in this genre, averaging 150 words per occurrence. Move Type 3, the second longest move, is only one-third the length, at 48 words per occurrence. Move Types 5, 6 and 7 are the shortest, with Move Type 5 averaging 9 words per occurrence, and Move Types 6 and 7 averaging 10 words per occurrence. Structural Elements: Table 3.11 shows the relative frequency of each of the structural elements of the direct mail letters in this corpus. Table 3.11  Percentage of letters with each structural element Structural Elements Element A: Date Line Element B: Address Information Element C: Salutation Element D: Complimentary Close Element E: Signature Element F: Signature Footer Element G: Footnote Information

Percent of Letters 77% 51% 88% 90% 89% 87% 7%

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

The vast majority of the letters in this corpus contained four structural elements, an opening salutation (88%), a complimentary close (90%), a signature (89%), and a typed signature footer (87%). The date line (77%) and address information (51%) were more optional, while footnote information is included relatively infrequently (7%). 3.5


Based on the results of the genre analysis of the 242 direct mail letters in this corpus, a couple of observations can be made about how moves are used within the genre. First of all, some of these moves are nearly obligatory in the genre, while others seem to be merely optional. Secondly, it seems clear that the juxtaposition of the moves relative to each other shows meaningful patterns. Move Type 2 (Introduce the cause and/or establish credentials of organization) and Move Type 3 (Solicit response) are the most important moves in this genre. The preeminence of these two moves can be seen by the fact that not only do they occur in nearly every direct mail letter in the corpus, but they generally occur more than once, they usually occur as the first and second moves in the letter, they are by far the longest of the moves, and they almost always occur in juxtaposition to each other. That Move Types 2 and 3 are the most prominent – in frequency, size, and position in the letter – is not surprising. At its most basic level, the purpose of the direct mail letter is to tell the readers what the organization is and/or what the need is, and to request funds to help the cause. These functions are accomplished in these two moves. In contrast, the other five moves serve as optional tools that individual writers in this genre can incorporate in various ways to tailor the effect of the letter on the reader. For example, Move Types 4 (Offers Incentive) and 5 (Reference Insert) clearly play a secondary role in the direct mail letter as they tend to be quite short in length and often embedded in another move, usually Move Type 3 (Solicit Response). Nevertheless, their role appears to be an important one in that they are included in a sizeable percentage of the letters (Move Type 4 in 35%; Move Type 5 in 53%). Essentially, it seems their function is to serve as a reminder: In the case of Move Type 4, the readers most often are reminded either that contributions to non-profit organizations are tax-deductible, or that they will “feel good” about the contribution that they make. With Move Type 5, the function of this move is simply to remind the readers to look at other material that has been included with the letter. Move Type 6 (Express Gratitude), occurring in 51% of the letters, also plays an important role of informing the readers how much the organization appreciates their support. Nevertheless, this role is noticeably a secondary one when the frequency, number of occurrences and length of this move are considered in relation to Move Types 2 (Introduce the cause and/or establish credentials of organization)


Discourse on the Move

and 3 (Solicit response). Move Types 1 (Get attention) and 7 (Conclude with pleasantries) are clearly optional moves, with both of them occurring in fewer than 15% of the letters. Similar observations can be made about the structural elements that are included; clearly there are some that are considered obligatory, such as the salutation (Element C) and complementary close (Element D), and others that are more optional, such as address information (Element B). The facts that most of these structural elements occur in most direct mail letters, and that practitioners themselves view these as essential components of the direct mail letter (e.g., Cone 1987) suggest that more careful analysis of these may be warranted in future studies. Indeed, it could be argued that at least some of these elements should be viewed as moves in themselves, as they are functional units of text serving a specific purpose that adds to the persuasive nature of the letters. Textual choices within these structural elements, for example how to phrase the salutation, are actually quite significant and can be viewed as something beyond a standardized template. 3.6

Letter prototypes

One strength of this type of corpus analysis is that it allows us to develop prototypes of the genre. Three such prototypes suggest themselves from these data. The first prototype might be one that represents the most basic form of the direct mail letter, using the moves and structural elements which occur in at least 85% of the letters in the corpus. These include Move Types 2 (introduce the cause and/or establish credentials of organization) and 3 (solicit response), and Structural Elements C (salutation), D (complimentary close), E (signature), and F (signature footer). An example of such a letter is provided in Figure 3.8. A second prototype might include all the moves and the structural elements that occurred in over 50% of the letters in this corpus. These include Move Types 2, 3, 5 (reference insert) and 6 (express gratitude) as well as Structural Elements A (date line), B (address information), C, D, E and F. An example of such a letter is provided in Table 3.13.

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

Table 3.12  Prototype direct mail letter representing move types and structural elements which occurred in ≥ 85% of the corpus. Structural Element C

Mr./Mrs. Smith

Move Type 2

Now more than ever, inner city girls need your support to help their dreams become a reality. Each generation of girls faces new challenges: new technology, new moral issues, new opportunities. Inner City Girls experience a wide range of real life skills – first aid, resume writing, and managing money. They also reap benefits that are difficult to measure, including enhanced self-esteem, greater confidence in their abilities, and the strength and conviction to take the lead and excel in their endeavors. We start early. As a preventative, informal education program, Inner City Girls helps girls relate to others, develop values, contribute to their society, and develop their own potential.  This results in reduced risk of teen pregnancy, suicide, truancy, substance abuse and so many other crises. Your gift to the 1997 Inner City Girls Annual Campaign helps to ensure that girls will continue to receive the benefits that Inner City Girls offers. Today’s girls will be tomorrow’s leaders – and they are counting on you. Sincerely,

Move Type 3 Structural Element D Structural Element E Structural Element F

(Signature) Sally Mentor President 1997 Inner City Girls Annual Campaign

Table 3.13  Prototype direct mail letter representing move types and structural elements which occurred in ≥ 50% of the corpus Structural Element A

October 26, 2000

Structural Element B

Sam Q. Doe 123 Street Dr. Somewhere, IN 46202 Dear Sam,

Structural Element C Move Type 2

For many of the children and seniors that Help Your Neighbor cares for, the Holiday season can be a troubling time. Nearly every day HYN receives a call about a patient or family in need of home care who has limited financial resources. Calls for help from families that need the crisis services HYN provides for their children ring throughout the season. This is not the ringing that you and I traditionally picture during the holiday season.

 Discourse on the Move

Move Type 3

Move Type 2

Move Type 3 Move Type 5 Move Type 6 Structural Element D Structural Element E Structural Element F

But there is something that you can do to help. With your gift of sharing, you are: *providing needed home care services to the most needy *giving emergency respite to families of children at risk for neglect or abuse *helping establish a “Golden Touch” program to provide companionship and homemaker services to homebound seniors. Help Your Neighbor has been a part of this community for over 85 years. Serving the needy has been an important part of our mission. Over the last ten years, HYN has delivered over $1 million worth of free services to the citizens of Somewhere. But we cannot do it alone. We need your help. A gift of sharing can bring comfort and hope to those most in need during this holiday season. Please use the enclosed envelope to make a contribution to help us ease the suffering and indeed ring in a most joyous holiday season. I thank you for your generous support. Sincerely, (Signature) Bob L. Brown President & CEO

A third prototype might simply show what a direct mail letter would look like if it used each of the possible move types and structural elements that define this genre; Table 3.14 provides an example of such a letter. It should be pointed out, however, that most real-world direct-mail letters do not use all seven possible rhetorical move types and, in fact, only one letter in this corpus did. Table 3.14  Prototype direct mail letter representing all possible move types and structural elements Structural Element A

October 26, 2000

Move Type 1

“Do all the good you can, by all the means you can, in all the ways you can, in all the places you can, at all the times you can, to all the people you can, as long as ever you can.” John Wesley Sam Q. Doe 123 Street Dr. Somewhere, IN 46202 Dear Sam,

Structural Element B Structural Element C

Move Type 2

Move Type 3 Move Type 2 Move Type 3 Move Type 5 Move Type 4 Move Type 6 Move Type 7 Structural Element D Structural Element E Structural Element F Move Type 3PS Move Type 6PS Structural Element G

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse  Ebenhazer cares for at-risk children and families. We do this through a wide range of programs including community-based, therapeutic foster care, group homes and our treatment center. Many of the children are victims of abuse or live in unstable homes. This Christmas season we are asking you to take a few minutes to consider making a contribution to Ebenhazer to help the 1,500 children and families that we care for. Many of the children have no homes; no memories of joy from past holidays. Others are from families that are struggling to provide a healthy, happy environment but don’t have the resources to make it possible. Your contribution will make a difference in a child’s life. It may help a family stay together. It can certainly make happy holiday memories. A gift to Ebenhazer means the children in our care will have presents to open. A gift means a family will have a holiday meal, cooking utensils to prepare the meal and dishes to serve it on. Your gift will go beyond the holiday season. It can help purchase clothing, school supplies, books and educational tools throughout the year. Please use the enclosed donation card and return envelope and mail your taxdeductable donation to Ebenhazer today. Thank you in advance for your gift. We wish you and your family a new year full of joy and love. Sincerely, (Signature) Mary Smith Director P.S. Let our families and children know you want them to have the same kind of memories of the holidays you will have. Please give generously. Thank you for thinking of Ebenhazer this Christmas season. Enclosures

4 Linguistic analysis of moves: Tracking the use of stance structures As introduced in Chapter 1, the goal of this book is to move beyond simply segmenting texts into well-defined discourse units (in this case, moves); the goal is also to analyze the linguistic characteristics of each individual discourse unit and each discourse unit type (i.e., the move types), to determine the typical linguistic


Discourse on the Move

characteristics of the units. Although they are defined in functional terms, moves are constructed from linguistic devices, including word choice, phrase types, and grammatical features (e.g., tense, aspect, voice). Many of these linguistic devices are used to express stance: ‘personal feelings, attitudes, value judgments, or assessments’ (Biber et al., 1999, p. 966). Linguistic features used for these functions are especially important in direct-mail letters. There have been numerous studies of the linguistic mechanisms used by speakers and writers to convey their personal feelings and assessments, carried out under several different labels, including ‘evaluation’ (Hunston, 1994; Hunston & Thompson, 2000), ‘intensity’ (Labov, 1984), ‘affect’ (Ochs, 1989), ‘evidentiality’ (Chafe, 1986; Chafe & Nichols, 1986), ‘hedging’ (Holmes, 1988; Hyland, 1996a, 1996b), ‘persuasion’ (Hyland, 2004a), and ‘stance’ (Barton, 1993; Beach & Anson, 1992; Biber, 2004, 2006a, 2006b; Biber & Finegan, 1988, 1989; Biber et al., 1999, Chapter 12; Conrad & Biber, 2000; Hyland, 1999b; Precht, 2000). In the present case, we adopt the framework of stance devices developed in Biber et al. (1999) and Biber (2006a,b) to analyze the ways in which move types in direct-mail letters differ linguistically. Because non-profit direct mail letters are overtly persuasive in nature, there is little question that stance plays an important role in this genre. We are interested in looking at how the use of stance structures (as opposed to other expressions of stance, like word choice) varies from move to move. We believe that identifying stance structures could be important in untangling the language structures used in direct-mail letters and provide a better describing the function of the different moves in the genre. 4.1

Identifying grammatical stance devices

According to Biber et al. (1999), the five most common grammatical devices used to express stance are: 1) stance adverbials, 2) stance complement clauses (specifically “that” and “to” clauses), 3) modals, 4) premodifying stance adverbs (e.g., ‘I’m so happy for you.’), and 5) stance nouns followed by prepositional phrases. While Biber (2006a; 2006b) has previously analyzed the use of grammatical stance devices in specific registers (comparing spoken and written academic registers), this study seeks to compare and contrast the use of these stance devices across the move types within a single genre. Each move was automatically ‘tagged’ using a grammatical tagger. While the tagging program, developed by Biber, identifies a wide variety of linguistic features (see Appendix Two), we focused here only on those grammatical devices that express stance. These features are given in Table 3A at the end of the chapter. The rate of occurrence for each stance feature within each move type was calculated. In the

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

following discussion, we focus on only the stance features that occurred at least 3 times per 1,000 words. 4.2

Interpreting the use of grammatical stance devices used in moves

As expected, since each of the moves has very different rhetorical functions within this persuasion-motivated genre, the seven different move types all use different combinations of grammatical stance devices. Table 3.15 provides a breakdown of the results by move, showing those stance devices that occurred at a rate of ≥ 3 per 1,000 words. Table 3.15  Common grammatical stance devices by move type Move Type

Stance Structure Occurring ≥ 3 times/1000 words

Move 1: Get attention

Stance Adverbials of Certainty Modals of possibility/permission/ability Modals of prediction/volition

Move 2: Introduce cause/ establish credentials

Rate/1000 words 7.9 7.0 13.1


Move 3: Solicit response

Modals of possibility/permission/ability Modals of prediction/volition To-complement clauses controlled by (all) stance verbs

12.1 14.4 7.8

Move 4: Offer incentives

Modals of possibility/permission/ability Modals of prediction/volition

7.2 19.7

Move 5: Reference insert

Modals of necessity/obligation

Move 6: Express gratitude

Modals of prediction/volition To-complement clauses controlled by desire/intention/decision stance verbs To-complement clauses controlled by all stance verbs Pre-modifying stance adverbs

Move 7: Conclude w/ pleasantries

Stance Adverbials of Certainty To-complement clauses controlled by desire/intention/decision stance verbs Pre-modifying stance adverbs

3.4 11.2 4.4 5.6 3.0 5.6 8.5 7.2


Discourse on the Move

Table 3.15 provides the basis for interpreting how the different moves in this genre tend to use the different grammatical structures of stance in order to accomplish their rhetorical purpose. The purpose of Move Type 1 (Get Attention) is to engage the reader and get him/her interested in the cause/need being promoted. The move typically contains a quotation, story, or strong general pleasantries. The fairly strong reliance on modals of possibility/ability and modals of prediction have the purpose of ‘empowering’ the reader and trying to show that the reader can make a difference. This can be seen in the following examples. Modals of possibility/ability (italics added to show usage):

(7) “You might hear some ugly talk this summer.” (8) “YOU can be the one to open the door.”

Modals of prediction (italics added to show usage): (9) “The urgency you feel to make changes is just the extent that change will be made.” (10) “Until he extends the circle of his compassion to all living things, man will not himself find peace.”

Stance adverbials of certainty, the other stance structure frequently used in Move Type 1, contribute to getting the reader’s attention by underscoring the need. Stance adverbials of certainty (italics added to show usage): (11) “Please send a million dollars so we can really support geological activities here at IUPUI in perpetuity.” (12) (quoting Margaret Mead) “Never doubt that a small group of thoughtful, committed citizens can change the world, indeed it’s the only thing that ever has.”

Move Type 2, Introducing the cause and establishing credentials, did not include any especially frequent use of specific stance structures. Looking at the letters in the corpus more carefully, it appears that this move is written in a more “matter of fact” manner. Unlike the other moves, the emphasis in this move is on content and facts – what the organizations do and what the needs are – rather than emphasizing personal feelings, attitudes, value judgments, or assessments. For example: (13) “The number of companies reporting a shortage of skilled workers almost doubled from 1995 to 1998; from 27 percent to more than 47 percent. Did you know that about 20  percent of America’s workers have low basic skills and 75 percent of unemployed adults have reading or writing difficulties?

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

Indy Reads is working to change that!” (14) “In 1985 a group of courageous pioneering women established the YWCA of Indianapolis to meet the needs of women, and in 1998 the tradition continues. The WYCA of Indianapolis still focuses on, supports, and gives empowerment to women and their families. Empowerment refers to meeting the needs of girls and women so that they can freely exercise the power to determine and direct their lives.”

Move Type 3, Solicit response, can incorporate one of two steps, either soliciting financial support or soliciting other response (a non-financial contribution from readers, such as volunteering to help). The stance structures most commonly used in Move Type 3 are modals of possibility and ability, modals of prediction and volition, and to-complement clauses controlled by stance verbs. Looking again more closely at the letters themselves, Move Type 3 frequently uses modals of possibility and ability in order to state the benefit of support for the reader. The modal can, indicating ability, is by far the most common (occurring 188 times); the modal may is the next most common (occurring 45 times), indicating possibility: (15) “You can help people reach their dreams of reading and learning by making a contribution to Indy Reads.” (16) “It may help a family stay together.”

Modals of prediction and volition, on the other hand, were typically used to ask directly for a pledge or donation: (17) “Will you help them change?” (18) “We hope you will become a partner of Indy Reads”

To-complement clauses controlled by stance verbs most frequently appear at the end of the move and play a role in making clear what it is the organization wants the reader to do in response to the letter. (19) “We are hopeful that you will agree to help.” (20) “If you have any questions or concerns at any time, please do not hesitate to call me.” (21) “When you are contacted by your Campus Campaign volunteer, we hope you’ll choose to become one of the many partners in the community of IUPUI.”

Move Type 4, Offer incentives, makes frequent use of modals of possibility, permission, and ability, but modals of prediction and volition are used at an extremely high rate of nearly 20 times per 1,000 words. These structures parallel those used in Move Types 1 and 3, but support very different rhetorical purposes. Modals of


Discourse on the Move

possibility/permission/incentive typically are tied to offers of tangible incentives, as illustrated by examples (22) and (23). (22) “I hope we can include your name among the list of inaugural members of the 1994 Black Cane Society.” (23) “Based on each individual tax situation, your gift may be tax deductible”

Modals of prediction and volition are also used to support reciprocal offers, including offers of tangible incentives (24, 25) and offers of intangible incentives (26). (24) “Corporate contributors will be acknowledged in our newsletter, annual report and on the Indy Reads webpage.” (25) “However, these tax credits are only available for a limited time, so we ask that you act soon if you would like to use them. (26) “I am sure you will feel good about giving.”

Move Type 5, Reference insert, uses only one grammatical stance structure consistently, but this structure, modals of necessity and obligation, is not used regularly by any other move. Looking at this structure in context, it is clearly used to direct readers’ attention to materials included with the letter. (27) “For your convenience, I am enclosing a copy of Form CC-40, which should be filed with your Indiana State Income Tax.”

What is most interesting about the regular use of this particular stance structure in this move is that it is very directive, explicitly telling the reader what s/he must, should, or ought to do. This is a rather surprising structure to see in a letter such as this whose whole purpose is to persuade a reader to make a financial (usually) contribution; telling someone they have to do something (when they really don’t) is usually not a successful persuasion tactic. Nevertheless, within Move 5, this stance structure does not come across as inappropriate, primarily because it is not part of the solicitation itself but points the reader to steps that will benefit him/ herself, rather than the agency. Move Type 6, Express gratitude, commonly uses modals of prediction and volition (28 and 29), as well as to-complement clauses controlled by stance verbs (30 and 31) to thank readers in advance for potential donations. (28) “Your check will be greatly appreciated.” (29) “I would like to thank you for your commitment to dental hygiene education.” (30) “I want to thank you for your help.” (31) “I want to express my gratitude to those of you who have already pledged or contributed in 1991.”

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

The level of appreciation for the gift is frequently signaled in this move through the use of pre-modifying stance adverbs, as illustrated by the following two examples. (32) “Thank you so much for your help.” (33) “I can only hope that you know how appreciative we at the Indianapolis Zoo are of your philanthropy.”

Move Type 7, Conclude with pleasantries, is the only move other than Move Type 1 to use stance adverbials of certainty. In this move, this structure always occurs in rather formulaic expressions, as shown by the following examples. (34) “May you be blessed, today and always, as you so generously share your blessing.” (35) “I am always happy to hear from you about your accomplishments.”

Move Type 7 also has the highest rate of to-complement clauses controlled by desire/intention/decision stance verbs, but like the adverbials of certainty, these all, without exception, occur in short, formulaic structures that tend to end the letter. (36) “I hope to see you there.” (37) “I hope to hear from you soon.”

Lastly, just as Move Type 6 uses pre-modifying stance adverbials for emphasis, Move Type 7 uses this structure in the same way. In fact, the only pre-modifying adverb that is commonly used in this move is the adverb so before an adjective: “so great,” “so closely,” “so generously.” In sum, all the move types in the fund raising letters, with the exception of Move Type 2, frequently use one or more grammatical stance devices, and the combination of grammatical structures used are distinctive, with no two moves using the same set of structures. Our results suggest that different moves use somewhat different stance structures, which supports the need to teach different strategies for different moves. Overall, however, the results were unexpected in showing a rather limited use of stance structures. Modals of possibility/ability and prediction were used along with to complement clauses. Missing, however, were many stance features that are typically considered part of persuasive discourse, e.g., modals of obligation, stance adverbials and premodifying stance adverbs. The lack of variety in the stance use suggests a discourse that treads carefully, does not take strong positions, and does not put strong demands on the reader. A previous study using Biber’s multi-dimensional features (Connor & Upton, 2003) had suggested that fundraising letters as a genre are similar to academic prose, a finding, which was unexpected. The current study further supports this general finding: both genres use a limited range of grammatical stance features, restricted primarily to modal verbs and to complement clauses (compare the findings here to those reported in Biber 2006a,b for academic prose). Thus, despite


Discourse on the Move

their apparent differences in communicative purpose, we see here that these two genres are surprisingly similar in the kinds of stance expressed and the particular linguistic devices used for these functions.

5 Final thoughts One goal of this chapter was to outline a general approach that can be used to identify and analyze the moves of a genre in a corpus of texts, and to provide a specific and detailed example of how this type of analysis can be done. As noted in Chapter 2, a move analysis seeks to identify the components (moves) of a genre by the communicative purposes they serve. These communicative purposes must be identified within the context of the genre as well as the social context in which the genre resides (e.g., fundraising direct mail relationships). The question that we are seeking to answer with this type of analysis is, “What are the rhetorical structures that address the specific purposes of the genre, and if these vary, how so?” Because we are seeking to understand why a genre is structured the way it is, and because it is important to account for the socio-cultural, institutional, and organizational influences on a specific genre, there is naturally a “subjective” element to this sort of analysis. However, despite its subjective nature, certain guidelines can be followed that enables an empirical analysis of moves in a corpus of texts (see also Chapter 2). First, extreme care must be taken to collect “good” data. In the present case, the corpus of fundraising discourse was well planned – involving the input of both fundraising practitioners and linguists – and carefully documented, and was large enough to provide reliable results. Then, a series of pilot studies were run with a research team, first to develop a working set of genre-specific move types with distinct definitions, and then to confirm the inter-rater reliability of using these moves to analyze the individual texts in the corpus. Once the move types were clearly defined and all of the documents in the data set coded (and checked by multiple raters), the next step was to look for patterns in how and when the different moves were used in order to help explain the specific role of each move more broadly within the genre. The goal was to have a full understanding of the communicative purposes and functions that different parts of the genre have and how they work together to accomplish the overall communicative aim of the genre. Nevertheless, although a move analysis uses communicative function as the starting point for understanding the rhetorical purposes of a genre, the expectation is that these distinct functions are realized through the use of distinct and consistent linguistic features. Consequently, it should be possible to see variation in linguistic patterns from one move to the next.

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

The second and more important goal of this chapter for the purposes of this book was to show the contribution that a corpus-based approach could make in the analysis of discourse structure (e.g., move structure). By analyzing generic moves in a fairly large specialized corpus of direct mail letters collected from multiple non-profit organizations of various types (e.g., environmental, education), we are able to generalize the findings and develop representative prototypes that can be used for exemplification and training. It then becomes possible to authoritatively compare and contrast the discourse structures of different types of texts in order to gain a clearer understanding of how each uniquely accomplishes its communicative purposes. In addition, using a corpus of texts to analyze discourse structures makes it much easier to identify alternate ways (steps) for accomplishing common functions (moves); such variations can be easily missed or misinterpreted when looking at individual texts. In the same vein, a corpus-based analysis makes it easy to identify which moves (and steps) are more common, even required, and which are optional or idiosyncratic and can be used at the discretion of the writer without the reader feeling the text is non-standard or inappropriate. In a more detailed analysis, a corpus-based approach will even permit generalizations about where different discourse structures occur within a typical text, and where they occur relative to other structures (i.e., before, after, or within). A corpus-based analysis also allows for the detailed analysis of the linguistic characteristics of the discourse units that have been identified. In the discourse analysis done in this chapter, the corpus-based approach allowed us to make detailed observations about the specific grammatical stance devices that each of the different move types used to accomplish their unique functions in the genre. Further analysis would likely reveal other linguistic differences among these move types. Table 3A  Grammatical devices used to express stance 1. Stance adverb(ial)s (See Biber et al, 1999, pp. 557–558, 853–874) Expressing Certainty: actually, always, certainly, definitely, indeed, inevitably, in fact, never, of course, obviously, really, undoubtedly, without doubt, no doubt Expressing Likelihood: apparently, evidently, kind of, most cases, most instances, perhaps, possibly, predictably, probably, roughly, sort of, maybe Expressing Attitude:   amazingly, astonishingly, conveniently, curiously, disturbingly, hopefully, even worse, fortunately, importantly, ironically, regrettably, rightly, sadly, sensibly, surprisingly, unbelievably, unfortunately, wisely


Discourse on the Move

Expressing Style: accordingly, according to, confidentially, figuratively, frankly, generally, honestly, mainly, strictly, technically, truthfully, typically, reportedly, primarily, usually 2. Complement clauses controlled by stance verbs, adjectives, or nouns 2.1 Stance verb + that-clause. (See Biber et al, 1999, pp. 661–670) Verbs Expressing Certainty: acknowledge, affirm, ascertain, calculate, certify, check, conclude, confirm, decide, deem, demonstrate, determine, discover, find, know, learn, mean, meant, meaning, note, notice, observe, prove, realize, recall, recognize, recollect, record, remember, see, show, signify, submit, testify, understand Verbs Expressing Likelihood: appear, assume, believe, bet, conceive, consider, deduce, detect, doubt, estimate, figure, gather, guess, hypothesize, imagine, indicate, intend, perceive, postulate, predict, presuppose, presume, reckon, seem, sense, speculate, suppose, suspect, think, wager Verbs Expressing Attitude: accept, admit, agree, anticipate, boast, complain, concede, cry, dream, ensure, expect, fancy, fear, feel, forget, foresee, guarantee, hope, mind, prefer, pretend, reflect, require, resolve, trust, wish, worry Verbs Expressing Speech Act (and other communication verbs): add, announce, advise, answer, argue, allege, ask, assert, assure, charge, claim, confide, confess, contend, convey, convince, declare, demand, deny, emphasize, explain, express, forewarn, grant, hear, hint, hold, imply, inform, insist, maintain, mention, mutter, notify, order, persuade, petition, phone, pray, proclaim, promise, propose, protest, reassure recommend, remark, reply, report, respond, reveal, say, shout, state, stress, suggest, swear, sworn, teach, telephone, tell, urge, vow, warn, whisper, wire, write 2.2 Stance verb + to-clause. (See Biber et al, 1999, pp. 693–715) Verbs Expressing Probability (likelihood): appear, happen, seem, tend Verbs Expressing Cognition/perception: assume, believe, consider, estimate, expect, felt, find, forget, hear, imagine, judge, know, learn, presume, pretend, remember, see, suppose, take, trust, understand, watch Verbs Expressing Desire/Intention/Decision: aim, agree, bear, care, choose, consent, dare, decide, design, desire, dread, hate, hesitate, hope, intend, like, look, love, long, mean, need, plan, prefer, prepare, refuse, regret, resolve, schedule, stand, threaten, volunteer, wait, want, wish Verbs Expressing Causation/Modality/Effort: afford, allow, appoint, arrange, assist, attempt, authorize, bother, cause, counsel, compel, defy, deserve, drive, elect, enable, encourage, endeavor, entitle, fail, forbid, force, get, help, inspire, instruct, lead, leave, manage, oblige, order, permit, persuade, prompt, require, raise, seek, strive, struggle, summon, tempt, try, venture

Chapter 3.  Identifying and analyzing rhetorical moves in philanthropic discourse 

Verbs Expressing Speech Act (and other communication verbs): ask, advise, beg, beseech, call, claim, challenge, command, convince, decline, heard, invite, offer, pray, promise, prove, remind, report, request, say, said, show, teach, tell, urge, warn 2.3 Stance adjective + that-clause. (See Biber et al, 1999, pp. 671–674; many of these occur with extraposed constructions) Adjectives Expressing Certainty: accepted, apparent, certain, clear, confident, convinced, correct, evident, false, impossible, inevitable, obvious, positive, proved, plain, right, sure, true, well-known Adjectives Expressing Likelihood: doubtful, likely, possible, probable, unlikely Adjectives Expressing Attitude/Emotion: adamant, afraid, alarmed, amazed, amused, angry, annoyed, astonished, aware, careful, concerned, curious, depressed, disappointed, dissatisfied, distressed, disturbed, encouraged, frightened, glad, grateful, happy, hopeful, hurt, irritated, mad, pleased, reassured, relieved, sad, satisfied, shocked, surprised, thankful, unaware, uncomfortable, unhappy, unlucky, upset, worried Adjectives Expressing Evaluation: acceptable, advisable, amazing, annoying, anomalous, appropriate, awful, conceivable, critical, crucial, desirable, dreadful, embarrassing, essential, extraordinary, fitting, fortunate, funny, good, great, horrible, imperative, incidental, inconceivable, incredible, indisputable, interesting, ironic, lucky, natural, neat, necessary, nice, notable, noteworthy, noticeable, obligatory, odd, okay, paradoxical, peculiar, preferable, ridiculous, sensible, shocking, silly, sorry, strange, stupid, sufficient, surprising, tragic, typical, unacceptable, unaware, uncomfortable, understandable, unfair, unfortunate, unthinkable, untypical, unusual, upsetting, vital, wonderful 2.4 Stance adjective + to-clause. (See Biber et al, 1999, pp. 716–721; many of these occur with extraposed constructions) Adjectives Expressing Certainty/Likelihood): apt, certain, due, guaranteed, liable, likely, prone, unlikely, sure Adjectives Expressing Attitude/Emotion: afraid, amazed, angry, annoyed, ashamed, astonished, concerned, content, curious, delighted, disappointed, disgusted, embarrassed, free, furious, glad, grateful, happy, impatient, indignant, nervous, perturbed, pleased, proud, puzzled, relieved, sorry, surprised, worried Adjectives Expressing Evaluation: awkward, appropriate, bad, best, better, brave, careless, convenient, crazy, criminal, cumberome, desirable, dreadful, essential, expensive, foolhardy, fruitless, good, important, improper, inappropriate, interesting, logical, lucky, mad, necessary, nice, reasonable, right, safe, sick, silly, smart, stupid, surprising, useful, useless, unreasonable, unseemly, unwise, vital, wise, wonderful, worse, wrong Adjectives Expressing Ability/Willingness: able, anxious, bound, careful, competent, determined, disposed, doomed, eager, eligible, fit, greedy, hesitant, inclined, insufficient, keen, loath, obliged, prepared, quick, ready, reluctant, set, slow, sufficient, unable, unwilling, welcome, willing


Discourse on the Move

Adjectives Expressing Ease or Difficulty: difficult, easier, easy, hard, impossible, pleasant, possible, tough, unpleasant 2.5 Stance noun + that-clause. (See Biber et al, 1999, pp. 648–651) Nouns Expressing Certainty: assertion, conclusion, conviction, discover, doubt, fact, knowledge, observation, principle, realization, result, statement Nouns Expressing Likelihood: assumption, belief, claim, contention, expectation, feeling, hypothesis, idea, implication, impression, indication, notion, opinion, possibility, presumption, probability, rumor, sign, suggestion, suspicion, thesis Nouns Expressing Attitude/Perspective: grounds, hope, reason, view, thought Nouns Expressing Communication: comment, news, proposal, proposition, remark, report, requirement 2.6 Stance noun + to-clause. (See Biber et al, 1999, pp. 652–653) agreement, authority, commitment, confidence, decision, desire, determination, duty, failure, inclination, intention, obligation, opportunity, plan, potential, promise, proposal, readiness, reluctance, responsibility, right, scheme, temptation, tendency, threat, wish, willingness 3. Modal and semi-modal verbs (See Biber et al, 1999, pp. 483ff.) Modals Expressing Possibility/Permission/Ability: can, could, may, might Modals Expressing Necessity/Obligation: must, should, (had) better, have to, got to, ought to Modals Expressing Prediction/Volition: will, would, shall, be going to 4. Premodifying stance adverb (stance adverb + adjective or noun phrase) Most common premodifying adverbs (See Biber et al, 1999, pp. 544ff): Adverbials + adjectives (‘It was perfectly quiet.’) awfully, completely, extremely, how, perfectly, quite, really, slightly, so, totally, very Adverbials + nouns (‘It is almost time;’ ‘It was quite a surprise.’) about, almost, completely, quite, really 5. Stance noun + prepositional phrase (‘of + NP’ or ‘for + NP’) (See 2.5 and 2.6 for list of stance nouns used.)

chapter 4

Rhetorical moves in biochemistry research articles BY Budsaba Kanoksilapatham

The study described in this chapter provides another example of the powerful descriptive nature of a corpus-based, top-down approach to discourse analysis1. Unlike previous move-based studies of research articles, this is the first study to undertake a comprehensive coding of all the moves in a fairly large corpus that represents all four sections – introduction, methods, results, discussion – followed by an analysis of the linguistic structures that make up those moves. In keeping with the steps introduced in Table 1.1, the first steps in the study (after the compilation of a representative corpus) were to identify the rhetorical move types used in biochemistry research articles, segment the texts into moves, and then identify the specific move type each represents. Then, following steps 4–5 described in Table 1.1, multidimensional analysis (see Appendix 1) was used to identify the linguistic characteristics of each rhetorical move, and to analyze the typical linguistic characteristics of each move type. Finally, the typical discourse organization of research article sections is analyzed in terms of these move types. The integration of move analysis and multidimensional analysis provides us with a comprehensive communicative and linguistic description of the discourse of biochemistry research articles, underscoring the value of a corpus-based approach.



Previous move-based studies of scientific research articles have provided valuable insights regarding the rhetorical moves conventionally employed in each of the four internal sections (introduction, methods, results, discussion; see Chapter 2). Discipline-specific variations are also discernible (e.g., Anthony, 1999; Brett, 1994; 1. The material presented in this chapter is based upon dissertation research supported by the National Science Foundation, USA, under Grant No. 0213948 and a TOEFL Grant for Doctoral Research. Part of this material has been previously published in English for Specific Purposes, 24, 3, 269–292.


Discourse on the Move

Chu, 1996; Dubois, 1997; Naczi, Reznicek, & Ford, 1998; Swales & Luebs, 2002; D. Thompson, 1993) suggesting that the rhetorical organization of research articles is constrained by conventions of the academic disciplines and by the expectations of the relevant discourse communities. However, the findings generated by these studies must be treated with caution. First, many of these studies do not analyze a representative corpus of the discipline they studied. Sampling problems include expert’s subjective recommendation of the journals analyzed (e.g., Nwogu, 1997; Posteguillo, 1999), reflecting individual preferences rather than the actual academic prestige of the journals. Other sampling problems include lack of specification criteria for article selection (e.g., Swales & Najjar, 1987; Williams, 1999), mixture of different genres (such as clinical reports and experimental articles) in the same corpus (e.g., Williams, 1999), and non-compatibility of journals (e.g., specialized journals and interdisciplinary journals in the same corpus in Berkenkotter & Huckin, 1995). In addition, most previous studies have focused on a single section of articles, rather than the overall organization of research articles across all four sections (introduction, methods, results, discussion). The unsystematic and subjective selection of research articles investigated, and the mixed and unrepresentative nature of the corpus, preclude valid generalizations about the rhetorical organization of the target genre. Perhaps more importantly, previous research has been limited by its exclusive focus on rhetorical moves, with little or no attention given to the lexico-grammatical characteristics of moves. For this reason, we know little at present about the typical linguistic characteristics of the different move types that comprise research articles. The study presented in this chapter is unique in several ways. First, it analyzes the discourse structure of all four sections (introduction, methods, results, discussion) in scientific research articles. Prior to this study, Nwogu’s 1997 study was the only one that described moves in all four sections of research articles, based on analysis of 15 medical articles that had been recommended by medical practitioners. While a useful initial analysis, that study was still quite restricted in scope. In the present study, based on Swales’ (1990; 2004) framework2, 60 biochemistry research articles (which were systematically collected as part of a representative corpus) were first analyzed for move structure. This qualitative approach was then 2. Swales’ original model in 1990 was revised in 2004 (230, 232) consisting of three moves. Move 1: Establishing a topic is realized by topic generalizations of increasing specificity. Move 2: Preparing for the present study (citations possible) is realized as Step 1A: Indicating a gap or Step 1B: Adding to what is known, and Step 2: Presenting positive justification. Move 3: Presenting the present work is realized by up to seven steps--Step 1: Announcing present research descriptively and/or purposively, Step 2: Presenting research questions or hypotheses, Step 3: Definitional clarifications, Step 4: Summarizing methods, Step 5: Announcing principal outcomes, Step 6: Stating the value of the present research, Step 7: Outlining the structure of the paper.

Chapter 4.  Rhetorical moves in biochemistry research articles 

complemented by quantitative analysis of specific linguistic characteristics of each move type. Some previous grammatical-rhetorical studies have described the functions of individual linguistic features in research articles (e.g., Salager-Meyer, 1997, on hedging; D. Thompson & Ye, 1991, on reporting verbs). In general, though, these studies do not document linguistic differences across research article sections, and no study to date has attempted to describe systematic linguistic differences among the move types within research article sections. In contrast, the present study undertakes a detailed linguistic description of each move type. This description incorporates analysis of 41 distinct linguistic features, a large set of features made possible by corpus-based techniques (including multidimensional analysis). Combining the strengths of both qualitative and quantitative corpus analysis tools, this study illustrates a novel and successful application of multidimensional analysis for top-down discourse analysis: to systematically identify the linguistic features associated with each move type (representing different communicative purposes) and to provide a more comprehensive description of rhetorical organization in research articles than has been previously feasible.

2 Description of the corpus The term “biochemistry” was first introduced in 1903, but this field has mushroomed so that it is now represented by 261 specialized journals published worldwide (“Journal Citation Reports,” 2004). As a result, it is no easy challenge to build a representative corpus of research articles from this academic discipline, ensuring that the articles contained in the corpus truly represent the range of research articles in biochemistry. To control for possible differences among national varieties of English and across time, only journals published in the United States in the year 2000 were considered. The corpus was further restricted to research articles from the five most prestigious scientific journals in biochemistry (determined by their “impact factors”3): Cell (C), Molecular Cell (MC), Molecular and Cellular Biology (MCB), Journal of Biological Chemistry (JBC), and Molecular Biology of the Cell (MBC). From these five journals, 60 articles (12 from each journal) were randomly selected, evenly distributed over all the issues of each journal for the year 2000. These articles all have four distinct sections (Introduction, Methods, Results, and Discussion). The total corpus size is about 320,000 running words. 3. The impact factor is the average number of times articles that are published in a specific journal in the two previous years were cited in a particular year. This figure is useful in evaluating a journal’s relative importance, especially when a comparison is made to other journals in the same field.


Discourse on the Move

3 Determining the move categories in the genre of biochemistry research articles The first step in the analysis here was to identify the move types that can occur in each section of biochemistry research articles. This task was made easier because I was able to build on the numerous previous studies that have identified move types in research articles from different academic disciplines: Anthony (1999), Chu (1996), Crookes (1986), Samraj (2002) on the move types in Introductions; Swales & Luebs (2002), Wood (1982) on the move types in methodology sections; D. Thompson (1993), Williams (1999) on the move types in Results sections; and Dubois (1997) on the move types in Discussion sections. Considering the findings from these studies, together with my own detailed analyses of biochemistry research articles, I identified 15 move types that can occur in these texts. Several of these move types can consist of multiple sub-parts, referred to as ‘steps’. Table 4.1 summarizes the overall framework. Table 4.1  Model of move structure in biochemistry research articles INTRODUCTION Move 1: Establishing a topic Move 2: Preparing for the present study: Indicating a gap/raising a question Move 3: Introducing the present study Stating purpose(s) Step 1: Step 2: Describing procedures Step 3: Presenting findings METHODS Move 4: Describing materials Step 1: Listing materials Step 2: Detailing the source of the materials Step 3: Providing the background of the materials Move 5: Describing experimental procedures Step 1: Documenting established procedures Step 2: Detailing procedures Step 3: Providing the background of the procedures Move 6: Detailing equipment Move 7: Describing statistical procedures

RESULTS Move 8: Step 1: Step 2: Step 3: Step 4:

Restating methodological issues Describing aims and purposes Stating research questions Making hypotheses Listing procedures or methodological techniques Move 9: Justifying methodological issues Move 10: Announcing results Reporting results Step 1: Step 2: Substantiating results Step 3: Invalidating results Move 11: Commenting results Explaining results Step 1: Step 2: Generalizing/interpreting results Step 3: Evaluating results Step 4: Stating limitations Step 5: Summarizing

Chapter 4.  Rhetorical moves in biochemistry research articles 

DISCUSSION Move 12: Contextualizing the study Describing established knowledge Step 1: Step 2: Generalizing, claiming, deducing previous knowledge Move 13: Consolidating results Restating methodology (purposes, research questions, hypotheses, and procedures) Step 1: Step 2: Stating selected findings Step 3: Referring to previous literature Step 4: Explaining differences in findings Step 5: Making overt claims/generalizations Step 6: Exemplifying Move 14: Stating limitations of the study Move 15: Suggesting further research

The following sections describe the individual move types in each section and their constituent steps. 3.1

The introduction section

Move 1: Establishing a topic assures that the topic is worth investigating and the field is well established. Move 1 also reports previous research deemed relevant to the topic being discussed. Move 1 usually begins the Introduction section, consisting of topical statements of increasing specificity: General Move 1 statement:

(1) Cell-cell adhesion is critical for tissues and organs. [C9]

Specific Move 1 statement: (2) These modifications promote plasma membrane association and facilitate highaffinity protein-protein interactions (REFERENCE). [MBC3]

Move 2: Preparing for the present study focuses on weaknesses in the existing literature and/or unaddressed research questions. Move 2 in biochemistry establishes a niche in previous research by the step of either indicating a gap or raising a question, as shown in (3–4).

(3) Although these and other important roles of U2 snRNP are well known, the critical issue of … has not yet been determined. [MC5] (4) …, but it is not known whether they associate specifically with AJs. [C1]


Discourse on the Move

Move 3: Introducing the present study is realized by three steps in this genre. Step 1: Stating purpose(s) explicitly announces the purpose(s) of the study:

(5) It was undertaken to examine in detail … and to try to understand …. [MCB3]

Step 2: Describing procedures focuses on the principal features of the study:

(6) We therefore investigated AJ formation in primary keratinocytes …. [C1]

Step 3: Presenting findings announces the major findings of the study:

(7) Our results show that U2snRNP is … associated with the E complex …. [MC5]


The methods section

The methods section has four move types. Move 4: Describing materials covers a wide range of materials used in biochemistry experiments, from natural substances, human/animal organs or tissues, to chemicals. Move 4 can be realized as three variations. Step 1: Listing materials explicitly itemizes materials or substances used:

(8) Bacterial strains used in this study … are listed in Table 3. [C8]

Step 2: Detailing the source of materials identifies how these items are obtained, such as by purchase, as a gift, etc.:

(9) COS-7 cells were obtained from S.Brandt …. [MCB4]

Step 3: Providing the background of the materials includes the description, properties, or characteristics of the materials: (10) All strains have GAL upstream activating sequence-regulated PGK1pG abd MFA2pG genes, … (REFERENCE). [MCB11]

Move 5: Describing experimental procedures has three variations or steps. Step 1: Documenting established procedures recounts established experimental processes commonly known to biochemistry researchers: (11) Chromatin binding assays were performed as previously described (REFERENCE). [MC4]

Chapter 4.  Rhetorical moves in biochemistry research articles 

Step 2: Detailing procedures provides detailed description of not-so typical procedures to facilitate the replication of subsequent studies: (12) To obtain polyclonal antibodies …, mice and rabbits were immunized …. [MBC9]

Step 3: Providing the background of the procedures justifies the choice of technique or procedure: (13) Complete details of all constructions will be provided upon request. [JBC10]

Move 6: Detailing equipment (14) and Move 7: Describing statistical procedures (15) both occur infrequently in this genre: (14) Images were recorded through a Hamamatsu C-2400 New vicon camera using a 10 x objective and brightfield optics. Video images were digitized at a rate of 6 frames/min as described above. [MBC8] (15) The data were fitted to the Michaelis-Menten Equation 1 by using a non-linear least squares approach and the kinetic constants+- S.E. [JBC7]


The results section

The results section also has four move types: Move 8: Restating methodological issues focuses on how the data of the study have been produced. This move is realized by one or more of four steps. Step 1: Describing aims and purposes: (16) To examine the kinetics …, we first plated … keratinocytes …. [C1]

Step 2: Stating research questions: (17) To determine whether these GTPases participate in the phagocytosis of P. aeruginosa, we expressed …. [JBC1]

Step 3: Making hypotheses: (18) Mondo A and Mlx heterodimerize are predicted … to bind CACGTG E-box sequences. [MCB12]

Step 4: Listing procedures or methodological techniques: (19) (To determine whether …,) P19 cytoplasmic extracts were incubated …. Retention of MondoA Mlx heterodimers on the DNA beads was determined by Western blotting. [MCB12]


Discourse on the Move

Move 9: Justifying procedures or methodology reveals what determines the scientists’ decision to opt for particular experimental methods, procedures, or techniques. This move can be expressed by referring to previous research. (20) (DKO4 cells were used), in which mutant Ras had been detected homologous recombination (REFERENCE) and a conditionally active Raf allele (EGFPRaf-1: ER) was stably expressed in these cells (REFERENCE). [C10]

Move 10: Announcing results is a crucial move of the Results section and is realized by three steps. The first step reports major findings, whereas the second step persuades the respective discourse community to consider the finding as a part of consensual knowledge. The third step highlights the novelty produced by the study that might be worth further investigation. Step 1: Reporting results: (21) Data is shown for Pse1–ECFP/Nic96–EYFP and Pse1–ECFP/Nup188–EYFP (Figure 3). [MC1]

Step 2: Substantiating results: (22) Similar results were obtained…. [MC1]

Step 3: Invalidating results: (23) (Full length VASP-GFP localized to adhesion zippers … (Figures 6A-6D). This was true in the majority of transfected cells ….) In contrast, TD-GFP interfered with formation of adhesion (Figures 6E-6H). [C1]

Move 11: Commenting on the results is one place where scientists not only report but also comment on the results. Excerpts (24–28) illustrate the five steps of Move 11: Step 1: Explaining results: (24) We presume that the localization of GFP-tagged Ste18p is representative of native Ste18p because the wild-type fusion protein rescues mating in a ste18 strain. [MBC3]

Step 2: Generalizing/interpreting results: (25) These results suggest that proteolysis of c-Myc is proteasome dependent. [MCB4]

Step 3: Evaluating results: (26) The strong exacerbation of the phenotype of fun12 (1–915).. and the lack of any effect in tif34 … support our conclusion that eIF5B and eIF1A functionally interact during translation initiation. Moreover, the toxicity … is consistent with

Chapter 4.  Rhetorical moves in biochemistry research articles 

the model that release of eIF1A and eIF5B from 805 initiation complexes is coupled. [MCB10]

Step 4: Stating limitations: (27) The molecular mechanisms … are unknown. It is therefore difficult to propose an explicit … model to explain why telomeres become longer …. [MC10]

Step 5: Summarizing: (28) Together, these results demonstrate that reg A- cells are capable of assessing the direction of a spatial gradient of cAMP …. [MBC8]


The discussion section

The discussion section is also comprised of four possible move types: Move 12: Contextualizing the study has two distinct steps. Step 1: Describing established knowledge cites or reports related previous research or established knowledge of the topic that is crucial in understanding what is being presented: (29) Conventional kinesin has long been suspected of being a vesicle motor. Initially, this stemmed from its discovery in axoplasm (REFERENCE), which is rich in Golgi-derived transport vesicles, and its co-localizatioin with vesicles in cultured cells (REFERENCE). [MBC8]

Step 2: Generalizing, claiming, deducing previous knowledge describes how the findings relate to the results of previous research: (30) The observation that BAD is inactivated by phosphorylation atg Ser-155 has important implications for the understanding of the regulation of Bcl-2 family members. [MC7]

Move 13: Consolidating results highlights the strengths of the study and defends its importance. This move is realized through six steps: Step 1: Restating methodology: (31) In this study, we exploited primary culture to examine the impact that elevated K16 protein level has on a number of basic properties of skin keratinocytes. [MBC10]


Discourse on the Move

Step 2: Stating selected findings: (32) We show that the essential Gpi11 and Gpi13 proteins are involved in late stages in the formation of the yeast GPIs, and we identify and characterize three new candidates GPI precursors. [MBC5]

Step 3: Referring to previous literature for comparison: (33) The experiments presented here confirm the previously reported data (REFERENCE), showing that …. [JBC4]

Step 4: Explaining differences in findings: (34) The advantages the Ku-X4-LIV complex confers upon ligation in vitro can therefore explain why these factors … are required for cellular end joining: ligation is fast and efficient, even at low enzyme … and in the presence of … unbroken DNA. [MBC5]

Step 5: Making overt claims/generalizations: (35) (Simply changing the CaaX motif … to a form recognized by Ftase significantly improved mGBP1 modification.) This result also indicates that the CaaX motif … is not likely to be buried within the structure of the protein, …. [MBC7]

Step 6: Exemplifying: (36) (Within the G88R RNase A variants, cytotoxicity correlates well with conformational stability (Fig.2).) For example, A4/G88/V118C Rnase has the highest Tm value of the five enzymes and is the most potent cytotoxin. [JBC6]

Move 14: Stating limitations of the present study makes explicit the scientists’ views of the limitations of the study about the methodology, the findings, and/or the claims made based on the findings: (37) Additionally, some interactions may be too transient for detection by FRET. [MC1] (38) Our data do not enable us to rule out a requirement for additional, non-PMAactivated pathways in the activation of … splicing in primary T cells. [MBC1]

Move 15: Suggesting further research allows the scientists to offer recommendations for the course of future research by pinpointing particular research questions to be addressed or improvements in research methodology:

Chapter 4.  Rhetorical moves in biochemistry research articles 

(39) Further analysis of the molecular basis of motor axon guidance in the limb may help to define two interrelated issues in the patterning of neuronal projections. …. [C7]

4 Coding moves in the corpus of biochemistry research articles As mentioned in Chapter 2, the subjective nature of move identification presents a methodological challenge for corpus-based research, which requires a systematic identification and coding of all moves in the corpus (e.g., Crookes, 1986; DudleyEvans, 1994a; Paltridge, 1994). As a result, two individuals analyzing the same text type may differ in ascribing move boundaries or in identifying the move type of each move (as in the studies by Nwogu, 1997, and Williams, 1999, on the Results section in medical research articles). Therefore, it was necessary to assess intercoder reliability of move assignment for the present project, ensuring that move demarcation could be conducted consistently by different individuals, and that the framework for determining move type could be applied reliably. In the present case, I evaluated the reliability of my own coding in comparison to the coding of an expert in the field of biochemistry: a PhD student at an American university who is also a faculty member in the School of Pharmacy at Silpakorn University in Thailand. Although the expert coder is not a native speaker of English, he clearly possesses extensive experience and expertise in reading academic research articles in the field of biochemistry. A two-hour training session for each section was conducted to explain the purpose of the task and to acquaint the coder with the use of the analytical framework (described in Section 2 and Table 4.1). Texts were segmented into moves, and the move type of each move was determined. Only one rhetorical move was ascribed to a segment of a text. Texts were not coded for steps. The list of steps constituting each move was used to facilitate the coder’s decision in ascribing moves; however, the step distinctions played no role in the subsequent analyses. In the second stage of training, both raters coded four randomly selected texts representing the four conventional sections. We then went through each text to identify any coding disagreements. Difference in coding led to discussion and clarification of the criteria for coding assignments. Finally, the raters each independently coded 15 research articles (three articles from each of the five journals). Based on the independent coding by the author and the expert coder, inter-coder reliability was measured by agreement rate or percentage agreement and kappa value. (Percentage agreement rate does not take into account chance agreement between two coders, whereas kappa value does (Orwin, 1994).


Discourse on the Move

Table 4.2  Summarized results of inter-coder reliability analysis Section



Introduction Methods Results Discussion

.93 .81 .88 .88

97.58 96.35 93.02 93.02




Table 4.2 shows high overall inter-coder reliability as measured by both agreement rates and kappa4 values. Moves in the Introduction section were more consistently and reliably identified than those in the other sections. In contrast, the Methods section displayed more divergence in move identification. However, there seemed to be no systematic pattern regarding divergences in move coding. The findings suggest the psychological reality of a move as a discourse unit that can be empirically investigated further.

5 Distribution of move types within texts from the biochemistry corpus One major goal of move analysis is to identify the primary communicative function – the move type – of each statement in a text. Thus, when Introductions in biochemistry research articles are described as being composed of three move types, this means that every statement in the Introduction can be attributed to one of these three types. However, it is not the case that move types necessarily occur sequentially in a text. For example, an Introduction will not necessarily be composed of sentences belonging to Move Type 1 (Establishing a topic), followed by Move Type 2 (Preparing for the present study), followed by Move Type 3 (Introducing the present study). Rather, these three move types, and their associated communicative functions, can be interspersed throughout the Introduction. A move type represents a particular communicative function, and a text often switches from one move type (communicative function) to another and then back again to the first. Each of these text segments are coded as separate moves, resulting in the possibility of multiple moves representing a single move type. The following text excerpt illustrates how the language of an Introduction can be attributed to different moves: 4. According to Fleiss (as cited in Orwin, 1994), the interpretation of Cohen’s kappa is summarized as follows: k