Evolution in Health and Disease, second edition

  • 21 76 8
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

Evolution in Health and Disease, second edition

Evolution in Health and Disease This page intentionally left blank Evolution in Health and Disease Second Edition E

1,518 348 3MB

English Pages 397 Page size 252 x 372.6 pts Year 2008

Report DMCA / Copyright


Recommend Papers

File loading please wait...
Citation preview

Evolution in Health and Disease

This page intentionally left blank

Evolution in Health and Disease Second Edition E DI T E D BY

Stephen C. Stearns and Jacob C. Koella



Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Oxford University Press 2008 The moral rights of the authors have been asserted Database right Oxford University Press (maker) This edition 2008 1st edition 1999 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging in Publication Data Evolution in health and disease / edited by Stephen C. Stearns and Jacob C. Koella.—2nd ed. p.; cm. Includes bibliographical references and index. ISBN 978–0–19–920745–9 (hardback: alk. paper) ISBN 978–0–19–920746–6 (pbk.: alk. paper) 1. Medical genetics. 2. Human evolution. 3. Disease—Causes and theories of causation. I. Stearns, S. C. (Stephen C.), 1946 II. Koella, Jacob C. [DNLM: 1. Evolution, Molecular. 2. Disease. 3. Health. QU 475 E95 2008] RB155.E96 2008 616’.042—dc22 2007033610 Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Antony Rowe, Chippenham, Wiltshire ISBN 978–0–19–920745–9 (Hbk.) 978–0–19–920746–6 (Pbk.) 10 9 8 7 6 5 4 3 2 1

Preface to the Second Edition

This book surveys the ways in which evolutionary thought illuminates medical science. It is intended for a broad audience that includes medical students, graduate students in the biological sciences, medical and biological educators, medical and biological researchers, medical practitioners, and the interested public. Readers with backgrounds in biology or medicine should feel at home; those without such backgrounds may at times feel challenged. We have asked the authors to reduce jargon and introduce technical terms with clear definitions; often we have succeeded, but not always. Evolutionary thought illuminates medical issues from a fresh angle; it does not replace other approaches. It complements; it does not compete. The content of this edition is almost entirely new. Only in Chapters 1, 6, and 12 will you find passages that appeared in the first edition, and even those three chapters are almost completely rewritten. The contributors are also mostly new. Only 11 of the 61 scientists who contributed to the first edition are represented here. Close to 60% of the cited references were published after the first edition. This does not, however, mean that the topics covered have changed radically. Many of the important issues discussed in the first edition remain, here viewed with fresh perspective.

As in the first edition the book comes in five parts: 1. Introduction 2. The history and variation of human genes 3. Natural selection and evolutionary conflicts 4. Pathogens: resistance, virulence, variation, and emergence 5. Non-infectious and degenerative disease, including diseases associated with aging, cancer, nutrition, lifestyle, and metabolism The chapter ending each part was commissioned as a survey of important issues not covered in chapters that precede it. By using this technique we hope to have achieved fairly complete coverage of at least the most important issues. Bringing evolutionary thought into medical research and practice helps to explain how many medical issues arose in the first place. It can also help to save many lives and to reduce much suffering. For those reasons we hope that the ideas presented here find a broad and sympathetic audience open to fresh approaches that do not sacrifice scientific rigor. Stephen C. Stearns, New Haven, CT USA Jacob C. Koella, Ascot, UK May 2007


This page intentionally left blank


List of Contributors


Part I Introduction


1 Introducing evolutionary thinking for medicine Stephen C. Stearns, Randolph M. Nesse, and David Haig


Introduction Mismatched to modernity Adaptation takes time: lactose tolerance Birth control and cancer risk Early-life events with late-life consequences Parasite load and autoimmune disease Infection Resistance Virulence Emerging diseases Reproduction Evolved conflicts between mother and offspring Evolved conflicts between mother and father Spontaneous abortions and complementary immune genes Populations have histories Evolutionary technologies Phylogenetic reconstructions Attenuated live vaccines The nature of evolutionary explanations Microevolution, macroevolution, and development Mechanistic and evolutionary explanations Natural selection How selection works Fitness is relative reproductive success Natural selection has several components: individual, sexual, and kin selection Traits do not evolve for the good of the species Random events and neutral variation: how neutral evolution works Trade-offs Macroevolution Relationships and fossils reveal history Constraints: eyes and tubes

3 3 3 4 4 4 5 5 5 5 5 5 6 6 6 7 7 7 7 7 8 8 8 9 9 10 10 11 12 12 12




Conclusion Health, fitness, and the pursuit of happiness Human diversity Implications for medical practice, research, and education What doctors need to know about evolution and why

13 13 13 14 14

Part II The history and variation of human genes


2 Global spatial patterns of infectious diseases and human evolution Jean-François Guégan, Franck Prugnolle, and Frédéric Thomas


Introduction Geographical aspects of human diseases Latitude and the species diversity of human pathogens Longitude and the species diversity of human pathogens Latitude and the nested pattern of human pathogens Latitude and the geographical range of human pathogens Geographical area and the species diversity of human pathogens Historical patterns of the distribution of disease Pathogen distribution and human genetic evolution Pathogen distribution and human genetic evolution: the case of sickle cell disease Variations in pathogen diversity and human genetic evolution: the HLA genes Infectious diseases and human life-history traits Human fertility and the species diversity of human pathogens Human birthweight and the species diversity of human pathogens Human behavior and culture, and the species diversity of human pathogens Summary Acknowledgments 3 Medically relevant variation in the human genome Diddahally R. Govindaraju and Lynn B. Jorde Introduction Molecular markers Microsatellites Single nucleotide polymorphisms (SNPs) Haplotypes Determination of haplotypes Linkage disequilibrium, recombination and haplotype blocks Linkage disequilibrium Recombination and recombination hotspots The structured genome—haplotype blocks TagSNPs The HapMap project Background Findings Structural variation Inference of evolutionary processes Natural selection

19 19 20 21 21 21 21 22 23 23 25 26 27 27 27 28 29 31 31 32 32 34 34 35 35 35 36 37 37 37 37 38 40 40 40


Genetic drift Admixture Causal SNPs and the magnitude of their effects Summary Acknowledgments 4 Health consequences of ecogenetic variation Michael Bamshad and Arno G. Motulsky Introduction Genetic basis of variation in drug metabolism and response Genetic basis of monogenic drug reactions Genetic basis of complex pharmacogenetic traits Genetic basis of chemosensory perception and food preferences Bitter taste sensitivity Sweet and umami taste sensitivity Lactase persistence The structure of human populations Correspondence between race and population structure Race as a proxy for genetic ancestry Conclusions Acknowledgments 5 Human genetic variation of medical significance Kenneth K. Kidd and Judith R. Kidd Introduction The pattern of human genetic variation The amount and nature of human genetic variation The human expansion out of Africa The impact of genetic variants—or lack of it The role of selection The impact of population bottlenecks on genetic patterns Disease can cause bottlenecks Migration out of Africa Complex disease and evolution Genetic influences on alcoholism Variation in ethanol metabolism and alcoholism Variation in taste perception and alcohol dependence Summary Acknowledgments


41 41 41 42 42 43 43 44 44 45 45 46 47 48 49 49 50 50 50 51 51 51 52 52 53 55 55 56 58 58 59 59 61 62 62

Part III Natural selection and evolutionary conflicts


6 Intimate relations: Evolutionary conflicts of pregnancy and childhood David Haig


Introduction Parental justice

65 65



Internal conflicts Credibility problems Pregnancy termination Menstruation Selective abortion Gestation length Infanticide Maternal circulation Preeclampsia Growth Fat Brains and bodies Intergenerational conflicts Summary Acknowledgments 7 How hormones mediate trade-offs in human health and disease Richard G. Bribiescas and Peter T. Ellison Introduction: Hormones, life history, evolution, and health Hormones and trade-offs Hormones, population variation, and phenotypic plasticity Hormones and trade-offs in males Androgens and fetal development Childhood quiescence Adolescent development, morbidity, and mortality What are the benefits of testosterone in adult males? Testosterone and somatic investment Testosterone and immune function Fatherhood and paternal investment The aging male Hormones and female reproductive trade-offs Constraints on female reproductive success Birthweight and infant survival Parturition Lactation and birth spacing The resumption of ovarian cycling Waiting time to conception The timing of conception and human reproductive seasonality Age and female fecundity Contemporary medical implications Metabolic syndrome Cancer Hormonal supplementation Hormonal caveats Summary

66 67 67 67 68 69 70 71 72 73 73 74 75 76 76 77 77 77 79 80 80 80 81 81 82 82 84 84 85 85 86 86 87 88 88 89 90 91 91 91 91 92 92


8 Functional significance of MHC variation in mate choice, reproductive outcome, and disease risk Dagan A. Loisel, Susan C. Alberts, and Carole Ober Introduction Genes of the major histocompatibility complex Form and function of MHC molecules Evolution of MHC genes Pathogen-mediated selection on MHC genes Sexual selection on MHC genes MHC-mediated mate choice in non-human vertebrates Role of the MHC in human mate choice Evolutionary implications of MHC-mediated mate choice MHC-linked olfactory cues Influence of MHC peptide-binding region on odor MHC peptide ligands as olfactory cues Detection of MHC-mediated odors Peptide binding as an integrating principle in MHC evolution MHC and reproductive outcome MHC sharing and reproduction in outbred human populations MHC sharing and reproductive outcome in an unselected population MHC sharing, reproduction, and diversity HLA-G in reproduction, immune regulation, and disease HLA-G in reproductive, autoimmune, and inflammatory pathologies Evolution of HLA-G The cost of protection: non-adaptive consequences of MHC diversity Conclusions Summary Acknowledgments 9 Perspectives on human health and disease from evolutionary and behavioral ecology Beverly I. Strassmann and Ruth Mace Introduction Phenotypic plasticity Kin selection theory Step-parents Adoption Life-history theory Trade-offs Offspring number versus quality Parental effort versus longevity Menopause and the post-reproductive life span Parental investment theory Infanticide Sex ratios



95 95 96 96 97 98 99 100 101 102 102 102 103 103 103 104 104 105 105 106 106 106 107 108 108 109

109 109 112 113 113 114 114 115 115 116 117 117 117



Sexual selection theory Higher mortality of males than females Sexual jealousy and genital cutting Summary Acknowledgments

118 119 120 120 121

Part IV Pathogens: resistance, virulence, variation, and emergence


10 The ecology and evolution of antibiotic-resistant bacteria Carl T. Bergstrom and Michael Feldgarden


Introduction History of clinical antibiotic resistance Genetic mechanisms Point mutations Homologous recombination Heterologous recombination Natural ecology Soil ecology Agricultural use Hospital transmission Population genetics Linked genes Compensatory mutation Applying evolution/approaches for the future Predicting resistance evolution Narrow spectrum antibiotics Bacteriocins Quorum sensing disruptors Ecological modeling Antibiotic cycling Conclusions Summary Acknowledgments 11 Pathogen evolution in a vaccinated world Andrew F. Read and Margaret J. Mackinnon Introduction Vaccines have consequences for pathogen evolution Hepatitis B Pertussis Pneumococcal disease Diphtheria Malaria Avian influenza Marek’s disease Infectious bursal disease (IBD) Thus, vaccines are not evolution-proof

125 126 127 127 128 128 129 129 129 130 131 131 132 133 133 133 133 134 135 135 136 137 137 139 139 139 139 141 141 141 141 142 142 142 143


Why has vaccination worked despite evolution? Not all infectious diseases are alike Is it too soon to be confident? Pathogen adaptation in vaccinated populations Epitope evolution Virulence adaptation Other possible vaccine-adapted phenotypes The health consequences of vaccine-adapted pathogens Predicting evolution Watching evolution Coda Summary Acknowledgments 12 The evolution and expression of virulence Dieter Ebert and James J. Bull Introduction Outline of this chapter Defining virulence Artificial virulence evolution and live vaccines The three phases of the evolution of infectious diseases Phase 1: Accidental infections Phase 2: Evolution of virulence soon after successful invasion Phase 3: The evolution of optimal virulence Mechanisms of virulence remain to be considered Variation of hosts impacts the expression and evolution of virulence Virulence has a direct benefit for the parasite Can we manage the evolution of virulence? Summary Acknowledgments 13 Evolutionary origins of diversity in human viruses Paul M. Sharp, Elizabeth Bailes, and Louise V. Wain Introduction Origins of human viruses Origins of diversity within human viruses Herpesviruses AIDS viruses Influenza A viruses Dengue viruses Comparisons among viruses Summary 14 The population structure of pathogenic bacteria Daniel Dykhuizen and Awdhesh Kalia Introduction


143 143 144 144 145 145 147 148 149 150 151 152 152 153 153 154 154 155 156 156 157 159 162 163 163 164 166 167 169 169 169 170 171 174 177 180 181 183 185 185



Population structure Clonality versus panmixia Population structure and disease type Population structure and clonality Population structure and genetic variation Effective population size Effective population size determined by infection dynamics Helicobacter pylori Geographic variation Infection dynamics Infection inoculum Mutation rate Recombination in H. pylori Selective sweeps Role of diversifying selection in maintaining nucleotide diversity Expectation of high genetic variability in H. pylori Streptococcus pyogenes GAS epidemiology Clonal expansion Recombination Why is there clonal expansion? Interspecies horizontal gene transfer Expectation of moderate genetic variability in GAS Salmonella typhi The last common ancestor of S. typhi The origin of S. typhi Carrier numbers determine Ne Further considerations Mutation Inoculation size Recombination Selective sweeps Diversifying selection Species introgression and high diversity Summary Acknowledgments 15 Whole-genome analysis of pathogen evolution Julian Parkhill Introduction Long-term evolution of pathogens Horizontal exchange of genes Mechanisms of gene exchange Core and accessory genomes Pathogenicity islands Plasmids Bacteriophage

185 185 186 186 186 187 188 188 188 188 189 189 190 190 190 191 191 192 192 192 193 193 195 195 196 196 196 196 197 197 197 197 197 197 198 198 199 199 199 199 200 201 202 202 203


Homologous recombination Short-term evolution of pathogens Yersinia pestis Bordetella pertussis Stochastic variation/hypermutability Phase variation Simple sequence repeats DNA inversion Genomic discovery of phase variable genes Identification of rapid variation by genomic sampling Campylobacter jejuni—simple sequence repeats Bacteroides fragilis—DNA inversion Phase-variable restriction modification systems Summary Acknowledgments 16 Emergence of new infectious diseases Mark Woolhouse and Rustom Antia Introduction Which diseases emerge? Diversity of pathogens Characteristics of emerging pathogens Disease emergence as a biological process The pathogen pyramid Role of ecology Role of evolution Examples of emerging infectious diseases HIV/AIDS Influenza SARS Ebola Monkeypox Practical implications of disease emergence Predicting pathogen emergence Public health response Summary 17 Evolution of parasites Jacob C. Koella and Paul Turner Virulence and transmission in public health and evolution The evolution of virulence in control programs General considerations Vaccines Drug treatment The problem of virulence The problem of the trade-off


203 204 204 206 207 207 208 209 209 210 210 211 212 213 213 215 215 217 217 218 220 220 222 222 224 224 224 225 225 225 226 226 227 227 229 229 229 229 230 230 231 232



Beyond the trade-off model Coevolution Emerging diseases A molecular and an experimental approach to the evolution of parasites Summary Acknowledgments

234 234 234 235 236 237

Part V Noninfectious and degenerative disease


18 Evolutionary biology as a foundation for studying aging and aging-related disease Martin Ackermann and Scott D. Pletcher


Introduction Defining and measuring aging The canonical evolutionary models of aging Evolutionary genetics of aging Predictions of the evolutionary models Molecular mechanisms of aging Dietary restriction Conserved pathways that influence the rate of aging Merging molecular mechanisms with evolutionary theory Adaptive responses to environmental signals Going beyond traditional evolutionary models of aging New directions Concluding remarks Summary Acknowledgments 19 Evolution, developmental plasticity, and metabolic disease Christopher W. Kuzawa, Peter D. Gluckman, Mark A. Hanson, and Alan S. Beedle Introduction: diseases of excess or deficiency? The developmental origins of health and disease (DOHaD) paradigm Origin Evidence from epidemiology Experimental evidence Epigenetic mechanisms An integrated response to developmental cues Reduced body size and lean mass Muscle becomes insulin-resistant Fat deposition is enhanced in highly labile visceral depots Stress responses and reactivity are accentuated A developmental and evolutionary synthesis Anticipating the future from maternal cues: predictive developmental plasticity Medical and public health implications Policy implications Summary Acknowledgments

241 242 242 243 244 245 245 246 248 248 249 250 251 252 252 253 253 254 254 255 256 257 258 258 258 259 259 259 260 262 263 264 264


20 Lifestyle, diet, and disease: comparative perspectives on the determinants of chronic health risks William R. Leonard Introduction Evolutionary energetics Influence of lifestyle change on daily energy expenditure How changes in lifestyle influence energy intake and diet composition Energy intake Macronutrient composition Health consequences of energy and nutritional imbalances Obesity Chronic metabolic disorders Summary Acknowledgments 21 Cancer: evolutionary origins of vulnerability Mel Greaves Introduction: a risky species? Evolutionary basis of vulnerability to cancer The proximate mechanisms Some evolutionary ground rules Lack of perfection in evolutionary engineering Evolutionary adaptation has ‘no eyes to the future’ Skin cancer Breast and prostate cancer Childhood leukemia The inevitability of natural selection The only evolutionary currency is reproductive success Implications Summary Acknowledgments 22 Cancer as a microevolutionary process Natalia L. Komarova and Dominik Wodarz The concept of somatic evolution as a way of thinking about cancer Cancer: a disease of the DNA Cancer initiation and chromosomal instability Multistage carcinogenesis of colon cancer Genetic instability Quantitative model of chromosome loss and genetic instability Optimal (for cancer) rate of chromosome loss What is the reason for CIN? Chronic myeloid leukemia (CML) and resistance against small molecule inhibitors Some facts about CML treatment The computational strategy


265 265 265 267 269 269 270 271 271 274 275 276 277 277 278 278 279 279 282 283 283 283 284 285 286 287 287 289 289 289 291 291 291 292 293 294 295 295 296



Emergence of resistant cells Cancer turnover and the emergence of resistance Combination therapy: strategies to prevent resistance Conclusions

296 297 298 299

23 The evolutionary context of human aging and degenerative disease Steven N. Austad and Caleb E. Finch


Introduction Aging as a by-product of selection for reproductive performance Genes and aging Apolipoprotein E Growth hormone/insulin/insulin-like growth factor 1 Summary Acknowledgments

301 302 304 305 307 311 311






Martin Ackermann, ETH Zürich, Institute of Integrative Biology, Universitätsstrasse 16, ETH Zentrum, CHN J12.1, CH-8092 Zürich, Switzerland. [email protected] Susan C. Alberts, Department of Biology, Duke University, Box 90338, Durham, NC 27708 USA. [email protected] Rustom Antia, Emory University, Department of Biology, 1510 Clifton Road, Atlanta, GA 30322 USA. [email protected] Steven N. Austad, University of Texas Health Science Center, STCBM Building, Room 3.100, Department of Cellular & Structural Biology, Barshop Center for Longevity and Aging Studies, 15355 Lambda Drive, San Antonio, TX 78245 USA. [email protected] Elizabeth Bailes, Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham NG7 2UK, UK. Michael Bamshad, Department of Pediatrics, University of Washington School of Medicine Box 356320, 1959 NE Pacific Street, HSB RR349, Seattle, WA 98195 USA. [email protected] Alan S. Beedle, The Liggins Institute, The Faculty of Medical and Health Sciences, The University of Auckland, Private Bag 92019, Auckland, New Zealand. [email protected] Carl T. Bergstrom, Department of Biology, University of Washington, Box 351800, Seattle, WA 98195–1800 USA. [email protected] Richard G. Bribiescas, Department of Anthropology, Yale University, P.O. Box 208277,

New Haven, CT 06520–8277 USA. [email protected] James J. Bull, The University of Texas at Austin, School of Biological Sciences, ESB 2, 1 University Station, A6500, Austin, TX 78712 USA. [email protected] Daniel E. Dykhuizen, Department of Biology, University of Louisville, Louisville, KY 40292 USA. [email protected] Dieter Ebert, Zoology Institute, University of Basel. Vesalgasse 1, CH-4051 Basel, Switzerland. [email protected] Peter T. Ellison, Department of Anthropology, Peabody Museum, Harvard University, Cambridge, MA 02138 USA. [email protected] Michael Feldgarden, Alliance for the Prudent Use of Antibiotics, 75 Kneeland St., 2nd floor Boston, MA 02111 USA. [email protected] Caleb E. Finch, Leonard Davis School of Gerontology, University of Southern California 1975 Zonal Avenue KAM-110, Los Angeles, California 90089–9023 USA. [email protected] Peter D. Gluckman, The Liggins Institute, The Faculty of Medical and Health Sciences, The University of Auckland, Private Bag 92019, Auckland, New Zealand. [email protected] Diddahally R. Govindaraju, Department of Neurology, Boston University School of Medicine, 715 Albany Street E-306, Boston, MA 02118–2526, USA. [email protected]. Mel Greaves, Section of Haemato-Onclogy, Institute of Cancer Research, Sutton, xix



SM2 5NG UK. [email protected] Jean-Francois Guégan, Génétique & Evolution des Maladies Infectieuses, UMR-2724 CNRS-IRD, Centre IRD de Montpellier, 911 avenue Agropolis, BP 64501, F-34394 Montpellier cedex 5 France. [email protected] David Haig, Department of Organismic and Evolutionary Biology, Harvard University, 26 Oxford Street, Cambridge, MA 02138 USA. [email protected] Mark A. Hanson, Center for Developmental Origins of Health and Disease, Institute of Developmental Sciences, University of Southampton, Mailpoint 887 Southampton General Hospital, S016 6YD UK. [email protected] Lynn B. Jorde, Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, 15 North 2030 East, Room 2100, Salt Lake City, UT 84112–5330 USA. [email protected] Awdhesh Kalia, Department of Biology, University of Louisville, Louisville, KY 40292 USA. [email protected] Judith R. Kidd, Department of Genetics, Yale University School of Medicine, 333 Cedar Street, P.O. Box 208005, New Haven, CT 06520–8005 USA. [email protected] Kenneth K. Kidd, Department of Genetics, Yale University School of Medicine, 333 Cedar Street, P.O. Box 208005, New Haven, CT 06520–8005 USA. [email protected] Jacob C. Koella, Department of Biological Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY UK. [email protected] Natalia L. Komarova, Department of Mathematics, University of California-Irvine, Irvine, CA 92697 USA. [email protected] Christopher W. Kuzawa, Department of Anthropology, Northwestern University, 1810 Hinman Avenue, Evanston, IL 60208 USA. [email protected]

William R. Leonard, Department of Anthropology, Northwestern University, 1810 Hinman Avenue, Evanston, IL 60208 USA. [email protected] Dagan A. Loisel, Department of Biology, Duke University, Box 90338, Durham, NC 27708 USA. [email protected] Ruth Mace, Department of Anthropology, University College London, London, WC1E 6BT UK. [email protected] Margaret J. Mackinnon, Department of Pathology, Cambridge University, Tennis Court Road, Cambridge CB2 1QP UK, and Wellcome Trust/ Kenya Medical Research Institute Collaborative Programme, Kilifi, Kenya. [email protected]. Arno G. Motulsky, University of Washington, Departments of Medicine (Medical Genetics) and Genome Sciences, Box 355065, 1705 NE Pacific St , Seattle WA 98195–5065 USA. [email protected] Randolph M. Nesse, The University of Michigan, 426 Thompson St., Room 5261, Ann Arbor, MI 48104 USA. [email protected] Carole Ober, Department of Human Genetics, 920 E. 58th St., CLSC 507, Chicago, IL 60637 USA. [email protected] Julian Parkhill, The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA UK. [email protected] Scott D. Pletcher, Huffington Center on Aging and Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, N803, Houston TX 77030 USA. [email protected] Franck Prugnolle, Génétique & Evolution des Maladies Infectieuses, UMR-2724 CNRS-IRD, Centre IRD de Montpellier, 911 avenue Agropolis, BP 64501, F-34394 Montpellier cedex 5, France. [email protected] Andrew F. Read, Institutes of Evolution, Immunology and Infection Research, School of Biological Sciences, Ashworth Laboratories, The King’s Buildings, University of Edinburgh, West Mains Road, Edinburgh, EH9 3JT Scotland, UK. [email protected]


Paul M. Sharp, Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, Kings Buildings, Edinburgh, EH9 3JT UK. [email protected] Stephen C. Stearns, Department of Ecology and Evolutionary Biology, Yale University, Box 208106, New Haven, CT 06520–8106 USA. steph[email protected] Beverly I. Strassmann, Institute for Social Research & Department of Anthropology, 101 West Hall, 1085 S. University Avenue, Ann Arbor, MI 48109–1107 USA. [email protected] Frédéric Thomas, Génétique & Evolution des Maladies Infectieuses, UMR-2724 CNRS-IRD, Centre IRD de Montpellier, 911 avenue Agropolis, BP 64501,


F-34394 Montpellier cedex 5, France. [email protected] Paul E. Turner, Department of Ecology and Evolutionary Biology, Yale University, Box 208106, New Haven, CT 06520–8106 USA. [email protected] Louise V. Wain, Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham NG7 2UK, UK. Dominik Wodarz, Department of Ecology and Evolution, University of California-Irvine, Irvine, CA 92697 USA. [email protected] Mark Woolhouse, Centre for Infectious Diseases. University of Edinburgh, Ashworth Laboratories, Kings Buildings, West Mains Rd, Edinburgh EH9 3JT UK. [email protected]

This page intentionally left blank



This page intentionally left blank


Introducing evolutionary thinking for medicine Stephen C. Stearns, Randolph M. Nesse, and David Haig

Introduction Should doctors and medical researchers think about evolution? Does it bring useful insights? Would doctors and researchers who learned a substantial amount about evolution be more effective than a control group that learned only the usual rudiments? Would providing such education improve health enough to justify the cost? Positive answers to these questions would have profound implications for medical education, research funding, and the future of human health. To address them, we start with examples of significant evolutionary insights into serious medical issues. We then describe the principles of evolutionary biology that produce these insights. We conclude with a summary of what doctors should know about evolution. At the outset we acknowledge that much medical practice proceeds just fine with little need for a theoretical foundation. Medicine is a profession that offers practical help. Surgeons need to know how the organism is constructed, how it works, and what procedures work best; knowledge about how and why it evolved does not help in performing an operation. For internists, pediatricians, epidemiologists, and geneticists, evolution is more often of practical concern. Evolutionary thinking provides insight and saves lives when one is prescribing antibiotics, managing virulent diseases, administering vaccinations, advising couples who have difficulty conceiving and carrying offspring to term, treating the diabetes and high blood pressure of pregnancy, treating cancer, understanding

the origins of the current epidemics of obesity, diabetes, and autoimmune diseases, and answering patients’ questions about aging. Evolution is not an alternative to existing medical training and research. It is a useful basic science that poses new medical questions, contributing to research while also improving practice. We now present some significant evolutionary insights into medical issues. The first is that our evolved state is often mismatched to our modern environment because that environment is changing more rapidly than we can adapt to it.

Mismatched to modernity Adaptation takes time: lactose tolerance That it takes time for a population to adapt to environmental change is illustrated by the absorption of milk sugar, lactose, by adults (Simoons 1978; Durham 1991; Mace et al. 2003). Like other mammals, human females provide their children with the enzymes needed to digest lactose in their milk. A minority of us now has the ability to digest fresh milk into adulthood, including populations in Europe, western India, and sub-Saharan Africa. The ancestral human condition was the inability to digest fresh milk after being weaned, and the new, recently evolved condition is the ability to do that. How long would it take that ability to evolve? The ability to digest fresh milk after weaning behaves as a single dominant autosomal gene, and dominant genes increase in frequency under selection more rapidly than do recessive genes. 3



Individuals without lactase who drink milk suffer from flatulence, intestinal cramps, diarrhea, nausea, and vomiting. A mutation for lactose tolerance had an advantage for herding peoples who could use milk from their animals. Selection for lactase activity could have been particularly strong during serious famines. If the ability to absorb lactose conferred a selective advantage of 5%, how long would it take to increase from a frequency 1% to a frequency of 90%? The answer is about 325 generations or roughly 8000 years (Crow and Kimura 1970). If adults have drunk milk for only 8000 years, then it must have conferred substantial benefits for selection to increase it so quickly to its current high frequency in northern Europe. Even for a gene under strong selection—and a 5% advantage is strong selection—time is a constraint. The lactose example suggests that it is quite plausible that we are mismatched to modernity.

Birth control and cancer risk Women in cultures without contraception and with normal birth intervals of two and a half years because of breastfeeding have about 100 menses per lifetime; in postindustrial cultures women have up to 400 cycles per lifetime (Strassmann 1997). Women who are nearly perennially cycling experience increased cell divisions, which put them at risk for breast cancer (Strassmann 1999). In the 1990s, breast cancer rates, for example, were 20–30 per 100,000 for females of all ages in Columbia, Costa Rica, and Ecuador, and 100–150 per 100,000 for females of all ages in the USA and Western Europe (International Agency for Research on Cancer, http://www.-dep. iarc.fr)—just about five times higher. Women who experience first birth at a young age and who spend most of their reproductive years pregnant or in lactational amenorrhea (a time when the ovaries shut down during breastfeeding) have demonstrably lower breast cancer rates. Although we do not recommend a return to this reproductive pattern, it is clear that Western women are experiencing too much endogenous hormone exposure and that this exposure comes from women’s own ovaries rather than from external environmental sources. Contraceptives need not induce a monthly period. Hopefully a solution can be found that gives women

the right level of estrogen for maintaining bone strength and avoiding osteoporosis while avoiding the risks of cancer. The first step, however, is to recognize that there is nothing biologically normal about the regular monthly period. Too many menses are harmful because they increase cancer risk, but merely suppressing them without appropriate adjustments in hormone exposure to protect against osteoporosis might not, on average, help.

Early-life events with late-life consequences Low-birthweight infants are at higher risk of becoming obese and developing diabetes, high blood pressure, and atherosclerosis later in life. Early nutritional stress is a signal whose evolved response sets the individual on a special developmental course with a physiology effective for conserving energy but ill-prepared for abundant food (Barker et al. 2002). Obesity rates have risen threefold or more since 1980 in many countries, both industrialized and developing, with the rate of increase often faster in developing countries. While agencies like the WHO ascribe the worldwide obesity epidemic solely to increased food consumption and decreased physical activity (http:// www.who.int/dietphysicalactivity/publications/ facts/obesity), the mismatch between early- and late-life nutritional status also contributes, rendering those born in poverty and growing into plenty especially vulnerable.

Parasite load and autoimmune disease In the environment in which we evolved, we were frequently exposed to severe, persistent infections; most people carried parasitic worms most of the time. Worms, which inhabit their hosts for many years, evolved to down-regulate host immune responses to enhance their survival and persistence in the host. In so doing they reduced our susceptibility to autoimmune diseases by reducing the overall production of antibodies, a small percentage of which leak through our surveillance systems to react with self. Our environment is now so antiseptic that few have worms and few adults die from infection, but many have autoimmune diseases that are becoming much more common now that


children rarely have parasites. Some doctors are successfully treating autoimmune disease by injecting preparations of the coats of parasitic worms, activating an inhibitory arm of the immune system suppressed in modern populations (Michaeli et al. 1972). Gabonese schoolchildren with schistosomiasis have fewer allergic reactions to dust mites, and Ethiopian, Brazilian, Venezuelan, and Gambian adults have less asthma when infected with nematodes (Wilson and Maizels 2004). This idea helps to explain the current epidemics of asthma, type I diabetes, and even leukemia (Greaves 2000; Wilson and Maizels 2004). It may take hundreds of generations for evolution to bring the screening mechanisms of our immune systems, located in the thymus and bone marrow, into equilibrium with the cleanliness of modern environments.

Infection Resistance Most doctors and many patients recognize antibiotic resistance as an example of rapid evolution. When it evolves at all, antibiotic resistance evolves much faster than we can evolve defenses. Much work remains to understand why some bacteria remain susceptible, such as streptococcus to penicillin, while others escape a new antibiotic in just a few years. Part of the answer is that bacteria and viruses do not always have to wait for mutations; many receive resistance genes from other pathogens (Lederberg 1998). Another part of the answer is that most antibiotics, created by selection during millions of years of competition between bacteria, are weapons against which some bacteria have already evolved effective responses (D’Costa et al. 2006). The same principles that govern the evolution of antibiotic resistance apply also to cancer chemotherapy, where resistant cell lines displace others. Triple chemotherapy for cancer is effective for the same reasons that triple antibiotic therapy is now routine for tuberculosis.

Virulence Virulence—the ability of a pathogen to cause morbidity and mortality—is also shaped dynamically


by natural selection. It increases when infection spreads easily—by mosquitoes, fleas, lice, hands, or needles—and when pathogens compete with other pathogen strains within a host. Peaceful coexistence with the host occurs only when it benefits both parties. If the illness or death of the host increases the chances that the pathogen will be transmitted, the pathogen will evolve greater virulence. Genes that influence virulence do not need to arise by mutation; the viruses that integrate into bacterial genomes transmit them among bacteria. They include the toxin genes of cholera, botulinum, diphtheria, and scarlet fever (Waldor 1998). Plasmids, small circular genomes that inhabit bacterial cytoplasm and can induce their hosts to conjugate (have bacterial sex), also transmit virulence genes among bacteria. Thus much of the information that a bacterium needs to become more virulent evolved long ago, now exists in pre-packaged modules, and is mobile.

Emerging diseases New diseases that emerge from other species can persist and spread in humans only if they evolve changes that allow them to enter, survive, reproduce in, and be transmitted from the new host. Without these evolutionary steps, SARS and avian flu would not be threats: to evaluate such threats, we need to understand their evolution. For some diseases, including AIDS, introduction into human hosts, by whatever route, starts the process moving. The implications for organ transplantation from other species are obvious and serious.

Reproduction Evolved conflicts between mother and offspring The mother is equally interested in the success of each of her offspring, for she shares exactly half her genes with each of them. The fetus, however, has evolutionary interests that differ from its mother’s with respect to its siblings, because it ‘shares’ all of its genes with itself but only some of its genes with its siblings. Thus there is a conflict between the genes in the mother and the genes in the fetus over how much the mother invests in the fetus



(Trivers 1974; Burt and Trivers 2006), and the fetus is equipped with placental morphology and endocrine function to manipulate the physiological state of the mother to its benefit. By-products of this evolutionary conflict include increased maternal blood pressure (pre-eclampsia) and diabetes (Haig 1993).

Evolved conflicts between mother and father The paths to reproductive success of fathers and mothers differ fundamentally. The reproductive success of a mother depends on the number of children she bears in her lifetime. The reproductive success of a father depends on the number of times he mates successfully per lifetime. Starkly put, he can father a child on this female, then go off and father another on a different female, leaving her to raise his child. This asymmetry in reproductive opportunities is ancient, predating the origin of humans by hundreds of millions of years, and we may have inherited its consequences from ancestor species. Because of this asymmetry, genes from the father have been selected to manipulate the mother to provide more nutrition to the current fetus than she has been selected to give, while genes from the mother counter this manipulation to reserve resources for her survival and her future offspring, which she may have by other males (Haig 1992). Such manipulations are possible because of a process called germ-line imprinting that inactivates some genes during early fetal development when they come through the father, and other genes when they come through the mother. Genetic imprinting may also explain the genetic component of several serious diseases, including autism and schizophrenia. It is also a major impediment to cloning.

Spontaneous abortions and complementary immune genes Early spontaneous abortions are especially common in women whose fetuses are immunologically deficient because their parents share the same versions of one or more major histocompatibility complex (MHC) genes. The immune systems of such fetuses cannot produce the recombinant antibody diversity needed to counter rapidly evolving pathogens and if carried to term would be poor at resisting

infection as infants. Remarkably, the female reproductive tract can identify and discard such fetuses at a very early stage (Ober 1992) when they have not yet cost the mother much time or energy, freeing her to try again, perhaps with a different mate. Repeated spontaneous abortions are both emotionally and evolutionarily costly, and avoiding them would be advantageous. Intriguingly, humans tend to choose mates whose MHC alleles differ from their own (Wedekind et al. 1995; Ober et al. 1997), using mechanisms not yet fully understood. The existence of this process suggests two things about the ancestral environment in which it was selected. We then lived in small, inbred groups where the probability of encountering a mate with the same MHC alleles was significant. And infectious disease then accounted for a significant portion of infant and child mortality, as it still does in much of the world.

Populations have histories Human populations have diverged genetically since we emerged from Africa about 100,000 years ago, and nearly every human individual has a unique genome and has had a unique developmental history of environmental interactions. As we colonized the planet, each branch of our family tree encountered different pathogens and different diets, and those pathogens and diets left their traces on our innate abilities to resist disease and metabolize drugs. As a result genetic diseases vary among populations of different geographical origin and ethnicity. Doctors practicing in South Africa, in Quebec, or on Pitcairn Island need to be aware of the high incidences of certain genetic diseases frequent in those populations but not in others because each of them was founded by a small group of people in which those genetic defects just happened to be relatively frequent. Not all genetic diseases found at unusually high frequency in specific ethnic groups are the result of such founder events. Some confer disease resistance when present as heterozygotes, such as sickle-cell anemia and glucose-5-phosphate dehydrogenase (G6PD) deficiency, which confer resistance to malaria. In other cases such connections are suspected but not yet well established: Tay-Sachs


disease, carried by up to 11% of Ashkenazi Jews, is thought to confer resistance to tuberculosis; cystic fibrosis is thought to confer resistance to cholera; phenylketonuria to fungal toxins implicated in spontaneous abortions. Genetic susceptibility to risk factors associated with circulatory disease also varies geographically. For example, people whose ethnic origin is closer to the equator are at higher risk of suffering from high blood pressure (Young et al. 2005), and susceptibility to smoking, cholesterol, and obesity is influenced by interactions among at least five genes each of which exists in several variants. Certain combinations of these variants are associated with much greater susceptibility; others with much less. This is crucial practical information for cardiac prevention.

Evolutionary technologies Evolutionary biology has also produced technologies with medical applications. Two are particularly important: the new methods of inferring relationships and history using phylogenetic reconstruction, and the production of live attenuated vaccines through serial transfer.

Phylogenetic reconstructions The phylogenetic methods developed to reconstruct relationships among species, and thus the history of life, have been used on RNA sequences recovered from HIV infections: they identified the Florida dentist who infected his patients (Crandall 1995) and the sailor who introduced AIDS to Sweden, and they also showed that routine dental care does not transmit HIV (Jaffe et al. 1994). The same methods reveal that smallpox exists in three major lineages, one from West Africa, one from South America, and one from Asia. If smallpox is ever used as a biological weapon, knowing the strain will be crucial to developing the correct vaccine.

Attenuated live vaccines Serial transfer is used to produce attenuated live vaccines, which are evolved by passing human pathogens through several generations of culture


on tissues from other species. As they evolve to specialize genetically on the new host, they lose most of their virulence in humans. Every time this procedure succeeds—as it has for the oral polio and typhoid vaccines—it demonstrates the evolutionary principle that a jack of all trades is a master of none. We now discuss the other basic evolutionary principles that inform the examples presented above.

The nature of evolutionary explanations Microevolution, macroevolution, and development To understand the current state of any population, we must consider the interactions of both microand macroevolutionary processes. Microevolution refers to changes in traits and gene frequencies resulting from selection and drift in each generation; its causes operate at the level of populations. Macroevolution refers to the broad patterns and deep time perceived in comparisons among species and with fossil evidence; it is revealed in comparisons at the level of the phylogenetic lineage, at and above the species level. Micro- and macroevolution explain why populations and species are the way they are, but they do not explain individuals. Understanding individuals requires adding consideration of development. In the process of development, genes and environments interact to produce the organism at all stages of its life cycle. Microevolution has shaped developmental reactions to the environment across the entire trajectory from conception to death. Those reactions also carry the macroevolutionary traces of phylogenetic history. Thus, every trait in every organism arises from two interactions. One is between relatively rapid microevolutionary changes and relatively slow macroevolutionary trends and constraints in the population and lineage. The other is between genes and environments during the development of each individual. As a consequence: • Every evolutionary change in traits involves changes in genes that influence development—for all traits develop. • All traits arise from interactions between genes and environment; it is an elementary mistake to say



that a trait is ‘environmental’ or ‘genetic,’ the product of ‘nature’ or ‘nurture,’ for all traits are products of both. However, it is perfectly sensible to estimate what proportion of variation in a given population is attributable to genetic differences, to environmental differences, and to their interactions. • An organism’s traits form a mosaic: some ancient, some new, some static, others rapidly evolving. Doctors do not treat genes; they treat traits influenced by genes expressed in whole organisms, such as infection, inflammation, blood pressure and chemistry, and anxiety. To do this well for many, if not all traits, they need to understand genetic evolution, trait evolution, and development.

Mechanistic and evolutionary explanations Most medical research has been limited to questions about the mechanisms of the body. The evolutionary perspective asks questions about why those mechanisms are the way they are. The distinction between ‘proximate’ or mechanistic and ‘ultimate’ or evolutionary explanations was emphasized by Tinbergen (1963) and Mayr (2004) but remains unfamiliar in the medical sciences. Both types of explanations are necessary, neither substitutes for the other, and they inform each other. In humans, the presence of some mechanisms and not others is the result of our ancestry and relationships. Like all other vertebrates, humans counter infection with an adaptive immune system and have an inside-out eye whose vessels and nerves run between the light and the receptors. Like all mammals, humans have internal fertilization, pregnancy, and lactation, and females store fat before and during pregnancy. Like all primates, humans provide extended offspring care. Like all hominids we have late maturation, a long life, and a relatively low reproductive rate. Among hominids we stand out for our relatively short interbirth intervals and a significant period of post-reproductive survival in females. Like birds and mammals, but unlike fish and trees, humans have determinate growth: we stop growing at maturation. After maturation, energy is devoted to reproductive competition and caring for offspring as well as storing calories and resisting

disease. Ancient neuroendocrine mechan isms mediate the allocations among these essential functions as well as the transition from the juvenile to the adult state. Those mechanisms have been shaped by selection to adjust allocations to the current situation. Not all such adjustments need be adaptive. For example, one seems to switch the neuroendocrine system to a premature state when nutrition is scant, a finding that helps us understand anorexia nervosa. And while seeking calories and storing them as fat was once useful in most environments, today it shortens lives (Neel et al. 1998). Thus, an evolved system of proximate mechanisms interacts with environments to shape phenotypes and behavior. Individuals whose proximate mechanisms improve reproductive success pass on more of their genes to future generations. Others are selected against.

Natural selection How selection works Selection operates to change a trait whenever three conditions are satisfied. When a trait varies among individuals, that variation affects how many successful offspring an individual has, and the genes that vary among individuals influence at least some of the variation in the trait, the reproduction of the successful individuals then changes the frequency of the genes and traits in the next generation. As this process continues over generations, the inheritance of the changes accumulates and can be measured in changes in the genetic composition of the population. The evidence for natural selection is overwhelming. Selection is not a theory. It is a principle that must hold when certain conditions are present: variation in traits, variation in reproductive success, correlation of trait variation with reproductive success, and inheritance of trait variation. If objects in any population vary in ways that influence which ones persist, then the population will change over time. It has to. Consider the water glasses in an inexpensive furnished apartment that has been repeatedly rented. They can be explained by selection. Some collection of glasses came into the apartment. The


fragile ones broke. The attractive ones left when renters departed. The nonfunctional ones with odd shapes were thrown out. What is left is what you find—a collection of sturdy, ugly, functional glasses. Selection can equally well account for why your coin jar is now mostly full of pennies, why the vegetables at the grocery store on a Sunday evening are mostly damaged, and why some television shows persist and spawn imitators, while others are long gone. Natural selection is just the special kind of selection that occurs when the objects are individuals in a population whose variations are caused partly by genes and whose contributions to future generations are influenced by how many of their offspring survive to reproduce in turn.

Fitness is relative reproductive success The basic insight of population genetics is simple and powerful—the evolutionary process can be reduced to the analysis of the factors that increase or decrease the number of copies of a gene in a population from one generation to the next. It is a superb starting point. However, gene frequency change is insufficient to explain phenotype evolution. To understand some particular aspect of an organism’s design for reproduction and survival, such as age at first reproduction, requires analysis of how the organism’s genes produce traits that interact with environments in contributing to survival and reproduction. Natural selection improves reproduction, but the route to reproduction requires allocating effort among finding food, avoiding predators and parasites, fighting, attracting mates, and caring for offspring. The variants that selection sorts do not necessarily include the optimal type: they simply consist of the variation that can be produced by the current population, as it exists. Those that persist performed better than the others, but there is no reason to think that their performance was the best possible.

Natural selection has several components: individual, sexual, and kin selection The analysis of reproductive success begins with the factors determining the number of surviving and reproducing offspring produced by a single


individual over its lifetime. This is the most general component of reproductive success, individual fitness: a shorthand way of referring to long-term reproductive success. In sexually reproducing organisms, reproductive success depends substantially on mating success. This component of natural selection is called sexual selection. Sexual selection shapes traits that improve mating success even if they decrease individual health or survival. For example, the male peacock’s tail improves his reproductive success by making him attractive to females but reduces his chances for survival by making it harder for him to fly. Human males have shorter lives than females; at sexual maturity in most modern cultures, mortality rates for men are three times higher than those for women (Kruger and Nesse 2004). Sexual selection can involve the two sexes in a complex interaction with fascinating properties. Females choose mates for a variety of reasons, and their preferences shape male behavior and morphology. The process stops when the costs and benefits of mating success balance. At that point, survival has often been compromised by investment in reproduction. Organisms living with relatives experience a third kind of selection. At one level, what matters to evolution is only the relative number of copies of genes that exist in the population in the next generation. Whether those genes are contributed directly, by an individual, or indirectly, by its relatives, is of no consequence. The closer the relationship, the more genes are shared. An individual can increase the frequency of its genes if it acts in ways that increase the reproductive success of its kin whenever the benefits to the kin’s reproductive success, weighted by its degree of relationship, exceed the costs to the individual’s reproductive success (Hamilton 1964). This process, called kin selection, has helped us understand the evolution of apparently self-sacrificial, cooperative, altruistic, and nepotistic behavior. It also explains why organisms are more likely to help close relatives than distant ones; full sibs, and parents and offspring, share half their genes, but first cousins share only oneeighth. The empirical success of kin selection has convinced evolutionary biologists that their focus on genes is correct (Williams 1966; Dawkins 1976; Dawkins 1982; Williams 1992).



The gene-centered view also explains why senescence is a property of the soma (an individual body), not of the germ line. Evolution ‘cares’ about the germ line—the genes—whereas doctors treat the soma, which is, from the point of view of evolution, disposable. The consequence has been all the degenerative diseases associated with aging, which are becoming the bulk of medical care. Surely we should want to understand their evolutionary origins.

Traits do not evolve for the good of the species Before the 1960s, one often heard that some adaptation had evolved for the good of the species, helping it to avoid extinction. For instance, lemmings were said to jump into fiords to commit suicide when food was scarce so the species could survive. As a general explanation, this is incorrect. The vast majority of traits evolve only if they improve the reproductive success of individuals and their kin; if they benefit the species as well, they do so only as a by-product of their benefits to the genes of individuals. Selection acting on species requires the standard conditions to be effective: variation among species in reproductive success (in this case determined by relative rates of extinction and speciation), variation in traits correlated with reproductive success, and heritability of those traits. Genes that benefit the species at the expense of individuals will rapidly disappear, for selection on individuals is much stronger than selection on groups and species. Individuals have much shorter generation times than species, and in the time that it takes for new species to form and go extinct, a process spanning many thousands of individual generations, hundreds of millions of the individuals that form those species will have lived and died. For that reason, selection has much greater opportunity to sort among individuals than it does to sort among species, and species selection simply cannot shape adaptations (Maynard Smith 1964; Williams 1966).

Random events and neutral variation: how neutral evolution works Some changes in the genetic composition of populations occur through neutral evolution—fluctuations

in the frequencies of genes whose alleles do not correlate with reproductive success. This kind of evolution is called ‘neutral’ because the variation is neutral with respect to selection; no variant has any systematic advantage over any other. It is also called drift to reflect the lack of direction of neutral genes drifting through the population over many generations. Drift produces random change in both large and small populations, but it works more rapidly and over a broader range of conditions in small populations. Two processes introduce randomness into evolution: mutations and meiosis. Two other processes accentuate it: founder events and lack of correlation of genetic effects with reproductive success. Mutations are random with respect to their effects on fitness; many are neutral or deleterious, some give an advantage. Whether the costs or benefits of a particular mutation will result in a systematic change in gene frequency depends on the number of times those effects are tested in organisms. If they are only tested a few times, then the randomness of meiosis may dominate the effects of the gene on reproductive success. The randomness of meiosis is like a coin flip. It consists of the 50% chance that each copy of a chromosome has of getting into a particular gamete. Since only some gametes succeed in forming a zygote, developing, and reproducing, the random effects of meiosis are particularly important in small populations. This can be seen by the limiting case of a population of two individuals, one male and one female, who produce just one offspring. Consider a new mutation sitting on a chromosome in the female. It has just a 50:50 of getting into the egg. Thus even if a new mutation gives a huge advantage, if the bearer has only one offspring, there is a 50% probability that the mutation will be lost. Most genes have effects that are not perfectly correlated with reproductive success. To the degree that they are not, those genes are subject to some influence of drift. Even advantageous genes sometimes end up in organisms that produce no children. It is therefore only in small populations that drift can overcome the effects of strong selection. As population size and number of offspring increase, so do the number of chances that genes


have of making their way into the next generation and having their effects on reproductive success register, and the effects of drift diminish. Founder events are another source of randomness in evolution. They occur when new populations are founded by a small and unrepresentative sample of the ancestral population. They are important in understanding why certain genetic diseases are unusually frequent in the descendents of the Dutch who colonized South Africa, of the French who colonized Quebec, and of the Bounty mutineers who settled on Pitcairn Island. The element of randomness introduced into evolution by founder events is precisely that of sampling error. Even in large populations drift acts on the neutral genes whose effects are not at all correlated with reproductive success. Completely neutral genes drift through both small and large populations like molecules in Brownian motion; the rate at which they are fixed determines the ticking of the molecular clocks that record the divergence times of species in DNA sequences. Thus drift does not only happen in small populations. Both random effects and selection have had important effects on populations, but we do not yet know what proportion of genetic variation each accounts for. In humans, the amount of variation is large: about 30% of human genes coding for structural proteins have more than one allele. In many proteins only certain amino acids are critical to their function; substitutions at other positions may be selectively neutral or close to it. On the other hand, the fact that no selective function is known for most human polymorphisms does not mean that selection has not been important: absence of evidence is not evidence of absence. Modern civilization has changed our activity patterns and our diet, and has eliminated or reduced many pathogens that were selective agents in the past. Furthermore, many of the body’s mechanisms are useful only in special circumstances. Shivering is useful only in cold situations and certain immune responses are useful mainly against worms that are no longer a threat. In short, the hunt for the adaptive significance of each gene, and of genetic variation, is just getting underway. That drift is real and sometimes potent should not stop us from considering possible


functions. ‘The neutral hypothesis, when applied to the study of human polymorphisms, might even have a counterproductive effect if it discourages the search for sources of natural selection’ (Vogel and Motulsky 1997).

Trade-offs One of the most useful generalizations evolution offers to medicine is a vision of the body as a bundle of trade-offs. No trait is perfect. Every trait could be better, but making it better would make something else worse. Our vision could be as acute as that of an eagle, but the price would be a decreased capacity to detect color, depth, and movement in a wide field of vision. If the bones in our wrists were thicker they would not break so readily, but we would not be able to rotate our wrists in the wonderful motion that makes throwing efficient. If the stomach made less acid we would be less prone to ulcers, but more prone to GI infections. Every trait requires analysis of the trade-offs that limit its perfection. This kind of thinking is especially important as we gain more and more ability to alter our bodies. For instance, it seems like a good idea to need less sleep, but natural selection has been adjusting the length of sleep for millions of years. If we think we can take drugs to cram more into 24 hours, we had better think twice. How much testosterone is optimal? Increased testosterone levels in human males may increase strength and competitiveness, but they also decrease ability to resist pathogens and parasites (Chapter 7). How many menstrual cycles per lifetime are optimal? More cycles mean more reproductive opportunities, but they increase cancer risk. These effects of testosterone and menstruation exemplify the central trade-off shaping life span and aging: the trade-off between reproduction and survival. Every trait must be analyzed in terms of the costs and benefits of the trade-offs in which it is involved. They limit how much fitness can be improved because every improvement in one trait will compromise some other. And those compromises can emerge as unpleasant, costly surprises when interventions are made in ignorance of the trade-offs they manipulate.



Macroevolution Relationships and fossils reveal history The type of explanation provided by macroevolution is essentially historical: things are now the way they are because they had a particular evolutionary history. Explaining the human pelvis, for example, begins with figuring out both how its shape changed over evolutionary time, and why it changed. To understand that history, evolutionary biologists use two methods, paleontology—the study of fossils—and the comparative method— comparisons of living species. Often they are used together. For traits that do not fossilize, the comparative method is the only way to reconstruct the history. The first step in the comparative method is always to locate the species on the Tree of Life, to identify its relationships with other species. Those relationships are now often more precisely understood thanks to a great deal of research that has been strikingly improved by better logic and the availability of cheap DNA sequences. Many relationships are being revised because of those developments. Given the location of the species in the evolutionary tree, one can map variations in the trait on the historical sequence of species to determine when the trait arose and how it changed in different lineages. Ancestral states can then be inferred by using several methods to search for correlated changes among traits over the portion of time, space, and biodiversity represented by the phylogeny (e.g., Felsenstein 1985; Pagel 1994). The evolutionary histories of menopause and the pelvis exemplify the power of the method; the appendix illustrates the challenges (Fisher 2000).

Constraints: eyes and tubes Organisms are not soft clay from which natural selection can sculpt arbitrary forms. Natural selection can only modify the variation currently present in the population, and that variation is constrained by history, development, physiology, and the laws of physics and chemistry. Natural selection cannot anticipate future problems, nor can it redesign existing mechanisms and structures from

the ground up. You cannot change the basic design of a car while the car is being driven. We illustrate constraint with two examples. The first concerns the vertebrate eye, often cited for its astonishing precision and complexity. It contains, however, a basic flaw (Goldsmith 1990). The nerves and blood vessels of vertebrate eyes lie between the photosensitive cells and the light source, a design that no engineer would recommend, for it obscures the passage of light into the photosensitive cells. Hundreds of millions of years ago, vertebrate ancestors had simple, cup-shaped eyes that detected only the direction of light and dark, not images. These simple eyes developed as an out-pocketing of the brain, and the position of the light-sensitive tissue layers happened to be beneath the layers that contained nerves and blood vessels. Once such a developmental sequence evolved, it could not be changed without intermediate forms that would be almost useless. Thus, natural selection cannot start from scratch to make the vertebrate eye more ‘rationally designed.’ The proof that the eye’s substandard design is not necessary is found in the octopus eye, which has no blind spot because the vessels and nerves run on the outside of the eyeball, penetrating only where they are needed. The second example concerns the length and location of the tubes connecting the testicles to the penis in mammals (Williams 1992). In the adult ancestors of primates and their relatives, and in present day primate embryos, the testes lie in the body cavity, near the kidneys, like the ovaries in the adult female. For reasons still unknown, the sperm of many mammals develop better at temperatures lower than those in the body core. This selection force moved the testes out of the hightemperature body core into the lower-temperature periphery and eventually into the scrotum (in some species they only drop into the scrotum during breeding season). This evolutionary progression in adults is replayed in the development of the testes. As they move from the body cavity towards the scrotum, the vas deferens does not take anything like the most direct route. Instead, it wraps around the ureters like a person watering the lawn who gets the hose caught on a tree. If it were not


for the constraints of history and development, the vas deferens would be much shorter and perhaps function better. Many other examples of suboptimal design are described in William Paley’s book, Natural Theology, where they are explained as results of the Creator’s intent to puzzle scientists (Paley 1970 [1802]).

Conclusion Health, fitness, and the pursuit of happiness Shorter interbirth intervals are associated with increased childhood mortality. Nevertheless, Hobcraft et al. (1983) observed: ‘For what it is worth, we note that any family trying to achieve maximal numbers of surviving children at any cost would, in the light of these results, continue to bear children at the most rapid rate possible. The dramatic excess mortality is not enough to negate the extra births. However, it is hard to recommend a pattern with such disastrous human consequences.’ This quotation illustrates two important distinctions. First, maximizing the fitness of a parent need not maximize the fitness of individual offspring. Second, health and fitness are not synonyms when fitness is understood in its genetic sense. Where there is a conflict between the self-defined interests of human individuals and the interests of their genes, medicine should serve the former. However, what individuals will choose for themselves does not bear any simple relation to health or fitness. Our choices sometimes promote health over fitness and sometimes fitness over health. When a woman chooses to be pregnant, she takes an action that enhances her fitness but has risks for her health. When she uses contraception, her choice may be good for her health but reduce her fitness. Our evolved natures should be treated with respect, but not with deference. We did not evolve to be happy: rather we evolved to be happy, sad, miserable, angry, anxious, and depressed, as the mood takes us. We evolved to love and to hate, and to care and be callous. Our emotions are the carrots and sticks that our genes use to persuade us to achieve their ends. But their ends need not be our ends. Goodness and happiness may be goals attainable only by hoodwinking our genes.


Human diversity Medicine and evolutionary biology have different approaches to variation. Medicine tends to be normative: some states (health) are better than other states (disease). Evolutionary biology is similarly concerned with the causes and consequences of variation, but particular states are not intrinsically more valuable or desirable than others. Differential reproduction is a consequence of interest but not a measure of value. Despite a common misconception, evolutionary biology is concerned with environmental as well as genetic sources of variation. Evolutionary biologists are fascinated by whether plastic human responses to different environments enhance genetic fitness and whether these responses have an evolved component (see Chapter 19). But whether a particular response is adaptive or non-adaptive (in the evolutionary sense) says nothing about the desirability of the response. The idea that some variation is ‘normal’ and some ‘abnormal’ has no place within evolutionary theory. Critics often object to the application of evolutionary theory to our own species because they fear that the theory has normative implications, or will be perceived as having such implications. However, normative questions are not the province of evolutionary biology. If it were convincingly shown that some men have a genetic predisposition to homosexuality, then the discovery would raise interesting evolutionary questions but there would be no reason to treat sexual orientation as a medical problem, just as few people would now see left-handedness as a problem needing correction. On the other hand, if it could be shown that variation in growth between human populations is an adaptive response to different levels of nutrition the response would be of evolutionary interest but its existence would not absolve us of asking why some people should have more food than others. Evolutionary biology is not going to provide easy answers to medical dilemmas, nor provide a simple guide for intervention, but a dialogue between evolutionary biology and medicine should nevertheless be of benefit to both disciplines. Most immediately, the vast database of medicine provides unparalleled opportunities to test evolutionary theory and suggest new avenues of evolutionary research.



We hope that evolutionary biology will be able to repay some of this debt by providing medicine with new hypotheses for answering old questions.

Implications for medical practice, research, and education Clinicians can profit from viewing infection from the pathogen’s point of view and being able to anticipate the evolutionary responses of pathogens to treatments with antibiotics and vaccines. The coevolution of pathogens with our bodies, our behaviors, our interventions, and our drug industries is ongoing, incessant, and inescapable (Chapters 10–17). The evolutionary view helps clinicians dealing with reproductive medicine, cancer, and autoimmune disease to understand how our bodies are mismatched to modernity and how far biological adaptation lags behind cultural change. The diseases of civilization include significant proportions of cancers, allergies, asthma, obesity, diabetes, and cardiovascular disease (Chapters 8, 9, 19–23). For medical researchers evolution provides a continuing supply of a key limiting resource: new questions posed from a different point of view leading to alternative explanations that suggest new lines of research on tough problems. We recommend considering graduate research programs that bridge medical school departments with departments doing basic research in evolutionary biology. For medical education, the engagement with evolution does not necessarily imply any new courses or any fundamental restructuring of the premedical or medical school curricula. Both are already packed with useful information that would be a mistake to discard. Instead, we suggest fitting evolutionary material into roughly 10% of that subset of courses where such material is relevant and clearly beneficial.

What doctors need to know about evolution and why 1. How natural selection works—By this we mean not just memorizing ‘variation, inheritance, and differential reproductive success’ but being able

to describe, with examples, how natural selection explains why organisms are the way they are. The body is not a machine designed from first principles by an omniscient engineer. Evolution has assembled it by tinkering with the variants available, every step of the way. 2. Trade-offs and constraints are ubiquitous— Because selection has pushed the design of organisms to limits determined by trade-offs and constraints, improving one thing often makes something else worse. Because some trade-offs are not obvious, unpleasant surprises are possible. Because constraints are real, the optimal has often not been attained. 3. The distinction between proximate and evolutionary explanations and how they combine to explain traits—Those who do not understand this distinction will waste time on futile arguments and will not grasp the importance of evolutionary explanations. For instance, those who think that type I diabetes is caused only by genes and autoimmune reactions have often not considered why those genes persist and why the autoimmune reactions evolved as they have. 4. The distinction between micro- and macroevolution—Some think that evolution is only about anthropological studies of bones and primates and confuse that with studies of changes in gene frequencies. 5. The distinction between evolution and nat ural selection—Evolution is more than just natural selection. It includes gene drift, gene flow, founder events, speciation, and all of their consequences. 6. Group selection is weak—Many who do not know this is a problem offer explanations for traits such as aging that are inconsistent with evolutionary mechanisms. The correct explanation of aging follows as an example of explanations based on individual selection. 7. Aging is a by-product of selection operating on the whole life cycle, from birth to maturity to death—Selection pressures drop with age and disappear in post-reproductive individuals, and up to a point more fitness can be gained by investing in reproduction than in maintenance that would improve survival. Therefore all organisms must evolve senescence. By understanding why we age, we can better appreciate the consequences of


treating the symptoms of aging and attempting to prolong life (Chapters 18, 23). 8. Each human individual has had a slightly different evolutionary history, and each has a different genetic makeup—This leads to important differences in the way that different human individuals react to drugs and to diseases (Chapters 2, 3, 4, and 5). 9. Microorganisms and cancer cells rapidly evolve resistance to drugs—This has important implications for drug design and the management of treatment (Chapters 10, 21, and 22). 10. Evolutionary theory tells us why virulence evolves to a certain level and no further and what measures could be taken to reduce it—Changes in our lifestyle, in treatment, and in public health


measures such as vaccination all cause virulence to evolve, for better or for worse (Chapters 11, 12, 16, and 17). 11. The evolutionary analysis of genetic conflicts tell us why both the placenta and the ovary make high concentrations of reproductive hormones during pregnancy and why some fetal proteins are derived only from the father’s genes while others are derived only from mother’s (Chapter 6). 12. Selection is everywhere in everyday life, including what drugs physicians use, which patients keep coming for treatment, and which insurance companies stay in business—Understanding selection in general is the foundation for understanding natural selection. Doctors need to understand this to help explain evolution to their patients.

This page intentionally left blank


The history and variation of human genes

This page intentionally left blank


Global spatial patterns of infectious diseases and human evolution Jean-François Guégan, Franck Prugnolle, and Frédéric Thomas

Introduction The range of diseases to which humans have been exposed has changed considerably from early human populations nearly four millions years ago through Neolithic humans c.10,000–8,000 years ago to modern humans living today in megalopolises (Armalegos et al. 1996). Over this long period humans have constantly created new ways of living and eating, thus generating new pathways for diseases to invade and spread into communities. For most of their evolutionary history, humans lived in small, sparsely settled communities with very low population sizes and densities. Although such human communities were too small to support endemic pathogens that were constantly present, they were regularly infected by zoonoses through insect bites (e.g., sleeping sickness), by preparation and consumption of contaminated flesh, from wounds inflicted by animals (e.g., tetanus), and by direct contacts with animal reservoirs (e.g., avian tuberculosis and leptospirosis (Armalegos et al. 1996)). Moreover, the range of earliest hominids was probably restricted to the tropical savannah, which would have limited the number of pathogen species. As they moved into temperate zones, hominids escaped from some of the tropical diseases that had plagued their ancestors and acquired new pathogens. When, about 10,000 years ago, the agricultural revolution produced larger, less mobile human populations, infectious diseases such as influenza, measles, mumps, and smallpox increased (Armalegos et al. 1996). The domestication of animals also attracted

more potential vectors and led to greater exposure to zoonotic diseases (Polgar 1964). Over this long period, aspects of human behavior, physiology, and genetics evolved in response to these diseases (Armalegos and Dewey 1970; Dronamraju 2004). We consider in this chapter three main questions about the global distribution of infectious diseases and their impact, in particular the impact of their diversity, on human evolution. First, what are the global geographical patterns of the distribution of pathogen species, and how can we explain them? Second, how does parasitic diversity influence the evolution of genetic diversity and the distribution of alleles at particular genes in humans? Third, how has geographical variation in the distribution and diversity of infectious disease shaped the distribution of life-history traits observed in current human populations? We argue that while the emergence of new diseases has been a recurrent pattern since the origin of hominids, with the new emerging pathogens we now face an important epidemiological transition that potentially influences human adaptation and survival. In particular, global trade and transcontinental economic exchange and transport will considerably alter the occurrence and distribution of human infectious diseases and thus the selection they exert on humans.

Geographical aspects of human diseases Latitude affects the diversity and distribution of many free-living organisms, but little is known about large-scale patterns of the distribution of human or animal pathogens (Finlay 2002). One 19



reason large-scale patterns of human diseases have so rarely been studied is that their geographical distribution has probably changed substantially over human history, with a major transition during the late twentieth century. In particular, it is generally thought that infection chains and intercontinental transfers of microbes would homogenize their spatial distribution, so that no geographical patterns could be detected (see Haggett 1994; Finlay 2002). However, recent studies of human microbial pathogens identify several macroscale distribution patterns of human diseases.

diseases affecting humans in the tropics (Guernier et al. 2004). But does this distribution of modern diseases reflect the environment of early humans? There are indeed several reasons to expect a similar or even stronger pattern for early human populations. First, the natural history explorations at the beginning of the sixteenth century and more recent large-scale dispersal due to intercontinental trade and transport increased the geographical ranges of diseases (McMichael 2004) and thereby weakened spatial patterns. Second, the geographic variation of temperature and rainfall affects disease ranges, in particular those of vector- and reservoir-borne diseases (Guernier et al. 2004). If, as we would expect, the latitudinal increase in winter severity decreases the survival of pathogens or their vectors, disease diversity would decrease as we move away from the equator. In addition, the amount of precipitation during a growing season decreases as we move away from the equator, which should affect the range of diseases and vectors that are sensitive to moisture. Finally, species richness of the parasites of non-human hosts also decreases with increasing latitude, in particular for metazoan parasites of marine fish (Rohde 1999), of primates (Nunn et al. 2005), and for parasites of some plant hosts, e.g., soybean (Yang and Feng 2001). Even if this rule does not apply to all microbial organisms

Latitude and the species diversity of human pathogens Disease species diversity is higher in the tropics than in temperate areas (Fig. 2.1a) (Guernier et al. 2004). This pattern is stronger in the northern hemisphere, where human populations are concentrated, than in the southern hemisphere. Most important parasites of humans occur in tropical and subtropical countries, and some of these species, mainly zoonotic and vector-borne diseases, are restricted to those regions because of the restricted geographical distribution of their hosts (Woolhouse and Gowtage-Sequeria 2005). Humans in temperate regions suffer from only a small subset of the (a)

Disease diversity



Disease composition

Latitudinal distribution range







































Equator line

1 Figure 2.1 At a global scale, three macroscopic patterns for human diseases emerge: (a) the relationship between human disease species diversity and latitude; (b) the relationship between latitude and disease species composition, where disease species comprising smaller assemblages at high latitudes constitute a subset of the species in richer tropical areas (nested pattern); numbers indicate different parasite species; (c) the relationship between latitudinal geographic range of human diseases and latitudinal centroids of diseases (Rapoport’s rule).


(see Finlay 2002), the similarity of the effect of latitude in these examples is striking.

Longitude and the species diversity of human pathogens In contrast to the latitudinal gradient of species richness, the tendency for richness to vary with longitude has been largely ignored (Gaston and Blackburn 2000). Yet, the diversity of human diseases is generally highest in continental Africa and is lower in both southern America and southeast Asia (unpublished data). This spatial pattern probably reflects in part the long-distance dispersal of diseases by humans during their history of expansion and migration, as in the example of fungal pathogens that followed the American migration of humans from the north to the south (Fisher et al. 2001). Inevitably, increasing global travel, trade, and migration (Wilson 1995) will weaken this spatial trend.

Latitude and the nested pattern of human pathogens As in animal and plant communities, human pathogens are distributed in a nested species structure (Guernier et al. 2004): some species are widely distributed and occur in many local communities, whereas others have more restricted distributions and occur only in a subset of the richest local communities (Fig. 2.1b). Together with the latitudinal gradient mentioned above, this means that some parasites only occur in tropical regions, others occur everywhere, but very few (e.g., Lyme disease) occur only in temperate areas (see Guernier et al. 2004). Pathogens that occur in tropical and temperate zones are generally directly transmitted viruses, bacteria and fungi, which are internal to the host and therefore little affected by environmental variability. This category of disease agents represents around 36% of 332 pathogen species described by Smith et al. (2007); their dispersal is primarily drive by contagion (Guernier et al. 2004). In contrast, pathogens with external stages (helminth worms, vector-transmitted pathogens, and reservoir-borne diseases) are more strongly influenced by environmental conditions; the ranges and environmental


requirements of their hosts restrict the range of many of these pathogens to tropical regions. According to Woolhouse and Gowtage-Sequeria (2005), 58% of 1,407 recognized species of human pathogens are zoonotic and thus constrained by the animal host’s spatial range. Because many of these animal hosts are tropical, most actual and potential human pathogens are endemic to tropical zones.

Latitude and the geographical range of human pathogens According to Rapoport’s rule, species whose geographical ranges are centered at higher latitude tend to be distributed over a larger latitudinal range (Gaston and Blackburn 2000). This rule is valid for some human pathogens, for mean latitude and disease ranges (Fig. 2.1c) are positively correlated for five of the six pathogen categories considered, namely protozoa, fungi, bacteria, helminthes, and vector-transmitted viruses (Guernier and Guégan, submitted). The exceptions to the rule are directly transmitted viruses. Despite previous doubts about the existence of Rapoport’s rule for the southern hemisphere (see Rohde 1999) and its generality as a common pattern in macroecology (Gaston and Blackburn 2000), the spatial trend for the five groups of human pathogens is also supported in the southern hemisphere. Moreover, this pattern also occurs in the tropics, although several previous studies suggested that it is limited to the Palearctic and Neartic above latitudes of 40–50°N (Chown et al. 2004). Thus human diseases centered at higher latitudes have wider geographical ranges than what it is generally observed for diseases endemic to the intertropical belt.

Geographical area and the species diversity of human pathogens Perhaps more than any other ecological pattern, the species–area relationship has influenced the development of ecology. Smith and collaborators (2007) identified three distinct categories of species–area relationships for human diseases (Fig. 2.2). First, directly transmitted diseases, such as measles and pertussis, do not show a significant species–area relationship (Fig. 2.2). In other words,



Species diversity (in Log)




Surface area (in Log) Figure 2.2 Areal species richness diversity and surface area size (double-logarithmic transformations) for three categories of human pathogens: contagious diseases (black squares), zoonotic diseases (open triangles), and multihost diseases (black circles). Each point represents a pathogen species. Modified from Smith et al. (2007).

the species diversity within a local community does not differ statistically from that observed at the largest scale (Fig. 2.2), and communities from adjacent sites are not more similar to each other than they are to those from more distant sites. This suggests that such parasites disperse rapidly over large distances, thus maintaining their global distribution. Second, multihost diseases such as trypanosomiasis, with human hosts and non-human reservoirs or vectors, have a positive species–area curve (Fig. 2.2). For these diseases, increasing the size of the sampling area increases the number of pathogen species, which suggests that they are to some extent endemic to certain localities, as discussed earlier. Third, reservoir host diseases, such as monkeypox virus or Ebola virus, which require an animal to spread, follow a positive species–area curve (Fig. 2.2). For this category of diseases, for which transmission from person to person is impossible or extremely rare and humans are unusual, accidental hosts and not part of the normal life cycle, pathogen communities from distant localities are composed of distinct pathogen species, for many zoonotic parasites are restricted to locally distributed reservoir species, such as tropical rodents. Therefore it is likely that with increasing sampling effort more zoonotic pathogen microbes will be discovered in the future.

In summary, geographical barriers rarely restrict the large-scale dispersal of directly transmitted pathogens. The local diversity of these pathogens—to which human populations are exposed—is large and probably a significant proportion of global diversity (Fig. 2.2). At the other extreme, the ranges of multihost reservoir diseases and zoonotic diseases do not differ markedly from the ranges of their macroscopic hosts, and they show some well-known biogeographical patterns: species richness and species composition gradients with latitude and longitude, range size with latitude, species richness–area curves. Within these two groups of pathogens, many novel, endemic pathogen species probably exist with spatial distributions largely driven by that of their reservoir species and with high species diversity in the tropics. New emerging diseases will likely originate from animal reservoirs, especially those in the tropics. International exotic animal trade may be an excellent pathway for disseminating such emerging diseases into ‘microbe-free’ regions around the globe (Di Giulio and Eckburg 2004).

Historical patterns of the distribution of disease It is unlikely that early modern humans faced the pathogen species diversity that we know today, for many diseases—e.g., coccidioidomycosis (Fisher


et al. 2001), smallpox (Oldstone 1998), the plague (Scott and Duncan 2001), leprosy (Monot et al. 2005), and many zoonotic diseases (Oldstone 1998)— emerged only a few thousand years ago. Leprosy, for instance, originated in Eastern Africa or the Near East and was introduced into the Americas within the past 500 years (Monot et al. 2005). Similarly, the fungal disease Coccidioides immitis probably appeared in South America within the past 10,000 years via human migrations (Fisher et al. 2001). Although current macroscopic patterns of disease distribution and occurrence cannot precisely mirror the situation of early human populations, the broad macroecological patterns of diseases discussed earlier are likely to have been similar. To summarize this section: large-scale humanpathogen interactions show two general spatial trends: (a) globally distributed pathogens selected throughout history as strains adapted to human populations, and (b) endemic pathogens, primarily zoonoses, whose species diversity is highest in the tropics.

Pathogen distribution and human genetic evolution Did differences in levels of exposure to certain pathogens or groups of pathogens differentially affect the genetic evolution of human populations? The answer is yes! Of several good examples, we focus here on two. The first concerns the gene that codes for the ␤-globin found in hemoglobin. Some mutants of this gene have been maintained at high frequencies in certain human populations— despite their obvious deleterious effects—because they confer resistance against particular pathogens. The second concerns a group of genes with immune function, the HLA (human leucocyte antigen) genes also known as major histocompatibility complex (MHC). Human populations exposed to a higher diversity of disease agents display higher genetic diversity at HLA genes than is expected under a simple neutral model. While genetic drift and demographic history have also been important in shaping their patterns of diversity, we argue that selection exerted by local pathogen communities has influenced the local evolution of these human genes.


Pathogen distribution and human genetic evolution: the case of sickle cell disease Hemoglobin and sickle cell disease Sickle cell disease is caused by a change in the hemoglobin protein (Pauling et al. 1949). Individuals with two copies of the Hb S variant of the ␤-globin (homozygous Hb SS) develop stiff, distorted red blood cells that have difficulty passing through the body’s blood capillaries. Tissues with reduced blood flow become damaged. Eventually, the disorder causes anemia, joint pain, a swollen spleen, and often severe infections that lead to death. Homozygous individuals have a short life expectancy and rarely reproduce. Heterozygotes for this variant (individuals that present the ‘normal’ variant Hb A and the ‘abnormal’ one Hb S) produce both sickle-shaped red cells and normal ones but rarely develop any symptoms (Ashley-Koch et al. 2000). Because persons homozygous for the sickle cell gene very rarely reproduce, the sickle cell allele (Hb S) should decline in every generation within populations and should therefore be observed only at very low frequencies if at all. This, however, is not the case everywhere. Sickle cell trait distribution High frequencies of more than 20% of the sickle cell trait are found in populations across a broad belt of tropical Africa (Allison 1954a,b) (Fig. 2.3). Elevated frequencies are also found in Greece, Turkey, and India (Singer 1953). Intermediate frequencies are found in, for example, Sicily, Algeria, Tunisia, Yemen, Palestine, and Kuwait. The sickle cell gene is thus found in a large and nearly continuous region of the Old World (and in populations that have recently emigrated from there), whereas the trait is almost completely absent from northern Europe, Australia, and North America (Singer 1953). Two main hypotheses have been proposed to explain the observed high frequencies within certain populations (Neel 1951) despite its highly deleterious effects: either the sickle cell allele frequently arises by recurrent mutation within populations, or the heterozygous individuals for the sickle cell allele (Hb AS) have a selective advantage (i.e., overdominance) over both the ‘normal’ homozygotes (Hb AA) and the sickle cell ones (Hb SS). Overdominance would enable



Endemic malaria

2-4 4-6 6-3 3-1D 1D-12 12-14 214

G6PD deficiency 7D-9.9 1DD-14.9 215

Figure 2.3 Maps showing the relation between (a) the geographic presence of malaria before 1920, (b) the frequencies (%) of the Hb S allele, and (c) the frequencies (%) of the G6PD deficiency in males in Africa, southern Europe, and west Asia. For G6PD, only the frequencies higher than 7% are reported. In many regions where malaria is prevalent but not the Hb S or the G6PD deficiency, other mutant hemoglobins may be found. Data mapping compilation from different sources by one of the authors (FP).

the deleterious allele to be maintained at a stable polymorphism. For the first hypothesis, the mutation rate would have to be very high and confined to certain human populations. Vandepitte et al. (1955) demonstrated that the mutation rate in hemoglobin was not high enough to maintain the observed frequencies of the sickle cell allele within populations. Therefore selection in favour of heterozygous individuals seems the best explanation. Why, then, had the gene become common in some parts of the world but not in others? Why did human heterozygotes have an advantage only in certain communities? Malaria and the sickle cell trait: the advantage of heterozygotes Allison et al. (1952) proposed that malaria could be the selective agent behind this process by noting that the geographical distribution of the gene for hemoglobin S and the distribution of malaria in Africa virtually overlapped (see Fig. 2.3). And indeed, Allison (1954a,b) later demonstrated that the prevalence and intensity of the infectious disease were lower in Hb AS heterozygote individuals than in Hb AA homozygous individuals. Hb AS children are more likely to survive than Hb AA children in highly malaria endemic areas (Aidoo et al. 2002). Mechanisms of resistance: an intimate association between malaria and red cells Several factors are likely to contribute in varying degrees to the partial resistance of sickle

cell heterozygotes to malaria. Resistance can be mediated by the reduced ability of parasites to grow and multiply in Hb AS red cells (Friedman 1978) or by their early removal from circulation (Luzzatto et al. 1970). Thus, parasite-infected Hb AS erythrocytes sickle more than non-parasitized Hb AS cells, which may lead to the parasites’ intracellular death (Friedman et al. 1979) or their removal by the immune system (Luzzatto et al. 1970). Although the latter may be largely the result of innate immunity, recent data suggest that acquired immunity may also be involved (Williams 2006). The contributions of these processes to protection against malaria in vivo are still largely undetermined. Malaria and other red cell polymorphisms The Hb S variant is not the only polymorphism of red cell proteins that has been selected for protection against malaria (Table 2.1). As shown in Fig. 2.3, the distribution of these variants is similar to the geographical distribution of malaria. However, human double heterozygotes for some of these variants, such as Hb S and ␤-thalassemia, or Hb S and Hb C, also suffer from a type of sickle cell disease (as do homozygotes) that reduces their fitness, so that these variant alleles tend to be mutually exclusive in human populations (Allison 1964). By comparison, other combinations of variants, for which there is no negative interaction between mutants, can be found at high frequencies (as is the case for G6PD deficiency and the Hb S variant; Fig. 2.3).



Table 2.1 Examples of red cell genes involved in malaria resistance, and which polymorphism worldwide may have been partly determined by the presence/absence of malaria Cell component



Protein and function

Effect on malaria

Main distribution


Hb S


␤-globin (hemoglobin component)

Protects against severe malaria

Hb C Hb E ␣-thalassemia


␤-globin (hemoglobin component) ␤-globin (hemoglobin component) ␤-globin (hemoglobin component)

Protects against severe malaria Reduces parasite invasion Protects against severe malaria



␣-globin (hemoglobin component)

Protects against severe malaria

Red cell enzymes

G6PD deficiency


Protects against severe malaria

Red cell membrane



Glucose-6-phosphate dehydrogenase (protects against oxidative stress) Duffy antigen (Chemokine receptor)

Africa, Middle East, India, Mediterranean Africa Southeast Asia Africa, Mediterranean India, Southeast Asia Africa, Mediterranean, India, Southeast Asia, Melanesia Africa, Mediterranean, India, Southeast Asia Africa

Protects against Plasmodium vivaxa


Plasmodium vivax is one of the agents of the human malaria. The others are Plasmodium falciparum, Plasmodium malariae, and Plasmodium ovale. The deadliest is P. falciparum.

Malaria and human gene evolution Malaria has been a major determinant in the evolution of several human genes, especially those involved in the constitution of red blood cells. It has in fact been suggested that malaria was (and still is) one of the most powerful forces of selection operating on humans. Despite the widespread use of drugs, malaria is still responsible for between 1.5 and 2.7 million deaths each year, primarily of children under the age of five years (Phillips 2001), and as such still has a major impact on human fitness in many populations. This infectious disease has shaped the evolution of several human genes.

Variations in pathogen diversity and human genetic evolution: the HLA genes HLA is a complex of genes with a major role in the recognition and presentation of non-self (antigens) to the effector cells of the immune system (T-cells) (Zinkernagel and Doherty 1974). Class I genes (A, B, and C), which are expressed in almost all cells, are involved in the recognition of intracellular nonself (e.g., viruses); class II genes (DP, DQ, and DR), which are only expressed in the antigen presenting cells, are mainly involved in the recognition of extracellular pathogens (or non-self). HLA genes are among the most polymorphic genes, both in

the human genome and in the genome of other vertebrates. For instance, more than 350 alleles are known for the Class I HLA B gene alone (Robinson et al. 2001). Several pieces of evidence suggest that this extreme polymorphism is, at least in part, maintained by balancing selection (Meyer and Thomson 2001). Thus, within human populations, the number of HLA alleles is far higher than the number expected under neutrality. Furthermore, when alleles do not differ in their selective effect (Potts and Wakeland 1993), they are generally more evenly distributed within populations than expected under a pure neutral model of evolution (Hedrick and Thomson 1983), and heterozygote excesses are observed more often than predicted by Hardy-Weinberg expectations (see Markow et al. 1993). Several hypotheses have been proposed to explain selection operating on HLA genes within populations, including MHC-dependent mate choice (Penn et al. 2002), spontaneous abortion (Thomas et al. 1985), and the selection imposed by the various species or strains of pathogens infecting human populations (Klein and Ohuigin 1994). This latter kind of selection, generally called ‘pathogen-driven balancing selection,’ is expected to operate when different alleles are selected



because of their ability to provide higher resistance against certain species of pathogens or certain strains, and is supported by several pieces of evidence. Thus, certain alleles confer more resistance against certain pathogens (e.g., against malaria or HIV) and individuals heterozygous for HLA genes are more resistant to some infectious diseases than homozygous individuals (overdominant selection) (see Penn 2002). There is therefore little doubt that the evolution of HLA in humans is linked to pathogens. But can this link explain why HLA diversity varies among human populations worldwide? In other words, might pathogenic species richness and composition have influenced the local evolution of these immune genes in modern humans? HLA genetic diversity and pathogen-driven selection Across 61 populations (Fig. 2.4), there is a strong positive correlation between HLA class I diversity and pathogen species diversity, especially for genes A and B after accounting for the effect of human demography on HLA diversity (for details see Prugnolle et al. 2005). Note that, as HLA class I genes (A, B, C) are mainly involved in the presentation and recognition of intracellular pathogens, the analysis considered only intracellular disease agents (viruses, obligate and facultative intracellular bacteria, and protozoans with at least one

Partials of HsHLA*/distance from Africa

0.6 0.4 0.2 0.0 ⫺0.2 ⫺0.4 30



50 45 Virus species richness



Figure 2.4 Partial residuals (after having taken into account the effect of demography) of HsHLA*( = log[Hs/(1-Hs)]) against virus species diversity for the HLA B gene. For details refer to Prugnolle et al. (2005). Hs is the HLA genetic diversity. Dot size is inversely proportional to the number of human populations coming from the same region in order to avoid statistical bias due to over-representation of some human communities. Modified from Prugnolle et al. (2005).

intracellular stage). The relationship is stronger for HLA B than for HLA A, suggesting that the HLA B gene might be under stronger balancing selection than the other Class I genes. This finding is in good agreement with other genetic and immunological studies, which have shown a stronger involvement of HLA B than of HLA A in the recognition of non-self (Kiepiela et al. 2004). Subdividing the intracellular pathogens into viruses, bacteria, and protozoans shows that HLA class I diversity is mainly correlated with virus species richness, suggesting that virus diversity, which is higher in the tropics, might exert stronger selective pressure on immune genes than any other category of pathogens. Pathogens are not distributed evenly in space (see above). They form an ecologically heterogeneous landscape in which spatially separated human populations have been submitted to different selected regimes, leading populations to adapt to their local parasitic conditions. Today, the traces of these different evolutionary histories may be found in the genomes of human populations.

Infectious diseases and human life-history traits Human populations differ in life-history traits such as survival, fertility, age at first menstruation, and age at menarche (Thomas et al. 2001; Barret et al. 2002). Social scientists and demographers have traditionally assumed that socioeconomic variables— such as development, modernization, culture, and family planning programs—predominate in determining these differences. Variation in human life-history traits, however, might also have evolutionary explanations that rely on differences in characteristics of the environment, including biotic interactions (Stearns 1992). For instance, in many plant and animal species, parasites play an important role in the evolution of host life-history traits (see, e.g., Kris and Lively 1998; Fredensborg and Poulkin 2006). Parasites use resources that the host could otherwise use for its own growth, maintenance, or reproduction. Direct costs of this exploitation lead to variation in life-history traits among individuals and populations. Alternatively, changes in host life-history traits may be an adaptive response to


parasitism. One solution developed by many animal species against parasites is the adjustment of life-history traits to compensate for their negative effects on fitness. By analogy, we here suggest that parasitic and infectious diseases have also affected human life-history traits.

Human fertility and the species diversity of human pathogens Guégan et al. (2001) performed a comparative analysis on 150 countries to explore the relationship between the diversity of infectious disease agent species and human fertility. The prediction was that humans in countries with high quantities and diversity of virulent parasites should compensate for the high offspring mortality by increasing their reproductive investment. In agreement with this prediction, human fertility was positively related to the diversity of disease types encountered by local human communities. The correlative nature of this study prohibits any conclusions about the causal mechanisms relating diseases and fertility. One important finding was that a set of co-occurring diseases rather than a unique infectious disease is the key to understanding the link between parasitism and human life-history traits.

Human birthweight and the species diversity of human pathogens Human populations differ in birthweight (Vangen et al. 2002). Many variables influence prenatal growth and birthweight in humans, e.g., maternal energy supply, maternal stature, physical work, stress, temperature, disease status, smoking status, gestation length, and altitude (see Koupilova et al. 2000; Wells 2002). Thomas et al. (2004) present a theoretical model to suggest that a significant part of the variability in human birthweight results from adaptive responses among which the risk of fitness reduction predominates (this idea has not yet been formally tested). In stable, well-resourced, low-parasite environments (i.e., most modern industrialized countries), somatic (i.e., non communicable) diseases are likely to be an important source of fitness variation among individuals. Because


infants with a low birthweight are at a higher risk of expressing chronic diseases later in life (e.g., cardiovascular diseases, diabetes, certain cancers, impairment of hearing and vision; cf. Chapter 19), selection in these environments is expected to favour individuals producing larger children. Even if some of these somatic diseases occur late in life (i.e., after reproduction), they are likely to be detrimental to an individual’s fitness, for they reduce its capacity to deliver grandparental care. In countries where the risks of parasitic infections are high (i.e., numerous developing countries), women are, other things being equal, also expected to deliver infants with a high birthweight. Indeed, infants with low birthweight generally have an increased vulnerability to infectious diseases because of impaired immune function (e.g., Moore et al. 1999). Given that offspring mortality (due to infections), more than fertility, is likely to be the primary determinant of fitness variation between reproducing females, mothers in parasiterich environments will have a particular reproductive interest in producing larger, more resistant, children. The study by Thomas et al. (2004) also predicts that once a threshold in infection risk is reached, birthweights significantly increase with the number of diseases present. Finally, in environments where adverse environmental conditions—famine, drought, or accidents—are frequent, selective pressures for producing large offspring are likely to be relaxed because the negative impacts of environmental factors on individual fitness are largely independent of birthweight. Instead, the fitness costs incurred by the mothers when producing large children (e.g., reduced survival, lower probability of subsequent reproduction, cf Bereczkei et al. 2000) are less well compensated by reproductive advantages, so that natural selection should favor individuals producing smaller babies.

Human behavior and culture, and the species diversity of human pathogens Can infectious diseases also alter human culture? One example is the protozoan Toxoplasma gondii (see Lafferty 2006), which lives in the nervous system. Cats are the final hosts of T. gondii, and



rodents are its normal intermediate hosts, but the parasite also develops well in humans. The parasite induces behavioral alterations in rodents that lead to an increased risk of predation by cats (Berdoy et al. 2000). In humans, Toxoplasma infections result in slight personality changes, for example guilt proneness, a form of neuroticism, and reduced psychomotor performance (Havlicek et al. 2001). Because cats do not normally prey on humans, these behavioral changes are of no apparent value to the parasite. They could be manifestations of mechanisms evolved in the past to manipulate the normal rodent hosts, or they may be mere coincidental pathology. Whatever the cause of such changes, Lafferty’s results suggest that Toxoplasma could affect specific elements of human culture. He found that countries with high Toxoplasma prevalence have a higher aggregate neuroticism score, and Western nations with high prevalence


Uncertainty avoidance

100 80


60 40 20 0 0

















‘ Masculine’ sex roles

also score higher in the ‘neurotic’ cultural dimensions of uncertainty avoidance and of masculine sex roles (Fig. 2.5). Many infectious agents may play a role in some neuropsychiatric disorders (McSweegan 1998), and a recent study has pointed out that nearly half (49%) of all emerging viruses today are characterized by encephalitis or serious neurological clinical symptoms in humans (Olival and Daszak 2005), highlighting the importance of neurotropic disease agents in medicine and social culture. Recent investigations have illuminated the molecular mechanisms that enable neurotropic viruses to alter brain function and lead to neurobehavioural disorders (Volmer et al. 2006). Although parasitic and infectious diseases have had a major impact on human population demography around the world, relatively few attempts have been made to investigate how disease-causing agents have affected human biology. The few studies above suggest that their influence might be substantial. The importance of parasites as a determinant of human life histories as compared to other factors remains to be assessed.





0 National prevalence of T. gondil (Western nations)

Figure 2.5 Association between (a) the cultural dimension of uncertainty avoidance and the prevalence of Toxoplasma gondii in Western nations, and (b) the cultural dimension of masculine sex roles avoidance and the prevalence of T. gondii in Western nations. From Lafferty (2006).

1. Human infectious diseases are not distributed at random: contagious diseases are everywhere; zoonotic pathogens are more locally concentrated in the tropics. 2. Therefore, human communities are not all equally exposed to disease; populations in the tropics have suffered, and are still suffering, from a greater diversity of pathogens. 3. Pathogens have exerted strong selective pressures on modern humans, which in turn have evolved resistant genotypes. Results of this evolution may be observed in the genomes of current human populations. 4. Because pathogens are not distributed homogeneously, human populations have been submitted to qualitatively and quantitatively different selective pressures. Therefore, different human populations may have followed different evolutionary pathways. 5. An allele that confers resistance against a pathogen may reduce fitness in the absence of the pathogen. The evolution of an allele conferring resistance


against a pathogen is often the result of a complex balance between costs and benefits. 6. The life-history traits of early humans (like those of many animals) were shaped by interactions with parasites, but to what extent those of modern humans result from selection by disease is a matter of debate. Better comparative statistics on life-history traits in humans (in addition to traits usually surveyed by anthropologists) are needed to explore this important issue. 7. Given the current epidemiological transition into which modern societies have entered (less parsitic load), analyses of the connections between lifehistory traits and disease biology can also help us to understand evolutionary responses in fertility, sexual dimorphism, and life span (cf. Chapter 7). 8. These considerations stimulate important questions about the role of parasites in our evolution:


Which kinds of pathogens are most likely to spread in human populations in the future (cf. Chapter 16)? To what extent will the homogenization of zoonotic diseases interfere with human adaptation and evolution? If pathogen pressure maintains much human polymorphism, what will be the effects of disease control and eradication on our own evolution?

Acknowledgments The authors thank the Institut de Recherche pour le Développement and the Centre National de la Recherche Scientifique for financial support. The authors are also grateful to Professors Steve Stearns and Jacob Koella for their judicious comments on an early draft and for inviting us to write this chapter. Finally, thanks are due to Marc Choisy for improving our English.

This page intentionally left blank


Medically relevant variation in the human genome Diddahally R. Govindaraju and Lynn B. Jorde

Introduction Nearly every human disease has a genetic component. Some diseases, such as cystic fibrosis or Huntington disease, are caused mostly by an alteration of a single gene. These conditions are each relatively rare and collectively are seen in about 1% of individuals in populations (Jorde et al. 2006). Other conditions, such as heart disease, diabetes, common cancers, and psychiatric disease, are much more common and are caused by the interaction of multiple genetic alterations in an individual. They are also strongly influenced by non-genetic factors such as diet, sedentary lifestyle, and exposure to toxic substances such as tobacco smoke. The past two decades have seen notable success in defining the genes that cause single-gene disorders. A key feature in this success has been the development of thousands of genetic ‘markers.’ These short DNA sequences vary from individual to individual, and their locations in the human genome are known. Importantly, they can be assayed easily in the laboratory. They thus constitute a series of recognizable signposts on the genetic map. Investigators can search the genome for specific markers co-inherited with a disease from parent to offspring among family members using information on the transmission of sections of intact chromosomes (genetic linkage). In this way, the locations of more than 2,000 diseasecausing genes have been identified (OMIM—Online Mendelian Inheritance in Man 2006). Because of their complex causation, common diseases represent a greater challenge for the geneticist. Instead of pinpointing a single genetic alteration

in an affected individual, it may be necessary to identify several or more alterations, in addition to non-genetic predisposing factors. Accordingly, progress in identifying the genetic contributors to common diseases has been relatively slow, but the enormous public health burden of these diseases has motivated considerable research. Several factors are contributing to nascent success in this endeavor: large collections of affected individuals and families, efficient computational methods and machinery, and identification of millions of new markers throughout the genome. Most of this review will focus on these novel classes of genetic markers and their applications. We will define and discuss three marker types. 1. Short tandem repeats (STRs; also termed microsatellites) consist of short DNA sequences, typically 2 to 5 base pairs (bp) in length, that are repeated in tandem at a specific location on a chromosome (e.g., CACACACA). The numbers of these repeats at a specific location can vary from one individual to another, and this variation can be assessed by standard laboratory methods. STRs have been highly useful, both in disease and in forensic applications. 2. Single nucleotide polymorphisms (SNPs) are single base-pair variants; they number in the millions and are easily typed by automated laboratory analytical systems. 3. Copy number variants (CNVs) are larger DNA sequences (typically 1,000 to 1 million bp in length) whose copy number varies among individuals. The human genome contains approximately five million relatively common SNPs spaced across the 31



genome at average intervals of one every six thousand base pairs. Because sections of chromosomes are transmitted together across generations, SNPs located close to one another are highly correlated (i.e., if one knows which variant of a SNP an individual has at a given location, the SNP variant at a nearby location can be predicted accurately without actually typing it). It would thus be wasteful to assay every common SNP in a series of cases. A recent large-scale collaborative effort, the ‘HapMap Project,’ has identified the patterns of SNP correlation throughout the genome. These patterns, as well as the evolutionary factors responsible for them, are a further focus of this review. In the final sections, we will discuss briefly the use of these markers to infer evolutionary processes and the existence of causal SNPs.

Molecular markers Microsatellites In the long stretch of the DNA molecule, certain nucleotides tend to occur together in pairs, triplets, and sets of four to six, and they are repeated dozens of times along the chromosomes in the genome. Depending upon the number of repeated nucleotides, they are called di, tri, tetra, or penta nucleotide repeats. For example, nucleotide repeats such as CA or CAG may be repeated several times in a stretch of chromosome (Fig. 3.1b), and are designated as (CA)n or (CAG)n, and n being the number of repeats. The name ‘microsatellite’ is derived from the observation that during DNA isolation using ultracentrifugation, satellite bands appeared next to the main DNA band. Consequently, DNA concentrated in these bands was called microsatellites or short tandem repeats. These repeats are highly variable among chromosomes, and each locus may have up to 10 alleles. Because of their variability and abundance, microsatellites have been employed extensively for discovering genes underlying Mendelian disorders, for evolutionary studies (Jorde et al. 1997; Zhivotovsky et al. 2003), and in molecular diagnostics (Straub et al. 1993). Most often, microsatellites are considered neutral, with no phenotypic consequence. Hence they are used in gene discovery studies primarily using

linkage analysis. Briefly, this approach involves three steps: first, identifying a series of STRs at various regions of the human genome; second, tracking the co-segregation of these markers with disease in families; and third, identifying a segment of chromosome harboring a causal gene on the basis of a ‘LOD (logarithm of odd) score’ (for details see Terwilliger and Ott 1994). In general, for a Mendelian disorder, a LOD score of > 3.0 is considered to be evidence for linkage. A number of microsatellites are known to be involved in human diseases. In these diseases, a low number of repeats is harmless to carrier individuals, but more than a certain number of repeats can lead to a disease phenotype. For example, in Huntington disease, healthy individuals have 6–35 CAG repeats in the huntington gene, but affected ones may have 36–200 repeats. Similarly, in Fragile X syndrome, healthy individuals have up to 50 CGG repeats in the FMR1 gene, while affected ones have more than 200 (Cummings and Zoghby 2000) (some individuals with 50–200 repeats are affected in middle age by milder disease symptoms such as tremor and ataxia in males and premature menopause in females). Although the use of STRs in combination with linkage analysis has proved very useful in discovering the genetic basis of simply inherited Mendelian disorders, linkage approaches have limitations in mapping genes underlying complex traits and common diseases. Therefore, newer approaches, such as linkage disequilibrium (LD) and association studies (see below), which incorporate information from dozens or hundreds of generations of past recombination (see below) and can use unrelated individuals, were proposed (Jorde 1995). Risch and Merikangas (1996) also conjectured that using a large number of markers (up to 1 million) to saturate the genome, then comparing alleles carried by diseased and healthy individuals using association studies, could help to discover the genetic basis of complex traits. Association studies have an advantage over linkage studies. Association studies may or may not require family information, and they assume that the markers may be embedded in the causal gene or close to it. The development of these analytical approaches approximately coincided with the discovery of a


(a) Transcription factors 5'UTR


Exon 1

Exon 2

TATA Repressors and enhancers (b)

3' UTR



ACGGGTAGAATCGACACACACACACACACA ATGGGTAGAATCGACACACACACACACACA Change of the nucleotide from C > T is a SNP; CA repeats are microsatellites


A ......C......G...... G ......A Ancestral haplotype A ......T ......G...... G ......A Haplotype 2—derived from the ancestral haplotype A ......T ......G...... G ......A Haplotype 3—two sites have changed A ...T ......G...... G ......A Haplotype 4—a disease causing allele ‘D’ due to mutation of any one of the four nucleotides at that site creates LD with its neighboring SNPs. D


Chrom. region YRI


60 40 20 0 0









26,700,000 (e)

* * * A......C.......G......A.......A.......A......A A......G.......G......T.......A.......T........A A......C.......G......T.......A.......A.......A A......G.......G......A.......A.......A......A A......C.......G......A.......A.......T.......A



* represent ‘informative row of tagSNPs'

SNPs at site 2 and 4 are highly correlated, hence only two instead of three may be selected for typing the region.

D’ or r2=0.8 - 1.0 Figure 3.1 Organization of genetic variation in the human genome. (a) An idealized gene. (b) Single nucleotide polymorphism (SNP). (c) Haplotype. (d) Relationships among LD, haplotype blocks, and recombination hot spots in three HapMap populations: (left) LD (haplotype blocks) and (right) haplotypes regularly extended beyond the hot spots. Spikes at the bottom of both figures correspond to recombination rates expressed in cM/Mb (modified from (left) Altshuler et al. 2005 and (right) McVean et al. 2005, with permission). (e) TagSNPs.




class of molecular markers more abundant than STRs—single nucleotide polymorphisms.

Single nucleotide polymorphisms (SNPs) When many strands of DNA from a given region, derived from a large number of individuals, are aligned and compared, some individual nucleotide sites may differ from each other. Such differences at individual nucleotide sites (i.e., A, T, C, and G) in the DNA molecule are designated as SNPs. When, in the late 1980s, SNPs were found to be more abundant in the human genome than any other markers previously described, including microsatellites, they were commercially exploited. These efforts, largely supported by pharmaceutical companies, led to the organization of the SNP consortium (http://www.hapmap.org), whose goal was to understand the genetic basis and treatment of human diseases. In general, SNPs exist as two alleles, and by convention a ‘polymorphism’ is defined as a site in which the less common or minor allele has a frequency (minor allele frequency, or MAF) of 0.01 or more. SNPs are distributed throughout the genome and their distribution varies both among different portions of genes (Figs. 3.1a, b) (Chakravarti 1999) and among populations. Polymorphisms in introns are more variable than those in exons. The latter are further classified into non-synonymous (amino-acid altering) and synonymous (silent) polymorphisms. Missense mutations in exons lead to amino-acid changes and are generally important in single-gene (Mendelian) disorders, while nonsense mutations result in stop codons that terminate the translation of mRNA into proteins. In contrast, synonymous polymorphisms do not result in any detectable amino-acid changes (but see KimchiSarfaty et al. 2007). SNPs in regulatory genes do not affect amino-acid variation in expressed proteins, but they often affect the levels of gene expression. Because of their abundance, SNPs are frequently used in genetic and evolutionary studies, including discovery of candidate genes, association between markers and causal alleles, inferring demographic histories and migration patterns, and distinguishing between evolutionary forces such as selection and neutrality (Nachman 2001). Any newly

discovered and validated SNP is usually deposited with the National Center for Biotechnology Information (NCBI) data bank as a dbSNP (database SNP), and by January 2007, approximately 28 million SNPs had been listed. Among these, more than 6 million are validated and 4 million are found within genes (http://www.ncbi.nlm. nih.gov/projects/SNP/snp_summary.cgi). Because SNPs represent mutations, they arise constantly in individuals and populations, and therefore many remain undiscovered.

Haplotypes Often two or more SNPs occur close together on a chromosome. Some of these combinations are transmitted together from parents to offspring, behaving as single alleles in the gametes because they are undisturbed by crossing over. These SNP combinations are called haplotypes, a term coined by Cepellini et al. (1967) that signifies a linear combination of markers on either haploid member of a pair of diploid chromosomes. Haplotype variation among individuals from one or more populations consists of nucleotide differences at one or more sites within each haplotype among a set of chromosomes (Fig. 3.1c). In principle, given n SNPs, 2n haplotypes could form, but in the absence of recombination, recurrent mutation, and gene conversion, only n+1 haplotypes are expected. Organizing SNPs into sets of haplotypes has several advantages: they provide greater statistical power than individual polymorphisms; they increase the number of degrees of freedom available for analysis; and they reduce the sample size needed to detect significance in association studies (Clark 2004). These properties of haplotypes have been exploited to study many complex disorders. For example, Sing et al. (1987) used evolutionary approaches to study the relationship between the apolipoprotein E (Apo AI) polymorphism and triglyceride levels (see also Chapter 23). Their approach consists of four steps: 1. Derive haplotypes from pedigrees or through statistical estimation. 2. Classify the haplotypes by degree of relationship using phylogenetic methods.


3. Compare patient phenotypes—in this case triglyceride concentrations—across haplotypes with analysis of variance approaches. 4. Identify the causal variant (mutation) as the haplotype whose phenotype (here triglyceride level) differs significantly from the phenotypes of other haplotypes. Other approaches for isolating haplotypes that harbor disease variants have since been developed (Fallin et al. 2001). To be used in gene discovery and evolutionary studies, haplotypes must be identified, and it is more difficult to derive haplotypes than SNPs from diploid individuals, for reasons discussed next.

Determination of haplotypes Assume that a series of nucleotide sites in a diploid genotype exists in three states: (a) homozygous at all of the sites, (b) heterozygous at only one site, and (c) heterozygous at two sites. A homozygous genotype will produce two identical haplotypes; a genotype heterozygous at one site will produce two different haplotypes. In the first two cases, the lineage of haplotypes can be traced to the parental genotypes with some certainty. In the third case, however, although it is possible to infer the likely haplotypes, unlike the first two cases, one is less certain about their lineage because of crossing-over. The situation gets more complicated when investigators confront regions of chromosomes with multiple heterozygous sites and lack information on the allelic states of parents and grandparents. To overcome these difficulties, both molecular and statistical approaches for identifying haplotypes have been proposed. One problem is that of isolating a single DNA sequence (haplotype) from a diploid pair in a sufficient number of copies to perform a molecular analysis. This can be done several ways: by diluting the sample to the point where only one DNA molecule is expected, then amplifying it (Ruano et al. 1990); by amplifying specific short DNA sequences (approximately 10 kilobases (kb)) using PCR primers designed to recognize those specific alleles (Michalatos-Beloin et al. 1996); or by forming mouse– human hybrid cells that retain only one human


chromosome (Patil et al. 2001). Combinations and extensions of these approaches have also proven effective (Ding and Cantor 2003; Zhang et al. 2006). Statistical methods for strengthening the inferences that can be made from the haplotype data generated by these molecular techniques have also been developed. They are based on the idea that at some point in the past there existed an ancestor that was homozygous at all sites. The observed variation is then assumed to have arisen in a series of single steps from that ancestor (Clark 1990). The statistical inference of those steps has become increasingly powerful and sophisticated (Excoffier and Slatkin 1995; Marchini et al. 2006).

Linkage disequilibrium, recombination, and haplotype blocks Two genes (loci) located on the same chromosome are physically linked, but their linkage can be altered by recombination during meiosis. Consider two genes, A and B, each of which exists in two allelic states: A and a, B and b. If these genes were not linked, and instead assorted into gametes independently, then the four possible gametes (AB, Ab, aB, and ab) would give rise to nine (AABB, AABb, AAbb, AaBB, AaBb, Aabb, aaBB, aaBb, and aabb) classes of offspring genotypes whose frequency would simply be the product of the gametic frequencies. However, because they are linked, the gametes are not distributed independently, and the degree that their frequencies deviate from those expected under independent assortment is called linkage disequilibrium. Linkage disequilibrium has been studied extensively from both genetic (Kimura 1956; Lewontin and Kojima 1960) and evolutionary (Felsenstein 1974; Maynard Smith 1977) perspectives, and Dobzhansky (1970) claimed that particular linked combinations of alleles are maintained in populations because they form coadapted complexes.

Linkage disequilibrium In general, neighboring loci have strong linkage disequilibrium, and distant loci have weak linkage disequilibrium. When a new mutation first occurs anywhere in the genome, it is in complete linkage disequilibrium with its neighboring nucleotide sites



and marker alleles (Fig. 3.1c). Over time, recombination events between the two alleles will erode the linkage disequilibrium and lead to linkage equilibrium between linked alleles. Such events occur more frequently between distantly than between tightly linked genes. Thus, linkage disequilibrium decays over time as a function of the frequency of recombination, which is greater for polymorphisms located farther apart on chromosomes. This evolutionary principle is extremely important in finding disease genes. For example, if a marker is in linkage disequilibrium with some disease-causing allele, the strength of linkage disequilibrium between the marker and the unknown variants in the genomic region can be used to predict and eventually isolate the causal allele. This approach is called linkage disequilibrium or association mapping. Of several measures proposed to measure linkage disequilibrium (Devlin and Risch 1995), D’ (Lewontin 1964) and r2 (Hill and Weir 1994) are the most popular. The latter measures the correlation between alleles and has better sampling properties, particularly when sample sizes are small. With either measure, the strength of linkage disequilibrium varies between complete (1.0) and no (0.0) association between loci. Linkage disequilibrium varies among chromosomal regions and depends on both demographic (e.g., population size, admixture) and genetic (e.g., recombination and mutation) factors. Populations that have experienced bottlenecks (e.g., Icelandic, Finnish, and French Canadian) appear to have longer genomic regions under linkage disequilibrium than outbred populations such as North American Caucasians (Chapter 5). In global perspective, the out-of-Africa model predicts less linkage disequilibrium for the African populations than for the European and Asian populations (Jorde 2000) because the African populations have had more time for recombination to occur. In addition to time, factors that influence linkage disequilibrium include genetic drift, selection, mutation, gene conversion, recombination, age of alleles, admixture, and hitchhiking (Ardlie et al. 2002).

Recombination and recombination hotspots The recombination of genes occurs through two mechanisms: the independent assortment of

non-homologous chromosomes into gametes, which shuffles genes located on different chromosomes, and the crossing over of segments of homologous chromosomes when they pair during meiosis, which shuffles genes located on the same chromosome. Both mechanisms reduce the linkage disequilibrium between alleles created by selection or other evolutionary forces and increase genetic variation in populations. Hence, recombination is an essential feature of the genome and may be selected for any of a variety of reasons (Maynard Smith 1977). The physical distance between markers is simply the number of nucleotides between them; it increases linearly along the chromosome and is the measure used in physical maps of chromosomes. In contrast, the genetic distances between markers in genetic maps are measured with crossover data and estimated in centimorgans (named after Thomas Hunt Morgan), which express crossover frequencies in percentages (1 cM = 1% crossover). One centimorgan corresponds roughly to 1 megabase (Mb; 1000 kb) of DNA, but because recombination is not uniform along chromosomes, genetic and physical maps do not directly correspond. Morton and colleagues (Zhang et al. 2002) merged the properties of genetic and physical maps and developed linkage disequilibrium maps in which the distance between markers is represented in linkage disequilibrium units (LDUs). These linkage disequilibrium maps are characterized by plateaus and steps. The plateaus represent haplotypes, and the steps coincide with regions of recombination. Jeffreys et al. (2001) reported recombination hot spots in short DNA segments spanning only 1–2 kb using sperm typing and suggested that recombin ation is more frequent and not uniform along chromosomes and also concentrated at certain regions called ‘recombination hot spots.’ A hotspot is defined as a small region of 1–2 kb in which the recombination rate is at least ten times higher than in surrounding regions. When McVean et al. (2004) studied fine-scale recombination using computational approaches, their results closely agreed with the recombination hotspots identified by methods such as sperm analysis. Myers et al. (2006) extended the high-resolution mapping to the entire human genome and reported 30–50,000 recombination hotspots corresponding to one hot spot for every


50–100 kb of DNA. Some of the hotspots were associated with retrovirus-like retroposons in which two sequence motifs, CCTCCCT and CCACGTGG, were enriched. However, most hotspots lacked these motifs, implying that recombination hotspots in the human genome have multiple causes (Myers et al. 2005).

The structured genome—haplotype blocks When recombination occurs, it breaks the chromosome into smaller blocks. The discovery by cytogenetics and evolutionary genetics that chromosomes are divided into blocks exchanged by recombination (Dobzhansky 1970), confirmed with DNA sequence polymorphisms, has led to the current view of the human genome as a series of islands of strong linkage disequilibria punctuated by recombination hot spots (Goldstein 2001). The complementary relationship between linkage disequilibrium and recombination suggests the presence of linkage disequilibrium blocks (Fig. 3.1d—also called haplotype blocks). This concept is useful in clinical studies because if two or more populations share the same haplotype block, clinical samples from these populations may be pooled and analyzed as one population in association studies (Cardon and Abecasis 2003). Several studies suggest that haplotype blocks are regular features of the human genome. For example, Jeffreys et al. (2001), Patil et al. (2001), and Gabriel et al. (2002) found that chromosomes (haplotypes) are divided into blocks extending from 5 to 100 kb. While these studies show the ubiquity of blocks in the human genome, the appropriate definition of blocks remains controversial (Stumpf and Goldstein 2003; Nothnagel and Rohde 2005). Many approaches for defining linkage disequilibrium blocks have been proposed (e.g., Daly et al. 2001; Zhang et al. 2002). In a comprehensive analysis of the distribution of haplotype blocks in the human genome, Hinds et al. (2005) reported nearly 90,000, 100,000, and 236,000 blocks, with an average size of 25.2, 20.7, and 8.8 kb, in Han Chinese, European-American, and African-American populations, respectively. Some methods agree on the boundaries among blocks (Gabriel et al. 2002; Evans and Cardon 2005; Ribas et al. 2006), while others show marked variation (Zhang et al. 2002; Liu et al. 2004). Further assays of additional SNPs in a wider


array of human populations are needed to establish the extent to which haplotype blocks can be reliably defined and generalized.

TagSNPs While each SNP within a haplotype is correlated with its neighboring SNPs to a degree that varies with distance along the chromosome, some are so tightly correlated that they can be treated as reliably identifying the same chromosomal region. Johnson et al. (2001) examined the strength of linkage disequilibrium between SNPs within 9 genes and found that some pairs of SNPs were in complete linkage disequilibrium with each other. Therefore, they reasoned that they could use one such SNP to predict the presence of the other and referred to these SNPs as ‘haplotype tag SNPs (htSNPs), markers that capture the haplotype of a gene or a region of LD (Fig. 3.1e).’ Using this approach, they were able to select only 2 out of 6 SNPs from one gene, and 5 out of 22 SNPs from another, without any loss of information. This approach is advantageous, because a small number of tagSNPs could capture most of the information within a block, and they would reduce the cost of genotyping with little loss of information. These findings stimulated the exploration of various approaches to identify tagSNPs, three of which appear to be superior (Chi et al. 2006). The identification of tagSNPs involves three steps: (a) defining haplotype blocks, (b) determining the strength of LD between SNP pairs using a threshold (usually r 2 = 0.8: Carlson et al. 2004), and (c) selecting a few informative SNPs that can explain at least 90% of the information within the haplotype block. Because the boundaries of haplotype blocks can be unreliable, it may be preferable to ignore the blocks and choose the tagSNPs using a ‘model free approach’ (Halldorsson et al. 2004).

The HapMap project Background Reich et al. (2001) studied the distribution of common SNPs, each with a minor allele frequency of > 0.05, and their linkage disequilibrium relationships in 19 large (> 160 kb) regions in the Utah Mormon, Centre d’Etude du Polymorphisme



Humain (CEPH), Swedish, and Yoruban samples. They found similar haplotype patterns in the Utah and the Swedish populations and suggested that information from the CEPH sample could be used to predict the overall linkage disequilibrium patterns in Europeans. They also reported that the shorter ‘Yoruban haplotypes are generally contained within the longer Utah haplotypes.’ Encouraged by these results, Reich and Lander (2001) resurrected the common disease/common variant (CD/CV) hypothesis of Lander (1996), which states that most common diseases such as hypertension may have relatively simple allelic spectra and that ‘the causes of disease could be found by association studies using common gene variants.’ Gabriel et al. (2002) extended their study, analyzing haplotype distributions in five major populations and finding that only a few haplotypes were needed to explain the diversity within each population. They indicated that because the sample size (n) required for association studies using information on linkage disequilibria is the reciprocal of r2 (n/r2: Ardlie et al. 2002), 300,000 to 1 million SNPs may be sufficient to define common haplotypes in any global population, and that these haplotypes could in turn be used in association studies. Motivated by this observation, Gabriel et al. (2002) argued that it was important to develop a resource for linkage disequilibrium mapping that would include all common variants found in the human genome in order to understand the genetic basis of complex diseases. If such common haplotypes in all major ethnic groups could be discovered, then it should be possible to use them to discover disease genes. This insight propelled the launch of the HapMap project (http://www.hapmap.org), the most ambitious genetic study since the Human Genome Project.

Findings Phase 1 of the HapMap project aimed to ‘create a resource that would accelerate the identification of genetic factors that influence medical traits’ and was completed in 2005. It consisted of two parts: first, to genotype at least one common SNP with a minor allele frequency of t 0.05 every 5 kb across the genome in three major populations; and second, to analyze in detail all the variants

in ten regions, each 500 kb in length (total 5 Mb), selected from the ENCODE (ENcyclopedia of DNA Elements) project (http://www.genome.gov). The HapMap study included 269 DNA samples representing three major ethnics groups: 90 Yoruba Africans (30 parent–offspring trios) from Ibadan, Nigeria (YRI); the CEU collection of the Utah population (CEU; 90 individuals from 30 trios); and 45 Han Chinese from Beijing and 44 Japanese from Tokyo. As appreciable allele frequency differences between the Chinese and Japanese populations were not detected, they were combined into one (CHB + JPT). These samples were genotyped with 1,007,329 SNPs. In the 17,944 SNPs discovered in the ENCODE regions (one per 279 bp), 46% had a minor allele frequency of d 0.05. The differentiation among the HapMap populations (Fst) was 0.12, which is similar to that of other studies of major world populations (Jorde 2000). Fst was lower (0.07) when the African population was omitted. The HapMap project also revealed the block-like structure of the genome, with each block consisting of 30–70 SNPs. The number of common haplotypes ranged from 4.0 in CHB + JPT to 5.6 in YRI. Although some haplotypes terminated at recombination hotspots, this was not a general rule. Some haplotypes extended more than 100 kb near centromeres and in certain regions of the X chromosomes. Because many SNPs were in strong linkage disequilibrium with one another, it was concluded that one SNP may be used at every 2 kb in Yorubans and one at every 5 kb in CEU and CHB + JPT in association studies without a significant loss of information. In general, 260,000 to 474,000 tagSNPs may be required to capture all common SNPs in the PHASE1 data set. The HapMap is enjoying notable success at least in three areas: transferability of the HapMap SNPs to other populations, use of these SNPs in association studies, and detection of natural selection on specific genes. Similarities between the Utah HapMap tagSNPs and those of several Caucasian populations (Montpetit et al. 2006) have been reported, as have similarities between CHB + JPT tagSNPs and those of the Korean population (Yoo et al. 2006). Transfer of tagSNPs across related populations with differences in demographic history, which can


affect LD patterns, can sometimes be problematic (De La Vega et al. 2005). Recent studies suggest that that the problems are relative: HapMap tags could be used in Japanese and other Caucasian populations with a greater degree of confidence than in African-American samples (Conrad et al. 2006; de Bakker et al. 2006). The key factor in the portability of tagSNPs across populations appears to be the level of linkage disequilibrium in the population to be tagged. Populations with high linkage disequilibrium, such as the European populations, tend to ‘tag well,’ as opposed to populations with low levels of linkage disequilibrium (e.g., African populations), which ‘tag poorly’ (Need and Goldstein 2006). The HapMap project is already helping us to identify disease variants for some complex disorders. For example, causal variants (SNPs) underlying macular degeneration (Klein et al. 2005), HDL cholesterol levels (Hinds et al. 2004), and obesity (Herbert et al. 2006) have been discovered using


the HapMap SNPs (Table 3.1). Signatures of natural selection have been detected for several genes involved in longevity, fertility, and disease resistance (Wang et al. 2006). Questions remain, however, about the general applicability of the HapMap results across all human populations (Sawyer et al. 2005), and the use of relatively high-frequency SNPs can sometimes affect linkage disequilibrium patterns in unexpected ways (Clark et al. 2005b). Because of these and other concerns (Terwilliger and Hiekkalinna 2006), the HapMap data should be regarded as a highly useful first approximation to human haplotype diversity but should continue to undergo careful scrutiny. The project is being improved by the addition of more populations, and, in the nowcompleted Phase 2 of the project, an additional six million SNPs. Ultimately, as genotyping technology advances, the cost per genotype will shrink, making the use of tagSNPs from the HapMap less necessary (Need and Goldstein 2006).

Table 3.1 Some examples of associated/causal polymorphisms with various phenotypes Phenotype


Chromosome location

Causal variant




20p13 7p14.3 Xq13.1 8p23 16q21 5q31.1

rs511898 C>T; rs574174 C>T rs522363 C>T 1057 C>T 37995 C>G 3020insC 1672 C>T −207 G>C

Blakey et al. (2005) Laitinen et al. (2004) Koschinsky et al. (2001) Evans et al. (2005) Ogura et al. (2001) Peltekova et al.(2004)

7q36 7p14.3 2q21 7q31 2q14.1 1p13.3 1q 21–23 1q21

−786 T>C rs17822931 G>A Minus13910 C>T 2445 bp deletion in the intron 2 rs522363 C>T rs2476601 C>T −169 C>T M308A>G

Taverna et al. (2005) Yoshiura et al. (2006) Ennattah et al. (2002) Lai et al. (2001) Herbert et al. (2006) Begovich et al. (2004) Kochi et al. (2005) Lee et al. (2006)



rs1990760 A>G

Smyth et al. (2006)


3p25 2q37.3 10q25.2

C1431T rs12255372 G>T Rs7903146 C>T

Altshuler et al. (2000) Horikawa et al. (2000) Grant et al. (2006)

Blood pressure Brain size Crohn’s disease

Diabetic retinopathy Earwax Lactase persistance Language Obesity Rheumatoid arthrits Systemic lupus erythemotosis (SLE) Type I diabetes Type II diabetes a. protective b. association



Structural variation Whereas the finding that haplotype (linkage disequilibrium) blocks break the genome down into blocks defined by their lack of recombination is only a few years old, an even larger genomic structure has been known for about a century: chromosomes. The human genome has two types of chromosomal polymorphisms: numerical and structural. These describe variations in the number of entire chromosomes (aneuploidy, polyploidy, etc.) and within chromosomes (duplication, deletion, translocation, and inversion), respectively (Dobzhansky 1970). These classical structural chromo somal polymorphisms involve cryptic changes to several megabases of DNA. Variation in segments of DNA between 500 bp and 5 Mb in length known as copy number polymorphisms (CNPs: Feuk et al. 2006; Sharp et al. 2006) involve regions smaller than classical chromosomal duplications, deletions, translocations, and inversions but much larger than those containing microsatellites. So far, 3654 CNPs have been discovered using Mendelian transmission errors from pedigreed families, deviations from H-W equilibrium, comparative genomic analysis using BAC constructs, and reduced signal intensities in microarrays (Wong et al. 2007). Recent reports indicate that at least 73 genes involved in inherited and sporadic diseases, susceptibility to disease, and complex and benign traits contain CNPs (Wong et al. 2007).

Inference of evolutionary processes Natural Selection Molecular markers have been extensively used to document the action of evolutionary forces such as selection, drift, mutation, migration, and inbreeding. Following pioneering work on Drosophila (Kreitman and Aguadé 1986), departures of allelic frequency distributions from neutral expectations have been used to test for the effects of natural selection (Nielsen 2005). The underlying principle in these studies is that alleles under positive selection increase in frequency, while negative or purifying selection reduces allele frequencies. With this premise, signatures of selection have been documented in the lactase gene in

relation to domestication of cattle, G6PD in relation to malaria, TNFSF5 in response to infectious diseases, and FOXP2 in the development of human speech (Cavalli Sforza and Feldman 2003). Recently, it has been suggested that approximately 1800 genes are under Darwinian selection (Wang et al. 2006). This list included many genes that have been previously shown to be under selective constraints and belong to host–pathogen, reproductive, and immune pathways (Bustamente et al. 2005). Polymorphisms in hypertension-related genes such as GNB3 and AGT show broad clinal variation in relation to latitude (Young et al. 2005), suggesting a role for selection in relation to ecological factors frequently demonstrated in evolutionary genetics of various organisms using classical markers (Wallace 1980). In an interesting study, Tishkoff et al. (2006) studied polymorphisms in the lactose tolerance (LCT) gene in three African populations. A mutation in the LCT gene (C/T 13910) that started to spread about 9000 years ago allows adult Europeans to digest milk. Tishkoff and co-workers found that three other mutations in the same gene arose independently in Tanzanians, Kenyans, and Sudanese, allowing them to digest milk as adults. They reasoned that it must have started to spread about 7000 years ago, when cattle were domesticated in this region. This provides an interesting example of convergent gene–culture coevolution, in which parallel changes occur in unrelated populations in relation to cultural changes. Another signature of selection may be found in series of SNPs that extend in linkage disequilibrium for many kilobases. Sabeti et al. (2002) measured the extent of homozygosity at all SNPs in haplotypes in G6PD and TNFSF5, genes implicated in resistance to the malaria parasite, Plasmodium falciparum. Their results indicated that haplotypes extend 400–500 kb from a defined core. Volkman et al. (2006) extended this approach to document the extent of genetic variation among 54 isolates of the parasite itself, in which they also found linkage disequilibrium extending over 64 kb at least in one region. Similarly, Tishkoff et al. (2006) reported haplotype homozygosity extending to 2.0 Mb among six African populations, in response to recent strong natural selection.


As mentioned earlier, studies focused on detecting signatures of selection using molecular data assume that fluctuations in allele frequencies reflect natural selection. However, genetic drift also affects the same alleles subjected to natural selection. Hence, disentangling the effects of natural selection and drift solely based on molecular data is difficult. Perhaps it requires carefully designed studies that track the inheritance of marked genotypes from parent to offspring, and actually determine allele frequency changes in response to selection. Also note that natural selection by definition is the differential survival of phenotypes. Therefore, the inheritance pattern of specific alleles must be studied in relation to the phenotypes in order to document natural selection. Until then, conclusions reached solely using molecular data as indices to document natural selection must be viewed as tentative.

Genetic Drift Reductions in population size amplify the impact of the normal random fluctuations in gene frequencies termed genetic drift. Population bottlenecks that lead to genetic drift can dramatically raise or lower the frequencies of alleles and haplotypes in populations. One of the most quoted examples is migration of anatomically modern humans out of Africa some 50–100,000 years ago, which resulted in a bottleneck in the size of non-African populations (Cann et al. 1987; Reich et al. 2001). Genetic drift in small populations can produce results similar to those of inbreeding, which reduces the level of heterozygosity. The combined effect of inbreeding and drift could leave extended tracts of homozygosity in the genome as historical residues even in outbred populations, and some such effects may still be retained in human populations. Signatures both of inbreeding and of natural selection have been detected using the HapMap data (Gibson et al. 2006).

Admixture Human populations are not isolated. Geographical migrations and mating between interethnic groups occur frequently, leading to substantial admixture


among populations. For instance, the level of European admixture in African-Americans of New Orleans was estimated to be 40% (Parra et al. 1998). Even low levels of admixture can potentially affect association studies, resulting in spurious associations (Devlin and Roeder 1999). The extent of admixture has traditionally been inferred with a predefined set of populations from which the admixed population is assumed to have been derived. While admixture poses problems for association studies, it may be useful for gene mapping. A technique called ‘admixture mapping’ (Chakraborty and Weiss 1988) exploits the differences in allele frequencies in parental populations and the linkage disequilibrium in the resulting intermixed populations. Alleles specific to ethnic groups or ‘private polymorphisms’ (Neel and Thomson 1978) have been developed to identify the proportion of individual admixture. These studies will help to characterize individuals on the basis of genetics rather than on their geographical origins and may lead to more accurate, genetically based targeting of medications (Chapter 4, but see Mountain and Risch 2004).

Causal SNPs and the magnitude of their effects Linkage analysis, linkage disequilibrium analysis, the availability of the complete human DNA sequence, SNPs, and tagSNPs have all contributed to the discoveries of genes that underlie complex diseases. Examples in which specific alleles have significant effects on disease predisposition (see Table 3.1) include APOE and Alzheimer’s disease; PPAR␥ and type II diabetes; ADAM33, GPRA, and asthma; CARD15 (NOD2) and Crohn’s disease. The effect size, measured in odds ratios (OR), of many of these ‘causal SNPs’ is low (Todd 2006). In other cases, such as familial low-HDL cholesterol, multiple rare SNPs are involved in disease predisposition, and no single SNP plays a major role (Cohen et al. 2004). Furthermore, more recent work indicates that a majority of the rare (and low-frequency) mutations may have deleterious effects (Kryukov et al. 2007), indicating that high-frequency SNPs used in the HapMap project to predict the genetic



basis of many common diseases might miss the effects of low-frequency alleles on disease phenotypes as predicted by Pritchard (2001). Thus, the general validity of the ‘common disease–common variant’ model remains unclear, and there are evolutionary reasons to doubt that single variants will explain a large proportion of the variation in most common genetic diseases. Furthermore, the protein products of common disease genes are likely to interact with one another and with the environment in complex ways (Hamona et al. 2006), a prediction made long ago by Sewall Wright (1982, and references therein). Precise mapping of genotype– phenotype relationships for many complex diseases still remains a daunting task (Clark et al. 2005a). As our knowledge of the genome and its variants increases, our ability to meet these challenges will also increase.

Summary 1. Whereas geneticists three decades ago could only work with a few dozen polymorphic markers, we now deal with the challenge of deciding how best to use several million markers. The availability of millions of polymorphisms has revolutionized our ability to employ techniques such as linkage disequilibrium and haplotype analysis to discover genes contributing to human disorders. 2. Many of the recent findings based on SNPs and other newly developed markers agree well with earlier studies using classical markers (e.g., the extent and distribution of variation among populations). Other findings, such as the existence of defined recombination hotspots, are completely new. 3. Several large collaborative projects, including the HapMap, have shown that linkage disequilibrium (LD) patterns vary among continental populations, with the ‘older’ African populations

showing substantially less LD than others, as expected. Recently founded isolates, with large regions of LD, offer a potentially useful resource for rapid gene discovery. The HapMap is not (and was never intended to be) a comprehensive survey of human haplotype diversity. 4. Comparisons with HapMap data show that major haplotypes are similar enough within continental populations (e.g., Europeans) to allow a high degree of transferability of tagSNPs from one population to another, an important advantage for association studies. Indeed, these resources have already facilitated the discovery of several genes that underlie common disease susceptibility. 5. It has also been shown that several interesting new genes and haplotypes have been the targets of natural selection, and that the recombination hotspots that punctuate the human genome have evolved recently and are not shared with our closest evolutionary relative, the chimpanzee. 6. Caveats remain: ascertainment biases, simplifying assumptions about LD between polymorphisms and causal alleles, and complex genotype–phenotype and genotype–environment relationships complicate the use and interpretation of these data. Nonetheless, these new advances are poised to make great contributions to studies of both evolutionary biology and human health, and to forge stronger links between them.

Acknowledgments We thank Steve Stearns and Jacob Koella for providing us with the opportunity to contribute this article, and apologize to a number of authors for not citing many primary references due to space limitations. This work was supported in part by NIH grants GM59290 and HL070048 and by NSF grant BCS-0218370.


Health consequences of ecogenetic variation Michael Bamshad and Arno G. Motulsky

Introduction When asked about the impact of genetics on health, most people cite major advances made in recent decades toward identifying the basis of well-defined Mendelian disorders (e.g., sickle cell disease, cystic fibrosis) and the genetic factors influencing risk of complex diseases such as coronary artery disease and asthma. A much larger fraction of genetic variation is likely to influence how each person interacts with environmental factors that influence their health. The field that deals with the relationship between environmental factors and genetic variation is called ecological genetics or ecogenetics. Ecogenetics is divided into several disciplines that focus on the interaction between genetic variation and different facets of the environment. The most established of these disciplines is pharmacogenetics, in which research has concentrated largely on how variability in a single gene influences differences in how people process and respond to drugs (Motulsky 1957; Vogel 1959; Weinshilboum and Wang 2006). Identifying genetic variants that influence drugrelated traits is important because such variants might influence whether a drug is effective or predispose an individual to an adverse drug effect or idiosyncratic reaction. Of course, whether such variants influence a person’s health depends on drug exposure. A genetic variant that influences whether a person can metabolize codeine to morphine (see below) has no impact on the person’s health if he or she is never exposed to codeine. On the other hand, every person must process thousands of

different chemical compounds consumed each day in the form of food (e.g., a cup of coffee contains several thousand chemical compounds). Indeed, it has long been known that the type and amount of food that a person eats (i.e., their diet) influences health, and more recently that genetic variability influences the effects of diet on health and to some extent food preferences. The study of how genetic variation influences these traits is called nutrigenetics (Motulsky 1987; Ordovas and Corella 2004). Genetic variation that influences interactions with the environment is particularly likely to have been a target of natural selection (Bamshad and Wooding 2003; Sabeti et al. 2006). While the overall effects of selection on ecogenetic variation are difficult to predict, ancestral humans clearly adapted to widely varied environments as they populated various parts of the world. One consequence is that local positive selection for beneficial genetic variants—those variants that make it more likely for their carriers to reproduce and thus become more frequent over time—played an important role in shaping patterns of ecogenetic variation. In turn, finding variants that have been subject to local positive selection can provide insights about which genes and environmental factors influence an individual’s health. Therefore, understanding how evolutionary forces and demographic processes have shaped the distribution of genetic variation among humans provides both a perspective for interpreting patterns of ecogenetic variation and the basis for experimental strategies for ident ifying medically relevant ecogenetic variation.




Genetic basis of variation in drug metabolism and response The potential significance of variation and response to drugs used in medicine has long attracted practical and theoretical attention. Early in the twentieth century, Sir Archibald Garrod identified several rare genetic conditions as inborn errors of metabolism (e.g., cystinuria, pentosuria, and alkaptonuria) and postulated they were caused by use of alternative chemical pathways. He generalized these findings and suggested that idiosyncrasies to food and to drugs have a similar chemical basis (‘what is one man’s meat is another man’s poison’) (Garrod 1902). The insight that biochemical variability was caused partly by genetic differences among people led to the emergence of pharmacogenetics in the mid-1950s, and the capacity to assess the action of many genes simultaneously led to the appearance of pharmacogenomics around 1990 (Eichelbaum et al. 2006).

Genetic basis of monogenic drug reactions In some cases, adverse drug reactions occur in individuals with variants of a single identified gene. These genes include those that code for glucose-6-phosphate dehydrogenase (G6PD), N-acetyltransferase 2 (NAT2), butyrylcholinesterase (BCHE), thiopurine methyl transferase (TPMT), and the large family of genes in the cytochrome P450 oxidase (CYP) drug metabolizing system. An unexpected adverse drug reaction manifesting as excessive red cell destruction following administration of various oxidant drugs was shown in the 1950s to be caused by G6PD deficiency (Carson et al. 1956): an X-linked recessive trait common in populations of African and Mediterranean ancestry. Its relatively high prevalence in these areas was suggested to be caused by its protective effect on mortality from falciparum malaria (Motulsky 1960) in female heterozygotes who carry both normal and enzyme-deficient red cells (Ruwende et al. 1995). Recently, this malaria hypothesis has been supported with the discovery that mutations that cause G6PD deficiency are found on haplotypes bearing regions of extended

linkage disequilibrium (LD) consistent with the effects of recent positive selection (Tishkoff et al. 2001; Sabeti et al. 2002; Saunders et al. 2005). There are several alleles in the gene that encodes NAT2, an enzyme used to acetylate, thereby inactivating, the commonly used anti-tubercular agent, isoniazid, along with several other drugs. Individuals who are homozygous for this allele are known as ‘slow inactivators.’ The prevalence of slow inactivators is around 50% in populations of Europeans and African origin but is lower among East Asians (Lin et al. 1993). Local positive selection appears to have favored haplotypes bearing slow inactivator polymorphims in Western/Central Eurasians, although the adaptive advantage of this trait is unclear (Patin et al. 2006). Alleles of the gene that encodes BCHE cause reduced enzyme activity (Lando et al. 2003). Homozygotes or compound heterozygotes for such polymorphisms exhibit diminished ability to inactivate the drug suxamethonium, a muscle relaxant widely used during anesthesia. Use of suxamethonium in individuals with reduced BCHE activity results in prolonged respiratory failure that requires mechanical ventilation for up to several hours. However, homozygosity in the absence of exposure to suxamethonium causes no known disadvantage. Thiopurine methyl transferase is an enzyme that inactivates thiopurine drugs (e.g., mercaptopurine, azathioprine), which are frequently used to treat acute lymphatic leukemia and to prevent graft rejection of organ transplants. A common mutation of the TPMT gene reduces enzyme activity (Wang and Weinshilboum 2006). Individuals homozygous for this mutation can experience life-threatening bone marrow suppression upon exposure to thiopurine drugs. The presence of such variants can be assessed by genotyping and/or enzyme assay; such assays are frequently done before clinical use of thiopurines. This practice is unlike that used for most pharmacogenetic traits where screening has not achieved general clinical acceptance. This state of affairs is largely explained by the absence of clinical studies with a sufficient number of patients to demonstrate the clinical utility of such testing. The cytochrome P450 oxidase drug metabolizing system is a generic term describing a large collection


of evolutionarily-related genes that encode various isozymes found in humans, other vertebrates, invertebrates, fungi, plants, and some prokaryotes (Ingelman-Sundberg 2004). The CYP system is used by 20–25% of the pathways utilized for elimination of all drugs. One of the best-characterized CYP enzymes is CYP2D6, the activity of which is required for the metabolic breakdown of many different drugs, including several antidepressants. Normal CYP2D6 activity is also required for the metabolism of codeine, a widely used analgesic, into its active component, morphine. More than 80 CYP2D6 alleles have been identified, and ~7% of Europeans carry alleles that reduce CYP2D6 activity, leading to adverse drug reactions as a result of increased drug levels. On the other hand, CYP2D6 copy number also varies among individuals. People with multiple copies of CYP2D6 exhibit increased activity of CYP2D6 that reduces the effectiveness of drugs metabolized by this enzyme (Eichelbaum et al. 2006).

Genetic basis of complex pharmacogenetic traits The best known pharmacogenetic traits, like those above, segregate in Mendelian patterns. They form the minority, for most pharmacogenetic traits are complex. Indeed, the distribution of effects among individuals in early in vivo studies of drug metabolism was frequently bell-shaped and consistent with multifactorial inheritance as contrasted to the multimodal distribution often observed for Mendelian traits. Understanding the genetic basis of complex traits has proven to be challenging, although the availability of new genetic tools and an increasing number of well-characterized cases has made the identification of genetic factors underlying complex pharmacogenetic traits more tractable. With increasing frequency multiple genetic variants in drug metabolism and response are found to contribute to variation in dose requirements and drug effects. To this end, simultaneous ana lysis of several different gene variants affecting drug metabolism and drug effects has now become a common experimental approach. Such a pharmacogenomic approach is leading to the discovery of previously unappreciated pathways that affect drug metabolism and drug action.


Two common variants of CYP2C9 (CYP2C9*2 and CYP2C9*3), another P450 gene, influence the metabolism of warfarin, an anti-coagulant drug. The frequency of these alleles varies between 6 and 12% in populations of European origin, but each is found at a substantially lower frequency in Africans and East Asians. Warfarin is widely used to prevent thrombosis, but because of variation in dose requirements, hemorrhagic complications from warfarin therapy are common. Therefore, an individual’s level of anticoagulation needs to be checked regularly so that warfarin can be given at a dose that is effective while avoiding excessive bleeding. Individuals with at least one copy of either CYP2C9*2 or CYP2C9*3 require less warfarin for effective anticoagulation than the general population. Consistent with this observation, hemorrhagic complications are, on standard dosing, more frequent in individuals who carry the CYP2C9*2 or the CYP2C9*3 alleles. Thus, CYP2C9 variants influence both warfarin metabolism and adverse outcomes associated with warfarin (Kamali 2006). CYP2C9*2 and CYP2C9*3 explain only 6–18% of the genetic variance of individual warfarin requirements. A larger fraction is explained by non-coding single nucleotide polymorphisms (SNPs) in the gene VKORC1, which encodes the enzyme vitamin K epoxide reductase complex 1 (Rieder et al. 2005). These SNPs define five different VKORC1 haplotypes, A–E. Haplotype A is associated with a lowdose warfarin group, while haplotype B is associated with a high-dose warfarin group. These haplotypes account for 15–30% of the total variance in warfarin dose requirements. Together, the CYP2C9 and the VORKC1 polymorphisms explain almost half of the variance of warfarin dosage. Additional genetic and environmental factors that influence warfarin metabolism remain to be found.

Genetic basis of chemosensory perception and food preferences Food preferences are determined by multiple types of sensory input such as taste, texture, and smell, the perception of which depends on stimulation of ‘receptor’ proteins located on specialized cells on the tongue and the mucosal surfaces of the mouth, nose, and pharynx. In humans, five different



modalities of taste can be detected: sweet, salt, sour, bitter, and umami (savory or the taste of monosodium glutamate). There may also be a mechanism for tasting fats (Mattes 2005). Presumably, the ability to perceive these different tastes evolved because it was important for ensuring an adequate intake of nutrients while avoiding substances that were toxic. Indeed, it is easy to imagine that consumption of food items that were nutritious (e.g., having a high caloric value, rich in protein, or containing essential nutrients) was desirable because it increased the likelihood of survival and reproduction. It was also advantageous to be able to taste compounds that indicated a plant was poisonous or that a food item had become rancid. Such compounds produced by plants or the bacteria and fungi growing on foodstuff frequently taste bitter. Bitter is typically considered a bad taste, although in low concentrations bitter-tasting compounds are sometimes found desirable (e.g., coffee, beer).

Bitter taste sensitivity Variation in taste sensitivity to bitter compounds is arguably the best characterized of all the taste phenotypes in humans. This is due, in part, to the discovery in the 1930s that the ability to taste the bitter compound phenylthiocarbamide (PTC) varied among individuals and that much of this variation was explained by a single locus (Snyder 1931; Blakeslee 1932; Fox 1932). The ability to taste PTC appeared to segregate as an autosomal dominant trait; non-taster status is recessive. Given the ease with which taste sensitivity to PTC could be tested, it became one of the best known Mendelian traits in humans, but it took more than 70 years to identify the locus for PTC taste sensitivity and the specific genetic variants responsible for tasters and non-tasters. Variation in PTC taste sensitivity is due to alleles in taste receptor type 2 member 38, or TAS2R38, one of a family of 25 different functional bitter taste receptor genes discovered recently (Kim et al. 2005). The TAS2Rs are relatively simple genes consisting of a single exon that encodes small proteins of about 400 amino acids. TAS2R38 haplotypes account for ~50–80% of the variance in PTC taste sensitivity, and most of this variation is explained by amino-acid

variation at positions 49 (encoding proline or alanine), 262 (encoding alanine or valine), and 296 (encoding valine or isoleucine), which give rise to two common isoforms, denoted PAV and AVI (Kim et al. 2003). In vitro stimulation of cells expressing TAS2R38-PAV with PTC elevates cytosolic [Ca+2] concentrations, while stimulation of cells expressing TAS2R38-AVI does not. Variation in PTC taste sensitivity has long interested anthropologists and evolutionary biologists. R. A. Fisher and his colleagues E. B. Ford and J. Huxley investigated the variability of PTC taste sensitivity in great apes and performed what is arguably the first test of natural selection on a human gene. Fisher et al. (1939) reported that chimpanzees, like humans, varied in their ability to taste PTC and argued that balancing natural selection, a form of natural selection in which two or more different alleles are maintained at frequency higher than expected by chance, must therefore have maintained variation at the PTC locus from a time before the human–chimpanzee divergence. Since discovery of PTC taste sensitivity, the frequencies of the presumed PTC taster and non-taster alleles have been studied in hundreds of populations worldwide (Fisher et al. 1939; Guo and Reed 2001). These studies showed that the frequency of the non-taster allele varies around a mean of ~50%, and this observation was confirmed by resequencing TAS2R in a large sample of chromosomes from Africans, Asians, and Europeans (Wooding et al. 2004). Moreover, the two divergent TAS2R38 haplotypes are maintained at similar frequencies in populations from different geographical regions, and, under reasonable assumptions about human population history, the distribution of allele frequencies in TAS2R38 differ significantly from that predicted under neutrality (Wooding et al. 2004). Collectively, these results suggest that TAS2R38 has been subject to balancing natural selection in humans. The mechanism by which balancing selection has maintained highly divergent TAS2R38 haplotypes in humans is unclear. One possibility is that the fitness of individuals who are TAS2R38 heterozygotes is higher than that of homozygotes because they can perceive a wider repertoire of bitter compounds. This hypothesis is consistent with the observation that TAS2R38-AVI appears


to be functional. Indeed, two recent studies have reported that fruits of the plant Antidesma bunius taste bitter to PTC non-tasters, raising the possibility that this plant contains a heretofore unknown ligand for the receptor encoded by TAS2R38-AVI (Henkin and Gillis 1977; Tharp et al. 2005). One prediction of Fisher et al.’s (1939) hypothesis that balancing selection has maintained the taster and non-taster alleles since the human– chimpanzee divergence is that these alleles should be shared by both species. With the characterization of TAS2R38 this prediction could be tested directly. Wooding et al. (2006) showed that PTC taste sensitivity in chimpanzees is also controlled by two common alleles of TAS2R38; however, neither of these alleles is shared with humans. Instead, a mutation of the initiation codon results in the use of an alternative downstream start codon and production of a truncated receptor variant that fails to respond to PTC in vitro. Association testing of PTC sensitivity in a cohort of captive chimpanzees confirmed that the chimpanzee TAS2R38 genotype accurately predicts taster status in vivo. Therefore, while Fisher et al.’s (1939) observations were accurate, their explanation was wrong. Humans and chimpanzees share variable taste sensitivity to bitter compounds mediated by PTC receptor variants, but the molecular basis of this response has arisen twice, independently. Shortly after the discovery that taste sensitivity to PTC varied among humans, Williams (1931) suggested that variation in bitter taste sensitivity might explain differences in food preferences among individuals. In the ensuing decades, numerous studies examined the relationship between bitter taste sensitivity, food preferences, and health-related traits (Bartoshuk and Beauchamp 1994; Tepper 1998). Particular attention focused on foods such as cruciferous and green leafy vegetables that are bitter to taste and rich in compounds with antioxidant or anti-carcinogenic properties. The importance of such studies is underscored by the repeated observation that taste, as opposed to nutritional or health value, is the key determinant of food selection (Glanz et al. 1998). Despite years of work, the association between bitter taste, food preferences, and health has been suggestive but not compelling, partly because both bitter taste and


food preferences are complex traits. The identification of the TAS2R genes and the ligands to which they bind has facilitated direct testing of the association between TAS2R genotypes and sensitivity to bitter-tasting foods and drugs (e.g., tobacco), although results to date should be considered preliminary (Cannon et al. 2005; Timpson et al. 2005). There is considerable genetic variation in the genes that encode the 24 other TAS2R receptors. The resequencing studies that have shown that diversity of TAS2R genes is higher than expected suggest that other TAS2R genes have also been targets of natural selection (Kim et al. 2005). One such target is TAS2R16, which encodes the receptor that mediates responses to salicin, amygdalin, and many E-glucopyranosides. A TAS2R16 haplotype bearing a mutation that results in a K172N substitution encodes a receptor that exhibits increased sensitivity to bitter E-glucopyranosides (Soranzo et al. 2005). This haplotype appears to have been a target of recent positive selection as its frequency is much higher in many populations than expected under neutrality. Interestingly, the haplotype that encodes the K172-bearing receptor has been associated with an increased risk for alcohol dependence (Hinrichs et al. 2006). This haplotype is rare in European Americans in which the minor allele frequency (MAF) is only 0.6%, but nearly half of all African Americans carry at least one allele.

Sweet and umami taste sensitivity Over the past several years, receptor proteins for sweet, sour, umami, bitter, and salt taste sensitivity have been identified in a wide range of species including humans. Sensitivity to sweet and umami is controlled by a pair of G-protein coupled receptors (GPCRs), the components of which are encoded by three genes, taste receptor 1 protein 1, (TAS1R1), TAS1R2, and TAS1R3 (Nelson et al. 2001, 2002). The proteins encoded by TAS1R2 and TAS1R3 combine to form the sweet receptor while the umami receptor is a heterodimer of the proteins encoded by TAS1R1 and TAS1R3. In mice, several TAS1R3 variants have been associated with variation in sweet preference, and in vitro functional studies have confirmed that a single nonsynonymous substitution of a threonine for isoleucine explains part



of the variance in sweet preference (Nie et al. 2005). Moreover, homozygous deletion of both TAS1R3 and TAS1R2 results in mice that are insensitive to sweet (Zhao et al. 2003). Collectively, these studies indicate that variation in genes that encode taste receptors can influence inter-individual variation in taste preferences. Similar polymorphisms that influence umami sensitivity have not yet been found. Likewise, while preferences for sweet and umami have been found to vary among people, the genetic and molecular basis of this variation remains unknown (Lugaz et al. 2002).

Lactase persistence Throughout the world, most adults cannot drink milk without experiencing symptoms of cramping, bloating, and diarrhea because they cannot digest lactose, the main carbohydrate present in milk. The inability to digest lactose (i.e., lactose intolerance) is due to reduced activity of the enzyme lactasephlorizin hydrolase (LPH) in the epithelium of the small bowel. LPH is active in infants, but its activity diminishes after weaning so that most adults are ‘lactase non-persistent.’ In some people, LPH activity persists into adulthood (i.e., lactase persistence). Adults with lactase persistence can digest lactose without a problem and can therefore consume milk products without developing symptoms of lactose intolerance. Lactase persistence is common in many populations of Northern Europe (e.g., ⬎ 90% in Swedes and Danes) and present in a handful of populations in sub-Saharan Africa (e.g., ~90% in the Tutsi), whereas lactase non-persistence is more typical in most other parts of the world (Swallow 2003). Most mammals lose the capacity to metabolize lactose after weaning from breast milk. Accordingly, lactase non-persistence appears to be the ancestral state in humans, and mutations causing lactase-persistence arose subsequent to the origin of anatomically modern humans. Given the high frequency of lactase persistence in populations in which dairy products have been a staple food source, it has long been suspected that LCT has been the target of recent positive selection coincident with the domestication of cattle (Hollox et al. 2001; Bersaglieri et al. 2004; Myles et al. 2005).

Adult expression of LPH segregates as an autosomal dominant trait, and is controlled by cis-acting elements upstream of LCT, the gene that encodes LPH (Wang et al. 1995). In European populations, adult LPH expression is regulated by a C/T SNP at -13910 located ~14 kb upstream in intron 9 of a gene named minichromosome maintenance 6 (MCM6) (Enattah et al. 2002). In sub-Saharan African populations in which lactase persistence is common, the T-13910 variant is found at low frequency in groups from West Africa (e.g., the Fulani) and is absent in populations from East Africa. This observation suggested that the molecular basis of lactase persistence in Africans is different than in Europeans (Mulcare et al. 2004). This prediction was confirmed with the discovery of three novel SNPs (i.e., C/C-14010, T/G13915, and C/G-13907) located upstream of LCT in sub-Saharan African populations, each of which appears to increase transcription of LCT in vitro (Tishkoff et al. 2007). The pattern of DNA sequence variation observed in the genomic region surrounding LCT suggests that it has been a target of recent positive selection. Specifically, a region of linkage disequilibrium greater than 1 Mb has been observed on chromosomes bearing the T-13910 variant in Europeans (Bersaglieri et al. 2004). This unusually long region of LD (given the high frequency of the T-13910 variant) suggests that positive selection has increased the frequency of this variant more rapidly than expected. The presence of such a pattern is a characteristic signature of recent positive selection (Sabeti et al. 2002; Bamshad and Wooding 2003). It suggests that the T-13910 variant originated between ~2,000 and ~20,000 years ago, consistent with the inception of dairy farming in Europe as inferred from archaeological evidence. In sub-Saharan Africans the three SNPs that upregulate adult expression of LCT and are associated with lactase persistence occur on three different haplotypes common (i.e., frequency > 5%) in different populations. The most frequent of these haplotypes bears the C-14010 SNP in a region of extended LD that spans more than 2 Mb and also has been a target of recent positive selection (Tishkoff et al. 2007). This C-14010 SNP is estimated to have originated between ~1,200 and ~23,000 years


ago, consistent with archaeological data suggesting that pastoralism only spread into sub-Saharan Africa ~3,000–4,000 years ago. For the C-14010 to have reached the frequency observed in Africans in such a short time, the selection coefficient must have been between 0.01–0.04, which is fairly strong and similar to the effects of malaria on the HBB variant that causes sickle cell disease. Thus recent positive selection has rapidly increased the frequency of several different LCT variants independently in populations from at least two parts of the world. In each case, the selective force appears to have been a local adaptive response to the higher fitness afforded by the ability to consume dairy products.

The structure of human populations Studies of the health consequences of ecogenetic and pharmacogenetic variation often draw comparisons among populations, particularly among populations defined by race—using its historical meaning as a descriptor of Africans, Asians, Europeans, Native Americans, and Pacific Islanders. Moreover, it is commonplace to use membership in a population to predict the likelihood that an individual carries a particular genetic variant that influences susceptibility to disease or drug response. While differences in disease prevalence are often due to environmental factors, some genetic factors that influence health-related traits such as hypertension, diabetes, and atherosclerosis, and their complications vary among racial groups (reviewed in Bamshad 2005). One important issue is the difference between race and ancestry. Ancestry refers to objective genetic relationships between individuals and among populations, whereas race has always been a somewhat arbitrary definition of population boundaries. For example, while an individual might have ancestors from Africa, Europe, and North America, he or she still might be categorized as an African American if a substantial number of genes are of African origin. Therefore, race captures some biological information about ancestry, but is not equivalent to ancestry. The extent to which race helps us to predict genetic differences that influence health partly depends on how well


‘traditional’ classifications of race correspond with genetic inferences of individual ancestry.

Correspondence between race and population structure Two randomly chosen humans differ at ~1 in 1,000 nucleotide pairs or at, on average, ~3 million of the genome’s 3 billion nucleotides (i.e., they are 99.9% identical). However, different sites vary between different pairs of individuals. In all it is estimated that there are ~10 million variable sites with SNPs at a frequency of t 1% (Kruglyak and Nickerson 2001). Millions more SNPs are even more rare. While some of these SNPs contribute to phenotypic variation, most SNP variation is said to be ‘neutral’ or without functional consequence. The distribution of this neutral variation reflects human demographic history, including the organization of humans into subpopulations (i.e., population structure). Studies using a broad range of genetic markers have confirmed that ~10–15% of the total genetic variation in humans is explained by differences between sub-Saharan Africans, Northern Europeans, and East Asians. This means that, on average, individuals chosen at random from each of these populations will be slightly more different from one another than individuals from the same population. Yet, while the fraction of overall genetic variation distributed between groups is relatively small, allele frequencies at different loci are highly correlated, so that a modest number of genotypes (i.e., several hundred per person) can, with a high degree of accuracy, allocate anonymous individuals to groups that correspond to ancestry from different geographic regions. Thus, geographical ancestry can be used to make reasonably accurate predictions of genetic ancestry. Populations from neighboring geograph ical regions typically share more recent common ancestors, and therefore their allele frequencies are more highly correlated—a pattern commonly seen as a cline of allele frequencies. Because of such clines, individuals sampled continuously across some large geographic regions (e.g., Middle East, Central Asia) are difficult to allocate into genetic groups that are inclusive of all individuals from these regions. Correspondence with geography is



also less apparent for populations (e.g., Hispanics, South Asians) that are historically admixed.

Race as a proxy for genetic ancestry In the United States, several studies have reported that classification of individuals by self-identified racial group is highly correlated with inferences based on explicit genetic data (Tang et al. 2005). Despite this observation, race is a crude predictor of the genetic ancestry contributed by populations from different geographic regions of the world (i.e., ancestry proportions). For example, the fraction of variation that an African American shares with West Africans varies considerably because African Americans have admixed to variable degrees with groups originating from other geographic regions, mainly Europe. Thus, while the West African contribution to an African American’s ancestry averages about 80%, its range is wide (i.e., ~20–100%) (Shriver et al. 2003). The genetic composition of selfidentified European Americans also varies, with ~30% of European Americans estimated to have ⬍ 90% European ancestry. Similarly, Hispanics from different regions of the United States are variably admixed with different populations (e.g., more African admixture in Hispanics living in the southeast versus more admixture with Native Americans in the southwest) (Choudhry et al. 2006). Accordingly, membership in a genetically inferred group does not mean that all members of the group necessarily have a similar genetic composition. Knowing the proportion of an individual’s ancestry that originated in different populations and to what degree a group is divided into genetic subpopulations can be useful for identifying genetic and environmental factors—by reducing false negative associations and uncovering true associations—that underlie common diseases for which risk varies among populations. To this end, several hundred loci that are particularly informative for estimating ancestry proportions in African, European, Asian, Hispanic, and Native Americans have been identified. In contrast to the situation in the United States, the geographical origins of individuals with African or Asian ancestry living in other parts of the world (e.g., Europe, South America) are more heterogeneous. This is important because sub-Saharan Africans and

Asians are clearly divided into multiple genetic subpopulations. Indeed, some populations from East Africa and West Africa are, on average, more different genetically than populations from Northern Europe and East Asia. Worldwide, notions of race capture only a modest amount of information about geographical ancestry and therefore population structure, and capture even less, in general, than ancestry inferences from explicit genetic data.

Conclusions 1. Ecogenetics aims to understand how genetic variation interacts with environmental factors to influence human phenotypic variation. 2. A priority is to identify variants that predict drug response and adverse drug reactions as well as taste perception and nutritional effects. 3. Such traits are often caused by multiple variants that work together in the same physiological pathway or interact in different pathways. 4. Identifying these variants is now easier using new technologies that assay genome-wide patterns of variation and new statistical methods that test for interactions between genetic variants and environmental factors. 5. It is already clear that the frequencies and effects of many risk alleles influencing health co-vary more closely with ancestry than with race. 6. While race captures substantial information about ancestry in U.S. populations, overall it is a poor predictor of individual ancestry proportions. Instead, ancestry proportions should be inferred based on geographical ancestry or better yet, explicit genetic information. 7. Change in the clinical setting to reflect this reality will not come quickly or easily. It will require understanding the circumstances in which ancestry, rather than race, is a better predictor of disease risk or treatment outcome; developing convenient and cost-effective ways to assess ancestry (e.g., an officebased assay of ancestry informative markers tested on DNA from saliva); and teaching clinicians how to interpret ancestry information appropriately.

Acknowledgments A.G.M. is partially supported by N.I.E.H.S. Grant P30ES07033 (Center for Ecogenetics).


Human genetic variation of medical significance Kenneth K. Kidd and Judith R. Kidd

Introduction In this chapter we put human genetic variation into a global context and discuss several ways in which that global view of genetic variation relates to health. There is enough genetic variation that every independently conceived human is genetically unique. That uniqueness results from the alternative forms (alleles) that occur at varying sites (polymorphisms) in the basic human DNA pattern and the huge number of possible combinations of those alleles. The amount of DNA sequence variation can now be estimated, and while a very large number of varying positions exist, they constitute a small fraction of the genome. Most of this variation appears to be normal in that the various forms are present at common frequencies. However, common normal variation is not necessarily irrelevant to health, as examples in this chapter will illustrate. The variation does not just exist among individuals; populations from different parts of the world have different frequencies of those different alleles. Such frequency differences among populations are a result of evolution. While the causes of genetic differences among populations are many and sometimes difficult to determine, that population variation can be very important in understanding the different frequencies of medically relevant conditions across the globe. We begin by discussing how random factors during the history of modern humans established an overall global pattern of genetic variation with individual genes showing a distribution around

that overall pattern. This normal variation provides a baseline from which evidence of recent selection must be distinguished. Because chance events result in different patterns of frequency variation for independent genes, demonstrating that a specific pattern is the result of selection and not neutral evolution is not easy. We use an in-depth analysis of some of the genes involved in susceptibility to alcoholism to illustrate some of the important issues that arise in trying to understand the causes of allele frequency variation among populations and the relationship(s) of that variation to health. This example illustrates how both evidence of natural selection and population variation can affect our understanding of the genetic components of a complex disorder.

The pattern of human genetic variation Modern humans reflect several recent evolutionary processes: a long period of evolution in Africa, the divergence of populations as humans spread around the world, with populations becoming partially reproductively isolated, and the counteracting process of gene flow, whose impact has varied inversely with distance. Thus, populations in different geographic regions have both shared and unique evolutionary histories. Because modern humans are a young species, and populations derived from the expansion out of Africa are even younger, only strong selective forces are likely to have had a significant and detectable microevolutionary effect. 51



The amount and nature of human genetic variation In a basic sense evolution is the change in allele frequencies, and genetic contributions to health and disease among modern humans often reflect genetic variation within the species. At the DNA level we are disconcertingly similar to our chimpanzee cousins, diverging in nucleotide sequence about 1.2% (Chimpanzee Sequencing and Analysis Consortium 2005; Kehrer-Sawatski and Cooper 2007). Human to human variation is much less; the average difference among individuals at the DNA level is estimated to be less than 0.10% (equivalent to about 0.5–1.0% of our bases being variable with allele frequency more than 1%) (Goldstein and Cavalleri 2005; International HapMap Consortium 2005). Much more variation occurs, but only very rarely (at a heterozygosity of less than 1% in the world). While we will be emphasizing variation, it is important to keep it in perspective: all humans are more than 99.9% genetically identical. With such a trivial amount of difference at the DNA level between people, how can it be that each of us is genetically unique? The answer, of course, lies in the overwhelming number of bases contained in the genome—3+ billion. One-tenth of 1% of 3 × 109 is still over 3 million bases being commonly variable in humans! Several million varying sites already have been confirmed to exist (dbSNP at http://www.ncbi.nlm.nih.gov/projects/ SNP/). Most of these differences probably have no phenotypic effects—that is, no effects on appearance, disease resistance, response to drugs or food, development, how our DNA folds, response to environments, or anything else. In fact, most of these DNA differences must be essentially evolutionarily neutral, but ‘most’ of a very large number still leaves many polymorphisms that do affect phenotype and can be or have been or may become subject to selection. However, even some variation expressed in phenotypes may be selectively neutral because the differences do not affect reproductive success. DNA variation occurs as single-base differences, insertions and deletions (long, short, or intermediate, and tandem or dispersed), rearrangements, and others not so easily described (see Chapter 3). The

variants may be in non-coding, coding, or regulatory regions, or they may affect splicing or stability of mRNA. They may have been subject to natural selection (positive, negative, or balancing), and that selection may have been geographically restricted or global, and even if global, of different strength in different geographical regions or having taken place at different times. For all these differences the mutation rates to new forms and the chances of independent recurrence of a form vary with type of polymorphism. Any given difference may be common or rare and may exist in a single person, in a single population, in populations originating in a restricted geographic region, or in all or most populations around the world. Thus, we have vast amounts of genetic variation and it is not uniformly distributed around the world. Obviously, selection can change the frequencies, but selection is not the only factor. Pure chance can also play a role (cf. Chapter 1).

The human expansion out of Africa Modern humans have only recently expanded to occupy almost all the world, reaching the Americas only roughly 20,000 years ago (the exact times are quite controversial) and Western Europe only about 40,000 years ago. Prior to 80,000 to 100,000 years ago there were no modern humans outside of Africa. This relatively recent and rapid expansion has left a significant imprint on our genome with major geographical differences and biomedical and health implications. The most reasonable model for the expansion involves populations already in northeast Africa (e.g., present day Ethiopia, Somalia) crossing into south-west Asia. Groups then repeatedly moved beyond the existing frontiers with the population left behind growing in size (Liu et al. 2006). The consequence for genetic variation of such an ongoing process is accumulation of sampling error: random genetic drift. Today we see the consequences— the largest amount of genetic variation in Africa, much less variation outside of Africa, with gradual decreases in variation from southwest Asia into Europe, from western to eastern Asia and into the Pacific, and finally from somewhere in Siberia into the Americas. Such a pattern of expansion


is consistent with archeological data but is most strongly supported by genetic data. Average heterozygosity is higher in African populations than non-African populations, but it is difficult to demonstrate with all genetic polymorphisms because of a seemingly pervasive European ascertainment bias (many polymorphisms were originally discovered because they had high heterozygosity in Europeans), especially for single nucleotide polymorphisms (SNPs). From a purely statistical perspective, heterozygosity declines more slowly than the number of alleles (Crow and Kimura 1970). Nevertheless, a definite decline in heterozygosity with increasing distance from Africa is apparent for short tandem repeat polymorphisms (STRPs) and haplotypes (Bowcock et al. 1994; Calafell et al. 1998; Zhivotovsky et al. 2003; Tishkoff and Kidd 2004). Linkage disequilibrium in indigenous populations also increases as overall variation decreases in the ‘distance from Africa’ pattern (Tishkoff and Kidd 2004; Conrad et al. 2006). One fact about all this variation stands out: The frequencies of these differences in our DNA almost always differ in different populations (see examples below). Whether a new mutation is not transmitted to future generations and dies out, rises in frequency, or completely replaces the original form of the DNA is generally unpredictable except in terms of probability, irrespective of any selection that may be operating—the stochastic factors described in Chapter 1 are always present. The causes of these allele frequency differences are probably as varied as the polymorphisms themselves. The specific causes include (though are not limited to) demography of the population in which the variant originated, history of migrations and population relationships, mutation rate, natural selection, and, perhaps most important, random genetic drift, or the stochastic sampling of alleles from one generation to the next. A tree representation of population relationships based on variation at a large number of presumably neutral loci is given in Fig. 5.1. To the degree that many assumptions hold, this tree can be seen as a reflection of the relationship of populations in Africa, expansion from East Africa (Chagga, Maasai, Ethiopians) into southwestern Asia, then


branching into Europe proper, as well as continuing across Eurasia with one lineage diverging north and into the Americas and a separate lineage diverging into the Pacific, Southeast Asia, and East Asia. Modern genomic methods being used to detect signatures of natural selection should be considered against this background of the global pattern of human genetic variation.

The impact of genetic variants—or lack of it In the context of extensive common variation it is usually not possible to characterize any specific allele as abnormal unless it is one of the rare alleles actually ‘causing’ a serious disorder. Alleles that increase susceptibility to medical conditions are not inherently abnormal—they exist also among normal, healthy individuals and they may have a beneficial effect in other circumstances. A common allele that produces an enzyme with altered activity, for example, is normal in that wide ranges of activity levels can be fully compatible with a healthy individual. In the context of polymorphism there is in the medical literature a tendency to label the more common allele the ‘wild type’ and the other alleles as variants or mutants. However, what is more common in one part of the world may be less common in other parts of the world. The distinction also implies ‘normal’ versus ‘abnormal,’ a biologically erroneous distinction. Evolution teaches us that variation among individuals is normal. The effects, if any, of genetic variation are numerous: A variant may contribute to disease, malformations, reproductive failure; protect against particular diseases or disorders; cause differential drug responses; produce different morphological features; have no or unknown effects; or have context-dependent effects (cf. Chapter 4). The effect(s) may be due predominantly to variation at a single polymorphic site (as in Tay Sachs disease in Ashkenazi Jews); at two or more sites in the same gene (e.g., PKU and most other ‘Mendelian’ diseases) (see Kidd et al. 2000; Kidd and Kidd 2003); or at two or more genes independently having the ‘same’ effect (e.g., disorders of heterogeneous etiology such as hereditary deafness; reviewed in Petersen and Willems 2006). They may also be due to variation at polymorphic sites in two or more


Yo ru b

o lbsa u Ha

gga Cha aasai M AfrAmer


DruAshkenazi ze

es Yemenit s-A sian inns Rus F ei ne yg ria Ad



R Eu uss ro ian Am se V Iris rica Danhes ns



Bia ka






Bootstrap Values Based on 1,000 replicates




ns nesia Micro

H Am akk a i

Pi Pim m aA aM Z X





= 85% to 90%

u Tic


= >95% = 90% to 95%

a ay M Quechua Cheyenne

Japanese s odian b Cam ans ese e Kor Chinese SF hin C TW

= 100%




ru i



Ka a

Figure 5.1 A tree representation of population similarities. This analysis is based on allele frequency similarities/differences at 149 loci. The branch lengths can be considered to represent the amount of evolution (random genetic drift) and the branching pattern can be considered to represent the net divergence among populations. Thus, high amounts of gene flow will result in small branches, whereas reproductive isolation for long periods will generate long branches. Since the amount of random genetic drift is inversely related to population size, very small isolated populations have long branches, as seen for the Karitiana and Surui, two small tribes in Amazonia. While there are clear clusters in this tree, note that sampling of populations is also geographically clustered (Kidd et al. 2004) with virtually no sampled populations between the eastern end of the Mediterranean and far East Asia.

genes operating jointly, or interactively with neither alone being sufficient (e.g., skin color; Myles et al. 2007). Whatever the physical characteristics, causes, or effects, the allele frequencies, geographic distributions, or genetic organization of this variation,

the outstanding fact remains that polymorphisms are present evidence of our evolutionary history as a species, including both our demographic history and the history of selective forces that have operated. As emphasized in Chapter 1 selection can only take place where and when both the


genetic variation and the selective agent are present, and when demography keeps blind chance (or, more formally, random genetic drift) at a sufficiently low level.

The role of selection Most selection is for conservation of function (and hence DNA sequence), rather than selection for a new variant. Such selection, called purifying selection, is demonstrable for most proteincoding segments of genes in that they show less DNA sequence variation than the introns and intergenic sequences. Nevertheless, it is also clear that positive (directional) and balancing selection do occur. Examples have been identified with the selective agent, gene, and effect identified (e.g., malaria, HBB, and sickle cell anemia—balancing selection; nutritional requirement, lactase, and adult lactose tolerance—positive selection). Both examples point to the fact that selection may be operating in some parts of the world but not others. It should also be noted that selection on specific alleles could also be operating in the same direction globally or under different influences in different parts of the world. The selective advantage or disadvantage of alleles at one locus can also depend on the genotype at another locus (epistasis). How do we recognize when and where selection may be taking place? One clue (albeit a weak one) in identifying historical intraspecies positive selection is existence of large allele frequency variation among populations from different parts of the world, that is, polymorphisms with high values for the standardized variance statistic, Fst (Akey et al. 2002; Bamshad et al. 2002; Bamshad and Wooding 2003; Akey et al. 2004; Bersaglieri et al. 2004). This approach focuses on selection that has operated in a subset of the world’s populations to change the allele frequency in only one region. Likewise, extremely low global Fst (a measure of population differentiation) across a genomic region might indicate the presence of balancing selection. While the logic is clear, the problem is determining when an allele shows a large frequency difference due to selection and when it is just in the upper (or lower) tail of the chance distribution of variation around the world. Figure 5.2 illustrates this problem with


the distribution of Fst values for 382 diallelic polymorphisms (SNPs) across 42 populations from around the world. None of these 382 polymorphisms have been implicated as a suspected site of natural selection nor have these SNPs been chosen for this distribution based on Fst. Therefore, we regard Fig. 5.2 as representing a reference Fst distribution of the chance distribution of variation around the world.

The impact of population bottlenecks on genetic patterns As pointed out by Biswas and Akey (2006), the statistical significance for several of the tests of within-species selection is affected by historical demography. Many such tests consider deviation from the equilibrium neutral model. One can easily argue that the very recent major demographic differences among human populations make the assumptions of the neutral model and equilibrium very tenuous. Both the wide variation in global Fst values and their skewed distribution (Fig. 5.2) argue against a simple model, as do the quite diverse patterns of allele frequency variation among populations seen for loci with similar Fst values. The history of recent human expansion with some apparently major founder events (out of Africa, into the Americas) and seemingly multiple smaller but cumulative events (e.g., across Eurasia) has introduced a considerable stochastic element into allele frequency patterns and patterns of linkage disequilibrium. While selection may have affected some loci, it is often possible to find multiple unlinked loci showing a similar pattern, which suggests that the historical demography may be a more likely explanation than a similar distribution of selective forces. For example, although its Fst is high, the global pattern at the G37995C SNP at microcephalin (Evans et al. 2005) is not unique and differs little in qualitative pattern from the SORCS3 SNP in Fig. 5.3. Thus, the Fst and global distribution of allele frequencies can only be considered suggestive. Similar cautionary comments can be made about the conclusion of ‘ongoing adaptive evolution’ at the ASPM locus (Mekel-Bobrov et al. 2005; see Currat et al. 2006). Sabeti et al. (2006) review methods and pitfalls of identifying and statistically evaluating candidate




N Populations N Markers


Count of markers


42 382





Std Deviation











0 5 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 0 5 5 0 0.02 0.05 0.07 0.10 0.12 0.15 0.17 0.20 0.22 0.25 0.27 0.30 0.32 0.35 0.37 0.40 0.42 0.45 0.47 0.50 0.52 0.55

Upper bound of Fst interval (0.025 increments) Figure 5.2 The Fst distribution of presumably neutral SNPs. The populations are the same as in Figs. 5.1 and 5.3. Note that the distribution is quite wide: some SNPs show little variation around the world while others show considerable variation. At the extremes of this distribution we expect to see variants subject to balancing selection (very low Fst) and variants subject to positive selection in one region of the world (very high Fst). However, polymorphisms can be at either extreme by chance alone, as seems likely for the polymorphisms included in this analysis.

genes for natural selection. Their recurrent theme is the difficulty in distinguishing the effects of population demographic history (bottlenecks, expansions, isolation; random genetic drift) from positive natural selection. The confounding of the effects of natural selection and random genetic drift is easy to demonstrate with simulations showing that if selection is weak, then random factors (founder effects, genetic drift) in a small population or accumulating at the front of an advance can alter allele frequencies much more than selection, even causing loss of the favored allele. (To see for yourself go to http://krunch.med.yale.edu/popgen and try it.) Cystic fibrosis (CF) in northwest Europe is likely one example of a chance increase in a deleterious allele. One deleterious allele is quite frequent in this region although the homozygote was effectively lethal prior to modern medical treatment. Many have speculated that positive selection on

the heterozygotes would have overcome the clear negative selection on the homozygote. An alternative explanation is that random genetic drift resulted in the high frequency, for most alleles for CF exist in healthy heterozygotes. With many DNA polymorphisms that are seemingly neutral we do see a cline from southeast into northwest Europe, a pattern of past population expansion. Is that proof that positive selection did not increase that particular allele? No, but it illustrates the difficulty in demonstrating positive selection against the background of random variation across an overall geographic pattern.

Disease can cause bottlenecks The previous discussion and those in other chapters have focused on random genetic drift as an alternative to selection driven by health problems



1.0 0.9 0.8 0.7 0.6 0.5

0.4 0.3 0.2 0.1

Bi a M ka Yo but ru i ba H Ibo C aus h a A fr M agg A a a m a Et eri sai hi ca Ye opi ns m an en s Sa D ite m ru s A ari ze sh ta ke ns n A az C dyg i Ru hu e s v i Ru sia ash ss ns_ ia A ns Fi _V n Eu D ns an ro A K m Ir es om e is i Z rica h yr ns K ian ha e n Ya ty M ic N ku C ron asi t am e o b si i C od ans hi ia C ne n hi s s ne e se SF H TW K akk o Ja rea a pa ns ne s A e C At mi he ay y a Pi en l m n Pi a e m AZ a M Q Ma X ue y ch a Ti ua c R. un Su a K ru ar iti i an a


Fst = 0.478 ADH1B rs1229984 ALrg47His

Fst = 0.249 ALDH2 rs671 Exon12c

Fst = 0.077 CCR5 rs333 32bp InDel

Fst = 0.432 MCM6 rs182549 HinP1l (–13.9 LCT)

Fst = 0.722 SLC45A2 rs16891982

Fst= .525 SORCS3 rs7914674

Fst= .024 IGSF4 rs10488710

Figure 5.3 Allele frequency distributions among populations. The frequency of one allele at each SNP was chosen to graph. The ALDH2 and ADH1B SNPs are discussed in the text. The SLC45A2 SNP has a very large Fst and is thought related to skin color in Europe. The MCM6 SNP is thought to be the regulatory variant responsible for adult resistance of lactase in Europeans. The CCR5 deletion confers resistance to the AIDS virus. Two presumably neutral SNPs from the tails of the distribution in Fig. 5.2 have also been graphed. The IGSF4 polymorphism, rs10488710, represents a marker with very low Fst; the SORCS3 polymorphism, rs7914674, represents a marker with very high Fst but with a global distribution very different than that of the ADH1B, ALDH2, and SLC45A2 polymorphisms thought to have undergone region-specific selection. Illustrations of gene frequency variation among most of these same populations for other markers with high and low Fst can be found in Kidd et al. (2006). Thus, we conclude that global pattern is, per se, a very poor basis for implicating selection. However, that pattern is not irrelevant to disease since alleles relevant to health and disease can vary considerably among populations by chance alone.

and disease. However, there is another perhaps very important way in which disease—most likely in the absence of selection—may drive evolution: when disease strikes a naïve population suddenly causing a high death rate, it can reduce that population to such a small size that random genetic drift

becomes very important. For generations after a severe genetic bottleneck caused by disease (i.e., constriction in population size), the allele frequencies across the entire genome will differ considerably from the genome of the rest of our species. It is particularly interesting that such a catastrophic



population crash affects the entire genome, not, as with natural selection, just the region around a selected gene. The number of generations that the population remains small will determine how strong the effect of the bottleneck will be. Rapid recovery will minimize the effects, just as slow recovery will maximize the effect. When the bottleneck is severe, survival of just a few heterozygotes for an allele that is highly deleterious when homozygous can increase the frequency to several percent in that population. Examples are likely to include the Native American and Polynesian exposure to the diseases of the Old World upon and after initial European contact.

Migration out of Africa The bottleneck that occurred in the expansion out of Africa was the most significant evolutionary event in the history of modern Homo sapiens. While it established a common genetic framework for all non-African populations, much of the genetic variation from which this sample was taken still exists in Africa. Indeed, the majority of all genetic variation still exists in Africa. While the major effect of the severe bottleneck was considerable loss of variation in all non-African populations, there are clear examples of a disease susceptibility being established by an ‘allele’ exceedingly rare in Africa becoming common in all non-African populations. One such example is myotonic dystrophy, which is caused by a trinucleotide expansion affecting expression of the gene for myotonic dystrophy (DM). As shown by Tishkoff et al. (1998), the large normal alleles with high repeat numbers at the trinucleotide short tandem repeat polymorphism (STRP) at the DM gene do not occur in sub-Saharan populations but do exist in Northeast Africa at low frequencies and at much higher frequencies in all non-African populations. These large normal alleles are prone to mutations to many more repeats. These abnormally large alleles are not associated with any disorder but are genetically unstable and expand by mutation until they are sufficiently long to cause myotonic dystrophy. Thus, myotonic dystrophy occurs in all non-African populations as a ‘rare’ genetic disease, but is essentially absent in native populations of sub-Saharan Africa.

Complex disease and evolution Research directed toward understanding complex traits, disorders, and diseases is a major biomedical priority. An evolutionary perspective on how genetic variation causes individuals to differ in risk can be very illuminating. For example, we no longer consider adult lactose intolerance to be a genetic ‘disorder.’ We now recognize that persistence of lactase activity into adulthood (i.e., adult lactose tolerance) recently evolved in a few populations. People in most of the world have the normal trait: lactose intolerance in adulthood or lactase non-persistence (for a review see Swallow 2003). It is now clear that the persistence of lactase production is evolutionarily complex in that different alleles are involved in different populations (see Chapter 4). Essentially all genetic diseases are ‘complex’ in one way or another. For example, phenylalanine hydroxylase deficiency (phenylketonuria, PKU) was long considered to be a simple metabolic disorder inherited in a Mendelian fashion and caused by a mutation disrupting the production of the enzyme phenylalanine hydroxylase (PAH). Years of study have revealed that there are many mutations in the PAH gene that can cause PKU (see http://www.pahdb.mcgill.ca for a compendium of mutations) and that different mutations cause different symptoms and expected outcomes. Many (more likely most) ‘simple Mendelian’ diseases exhibit the same complexity in molecular etiology. For example, what were thought to be three different Mendelian diseases—multiple endocrine neoplasia types 1 and 2 and papillary thyroid carcinoma—are different sets of mutations at the RET gene (Santoro et al. 1995; Eng 2000). Other mutations at the same gene increase susceptibility to Hirschsprung disease (Pasini et al. 1995; Parisi and Kapur 2000), which exemplifies a whole category of diseases/disorders/traits far more complex in etiology than a single locus and tend to affect far more people. The non-Mendelian or complex diseases are characterized by a large environmental component, the suspicion that more than one gene may be involved in the expression even in the simplest cases, and the strong likelihood that there is


etiological heterogeneity in the disease—that is, ‘the disease’ is really several diseases with different complex cause(s) leading to similar clinical manifestation. These diseases continue to have very strong impact on public health and remain major challenges to geneticists today. They include schizophrenia and alcoholism, as well as many diseases associated with aging, such as cardiac disease, Alzheimer’s disease, and essential hypertension (cf Chapter 23).

Genetic influences on alcoholism Alcoholism is a particularly good example of a complex disease (Erickson 1992). It is a major public health problem with well-documented genetic components (e.g., Goodwin et al. 1974; Kendler et al. 1994) that interact with many social and behavioral components. The genetic components are not all clearly understood, but several are being identified. Though there are numerous hypotheses for what functions and even genes might be involved in alcohol dependence (American Psychiatric Association 1987), some genes clearly involved in the risk of becoming alcoholic have been identified, for their alleles differ in impact. That an allele at one locus appears to have been selected in some populations emphasizes the importance of the evolutionary perspective. Interestingly, none of these

genes are involved in addiction, per se. Instead, they are involved in regulating how much alcohol an individual is likely to drink, primarily by making it less likely that individuals with certain genotypes will ever drink much: those that drink little are unlikely to become addicted. Such genetic variants regulating behavior are part of the genetic architecture underlying risk of alcoholism in both individuals and families. What are these genes? Three are involved in ethanol metabolism and two others appear to be involved in how one perceives the taste of ethanol. These five genes and their variants that are related to ethanol consumption through metabolism and taste are listed in Table 5.1 along with links into a database—ALFRED—that contains information on their allele frequencies in different populations. Because several variants have quite different frequencies in different parts of the world, their relative importance in the risk of alcoholism varies geographically.

Variation in ethanol metabolism and alcoholism Genes that metabolize ethanol are ubiquitous in animals because of the pervasive presence of ethanol in diets. Clear evidence of selection is shown to maintain function by their molecular similarities

Table 5.1 Polymorphisms associated with alcoholism Locus

Polymorphism identifier in dbSNP

Amino-acid variation

Site UID for frequency data in ALFRED


rs1229984 rs2066702 rs1693482 rs698 rs35719513 rs671 rs713598 rs1726866 rs10246939 haplotype rs846664

Arg47His Arg369Cys Arg271Gln Val349Ile Pro351Thr Glu487Lys Pro49Ala Ala262Val Val296Ile 3-site haplotype Lys172Asp

SI000229N SI000230F SI000735P SI000228M SI000736Q SI000734O SI000882S SI001020D SI000883T SI001087Q SI004073O



To be confirmed by additional studies.

All of these show allele frequency variation around the world; much of the population allele frequency data are in ALFRED (http://alfred.med.yale.edu) under the UID given in the table.



among species (Oota et al. 2007). Enzymes in the alcohol dehydrogenase family are primarily responsible for converting ethanol into acetaldehyde, a toxic substance. Fortunately, acetaldehyde is normally rapidly converted to acetate by aldehyde dehydrogenase. Acetate is a perfectly harmless molecule that enters into other aspects of metabolism. By the 1980s (Goedde et al. 1983; Goedde and Agarwal 1987) researchers had shown that a variant form of acetaldehyde dehydrogenase did not efficiently convert acetaldehyde to acetate; even the heterozygote with one normal and one abnormal allele had significantly reduced levels of enzyme activity. Thus, individuals heterozygous for the ALDH2*2 allele (also called ALDH2*487Lys), and even more so the homozygotes, break down acetaldehyde very slowly. Before deliberate fermentation, levels of ethanol in the diet were low, and such metabolic variation probably had little significance. However, the high transient levels of acetaldehyde that follow consumption of larger amounts of alcohol result in physiological reactions that are often very unpleasant. Along with the examples in Chapter 4, this illustrates how genetic variation can have context-dependent consequences with culture and technology (in this case brewing) altering the context. As can be seen in Fig. 5.3, this variant form of ALDH2 is found only in individuals whose ancestry comes from Eastern Asia (Oota et al. 2004). It is considered the primary factor responsible for the flushing reaction (the mildest of the physiological reactions to high acetaldehyde) seen in many individuals of East Asian ancestry when they consume even small amounts of alcohol (Harada et al. 1981). When larger amounts are consumed, such individuals may have other adverse effects, and the homozygotes can become quite ill from even small amounts of alcohol. Thus, this variant of a gene involved in ethanol metabolism strongly influences the amount of ethanol that many people of East Asian ancestry are likely to drink. Alcoholism may have such a low frequency in East Asia because the frequency of this variant is quite high, up to 30% in some East Asian populations. It is interesting to speculate that this large number of individuals with a genetic aversion to alcohol has influenced the larger culture away from consumption of large

quantities of alcohol. The ALDH2*487Lys allele is unambiguously a genetic factor in the risk of becoming alcoholic. Homozygous individuals are prevented from consuming enough ethanol to become addicted, and heterozygotes tend to drink less than homozygotes for the full activity allele. Because ethanol occurs in a normal diet in the form of overripe fruit and naturally fermented juices, we may ask: Why is this variant frequent when it blocks an important metabolic pathway? Why is this variant frequent only in East Asia? These are interesting questions not fully resolved, as discussed below. Having discussed the blockage of the breakdown of acetaldehyde, we now turn to variation in the enzymes that convert ethanol to acetaldehyde. Three separate genes that arose by gene duplication early in primate evolution (Oota et al. 2007) produce the primary liver enzymes for conversion of ethanol to acetaldehyde. These three alcohol dehydrogenase genes—ADH1A, ADH1B, and ADH1C—exist in the proximal long arm of chromosome 4 in a tandem array of 75 kb in the middle of a cluster of similar genes expressed in other tissues or other life stages, or that produce enzymes primarily targeted at alcohols more complex than ethanol. One of these genes, ADH1B, is probably the most active and produces the enzyme responsible for most conversion of ethanol to acetaldehyde in the liver. The adjacent ADH1C gene (about 15 kb away from ADH1B) also seems important in adult alcohol metabolism. Both ADH1B and ADH1C have variants that result in different enzyme activities, and those variants have quite different global distributions. ADH1B contains a particularly interesting variant that alters an amino acid from Arginine (Arg) at the 47th amino acid to Histidine (His). This Arg47His polymorphism exists at moderate frequency in the Middle East and North Africa, at very low frequency in Europe, and at very high frequency in East Asia (Fig. 5.3). It does not appear to exist either in Native Americans population or in Sub-Saharan Africans (Osier et al. 2002). (Obviously, with human movements as large, easy, and long range as they have become over the past five hundred years, these patterns refer to general regions and individuals whose ancestors derive from those geographic regions. Thus, the


variants at both ADH1B and ALDH2 can be quite frequent in individuals of East Asian ancestry living in Europe or the United States.) Many studies have reported a highly significant association between the Arg47His polymorphism and alcoholism: individuals having the ADH1B*47His allele are significantly less likely to become alcoholic (Li 2000). In addition to the ADH1B*47His allele, meta-analysis supports the conclusion that ADH1C*349Val is also significantly less frequent in alcoholics, and not just in East Asians, where it is in strong linkage disequilibrium with the ADH1B*47His allele (Osier et al. 1999), but also in populations of Europe origins where the ADH1B*47His allele is rare to absent (Whitfield et al. 1998). The hypothesized relationship to alcoholism is that individuals with the ADH1B*47His and/or ADH1C*349Val allele(s) (both coding for enzymes with higher Vmax) are more prone to adverse reactions from consumption of ethanol, probably the result of higher levels of acetaldehyde, which reduces chances of such individuals becoming alcoholic because of reduced consumption (Thomasson et al. 1994). A complicating factor in many studies of genetic variation and complex diseases is the strong linkage disequilibrium (LD) that can exist around a marker being studied. A positive association can be the result of alleles at the marker being studied or another variant, possibly unknown, in linkage disequilibrium with it. Just as recent human evolution resulted in allele frequency variation among populations, we also see that the amounts and patterns of LD vary among populations (Sawyer et al. 2005; Conrad et al. 2006; Gu et al. 2007). In the case of alcohol metabolism, very strong LD exists between ADH1B and ADH1C in most populations. Consequently, in East Asia the two loci have the alleles with high Vmax in coupling, i.e., present on the same chromosomes. Thus, it is very difficult to separate out the effects of the different loci. One method for such resolution was proposed by Valdes and Thomson (1997) and applied by Osier et al. (1999) to haplotype data involving ADH1B and ADH1C. By comparing alcoholics to controls in three different Taiwanese populations they demonstrated that the protective effect against alcoholism could be assigned entirely to the ADH1B*47His allele (then


designated ADH2*2); the equally strong population association of the ADH1C polymorphism was attributable to the LD. In Europeans, in contrast, the ADH1C variant has a clearly detectable association with protection against alcoholism (Whitfield et al. 1998). The phenotype of more rapid metabolism of ethanol to acetaldehyde is evolutionarily interesting because it has three different genetic causes and each has a very different geographic distribution. As noted (cf. also Fig. 5.3) the ADH1B*47His allele has high frequency in Southwest Asia and North Africa and in East Asia, is rare in Europe, and absent in native sub-Saharan African and American populations. In contrast, another allele for a rapid metabolizing enzyme, ADH1B*369Cys, is frequent only in populations of sub-Saharan Africa and those populations derived from them, such as African Americans. The rapid metabolizing form of ADH1C is common only in Europe. Finally, though the activity of the enzyme it codes for is not yet known, there is a Native Americanspecific allele of ADH1C. The existence of two variants—ADH1B*47His and ALDH2*487Lys—that both function to increase transient acetaldehyde levels, both at high frequency only in East Asia, has long been thought to be the result of natural selection, but what caused them to be frequent remains unclear (Goldman and Enoch 1990). That both have very high Fst supports, but does not demonstrate, region-specific selection (Osier et al. 2002; Oota et al. 2004). Han et al. (2007) took a different approach, the extended or long-range haplotype test developed by Sabeti et al. (2002). By showing that the LD on chromosomes with the ADH1B*47His allele extended much farther than on the chromosomes with the other allele, Han et al. (2007) provided strong genomic evidence that selection has favored that allele in some East Asian populations. However, while Han et al. (2007) discuss several hypotheses for the selective force, there is no direct evidence for any of them.

Variation in taste perception and alcohol dependence It is clear that normal metabolic variants could cause problems when exposed to higher levels of



ethanol in the diet than had previously existed and result in lower consumption of alcohol and hence decreased risk of becoming alcohol dependent. It is not so obvious that normal variation in the ability to taste bitter substances would also affect alcohol consumption and dependence. Yet such seems to be the case. Duffy et al. (2004a) showed that genotypes for the two common alleles at TAS2R38, the bitter taste receptor gene responsible for the phenylthiocarbamide (PTC) taste polymorphism (see Chapter 4), were associated with ethanol intake. The two common alleles are actually haplotypes yielding proteins differing at three amino acids (Kim et al. 2003). While some individuals of each genotype drank little or no alcohol, some individuals of the homozygous non-taster genotype (AVI/AVI) drank more than any heterozygotes (PAV/AVI) and even more than any of the taster homozygotes (PAV/PAV). The association was significant and implied that taster homozygotes would be less likely to become alcoholic because they would not drink much. Perception tests (Duffy et al. 2004b) indicated that homozygous tasters perceived ethanol as bitter and ‘burning’ as opposed to the sweet sensation reported by homozygous non-tasters, providing one explanation for the different drinking behaviors. Subsequent studies of TAS2R38 could not show an association in European American families with alcoholism with either alcoholism or the maximal number of drinks consumed in a 24-hour period (Wang et al. 2007). However, in African American women there was a significant association of genotype with consumption. The two studies used different measures and ascertained subjects differently; thus no clear replication of the Duffy et al. (2004a) finding has yet been done. While the role of TAS2R38 in alcoholism is not resolved, taste still seems likely to be relevant because other members of the bitter taste receptor gene family are also polymorphic, and in one recent study alleles at TAS2R16 showed linkage to risk of alcohol dependence (Hinrichs et al. 2006). Neither TAS2R38 nor TAS2R16 shows significant evidence of regionspecific selection but both show allele frequency variation around the world. Exactly how that variation influences the differing rates of alcoholism remains to be shown.

The detailed discussion of alcoholism illustrates many ways in which evolution has shaped the genetic underpinnings of different risks to complex disorders. Many polymorphisms alter the development of an individual and the myriad biochemical processes constituting life. Almost all are normal but can affect health in subtle ways. Whether any variant became common because of selection or random drift is an interesting research question, but from a health perspective the relevant point is that these variants exist. In different individuals, in different populations, and in different environments, these variants can affect risk to common disorders.

Summary 1. Extensive amounts of normal genetic variation occur in humans. 2. Every independently conceived individual is genetically unique. 3. The distribution of genetic variation in populations is the product of a human evolutionary history that has included selection and random genetic drift influenced by migration, demography, and isolation. 4. The major features of genetic variation in modern humans are the large amount of variation in Africa and the loss of variation in the expansion out of Africa. 5. Against the background of allele frequency variation around the world, it is difficult to identify which loci have been subject to natural selection varying among geographic regions. 6. That normal genetic variation can affect health in surprising and complex ways is illustrated by how metabolism and taste perception are related to alcoholism.

Acknowledgments The research presented here was supported in part by USPHS Grants AA009379 (alcohol-related genes) and GM57672 (population relationships). We thank the many hundreds of individuals who volunteered to give blood samples for studies such as this and the many scientists who have helped assemble the samples from diverse populations.


Natural selection and evolutionary conflicts

This page intentionally left blank


Intimate relations: Evolutionary conflicts of pregnancy and childhood David Haig

Introduction Haldane (1932a) warned against a fallacy that ‘has been responsible for a good deal of the poisonous nonsense which has been written on ethics in Darwin’s name.’ This was the common assumption that natural selection always makes an organism fitter in its struggle with the environment. Haldane recognized that competition within a species can favor characters that confer a competitive advantage but that are otherwise maladaptive. The competitive advantage disappears once competitors also possess the character, but the fitness costs remain. His examples of natural selection with potentially maladaptive outcomes included prenatal competition among the members of a litter for limited maternal care. Sibling rivalry is also present when offspring are produced one at a time, because parental expenditure of time and resources on one offspring means that less time and resources are available for future offspring (or for older offspring still receiving parental care). For this reason, Trivers (1974) argued that an evolutionary conflict exists between parents and offspring, with offspring selected to acquire more parental investment than parents are selected to supply. Haig (1993, 1996a) applied Trivers’ theory to interactions between human mothers and fetuses. He proposed that evolutionary conflicts between maternal and fetal genes could help to explain why pregnancy is often costly to a mother’s health. For many purposes, the best model for thinking about the prenatal relation between mother and fetus is the postnatal relation between the same genetic individuals. Mother and child have many

interests in common, but parenthood is rarely without conflict. When conflicts do arise, sometimes the mother gets her way, sometimes the child gets its way, and sometimes a compromise is negotiated. Some relationships proceed smoothly with little overt conflict; other relationships are tempestuous; and others are calm for the most part but with occasional storms. This chapter focuses on genetic conflicts during pregnancy: ‘Parental justice’ reviews the kinds of conflicts that can occur among the genes of mother and offspring; ‘Internal conflicts’ reviews conflicts that can occur within the genomes of mothers and offspring. ‘Credibility problems’ discusses how conflicting interests impede the exchange of information between mother and fetus; ‘Pregnancy termination’ considers the function of early pregnancy losses, the evolution of menstruation, and the control of parturition; ‘Maternal circulation’ presents a simple model of conflict over the distribution of maternal blood during pregnancy; ‘Preeclampsia’ applies this model to understanding the causes of a major disease of pregnancy; ‘Growth’ considers ways in which conflicts between generations may have influenced the evolution of human postnatal growth.

Parental justice How can there be evolutionary conflict between parents and offspring if an offspring’s genes come from its parents? Here is an attempt to answer this question that emphasizes the information available to genes. Consider a family of full-sibs. All of the genes of the offspring are sampled from genes of 65



the parents, but genes in the parents usually have no way to tell which offspring have inherited their copies. Parental genes therefore make decisions about the allocation of resources to individual offspring behind a ‘veil of ignorance’ that ensures each gene has the same probability of being present in each offspring. When no gene knows in which offspring it is present, each gene’s interests are best served by choosing the allocation that maximizes the expected number of surviving offspring. This allocation of resources is ‘just’ in the sense that no parental gene favors offspring with its own copies. As a consequence, natural selection maximizes the combined fitness of offspring considered as a group. The veil of ignorance is partially lifted once genes find themselves in offspring. A gene is definitely present in the offspring in which it is expressed and can take actions that benefit this offspring at the expense of its siblings. Thus, if the genes of offspring, rather than the genes of parents, control the allocation of parental investment they are subject to an evolutionary ‘tragedy of the commons.’ All genes will be worse off if all pursue their own interest, but unilateral restraint will be exploited. As a consequence, genes expressed in offspring have been selected to obtain levels of parental investment higher than is favored by genes expressed in parents.

Internal conflicts For the most part, this chapter considers conflicts between genes expressed in mothers and genes expressed in offspring, where the genes of each individual are assumed to act in concert and express unitary interests. However, conflicts are possible within the genomes of mothers and within the genomes of offspring when genes possess information about their distribution among a mother’s offspring. The alleles at a maternal locus can be classified as inherited or non-inherited with respect to a focal offspring. Inherited and non-inherited alleles have strongly opposed interests with respect to the focal offspring. Inherited alleles are definitely present in this offspring, but have only one chance in two of being present in each of the mother’s

other offspring. Therefore, such alleles would benefit from preferential allocation of maternal resources to the focal offspring. By contrast, noninherited alleles are absent from the focal offspring and gain no benefit from its survival. Such alleles would benefit from the focal offspring’s early demise if resources were thereby redirected to other offspring who have some chance of receiving copies of the non-inherited alleles. From the perspective of non-inherited alleles, any investment in the focal offspring is wasted and to be avoided, if possible. Maternal genes able to ‘peek’ behind the meiotic veil of ignorance and learn which offspring carry their copies would therefore be selected to favor offspring with their copies over offspring without. Maternal genes ‘discriminate’ among offspring on the basis of offspring genotype whenever there is an interaction in effects on fetal fitness between alleles at a locus expressed in the mother and alleles at a locus expressed in the fetus (intergenerational epistasis). Such interactions become nepotistic if the two loci are closely linked, because then natural selection generates linkage disequilibrium, favoring haplotypes that preferentially direct resources to fetuses carrying a copy of the haplotype (or away from offspring that do not inherit the haplotype). By this process, a maternal haplotype could come to ‘recognize’ its replicas in offspring and to direct benefits to these replicas (Haig 1996b). Two factors probably mitigate the effects of conflict between inherited and non-inherited alleles in mothers. First, intergenerational epistasis between closely linked loci may be rare: most maternal genes make decisions about the allocation of resources to individual offspring behind the meiotic veil of ignorance. Second, the maternal genome may police itself. When conspiratorial epistasis is present in one chromosome segment, most genes in the maternal genome will segregate independently of the rogue segment and will be selected to suppress its effects. Genes expressed in a focal offspring evolve to discount costs and benefits to a parent’s other offspring relative to costs and benefits to the focal offspring. The appropriate discount rate is determined by the probability that a gene in the focal offspring is also present in these other offspring. Whenever



females have offspring by more than one father, physiological costs imposed on a mother by an offspring will usually have a greater impact on the mother’s other offspring than on the father’s other offspring. Therefore, conflict can arise within the genomes of offspring if genes are privy to information about their parental origin. Most genes lack such information and operate behind an epigenetic veil of ignorance. However, some genes retain this information and have expression patterns that are conditional on their parental origin. Such ‘imprinted’ genes should promote increased demands on mothers when they are paternally derived and reduced demands when they are maternally derived (Haig 2004).

Communication is problematic in pregnancy, not only between maternal and fetal cells but also between cells of the same genotype, because a cell may be uncertain about the genotype (fetal or maternal) of other cells it encounters and about the origin and trustworthiness of incoming signals. Thus, endocrine communication within the mother’s body is compromised by potential disinformation broadcast into the maternal circulation by the placenta. The fetus’s internal lines of communication are relatively secure because maternal hormones do not have equivalent access to the fetal circulation (Haig 1996a, c).

Credibility problems

Obstetricians have long recognized a continuum from shedding of the menstrual decidua, through early miscarriages and premature deliveries, to shedding of the decidua with the placenta at a term birth (Tyler Smith 1856). This section examines this continuum, as well as the evolutionary forces associated with the abandonment of infants immediately after birth.

Communication between parties with conflicting interests is more difficult than communication between parties with identical interests. Messages cannot always be trusted if sender and receiver have conflicting interests, and untrustworthy messages are often best ignored (Maynard Smith and Harper 2003). These problems are exacerbated in maternal–fetal relations because there may be ambiguity about whether maternal or fetal cells are sending a particular message: a ligand detected by a receptor does not have a return address. One consequence is that maternal physiology during pregnancy will lack many of the checks, balances, and feedback controls present in the nonpregnant state. Some complications of pregnancy that endanger both mother and fetus may be the result of the inability of a physiologically threatened mother to credibly communicate her dire state to a demanding fetus. Messages between a mother and fetus can be credible in some circumstances, but these messages are likely to be simpler, and much less detailed, than the messages that can be conveyed between the genetically identical cells of a single body. Placental hormones may provide mothers with general information about offspring size and vigor—for this purpose one hormone is as good as another—but their production does not show the kind of temporal fluctuation that could communicate moment-to-moment variation in fetal need (Chard 1993).

Pregnancy termination

Menstruation Shedding of the lining of the uterus with visible loss of blood is rare among mammals, perhaps limited to anthropoid primates, a few bats, and elephant shrews. All of these species have invasive forms of placentation. Finn (1987) proposed that menstruation is a consequence of the initiation of the decidual changes of the endometrium independently of whether an embryo implants, where these changes function as a maternal defense against excessive invasion by trophoblast. ‘The advantage would seem to be in providing protection for the uterus in anticipation of the presence of a blastocyst rather than in response to it’ (Finn 1996). Evidence of early pregnancy losses at the time of menstruation suggests that a significant proportion of menses in our evolutionary past may have been associated with the loss of an implanted embryo. Clarke (1994) therefore proposed that menstruation functions as ‘a way of eliminating defective embryos before a pregnancy has proceeded very far.’ Menstruation could also function



to eliminate normal embryos when social or ecological conditions are unsuitable for pregnancy. In species with invasive implantation, elimination of an embryo requires simultaneous shedding of the attached endometrium. But why should this shedding be general rather than localized? One consequence of generalized shedding is that this obviates ‘stealth strategies’ in which embryos hide in the endometrium without signaling their presence. Sloughing the endometrium is an effective means of eliminating a single embryo but would be an indiscriminate form of embryo selection if it resulted in the loss of an entire litter. Menstruating species usually produce singletons. (Elephantshrews are an exception and produce twins: one from each uterine horn.) Many litter-bearing eutherians are able to reabsorb some members of a litter while leaving other embryos intact (Morton et al. 1982), perhaps serving a function analogous to selective embryo abortion.

Selective abortion Many organisms initiate more offspring than complete development. The overproduction of offspring may be adaptive if it allows the selective elimination of offspring of low quality (Stearns 1987). In this view, parental fitness is enhanced by death of a subset of offspring before major commitment of resources, if the average quality of survivors is higher than the average quality of non-survivors. The lower the cost of abortion relative to the cost of raising an offspring until independence, the more selective a parent can afford to be (Haig 1990). Such models can be rendered more sophisticated by allowing parental selectivity to vary in response to inputs of information about how favorable current circumstances are for reproduction (Wasser and Barash 1983). This section will discuss the extent of pregnancy loss in humans and consider whether some of these losses may be adaptive. The lining of the human uterus is shed two weeks after ovulation unless an implanting embryo produces sufficient chorionic gonadotropin (hCG) to block regression of the corpus luteum. The proportion of fertilized ova that fail to block menstruation is controversial, but it is nonetheless clear that the highest rate of attrition at any stage of the human

life cycle occurs in the first few weeks after conception (Holman and Wood 2001). Most of these losses occur without a woman being aware she is pregnant. Wilcox et al. (1999) used sensitive assays of hCG in maternal urine to detect early implantation. Fully 25% of conceptions detected by this method were lost within six weeks of the last menstrual period. Of the remainder, 11% miscarried. Such assays miss an unknown number of conceptions that fail to implant or whose hCG levels do not reach the critical value on the days sampled. Somewhere between 10 and 20% of embryos that successfully block menstruation subsequently miscarry in the first trimester. The majority of these miscarriages have some form of chromosomal abnormality (Fritz et al. 2001). For this reason, the idea that most clinical miscarriages reflect the adaptive functioning of an evolved mechanism of quality control is now widely accepted. By contrast, very little is known about the genetic properties of earlier losses. A simple interpretation would be that embryos lost in the first few weeks are even more severely compromised than later losses. Clearly, some early losses must have severe abnormalities, but it is an untested assumption that all early losses are of this kind. The cost of a one-month delay in reproduction is much less than the cost of raising an offspring of low expected fitness. Therefore, maternal physiology may have evolved to select among early embryos on the basis of small differences that need only be weakly correlated with subsequent vigor. Clinical miscarriages would then be ‘mistakes’ that got through this initial screen but did not pass the second interview. Under this hypothesis, chromosomal abnormalities would be more common in late losses than early losses. The metaphor of offspring as examination candidates (or job applicants) and the mother as examiner is useful for thinking about the action of natural selection in systems of selective abortion (Haig 1987). An examiner’s aim is to design tests that provide accurate information as cheaply as possible. The examinee wishes to pass. If the examinee is indeed of high quality, she and the examiner have a common interest in conveying this information accurately, but, if the examinee is of poor quality, her best interest may be to dissemble.


How then can a mother obtain useful information about embryo quality? A requirement for the final product of a synthetic pathway tests all the steps of the pathway. Thus, the ability of a small embryo to make large quantities of hCG demonstrates the efficient functioning of the embryo’s machinery of transcription, translation, and glycosylation. If mothers required more than some threshold level of hCG to block menstruation, then this requirement would constitute a crude screen of embryo quality. If the threshold were adjustable, mothers could raise the required level of hCG under ecological conditions less favorable for reproduction. Any testing procedure selects for individuals that are good at passing tests. But there is rarely a perfect correlation between test-scores and competence in the task for which the test is designed. (A test would be abandoned if there were no correlation.) Therefore, early pregnancy losses can favor genes with negative effects later in life solely because the genes are better than average at avoiding early losses or because they can bias the testing procedure to handicap offspring that do not inherit their copies. The first possibility would be associated with higher than average fecundity (because the distorter causes embryos that would otherwise have failed to pass) and the second with lower than average fecundity (because the distorter causes embryos that would otherwise have passed to fail). Both possibilities would result in segregation distortion in the successful pregnancies of heterozygous mothers. Thus, prenatal advantages could explain how some alleles with deleterious postnatal effects are maintained at high frequency by natural selection, and segregation distortion (defined to include biased early pregnancy losses) could explain the persistence of common medical conditions that are associated with either reduced or increased interbirth intervals. Haig (1993) proposed that chorionic gonadotropins evolved as a fetal attempt to evade maternal mechanisms of pregnancy termination. This view is compatible with hCG also conveying useful information to mothers about embryo quality. The two perspectives address different questions. The first addresses the origin of chorionic gonadotropins: why embryos first produced a hormone. The second addresses the maintenance of the system:


why mothers continue to respond to the hormone. Credible signals of quality must be difficult to fake because embryos sometimes have an incentive to deceive mothers about their true quality. The credibility requirement is reflected in the high levels of hCG produced. If the purpose of hCG were solely to signal the presence of an embryo, the signal could be sent cheaply, because signals of presence are always credible (‘I speak therefore I am’).

Gestation length The normal duration of human pregnancy is considered to be nine months (40 weeks; calculated from end of last menses) but infants born up to three months earlier had some chance of survival before the modern medical era. In nineteenthcentury Britain, an abortion was defined as expulsion of the contents of the uterus during the first six months of gestation, because the fetus was considered non-viable prior to 28 weeks gestation but viable thereafter (Tyler Smith 1856). Conversely, fetuses can survive in the uterus for considerable periods beyond normal term, including an anencephalic fetus that was still alive after 389 days gestation (Higgins 1954). An offspring’s contribution to its mother’s reproductive value presumably declines for gestation lengths longer or shorter than some optimal duration. If the cost to the mother’s residual reproductive value is an increasing function of gestation length, then the length of gestation favored by genes expressed in fetuses will exceed that favored by genes expressed in mothers (Fig. 6.1). Who then controls the decision when to end a particular pregnancy? Mothers probably have effective control during early gestation but fetal genes probably acquire greater influence as the placenta grows larger. At later stages, the human fetus appears to control when it comes out. This is suggested by cases of prolonged gestation associated with anencephalic fetuses (Anderson et al. 1969) and by cases of twins delivered weeks apart (Abrams 1957). If human fetuses control the time of delivery, then a fetus should remain in the womb for as long as intrauterine existence is more desirable than life on the outside. Therefore, one would expect longer gestations when intrauterine conditions are

Reproductive value



* max(B – C) ** max(B – rC)


costs of gestation and slow fetal growth (Dufour and Sauther 2002).



rC ** B

Gestation length Figure 6.1 A mother’s reproductive value is increased by a benefit (B) that is a hump-shaped function of gestation length. Her residual reproductive value is decreased by a cost (C) that is an increasing function of gestation length. Genes expressed in mothers are selected to maximize (B−C), whereas genes expressed in offspring are selected to maximize (B−rC), where r is a measure of the degree to which genes of offspring discount costs to a mother’s residual reproductive value. Fetal genes favor longer gestations than maternal genes, although the difference is slight if the ‘hump’ of the benefit function is narrow and the rise in the cost function is gentle.

relatively favorable but rapid delivery if the uterus becomes unsafe (once extrauterine survival is possible). Intrauterine infection is a major cause of premature delivery (Fields et al. 1996) and infants small for gestational age are more likely to be delivered early than appropriately grown infants (Gardosi 2005). A human fetus is usually the sole occupant of its mother’s uterus. In this privileged location, the fetus has a low risk of death (once it has survived the first trimester) and first call on its mother’s nutrient reserves via a placental supply line that has substantial reserve capacity, at least until the final stages of pregnancy. Therefore, if the fetus controls when it leaves the uterus, there seems no particular reason why it should grow at the fastest rate possible if there are advantages of taking its time, either in terms of the quality of construction of its body or in reducing the demands on its mother’s health. By contrast, in species that produce litters, competition within the litter will favor rapid development (Haldane 1932b). Among eutherian mammals, gestation length is negatively correlated with litter size, after controlling for maternal weight and total weight of the litter (Read and Harvey 1989). Primate pregnancies are characterized by low daily

There is a dramatic shift in power at the time of birth. Suddenly, the baby ventures out from the relative security of the womb into a world in which it is completely dependent on being accepted and cared for by another individual, usually its mother. During the course of human evolution, many infants were probably abandoned shortly after birth. Maternal psychology is expected to have evolved to ensure that decisions to accept or reject an infant were adaptive on average. Mothers presumably abandoned infants for a variety of reasons. Some reasons would have been intrinsic to the infant, such as low perceived quality with low chance of survival. Other reasons would have been extrinsic to the infant, such as unfavorable ecological conditions, lack of paternal or other support, and the existence of older children competing for maternal care. Offspring would have been under intense selection to express whatever characters would reduce their chances of rejection. Hrdy (1999, Chapters 19–21) has written eloquently about how discriminating maternal solicitude may have shaped the ‘adorable’ qualities of babies. In a subsequent section, the possibility that baby fat is the postnatal equivalent of hCG, carrying the message ‘this is a baby worth keeping,’ will be considered. Anthropological, archeological, and historical evidence all suggest that killing or exposure of newborn infants was common during our ancestry (Langer 1974; Williamson 1978; Smith and Kahila 1992). Current levels of infanticide are historically low, probably because of advances in contraception, the hospitalization of births, the provision of prenatal and postnatal social welfare, and reduced social disapproval of out-of-wedlock births. Nevertheless, studies of modern infanticides can provide clues about the contexts in which mothers killed or abandoned infants in the past. Homicides of newborn infants, unlike those of older children, are usually committed by mothers (Overpeck et al. 1998; Herman-Giddens et al. 2003), with the relative risk of infanticide particularly high for teenage mothers who have given birth to a previous child


(Overpeck et al. 1998). Infants killed on the first day of life in the United States have a high frequency of prematurity (Overpeck et al. 1999). Such data, by themselves, cannot disentangle whether mothers possess a psychological bias to abandon premature babies or whether mothers who abandon infants are more likely to have preterm deliveries.

Maternal circulation A simple model of the maternal circulation during pregnancy is presented in Fig. 6.2. Cardiac output from the left-side of the heart is shared between two subcirculations arranged in parallel: the uteroplacental subcirculation (resistance Rp) represents all maternal blood diverted through the intervillous space of the placenta; the non-placental subcirculation (resistance Rm) represents the systemic blood supply to maternal tissues. The fetal share of maternal cardiac output is determined by the ratio of the non-placental resistance to the sum of the resistances, namely Fetal share =

Rm Rm + Rp

Thus, an increase in the resistance of the nonplacental subcirculation will cause increased flow through the uteroplacental subcirculation, and vice versa. The uterine share of cardiac output is probably less than 1% in non-pregnant women (Kliman 2000), but this proportion rises progressively during pregnancy as fetal demand escalates. The uterus is estimated to receive 9% of maternal cardiac output


Placenta Rp

Maternal tissues Rm

Figure 6.2 A simple model of the maternal circulation during pregnancy. Maternal cardiac output is shared between two subcirculations arranged in parallel. The share of cardiac output available to the fetus is determined by the relative resistances of the non-placental circulation (Rm) and the uteroplacental circulation (Rp).


at 20 weeks gestation, 12% at 24 weeks, 14% at 28 weeks, 15% at 32 weeks, and 16% at 36–38 weeks (Konje et al. 2001). Pregnancy must therefore be associated with a progressive rise in the ratio of Rm to Rp. The theory of parent–offspring conflict predicts that this ratio will be physiologically contested, with placental factors predicted to increase Rm and decrease Rp , and maternal factors predicted to decrease Rm and increase Rp. Conflict over Rp is largely played out during the first half of pregnancy. The arterioles that supply the endometrium lengthen during the first weeks after ovulation and become highly convoluted as they outpace the growth in thickness of the endometrium (Daron 1936; Maas et al. 2001). The increase in length and tortuosity of these ‘spiral arteries’ would, by itself, increase Rp. Moreover, their rapid growth greatly increases the size of the ‘target’ for modification by the initially tiny embryo, potentially slowing the early invasion of trophoblast and allowing the mother to prepare defenses in depth. The peri-implantation period is accompanied by decidualization of the endometrial stroma. This process is initiated around spiral arteries but then spreads to the rest of the endometrium (Frank and Kaufmann 2000). Decidualization involves the deposition of a tough pericellular matrix (Wewer et al. 1985) and has been conjectured to create a barrier that limits the extent of trophoblast invasion (Finn 1987; Haig 1993). During implantation, the spiral arteries are remodeled into large-diameter, low-resistance vessels. Vascular dilation occurs in the absence of trophoblast, but the full expression of the ‘physiological changes,’ including the loss of muscular and elastic elements from the vessel walls, depends on the presence of trophoblast (Kam et al. 1999; Kaufmann et al. 2003). As a consequence of these changes, the principal mechanism of controlling regional blood flow (constriction of resistance vessels) is rendered non-functional in the uteroplacental subcirculation. Rp becomes a more-or-less fixed quantity and increases in Rm will cause increases in uteroplacental share. Subsequent conflict over the fetal share of maternal systemic circulation is therefore predicted to focus on control of R m (the non-placental resistance). If R m is increased without a change in Rp, then cardiac output and



the arteriovenous pressure difference must rise to maintain non-placental blood flow to maternal tissues. Uteroplacental blood flow would then increase in proportion to the increase in arterial pressure (Yuan et al. 2005; Haig 2006). Conflict over R m is predicted to intensify as pregnancy progresses because the nutritional requirements of the fetus steadily increase and because growth of the placenta confers greater power on the fetus to influence maternal physiology. Maternal peripheral resistance and arterial pressure decrease early in pregnancy (Duvekot et al. 1993). The conflict hypothesis interprets these changes as a maternal adaptation to reduce the fetal share of cardiac output. Maternal blood pressure reaches its nadir in the second trimester before a progressive rise toward term (Redman 1989). The conflict hypothesis interprets the increase in maternal blood pressure during third trimester as a fetal adaptation to direct extra maternal blood to the placenta. The hypothesis predicts that the factors responsible for the early decline in peripheral resistance should be of maternal origin, whereas the factors responsible for the late-pregnancy increase should be of placental origin.

Preeclampsia A subset of women with gestational hypertension develop preeclampsia, a mysterious condition that is a major cause of pregnancy-related morbidity and mortality. Preeclampsia is clinically defined as the combination of pregnancy-induced hypertension and proteinuria. The appearance of protein in maternal urine indicates that the glomeruli of the mother’s kidneys have become leaky to serum proteins. Lesions of the glomerular endothelium are pathognomonic for preeclampsia, but endothelial dysfunction affects all maternal vascular beds (Roberts et al. 1989). Preeclampsia is associated with necrosis and hemorrhage in multiple maternal tissues, best explained by reduced perfusion secondary to vasospasm (Roberts 2004). In preeclampsia, the placenta releases factors that cause maternal endothelial dysfunction. Consequently, delivery of the placenta (and baby) is the most effective means of reducing maternal risk. This creates a clinical dilemma when preeclampsia

develops before term. The longer that gestation is prolonged, the less the risks to the infant (from prematurity), but the greater the risks to the mother (from preeclampsia) (Oláh et al. 1993; Roberts 2004; Ganzevoort et al. 2006). Why should the placenta damage maternal endothelia? The usual assumption has been that preeclampsia is a side effect of physiological processes gone wrong and is maladaptive for both mother and fetus. Page (1939, 1967) and Haig (1993, 2006) have championed an alternative interpretation. Preeclampsia is maladaptive for mothers but adaptive for fetuses. The function of the placental factors responsible for maternal endothelial dysfunction is to increase non-placental resistance (Rm) and thereby increase maternal blood flow to the placenta. This is envisaged as a high-risk strategy activated in a minority of pregnancies in which there is an inadequate supply of one or more nutrients via the placenta. In many pregnancies affected by preeclampsia, conditions present before the onset of disease compromise the supply of nutrients to the fetus (Haig 2006). Thus, preeclampsia is more frequent when few spiral arteries are remodeled by trophoblast (Brosens et al. 2002), when two fetuses compete for nutrients in twin pregnancies (Sibai et al. 2000), and when pregnancy occurs at high altitude (Zamudio et al. 1995). Nevertheless, preeclampsia does not affect all pregnancies in which there is fetal growth restriction. Ness and Sibai (2006) have proposed that preeclampsia results when fetal growth restraint is combined with a maternal predisposition to endothelial dysfunction. The hypothesis that preeclampsia is caused by a fetal adaptation has a strong and a weak version. In the strong version, vasospasm of maternal vessels enhanced offspring survival (before the era of modern medical care) by increasing the flow of maternal blood to the placenta. In the weak version, preeclampsia is the occasional non-adaptive outcome of a fetal adaptation that operates in nonpreeclamptic pregnancies to cause mild maternal vasoconstriction. Two important changes of perspective may come from viewing preeclampsia as a fetal adaptation rather than as malfunction of ‘normal’ physiological processes. First, adaptations may be complex.


If preeclampsia is caused by the release of placental factors whose function is to increase nutrient supply, then this adaptive fetal response may involve multiple factors that target multiple maternal systems. Second, mothers are expected to have evolved counter-measures to limit fetal manipulation of maternal physiology and to limit the costs to maternal health. Therefore, fetal actions need to be clearly distinguished from maternal responses. Both may be hyperactivated in preeclampsia: the former are predicted to exacerbate symptoms, whereas the latter are predicted to ameliorate symptoms. An evolutionary perspective does not immediately suggest better methods of treatment, but may encourage new ways of thinking about the proximate causes of the disease.

Growth The total physiological cost of pregnancy is probably larger for human mothers than for our closest living relatives. Allometric comparisons among primates reveal that human pregnancies are prolonged and human neonates are large relative to maternal body size (Leutenegger 1974). Maternal body weight is similar in humans and chimpanzees but human gestation is about five weeks longer (Gavan 1953) with humans growing faster than chimpanzees after 27 weeks (Schultz 1940). As a result, birthweights average about 3.5 kg for humans (Arbuckle et al. 1992) and 2.0 kg for chimpanzees (Smith et al. 1975). The physiological cost of human pregnancy is exacerbated by the exceptional adiposity of human neonates. The next section reviews evidence that our babies are unusually fat and then considers evolutionary hypotheses to explain this peculiar human feature.

Fat One of the principal activities of human fetuses during the final weeks of pregnancy is the deposition of large amounts of fat. Fat accounts for more than half of fetal caloric accretion from 27 weeks until term and about 90% of daily accretion in the final stages of pregnancy (Smith et al. 1975). Calculated values vary somewhat. The ‘reference fetus’ of Ziegler et al. (1976) adds 5.4 g of fat per day


from 36 to 40 weeks gestation, whereas Widdowson (1980) calculated that fetuses deposit 8.8 grams of fat per day over the same four-week period. The reference fetus is only 0.1% lipid by weight at 24 weeks but 11.2% lipid at 40 weeks (Ziegler et al. 1976). The cadavers of six human neonates varied from 11 to 28% fat (average 16%: Widdowson 1950), and a ‘reference boy’ and ‘reference girl’ are born with 13.7 and 14.9% fat, respectively (Fomon et al. 1982). The fat percentage of human infants continues to increase after birth, reaching 25% at 6 months before declining (Garn et al. 1956,Fomon et al. 1982, de Bruin et al. 1996). Most fat is laid down after the fetus is sufficiently mature to have a chance of survival if it is delivered early. Survival in the event of premature delivery appears to be the fetus’s first priority, but, after this priority is met, a major proportion of the fetus’s nutrient intake is stockpiled as fat rather than lean tissue. As might be expected, fat is the most variable component of birthweight. Fat averaged 14% of birthweight but accounted for 46% of the variance in one study (babies of middle-class American mothers: Catalano et al. 1992) and accounted for 70% of the reduction in birthweight of babies whose mothers maintained intense physical exercise throughout pregnancy (Clapp and Capeless 1990). Although premature infants have reduced fat, small-for-gestational-age infants have relatively high percent body fat (Hediger et al. 1998). Comparative data are limited but suggest human babies are fatter than neonates of most other mammals (for a review see Kuzawa 1998). For example, Widdowson (1950) reported 1–2% fat in newborn pigs, cats, rabbits, mice, and rats. In her study, only neonatal guinea pigs, with 10% fat, approached the 16% fat of newborn humans. Anecdotal remarks suggest that nonhuman primates are lean at birth (e.g., Schultz 1969, p. 152), but the only quantitative data are measures of neonatal fat in squirrel monkeys (3% fat at birth: Russo et al. 1980), cebus monkeys (5% fat at birth: Ausman et al. 1982), and baboons (3% fat at birth: Lewis et al. 1983). The exceptional adiposity of human infants is not merely an artifact of affluent diets in the developed world. Babies of poor Indian mothers, like those of Western mothers, are fat by mammalian standards (11% fat at birth: Apte and Iyengar 1972).



Why should human babies be so much fatter than neonates of other mammals? The accumulation of fat before birth suggests that fat reserves had a special importance in the early postnatal period. The subcutaneous fat of humans has been suggested to substitute for the thermal insulation provided by hair in other primates (e.g., Garn et al. 1956; Pawlowski 1998). However, Pond (1968; 1997) and Kuzawa (1998) have questioned the strength of evidence supporting this popular hypothesis. For example, Inuit children are leaner, not fatter, than children from warmer climates (Johnston et al. 1982) and there is no correlation between fat levels and the ability of Philadelphia newborns to maintain high rectal temperatures (Johnston et al. 1985). Distinctive features of human life cycles are commonly ascribed to our large brains. Baby fat is no exception. The brain is a lipid-rich tissue enriched in long-chain polyunsaturated fatty acids such as docosahexaenoic acid (DHA). Fetal adipose tissue is enriched in DHA, and, at birth, contains 16 times more DHA than is present in the brain. Therefore, adipose tissue is proposed to provide a critical store of essential structural lipids for the support of brain development during the first months of postnatal life (Haggarty 2002). The brain is also a metabolically active organ that has little potential to reduce its energy expenditure during starvation. The brain of a human infant accounts for almost 60% of total oxygen consumption (compared to 20% for an adult human brain and 3.5% for the brain of a newborn lamb: Kuzawa 1998). Therefore, human infants are proposed to have larger fat reserves than other mammals because they have larger brains (relative to body size), and hence higher maintenance costs (Kuzawa 1998; Wells 2006). Of course, these proposals are not mutually exclusive. Baby fat could provide both a store of structural lipids for supporting brain growth and a store of energy for maintaining brain function. Human infants have well-developed subcutaneous fat but little visceral fat (Knittle 1978). Fat is preferentially deposited where it can be seen. Hrdy (1999, chapter 21) has suggested that the risk of infanticide may have selected for fat babies. The more fat stored by an infant the greater the mother’s expected return from continued investment in the infant, whereas fat already invested cannot be recouped if the infant is abandoned.

Thus, the existence of a maternal bias in favor of fatter babies may have intensified selection for prenatal accumulation of fat and for its conspicuous display by infants. Newborn humans are more likely than newborn apes to compete for maternal care with older offspring, because humans have greater overlap in the period of parental care for successive offspring (see below). However, no data have been found to address the question whether infanticide was a stronger selective force in humans than in other great apes.

Brains and bodies The allometric relationship between human brain and body size closely follows the pattern of other primates during fetal development (Martin 1990). Thus, human brain size at birth is not exceptional when compared to neonatal body size. Among primates, newborn gibbons and orangutans have proportionally larger heads than newborn humans (Schultz 1926) and newborn squirrel monkeys, for example, have almost twice the relative cranial capacity of human neonates (Leutenegger 1974). However, human babies do have large heads relative to the size of their mothers because they are large relative to maternal size (Leutenegger 1974). Our brain’s prenatal growth is not unusual, but the human brain uniquely maintains fetal growth rates for the first year of postnatal life (Martin 1990). Brain growth then steadily declines, with little increase in weight after 6 years (Laird 1967; Cabana et al. 1993). The bodies (and brains) of most mammals exhibit a simple decelerating growth curve with an asymptotic approach to final size. The human body displays a similar growth curve during its first few years, more or less coincident with the period of brain growth. Then there is a mid-growth spurt at about seven years, followed by a period of protracted slow growth until puberty, when linear growth accelerates before a final asymptotic approach to adult height (Bogin 1999). The idiosyncratic growth of our bodies, with its decelerations and accelerations, contrasts with the conventional decelerating growth of our brains. As a consequence, the human child ‘spends a considerable part of his growth period with a more or less full size brain, but with a body much smaller than that of the adults around him’ (Laird 1967).


Human postnatal growth is substantially slower than that of chimpanzees, by contrast to our more rapid growth in the final stages of pregnancy (Schultz 1936). If a child’s growth falters because of illness or malnutrition, growth accelerates upon recovery, and then decelerates once it has regained the trajectory present before disease (Tanner 1973). ‘Catch-up growth’ and the adolescent growth spurt both suggest that children have the potential to grow faster and that nutrient intake is not directly limiting growth. Slow growth appears to have been an adaptive product of natural selection. If one assumes that the timing of puberty was adaptive in ancestral environments, one can infer that each extra year of experience prior to puberty made a greater expected contribution to a child’s fitness than an extra year of reproduction. The fact that children grow below their evolutionary potential, and that puberty coincides with a dramatic acceleration of growth, suggests that children gained advantages from remaining small while they were non-reproductive. Three advantages can be conjectured: small size may have reduced maintenance costs; small size may have minimized dangers of being misperceived as an adult; and small size may have aided in extracting extra parental investment. Recent historical increases in adult height, and decreases in the age at puberty, at first sight challenge the conclusion that the slow growth of childhood is not limited by the supply of nutrients. The average age at menarche in Europe has decreased by 3–4 years since the mid-nineteenth century, concomitant with a dramatic increase in average height. These secular trends are generally ascribed to improved levels of nutrition (Ong et al. 2006). The distinction between ultimate (evolutionary) and proximate (physiological) explanations may provide a way to reconcile these apparently contradictory interpretations. One could posit the evolution of physiological mechanisms that responded to unusually good times by promoting more rapid growth and earlier reproduction, while natural selection continued to eliminate genetic variants that caused even faster growth under the same conditions. Intriguingly, children from India and South America who are adopted into Danish families have accelerated growth and early maturation. Such girls have a 10- to 20-fold increased


risk of entering puberty before they are 8 years old (Teilmann et al. 2006).

Intergenerational conflicts Human offspring mature more slowly than offspring of other great apes. However, ethnographic data suggest that human offspring received some supplemental foods during the first year of life and ceased suckling between two and three years of age, much earlier than occurs in other great apes (Kennedy 2005). Delayed maturation and early weaning thus appear to be derived features of the human life cycle. This section interprets the first as an adaptation of offspring to extract extra maternal care and the second as an adaptation of mothers to reduce interbirth intervals. Eshel and Feldman (1991) have shown that an action that handicaps an offspring relative to its siblings can be favored by natural selection if parents respond to the handicap by providing extra care. By this means, adaptive maternal responses to variation in infant need could be exploited by offspring to extract extra care. Models of this kind may explain how hominid infants were able to shift much of the cost of neural development and learning onto other individuals, particularly mothers. Young children gain experience about the world in relative safety by exploiting the sensory and motor systems of their mothers to keep out of trouble. A hominid infant that delayed brain maturation would be helpless relative to earlier maturing siblings but, if its mother responded with increased attention and prolonged care, then the slow-developer would gain the competitive advantage of a more highly developed adult brain. Delayed maturation would reduce maternal fitness if the enhanced prospects of slower-maturing infants failed to compensate for reduced family size. Early weaning may have been a maternal counter-adaptation to the increased period of juvenile dependency, because weaning allows mothers to conceive, gestate, and suckle a new offspring. The combination of early weaning with prolonged dependency means that juvenile humans often grew up with (and shared maternal care with) younger and older siblings. Human weanlings remain dependent on supplemental foods supplied by older individuals. Mothers would have gained little from early



weaning unless they could have reduced the cost of looking after older offspring while suckling. It is generally believed that human mothers were able to care for more than one offspring at a time because other members of their social group assisted with child-care. Potential helpers included fathers, current sexual partners, grandmothers, and older children, although the relative contributions of these alternative caregivers are a subject of debate (Hrdy 2005). Maternal costs are also reduced when older offspring contribute more to their own upkeep (Zeller 1987). A genetic disorder of imprinted loci provides clues about evolutionary conflicts associated with early weaning in humans. Prader-Willi syndrome (PWS) is caused by non-expression of genes normally expressed only when inherited from fathers. Appetite in this syndrome switches from profound anorexia at an age when infants would have been exclusively breast-fed to insatiability at an age when mothers would have introduced supplemental foods. This curious switch in appetite suggests that paternally derived genes benefited from enhanced intake of milk but not of supplemental foods (Haig and Wharton 2003). Early weaning appears to have enhanced maternal fitness, by reducing intervals between births, but to have reduced offspring fitness, possibly by increased mortality from disease.

to benefit mothers may be costly to offspring, and vice versa. 2. Maternal-fetal coordination is limited by natural selection acting at cross-purposes on maternal and fetal genes. The physiology of gestation is therefore expected to lack the fine-tuned homeostatic mechanisms that operate in the non-pregnant state. 3. Communication between mothers and fetuses is compromised by evolutionary incentives to send misleading signals. 4. Maternal physiology and psychology are predicted to have evolved mechanisms for testing offspring and terminating investment in offspring of low perceived quality. Offspring will have evolved features that reduce their chances of failing these tests. 5. Fetal genes are predicted to manipulate maternal physiology to increase the flow of maternal blood through the intervillous space of the placenta. Preeclampsia may be an expression of a fetal adaptation to increase blood flow to the placenta by increasing maternal systemic resistance. 6. Human babies are unusually fat. Adipose deposits may have provided a store of structural lipids and energy for growth and maintenance of the infant brain. Baby fat may also serve the function of advertising infant health to mothers. 7. The typical duration of lactation in our evolutionary past was suboptimal for offspring fitness.



1. Maternal fitness and offspring fitness are not synonymous. Some adaptations that have evolved

This chapter has benefited from the helpful comments of Jessica Girard and Jacob Koella.


How hormones mediate trade-offs in human health and disease Richard G. Bribiescas and Peter T. Ellison

Introduction: Hormones, life history, evolution, and health Health is commonly perceived as an idealized goal, one that involves optimal bodily function. However, this goal has been elusive, even with profound advances in modern medicine. What are the obstacles? Why haven’t we achieved strategies that allow people to enjoy consistently good health? Inevitably, discussion then turns to what it means to be ‘healthy.’ At that point, evolutionary biology offers an insight: concepts of health must incorporate the physiological constraints and ranges of plasticity well documented by the biological community. Those constraints include the idea of trade-offs. That is, the benefits from some function often can only be achieved by incurring costs in other parts of the body concurrently or at a later time. For example, increased circulating glucose benefits the basic metabolic function of important tissues such as the brain, which is entirely dependent on glucose. However, a consistent increase in glucose may also lead to greater insulinemia and a risk of diabetes. An increased awareness of tradeoffs, as well as the regulatory role of the important mechanisms of the endocrine system, is therefore central to our understanding of health and well being. Evolution by natural selection works through differential reproductive success. Individuals with higher reproductive rates and better survival of themselves and their offspring pass more genes to future generations. By managing the flow, distribution, and rate of consumption of finite supplies of glucose and fat among competing physiological

needs, hormones implement trade-offs between investment in growth, reproduction, and survival. That their effects are dramatic is documented by long-established practices that manipulate lifehistory trade-offs. For example, castration of domestic animals reduces reproduction and increases growth. By reducing the effects of testosterone on reproductive effort and thereby diverting somatic resources to growth and maintenance, farmers increase fat and protein deposition in their animals, resulting in greater amounts of high-quality meat (Huxsoll et al. 1998). Such cases support the suggestion of evolutionary biologists interested in endocrinology and medicine that hormone-driven trade-offs underlie much of the variation in what we consider to be health. In this chapter, we discuss the effects of hormones on the relationships among competing physiological functions involved in the trade-offs among the key life-history events that structure human life histories. The scope and complexity of endocrine function force us to restrict our discussion to metabolic investments in reproduction, survivorship, and growth although other life-history traits are also important.

Hormones and trade-offs In life-history theory, a trade-off refers to the allocation of a limited resource such as time or energy among competing functions (Stearns 1992). These allocation processes result in inverse associations between investments in: • Survivorship and reproduction; 77



• Growth and reproduction; and • Present and future reproduction. Table 7.1 summarizes some major hormone/lifehistory trade-off interactions. Survival is commonly affected by investment in immune function, while reproductive investment is expressed through processes involved with mating effort, such as sexually dimorphic muscle tissue in males, and in females, gestation and lactation. Growth involves increases in somatic mass prior to sexual maturation. The trade-off between present and future reproduction is most clearly expressed in females, where investment in present reproduction, either through pregnancy or lactation, suppresses ovulation and the possibility of future reproductive bouts. The role of hormones in mediating these functions is clearly revealed by manipulating hormones and noting its effect on life-history variables such as survivorship. Such experiments are obviously difficult to conduct in humans, but there are several examples from animal models. The administration of testosterone in birds and lizards results in greater investment in procuring mates while compromising survival by depleting somatic reserves (Reed

et al. 2006). Another costly somatic function is the maintenance of an activated immune system. One would predict the adaptive response to a challenge such as infection would be to lower hormones that promote the diversion of energetic resources away from an immune response. During infection, the adaptive down-regulation of hormones involved in promoting reproductive effort, such as testosterone, would be expected in order to augment survivorship (Muehlenbein and Bribiescas 2005). A test of this hypothesis indeed revealed the predicted decline in testosterone in association with experimental viral infection in rhesus macaques (Muehlenbein et al. 2006) (Fig. 7.1). An additional example from human females involves lactational amenorrhea. As long as a mother is nursing, she is less likely to begin cycling again due to the suppressive effects of prolactin on the hypothalamus (McNeilly et al. 1994). Note, however, that the duration of lactational amenorrhea varies depending on maternal energetic state, independent of suckling frequency and intensity (Valeggia and Ellison 2004). This interaction introduces the importance of phenotypic plasticity and reaction norms on hormone-managed trade-offs.

Table 7.1 A generalized summary of the physiological effects of key hormones, under conditions of energetic limitations, and their reported trade-off effects on life-history variables (a) Men Hormone

Reproductive effort



Muscle tissue n


Muscle tissue p


Muscle tissue p

Immune function p Adipose tissue p Immune function n Adipose tissue n Immune function n Adipose tissue l

(b) Women Hormone

Present reproductive effort

Future reproductive effort


Pregnancy maintenance n


Ovarian function, endometrial maintenance n Lactation n

Hypothalamic/ovulatory suppression p Hypothalamic/ovulatory suppression p Ovulatory suppression p



Autoimmune disorders?


Hormones, population variation, and phenotypic plasticity

30 Testosterone (ng/ml)

and metabolic hormones illustrate the effects of hormones on life-history trade-offs in response to environmental cues. The timing and mechanisms that underlie hormonal function plasticity are not fully elucidated, but compelling data suggest that gestation and puberty are sensitive periods. It is also important to note that the degree of plasticity itself may involve trade-offs since the development and maintenance of phenotypic malleability incurs tangible costs, such as metabolic investment in the production and maintenance of tissues and sensory mechanisms that detect changes in the environment. The implications for health in a diverse population are considerable. Within the United States, the growing number of recent immigrants presents a challenge for distinguishing common global variation from differences that might be construed to be pathological. For example, prescribed dosages of oral contraceptives should consider population variation in the rate of estrogen clearance. Lack of awareness of such variation may result in inappropriate dosages leading to less effective protection against pregnancy or an increased risk of breast

Relative metabolic load Low Moderate High


Because hormones regulate the transcription of genes as well as influencing other mechanisms that induce phenotypic variation, they are important players in reaction norms and phenotypic plasticity, key processes in trade-off management. Briefly, a reaction norm is the phenotypic expression of a single genotype across a range of environments. Hormonal expression is often modified by such factors as environmental circumstances, nutritional status, and energetic availability. For example, Fig. 7.2 illustrates how the reproductive suppressive effect of prolactin, a protein hormone transcribed by a single gene, is modified in response to a woman’s energetic condition. The importance of reaction norms is often underappreciated in clinical research. Understanding the implications of reaction norms allows hypothesis development and investigative conclusions to go beyond the standard ‘nature’ versus ‘nurture’ false dichotomy. Additionally, polymorphisms, epigenetic effects, and hormone receptor priming contribute to hormone level variation between populations. Consequently, differences in hormone expression in response to lifetime environmental conditions are important sources of phenotypic variation that have implications for health between diverse populations. Variation in estrogens, testosterone,


Threshold for amenorrhea


10 Time postpartum

0 Pre-exposure

At sacrifice

Figure 7.1 Testosterone exhibited a significant predicted decline in rhesus macaques (n ⫽ 6) exposed to Venezuelan equine encephalitus virus (mean exposure time 72 ⫾ 8.8 SE hours) (19.5 vs 13.4 ng/mL, Wilcoxon signed rank, p ⫽ 0.05, 1-tailed). Data derived from Muehlenbein et al. (2006).

Figure 7.2 The decline in average prolactin levels in a nursing mother’s blood with time postpartum can be advanced or retarded by variables that affect the relative metabolic load that milk production represents. For example, more frequent or intense nursing or lower maternal nutritional status both result in a slower postpartum decline in prolactin and a later resumption of menstrual cycling (indicated by the arrows) when prolactin levels fall below the threshold associated with amenorrhea (Lunn et al. 1984, Ellison and Valeggia, 2003).



and ovarian cancer (Bentley 1994). Moreover, the development and etiology of disorders that may pose an especially disproportionate risk to certain communities or ethnicities is an issue that also merits greater awareness. Extremely high rates of obesity and diabetes among Pima Indians in the United States and other ethnic communities illustrate the significant social, economic, and human costs. A potential reason for these disproportionate rates of illness may be the inaccurate assessment and priming of the endocrine system. Leptin, a hormone secreted primarily by fat cells that serves as a reporter of adiposity for the hypothalamus, is extremely low among Ache Amerindians of Paraguay compared to Americans. Consequently, Native Americans may exhibit leptin hypersensitivity, perhaps predisposing them to obesity and diabetes under conditions of excess caloric intake. High rates of obesity and diabetes seen in Amerindians may illustrate an example of an inaccurate assessment of environmental conditions as well as trade-offs involving survivorship, with an inordinate amount of resources being sequestered in adipocytes.

Hormones and trade-offs in males Androgens and fetal development Various hormones and hormone-like substances are essential for the fetal development of a human male. Briefly, testicular determining factor (TDF) is transcribed under the direction of the Y chromosome. As its name implies, TDF initiates and promotes the development of the testes. Other factors such as Müllerian inhibiting substance (MIS) are involved with the development of internal male reproductive structures such as the vas deferens. Growth and differentiation of external male genitalia, the penis and scrotum, are initiated by dihydrotestosterone (DHT) and testosterone. These same androgens are also candidate agents for subtle differentiation of the male brain. Disruptions or deficiencies of any of these hormones during development result in serious conditions that compromise sexual development. For example, a mutation of the gene responsible for transcribing five-alpha reductase, an enzyme necessary for

DHT production, results in ambiguous genitalia development and long-term challenges to sexual identity and reproduction. Unmistakably, androgens are crucial for male development but their production is not without cost. Male fetuses are aborted at a higher rate than females (Byrne and Warburton 1987), have lower leptin levels (a reflection of adiposity) (Gomez et al. 1999), and have higher mortality and prematurity rates than girls (Ingemarsson 2003), all likely related to testosterone production in utero. In passing, maternal testosterone levels during pregnancy are also connected to lower birthweight in males and females, probably due to energetic diversion away from the placenta and fetus (Carlsen et al. 2006). Higher testosterone in male fetuses is also hypothesized to underlie lower leptin levels in amniotic fluid compared to females (an indicator of lower adiposity), perhaps contributing to differences in infant mortality and morbidity (Schubring et al. 1999). Testosterone and DHT also incur metabolic costs and energy demands on the mother. Women pregnant with male fetuses, compared to females, exhibited a 10% greater caloric intake (Tamimi et al. 2003). Maintaining Fisherian genetic equilibrium therefore would require that more males than females be born, effecting small but consistent malebiased sex ratios evident in numerous populations. Greater male infant mortality is therefore somewhat of a conundrum. While decreasing the levels of androgens in circulation in utero might augment male infant survivorship, the potential effects on sexual development would likely be undesirable.

Childhood quiescence The period between birth and pubertal maturation is marked by general hormonal quiescence. While a significant surge in testosterone and other reproductive hormones is evident immediately after birth, the negative feedback mechanism of the hypothalamic–pituitary–gonadal axis is extremely sensitive to androgens, thereby inhibiting the accumulation of circulating steroids. During this lifehistory period, boys (and girls) grow at a slow, but constant rate, until the adolescent growth spurt when energy is redirected from general to sexually dimorphic growth and development.


The importance of maintaining this slow pattern of growth and keeping reproductive hormones in check is revealed by conditions such as precocious puberty, the premature desensitization of the hypothalamus and the onset of sexual maturation as early as age three or four. Not only does this condition reveal the benefits of slow steady growth, but also the costs of mis-timing pubertal maturation and a poor transition between general growth and sexual maturation. Boys with precocious puberty undergo a brief intense growth spurt that terminates in the cessation of long bone growth, fusion of the epithelial growth plates, and abnormally short stature (Carel et al. 2004). Untreated, boys with precocious puberty also endure significant psychosocial problems as well as behavioral disorders (Kakarla and Bradshaw 2003). During normal development, the attenuation of testosterone during this life-history stage illustrates the value of investment in overall somatic growth over reproductive effort.

Adolescent development, morbidity, and mortality During adolescence, testosterone levels and overall gonadal function escalate in response to decreases in hypothalamic sensitivity to circulating androgens and greater gonadotropin production (LH and FSH). The resulting development of secondary sexual characteristics such as facial and pubic hair, deepening of the voice, penile growth, and amplified skeletal muscularity are central to male attractiveness, libido, and the ability to father offspring. Testosterone and associated hormones are therefore necessary for the onset and maintenance of a heterosexual male’s ability to procure mates and secure fertility. However, adolescence is also associated with an acute rise in male mortality, primarily due to greater risky behavior (Owens 2002). Especially compelling is the universality of the male adolescent mortality spike, which is documented in numerous societies as well as forager populations. The causal connection between testosterone and greater male mortality in human males is circumstantial although it is well documented in other vertebrates. Pubertal increases in testosterone


are worthy of consideration since testosterone is highest during early 20s in industrialized populations (Harman et al. 2001). In addition, adolescent increases in testosterone are linked with greater anti-social and aggressive behavior (Dabbs 1996). Moreover, male adolescent mortality has climbed over the past hundred years, likely in response to greater mechanization (i.e., automobile accidents) in industrialized countries. Kruger and Nesse (2004) argue that greater human male mortality is the product of evolved psychological mechanisms that underlie sexual selection. That is, the conscious or unconscious perception that high-risk behaviors may augment attractiveness and competitiveness underlies male adolescent mortality and is a reflection of sexual selection in humans. It must be noted, however, that affiliations between testosterone and risky behavior are likely to be more complex than a simple dose-dependent relationship. For example among Ache foragers of Paraguay, the male adolescent mortality spike is significant, yet there are no age-related differences in testosterone (Hill and Hurtado 1996; Ellison et al. 2002).

What are the benefits of testosterone in adult males? The primary fitness benefits of testosterone are multifaceted, including support for optimal spermatogenesis, the development and maintenance of secondary sexual characteristics that augment male competitiveness and attractiveness, as well as libido. Other associations include competitive ability and possible relationships with social dominance. While testosterone variation within clinical ranges of normality is not associated with differences in fertility, it is necessary for optimal spermatogenesis. Similarly, testosterone variation within clinically normal ranges is not associated with differences in libido; however, testosterone does incur a permissive effect that maintains sexual motivation (Buena et al. 1993). Testosterone is additionally linked with male physical traits that serve as attractiveness and quality cues for females (Perrett et al. 1994). Roney and colleagues (2006) reported that female assessments of masculinity and short-term attractiveness are correlated with testosterone. Although higher testosterone



concentrations are connected with greater fitness in other vertebrates, no such correlations have been reported in humans. However, testosterone was modestly associated with number of lifetime sexual partners in university men (Bogaert and Fisher 1995). In regards to dominance and competition, testosterone rises in anticipation of competitive interactions in humans, with testosterone remaining high in winners (Booth et al. 1989). Among wild chimpanzees, testosterone is positively related with social dominance rank (Muller and Wrangham 2004).

Testosterone and somatic investment Despite good nutrition and modern medical support, contemporary sedentary lifestyles and aging contribute to declines in male somatic integrity. Muscle wanes and greater fat deposition are evident in many populations. In addition to calls for more exercise and caloric restraint, hormonal supplementation has undergone increased popularity. Indeed, testosterone has a clear enhancing effect on muscle anabolism and overall metabolism in human males. Dose-dependent testosterone augmentation of skeletal muscle and strength is readily evident in response to supraphysiological and normal-physiological doses in younger and older men (Bhasin et al. 2005). Moreover, testosterone aids in the catabolism of adipose tissue and may serve to improve mental states of well being. In total, testosterone supplementation has been proposed to be a useful and valuable hormonal vehicle for enhancing health and well being in older and younger men. However, such benefits are evident only in men who are not subject to energetic constraints. Most men around the world do not eat ad lib and often endure periods of moderate to severe energetic stress. In many species, even after eliminating the effects of high-risk behavior, the inherent physiology of male somatic function in the presence of testosterone is compromised. The effects of testosterone include higher metabolic rates in males than in females (Arciero et al. 1993). In birds, testosterone supplementation increases basal metabolic rate and is associated with greater mortality (Buchanan et al. 2001). Similar effects are evident in

humans: testosterone supplementation within the normal physiological range increases weight, lean body mass, and basal metabolic rate, and decreases adiposity (Welle et al. 1992) (Fig. 7.3). Whether testosterone supplementation affects the rates of mortality of humans is unclear. Nonetheless, it has been argued that greater metabolic costs in males resulting from the anabolic and metabolic effects of testosterone on muscle tissue, as well as the catabolic affects on adiposity, may reflect investment in reproductive effort at the expense of survivorship in human males, similar to what is observed in other organisms (Bribiescas 2001a). Interestingly, the effects of testosterone on metabolic rate extend to behavioral studies. Although testosterone is associated with pre-competitive rises, with winners maintaining high levels, the physiological significance may extend beyond mood and affect. Tsai and Sapolsky (1996) suggest that short-term changes in testosterone, as is common in competitive interactions, may involve the mobilization of energy to increase the likelihood of success. They have shown that metabolic rates in muscle cells exposed to testosterone rise 20% within one minute. Moreover, metabolic rates stay significantly higher compared to controls for several hours.

Testosterone and immune function Despite the clear reproductive and somatic enhancing effects of testosterone, a clear cost emerges, immunocompetence. The relationship between reproductive effort and survivorship, as reflected by immune function, is an important trade-off in male mammals. Since males do not gestate or lactate, thereby investing less in offspring than females, much of the variance in male reproductive success hinges on mating opportunities. Therefore, males often sacrifice investment in survivorship when reproductive payoffs are favorable. In addition, humans are long-lived and have multiple reproductive events (iteroparous), thereby allowing them to allocate differential investment in reproductive effort and immune function over multiple bouts of reproduction. From this premise, it would be predicted that when males are investing heavily in reproductive effort, either through greater time


Change in weight


Change in lean body mass



6 5




5 3


2 1 0

Testosterone treated



Testosterone treated

Change in fat

Basal metabolic rate


25 Change (kJ/hour)



2 1 0 –1 –2 –3


Testosterone treated


20 15 10 5 0

Testosterone treated


Figure 7.3 Effects of testosterone supplementation on weight, lean body mass, fat, and basal metabolic rate in healthy men (n = 4). Data from Welle et al. (1992).

and energy spent in mating behavior or in tissue that aids in competition or attractiveness, immune function should decline. Men given testosterone supplements would be expected to suffer greater bouts of infection. However, testosterone supplementation is mostly available in well-nourished populations where infectious disease rates are low and energetic availability is high. Testosterone supplementation of HIV-positive men as a treatment against wasting has yielded mixed results. Although testosterone augmentation of HIV-positive men significantly increased fat catabolism, CD4 levels were lower in treated men compared to placebos although the difference was not significant (Bhasin et al. 2006). A more suitable test would be to increase testosterone levels in energetically stressed men and exacerbate infection, an obviously unethical option. An indirect assessment of testosterone/ immune function trade-offs in men would be to observe predicted responses in men in response to treatment for the infection. These studies are

confounded by the medical treatment itself but predicted declines in testosterone are evident. Spratt (1993) reported that testosterone was lowest among the most ill among hospitalized men. In addition, men infected with the malarial parasite Plasmodium vivax exhibited testosterone increases in response to anti-malarial treatment (Muehlenbein et al. 2005) (Fig. 7.4). While skeletal muscle can be viewed as investment in reproductive effort, adipose tissue is important for nourishment during lean periods and is becoming more central to our awareness of the costs of maintaining a robust immune system and rallying an immune response in the presence of infection (La Cava and Matarese 2004). Therefore in the absence of pathological obesity, investment in fat tissue should reflect investment in survivorship. Estradiol in men may be a hormone that promotes survivorship through its promotion of fat deposition and suppression of testosterone. In addition, estradiol directly enhances immune function (Cutolo et al. 2006), as well as




Males Females

Mean testosterone (ng/ml)

4 3.5 3


2.5 2 1.5 1 0.5

l tro on C

m 4th pl e da y Sa

m 3rd pl e da y Sa

m 2nd pl e da y Sa

di Day ag o no f si s


Figure 7.4 Testosterone levels were lower on day of diagnosis compared to the fourth sample of anti-malarial treatment among Honduran men diagnosed with Plasmodium vivax infection (p = 0.02), and lower compared to healthy controls (p = 0.004) (reprinted from Muehlenbein et al. 2005).

possibly via the promotion of greater adiposity and the immuno-enhancing effects of leptin (Chan et al. 2006). It is therefore of great interest to note that simultaneous to its enhancement of muscle anabolism and metabolism, testosterone promotes fat catabolism, in essence liquidating somatic capital in conjunction with the increasing somatic energy costs. As leptin is a useful biomarker of somatic investment in survivorship, both as a reflection of available adipose stores and as an immunostimulant, it is noteworthy that testosterone and leptin are often inversely related (Luukkaa et al. 1998). In summary, increased testosterone generally leads to lower immune function either directly or via tradeoffs with other hormones.

understood. Certainly there are mitigating factors such as the evolution of the loss of overt ovulation signals and mate guarding. However, hormones are central mediating agents in a male’s decision to provide paternal and mate care. Organisms cannot be in two places at once. Decisions are made that determine when an organism will forage, rest, care for offspring, or seek mating opportunities. Consequently, trade-offs are inherent to time investment in divergent needs and behaviors. Behavioral effects of hormones are also well known although the actual mechanisms of influence remain clouded. If testosterone is associated with somatic investment in reproductive effort in the form of mate attraction, seeking, and competition, lower testosterone might be expected in association with paternal behavior investment. Gray and colleagues have reported that testosterone is indeed lower within the context of paternal behavior and mate bonding. The findings are more compelling in that they are evident within a broad spectrum of cultural settings (Gray et al. 2006). But does this relationship have an impact on survivorship? Perhaps. Jasienska and colleagues (2006) reported that among a rural Polish population, paternal longevity was increased by about 74 weeks per daughter while number of total offspring had a negative effect on maternal life span. Exposure to infants incurs acute changes in male endocrinology such as increases in oxytocin and declines in testosterone. Moreover, marriage and involvement in long-term monogamous relationships are also related to lower testosterone levels. The evolutionary rationale behind these changes remains unclear, although it may be that lower testosterone levels decrease both the motivation and physical costs of mating effort in situations where paternity certainty is high or when investment in offspring may increase mating opportunities.

Fatherhood and paternal investment Paternal investment is a central question in human male life histories. With only minute metabolic investment in offspring and the constant possibility of paternity uncertainty, the decision for a male to invest in offspring, often at the cost of mating effort, is both important and often poorly

The aging male While inverse associations between endogenous testosterone levels and longevity are not evident in humans or other species, the detrimental effects of testosterone supplementation on longevity are well documented in various organisms (Reed et al.


2006). Life span of castrated males of some species tends be longer than intact individuals, although evidence from humans is somewhat inconclusive. Nieschlag et al. (1993) reported no life span difference between intact and castrati, but their statistical power was limited. An assessment of mentally handicapped individuals with greater sample sizes revealed lower lifetime mortality for castrated than for intact males (Hamilton and Mestler 1969). Overall, the potential detrimental effects of testicular function on morbidity are likely to be reflected by more subtle aspects of male physiology than gross measurements of longevity. Male aging involves changes in numerous aspects of reproductive function and general somatic condition. While somatic and reproductive function exhibits a general pattern of degradation, men do not undergo the same abrupt decline in fertility observed during menopause. Nonetheless, declines in male fertility are evident (de la Rochebrochard et al. 2006). However, only recently has human male reproductive senescence been examined from an evolutionary or life-history perspective. What is clear is that male fertility is compromised and that various aspects of male endocrinology may reflect a decline in the ability to regulate energetic resources (Bribiescas 2006). For example, hypothalamic sensitivity to caloric stress is lower in older men. That is, older men fail to exhibit predicted responses in LH secretion and pulsatility in response to acute fasts and GnRH administration (Bergendahl et al. 1998). Most information on male aging emerges from clinical studies of populations characterized by relatively unrestricted food availability and sedentary lifestyles. Populations such as those in the USA exhibit declines in testosterone, most prominently after the age of 40 (Harman et al. 2001). However, other populations do not exhibit these same declines. For example, a meta-analysis of four non-Western populations indicated a much more attenuated decline or none at all (Ellison et al. 2002). Changes in estradiol levels are not associated with aging; however, obesity in older men can result in higher estradiol levels due to the greater capacity of fat cells to transform (aromatize) testosterone to estradiol. Hypothalamic function, however, does not appear to be as malleable. Serum LH and FSH


levels among Ache men rise steadily with age, most likely revealing a decrease in target receptor sensitivity within Leydig and Sertoli cells in the testes (Bribiescas 2005). Metabolic hormones such as leptin do not change with age, perhaps because of the narrow range of adiposity variation among the Ache (Bribiescas 2001b). In addition to infectious disease, degenerative conditions associated with male biology are also part and parcel of long-term trade-offs across different life periods. The relationship between testosterone, reproductive effort, and cancer is rooted in the concept of antagonistic pleiotropy. Testosterone, while only indirectly related to spermatogenesis and fertility, is central to male reproductive effort, in terms of both somatic investment and behavior. It is also equally accepted that testosterone is involved in the etiology of prostate cancer, usually later in life. However, the mechanistic relationship between testosterone and prostate cancer development remains under considerable investigation. There is no evidence to suggest that individual testosterone levels are indicative of greater prostate cancer risk later in life. Indeed some have questioned the causative assumption that underlies the perceived link between testosterone and prostate cancer. However, lifetime exposure to testosterone may increase the risk of prostate cancer. An especially cogent issue is the exceptionally high rate of prostate cancer among African American men. While specific genes have been associated with greater prostate cancer risk among African Americans (Amundadottir et al. 2006), African American men also tend to exhibit higher total testosterone levels than non-African American men (Winters et al. 2001). Also, native African men do not exhibit similar risks (Odedina et al. 2006).

Hormones and female reproductive trade-offs Constraints on female reproductive success Because of the demands of internal gestation and lactation, the direct metabolic investment of human females in the production of each offspring is a primary constraint on female reproductive



success. Offspring are normally produced one at a time, spaced several years apart. Statistically, the major determinants of female reproductive success variance in a pre-modern context are infant survival and interbirth intervals. Both of these are determined by maternal investment of metabolic energy. Birthweight is the primary determinant of infant survival, a direct reflection of maternal metabolic investment (Villar et al. 1992), while the relative metabolic demand of lactation (determined by infant demand and maternal metabolic resources) and female fecundity during the waiting period to the next conception (also a function of maternal energetics) are the primary determinants of birth spacing (McNeilly 2001). Each of these will be considered in turn.

Birthweight and infant survival Birth represents a dramatic transition in human life history and a metabolic crisis for the newborn. Nutritional resources, formerly provided across the placenta at little cost to the fetus, must now be absorbed through an active gut. Gas exchange must be accomplished through the infant’s own lungs. Thermal regulation is no longer passively assumed but must be actively pursued. An infant’s capacity to meet these and other metabolic demands in the first days and weeks of life is largely determined by its metabolic reserves at birth. Infant mortality rates climb steeply as birthweight falls below 2500 g (Habicht et al. 1973). At the end of gestation fat accumulation contributes increasingly to fetal weight gain, and both the rate of fat accumulation and the ultimate accumulation attained are functions of the mother’s metabolic state (Villar et al. 1992). As such, they are strongly determined by hormonal regulation of maternal metabolism. Early in pregnancy a mother’s metabolism favors fat accumulation, even to the point of down-regulating basal metabolism to reduce maintenance costs if energy availability is low (Poppitt et al. 1994). As fetal growth becomes a significant category of metabolic demand in midpregnancy, and because maternal blood glucose is the primary source of fetal nutrition, maternal metabolism shifts in the direction of maintaining high blood glucose levels to maintain a steep

concentration across the placenta. Maternal cortisol levels become elevated and peripheral tissues become insulin resistant under the influence of placental lactogen. Late in pregnancy, maternal physiology shifts again to resemble a state of starvation. Fat reserves, including those accumulated early in pregnancy, are mobilized as the nutritional demands of fetal brain growth accelerate (Homko et al. 1999). The trade-offs during pregnancy are essentially between maternal maintenance and fetal growth, but the manifestations of the trade-offs shift during the course of gestation. The energy requirements of fetal growth become a high priority, high enough to shift energy allocation away from maternal basal metabolism in favor of energy storage early, and in favor of heavy mobilization of stored reserves late. From mid-gestation on, pregnancy becomes a ‘diabetogenic state’ for the mother as her blood glucose levels are pushed dangerously high (from her perspective) to help meet the needs of fetal growth (Carrington and Messick 1963). David Haig (1993) has pointed out that the fitness optima for mother and fetus do not coincide during this period, setting the stage for maternal–fetal conflict, a conflict waged primarily on the battlefield of hormonal regulation of energy metabolism (see Chapter 6).

Parturition The mechanisms that result in human parturition appear to be set in train by a metabolic crisis that builds in intensity as the metabolic needs of the fetus, driven particularly by brain growth late in pregnancy, begin to outstrip the mother’s capacity to meet them across the placenta (Ellison 2003). Rather than proceeding until the fetus reaches some preferred state of maturity, parturition occurs earlier in undernourished mothers and later in overnourished mothers (Kline et al. 1989). The conditions that ordinarily initiate parturition begin with activation of the fetal hypothalamic– pituitary–adrenal (HPA) axis in response to low fetal blood glucose levels. Elevated fetal cortisol mobilizes fetal fat reserves, a strategy that may help support brain metabolism in the short term but that can reduce infant survival probabilities after birth. The placenta responds to elevated fetal


cortisol by producing increasing amounts of corticotropin-releasing hormone (CRH) (Smith et al. 2002). This hormone is ordinarily released by the hypothalamus to stimulate ACTH production by the pituitary, which in turn stimulates cortisol production by the adrenal. Hypothalamic CRH is regulated by negative feedback from cortisol. However, placental CRH is regulated by a positive feedback from fetal cortisol, resulting in a rapid, exponential increase in cortisol in the fetus. At the same time, placental CRH stimulates the production of arachidonic acid from placental fatty acid reserves. Arachidonic acid is a crucial requirement of fetal brain growth, but production of arachidonic acid is also the rate-limiting step for prostaglandin production in the placenta (Fig. 7.5). Prostaglandins, in turn, stimulate myometrial contractions that begin to drive labor and delivery. Thus an elegant set of hormonal mechanisms links the appearance of a metabolic crisis in the fetus to both mechanisms aimed at short-term relief of the crisis and mechanisms that initiate parturition (Smith et al. 2002). Circumstances that delay the metabolic crisis for the fetus can result in postmaturity, a condition that often accompanies uncontrolled maternal diabetes in pregnancy, for example. Conditions that dramatically delay the crisis, such as fetal anencephaly (failure of the development of the cerebral cortex), can result in a pregnancy that continues for over a year in the absence of medical intervention (Higgins 1954). On the other hand, circumstances that advance the crisis, such as the acute nutritional deprivation faced by many pregnant women during the Dutch Hunger Winter of 1944–5, can result in short gestation lengths and premature delivery (Kline et al. 1989).

Lactation and birth spacing In the absence of modern contraception, the duration of postpartum amenorrhea (also termed lactational amenorrhea) is the most important determinant of human birth spacing. The convergence of demographic and physiological studies in the 1970s and 1980s showed that the primary determinant, in turn, of postpartum amenorrhea was the behavioral intensity of nursing (Ellison 1995). Individuals who did not breastfeed at all, either



Prostaglandins Glucose FFA

Arachidonic acid




Figure 7.5 A diagrammatic representation of the hormonal dynamics involved in human parturition. Hormones and other molecules secreted by the fetus are represented by heavy lines; those secreted by the placenta are represented by dashed lines. As the metabolic demands of fetal growth, especially of the brain, begin to outstrip the mother’s ability to meet them, the fetal hypothalamic–pituitary–adrenal axis is activated to help release glucose and fatty acids from fetal fat reserves. Fetal cortisol has a negative feedback effect on further release of corticotropinreleasing hormone (CRH) by the fetal hypothalamus. However, fetal cortisol stimulates (rather than suppresses) release of CRH by the placenta, which in turn stimulates further release of ACTH by the fetal pituitary and cortisol by the fetal adrenal. Rising levels of placental CRH also stimulate the production of arachidonic acid (AA) from placental reserves. AA is an essential fatty acid required to support fetal brain growth. However, the production of AA is also the rate-limiting step in the production of prostaglandins by the placenta which stimulate contractions of the myometrium, initiating labor.

through choice or because of infant death, were observed to resume menstrual cycling within a few months on average, compared with durations of amenorrhea of six months to several years among those who did breastfeed. Comparisons among populations suggested that where breastfeeding occurred with high frequency through the day and weaning occurred late the duration of postpartum amenorrhea was much longer than in populations where nursing occurred at broad intervals and weaning occurred early. However, efforts to link individual nursing patterns to individual durations of postpartum amenorrhea within populations often failed.



Research in the Gambia suggested that maternal nutritional status also played an important role in the duration of postpartum amenorrhea (Lunn et al. 1984). Targeted supplementation of maternal diets, either during lactation alone or during gestation and lactation, had negligible effects on milk production or the energy content of maternal milk, but lowered levels of prolactin (the pituitary hormone responsible for promoting the anabolic activity of milk production), hastened the return of menstrual cycling, and shortened the interval to the next conception. Subsequent work among the Amele of New Guinea and the Toba of Argentina has shown that well-nourished individuals who nurse with high frequency resume menstrual cycling as rapidly as women in Western societies who nurse much less frequently (Worthman et al. 1993; Valeggia and Ellison 2004). At the same time studies of Western women indicate that the introduction and amount of supplementary food in the infant’s diet has a much greater impact on the resumption of menstrual cycling by the mother than nursing patterns (Tay et al. 1996). The accumulating weight of evidence now indicates that the duration of postpartum amenorrhea is a function of the relative metabolic load of lactation. The absolute metabolic load may be determined by infant demand, but the relative load is a function of maternal capacity as well (Ellison and Valeggia 2003).

The resumption of ovarian cycling The trade-offs for the mother involved in postpartum amenorrhea are primarily those between investment in current and future offspring. Tradeoffs also occur between maternal maintenance and investment in offspring, but there is good evidence that those are heavily weighted in favor of investment in current offspring. The Gambian studies demonstrated that undernourished women will down-regulate their own basal metabolism in order to buffer milk production, even during periods of considerable hardship (Prentice et al. 1983). Studies of American women indicate that milk volume and composition are unaffected by increased energy expenditure in the form of exercise (Dewey et al. 1994) and that nursing mothers have attenuated

HPA axis responses to energetic stresses such as treadmill exercise, suggesting that their physiology resists signals that would otherwise result in additional energy allocation to maintenance categories (Altemus et al. 1995). The resumption of ovarian cycling postpartum represents a shift in maternal energy allocation toward increasing the probability of a new conception. In Toba women in Argentina, resumption of cycling in individual women is closely coordinated with changing insulin levels, which in turn reflect changes in metabolic energy balance (Valeggia and Ellison 2004). A brief period of insulin resistance, marked by insulin levels above a woman’s longterm average, tends to occur in the few months before the resumption of ovarian cycling accompanied by rising levels of ovarian estrogens (Ellison and Valeggia 2003). Insulin has been shown to be a potent stimulant of ovarian steroid production that synergizes with pituitary gonadotropins (Willis et al. 1996). Estrogens, on the other hand, potentiate the response of adipose cells to insulin (Rosenbaum and Leibel 1999). Thus, increasing energy availability in a low-estrogen milieu leads to higher than normal insulin levels. Elevated insulin may help to stimulate ovarian steroid production, causing estrogen levels to rise. Rising estrogen stimulates the adipose response to insulin, bringing insulin back into the normal range. This elegant coupling of hormones regulating energy metabolism and ovarian function serves to ‘jump start’ ovarian function as maternal energy availability rises above the demands of milk production.

Waiting time to conception Pregnancy and lactation are periods of heavy metabolic demand on a woman’s physiology, times when she is essentially ‘metabolizing for two’ (Ellison 2001). In contrast, the period of fecund cycling that separates intense metabolic investment in successive reproductive bouts is a period of reduced demand and potential recuperation. ‘Maternal depletion’ is a syndrome of progressive deterioration in female nutritional status associated with closely spaced births with negative consequences for female survival and future fertility (Winkvist et al. 1992). Hence minimizing the fecund ‘waiting


time’ to each successive conception does not necessarily maximize lifetime reproductive success in human females. The overall pace of reproduction can be governed both by variance in the length of postpartum amenorrhea and by variance in fecundity during the waiting time to the next conception. The duration of the latter period may be particularly important for maintaining long-term energy balance over an individual’s reproductive lifetime. Waiting time to conception is of course partly determined by exposure to intercourse and, in the modern context, by manipulation of the probability of conception given intercourse through the use of contraception. In societies without widespread use of modern contraception, behavioral regulation of exposure to intercourse may play an important role in regulating waiting times to conception. Physiological variation in female fecundity, however, also plays an important role. Studies on numerous populations have now demonstrated that periods of restricted energy availability, either because of low energy intake or high energy expenditure, are associated with reduced frequency of ovulation and low profiles of ovarian steroids when ovulation does occur (Ellison 2001). Low ovarian steroid profiles, in turn, have been linked to reduced probability of successful conception (Lipson and Ellison 1996). While the correlation of female fecundity with acute variation in energy availability is reasonably well established, some disagreement persists over the impact of longer term, chronic energy shortage. Vitzthum et al. (2004) have argued that female fecundity should not be affected by chronically low energy availability since there is no expectation that deferring conception will result in better energetic conditions. They note that successful conceptions occur in rural Bolivian women even though their ovarian steroid profiles are low by Western standards and interpret this fact as supporting their position. Ellison (1990) and Lipson (2001), among others, have argued that the optimal pace of reproduction should be slower in chronically energy-limited environments in order to preserve long-term energy balance. Low steroid profiles, they point out, do not preclude conception, but only reduce its probability. Thus populations like that in rural Bolivia, faced with chronically low energy availability, would not be expected to have


Western level steroid profiles when they conceive. Rather they would be expected to take longer to conceive, given the same exposure to unprotected intercourse.

The timing of conception and human reproductive seasonality The sensitivity of female reproductive success to metabolic energy availability has resulted in mechanisms that synchronize the timing of conception with favorable energetic conditions. Conception is more likely when women are in positive energy balance and not constrained by high energy expenditure (Ellison 2003). This is a pattern that humans share with both chimpanzees and orangutans (Knott 2001; Emery Thompson 2005). It is also a pattern quite different from that observed in shorter-lived organisms living in highly predictable seasonal environments. In such organisms reproduction is usually synchronized with environmental energy availability to match periods of peak demand with periods of peak availability (Bronson 1989). Often the period of peak demand is the immediate postnatal period, so that reproductive seasonality is really ‘birth’ or ‘hatching’ seasonality. In humans, and perhaps other great apes, reproductive seasonality seems to be more a matter of ‘conception’ seasonality (Ellison et al. 2005). The reasons for this are likely threefold. First, the immediate post-conception period, before the demands of fetal growth become too great, may be a crucial period of maternal fat accumulation that will critically determine her need to meet fetal demand late in pregnancy. Hence conceiving during a period of negative energy balance or constrained energy availability may result in a low birthweight or early-term offspring with consequently lower probability of survival. Second, the period of high postnatal metabolic investment by the mother is quite drawn out in humans with no restricted period of ‘peak demand’ to serve as a focal point for synchronization with environmental energy availability. In fact, when environmental energy availability is low, the period of high metabolic demand on the mother, represented by intense lactation, tends to be more protracted. Third, formative human environments



may have been unpredictable in terms of energy availability without good cues that could be used to synchron ize reproductive cycles with energy availability nine months or more in the future. Indeed, it may only be the advent of agriculture that introduced such predictable seasonality of energy availability into human subsistence ecology. Thus the calendric birth seasonality, which is most readily observable among human societies that rely on subsistence agriculture, may be a consequence of a reproduct ive biology sensitive to conditions at conception transposed into an environment where such conditions are synchronized with agricultural cycles.

Age and female fecundity Virtually all indices of female fecundity, including menstrual regularity, frequency of ovulation, and ovarian steroid profiles in ovulatory cycles, follow trajectories that rise over the first decade after menarche, remain fairly stable from the midtwenties to the late-thirties, and then decline over the last decade of reproductive life until the cessation of ovarian cycling at menopause (Ellison 1996). The shape of this trajectory tends to be sharper in its rise and fall for any given individual than for the cross-sectional average of a population as a whole. The latter trajectory tends to be broader due to the variance in the timing of the rise and fall of individual patterns. Nevertheless, the three-phase pattern of female fecundity by age is recognizable across a broad range of human societies with different ecologies, cultures, and genetic backgrounds, and hence likely represents a common feature of human reproductive biology. Two features of this pattern attract the attention of evolutionary biologists: the relatively slow climb to full fecundity, and the ineluctable decline to zero fecundity at menopause. Originally misidentified as a period of ‘adolescent sterility’ (Montagu 1946), the gradual increase in fecundity following menarche can extend over five or more years for individuals (Apter et al. 1978). There is some evidence that those who experience menarche late may be on a slower trajectory of reproductive maturation in general, a pattern that

includes a longer, slower rise to peak fecundity after menarche (Vihko and Apter 1984). If female fecundity primarily influences the waiting time to conception, rising fecundity represents a decrease in the ratio of the period of ‘metabolizing for one’ to the period of ‘metabolizing for two’ and hence an overall increase in reproductive effort. A general prediction of life-history theory is that reproductive effort should increase as reproductive value (a weighted expectation of future offspring) decreases. Younger females have a greater expectation of future offspring than older females simply because more of their reproductive career is in front of them. Thus any reduction in survival probability as a consequence of a trade-off with current reproduction carries a greater fitness cost for younger than for older females. To put it another way, survival probability weighs more heavily in the fitness calculations for younger than for older females. So we should not be surprised to uncover less than peak reproductive effort early in a woman’s reproductive career. We might also expect that the rate at which she should approach peak reproduct ive effort would vary with the overall mortality risk she faces and the magnitude of the trade-off between reproductive effort and survival. Where the cost in survivorship associated with an increase in reproductive effort is low, reproductive effort should increase more steeply, as it apparently does for human females in favorable environments. The decline in fecundity with age late in the reproductive career and its eventual cessation at menopause present different theoretical difficulties. But this pattern also appears to be the result of quite different physiological mechanisms. Menopause itself is not a consequence of changes in hormonal feedback sensitivities, but rather a simple consequence of follicular depletion. Follicular supply in virtually all birds and mammals is fixed early in development and follicular attrition begins equally early and proceeds at a more or less constant rate throughout life. Hence any female mammal that lives long enough will eventually exhaust her follicular supply, and without follicles to produce steroids she will cease to experience reproductive cycles. This aspect of human reproductive physiology is not evolutionarily novel. The reasons for


this particular strategy of gamete production probably lie in the selective pressures on oocyte quality generated by the drastic constraints on lifetime embryo production imposed by internal gestation and avian clutch size (Ellison 2001). Why menopause persists at a relatively young age in a species in which significant numbers of individuals live well beyond the age of menopause is a different question. Hawkes (2004) has proposed that post-reproductive females contribute more to their own inclusive fitness by investing in their grand-offspring than they would by investing in additional offspring of their own. Hill and Hurtado (1991), among others, have argued that such an inclusive fitness effect would be too weak, given realistic assumptions about formative human demography. An alternative possibility is that the early developmental stage at which mammalian follicular supply is determined, a developmental position that may contribute to the protection of oocyte quality from teratogenic effects, may also make it very refractory to selection (Ellison 2001). A brief flurry of excitement generated by reports suggesting that ovarian follicular supply could be ‘restocked’ in mammals late in life has since subsided in the face of contrary evidence (Johnson et al. 2004).

Contemporary medical implications Metabolic syndrome Because reproductive and metabolic hormones are intimately involved in regulating life-history tradeoffs and managing energy allocation decisions, it should come as no surprise that dysregulation of these systems often leads to co-morbidity of body composition, glucose regulation, and reproduction. Many of these linkages are present in what has become known as ‘metabolic syndrome.’ In women, obesity and poor glucose regulation is linked to elevated androgen levels and ovarian pathologies such as polycyctic ovarian disease. In men agerelated decline in testosterone can be linked to decreases in lean body mass and increases in fat mass, increasing insulin resistance, and increased risk of cardiovascular disease.


Cancer The same linkages between the hormonal regulation of reproduction and metabolism produce a positive correlation between cumulative exposure to reproductive hormones and the risk of cancer, especially but not exclusively reproductive cancer. Tumors respond to many of the same signals of metabolic state as healthy tissues, so that both cellular proliferation and trophic growth of cancerous cells are stimulated by positive energy balance, both short and long term (Jasienska and Thune 2001). Positive correlations between physical size and cancer risk, which have long been noted, probably reflect these linkages (Ellison 1999). Because reproductive tract tissues are particularly responsive to signals of reproductive state carried by gonadal steroids, high levels of cumulative exposure to androgens (in men) and both estrogens and progestagens (in women) have been associated with elevated risks of prostate, testicular, breast, ovarian, uterine, and cervical cancer. The most effective current treatments work by removing, blocking, or disrupting those signals.

Hormonal supplementation Androgen supplementation has become an increasingly popular treatment for conditions and health complaints associated with aging. Inadvertently, clin icians are manipulating an agent associated with greater reproductive effort without fully appreciating the potential effects on immune function. Indeed, central motivations for androgen supplementation are decreased libido, increased adiposity, and muscle atrophy, all reflections of well-described trade-offs associated with reproductive effort and survivorship (Bribiescas 2001a). Given the broad range of non-pathological variation in testosterone levels both between and within populations, very little attention has been paid to the effects of testosterone supplementation in various populations. Bentley (1994) has argued, for example, that differences in endogenous ovarian hormone physiology in different populations should be taken into consideration in the administration of oral contraceptives in healthy women. Differences in clearance rates suggest that identical dosages may have very



different effects on circulating levels and may result in unwanted health effects. Similar investigations have not been conducted in men, although there are hints of analogous effects. Wang and colleagues (2004a) found no difference in testosterone clearance rates between white and Asian men. However, all of the white subjects were born and raised in the United States while three of the Asian men were born outside of the United States. An investigation by Santer et al. (1998) compared Chinese men living in the United States (Pennsylvania) and in Beijing, China, and found significant differences in testosterone production rates, free testosterone, and sex hormone binding globulin (SHBG) levels. In light of the quest for a male contraceptive and the growth of testosterone supplementation on the global market, it will be interesting to observe the long-term effects of testosterone supplementation on immune function, morbidity, and mortality in supplemented men. One would predict that men under chronic energy limitations would exhibit a greater incidence of morbidity and mortality in response to testosterone supplementation. Increased risk of prostate cancer in men receiving testosterone supplementation is a considerable concern among urologists. Recent longitudinal studies suggest that testosterone supplementation is associated with increased prostate cancer risk (Parsons et al. 2005). Older men who tend to be the most common recipients of testosterone supplementation are also at the highest risk of prostate cancer and hyperplasia. Testosterone supplementation therefore may carry a special risk for older men, although there is some disagreement about the relative risk.

Hormonal caveats While hormones are useful markers and mechanisms of life-history trade-offs, it is important to note that inverse relationships between adaptive traits are sometimes not apparent when they should be expected. This is not due to an inherent flaw in the assumption of allocation, but may result from an inaccurate assessment of the relationship between the traits in question. Trade-offs between life-history demands such as reproduction and

survivorship are often evident; however, significant overlap may arise between competing physiological functions. For example, lipolysis and fat catabolism tend to be positively associated with increases in testosterone (Wang et al. 2004b), but greater adiposity can also be associated with higher testosterone as is the case among well-nourished men compared to undernourished populations (Bribiescas 2001a). The phylogenetic history of hormones is quite ancient. Steroid hormones in particular are common to all vertebrates, invertebrates, and plants and are therefore a common trade-off currency. Therefore the effect of hormones on trade-offs will in part be affected by phylogenetic influences. For example, the physiology of a small mammal, because of its relatively higher metabolic demand, is likely to be more sensitive to changes in a metabolic hormone than that of a larger animal. Differential effects of hormones that reflect adiposity on reproduction between small and large mammals are worthy of consideration in support of this premise. For example, leptin, a lipostatic hormone, has a much more profound effect on mouse fertility than on humans. Similarly, short-term fasting in male rhesus macaques almost completely attenuates gonadotropin production (Cameron and Nosbisch 1991), while much longer periods of fasting are necessary to induce even modest hormone reductions in men (Klibanski et al. 1981). The implications for clinical research are enormous, for body size and differential life histories are seldom taken into consideration when developing animal models of disease.

Summary 1. Male hormones such as testosterone are intricately involved in regulating somatic investment and energy allocation between needs reflective of reproductive effort and survivorship. This tradeoff is made evident by changes in reproductive hormones in response to immunological challenges. 2. Female hormones adjust energy allocation between investment in ovarian function, somatic investment, and present offspring (lactation). This likely reflects differential investment and tradeoffs between present and future reproduction.


3. Metabolic hormones respond to environmental cues to sequester or liberate energetic resources such as glucose and fat. Mismatches between environmental conditions and the expression of metabolic hormones are likely to underlie the non-random incidence of obesity and diabetes in a number of ethnic populations.


4. Lifetime variation and exposure to endogenous reproductive hormones are related to antagonistic pleiotropy, indicative of a trade-off between early benefits for reproduction and later costs against survivorship. This is likely to be expressed by population differences in the incidence of reproductive tumors, such as breast and prostate cancer.

This page intentionally left blank


Functional significance of MHC variation in mate choice, reproductive outcome, and disease risk Dagan A. Loisel, Susan C. Alberts, and Carole Ober

Introduction The nervous and immune systems both serve essential sensory functions in vertebrates. Whereas the nervous system surveys the sensory landscape of the physical world, the immune system responds to an enormous diversity of self and non-self (i.e., bacterial, viral, fungal) biological stimuli. The two systems are also intimately connected by a common biochemical language (Blalock 1994). A network of shared ligands (e.g., neurotransmitters, hormones, and cytokines) and receptors enables molecular crosstalk between the two systems, facilitating intersystem coordination and intrasystem regulation (Blalock 1994; Boulanger et al. 2001). In addition, the activity of one system is crucial to the normal development and function of the other. Immune signaling plays a critical role in normal central nervous system development and function, synaptic remodeling and plasticity, and learning and behavior (Boulanger and Shatz 2004; Ziv et al. 2006). Likewise, neural structures and functions contribute to immune homeostasis and host defense (Downing and Miyan 2000). From these three concepts—the overlap of sensory function, bidirectional flow of information, and developmental interdependence—emerges the idea of an integrated neural-immune circuit (Blalock 1994). The existence of an integrated neural-immune system has profound implications for our understanding of the function and evolution of the human body’s most extraordinary genetic system: the immune genes of the major histocompatibility

complex (MHC). MHC genes are highly genetically diverse, among the most diverse in the human genome, and this diversity influences disease susceptibility and resistance. There is little doubt that the ongoing evolutionary battle against pathogens has shaped the evolution of these genes. In contrast, the suggestion that MHC genes are involved in vertebrate mate choice and reproduction, specifically in the context of sexual selection, has been met with much skepticism despite burgeoning evidence in its support. The newly emerging idea of neural-immune integration addresses issues at the heart of this skepticism because it provides a mechanism for the production and detection of MHC-based olfactory cues that would be an essential basis of MHC-based sexual selection. In this chapter, we review the empirical evidence and evolutionary theory underlying current ideas about the importance of natural and sexual selection in the evolution of MHC genes, and we discuss the possible adaptive and non-adaptive consequences of this selective scenario. Only by examining MHC biology in an evolutionary perspective can we truly appreciate the far-ranging non-immune functions of these unique genes, including their underappreciated role in vertebrate olfactory communication, mate choice, and reproduction.

Genes of the major histocompatibility complex The major histocompatibility complex is a genedense region occurring in all jawed vertebrates 95



that contains several dozen genes involved in adaptive or innate immunity (Fig. 8.1a). These immune genes strongly influence disease resistance, tissue graft acceptance, and fetal tolerance during pregnancy (Ober and van der Ven 1997; Klein and Sato 2000; Lechler and Warrens 2000). In addition to their immune functions, some MHC genes play an important role in nervous system development and function (reviewed in Boulanger et al. 2001); MHC class I molecules, for example, are essential for normal structural and synaptic remodeling in the developing and mature central nervous system. Although details about their role in the nervous system are just emerging, it is clear that MHC function extends beyond immunity.

Class II


Class I

700 600 500 400 300 200 100 0


Number of human alleles


Form and function of MHC molecules The classical MHC genes encode transmembrane glycoproteins that bind short peptides from degraded self and non-self (i.e., pathogen-derived) proteins and present them on the cell surface to circulating T cells (Figs. 8.1b,c). The range of peptides bound by an MHC molecule is determined by the composition of amino acids comprising its peptidebinding groove. Since all the peptide-presenting MHC genes differ in their peptide-binding groove sequence, each gene binds a different range of peptides. T-cell recognition of a non-self peptide bound by a MHC molecule initiates a complex cascade of immune responses designed to limit the spread or replication of pathogens: cytotoxic T cells proliferate and destroy infected cells, macrophages secrete complement protein to kill phagocytized pathogens, and B cells are activated to produce pathogen-specific antibodies (Lechler and Warrens 2000). Thus, peptide presentation by MHC genes is critical to the development and activation of immune surveillance and response. Disruption of the normal MHC function usually results in profound immunopathology; the consequences range from severe immunodeficiency and death to autoimmune diseases and tumor growth.

Evolution of MHC genes (b) (c)

Exon 1

Exon 2

Exon 3

Exon 4 Exon 5


Peptide ligand

Plasma membrane

Figure 8.1 Structure and diversity of human MHC genes. (a) The MHC region encompasses roughly 4 million bases on chromosome 6p and contains > 200 genes, approximately 40% of which are involved in immunity. Among these genes, the classical class I genes (shown as open ovals) and class II genes (shown as open rectangles) show extraordinary levels of diversity in humans. Nonclassical class I and class II genes (shown as filled ovals and filled rectangles, respectively) are generally less diverse. (b) The exon structure of a class II gene. (c) The structure of a class II molecule. An alpha and beta chain dimerize to form the functional molecule, which together with bound peptide are expressed at the surface of immune cells.

Perhaps the most striking property of the MHC is its extraordinary genetic diversity. The classical MHC genes, referred to as human leukocyte antigens or HLA genes, are among the most polymorphic in the human genome, with hundreds of alleles at some loci (Fig. 8.1a) (Garrigan and Hedrick 2003). Nucleotide diversity in the MHC region can reach levels approximately two orders of magnitude greater than the genome average (Garrigan and Hedrick 2003). This variation is not evenly distributed across MHC loci, however. Both polymorphisms and non-synonymous (i.e., amino acid altering) changes occur in excess in the codons involved in peptide binding, suggesting that selection favors variation at these sites (Hughes and Yeager 1998). In addition, several other features of MHC diversity are consistent with the action of selection: too many MHC alleles are observed in most populations,



Pathogen-driven selection Reproductive selection

Sexual selection MHC influences on survival and reproduction

MHC-based fetal tolerance

MHC-mediated mate choice

Consequent characteristics of peptide-presenting MHC genes:

Evolutionary time

*High genetic diversity *Uniform allele frequencies in populations *Extensive diversification among alleles *Deficiency of homozygotes *Survival of ancient polymorphisms *High linkage disequilibrium

Non-selective forces (drift, gene flow, etc.) Figure 8.2 Selective forces driving MHC evolution. The standing diversity of MHC genes is subject to a number of selective and non-selective forces. In humans, for example, the remarkable molecular and population genetic features of MHC genes (e.g., high diversity, uniform allele frequencies) are attributed to the action of these evolutionary forces.

allele frequencies are too uniform, alleles differ at too many (often > 50) sites, and polymorphisms have persisted for too long to be consistent with neutral evolution (Fig. 8.2) (Apanius et al. 1997; Meyer and Thomson 2001; Garrigan and Hedrick 2003). The consensus interpretation of these results is that selection favoring the induction and maintenance of MHC diversity—i.e., balancing selection—has greatly influenced MHC gene evolution. But, what are the ultimate drivers of selection? Because MHC genes play a fundamental role in immunity and contribute to disease risk, it has been long suspected that pathogen-mediated selection maintains MHC diversity (references in Apanius et al. 1997). However, pathogens are but one of several potential drivers of MHC diversity that are consistent with the observed data (Fig. 8.2). There is considerable evidence that non-pathogen-mediated mechanisms, such as sexual selection, autoimmunity, and reproductive selection, also shape the pattern of MHC

evolution (Apanius et al. 1997; Meyer and Thomson 2001). In the following sections, we describe the pathogen- and non-pathogen-mediation models of selection, the predictions of these models, and the empirical evidence that supports them.

Pathogen-mediated selection on MHC genes Pathogen-mediated selection operates when MHC gene variants (i.e., alleles) differ in their ability to protect against infectious organisms. Ample evidence for this phenomenon exists in humans, in that specific human MHC alleles are associated with susceptibility and resistance to a number of infectious diseases, including HIV, malaria, tuberculosis, hepatitis, leishmaniasis, and leprosy (reviewed in Lechler and Warrens 2000; Shiina et al. 2004). Similar correlations between MHC alleles and host resistance to bacterial, viral, fungal, and



parasitic infection were observed in experimental infection studies of laboratory (e.g., mice and rats) and captive-raised (e.g., chickens, cows, and fish) animals (reviewed in Apanius et al. 1997; Penn 2002; Bernatchez and Landry 2003; Sommer 2005b). Finally, recent field studies have demonstrated that MHC variation influences pathogen resistance in wild populations of fish, sheep, snakes, and mice (reviewed in Sommer 2005b). These results from wild populations are particularly informative, as they illustrate the potential for MHC genes to impact fitness in the natural world. In addition to the allele-specific effects described above, MHC-heterozygosity effects on disease have been observed in humans, laboratory mice, and captive-raised fish (reviewed in Penn 2002; Sommer 2005b). Specifically, MHC heterozygotes were more resistant to infection or more efficient in recovering from infection than homozygotes; this effect was most pronounced in studies involving serial or multiple-pathogen infections, or infections caused by quickly evolving viruses like HIV (Penn 2002; Sommer 2005b). MHC heterozygotes may be more resistant to infection than homozygotes due to either heterozygote advantage (i.e., heterozygotes are more resistant than the average of the two homozygotes) or heterozygote superiority (i.e., heterozygotes are more resistant than either homozygote) (Penn 2002; McClelland et al. 2003). Three theoretical models of pathogen-driven balancing selection have been developed to explain how MHC diversity is maintained (reviewed in Meyer and Thomson 2001; Hedrick 2002). First, the frequency-dependent selection model posits a cyclical coevolutionary arms race in which the selective value of an allele is inversely proportional to its frequency in the population. New or rare alleles have a selective advantage because few pathogens have adapted to them; this advantage declines as the rare alleles increase in frequency and pathogens evolve resistance. The resulting oscillations in allele frequencies prevent alleles from becoming fixed or eliminated, thereby maintaining MHC variation. Second, in the fluctuating selection model, fitness values change as a function of pathogen frequency or intensity, not as a function of allele frequency as is observed in the

frequency-dependent model (Meyer and Thomson 2001). Fluctuations in the temporal or spatial patterns of pathogens result in selection favoring different MHC alleles at different times, irrespective of their frequency in the population. The changing selective landscape implicit in this model could maintain the high allelic and nucleotide diversity characteristic of peptide-presenting MHC genes (Hedrick 2002). Finally, heterozygote superiority could maintain MHC polymorphism if heterozygotes are more resistant to pathogens, and thus have higher fitness, than both of the homozygotes (Hughes and Yeager 1998; Penn 2002). The three models are not mutually exclusive, and it is likely that their effects combine to maintain the extreme diversity of MHC genes.

Sexual selection on MHC genes A prediction of the pathogen-mediated mechanisms of MHC evolution is that individuals may increase their fitness by employing MHC-dependent mating preferences that increase the production of offspring with enhanced disease resistance (Penn and Potts 1999). MHC-dependent mating preferences could increase the immunological resistance of offspring by any of the following, non-mutually exclusive processes: 1. producing MHC heterozygotes with enhanced resistance to multiple pathogens; 2. increasing the disparity between the offspring and parental MHC genotypes (in order to protect against pathogens adapted to the parental genotype); 3. facilitating the avoidance of genetic incompatibility at MHC loci or genome-wide inbreeding; and 4. producing offspring with optimal levels of MHC diversity (Penn 2002; Sommer 2005b). At the level of individual choice, individuals may potentially maximize the fitness of their offspring by choosing mates with particular MHC characteristics. For example, individuals may choose mates based on whether they carry ‘good genes,’ which confer additive benefits on offspring and may be indicated by condition-dependent indicator traits. If mate choice is based on good genes,



(b) AA












Figure 8.3 Models of mate choice. Both examples show female choice. (a) Mate choice for ‘good’ genes. In this model, the A allele signals good genes. Mating with the AA males is preferred by all females (solid line), regardless of her genotype, because it maximizes the number of offspring inheriting the good gene. (b) Mate choice for ‘compatible’ genes. In this model, the A and B alleles represent compatible genes. Females prefer males who are most different from themselves. The heterozygous AB female will not prefer any one male genotype because all matings will result in half her offspring being heterozygous (dotted line). (Adapted from Neff and Pitcher 2005).

then all individuals in a population will have similar preferences (Fig. 8.3a). Alternatively, individuals may choose mates with the most ‘compatible genes’ to their own, so as to provide offspring with the most adaptive gene combinations (Neff and Pitcher 2005). Mate choice based on compatible genes will result in incongruent mate choice patterns within a population (Fig. 8.3b). MHC genes have been implicated as a target of mate choice for both good genes and compatible genes in non-human animals and in humans; the evidence for both is reviewed below.

MHC-mediated mate choice in non-human vertebrates The first evidence for MHC-mediated mating preferences indicated that male and female laboratory mice preferred MHC-dissimilar mates (reviewed in Alberts and Ober 1993; Penn and Potts 1999). Subsequent laboratory mouse studies reported conflicting results and this inconsistency was likely due to the methodological shortcomings of those studies. More compelling evidence for MHC-mediated mate choice in mice came from studies of wildderived house mice living in large, semi-natural enclosures (Potts et al. 1991; Potts et al. 1992). Here, a significant population-level deficiency of MHChomozygous offspring was observed and shown to be due to female choice for MHC-dissimilar males


(Potts et al. 1991). Subsequent experiments demonstrated that cross-fostering (rearing female mouse pups in MHC-dissimilar families) reversed MHCdisassortative mating preferences, suggesting an important role for post-natal imprinting in the development of MHC-mediated mate choice (Penn and Potts 1998b). Work on natural populations of non-model vertebrates has detected MHC-dependent mate choice in species of fish, birds, lizards, and rodents (Paterson and Pemberton 1997; reviewed in Bernatchez and Landry 2003; but see also Ekblom et al. 2004; Westerdahl 2004; Piertney and Oliver 2006). These studies highlight the complexity and context-dependent nature of MHC-based mating preferences. In some studies, particular MHC alleles appear to be targets of mate choice for good genes, while in others individuals appear to choose mates based on their own similarity (or dissimilarity) to them at MHC loci, suggesting compatible genes that enhance offspring quality via interactions between maternal and paternal MHC genotypes (Fig. 8.3). Evidence consistent with mate choice for good genes came from a study of ring-necked pheasants. In these birds, MHC genotype was associated with variation in both male spur length, a conditiondependent ornament preferred by females, and adult male annual survival (von Schantz et al. 1997). This effect was not due to heterozygosity per se, as spur length did not differ between MHC heterozygotes and homozygotes. Because MHC genotype affected spur length and females generally preferred long-spurred males, these results are generally consistent with the MHC as good genes model (von Schantz et al. 1997). Several other studies found evidence for mate choice based on MHC as compatible genes, and these results suggest that the mating preferences can take at least three forms. First, as in the early laboratory mouse experiments, there is evidence for mate choice for MHC-dissimilar mates in studies of free-living yearling female savannah sparrows, wild-caught Atlantic salmon, and, possibly, freeranging sand lizards (reviewed in Bernatchez and Landry 2003; Piertney and Oliver 2006). Second, data consistent with mate choice for MHC-similar mates were observed in a wild population of



house sparrows (Bonneaud et al. 2006) and in freeranging Malagasy giant jumping rats (Sommer 2005a). Although these results are contrary to current thinking about MHC mate choice patterns, it has been hypothesized that a preference for MHC-similar mates may emerge when selection favors local adaptations or co-adapted gene complexes (Bonneaud et al. 2006). Mate choice for MHC-similar mates would not contribute to the maintenance of MHC diversity. Third, individuals in some species appear to choose mates that are optimally diverse, i.e., neither too similar to nor dissimilar, at MHC loci. For example, in odor preference trials of wild-caught three-spined sticklebacks, females chose males with levels of MHC diversity that would facilitate the production of offspring with an intermediate number of MHC sequence variants (Aeschlimann et al. 2003). Subsequent infection experiments showed that sticklebacks possessing an intermediate level of MHC diversity were most resistant to parasite infection (Wegner et al. 2003). Finally, individuals may simultaneously integrate information about both good genes and compatible genes into their mate choice decisions (Colegrave et al. 2002; Mays and Hill 2004; Neff and Pitcher 2005). For example, in a laboratory setting, female mice evaluated and utilized both male scent marking rate (as a good genes indicator) and MHC similarity (as a compatible genes indicator) in their mate choice decisions (Roberts and Gosling 2003). Female choice was also shown to be phenotypically plastic. When variation in scent marking was small, MHC similarity was a significant predictor of female preference, but when variation was large, MHC similarity was not a significant predictor (Roberts and Gosling 2003). This suggests that females may be able to optimize fitness by using different choice criteria in different social, environmental, and genomic contexts (Colegrave et al. 2002; Mays and Hill 2004; Neff and Pitcher 2005). Mate choice decisions may therefore involve trade-offs between good and compatible genes, potentially complicating the detection of MHC-mediated mate choice in nature. Thus, the failure to detect MHC-mediated mate choice when it is occurring and discrepancies among studies that detect an effect may simply reflect natural

differences in the relative importance of good versus compatible genes.

Role of the MHC in human mate choice Evidence that MHC genes (i.e., HLA genes) influence human mating preferences comes from both odor preferences experiments and populationbased studies. In the former, researchers tested for associations between odor preferences and MHC similarity. In a well-known example, college students were asked to rate the pleasantness of odors from t-shirts worn by other MHC-similar and -dissimilar unrelated individuals (Wedekind et al. 1995; Wedekind and Furi 1997). In general, women rated the odors of MHC-dissimilar men as being ‘more pleasant’ than MHC-similar men, although this preference was reversed for women on oral contraceptives (Wedekind et al. 1995; Wedekind and Furi 1997). Men also tended to rate MHCdissimilar women as more pleasant, although the results were not statistically significant (Wedekind and Furi 1997). In addition, in these same university students, there was evidence of an interaction between MHC genotype and the preference for particular perfume scents (Milinski and Wedekind 2001). These results suggest that humans exhibit MHC-based odor preferences. However, these odor preferences must reflect behavioral decisions with fitness consequences in order to be relevant to the evolution of MHC genes. Odor preference studies have also provided insight into the mechanisms of preference development. Jacob and colleagues (2002) showed that women can discriminate between men’s odor samples as a function of the number of shared MHC alleles. Overall, women most preferred odors from men with whom they shared an intermediate number of MHC alleles (mean of 2.3 matches to the woman out of a range of 0 to 6 matches), while the least preferred odors were from men with significantly fewer matches (mean of 1.5) (Jacob et al. 2002). Strikingly, the women’s preferences were based solely on matches to the alleles she inherited from her father. This remarkable finding suggests that odor preference development in humans may be sensitive to paternally inherited MHC alleles rather than just behavioral imprinting on familial


MHC-associated odors, as it appears to be in mice (Jacob et al. 2002; Potts 2002). To directly assess the role of MHC genes in mate choice, researchers have also analyzed actual patterns of mate choice, for example by comparing the observed level of MHC allele sharing between couples to that predicted under random mating. Retrospective studies of outbred, ethnically diverse human populations have consistently failed to find evidence for MHC-mediated mate choice (Alberts and Ober 1993; Meyer and Thomson 2001). This is unsurprising, given that most of these studies involved large, outbred, and poorly defined populations, and were confounded by ethnic, racial, or cultural mating preferences. Two studies examined mating preferences in more isolated, well-defined human populations. Hedrick and Black (1997) looked for MHC effects on mating in isolated South Amerindian tribes that showed population-level deficits of MHC homozygotes (suggestive of selection favoring heterozygotes). They found no evidence for MHCdependent mating preferences in their comparison of observed and expected sharing of a two-locus MHC haplotype in mating pairs. However, given the technological (i.e., low resolution of only two loci) and methodological (i.e. small sample size; not controlling for inbreeding and stratification in the population) limitations of the study, only extremely strong selection for MHC-mediated mate choice would have been detectable (Penn and Potts 1999). The second human population study involved a moderately inbred, ethnically homogenous religious isolate living on communal farms in the United States and Canada (Ober et al. 1997, 1999). To determine whether mate choice was random with respect to MHC genotype in 411 married couples, Ober and colleagues (1997, 1999) compared the observed number of couples in which the husband and wife matched at a 16-locus MHC haplotype to the number of couples expected to match given the population and mating structure. They found that significantly fewer couples matched at the 16-locus haplotype than expected; this result was confirmed in an independent analysis that was robust to population stratification and inbreeding (Genin et al. 2000). In addition, in couples that did share MHC haplotypes, the matched haplotype


was inherited from the mother significantly less often than expected, though this was largely due to a single haplotype (Ober et al. 1997, 1999). This deficiency of maternally inherited matched haplotypes suggests a role for imprinting in mate choice. The results of Ober and colleagues’ mate choice studies suggest that MHC (or closely linked) genes can have a strong effect on mating preferences, consistent with a pattern of disassortative mating for MHC haplotypes. This preference for MHCdissimilar mates is also consistent with, and may be partly responsible for, the population-level deficiency of homozygotes for MHC haplotypes observed in this population (Robertson et al. 1999). The ability to detect a strong MHC effect on mate choice was likely enhanced by specific features of the population, such as the relatively low number of segregating MHC haplotypes and homogeneity of important social and culture factors (e.g., ethnicity, religion, education, and income) (Ober et al. 1997).

Evolutionary implications of MHC-mediated mate choice Evidence for MHC-mediated mate choice has been observed in several vertebrate taxa (e.g., mammals, fish, birds, and reptiles), suggesting that this phenomenon may be relatively widespread in nature (Slev et al. 2006). The evolution of MHC-mediated mating preferences is favored under the models of pathogen-mediated selection proposed to drive MHC gene evolution, and, like pathogen-mediated selection, is capable of contributing to the maintenance of MHC diversity. The occurrence of MHCmediated mate choice will affect the long-term evolutionary trajectory of the MHC genes, as well as the population genetic characteristics, i.e., the MHC allele frequencies, that influence the relative risk of disease in contemporary populations. Since MHC-mediated mate choice is essentially a mechanism to preferentially produce offspring with the genetic qualities favored by selection, the evolution of strong MHC-mediated mating preferences in a population should reduce the incidence of infectious, genetic, and/or reproductive diseases. Therefore, poor mate choice decisions may have dire consequences. In the three-spined stickleback fish, where females prefer mates that best complement



their own MHC diversity (Aeschlimann et al. 2003), a suboptimal mate choice decision results in offspring more susceptible to parasite infection (Wegner et al. 2003). In the human population that showed a deficiency of couples that matched at a 16-locus MHC haplotype (Ober et al. 1999), pregnancy loss rate was significantly higher in couples with matching MHC haplotypes (discussed later in this chapter: Ober et al. 1998). Although many unanswered questions remain, MHC-mediated mating preferences can potentially influence host immunity and disease risk, and contribute to the extraordinary evolution of the MHC.

MHC-linked olfactory cues For individuals to make mate choice decisions based on MHC characteristics, they must be able to determine the MHC genotype of their potential mates and, under a compatibility paradigm, of themselves. Following the first reports of MHC-mediated mate choice, it was hypothesized that MHC genes might influence body odors. Research on olfactory communication in rodents has clearly demonstrated the validity of that hypothesis (reviewed in Penn and Potts 1998a; Yamazaki et al. 1998). Behavioral experiments have repeatedly shown that mice and rats (both trained and untrained) can distinguish among odors derived from MHC-disparate, but otherwise genetically identical, individuals (Penn and Potts 1998a). Mice, for example, were able to differentiate between odor samples from mice that differed across the entire MHC region, differed at a single MHC gene, or differed by only a few peptide-binding groove residues of a single MHC gene (Penn and Potts 1998a; Yamazaki et al. 1998; Penn 2002). This ability to detect MHC-associated odor differences extends across species boundaries, as rats can distinguish among odors from MHCdissimilar mice and humans, and humans can recognize differences in the odor of MHC-dissimilar mice (references in Penn and Potts 1998a).

Influence of MHC peptide-binding region on odor Considering that MHC alleles are defined by variation in the peptide-binding region, that this

variation influences pathogen resistance and mate choice, and that animals can discriminate via olfaction among individuals differing only in this region, it follows that the peptide-binding properties of MHC molecules influence the odor profile of an individual. Four hypotheses to explain how the peptide-binding groove of MHC molecules could influence odor have been proposed (reviewed in Penn and Potts 1998a; Yamazaki et al. 1998; Penn 2002). First, MHC molecules or fragments thereof may act as odorants; second, the unique array of peptides bound by an individual’s MHC molecules may function as odorants; third, partially degraded MHC molecules may assume a new function as carriers for circulating volatile odorants; and fourth, MHC molecules may shape an individual’s specific population of commensal microflora, which then produce odorants. Because the proposed mechanisms are not mutually exclusive, they may be operating simultaneously to influence an individual’s odor profile (Penn and Potts 1998a; Penn 2002). Recently, however, several independent lines of evidence have converged in support of the second hypothesis: the peptide ligands bound and displayed by MHC molecules also function as chemical signals of individuality (Boehm and Zufall 2006).

MHC peptide ligands as olfactory cues Although the mechanistic details of how MHC peptide ligands serve as olfactory cues are still emerging, the working hypothesis is that MHC/ peptide complexes are proteolytically shed from the cell surface, partially degraded, and then dissolved in bodily fluids, such as urine, serum, saliva, and sweat. The degradation of the MHC/ peptide complexes releases the peptide ligands, which are then free to interact with other types of receptors, e.g., those expressed in olfactory sensory neurons (Boehm and Zufall 2006). Because the range of MHC-bound peptide ligands reflects the structural diversity of the peptide-binding groove of the MHC molecules, this hypothesis directly links an individual’s MHC genotype with their body odor phenotype. The ability of MHC peptide ligands to act as olfactory cues of individual identity has recently


been tested in three distinct biological contexts. First, in a behavioral assay of olfactory assessment, male mice spent significantly more time investigating female urine supplemented with MHC class I peptide ligands specific for a different mouse strain than they spent investigating female urine supplemented with peptides specific for their own strain (Spehr et al. 2006). Even among the many olfactory cues present in mouse urine, the signal produced by the addition of peptide ligands was sufficient to alter odor preferences. Second, Leinders-Zufall and colleagues (2004) used the bioassay of pregnancy failure to demonstrate that MHC class I peptide ligands act as strain-specific chemosensory signals of MHC genetic identity. Specifically, the probability of pregnancy loss was greatly increased when female mice were exposed to a familiar male urine sample supplemented with unfamiliar peptide ligands. The third example of MHC peptide ligands influencing behavior was observed in a set of odor preference experiments involving three-spined stickleback fish (Milinski et al. 2005). Gravid female sticklebacks prefer the tank water of males possessing levels of MHC diversity that optimally complements their own diversity, i.e., results in offspring with intermediate levels of diversity. The addition of synthetic MHC class I- and II-specific peptide ligands increased the attractiveness of water from males with suboptimal diversity and decreased the attractiveness of water from males with optimal (or supraoptimal) diversity, presumably by signaling that those males were more diverse than they really were (Milinski et al. 2005; Boehm and Zufall 2006). Thus, in both mice and sticklebacks, exposure to MHC peptide ligands significantly modifies behaviors associated with social recognition, mate choice, and reproduction.

Detection of MHC-mediated odors The small, nonvolatile MHC class I-specific peptide ligands that modify MHC-mediated social behaviors are detected by specialized sensory neurons in both the main olfactory epithelium and the vomeronasal organ in mice (Leinders-Zufall et al. 2004; Spehr et al. 2006). In both of these olfactory organs, peptide ligands were detected in an allele-specific


manner; i.e., structurally distinct peptide ligands induced different sensory neuron activation patterns. In vivo testing showed that class I peptide ligand-specific activation in the main olfactory epithelium affected odor familiarity, while activation in the vomeronasal organ was sufficient to induce pregnancy failure (Leinders-Zufall et al. 2004; Spehr et al. 2006). The identity of the receptors responsible for recognition of the peptide ligands is unknown. Several candidates for such receptors have been proposed, such as MHC-linked olfactory receptor genes and the V2R pheromone receptors. However, this area requires future study (Boehm and Zufall 2006).

Peptide binding as an integrating principle in MHC evolution The discovery of a chemosensory function for MHC peptide ligands has important implications for immunity, behavior, and MHC evolution. First, it provides insight into the molecular mechanisms by which an individual can detect olfactory cues of MHC composition in others, can process that information to determine quality or compatibility, and can subsequently bias their behavioral response to maximize their fitness. Second, it serves as a vivid example of neural-immune integration, as the structure of MHC peptide ligands conveys valuable information to both the T-cell receptors monitoring the internal immunological environment and the olfactory sensory neurons surveying the chemosensory world. Finally, it reinforces the idea that MHC genes are subject to a complex mixture of selective forces and constraints. Because the binding of peptide ligands is important to both immune surveillance and chemosensory recognition, the peptide-binding properties of MHC molecules will be influenced by both natural (i.e., pathogen-mediated) and sexual selection (Slev et al. 2006). Determining whether synergistic or antagonistic consequences result from this arrangement will be a challenge for future research.

MHC and reproductive outcome The ability of a female to influence the MHC characteristics of her offspring does not end once



a mate is chosen. In the period between mating and birth, females have an opportunity to exert post-copulatory selection (also called cryptic female choice) to block or abort the production of offspring that would have decreased fitness due to diminished disease resistance, genetic incompatibility, or inbreeding depression (Alberts and Ober 1993; Apanius et al. 1997). From an evolutionary perspective, the elimination of these ‘less fit’ offspring would be adaptive, not pathological (Apanius et al. 1997). In non-human animals, there is abundant evidence showing that females use post-copulatory mechanisms to increase the production of offspring with higher genetic quality (reviewed in Tregenza and Wedell 2000; Neff and Pitcher 2005). In humans, the evidence is less convincing, and more controversial (Alberts and Ober 1993; Choudhury and Knapp 2001). However, several studies strongly suggest that the MHC compatibility between mother and fetus influences pregnancy outcome: one study found an excess of heterozygotes in newborn males (Dorak et al. 2002) and a number of studies have reported associations between parental MHC sharing and reproductive success (reviewed below).

MHC sharing and reproduction in outbred human populations Dozens of retrospective studies have examined the relationship between MHC sharing and recurrent spontaneous abortion (RSA) in outbred couples (reviewed in Ober and van der Ven 1997; Choudhury and Knapp 2001; Beydoun and Saftlas 2005). Approximately half of these studies reported increased MHC sharing in couples with RSA compared to control couples. However, among those studies showing a positive association, there was little consistency with respect to the specific MHC genes or regions thought to be responsible. In addition, a meta-analysis of the impact of MHC sharing at particular loci on RSA revealed significant heterogeneity in the odds ratio estimates among studies (Beydoun and Saftlas 2005). Similar discrepancies and inconsistencies were observed in studies of MHC sharing and other reproductive disorders, such as preeclampsia (Saftlas et al. 2005) (see Chapter 6 for a different explanation for

preeclampsia). Moreover, parental MHC sharing has been shown to significantly influence pregnancy success after assisted reproductive technology (ART) in couples experiencing unexplained infertility (reviewed in Ober and van der Ven 1997; Choudhury and Knapp 2001). MHC sharing was significantly higher in couples that failed to achieve a successful pregnancy after ART, suggesting that maternal–fetal histoincompatibility improved implantation success and the probability of a successful pregnancy among these couples. Overall, however, the relationship between MHC sharing in outbred couples and reproductive pathology, such as implantation success, recurrent spontaneous abortion, and preeclampsia, remains unresolved.

MHC sharing and reproductive outcome in an unselected population The most convincing evidence that MHC sharing influences reproductive success comes from a series of population-based studies of the Hutterites (Ober et al. 1983, 1992, 1998). The ethnically homogeneous Hutterite population is characterized by a limited number of independent MHC haplotypes, a high natural fertility rate, large family sizes, and a relatively uniform environment (Ober 1999). The communal lifestyle of the Hutterites minimizes non-genetic influences on fertility: for example, birth control use is limited, diet is relatively uniform, smoking is prohibited, and alcohol consumption is moderate (Ober 1999). Because of these features, the Hutterite population is well suited to studies examining genetic influences on fertility and reproduction. Motivated by the hypothesis that maternal–fetal histocompatibility influenced reproductive outcome, early studies of Hutterite couples revealed a trend towards longer time intervals from marriage to each birth in couples that shared alleles at the MHC genes, HLA-A, HLA-B, or HLA-DR (Ober et al. 1983; Ober and van der Ven 1997). The median completed family size of couples that shared HLA-DR alleles (6.5 children) was significantly smaller than that of couples that did not share HLA-DR alleles (9.0 children) (Ober and van der Ven 1997). To investigate the cause of the longer birth intervals and smaller completed family size, a follow-up


prospective study compared the length of time from the resumption of menses after childbirth to the next positive pregnancy test in HLA-DR similar and dissimilar couples (Ober et al. 1992). The results showed that couples that shared HLA-DR alleles took more than 2.5 times longer to achieve pregnancy. In addition, pregnancy loss rates were significantly increased among couples that shared alleles at another MHC class I locus, HLA-B. The association between increased pregnancy loss rate and MHC sharing in the Hutterites was remarkable. To confirm the result, more than 100 subsequent pregnancies were added to the data set and high-resolution genotyping was extended to 16 loci across the greater MHC region (Ober et al. 1998). Once again, sharing of HLA-B alleles was a significant predictor of pregnancy loss, as was the sharing of alleles at two additional neighboring loci, HLA-C and C4. However, pregnancy loss rates were highest among couples that shared the entire 16-locus extended MHC haplotype. Although it was not possible to determine whether the primary risk factor for pregnancy loss was associated with the HLA-B locus per se, with an HLA-B-linked locus, or with the extended haplotype, the results clearly demonstrate strong effects of sharing MHC alleles on reproduction in the highly fertile Hutterite population (Ober et al. 1998).

MHC sharing, reproduction, and diversity In utero selection for MHC-disparate fetuses could contribute to the enormous diversity of MHC alleles observed in human populations (Hedrick and Thomson 1988), particularly in homogeneous populations in which opportunities for MHC sharing among mates regularly arise. Such selection would minimize MHC homozygosity among children of couples that share MHC alleles, thereby increasing the fitness of offspring while at the same time maintaining genetic diversity in the population.

HLA-G in reproduction, immune regulation, and disease Fetal survival during mammalian reproduction required the evolution of immunological


mechanisms that balanced tolerance of the genetically foreign fetus with maintenance of adequate immune defenses against pathogens. The highly specific patterns of MHC gene expression in fetal cells at the maternal–fetal interface may represent one of those adaptations. In humans, expression of the classical MHC class I (with the exception of HLA-C) and class II molecules is absent on fetal trophoblast cells that interface directly with maternal tissues and immune cells (Ober and van der Ven 1997). Conversely, expression of the non-classical class I genes, HLA-E, -F, and -G, is prominent in fetal trophoblast cells (Ishitani et al. 2003). It is likely that these non-classical MHC molecules, particularly HLA-G, contribute to the establishment and maintenance of maternal tolerance during pregnancy, possible by directing maternal cells toward immunosuppressive phenotypes (Carosella et al. 2003; Hunt et al. 2005; Hviid 2006). HLA-G is unlike other MHC genes in several ways. First, although HLA-G is capable of classical peptide presentation, it is characterized by relatively low levels of protein polymorphism compared to other class I genes (Fig. 8.1a). Second, the HLA-G primary transcript is alternatively spliced into seven transcripts; at least two membranebound forms and two soluble forms are translated into protein. Third, HLA-G expression is tightly regulated and tissue-restricted in non-pathological circumstances; expression is high on fetal cells at the maternal–fetal interface, but otherwise may only be expressed in the adult thymus and cornea and on certain immune cells (reviewed in Carosella et al. 2003; Hunt et al. 2005; Hviid 2006). HLA-G shows remarkable immuno-regulatory abilities. In vitro experiments showed that HLA-G molecules interact directly with several immune inhibitory receptors and exhibit remarkable immunosuppressive properties (reviewed in Carosella et al. 2003; Hunt et al. 2005; Hviid 2006). The ability of HLA-G to induce cells into immunosuppressive phenotypes is critical to the induction of immune tolerance in pregnancy (Hunt et al. 2005), but it may also be significant in other contexts, such as organ transplantation (Carosella et al. 2003). For example, the expression of soluble HLA-G in heart tissue was associated with a lower risk of rejection after transplantation; a similar pattern



was observed for patients receiving kidney–liver double transplants (references in Carosella et al. 2003; Le Maoult et al. 2003).

HLA-G in reproductive, autoimmune, and inflammatory pathologies Because HLA-G plays such a critical role in pregnancy, mutations that affect HLA-G expression may be deleterious. Indeed, a number of casecontrol and cohort studies have linked HLA-G variation with reproductive complications such as miscarriage, preeclampsia, and pregnancy failure after in vitro fertilization (reviewed in Hviid 2006). In the Hutterites, the presence of a single nucleotide polymorphism (-725G) in the 5’ cis-regulatory region in both partners was associated with sporadic pregnancy loss (Ober et al. 2003). In outbred couples, specific HLA-G alleles, defined by coding sequence variation, are associated with recurrent spontaneous abortion (reviewed in Hunt et al. 2005; Hviid 2006). Finally, risk of recurrent spontaneous abortion was significantly increased in Danish women who were homozygous for a polymorphic 14-base pair insertion in the 3’ untranslated region of HLA-G (Hviid et al. 2004). Aberrant HLA-G expression may also contribute to pathologies such as tumor growth, viral progression, and autoimmune or inflammatory disorders (Carosella et al. 2003; Le Maoult et al. 2003). For example, the tightly regulated, tissuerestricted patterns of normal HLA-G expression may be altered in tumor or virally infected cells so as to facilitate escape from immune surveillance and response (Carosella et al. 2003; Le Maoult et al. 2003). Moreover, ectopic expression of HLA-G has been observed in a number of autoimmune and inflammatory diseases, such as psoriasis, atopic dermatitis, asthma, multiple sclerosis, and inflammatory myopathies. This ectopic expression may represent the cause or consequence of those pathologies. On one hand, the expression of HLA-G in adult cells could contribute to the development of pathological inflammatory responses by skewing the host immune cells toward a diseasepromoting (Th2-like) phenotype. The expression of HLA-G in bronchial epithelial cells from asthma

patients (Nicolae et al. 2005) and perhaps in the central nervous system of multiple sclerosis patients (Wiendl et al. 2005) supports this possibility. On the other hand, HLA-G expression may be induced in response to inflammation or injury. In this case, HLA-G molecules may limit the local autoimmune and inflammatory processes in order to protect healthy tissues, such as has been observed in heart transplants (Carosella et al. 2003).

Evolution of HLA-G The unique role of HLA-G in human reproduction and immunomodulation suggests that this gene may have experienced a different evolutionary history from the other, classical HLA genes. Whereas the classical MHC genes show strong evidence for diversifying selection on amino acids that interact with T cells or bind peptides (discussed earlier), HLA-G shows little diversity in its coding region. However, variation in HLA-G that influences its expression appears to have been the target of selection. For example, a null mutation in exon 3 (encoding the ␣2 domain of the protein) common in populations of African descent shows evidence of positive selection (Aldrich et al. 2002), and the highly polymorphic promoter region shows evidence of long-standing balancing selection (Tan et al. 2005). It has been suggested that high-expressing HLA-G haplotypes may serve to protect the allogeneic fetus during pregnancy, but that lower expressing haplotypes might be adaptive in the presence of in utero infection during pregnancy by allowing for a more robust maternal immune response (Aldrich et al. 2002; Tan et al. 2005). Similarly, the higher expressing haplotypes may enhance the immunomodulatory properties of HLA-G in adult cells and predispose toward immune diseases, such as asthma, while protecting healthy tissues in the case of transplants.

The cost of protection: non-adaptive consequences of MHC diversity In humans, variation in MHC genes is associated with differential susceptibility or resistance to more than 100 diseases belonging to several


disease classes: autoimmune, infectious, reproductive, cardiovascular, neurodegenerative, psychiatric, and metabolic (Klein and Sato 2000; Lechler and Warrens 2000; Shiina et al. 2004). MHC theory predicts that natural and sexual selection should favor the production of offspring with enhanced immunocompetence. Indeed, there is substantial empirical evidence that specific MHC alleles (or combinations of alleles) are associated with increased resistance to the pathogenic and parasitic infections that cause many fitness-reducing human diseases, such as HIV, hepatitis B, malaria, and tuberculosis (Lechler and Warrens 2000; Sommer 2005b). However, it is also true that some MHC allelic variants can increase an individual’s risk of disease. For example, MHC genes are among the strongest predisposing genetic factors for autoimmune diseases (Klein and Sato 2000). Genome-wide screens have consistently reported strong linkage to the MHC region and numerous association studies have related specific MHC alleles (or haplotypes) to susceptibility or resistance to diseases like type I diabetes, rheumatoid arthritis, and multiple sclerosis (reviewed in Lechler and Warrens 2000). Autoimmune diseases represent a pathogenic failure of self tolerance characterized by the presence of tissue-damaging self-reactive antibodies and/or T cells. The predisposing effects of MHC genes may result from their involvement in two essential parts of the antigen presentation pathway: the elimination of strongly self-reactive T cells during thymic development and the presentation of antigens to T cells in the course of normal immune surveillance (Marrack et al. 2001). Thus, the antigen-binding characteristics of MHC molecules are relevant to the induction of autoimmunity. Even though autoimmune disease may reduce fitness, many of the MHC alleles linked with increased risk of autoimmune disease are relatively common in contemporary human populations (Apanius et al. 1997; Graham et al. 2005). Why has selection not eliminated these autoimmune-predisposing alleles from human populations? Several non-mutually exclusive hypotheses to explain why these predisposing alleles continue to persist have been proposed (reviewed in Apanius et al. 1997; Graham et al. 2005).


First, the benefits conferred by an autoimmunepredisposing allele in the fight against infectious, genetic, and reproductive disease may outweigh even large costs imposed by the associated autoimmune phenotypes (Apanius et al. 1997; Graham et al. 2005). As long as the disease resistance benefit is greater than the autoimmune cost, selection will maintain the allele in the population. Second, some autoimmune diseases emerge late in life (i.e., when individuals are finished reproducing), and thus impose a very low fitness cost (Apanius et al. 1997; Graham et al. 2005). In these cases, the deleterious genetic variant would be subject to very weak selection, allowing it to persist in the population for a relatively long period of time. Third, autoimmunity may have emerged relatively recently in human history, perhaps in response to novel or changing pathogens or environments. If so, there might not have been sufficient time for selection to eliminate the autoimmune-predisposing alleles (Graham et al. 2005). Finally, autoimmune disease may result from the pathogen-induced manipulation of a normal immune response into a pathogenic response towards self, for example through molecular mimicry (Apanius et al. 1997; Marrack et al. 2001). Interestingly, the situations involving the infectious-autoimmune balance and molecular mimicry could, in certain circumstances, actually contribute to the maintenance of MHC polymorphism (Apanius et al. 1997).

Conclusions The nervous and immune systems show significant overlap and integration at many levels, e.g., molecular, developmental, and functional. The MHC gene family contributes to this integration, as MHC molecules function in immune and neural signaling processes (and influence mate choice and reproduction) in vertebrates. Given these many roles, the extraordinary diversity of classical MHC genes is probably subject to, and maintained by, multiple selective forces, including pathogendriven selection, sexual selection, and reproductive selection. With additional research, it may be possible to determine more precisely the relative importance of these selective mechanisms in MHC evolution, as well as the potential non-adaptive



consequences, e.g., autoimmune and inflammatory disease, that may result.

Summary 1. The nervous and immune systems are intimately connected by shared developmental, functional, and biochemical pathways. 2. The extraordinary diversity and remarkable evolution of MHC genes are influenced by several distinct forces, including pathogen-mediated selection, sexual selection, and reproductive selection. 3. MHC diversity influences the risk and progression of infectious, reproductive, autoimmune, and inflammatory diseases.

4. MHC genes play a significant role in olfactory communication, behavior, and mate choice in vertebrates, including humans. 5. The unique evolution of MHC genes contributed to the prevalence of autoimmune and inflammatory diseases in modern human populations.

Acknowledgments This work was supported by NIH Grants R01 HD21244 and R01 HL72414 to C.O., NSF Grants BCS-0323553 and IBN-0322613 to S.C.A, and funds from the Duke University Biology Department and Program in Genetics and Genomics to D.A.L.


Perspectives on human health and disease from evolutionary and behavioral ecology Beverly I. Strassmann and Ruth Mace

Introduction Evolutionary theory can inform the study of antibiotic resistance and other processes that lead to changes in allele frequencies over time (see also Chapters 11, 13, 15, 18, 23). It can also inform the study of genetic variation in susceptibility to disease and identify diseases caused by single genes and others reflecting multiple genes and their epistatic interactions (see also Chapters 2, 3, 4, 5, 6). The focus on genetic variation is important, but it should not eclipse what is actually a much larger picture. Human behavioral ecology uses evolutionary theory to understand patterns of behavior that do not map onto modern-day genetic variation. For example, the amount of paternal investment that males give to their wives’ offspring is more likely to be predicted by cultural variation that influences confidence of paternity than by genetic variation among males (Flinn 1981; Holden et al. 2003). A given man is expected to balance his paternal effort and his mating effort to achieve a higher reproductive pay-off than would be possible if he uniformly pursued a single strategy, but switching between strategies does not require the turning on or off of any genes. Behavioral geneticists might approach the problem of dads and cads by studying genetic differences in personality or in strength and attractiveness. Human behavioral ecologists, in contrast, usually consider problems that can be framed in terms of reproductive payoffs to alternative, flexible strategies. Longitudinal field studies and quantitative cultural comparisons

play a major role. The insights of human behavioral ecology help us to understand the role of evolution in shaping patterns of health and disease that reflect behavioral, social, cultural, environmental, and lifestyle differences. As this is a very large sphere of influence, the implications for medicine and public health are profound.

Phenotypic plasticity Phenotypic plasticity refers to variation in the phenotype that can be attributed to environmental differences rather than genetic differences (WestEberhard 2003); this definition does not, however, invoke the false dichotomy of ‘nature versus nurture.’ Nor does it mean that learning and other phenotypically plastic traits require less genetically transmitted information than do more canalized traits. The same genotype can lead to different phenotypes in different environments, but genes nonetheless play a role in the development of phenotypically plastic traits. Unpredictable and heterogeneous environments select for traits that have a high degree of phenotypic plasticity, whereas stable environments lead to more canalized development. Phenotypic plasticity can be adaptive because it tends to generate a better fit between the organism and the environment. As noted by Alexander (1979: 90), all organisms are to a greater or lesser degree phenotypically variable in ways that enhanced reproduction in the different environments to which they were typically exposed in the past and ‘such plasticity is almost the definition of phenotype.’ 109



The arrowleaf plant (Sagittaria sagittifolia) is helpful for illustrating the adaptive nature of phenotypic plasticity (Schmalhausen 1949). This plant assumes three radically different forms depending on whether it grows on land or is partially or fully aquatic (Fig. 9.1). The three morphologies result from an interaction between the plant’s genotype with its environment during development. The same genotype will lead to different developmental pathways and produce different phenotypes if the environment is altered. A useful concept is the ‘reaction norm,’ which refers to ‘the array of phenotypes that will be developed by the genotype over an array of environments’ (West-Eberhard 2003: 101). Since the sensitivity to environmental change is shaped by natural selection, plant species that for many generations have germinated under more homogenous and predictable conditions have diminished plasticity and narrower reaction norms. Thus, rather than being an external property introduced by the



environment, the capacity for phenotypic plasticity is naturally selected and has a genetic basis. We use the example of the arrowleaf plant for its clarity and historical importance, but examples from animals are numerous. West Eberhard (2003) has recently authored a detailed evolutionary synthesis on the developmental processes that generate adaptive phenotypic variation in a wide range of species. Research on phenotypic plasticity has helped to predict reaction norms for such life-history traits as age and size at maturity (Stearns and Koella 1986; Stearns 1992). A particularly interesting example is the decline in age at menarche by two to five years because of an improvement in nutrition (Stearns and Koella 1986). Poorly fed women mature later, which helps mitigate offspring mortality and increases fecundity by allowing more time for growth, but size at maturity is diminished. Stearns and Koella (1986) model age at menarche as a ‘single genotype sliding along a maturation trajectory’


Figure 9.1 Arrowleaf plant (Sagittaria sagittifolia). The three different morphologies illustrate phenotypic plasticity in response to three different environments: terrestrial, semi-aquatic, and aquatic.

Source: Schmalhausen (1949) after Mettler and Greg (1969: 66).


from early maturity at large size attained through fast growth to late maturity at small size attained through slow growth. This example is important in that it links demography to evolutionary biology and shows that reaction norms for age at menarche in humans and age at maturity in other organisms are an evolved response to selection. Stearns and Koella’s model partitions the phenotypic response to nutrition, which accounts for a three-year decline in age at menarche, from the genetic change that they predict will eventually (after several thousand years) hasten menarche by up to one year. In this volume, phenotypic plasticity in humans is illustrated by variations in hormone levels in response to energetic constraints (Chapter 7), the role of the fetal environment as a risk factor for cardiovascular disease (Chapter 19), and the role of diet and lifestyle in the etiology of degenerative disease (Chapter 20). These chapters emphasize physiological and metabolic mechanisms. The molecular mechanisms behind plasticity include genomic imprinting and other epigenetic effects that lead to heritable variation not encoded in the DNA sequence (Chapter 19). Here we focus on plasticity in reproductive and behavioral outcomes and mention the intervening mechanisms only sparingly. In human behavioral ecology, the term phenotypic plasticity is invoked to explain the role of natural selection in shaping and constraining learned behaviors and many other aspects of an organism’s flexible response repertoire. The major organ responsible for phenotypic plasticity in humans is the brain. The unusually large brain of humans is a web of adaptations that allow us to deal with our complex, shifting, and unpredictable social environment (Humphrey 1976; Alexander 1989, 1990; but see Kaplan et al. 2000). The complexity of the human social environment reflects ever-changing conflicts and confluences of interest among individuals and groups in human societies (Alexander 1989, 1990). Human intelligence teaches us how to be sophisticated negotiators of our social world, and we use language to manipulate this world in our interest (Alexander 1989; Dunbar 1996). Rather than pitting genes against learning, evolutionary ecologists understand that we have genes for learning. In a fluid and unpredictable environment, learning confers fitness advantages over fixed and inflexible


strategies. Behavioral ecologists leave to neuroscientists and psychologists the laboratory study of brain mechanisms and instead look at observable patterns of behavior in the real world. They expect that environmental input into the phenotype does not simply create noise, but that it leads instead to context-specific adjustments in behavior that enhance the reproductive success of individuals. A vast body of research demonstrates that even nonhuman animals, with their much smaller cognitive capacity, make adaptive ‘decisions’ that are condition-sensitive (West-Eberhard 2003). For example, if you are an echiuroid worm (Bonellia viridis), a chemical in the environment during your larval development will direct you to become male or female (Mettler and Greg 1969). Your body evolved to respond to this chemical because it is a reliable cue regarding the fitness pay-offs you can anticipate from your ‘choice’ of gender. Reproductive ‘decisions’ involve the brain, but need not be any more conscious or purposeful than the spinal reflex or the cerebral regulation of the heartbeat rate. At a conscious level, people tend to focus on immediate goals such as attracting a mate, sex, creating families, striving for status or wealth, staying warm and sheltered from the environment, helping their kin, forming alliances and friendships, and pursuing a means of subsistence. These goals tended to translate into enhanced reproduction over the human evolutionary past. When a person pursues these proximate stepping stones, she is engaged in reproductive striving even if she assumes these goals are ends in themselves (Alexander 1979; Irons 1979). In traditional societies that allow polygynous marriage and lack contraception, these stepping stones demonstrably lead to enhanced reproductive success, but in industrial societies, proximate goals may become uncoupled from their ultimate function (Irons 1998). For example, among the Kipsigis of Kenya, the Dogon of Mali, and nineteenth-century Mormons of Utah, wealthy, high status men had more wives and more offspring (Mealey 1985; Borgerhoff Mulder 1987; Strassmann 2003), whereas high-status Canadian men had higher mating success, but did not have more children (Perusse 1993). The critical difference was socially imposed monogamy and contraception in Canada—both recent inventions absent



over most of human evolutionary history. Human behavioral ecologists recognize that adaptation is expected only when genes and the environment are matched to each other and that after rapid environmental (including cultural) change, natural selection will lag behind (Chapter 1). Four of the theoretical cornerstones of evolutionary theory that might interest physicians and public health workers include: (1) kin selection, (2) life history, (3) parental investment, and (4) sexual selection theory. We present medically informative examples of these theoretical perspectives to suggest how physicians can profitably use behavioral ecology to understand a wide array of medical and public health problems, including child abuse and homicide in step families; deadbeat dads; attachment disorders; failure to thrive; female infanticide; excess male mortality from accidents, suicide, and disease; risky behavior; immunosuppression; reproductive cancer; marital violence; and genital cutting. We do not suggest that there is a gene for child abuse, marital violence, or any of the other phenomena discussed here. However, our genes give us the capacity for phenotypic plasticity, which means that we should expect a tendency for people to respond to their environments in adaptive ways. Hence, in the absence of environmental novelty and other constraints, the flexible, circumstance-dependent behaviors of humans and other organisms are expected to enhance reproductive success.

Kin selection theory The genes found in organisms today are not a random selection of the genes of our ancestors. Instead, the genes that have survived to the present have done so because they were unusually good at making copies of themselves (Williams 1966; Dawkins 1976). Genes get copied whether individuals reproduce directly or indirectly. Direct reproduction refers to an individual’s own reproductive success, whereas indirect reproduction refers to the help that an individual gives to siblings, cousins, and other kin. These two forms of reproduction combined constitute an individual’s inclusive fitness. Evolutionary biologist W. D. Hamilton defined the precise conditions under which an individual can

enhance her genetic representation in future generations by assisting a genetic relative (Hamilton 1964). Hamilton’s Rule states that an individual will get a genetic pay-off from helping a relative if rB ⬎ C, where r is the genetic relatedness between the donor and recipient, B is the reproductive cost to the donor, and C is the reproductive benefit to the recipient (see Chapter 1). Early critics of Hamilton’s theory raised a number of objections that have since been dispelled, including the mistaken notion that kin selection must operate at the conscious level (Dawkins 1979; Alexander 1989). Other researchers worked out the mechanisms by which animals as diverse as primates, social insects, and tadpoles discriminate kin from non-kin. It turns out that animals learn who their relatives are from shared associations, for example, being raised in the same burrow, or, in the case of humans, from being actively taught (Alexander 1990). The learning required for kin recognition further illustrates an important point: learned behavior is often adaptive, and this is especially true when it occurs in social contexts that do not depart radically from past human experience (Alexander 1990). Hamilton’s rule states that individuals are expected to flexibly adjust their behavior based on the genetic pay-offs from alternative options (see West-Eberhard 2003: 43). It predicts that humans (and other animals) will: (a) preferentially help close rather than distant genetic relatives; (b) preferentially help individuals of high reproductive value (who can produce lots of offspring in the future); and (c) preferentially help those for whom the cost of helping is comparatively low (see Alexander 1979: 156–7; Madsen et al. 2007). In humans, kin selection is a powerful theoretical framework for understanding patterns of parental investment and health vulnerabilities within families. To understand the medical and public health implications of kin selection, it is helpful to consider the role played by proximate mechanisms. Proximate mechanisms were favored by natural selection because they tended, on average, to bring about adaptive behavior. Whether a particular


mechanism is adaptive or not is context dependent, but mechanisms that tended to promote fitness in past environments were selectively favored. One such mechanism is known as ‘discriminative parental solicitude’ (Daly and Wilson 1980). Birds that nest in colonies, such as guillemots, have evolved the ability to distinguish their eggs and chicks from those of other parents, whereas birds with dispersed nesting sites lack such abilities (Birkhead 1978). In colonies the potential exists for eggs and chicks of different parents to get mixed up, so egg recognition enables parents to preferentially invest in their own genetic offspring. Humans are a social animal, and like the guillemot, discriminatory parental solicitude is well documented, especially in step-relationships (Daly and Wilson 1988).

Step-parents ‘Cruel step-parent’ stories like the Cinderella tale are widespread and reflect a demographic reality: children are at increased risk of abuse and death in the presence of a step-parent. In a Canadian city, preschoolers living with a presumed genetic parent and a step-parent were 40 times more likely to become victims of child abuse than were children living with both genetic parents (Daly and Wilson 1985). In the United States overall in 1976, fatal child abuse was 100 times more likely when a stepparent was present in the household (Daly and Wilson 1988). In modern times, homicide and child abuse are extreme behaviors unlikely to enhance the reproductive success of the perpetrators; hence they are presently maladaptive. The discriminative parental solicitude from which these pathologies arise, however, is a proximate mechanism that was presumably adaptive, on average, in the human evolutionary past (Daly and Wilson 1988). Step-parents are much more likely to abuse a child than are genetic parents, but the baseline rate of abuse in step families is still very low (322 fatalities per million Canadian preschoolers) (Daly and Wilson 2001). This brings us to another question: Why do many men benevolently parent other men’s children? In a study of men in Albuquerque, New Mexico, Anderson et al. (1999) and Lancaster and Kaplan (2000) demonstrated that paternal care can be a form of mating effort. Albuquerque


men invested the most in their genetic offspring whose mothers were current mates and the least in step-offspring whose mothers were former mates. Interestingly, they invested about the same in their genetic offspring whose mothers were former mates as they did in their step-offspring through current mates. The authors conclude that men invest in children partly for the benefit of the relationships they have with the children’s mothers. This study has implications for the public health problem of deadbeat dads who withhold the child support they owe. According to Anderson et al. (1999), the reluctance of males to pay support arises from their reluctance to ‘direct a significant portion of their mating effort budget into nonmating relationships, decreasing their ability to attract or maintain subsequent mates.’ Understanding the psychology of deadbeat dads may be a first step toward solving this pervasive public health problem (Anderson et al. 1999). When step-families are formed, explicit education and public awareness about the risks and early warning signs could help reduce the prevalence of tragic outcomes.

Adoption Adoption may seem to contradict kin selection theory, but it is easy to understand from a behavioral ecological perspective. In traditional societies in Oceania and other places where adoption has been well studied, the adoptive parents are usually the presumptive genetic kin of the adopted child (Silk 1980). In industrial societies, adoptive parents are often aunts and grandmothers who take over when necessity demands it, as in the case of maternal death or drug abuse, or adoptive parents may be childless. The adoption of a child is arguably an altruistic and humanitarian act, yet such acts are far less common among fertile than infertile couples. If fertile couples commonly preferred adoption over raising genetic offspring, then that would indeed be difficult for evolutionary theory to explain. Psychological and emotional mechanisms of parent–infant attachment play a crucial role and enable most adoptions to be successful. In past environments these mechanisms promoted parental investment in genetic offspring, but these same mechanisms can



be redirected toward nongenetic offspring in the novel environments of today. One novel feature of modern environments is the shortening of the reproductive span due to delayed childbearing, which forces many couples to choose between having smaller than desired family sizes or to complete their families through adoption. Over the human evolutionary past, however, the adoption of nonrelatives was probably rare. The adoption of older children is often less successful than the adoption of infants because the window of attachment is most pronounced in the first year of life (Carlson et al. 2003). In developing countries young children who lose their mothers are about five or six times more likely to die by age 5 years than are children whose mothers survive (Mace and Sear 2005). In the Ache of Paraguay, 100% of motherless children were killed by age 1 year (Hill and Hurtado 1996: 437), although recently such a child was adopted by a bereaved mother who was unrelated (K. Hill, personal. communication). Over human evolutionary history the generally high death rate of motherless infants would have diminished selection for the ability to postpone attachment until later in childhood. Not many children were left waiting for a substitute caregiver. Unfortunately, this legacy is problematic for the thousands of orphaned children who are institutionalized and who do not find families before they are toddlers or older (Carlson et al. 2003). Policy reforms that speed up adoptions would give life-long, emotional benefits for these children. Another possible reason for the brevity of the window of attachment is that, over the human evolutionary past, the adults who nurtured a child during infancy were more likely to be genetic parents and close kin. Children could trust these individuals more than the step-parents and strangers who followed later. The window of attachment is another example of the adaptive nature of learning (see Chapter 1) and shows that adaptations are a product of genes and the environment. The genes that are around today were favored in past environments, but when a novel situation is introduced (orphanages), then adaptation breaks down (see Chapter 1).

Life-history theory Trade-offs Life-history theory concerns the allocation of resources and time to vital events, ranging from growth and body maintenance to reproduction and death (Stearns 1992). As such it is relevant to most of the chapters in this book. Trade-offs are introduced in Chapter 1 and taken to their inevitable conclusion of senescence and death in Chapters 18, 19, and 23. The essential notion that trade-offs exist is fundamental to understanding human health and development: energy expended on reproduction may not be available for immune function or growth or repair, and the constant repair of a body beyond a certain age may not be the best way for genes to replicate themselves in future generations. Within those resources allocated to reproduction, how much should we devote to mating and how much to parenting? Within those resources allocated to parenting, how much should we devote to producing more offspring versus looking after the children we already have? While medical practitioners operate on the basis of preserving life by whatever means possible, natural selection will balance survival against reproductive gains to maximize not health or survival but inclusive fitness. Thus selection on our physiology and our behavioral tendencies may not always be in the best interest of our own or our children’s well being. Human behavioral ecologists have examined the trade-offs inherent in reproductive decisionmaking mainly in traditional populations, usually among women who are not using contraception (Cronk et al. 2000). These women are experiencing the high energetic and health costs of childbearing in an environment where food is often scarce. It is in these circumstances that we can best understand the selection pressures that operated on our ancestors, some of which may apply even in urban environments today and others of which may not. We review some examples of studies that have helped us to reveal the trade-offs that have shaped human life-history patterns in three different areas: offspring number versus quality, parental effort and longevity, and menopause and the postreproductive life span.


Offspring number versus quality A pioneering study that tested predictions from life-history theory among the Ache found little supporting evidence for these trade-offs, but helped lay the groundwork for studies that came later (Hill and Hurtado 1996). One of these was by Strassmann and Gillespie (2002), who found that Dogon women in Mali, West Africa, experienced diminishing returns to reproductive success (measured as the number of offspring who survived to age 10 years) from higher fertility (measured as the number of live births) (Fig. 9.2). Parents who had many children invested less in each one. For a given child, the odds of death by age 5 years increased 25% for each additional child present in the family (Strassmann 1997). When the ratio of children to married adults in the family increased by one extra child per adult, the odds of death increased nearly threefold (Strassmann 1997). These results held up after statistically controlling for a large number of predictor variables such as wealth; thus they provide evidence for the offspring number–offspring quality trade-off. Haig mentions in Chapter 6 that closely spaced birth intervals tend to increase infant mortality, yet the strategy of rapid reproductive rate can sometimes favor maternal reproductive success even when it elevates offspring mortality (Hobcraft et al.

Reproductive success

10 8 6 4 2 0




4 6 8 Female fertility



Figure 9.2 The relationship between female fertility (number of live births) and lifetime reproductive success (number of offspring who survived to age 10 years) in the Dogon of Mali (n = 55 women, R2 = 0.32). Each ‘petal’ represents an additional data point. From Strassmann and Gillespie (2002).


1983). This is perhaps most clearly illustrated among mothers of twins. While the overwhelming majority of human births are singletons, twinning occurs in 1–8% of pregnancies (depending on the population). Twinning is an evolutionary puzzle as the health costs and elevated mortality risks to mother and babies are well documented. In rural Gambia prior to the arrival of medical services, twins died within the first month of life at roughly double the rate of singletons—suggesting that twinning has no fitness advantage and thus is maladaptive (Sear et al. 2001). Other studies of twinning in preindustrial Finland found broadly similar results, although if both twins happened to be girls (who are slightly smaller and less energetically costly to bring to term), mortality risks to the mother and her infants were not as great (Lummaa et al. 2001). However, in both studies, the mothers who gave birth to twins experienced significantly shorter birth intervals, even if twin births were counted as only one birthing event, and mothers of twins had higher lifetime reproductive success. These findings support the idea that twinning is a by-product of polyovulation (Anderson 1990) and that ovulating from both ovaries each month increases the chances that a cycle achieves a pregnancy. Occasionally a twin pregnancy results, but twins were only carried to term in taller and healthier mothers in the Gambian population. Twin mothers tend to be the supermums of their populations, and on balance, despite their higher infant mortality, they gained enhanced reproductive success from their shorter birth intervals. In the Gambian population, twin mothers are so much more reproductively successful than the non-twinning mothers that twinning appears to be under positive selection pressure. It is tempting to speculate that populations with high twinning rates are those where fast reproductive rates are selected for by high extrinsic adult mortality risks, and one case study suggests this might be the case (Lummaa et al. 1998). Another hypothesis is that twinning rates are higher in populations with higher maternal body mass (Bonnelykke 1990).

Parental effort versus longevity Twinning illustrates the trade-offs between rapid reproductive rate and mortality risks for babies,



but the increased risk for mothers also needs to be considered. If the central tenet of life-history theory—that energy allocated to reproduction cannot be spent on body maintenance or growth—is true, it could be predicted that high reproductive rate is associated with increased adult female mortality, even after excluding cases of maternal mortality associated with the birth itself. However, in general, no clear association between life span and fertility emerges. Some studies have reported that reproduction shortens the life span in harsh environments (Lycett et al. 2000), but these results are controversial (Gavrilova and Gavrilov 2005). Other studies show a positive association between maternal fertility and longevity rather than the predicted negative one (Beeton et al. 1900; Gavrilova and Gavrilov 2005), which may reflect phenotypic correlation: fitter individuals experience both higher birth rates and a greater potential for long life. Twin mothers tend to be among the fittest in their populations in terms of phenotype; they tend to be taller, with higher Hb levels (Sear et al. 2001). It is not yet clear whether they also live longer than mothers of singletons—but if they do, their higher costs of reproduction are masked by their higher phenotypic quality. The experiment that needs to be done to demonstrate costs of reproduction, paid through reduced maternal longevity, is to make mothers who would normally have been capable of raising a family of six children raise eight children instead. Do these mothers experience higher mortality? Alternatively, one might make these mothers raise four children and see if they experience increased longevity compared to those left to raise the number of children they naturally bore. Of course the ethics of such an experiment preclude it in humans, but in fact that experiment has been done in birds and some other species. Collared flycatchers and kestrels that had additional offspring introduced into their nests as eggs died younger than if they were allowed to raise the clutch size that they had laid (Daan et al. 1990; Gustafsson and Part 1990). In human studies, we can only use sophisticated statistics to try to untangle these phenotypic effects— Doblhammer and Oeppen (2003) do just this and report evidence of a higher cost of reproduction in historical Britain among female aristocrats with

large families. However, their methodology may be problematic in that they dropped from their analysis both childless women and women with only one child (Gavrilova and Gavrilov 2005).

Menopause and the post-reproductive life span Both the trade-off between investing in existing or future children and the trade-off between further reproduction and maternal survival combine to explain one of the unique features of the human life history—menopause. While some species have undergone a cessation of reproduction in old age in captivity, humans seem to be the only primate in whom females routinely experience around 20 or more years of post-reproductive life, even in hunter-gatherer societies (Paul 2005). Proximately menopause is caused by atresia—eggs die off rapidly and simply run out by age 50 (with genetically normal eggs running out sooner). Ultimate explanations for why natural selection has designed our physiology in such a way are mostly based on the notion that at some point in later life the costs of continuing to reproduce outweigh the inclusive fitness benefits of helping either offspring or grandoffspring to survive and reproduce (Williams 1957; Peccei 1995; Hawkes et al. 1998; Shanley and Kirkwood 2001). The risks of maternal mortality mean that the trade-off between caring for existing offspring or having more offspring comes into play. Clearly as mothers do not reproduce after menopause in natural circumstances, it is not possible to test empirically what the precise costs and benefits of late reproduction would be, but alternative life histories can be modeled by computer. Shanley and Kirkwood (2001) conclude that human female menopause could evolve if maternal mortality risk is high and increasing with age, and mother and grandmother effects are important in reducing infant mortality and increasing female fertility. While there is some debate in the literature as to whether it was the life span that extended, adding post-reproductive life to a chimp-like fertile span, or whether the termination of reproduction moved earlier in the life course, it is worth noting that models such as Shanley and Kirkwood’s treat those two scenarios as functionally equivalent.


In demographic and anthropological studies, one method of assessing the importance of the extended family in the raising of children is to examine the impact the death of each relative has on the chances of survival of a child of each age. Studies of patrilineal, patrilocal populations find that the costs of a mother’s death greatly exceed the costs of a father’s death, in terms of a child’s survival and nutrition, and the survival of a maternal grandmother is associated with more positive child outcomes than the survival of a paternal grandmother (Mace and Sear 2005). The costs of raising a human family with multiple dependent young, unable to achieve self-sufficiency until their teens or later, seems to be a job that benefits from the help of the parents and the extended circle of kin. These studies do not, however, explain menopause as an adaptation because they do not show that the gain in reproductive success from helping children and grandchildren after menopause is greater than the cost to reproductive success from ceasing to maintain fertility. Theoretically, we expect transfers from parents and grandparents down the generations to be enabled by and to select for long life spans, which coevolve with larger brains and delayed onset of reproduction (childhood)—these are characteristics of human life-history strategies that are probably associated with the need to acquire complex skills for survival and reproduction (Kaplan et al. 2000; Lee 2003). Further empirical evidence, however, is needed to test this hypothesis and alternative explanations for the unique features of human life histories.

Parental investment theory Infanticide It is clear from the above discussion that child health outcomes are not solely related to genes or indeed exposure to pathogens, but can be greatly influenced by family circumstances, including decisions by parents in whom to invest (Hrdy 1999). Parents have finite resources and make strategic decisions about how much to invest in any given child (Hrdy 1999). In an extreme manifestation of maternal selectivity, Scheper-Hughes (1992) describes how women living in abject poverty on Brazilian sugar


plantations somehow disinvested in children they could not support. Despite the efforts of perplexed medical practitioners to help the children survive acute heath crises if they could, the same children seemed to fall sick, fail to recover, and often died. In food-limited populations, parental investment is part of the trade-off between mother’s body maintenance, her ability to keep a mate or attract a new one, and her ability to keep her other children alive. In many traditional human societies, including most hunter-gatherer societies, infanticide was common for children born into circumstances where the necessary parental and kin support was unlikely to be forthcoming. For example, in the Ache of Paraguay, children without living fathers were 3.9 times more likely to be killed by another Ache in each year of childhood; children of divorced parents were 2.8 times more likely to be killed during each year of childhood (Hill and Hurtado 1996: 437).

Sex ratios In developing the theory of sex ratio selection, Sir Ronald Fisher noted that all children have one mother and one father so the total reproductive success of males and females in a population is the same. It follows that if one sex becomes rare, individuals of that sex will have an advantage in terms of reproductive success. For example, if the breeding system is monogamous and the adult sex ratio is 1:2 (one male for two females), then males will have on average twice the reproductive success of females since half the females go unmated. If the mating system is polygynous, then each male will on average impregnate two females. Thus, regardless of the breeding system, the rare sex always has the reproductive advantage, and this produces a strong, frequency-dependent selective force that equilibrates only when the ratio of investment in the two sexes is 1:1 (Fisher 1930). The mechanisms of meiosis and the X–Y sex determination system of mammals reliably give rise to the approximately 1:1 sex ratio at birth, and these mechanisms may have evolved for Fisherian reasons. In many mammals, however, the 1:1 ratio is slightly adjusted by the uterine environment and the selective abortion of embryos of one sex.



Another nuance of sex ratio theory bears emphasis: it is the total investment in each sex that is generally equal rather than the number of each sex that is born (Fisher 1930). If the average cost of raising one sex through to independence is more than that of the other, then parents will on average raise to adulthood fewer offspring of the costlier sex (Fisher 1930). In humans and most mammals, the average male conceived is less costly than the average female conceived because males tend to die at higher rates in the juvenile period and death puts an end to parental investment; sex ratios at conception consequently tend to be male-biased. When one considers only the offspring who are successfully raised, on the other hand, males are on average more costly than females and sex ratios at breeding tend to be female-biased. Behavioral ecologists have studied the cultural and ecological factors that contribute to the differential costs of raising boys and girls. For example, females are more costly in dowry societies in northern India (Das Gupta 1987). Females are also more costly in Inuit societies where male hunting is the primary source of food and females return fewer benefits to the household economy (Smith and Smith 1994). In these cases infanticide can occur and tends to be strongly female-biased. Moreover, if resources are more useful for males than females at attracting mates (which is true in most wealth-inheriting, polygynous, human societies), then parental investment of these resources will be predominantly male-biased (Trivers and Willard 1973). Sex-biased investment is frequently against females, but in some societies parents get a better return on their parental investment through daughters and treat them better than sons (Cronk 2000). For example, among the Mukogodo of Kenya, females tend to marry outside the ethnic group to higher status Maasai males, whereas Mukogodo males experience difficulty in attracting mates. Parents consequently favor daughters and girls are less nutritionally deprived and experience superior growth compared with boys (Cronk 2000). Levels of parental investment can also be influenced by birth order (Taylor 2005). For example, in some pastoralist societies, there is only enough wealth for a few sons to be supported into marriage; later born children are often left unmarried or marry very

late, and parents invest preferentially in their earlyborn sons (Mace 1996).

Sexual selection theory In humans and other mammals, mothers expend resources on gestation and lactation, whereas males sometimes donate only the sperm that fertilizes the egg. Although human males tend to do more paternal investment than the males of most primate species, humans often show a pronounced skew in the relative parental investment by mothers versus fathers (Trivers 1972; Lancaster et al. 1987). This skew arises from the fact that reproductive success in males is usually limited by the number of females that a male mates with, whereas additional matings usually do not increase reproductive success in females (Bateman 1948; Trivers 1972). Pregnancy and lactational amenorrhea in hunter-gatherer and traditional societies last about nine and twenty months, respectively, so if a female mated again during this time she would not get pregnant. Males on the other hand can greatly increase their reproductive success by copulating with many females in short order, so they tend to put more of their reproductive effort into securing extra matings (Trivers 1972). Sexual selection refers to selection for traits (including behaviors) that increase mating success (Chapter 1); it is more intense in males because they compete against each other to mate with fertile females (Darwin 1871; Trivers 1972). Human females compete against each other to marry the best males, and that is what dowry competition is all about (Gaulin and Boster 1990), but the stakes are not as high. The range of variation in reproductive success in females is usually lower than in males. The average reproductive success of both sexes is the same, but in males one finds more extreme ‘winners’ and ‘losers’ in the reproductive gambit (Trivers 1972). Betzig (1986) describes the extreme harem sizes of despotic males, which sometimes numbered in the hundreds. Ghenghis Khan and his male kin employed such a successful polygynous strategy that 8% of all Asian males (sampled in 16 populations) can trace their Y chromosome to these twelfth-century Mongol rulers (Zerjal et al. 2003). For every man who gets





All causes


External causes


Internal causes

3 2.5 2 1.5 1 1

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Age

an extra fertile mating another man is disenfranchised, so there is fierce reproductive competition among males. Reproductive competition also exists among females, especially in polygynous families, but the intensity of competition is lower (Strassmann 2000, 2003).

Higher mortality of males than females Males have evolved to put more of their total energy budget into reproductive competition and less into health and longevity (see Daly and Wilson 1983: 298–9). Males who did not take risks to fight against other males for fertilizations left few descendants. Females, on the other hand, could take pregnancy almost as a given and direct more energy to somatic maintenance and repair. Risky behaviors that would leave their children motherless did not pay off. The greater intensity of sexual selection in males was a self-reinforcing process. Males took more risks, experienced more trauma, and had shorter life expectancies, which caused them to discount the future. They invested in traits that increased reproduction in the short term at the expense of the long term, intensifying male–male competition in early adulthood (reviewed in Kruger and Nesse 2006). Even today, the greater intensity of sexual selection in males causes them to have higher mortality rates than females, with a peak difference in young adulthood (Wilson and Daly 1985). As shown in

Figure 9.3 The ratio of male to female mortality in the United States in 2000. From Kruger and Nesse (2006).

Fig. 9.3, the ratio of male to female mortality (MR) in the United States in 2000 was always greater than 1, peaking at 3.01 at age 20–4 years (Kruger and Nesse 2006). The highest MR for a specific cause averaged across all ages was 4.7 for suicide, which also had the highest peak for any age group: 7.8 for ages 75–9 years. Next came non-automobile accidents (MR = 4.89) and homicide (MR = 4.35) in the age group 20–4 years. Diseases with high MRs were led by infectious disease (MR = 2.46) in the age group 45–9 years, followed by liver disease (including cirrhosis) and cardiovascular disease. In all circumstances except female infanticide and severe discrimination against women, excess male mortality is the prevailing pattern. For example, in the Ache of Paraguay, males died from accidents at twice the rate of females (Hill and Hurtado 1996). The peak in the MR at the time when males are entering into reproductive competition, and the well-known male–female difference in life span are the signature of sexual selection (Kruger and Nesse 2006). Sexual selection is also responsible for sexual dimorphism and the higher somatic maintenance costs of males (Bribiescas 2001; Campbell et al. 2001); their skeletal musculature alone accounts for 22% of basal metabolic expenditure compared with 16% in females (Elia 1992). In a study of men and women aged 17 to 81 years, resting metabolic rate was 23% higher (1,740 ± 194 kcal/day) in men than in women (1,348 ± 125 kcal/day) (Arciero et al. 1993).



All traits result from an interaction between genes and the environment (West-Eberhard 2003; Chapter 1), and the MR is strongly influenced by social factors. Individuals are more risk averse and will pursue strategies with long-term pay-offs if they perceive that their environment is favorable. In humans, the social environment and educational and economic opportunities are critical, and MRs are higher in disadvantaged groups (Singh and Yu 1996; Kruger and Nesse 2006). In regard to health, few aspects of a person’s social niche are as important as marital status and marital satisfaction. In a study of American youth of both sexes aged 20 to 24 years, the hazard of death was 2.2fold higher for divorced, separated, and widowed persons than for their married counterparts (Singh and Yu 1996). This result may reflect differences between those who marry and those who do not rather than the beneficial effect of marriage per se, but it did control for sex, income, education, race, residence, and immigration status. Married men are expected to decrease mating effort and to increase paternal effort and long term investments. As discussed by Bribiescas and Ellison (Chapter 7), testosterone plays a role in mediating the tradeoff between mating and parenting in human males (Gray et al. 2002); testosterone declines after marriage and increases after divorce when a man resumes mating competition (Mazur and Michalek 1998). In birds, the testes regress outside the breeding season to spare the high costs of testosterone, which include the risks of territorial aggression and mate guarding (Daly and Wilson 1983: 100–2). High testosterone levels are advantageous for mating competition, but are costly in terms of impacts on immune function (Campbell et al. 2001) and longevity (Hamilton and Mestler 1969). In women, estrogen and progesterone are metabolically costly and contribute to reproductive cancers; but menopause, menstrual cyclicity, and hormonal suppression mitigate these costs (Chapter 7) (Eaton et al. 1994; Strassmann 1996; Jasienka et al. 2000; Ellison 2001: 168–213).

Sexual jealousy and genital cutting Like songbirds, human males engage in paternal care and expend considerable effort on

mate guarding, a behavior that protects against cuckoldry—a word derived from ‘cuckoo,’ a bird that lays its eggs in another bird’s nest. Evolutionary ecologists define cuckoldry as paternal investment in genetically unrelated offspring. Sexual jealousy helps to defend paternity certainty and is the leading cause of marital violence (Daly et al. 1982; Daly and Wilson 1988). Female genital cutting, performed on about two million females per year, reduces female sexual pleasure so as to enhance sexual fidelity (Strassmann 1997). Tools include glass and razor blades (often nonsterile), and the practitioners are usually women who act on behalf of parents who feel that genital cutting is a requirement for securing an advantageous marriage for daughters. Where men prefer clitoridectomized women as marriage partners, the reproductive benefits of cutting may outweigh the reproductive costs. For example, among the Kassena-Nankana of Ghana, girls who had undergone genital cutting had nearly one child more than uncut girls, and this was mediated by an earlier age at marriage and first pregnancy (Reason 2004). Nonetheless the health costs can be considerable and can include infection, repeated urinary infections, obstruction of menstrual flow, infertility, and chronic pelvic pain (http://www.womenshealth.gov). Morison and co-workers (2001) found a higher prevalence of herpes simplex virus 2, a known risk factor for HIV infection, in cut than in uncut women in rural Gambia. Modern medical practitioners are increasingly coming into contact with patients at risk for genital cutting and may feel the pressure to let cultural relativism stand in the way of the need to forewarn these patients and their parents. Sexual selection theory elucidates the adaptive advantages for males (at the expense of females) and can unravel confusion created by ideologies whose root goal is male control of female sexuality.

Summary 1. In order to fully exploit the insights that can be gained from evolutionary medicine, the medical community needs to consider how the genes interact with their environments during development. Patients are phenotypes; thus all medical conditions are a product of genes and the environment.


Under varying environmental conditions, the same genotype can lead to a wide array of phenotypes. 2. Natural selection has favored phenotypic plasticity because it promotes reproductive success by creating a better fit between the genotype and the environment. The genes give organisms the capacity for phenotypic plasticity, which means that organisms can be expected to respond to their environments in adaptive ways. In a nutshell, this is the evolutionary argument for expecting the flexible, circumstance-dependent behaviors of humans and other organisms to be fitness-promoting. 3. The cornerstones of evolutionary theory that guide the study of behavior include kin selection, life-history, parental investment, and sexual selection theory. This is a rich body of theory that can illuminate a wide variety of medical and public health problems, such as child abuse and homicide in step-families; deadbeat dads; attachment disorders; failure to thrive; female infanticide; excess male mortality from accidents, suicide and disease; risky behavior; immunosuppression; reproductive cancer; marital violence; and genital cutting. 4. Many of these problems reflect reproductive conflicts of interest between individuals, for


example, between parents and offspring, siblings, members of the same sex, and males and females in mating relationships. Other conflicts occur within individuals and involve life-history trade-offs: mating versus parental effort; offspring number versus quality; present versus future reproduction; and reproductive effort versus longevity. Conflicts of interest within and between individuals constrain natural selection and prevent the creation of a Panglossian world wherein adaptation is maximized at all levels simultaneously. 5. It is hoped that health professionals will explore the reproductive conflicts of interest that underlie many different kinds of medical situations; they will thereby tap into a valuable new way of understanding human health and disease.

Acknowledgments We thank Stephen Stearns, Jacob Koella, Martin Daly, Robin Dunbar, Richard Alexander, Kim Hill, and Claudius Vincenz for helpful comments on the manuscript. B.I.S. thanks the American Philosophical Society for financial support.

This page intentionally left blank


Pathogens: resistance, virulence, variation, and emergence

This page intentionally left blank

C H A P T E R 10

The ecology and evolution of antibiotic-resistant bacteria Carl T. Bergstrom and Michael Feldgarden

Introduction Nosocomial (hospital-acquired) infections are a severe and often underappreciated public health problem. In the United States alone, at least 200,000 people and probably far more suffer from a hospitalacquired infection every year. The associated mortality is considerable; the Center for Disease Control has estimated that 90,000 U.S. residents die each year from nosocomial infections. To place this number in context, AIDS/HIV kills approximately 17,000 per year in the United States, influenza 37,000 per year, and breast cancer roughly 40,000 per year. As large as these numbers are, some estimates suggest that the actual magnitude of the problem could be up to tenfold higher. In 2004, the state of Pennsylvania instituted a mandatory reporting program for hospital-acquired infections (Volavka 2005). That year, Pennsylvania hospitals reported 11,668 hospital-acquired infections to the Pennsylvania Health Care Cost Containment Council (PHC4)—but they told a very different story to the insurance companies. By examining insurance billing records for claims and diagnoses that potentially resulted from nosocomial infections, PHC4 found that insurance claims were tenfold higher (115,631) than were the direct reports that they received. If one extrapolates from the Pennsylvania data to the entire United States, there could be over 2 million nosocomial infections per year in the United States. If the higher estimate is correct, roughly 7.5% of all hospital visits in the United States result in nosocomial infections, and dealing with these infections would be a major

cause of antibiotic use in the clinical setting. These problems are not restricted to the United States. In Latin America, 6–10% of all hospital visits result in nosocomial infections (Salvatierra-González 2004). Worldwide, the WHO estimates that 8.7% of all hospital visits result in a nosocomial infection, with 1.4 million patients suffering from these infections at any given time (Tikhomirov 1987). Nosocomial infections present such a great health challenge because they are often caused by antibiotic-resistant strains of bacteria well adapted to the hospital. Patients who are infected with antibiotic-resistant strains stay in the hospital longer, are more likely to die, and are more expensive to treat than are the patients who are infected with the drug-sensitive strains common outside of the hospital. For example, infection by Enterobacter strains resistant to third generation cephalosporins increases mortality fivefold, and the length and cost of stay by 50%, relative to infection by drugsensitive strains (Cosgrove 2006). Methicillinresistant Staphylococcus aureus (MRSA) infection doubles the mortality and increases the cost of care by nearly 40% relative to methicillin-sensitive S. aureus (MSSA) infection (Cosgrove 2006). The total economic burden of antibiotic resistance in clinical settings in the Untited States may be as high as $80 billion annually (S. Foster, personal communication). In this chapter, we lay out what is known about the ecology and evolution of antibiotic resistance, with an emphasis on hospital-acquired strains of bacteria. We begin with a brief history of resistance to clinical antibiotics. 125


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

History of clinical antibiotic resistance

cephalosporin resistance (also see ‘Mechanisms’ below) (Matthew 1979; Medeiros 1997). Resistance to third generation cephalosporins was observed several years after their introduction and was widely disseminated by plasmid transfer in the following decade (Papanicolaou et al. 1990). During the mid-1980s, increasing use of carbapenems and monobactam also led to the evolution and widespread plasmid-mediated dissemination of carbapenem-hydrolyzing ␤-lactamases (Bush 2002). Finally, resistance to fourth generation cephalosporins, such as cefepime, is also increasing to the point where these ␤-lactamases (‘CTX’) have spread into bacteria from agricultural habitats, although this particular family of ␤-lactamases is still rare in the United States (Damjanova et al. 2006; Oteo et al. 2006). These emergent resistance loci make it all the more distressing that the U.S. Food and Drug Administration has not yet decided to prevent the approval of the agricultural use of cefquinome, a cephalosporin analogue of cefepime. Another example is the rapid evolution of resistance to the macrolide antibiotic erythromycin. Introduced in 1952, erythromycin was heralded as a treatment of staphylococcal infections and used widely. In 1956, the first erythromycinresistant staphylococci were isolated in France and the United States. By the late 1970s, erm-encoded

Although antibiotic-resistant bacteria are often characterized as emerging infectious diseases, physicians have been confronted with antibiotic resistance for as long as they have been using antibiotics. Modern antibiotics essentially began with penicillin, which was first used clinically in 1943 and was widely employed toward the end of the Second World War. Reports of penicillin resistance came within a year (Kirby 1949), and by 1945, a British hospital reported that nearly 8% of staphylococcal isolates were resistant to penicillin (see Table 10.1). Four years later, almost 60% of British clinical isolates were penicillin-resistant (Barber and Whitehead 1949). Similar patterns occurred in the United States (Rolinson 1971). An arms race between drug development and resistance evolution ensued. In the 1950s cephalosporin C and its derivatives were introduced, and that, combined with the release of broad-spectrum penicillins in the early 1960s, selected for the plasmid-encoded broad spectrum ␤-lactamases, which confer resistance to penicillin as well as cephalosporins. The genes encoding the broad spectrum ␤-lactamases were often located on genetically mobile plasmids of Escherichia coli and thus were easily transferred both to other E. coli lineages and to other species that had previously lacked

Table 10.1 The rapid evolution of antibiotic resistance in clinically important bacteria Antibiotic

Year introduced

Year resistance observed

Penicillin Chloramphenicol Erythromycin Methicillin Cephalothin (1st generation cephalosporin) Vancomycina 2nd & 3rd generation cephalosporins Carbapenems Linezolid

1943 1949 1952 1960 1964

1945 1950 1956 1961 1966

1958a 1979, 1981

1986 1987

1985 2000

1987 2002


Vancomycin was first released in 1958; however, it was not widely used until the early 1980s.

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

erythromycin resistance had successfully transferred to the respiratory pathogen Streptococcus pneumoniae. Currently, erythromycin resistance is extremely common in S. pneunomiae and should probably be viewed as the ‘wild-type’ phenotype: in parts of Asia, over 80% of S. pneumoniae isolates are resistant to erythromycin and other macrolides, and in many other countries erythromycin resistance ranges between 30 and 60% of clinical isolates (Bozdogan et al. 2003; Roberts and Sutcliffe 2005). We have seen similar patterns for numerous other antibiotics. In response to increasing resistance to penicillin and erythromycin, physicians began using methicillin in 1960; resistance was observed within a year (Deresinski 2005). Methicillin-resistant strains, most notably methicillin-resistant S. aureus, exploded in epidemic proportions in hospitals during the 1980s. Vancomycin, though released in 1958, was not used heavily until the 1980s, when it became a common response to MRSA. In turn, vancomycin-resistant strains of enterococci (VRE) were observed in 1986 and spread rapidly in hospitals throughout the 1990s (Levine 2006). Linezolid, from a new class of antibiotics called oxazolidinones, offered a way to deal with certain VRE strains, and was released in the US in 2000—but by 2002, linezolid resistance was already being reported in vancomycin-resistant strains (Potoski et al. 2002). Because widespread use of a particular antibiotic has always led to a rapid evolutionary response, the increasing realization that there is a ‘resistance problem’ stems from the decrease in the availability of new drugs since the 1980s and not a fundamental change in the evolutionary response of microorganisms (Spellberg et al. 2004). Thus we see that antibiotic resistance, which generates considerable mortality, morbidity, and economic cost, inevitably evolves rapidly and spreads broadly following widespread antibiotic use. What are the genetic mechanisms by which resistant phenotypes arise? Where do these phenotypes arise, and how do they make their way into human-associated bacterial strains? How do bacterial population genetics contribute to long-lasting, multidrug-resistant bacterial strains? What novel approaches to preventing or treating resistance can be derived from knowledge of ecology and evolutionary biology? These questions are addressed in turn.


Genetic mechanisms From an evolutionary perspective, the mechanisms that facilitate antibiotic resistance arise in three distinct ways: • by point mutation: single nucleotide changes alter the structure or function targeted by the antibiotic; • by homologous recombination of existing point mutations: point mutations found in allelic variants are reassorted into a new allele containing the adaptive mutation; and • by heterologous recombination of novel resistance loci: resistance loci, previously not present in the recipient strain, are acquired, often through the uptake of plasmids.

Point mutations Point mutations are often the first genetic changes observed once a new drug is introduced. For example, a single base change in the structure of the peptide-binding groove of the ribosomal RNA can prevent some macrolide antibiotics from binding to their target (Hansen et al. 2002). Similarly, resistance to the quinolone antibiotics such as ciprofloxacin commonly results from point mutations in the gyrase and DNA polymerase subunits (Vila 2005). While the odds are low that any particular mutation will occur in the right place in the genome to confer resistance in this way, bacterial population sizes are so large that it is quite likely that such a mutation will occur somewhere within the population of bacteria inhabiting a human host. For example, the human small intestine supports 1010 to 1011 bacterial cells per gram of fecal matter. Given mutation rates of roughly 2 × 10 –3 per genome per replication and genome sizes on the order of 5 × 106 base pairs, a single gram of fecal matter is likely to include at least one newly occurred instance of every single point mutation (Genereux and Bergstrom 2005). Moreover, even in large populations, allele frequencies can change substantially in as little as a day as a result of rapid generation times of bacteria and the strong selection for resistance in a population exposed to antibiotics. Population bottlenecks associated with


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

drug treatment or colonization events only further accelerate the process.

Homologous recombination Homologous recombination is a second mechanism that can result in the evolution of resistance. A classic example is the evolution of penicillin resistance in Neisseria gonorrhea, the pathogen that causes gonorrhea. Changes in penA, which encodes penicillin-binding protein 2, can result in resistance to penicillin and other ␤-lactam antibiotics (Spratt 1994; Antignac et al. 2001). Interestingly, most resistant Neisseria do not appear to have acquired penicillin resistance via point mutations in penA, even though it is possible to generate such mutants in the laboratory. Rather, resistance is associated with ‘mosaic’ alleles derived from multiple recombination events with other Neisseria species (Spratt 1994). Sequence divergence within the recombinant regions of penA can reach up to 20%, indicating that the donor alleles diverged millions of years ago (Thulin et al. 2006).

Heterologous recombination The most elaborate resistance mechanisms typically arise through a third process: heterologous recombination, the acquisition of novel resistance loci. Resistance genes move easily; one important route is plasmid transfer, where a plasmid ‘minichromosome’ containing resistance genes is transferred from one cell to another. Resistance genes may reside on plasmids either because of recent anthropogenic selection from heavy antibiotic use (Barlow and Hall 2002a) or because of long-term evolutionary associations lasting tens of millions of years (Barlow and Hall 2002b), making it all the more easy for them to reach human-associated bacterial strains. The frequent association of resistance genes with integrons and other highly mobile genetic elements facilitates movement of these genes between plasmid and chromosome (Levin and Bergstrom 2000). Another route through which resistance genes move is via the cellular uptake of heterologous stretches of chromosome from some other source, either by evolved mechanisms of DNA uptake,

known as active transformation (Lorenz and Wackernagel 1994), or by virally mediated transduction. While transformation rates—and with them heterologous recombination rates—vary dramatically across species (Maynard Smith et al. 1993; Maynard Smith and Smith 1999), there is good evidence of reasonable recombination frequencies even among those species once considered to be entirely clonal (Feil et al. 2001). Heterologous recombination events typically involve the gain of a single locus or operon, such as the AcrAB efflux pump system in E. coli, which removes antibiotics by actively excreting them from the cell (George 2005). Efflux pumps are clinically problematic because they often confer resistance to multiple antibiotics and other antimicrobial compounds. For example, the AcrAB-TolC efflux pump found in E. coli confers resistance to chloramphenicol, tetracycline, erythromycin, novobiocin, fusidic acid, and various ␤-lactams. The same system also confers resistance to detergents, pine oil, fatty acids, bile salts, and organic solvents (Nikaido 1996). Because of its capacity to handle many substances, this efflux pump can now be maintained by selective pressures quite different from those that originally caused it to spread. This situation is not unique to therapeutic antibiotics and appears to apply to other antimicrobials, such as bacteriocins (Feldgarden and Riley 1999). Enterococcus species provide an even more dramatic example of resistance via heterologous recombination. Vancomycin-resistant enterococci (VRE) occur at very high rates in hospital intensive care units, where they increase mortality, morbidity, and cost of care significantly (Lodise et al. 2002; Kaye et al. 2004). They were essentially untreatable prior to the development of the new antibiotics quinupristin-dalfopristin and linezolid early in this decade. VRE derive their resistance from an altered ligase that changes the structure of their cell wall (Marshall et al. 1997, 1998) and is encoded by one of six different gene clusters: VanA, VanB, VanD, VanE, and VanG, which have been acquired by lateral gene transfer, and VanC, which is native to several Enterococcus species (Depardieu and Courvalin 2005). The operons vanA and vanB can be found on plasmids or chromosomes, while the other four operons are exclusively chromosomal (Arthur

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

et al. 1996b; Casadewall and Courvalin 1999; Arias et al. 2000; Abadia Patino et al. 2002; Depardieu et al. 2003; Depardieu et al. 2004). The operons vary both in the conditions under which they are induced and in the level of resistance conferred to glycopeptides. Some operons confer resistance to vancomycin alone because the glycopeptide antibiotic teicoplanin does not induce resistance gene expression and consequently cannot provide resistance to teicoplanin; when induced in the laboratory, the products of these operons can confer resistance to teicoplanin. Other operons confer resistance of varying strength to both commonly used glycopeptide antibiotics (Arthur et al. 1996a; Quintiliani and Courvalin 1996; Depardieu et al. 2004). VRE are an immediate clinical challenge that also pose an evolutionary threat, for they can serve as a source of vancomycin resistance genes for other hospital-associated bacteria. Most worryingly, vancomycin resistance has been transferred several times from VRE to MRSA, creating the specter of a highly resistant superbug (Flannagan et al. 2003). Fortunately, the infections were controlled in all of these cases and vancomycin-resistant MRSA has not yet taken off in hospitals or other environments.

Natural ecology Not only do antibiotic resistance mechanisms arise by several different genetic processes; they enter human populations from a variety of ecological settings. In this section, we discuss three of the most important: natural soils, agricultural ecosystems, and patient care facilities such as hospitals and nursing homes.

Soil ecology Most often, bacterial species that naturally produce antibiotics are the original source of laterally transmitted genes. Many soil microbes live in environments that are highly structured spatially, where they compete fiercely with other species for space and resources. Over millions of years of natural selection in these environments, bacteria have evolved an extensive repertoire of chemical weapons (antibiotics) and defenses (resistance mechanisms). Antibiotic resistance is not only used to


withstand attacks from other strains and species; bacteria that naturally produce antibiotics have had to evolve antibiotic resistance mechanisms to protect themselves from their own antibiotic products. A recent survey of resistance mechanisms in spore-forming soil bacteria revealed an astonishing preponderance of resistance phenotypes, with multidrug resistance in every one of the 480 strains screened and with no class of medical antibiotics effective against all of the strains (D’Costa et al. 2006). These antibiotic resistance mechanisms are not limited to soil-living bacteria; they may also be quite common in the enteric bacterial populations of wild animals. Sherley et al. (2000) sampled the enteric bacterial populations associated with Australian mammals. Even where human antibiotic use is rare to non-existent, many enteric bacteria were resistant to several drugs—and the genes conferring many of these resistant phenotypes predate the human use of antibiotics.

Agricultural use Crucial as antibiotics are to human health, their use is by no means restricted to human or even veterinary medicine. In 2001, the Union of Concerned Scientists estimated that 70% of the antibiotic use in the United States was for non-therapeutic agricultural purposes (Mellon et al. 2001). Most of this non-therapeutic use was as a growth enhancer: it was not used to treat infections but rather at subtherapeutic doses to make animals larger. This is a particularly effective way to generate antibiotic resistance, because subtherapeutic doses allow for gradual step-wise evolution of resistance under a mild selective regime, whereas therapeutic dosing imposes a much stronger selective regime that requires a large phenotypic change if a bacterium is to survive. Therapeutic doses also eliminate virtually all of the antibiotic susceptible cells that could potentially acquire resistance genes through gene transfer, whereas subtherapeutic doses do not (Lees et al. 2006). For these reasons, the use of antibiotic growth enhancers in animal feed was banned in 2006 in the European Union. Agricultural use has consequences. Resistance mechanisms that evolve in agricultural settings


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

can transfer into human populations, contributing to the overall problem of antibiotic resistance. Two of the most clearly documented cases of agricultural use leading to antibiotic resistance in humans are provided by nitrofurantoin resistance and vancomycin resistance. Nitrofurantoin has been increasingly used in human medicine to treat urinary tract infections as resistance to cotrimoxazole (trimethoprimsulfamethoxazole) has evolved. Nitrofurantoin can be carcinogenic, as its mode of action is to damage bacterial DNA, and thus was banned from agricultural use in the United States and the European Union (Prescott 2006). In 2002–3, there was an ‘outbreak’ of illegal nitrofuran antibiotic use in Portugese poultry farming (Antunes et al. 2006). Decreased susceptibility of Salmonella species to nitrofurantoin skyrocketed to 65% of all Salmonella isolates, and the prevalence in food animals of Salmonella enteriditis serogroup D isolates also increased (these particular salmonellae are very pathogenic in humans). Also, the frequency of nitrofurantoin resistance was indistinguishable between hospital isolates and poultry, while Samonella from other sources had much lower frequencies of resistance to nitrofurantoin. In addition, nitrofurantoin-resistant Salmonella were more likely to be resistant to other antibiotics. Perhaps the most dramatic example of the effect of antibiotic use in agriculture occurred in the European Union, where the vancomycin analogue avoparcin was used extensively in feed. In Denmark, studies in the mid-1990s indicated that poultry and pigs on farms where avoparcin was used were three times as likely to carry vancomycin-resistant enterococci (Bager et al. 1997). Following a ban of avoparcin in 1997, the prevalence of VRE decreased both in farm animals and in the human population, with VRE in the human population dropping from 12% of isolates to 3% (Bager et al. 1999; Klare et al. 1999). While the ban of avoparcin was followed by a decrease in VRE, VRE remains more frequent in Denmark than in the United States (Aarestrup et al. 2001). The ban of avoparcin was unable to eliminate vancomycin resistance, probably because of linkage between vancomycin resistance genes and other antibiotic resistance genes on transmissible plasmids; this linkage has maintained VRE at low

frequencies even in countries where avoparcin was never used in agriculture (Tomita et al. 2002; Lim et al. 2006). Given the burden of infection by VRE and other resistant bacteria, serious efforts to limit the spread of resistance in agriculture are worth undertaking.

Hospital transmission Once antibiotic-resistant strains enter the human population and from there enter the hospital, they often spread rapidly through the hospital environment. To understand how and why, we need to understand the ecological circumstances that bacteria encounter within a hospital. Several features distinguish the ecology of hospital-acquired infection from the ecology of most community-acquired infectious diseases. • Most of the resistant bacteria that cause problems in hospitals are species that are normally human commensals rather than pathogens (Bonten and Weinstein 1996). Many patients carry and transmit these bacteria asymptomatically, and patients often enter the hospital already colonized by sensitive strains of the species ultimately responsible for the resistance problem. If a patient is carrying a sensitive strain when admitted to the hospital, he or she may be less likely to be subsequently colonized by more dangerous hospital-associated strains. • Antibiotics are used at very high rates in the hospital population in general and in intensive care units in particular. While some of the antibiotic use aims to treat pre-existing infections, more often antibiotics are used as prophylaxis to prevent infection of surgical incisions. This generates very strong selection for resistant variants, and it also clears sensitive populations, thereby making it easier for resistant strains to colonize a patient. • Hospital staff can unwittingly act as disease vectors, shuttling bacterial strains from colonized to uncolonized patients (Stone et al. 2001). • The patient population turns over at a very high rate. Unlike the population of a town or a country, the population of a hospital is rapidly changing, with patients staying only 5–10 days on average in a U.S. intensive care unit. Thus any given patient colonized by resistant bacteria may leave the hospital

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

before the resistant strains are cleared. This influences the dynamics of disease transmission considerably. To remain endemic within the hospital, a strain must transmit before the patient departs (rather than before the strain is cleared), and this imposes a time scale of days to weeks on the transmission dynamics necessary for endemicity. • Once the resistant bacteria leave the hospital, they are not necessarily gone for good. Patients that move back and forth between hospitals and long-term care facilities serve as a particularly important reservoir of antibiotic-resistant strains, reintroducing those strains into the hospital even after a hospital outbreak has been controlled (Cooper et al. 2004; Smith et al. 2004). Several mathematical models (reviewed by Bonten et al. 2001) attempt to determine the influence of some or all of these factors on the dynamics of nosocomial resistance. These models may be of limited value for making precise quantitative predictions about the course of evolution or the rate of spread, both because human-associated bacteria evolve and spread in an extremely complicated ecological milieu and because of the inherent historicity and stochasticity of the evolutionary process. Nonetheless, mathematical models of antibiotic resistance in hospitals can be useful in a number of ways (Lipsitch and Bergstrom 2002). First, mathematical models can help researchers identify phenomena relatively robust to the specific details of the system. For example, a number of independently designed mathematical models have confirmed the positive effect that general infection control measures such as hand-washing and barrier precautions have in reducing antibiotic resistance frequencies within hospitals—and these models usually (though not always) find that infection control reduces resistance irrespective of the precise parameters chosen. Models can also provide mechanistic explanations for previously unexplained observations. For example, in hospitals—but not in larger communities—bacterial populations change rapidly in response to changes in antibiotic use and other interventions. Lipsitch et al. (2000) showed that this is a result of the high turnover rate at which patients leave the hospital; this, rather than


bacterial clearance from individual patients or competition among bacterial strains, sets the time scale of change. Mathematical models also help researchers generate testable hypotheses, and they can help select the most salient hypotheses for further evaluation in clinical trials. Given that clinical trials are extremely costly to run, this can be a great benefit. Similarly, models are useful in refining trial design. They can help researchers determine how to best assess the effect of an intervention and how to distinguish random fluctuations from meaningful (Cooper and Lipsitch 2004).

Population genetics In all three of ecosystems considered above, resistance genes can persist beyond the period of antibiotic use due to population genetic processes: physical linkage of resistance genes on the chromosome and compensatory evolution to reduce the cost of resistance.

Linked genes Once antibiotic resistance has evolved, reducing its frequency has obvious public health significance. The non-random association of multiple resistance and pathogenicity loci can allow resistance genes to be maintained by many different selective forces operating on several different mechanisms. First, as discussed previously in the case of efflux pumps (Nikaido 1996), one mechanism can confer resistance to many drugs. In many cases, resistance genes are associated in clusters on mobile genetic elements, such as plasmids and conjugative transposons (Leverstein-van Hall et al. 2002; Johnson et al. 2006). When a mobile genetic element is acquired, multiple resistance phenotypes will be gained. Finally, resistances can accumulate through a history of exposure to multiple antibiotics (Dagan and Lipsitch 2004). Unfortunately, virulence genes and antibiotic resistance genes can be physically linked on the chromosome as well (Qin et al. 2006). A study of porcine E. coli found widespread correlations between dozens of virulence and resistance genes (Boerlin et al. 2005; Travis et al. 2006). These correlations also


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

appear to have a phylogenetic basis, with different E. coli lineages possessing different patterns of correlation (Johnson et al. 2003). These lineage-based correlations might result from a past history of selection in environments that simultaneously selected for virulence and resistance. Alternatively, biochemical mechanisms in different lineages could limit the acquisition of resistance plasmids. Finally, lineages differ in their mutation rates of antibiotic resistance mutations (Wirth et al. 2006). In addition, other phenotypes can also maintain antibiotic resistance genes even in the absence of antibiotics. In dairy cattle, vitamin D supplementation in feed can select for antibiotic resistance plasmids (Khachatryan et al. 2006a; Khachatryan et al. 2006b).

Compensatory mutation One might think that when antibiotic treatment is withdrawn, resistant bacterial populations will revert to sensitivity. After all, resistance can be expensive, and in the absence of the antibiotic treatment, sensitive strains that do not pay the cost of resistance should replace the resistant strains. This is not necessarily true. One culprit is the occurrence of compensatory mutations: changes at other loci that reduce the fitness cost imposed by the resistant allele (Schrag and Perrot 1996; Schrag et al. 1997; Bjorkman et al. 1998). If the compensatory mutations that benefit the bacteria when coupled with the drug-resistant allele are harmful when

coupled with the drug-sensitive allele, this process can lead a bacterial strain into an ‘evolutionary lobster-trap’: a genotype easy to reach by selection in one direction, but difficult to leave even when selection goes in the other direction. We illustrate this process in Fig. 10.1, where we envision a two-locus model with wild type rc and evolved resistant strain RC. At the R locus, the R allele is drug-resistant and the r allele is drugsensitive. At the C locus, the C allele compensates for the cost of resistance and the c allele is uncompensated. In the presence of antibiotics (left panel), there is no fitness valley between uncompensated sensitive and compensated resistant strains; resistance evolves easily by the pathway shown, from rc to Rc to RC. In the absence of antibiotics (right panel), a fitness valley appears between resistant compensated (RC) and sensitive uncompensated (rc) strains; the resistant RC genotype cannot easily evolve back to the sensitive rc genotype, because both of the intermediates, Rc and rC, have reduced fitness. While sensitive uncompensated rc individuals would be favored in the absence of antibiotics if they were produced, the repeated bottleneck structure of human-associated bacterial populations may largely preclude the emergence and fixation of the rc type from an RC population. This is because rc individuals may not arise within RC populations in time to reach high frequency and thus survive the ensuing bottleneck (Levin et al. 2000).

Antibiotic present

Antibiotic absent RC





Rc Rc rc rC


Figure 10.1 The lobster trap of compensatory mutation. When the antibiotic is present (left), resistance (R) is beneficial relative to sensitivity (r), and on a resistant background, compensation (C) is beneficial over the wild type (c). The resistant, compensated genotype RC is directly selected and can easily emerge. When the antibiotic is absent (right), the fitnesses change. Resistance is now costly relative to the sensitivity, but epistatic interactions between the loci create a fitness valley (Rc and rC genotypes) between the resistant compensated type (RC) and the highest-fitness type, sensitive uncompensated (rc). Thus a resistant compensated population does not readily revert to drug sensitivity.

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

Applying evolution/approaches for the future So what can be done about antibiotic resistance? A two-pronged approach is needed. First, new drugs to which current strains are not resistant need to continue to be developed. This will provide a new set of responses to a future generation of ‘untreatables,’ much as quinupristin-dalfopristin and linezolid have offered a way to deal with previously untreatable VRE. Second, the growing understanding of the ecology and evolution of antibiotic resistance needs to be utilized in order to manage antibiotic use so that the evolution and spread of antibiotic-resistant strains can be slowed. These will be treated in turn.

Predicting resistance evolution Evolutionary biology insights can be very helpful in drug design; one promising direction is the possibility of using in vitro evolution to predict the possible mechanisms by which antibiotic resistance can evolve. In vitro methods of generating sequence diversity, such as DNA shuffling and related approaches, can be used to identify candidate mutations or mutational combinations that give rise to resistance to a new drug (Stemmer 1994). Once candidate mutational combinations are identified by DNA shuffling, one can determine whether these combinations are likely to be reached by natural selection, using further in vitro evolution (Barlow and Hall 2002c, 2003), or even by exploring the full fitness surface defined by the various combinations of mutations at those candidate sites (Weinreich et al. 2006). In this way, pharmaceutical developers could screen drug candidates for the likelihood that resistance would rapidly evolve, and furthermore could search for drug combinations that hinder resistance evolution to the new therapy (Barlow and Hall 2003).

Narrow spectrum antibiotics Historically, the trend in antibiotic development has been to broaden the range of targeted microorganisms by moving from ‘narrow spectrum’ to ‘broad spectrum’ drugs. For example, penicillin, the first ␤-lactamase, was only effective against


some Gram-positive bacteria, while cefepime, a fourth generation cephalosporin, is active against most clinical species of Gram-positive and -negative bacteria. In part, this has been driven by the economic need to increase potential market share through wider potential use. However, widespread use of many antibiotics has promoted the evolution of resistance, for antibiotics target both the etiological agent and many other organisms that switch between commensal and pathogenic life histories (e.g., E. coli, Enterococcus spp.). Consequently, there is increasing interest in antibiotics that only target a few organisms (Gillor et al. 2005). Ironically, concerns about the MRSA epidemic and organisms such as VRE and Acinetobacter have renewed interest in narrow spectrum antibiotics (Talbot et al. 2006). The ‘market share’ (i.e., the disease load) of these pathogens is so great that developing drugs to combat just one of these organisms is profitable. We now turn to one promising avenue for narrow spectrum antibiotics: bacteriocins.

Bacteriocins One class of antibiotics that has not been used therapeutically is bacteriocins. These are antibiotics produced by bacteria that typically only kill closely related bacteria, often within the same species or genus, narrowing the set of ‘non-target’ organisms that could evolve resistance. Because bacteriocins do not affect eukaryotic cells, these compounds could have far fewer toxic side effects than many other classes of compounds. One class of bacteriocins currently being investigated is the colicins: plasmidencoded protein antibiotics active against E. coli and related species. Colicins bind to a receptor molecule located on the surface of the cell and are then transported through the cell membrane into the interior of the cell (James et al. 1996), where they kill bacteria through several mechanisms, including DNA and RNA endonuclease activity and disruption of the integrity of the cell membrane (Pugsley 1984). Two mechanisms provide defense against colicins: ‘resistance’ and immunity. Colicin resistance involves the alteration of either the receptor or the transport systems so that the colicin cannot enter the cell. A single point mutation in either of these systems will confer resistance to multiple


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

colicins (Feldgarden and Riley 1999). In addition, some mutations confer costs in certain habitats. These costs result from the multiple functions of the receptors and transport systems, including interactions with efflux pumps (e.g., tolC), maintaining the integrity of the cell membrane, and nutrient uptake (Davies and Reeves 1975). For example, mutations in the FepA receptor confer resistance to colicins B and D, but also limit growth in iron-limited environments (Pugsley and Reeves 1976). Colicin immunity functions like a poison–antidote system. It is due to an immunity protein produced together with the colicin protein (Kleanthous et al. 1998). Immunity proteins are specific to the particular colicin with which they are produced and are not thought to have significant effects on sensitivity to other colicins. Because colicins must bind to a receptor and then be compatible with the uptake systems of the host cell, they often have a limited host range and can be used to target particular pathogens, such as extraintestinal pathogenic E. coli. In addition, if colicin resistance results in trade-offs, such as low growth under iron-limited conditions or the inability to withstand environmental stress, it can be disadvantageous in such habitats, limiting the spread of these resistant genotypes. Although in natural environments the frequency of colicin resistance is quite high (Feldgarden and Riley 1998), some pathogenic E. coli appear to be highly susceptible to colicins (Murinda et al. 1996). Recent work has attempted to modify colicins and other bacteriocins to be able to evade naturally occurring resistance mechanisms, often by altering receptor and transport targets (Gillor et al. 2005).

Quorum sensing disruptors Another promising approach to non-traditional antimicrobial chemotherapy is to alter bacterial behavior rather than eliminating bacteria outright. Many of the most harmful activities that bacteria engage in—including toxin production and biofilm construction—are social activities that result from coordinated behavior. Coordination is often achieved using quorum sensing signals, which allow bacteria to regulate their activities in a density-appropriate manner and engage in social behavior only when densities are high enough (when a ‘quorium’ is

present) for this to be effective (Miller and Bassler 2001). If we could disrupt the quorum sensing systems that bacteria use to turn on toxin and biofilm production, we could potentially reduce the impact of many bacterial infections and possibly also hasten clearance by conventional antibiotics; thus quorum sensing disrupters are of considerable interest in antibacterial drug development (Finch et al. 1998; Hartman and Wise 1998; Hentzer et al. 2003). At first glance, this strategy may seem to have limited potential. Bacteria use quorum sensing to make certain gene products inducible rather than constitutive. Simple point mutations could presumably switch these gene products to be constitutively expressed. Thus one might think that resistance to quorum sensing disrupters—in the form of constitutive expression—would evolve rapidly. But in an elegant application of evolutionary modeling, André and Godelle (2005) point out that bacterial social behavior, once disrupted, may be extremely slow to return. Their argument is essentially this: bacterial social behavior requires cooperation from many reproductively distinct individuals. Thus social behavior, including biofilm formation and toxin production, requires that bacteria somehow solve the collective action problem so that free-riders do not cause cooperation to break down. Where bacterial cooperation occurs, it is not an unavoidable consequence of direct individual selection as antibiotic resistance usually is, but rather a finely balanced consequence of multilevel selection. Thus if bacterial cooperation is disrupted, it may not return as readily as individually selected traits. To see how this might work, imagine a population of bacteria in which social behavior has been halted by disrupting quorum sensing. Whereas with conventional antibiotics the first antibiotic-resistant mutant has a substantial growth advantage, with quorum sensing disrupters the first resistant mutant has a growth disadvantage. It provides a public good by producing constitutively, but it receives no benefits from the other members of the population who are not producing due to the quorum sensing disrupter. Moreover, because these behaviors are selected at the population level, if resistance does evolve it is likely to do so on the time scale of populations, rather than on the time scale of individuals. While a bacterium may

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

reproduce in a matter of hours, populations often turn over on scales of weeks to months and thus resistance to quorum sensing disruptors is likely to evolve much more slowly that does resistance to conventional antibiotics.

Ecological modeling Population biology also provides a framework for thinking about ways to alter the hospital environment and to manage drug use so as to minimize the evolution and spread of resistance. Here, mathematical models of disease accord with proven clinical strategies (Lipsitch et al. 2000). Handwashing reduces the rate of transmission within hospitals and thus makes it harder for hospital-adapted strains to persist endemically within the hospital. Clinically, this has been known for nearly fifty years as an effective response to antibiotic resistance outbreaks (Barber et al. 1960). Barrier precautions, increased staff-to-patient ratios, and other basic hygiene have a similar effect. A one-time shift in the formulary reduces the strength of selection favoring resistant strains; repeated experience reveals the value of this approach (Lilly and Lowbury 1978).


Antibiotic cycling More recently, several authorities have speculated that antibiotic cycling—in which drug classes are rotated on a scheduled basis—may constitute an effective way for hindering resistance evolution and spread. The basic logic parallels that underlying crop rotation and HIV drug rotation: by confronting bacteria with a changing environment, their ability to track the environmental conditions may be reduced. Unfortunately, the clinical trials conducted thus far have been mostly disappointing, as revealed in a meta-analysis (Brown and Nathwani 2005). Mathematical modeling helps one see the reason for these results: antibiotic cycling does not necessarily reduce the environmental heterogeneity at the scale relevant to bacterial clones spreading through the hospital (Fig. 10.2) (Bergstrom et al. 2004). While antibiotic cycling may introduce longer-term heterogeneity in the hospital, it does not increase the local heterogeneity generated when different patients receive different drugs according to patient history, indications, and physician preference. In fact, by standardizing drug therapy at any given time, a cycling program may even decrease


1 2 3 4 5 6 7 8 9 10 Mixing


1 2 3 4 5 6 7 8 9 10 Cycling

Figure 10.2 Paradoxically, cycling provides a more homogeneous environment for bacteria than does ‘mixing,’ in which different patients receive different drugs but the proportions of drug use in a ward stays roughly constant over time. The black line indicates the trajectory of a bacterial lineage as it passes from patient to patient and bed to bed. Shaded rectangles indicate patients receiving one drug, open rectangles indicate patients receiving an alternative drug. The bacterial lineage in the mixing ward faces more heterogeneous selective conditions (antibiotic types) than does the bacterial lineage in the cycling ward. From Bergstrom et al. (2004).


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

local heterogeneity. Because successful hospitalassociated clones pass among multiple patients within the span of a single antibiotic ‘cycle,’ local heterogeneity turns out to be a more important restriction on the bacterial population’s ability to track its environment than is the longer-term temporal heterogeneity introduced by cycling.

this decline had nothing to do with the development and use of antibiotics! By 1935, when the first antibacterials, known as sulfonamides, were released, disease mortality had already declined almost threefold from its 1900 rate. By the time that penicillin was first used clinically in 1943, infectious disease mortality had dropped to less than a quarter of its 1900 rate. Thus at most a third of the overall decline could conceivably be attributed to antibiotics— and even this is likely to be a considerable overestimate, given that the decreasing trend in disease mortality continued steadily from 1900 to 1950, rather than accelerating with the advent of antibiotics. Presumably, most of the decline in infectious disease mortality over the twentieth century was instead a consequence of increased sanitation, improved public health infrastructure and practice, improvements in food handling, storage, and preparation, and improvements in nutrition (Genereux and Bergstrom 2005). Thus if we fall behind in the race against the evolution of antibiotic-resistant bacteria, we are unlikely to return to the sort of infectious disease mortalities that plagued humankind in 1900. What

Conclusions As grave as the public threat posed by antibiotic resistance may be, we should maintain perspective on which aspects of our lives are directly threatened and on which are not. Even if we lose much of our ability to resolve bacterial infection using antibiotics, we will not necessarily return to Dark Ages rates of infectious disease mortality—or even to nineteenth-century rates. Figure 10.3 illustrates infectious disease mortality rates, measured as deaths per 100,000 individuals per year, across the twentieth century. Over that century, infectious disease mortality declined dramatically, from roughly 800 deaths per 100,000 per year in 1900 to roughly 60 deaths per 100,000 per year in 1996. Remarkably, the vast majority of

1000 1918 Influenza pandemic 900

Deaths per 100,000 per year

800 700 600 First sulfonamides tested

500 400

First clinical use of penicillin

300 200 100 0 1900






Year Figure 10.3 Infectious disease mortality rate in the United States (Armstrong et al. 1999; Genereux and Bergstrom 2005). Redrawn with permission from Armstrong et al. (1999).

E V O L U T I O N O F A N T I B I O T I C- R E S I S T A N T B A C T E R I A

would we lose if antibiotic resistance becomes even more widespread and we have fewer options for treating multidrug-resistant infections? With the rise of HIV (which is responsible for most of the post-1980 increase in infectious disease mortality shown in Fig. 10.3) and increasing numbers of patients who are immunocompromised for other reasons, we are dealing with increasing populations of patients for whom antibiotics are a necessity. Furthermore, prophylactic use of antibiotics is critical for our ability to perform invasive surgical procedures without a serious risk of infection. Thus the loss of effective antibiotics would indeed by an enormous setback to medical practice.

Summary 1. The evolution of resistance to a clinical antibiotic occurs with near certainty after several years of widespread use. 2. Antibiotic resistance can evolve by a variety of genetic mechanisms and spread throughout and between species via gene transfer. 3. Resistance mechanisms that evolve in agricultural settings where antibiotics are used can transfer into human populations, contributing to the overall problem of antibiotic resistance.


4. Associations among resistance genes, and the process of compensatory evolution can result in the retention of resistance genes, even in the absence of selection favoring resistance. 5. Novel approaches to antimicrobial therapy— including narrow spectrum antibiotics, bacteriocins, and disruptors of quorum sensing—may provide alternatives to traditional broad spectrum antibiotics for which resistance is less quick to evolve. 6. To eradicate antibiotic resistance from a hospital setting, researchers need a thorough understanding of the underlying ecology. For example, ecological models have shown that antibiotic cycling, the hospital equivalent of crop rotation, does not necessarily reduce the environmental heterogeneity at the scale relevant to bacterial clones spreading through the hospital and thus may be ineffective at reducing the frequency of resistant strains in a hospital setting.

Acknowledgments C.B. was supported in part by NIH R01 GM68657. M.F. was supported by the Alliance for the Prudent Use of Antibiotics (APUA) through NIH Grant U24 AI 50139.

This page intentionally left blank

C H A P T E R 11

Pathogen evolution in a vaccinated world Andrew F. Read and Margaret J. Mackinnon

Introduction The evolution of drug resistance undermines the effectiveness of chemotherapy. In contrast, vaccines do not fail with the same depressing regularity in the face of pathogen evolution. Indeed, vaccination has eradicated one human disease, provides robust control in the developed world of another eight, and protects individuals against over a dozen more. Rightly, vaccination is viewed as a medical triumph. Yet it is argued that the long-term control of acute childhood diseases like smallpox, polio, and measles does not mean vaccines are evolutionproof. The pathogens now being targeted are quite different from the organisms responsible for those diseases, and some of the vast evolutionary experiments currently being conducted with vaccines are generating pathogen evolution. As shall be seen, a variety of evolutionary responses to vaccination are possible, including the evolution of more virulent pathogens. In general, little is known of what will happen and how evolution can be directed for human betterment. For much of the past century, the development and use of new drugs took place with little consideration of the evolutionary consequences (cf. Chapter 10). It is our contention that we should not repeat that complacency with vaccines: their evolutionary consequences need to be understood so that the benefits of this most successful of disease control measures can continue to be reaped. We structure our discussion around the following superficially attractive statements. As shall be seen, each is at least partly wrong.

• Vaccine-induced immunity simply replaces natural immunity, and so vaccination has no consequences for pathogen evolution. • Vaccination has worked effectively for more than a 100 years. If vaccine-driven evolution were going to cause problems, it would be obvious by now. • Even if vaccine-resistant mutants do evolve, they will do us less harm than wild type pathogens.

Vaccines have consequences for pathogen evolution Host immunity imposes massive selection on pathogen populations: host–parasite interactions are one of the richest of evolutionary battlefields. But unlike drugs, which impose totally novel selection pressures, vaccines work by eliciting immune mechanisms generated by natural infection in any case. Unfortunately, as shown by the examples of the following eight diseases, this does not make vaccines evolution-proof.

Hepatitis B Hepatitis B virus (HBV) is a globally significant cause of hepatitis, liver cirrhosis, and liver cancer. An effective vaccine has been used as part of national childhood vaccination programs since the early 1980s. These campaigns have dramatically reduced the prevalence of the virus and the associated disease (FitzSimons et al. 2005). The vaccine contains recombinant hepatitis B surface antigen (HBsAg). The so-called a determinant of HBsAg is the major target for the neutralizing antibodies



PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

responsible for vaccine breakthrough. Importantly, there is good evidence that these mutants are more often found in vaccinated individuals, and that they are increasing in frequency in vaccinated populations (Fig. 11.1) (François et al. 2001; Hsu et al. 2004; FitzSimons et al. 2005). Thus, HBV populations are evolving in response to widespread vaccination. Proposals to incorporate other HBV antigens into the vaccine are being considered (François et al.



Hepatitis B, Taiwan 30



0 1989





Pertussis, The Netherlands





Streptococcus pneumoniae, USA 70

Frequency of non-vaccine serotypes

Frequency of non-vaccine alleles


0 1984

(c) 100

Pertussis, Finland

100 Frequency of non-vaccine allele

Frequency of a determinant mutants

produced during natural infection or following vaccination. In 1990, the first single amino-acid substitution in the S gene coding for the a determinant was reported; since then more mutants have been discovered. These mutants exist at low frequencies in unvaccinated individuals and at least some are transmissible. A key feature is that these mutants can coexist with vaccine-induced anti-HBsAg antibodies; indeed, they are often






20 1949–19541965–19721976–19811982–19901991–1996







Figure 11.1 Examples of the evolution of vaccine-adapted mutants following the introduction of widespread vaccination. (a) The spread of mutant forms of the a determinant of the surface antigen of hepatitis B virus in Taiwan after universal vaccine began in 1984 (data from Hsu et al. 2004). (b) The spread of non-vaccine alleles of pertactin in Bordetella pertussis populations in Finland (data from Elomaa et al. 2005) and (c) The Netherlands (data from van Loo et al. 1999). Widespread pertussis vaccination was introduced into those countries in 1952 and 1953, respectively. (d) Increasing frequency of non-vaccine serotypes of Streptococcus pneumoniae after the introduction in the United States in 2000 of a pneumococcal conjugate vaccine containing 7 of 90 possible serotypes (data from Flannery et al. 2006). Note that in all cases, vaccination substantially reduced disease incidence.


2001; FitzSimons et al. 2005; Kimman et al. 2006), but it is not obvious that these will obviate the problem, for variants at these other sites occur, presumably because they also provide at least some escape from naturally acquired neutralizing antibodies.

Pertussis Also known as whooping cough, pertussis is a respiratory disease caused by the bacterium Bordetella pertussis. The introduction of whole-cell vaccines in the middle of the last century resulted in dramatic decreases in disease incidence. The virulence factors of B. pertussis can be divided into adhesins, such as pertactin (prn), and toxins, such as pertussis toxin (ptx). Adhesins facilitate attachment to the host, and toxins are involved in immune evasion and possibly resource extraction. Many of these virulence factors are polymorphic, and major changes in allele frequencies have been recorded worldwide (van Loo et al. 1999; Hallander et al. 2005; Hardwick et al. 2007), some of which are associated with vaccination. For instance, in Finland and Holland, the frequency of the vaccine-type pertactin allele (prn1) went from essentially 100% in the pathogen population to less than 5% after the introduction of nationwide vaccination (Figs. 11.1b,c) (van Loo et al. 1999; Elomaa et al. 2005). Similarly, in several countries, the ptxA1 allele, not present in the vaccine strains, has replaced the vaccine alleles (Elomaa et al. 2005; van Amersfoorth et al. 2005). Non-vaccine alleles tend to be more frequent in vaccinated individuals than in unvaccinated individuals (Mooi et al. 1998; Mastrantonio et al. 1999).

Pneumococcal disease Acute infections with the bacterium Streptococcus pneumoniae cause pneumococcal disease, which can present as meningitis, septicemia, and pneumonia, and which is an important cause of death among infants and the aged. Streptococcus pneumoniae has about 90 known serotypes (types of capsular polysaccharides); they vary in prevalence and virulence. In the 1990s, clinical trials with vaccines containing 7–11 of these polysaccharides showed decreases in these targets and increases in nonvaccine strains, a phenomenon known as strain


(serotype) replacement (Lipsitch 1999). Following the widespread use of such vaccines in childhood immunization programs, strain replacement is now visible at a population level. For instance, in the United States, widespread use of a 7-valent conjugate vaccine began in 2000. Disease incidence has declined dramatically, but non-vaccine serotypes are now increasing in frequency both among disease cases (Fig. 11.1d) and in the asymptomatic carriage population (McEllistrem et al. 2003; Huang et al. 2005; Flannery et al. 2006). The increase in non-vaccine serotypes can fully compensate for the decline in the vaccine serotypes, resulting in no net change in S. pneumoniae prevalence in the community (Huang et al. 2005). Some of the evolution may be a consequence of a simple rise in the frequency of strains with non-vaccine capsular types filling the niche vacated by the strains targeted by the vaccine, although it is also possible that existing strains are acquiring non-vaccine capsular polysaccharides (Porat et al. 2004).

Diphtheria Infections with toxin-producing strains of the bacteria Corynebacterium diphtheriae can cause respiratory disease characterized by lesions in the upper respiratory tract, particularly the tonsils, pharynx, larynx, and nose. The disease lesions are due to a specific phage-encoded cytotoxin; widespread immunization with the detoxified toxin (‘toxoid’ vaccine) has reduced the disease from a major child killer to one rarely seen at all, at least in rich countries. Various authors have attributed this to reductions in the frequency of the toxin-encoding phage in the bacterial population (e.g. Pappenheimer 1984; Ewald 1994, 1996, 2002; Soubeyrand and Plotkin 2002). So far as we are aware, the only data showing this evolution come from Romania during the third quarter of last century (Fig. 11.2). Here, the frequency of C. diphtheria strains that were toxigenic declined from over 80% to less than 5% after the introduction of the vaccine (Pappenheimer 1982).

Malaria There is currently no malaria vaccine, but many candidate vaccines are in various stages of trial.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Percentage of strains that were toxiogenic


Diphtheria, Romania




0 1958 1960 1962 1964 1966 1968 1970 1972 Figure 11.2 Decreasing frequency of toxin-producing strains of Corynebacterium diphtheriae in Romania after the introduction of widespread immunization with the diphtheria toxoid vaccine in 1958 (data from Pappenheimer 1982).

One of these trials demonstrated that vaccinedriven evolution is going to be an important consideration. The vaccine, known as Combination B, contained several antigens encoded by different Plasmodium falciparum genes. Antigenic loci in malaria are notoriously polymorphic, and Combination B contained a single antigen from each of three polymorphic loci. One of these loci, msp2, is dimorphic in nature, where a single parasite has one of two allelic forms. The vaccine contained just one of these forms. In a Phase II vaccine trial in Papua New Guinea, the vaccine did not protect against disease, but the non-vaccine allele rose in frequency in the vaccinated people (Genton et al. 2002). This sort of evolution—selection against vaccine strain alleles—is a major concern for malaria vaccine developers (e.g. Mahantry et al. 2003; Matuschewski 2006).

Avian influenza In 1995, the Mexican government was one of the first to use vaccination to try to control H5N2 avian influenza in poultry with very widespread immunization of commercial chickens. The antigenic variants that existed prior to vaccination were well controlled by vaccine-induced immunity, but new lineages arose after vaccination that replaced the originals. These new viral lineages are antigenically distinct and less successfully controlled by the vaccine (Lee et al. 2004). Similar evolution

has been seen in H9N2 influenza viruses following widespread vaccination of poultry in China (Li et al. 2005). This may mean that poultry influenza vaccines need continual modification to track viral evolution in response to vaccine-induced immunity (Lee et al. 2004), just as human influenza vaccines need to track viral evolution in response to natural immunity.

Marek’s disease Marek’s disease virus (MDV) is a cancer-causing herpes virus that costs the global poultry industry more than US$1 billion annually. The virus became economically important with the intensification of the chicken industry after WWII. In the United States, vaccination of chickens with live virus from a related non-oncogenic strain was used from the late 1960s. This first generation vaccine initially provided good control, but within a decade it was not providing adequate protection against virulent viral strains that appeared in the 1970s. In the 1980s, a second generation vaccine consisting of two non-oncogenic strains was introduced, but this too began to fail as more virulent strains subsequently evolved. In the 1990s a third generation vaccine consisting of an attenuated form of an oncogenic strain was introduced. Losses have once again subsided, but there is great concern in the poultry industry that the third generation vaccine may eventually be undermined by the evolution of even more pathogenic strains. Importantly, the two generations of vaccine that failed were undermined by strains antigenically identical to the oncogenic strains of the pre-vaccine era. Changes in viral aggression and immunosuppressive capacity, not antigenic type, caused the vaccine failure (Fig. 11.3) (Witter 2001; Davison and Nair 2004).

Infectious bursal disease (IBD) IBD is an immunosuppressive disease of chickens caused by a birnavirus and is responsible for many cases of respiratory and enteric disease. From the mid 1980s, vaccination failures began to be described in poultry operations around the world. In the United States, vaccine breakthrough was due to the evolution of antigenically novel strains






Figure 11.3 Lymphoid organs of chickens: normal chicken (control) and those infected with vaccine-breakthrough strains of MDV. Virulent MDV (vMDV) strains were responsible for the downfall of first generation of anti-MDV vaccines, and successively more virulent strains (vv and vv+) for the downfall of second generation vaccines. Photo by B. W. Calnek, with permission.

against which classical IBDV vaccines were not sufficiently protective. In contrast, vaccine-breakthrough strains in Europe belong to the ancestral IBDV serotype but were instead more aggressive. Like the newly evolved MDV strains, these very virulent European strains cause more severe disease in unvaccinated birds, with mortality rates of up to 60% (van den Berg 2000; Rautenschlein et al. 2005; Le Nouën et al. 2006).

Thus, vaccines are not evolution-proof These examples show that vaccines can provoke and even be overcome by pathogen evolution. Most obviously, vaccines that target a subset of strains can give a competitive edge to those not present in the vaccine. But even in the absence of strain-specific effects, widespread vaccination can alter the immune pressures pathogens experience. By reducing the number of fully susceptible hosts in a population, vaccine programs alter the likelihood that a pathogen will encounter and evolve in non-immune hosts. Vaccination can also create a new niche of weakly immune hosts. Where the relative fitness of competing pathogen strains depends on the immune status of their host, changing immune profiles of a population will prompt pathogen evolution. In extreme cases, vaccines can even be the main or only source of pre-existing immune selection.


In many intensified farming situations, such as the poultry and pig industries, large numbers of fully susceptible individuals come together for relatively short periods. Here, naturally acquired immunity will often have little impact on pathogen evolution, partly because major efforts are made to prevent natural infections in the first place, but mostly because animals are slaughtered before natural immunity has time to build up in the host population (Lee et al. 2004; de Jong et al. 2007). When epidemics do occur, vaccine-induced immunity will be a major source of immune selection on the pathogen.

Why has vaccination worked despite evolution? If vaccine adaptation is possible, why have pathogens subject to vaccination for many decades not evolved and caused the failure of immunization programs? Vaccination successfully eradicated smallpox and is close to eradicating polio. It has provided sustainable control for several decades against several other diseases, including measles, pertussis, diphtheria, mumps, and rubella. In this section, we argue that one should not draw from this history the lesson that evolution is unimportant. The success stories to date concern a peculiar subset of infectious diseases. The lesson of Marek’s disease is salutary: vaccines can fail in the face of pathogen evolution. It is simply too early to be confident that MDV will be the exception rather than the rule.

Not all infectious diseases are alike The vaccine success stories involve acute childhood infections. A striking feature of the natural history of these diseases is that first infections invoke immunity that is sterilizing, strain-transcending, and usually life-long. Such acute infections either kill their host or are rapidly cleared. The pathogens persist by exploiting susceptible individuals, typically non-immune children. Why natural selection failed to find these organisms a way to penetrate previously exposed hosts is unclear, but it is evident that it did not. There must have been intense selection on all of them to break though natural


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

immunity in the pre-vaccine era. An evolutionary solution would have been no easier in the vaccine era. Acute childhood diseases were easy targets for vaccination: natural immunity was already evolution-proof; all that was needed was for vaccines to induce something similar. In the case of smallpox, eradication did not even require vaccines to elicit the evolution-proof level of immunity produced by natural infections. The proportion of people that need to be vaccinated to eradicate a disease is determined by R0, one measure of pathogen fitness. R0 for smallpox is one of the lowest for any human disease, and any mutants able to escape vaccine-induced immunity would have had an even lower R0 (or else they would have evolved anyway). Escape mutants would thus be even easier to eradicate than wild type. Only if the smallpox vaccine had been very weakly cross protective could any epitope variants have escaped eradication and saved the species (McLean 1995). The diseases that are the focus of much of today’s vaccine development differ notably from acute childhood infections. The populations of pathogens causing diseases like flu, malaria, and pneumococcal disease frequently consist of a rich diversity of strains able to successfully infect previously infected individuals. Individual infections of diseases like HIV, sleeping sickness, tuberculosis, and malaria are often chronic, with infections persisting in partially immune hosts due to immunosuppression and antigenic variation. These ‘hard diseases’ are a great challenge for vaccine developers, arguably because for these pathogens, natural selection has found them an evolutionary solution to natural immunity: antigenic flexibility. Moreover, in contrast to vaccines against the successfully controlled diseases that induce sterilizing immunity, many vaccines against other diseases leak, allowing wild type pathogens to transmit though vaccinated hosts. The absence of sterilizing immunity makes it possible for natural selection to probe vaccine-induced immunity for weaknesses. Notably, with Marek’s disease, where two generations of vaccines had to be abandoned in the face of viral evolution, immunization is extremely leaky (Islam et al. 2006).

Is it too soon to be confident? Even for some of the acute childhood infections, it may be too early to say what the evolutionary outcome will be. A standard result in population genetics (and the theory of drug resistance) is that advantageous mutations even under strong selection can be spreading in a population for some time—even decades—before they become detectable. Mathematical models of vaccine-driven evolution show the same thing: it might take decades to tip the balance in favor of vaccine-resistant strains (McLean 1995; Wilson et al. 1999; Gandon et al. 2001). This is particularly so when vaccine coverage is low to begin with (e.g., Hepatitis B) and when the favored mutant is initially rare in the population. Moreover, vaccines of the future might also impose novel immune selection. Whereas current vaccines work by mimicking natural immunity, many vaccine developers are deliberately attempting to stimulate immunity targeted at pathogen epitopes not seen by natural immunity, sometimes with effector mechanisms not deployed naturally. Frequently a motivation behind such attempts is to avoid the antigenic polymorphism of loci seen by natural immunity (e.g., Alonso et al. 2005; Matuschewski 2006). Such technical breakthroughs have the potential to impose completely novel selection pressures, just as drugs do.

Pathogen adaptation in vaccinated populations In this section, we survey the sorts of pathogen phenotypes that vaccine-driven evolution might produce. In the next, we use this framework to address the consequences of these different evolutionary outcomes for public and animal health. To study the evolutionary responses of pathogens to vaccine-imposed selection, it helps to consider the fate of pathogen variants that differ in some way from wild type pathogens. Wild type pathogens are those favored by natural selection in unvaccinated populations and, following Gandon and Day (2007), we refer to variants that rise in frequency after the introduction of widespread immunization as vaccine-favored or vaccine-adapted variants. These could also be called ‘vaccine-escape’


or ‘vaccine-resistant’ mutants. We are tempted to use those terms because they capture the essential concept, but in much of the literature ‘escape’ and ‘resistance’ are equated with mutations at protective epitopes. For instance, François et al. (2001: 3803) explicitly equate vaccine escape in HBV with mutations in the envelope genes that result ‘in nonrecognition by neutralizing antibodies induced by vaccination’. As we will argue below, epitope alterations are only one possible mechanism of vaccine adaptation. To simplify matters, we hereafter refer to vaccinefavored variants as mutants, although they might be existing strains, genotypes or serotypes, or de novo mutants. All that matters is that their phenotypes be heritable. To rise in frequency, these vaccine-adapted mutants must be selectively favored in vaccinated hosts. This may mean that they are better able to invade/infect/penetrate the defenses of a vaccinated host, or that once inside the host, they have a higher per day transmission rate, or that they are cleared less quickly by the immune system than are wild type strains. In principle, parasite adaptation to immunized populations could involve a variety of different pathogen phenotypes. We argue that only a subset of this variety has been so far looked for—and therefore seen—and that broader thinking on this is required. Two pathogen traits have been the focus of previous considerations: epitope evolution (sensu François et al. 2001) and virulence evolution. Biomedical scientists have been concerned with the former; the latter has recently received some attention from evolutionary biologists. The distinction is somewhat artificial (epitope variants can differ in virulence), but we begin with these traits for historical and heuristic reasons.

Epitope evolution Vaccine-adapted mutants can arise by alterations in the genes encoding the pathogen epitopes that are recognized by vaccine-induced immunity. Some of these changes can make the structure of the protein they encode differ so much from wild type that the mutants are not recognized by vaccine-primed host responses. Epitope variants are well known in many diseases and are in part responsible for the chronic


nature of many viral and protozoal diseases and the slow acquisition of protective natural immunity in diseases such as malaria. Epitope variants are a major cause of vaccine breakthrough in individual patients in several diseases including hepatitis B, whooping cough, and pneumococcal disease, where epitope differences among the capsular polysaccharides and surface antigens are responsible for the evolution seen in populations immunized against a subset of circulating strains (Fig. 11.1).

Virulence adaptation A different class of possible outcomes is what might be called ‘virulence adaptation.’ Here, widespread immunization imposes direct selection on pathogen virulence determinants, so that the subsequent evolution involves the adjustment of intrinsic pathogen virulence, the virulence observed when the pathogen infects an immunologically naïve host. We define virulence as the harm done to hosts following infection. • Decreased virulence —The intrinsic virulence of diphtheria went down in Romania after the introduction of a toxoid vaccine (Fig. 11.2). The standard explanation for this evolutionary outcome is as follows (Pappenheimer 1984; Ewald 1994, 1996, 2002; Soubeyrand and Plotkin 2002). Virulence is due to the production by the bacteria of a toxin that enhances pathogen fitness by allowing the bacteria to obtain nutrients when resources in the immediate vicinity are scarce. This gives toxin-producing strains (tox+) a fitness advantage over toxin-less (tox–) strains in unvaccinated hosts. But the toxoid vaccine induces anti-toxin immunity, reducing the benefits of producing the toxin. This immunity does not target the pathogen directly, just its products. The toxin constitutes as much as 5% of the total protein synthesized by the bacterium. Consequently, tox– strains are better adapted to an immunized population because they avoid the metabolic costs of producing toxin for little gain. Hence intrinsic virulence evolves downwards. • Increased virulence —More recently, we proposed that vaccines could drive virulence in the opposite direction (Gandon et al. 2001, 2003; Read et al. 2004; Mackinnon et al. 2007). Our argument


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

is a logical consequence of the best-studied theory of why evolution does not always produce benign parasites. The virulence trade-off hypothesis posits that there are fitness benefits associated with virulence, as well as costs. The benefits (why selection favors virulent strains at the expense of the avirulent) are assumed to be the production of more transmission forms per unit time and/or longer time before immune clearance. The cost of virulence (why selection penalizes excessively virulent strains) is the truncation of the infectious period by host death. Natural selection favors those pathogen strains able to optimally balance these costs and benefits to maximize overall fitness. Further details of the trade-off model and other theories for the evolution of virulence are given by Ebert and Bull (Chapter 12). For vertebrate diseases, the strongest evidence supporting the trade-off model comes from myxomatosis and our own experimental work with rodent malaria in laboratory mice (Mackinnon and Read 2004a; Mackinnon et al. 2007). In Plasmodium chabaudi, we have documented genetic variation in virulence on which selection can act. Host death shortens infectious periods, but more virulent strains transmit more successfully, are less rapidly cleared by the host, and have an advantage in competition with less virulent strains. Importantly for what follows, these advantages of virulence also accrue in immunized hosts (Fig. 11.4). Now, consider what might happen if a new vaccination program is used to attack a pathogen that is optimally balancing the costs and benefits of virulence. Presumably a vaccine is used because it protects the host against death. This means the fitness cost of virulence—the force selecting against virulence—is relaxed by vaccination. Since there are still fitness benefits of virulence, more aggressive strains will spread in vaccinated populations because they are now less likely to kill the host. Even if vaccination reduces pathogen titers and transmission rates, virulent strains will still produce more transmission stages than less virulent strains. In fact, in an immunized host, they may produce disproportionately more transmission stages if immunity is more effective against less aggressive strains. Consequently, vaccinated

individuals create the conditions that favor the spread of intrinsically more virulent parasites. This verbal argument is supported by rather general mathematical models (e.g., Gandon et al. 2001, 2003; Porco et al. 2005; André and Gandon 2006; Ganusov and Antia 2006; Massad et al. 2006; Miller et al. 2006). More specific population-level models, parameterized for endemic high-transmission malaria, confirm the argument and show that the evolution of virulence can take place on time scales of a few decades, comparable to the time taken for resistance to drugs like chloroquine to become clinically relevant (Gandon et al. 2001). An important corollary to this argument is that the mode of action of the vaccine matters: the vaccines must leak (allow at least some transmission of wild type pathogens) and reduce disease (risk of death). Vaccines that stop hosts from becoming infectious do not alter the relative costs and benefits of virulence: they thus do not directly drive virulence evolution. Indeed, transmission reduction alone can even impose minor downward selection on virulence indirectly via epidemiological processes. Less transmission means fewer multiply infected hosts (Gandon et al. 2001), and in theory (e.g., Frank 1996) and for malaria in practice (de Roode et al. 2005; Bell et al. 2006) within host competition can favor virulence. Strikingly, outbreaks of virulent strains in vaccinated populations have been seen in two poultry diseases, Marek’s disease and infectious bursal diseases (see above), as well as in feline calcivirus disease in domestic cats (Hurley et al. 2004; Coyne et al. 2006; Radford et al. 2006). In all three cases, virulence evolution has eroded vaccine efficacy, and hosts infected with these vaccine-breakthough strains are at greater risk of death. While it is difficult to know whether these strains have arisen in response to vaccination, in all three cases, vaccination provides less protection against the virulent strains than it does against progenitor strains. Similarly, experimental evolution of rodent malaria parasites showed that immunization promoted the evolution of virulence (Mackinnon and Read 2004b). Because from the pathogen’s perspective on virulence evolution, genetic resistance is vaccination by another means, we also mention that the evolution of more resistant rabbits in Australia


(b) −3




−4 CW AR AL CQ −5




Rate of recovery


(log10 no. parasites × 109 per day)

Transmissibility (log10 average no. gametocytes per red blood cell)























(minimum red blood cell density, log10 no. × 109/ml)

(minimum red blood cell density, log10 no. × 109/ml) (c)







AS 0






(total no. gametocytes × s106/ml)

(% of mosquitoes infected)


Lifetime transmission potential

(d) 50



40 P < 0.05 30



0 Survived



(minimum red blood cell density, log10 no. × 109/ml) Figure 11.4 Benefits and costs of virulence in rodent malaria. Groups of laboratory mice (C57/Bl, female, n = 7 – 17) were infected with one of 10 strains (clones) of wild-caught rodent malaria parasite Plasmodium chabaudi. Parasite clones that were more virulent were also more transmissible (a, c) and more persistent (b). They were also more productive in semi-immune hosts: thus virulence was of positive benefit to parasite fitness. However, mice that died had lower transmission potential than mice that survived, causing a fitness cost of virulence (d). Thus these data support the assumptions of the virulence trade-off model. Virulence was measured as the minimum red blood cell density experienced by the mouse during its infection; transmissibility was measured by the average number of transmission stages (gametocytes) produced during the infection or as infectivity to mosquitoes over a 4-day period; and persistence was measured as the rate of decline in parasite density post-peak parasitemia. Each lettered symbol represents the mean for the parasite clone and the bars represent standard errors of the means. Black symbols indicate that the mouse was naïve to infection when inoculated; lighter symbols indicate that the mouse was made semi-immune prior to infection. Data from Mackinnon and Read (1999; 2003) and Mackinnon et al. (2002).

led to the evolution of more virulent myxoma virus (Fenner and Fantini 1999).

Other possible vaccine-adapted phenotypes Epitope changes and virulence adaptation are the main phenotypes that have been considered in the

context of vaccine-driven evolution. But it seems likely that other phenotypes might also be favored in vaccinated populations, including these: • enhanced immunosuppression, for instance by enhanced production of immunomodulatory substances;


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

• the evolution or enhanced production of smokescreen molecules, immunogenic molecules on pathogen surfaces, whose only function is to distract the immune system from functionally important molecules; • changes in patterns of antigenic variation, including antigenic repertoires and rates of change; • changes in tissue tropism into immunologically privileged sites and/or increased sequestration; and • activation of alternate host cell invasion pathways. There are undoubtedly other strategies yet to be discovered.

The health consequences of vaccine-adapted pathogens What will be the health outcomes of vaccine-driven pathogen evolution? There are several possibilities. First, vaccine-adapted mutants might spread but with little health impact. Small differences in relative fitness can easily lead to large changes in gene frequencies with little impact on overall pathogenicity or infectiousness. In pertussis, several very large changes in allele frequencies at genes encoding toxins and adhesins occurred during the whole cell vaccine era with apparently little public health impact (Denoël et al. 2005). Second, vaccine efficacy could be eroded, with the spread of vaccine-adapted mutants leading to reduced individual protection. This would be equivalent to classical drug resistance, and could, in extreme cases, lead to the abandonment of the vaccine, as happened with the early generation Marek’s disease vaccines. There are calls to include new epitopes in some existing human and veterinary vaccines to pre-empt this (e.g., François et al. 2001; Radford et al. 2006). The existence of considerable standing variation among strains is one of the reasons vaccines do not yet exist for diseases such as malaria and HIV. Third, intrinsic virulence could evolve. Clearly, this will enhance the population-wide health benefits of vaccination if virulence goes down: individuals will be exposed to less pathogenic strains. The continued circulation of unproblematic strains may also help boost or maintain vaccine-induced

immunity (Ewald 1996). In contrast, if more virulent strains evolve, unvaccinated individuals who get infected will suffer more severe disease, as will vaccinees if vaccine efficacy declines. But what of the population-wide health burden? Vaccines reduce disease severity, and often transmission. Does a lower force of infection, and hence reduction in the number of hosts at risk of becoming diseased, more than compensate for deaths due to increased intrinsic virulence? We modeled the widespread use of a leaky blood-stage malaria vaccine in a human population subject to year-round high-intensity malaria transmission. Parameter values were chosen to mimic a situation like that in Tanzania. As predicted, intrinsic virulence increased, and the public health benefits were eroded through time, with vaccine-induced reductions in populationwide mortality rate receding from that achieved immediately after vaccination was introduced. At intermediate levels of vaccine coverage, total mortality actually exceeded that in the pre-vaccine era (Gandon et al. 2001). We are cautious about drawing strong policy conclusions from these models, because complex epidemiological situations are hard to capture. Nonetheless, the exercise does show that in principle, vaccine-driven evolution can worsen the public health burden. If such evolution did occur, withdrawal of the vaccine would expose even more people to the full intrinsic virulence of the vaccine-adapted strains. Clearly, a key issue is whether the intrinsic virulence of vaccine-adapted strains is greater or less than that of the wild type. We think there is a general feeling in the vaccine community that because escape mutants will be less fit than wild type (otherwise, they would have evolved anyway), vaccine-adapted mutants will be less damaging than wild type pathogens. Indeed, experimental insertion of immune-evading epitopes into wild type influenza A and HBV reduces viral replication rates relative to lines in which wild type epitopes have been inserted (Kalinina et al. 2003; Berkhoff et al. 2006). Apparently, the conformational changes required for immune evasion led to less efficient viral replication. Several other types of vaccineadaptation might similarly reduce replication efficiency. It is well known in the context of drug resistance (Chapter 10) that such fitness costs may


in due course be reduced by compensatory mutations elsewhere in the genome (so far as we know, not yet studied for epitope mutants). Even so, initially penalized vaccine-adapted variants should never acquire more than wild type replication efficiency by compensatory mutation. But vaccine-adapted mutants need not have lower replication abilities than wild type. For instance, pathogens allocating more resources to smokescreen molecules might grow less aggressively than wild type pathogens in unvaccinated individuals, but they might also grow better in vaccinated hosts precisely because of the smokescreen. Overexpression of immunomodulatory substances might be favored in vaccinated individuals, while overexpression in unvaccinated hosts might kill them. Similarly, higher replication rates may enable more transmission and longer persistence in the face of vaccine-induced immunity. Indeed, as we have mentioned above, intrinsically more virulent vaccine-breakthrough strains have been seen in feline calicivirus, Marek’s, and infectious bursal diseases. Even epitope mutants need not be less virulent. Many epitopes are themselves often virulence factors. A vaccine-adapted strain might have immune-evading epitope conformations that also make it highly efficient at binding host cells and hence too pathogenic to be favored by selection in an unvaccinated world. Thus there is no a priori reason to assume that vaccine-adapted variants will necessarily be less virulent. They may not have evolved in an unvaccinated world precisely because of their excessive virulence.

Predicting evolution Predicting the direction vaccine adaptation will take is extremely challenging, because a lot of quantitative biological data are required. First, data are needed on fitness costs and benefits associated with putative vaccine-favored mutants in vaccinated and non-vaccinated hosts. Whether a particular mutant spreads depends on the relative magnitudes of these costs and benefits. Costs and benefits vary widely with host, pathogen, and epidemiological circumstances and are unlikely to map easily onto categorization of the cellular and molecular mechanisms involved in vaccine


adaptation. Our argument that malaria vaccines could favor the spread of more virulent malaria parasites is based on costs and benefits of virulence that we measured in laborious experimental work with rodent malaria (Fig. 11.4). While we expect the logic will apply to other species of malaria and to other diseases where virulence is intimately linked with transmissibility, different natural histories can generate different predictions (e.g., Ganusov and Antia 2006). Even very good knowledge of the fitness costs and benefits of vaccine adaptation is insufficient to predict evolutionary outcomes: a whole host of epidemiological factors are also important (Restif and Grenfell 2007). For instance, what is the level of natural immunity, and how will vaccination change this? What effect will vaccination have on levels of herd immunity? What fraction of the pathogen population lives in non-immune versus immunized hosts? What subset of the pathogen population does the vaccine cover/target? How does vaccine coverage alter the force of infection of the disease? For pathogen populations in which a rich strain structure interacts with virulence evolution, predictions become very problematic, particularly because complex population dynamics typically follow a perturbation of the epidemiological system. Evolutionary prediction requires disease-specific models parameterized by a very good knowledge of relevant details. We doubt that simple generalities will emerge. For instance, it has been argued that it is highly desirable to have vaccines that selectively remove virulent strains, leaving mild strains to circulate and induce supplementary protection (the so-called ‘virulence antigen strategy’: Ewald 1994, 1996; Soubeyrand and Plotkin 2002; Ebert and Bull 2003). This sort of generality makes us nervous. Vaccines targeted at virulence determinants will not always lead to reduced virulence. Toxoid vaccination, for example, selectively targets parasite toxins. Anti-toxin immunity might indeed select against toxin production, but the production of more toxin may be a way for the pathogen to retain the resource-acquisition or immunosuppressive benefits of the toxin in the face of anti-toxin antibodies. Toxin epitopes could also evolve. The optimism behind the virulence antigen strategy may


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

be particularly misplaced when virulent strains are targeted at antigens unconnected with the virulence. For example, if the capsular polysaccharides of Streptococcus pneumoniae and other pathogenic bacteria such as Neisseria meningitidis and Haemophila influenza are not the only cause of strain variation in virulence, vaccines directed at the currently more virulent serotypes could simply prompt capsular switching. In that case, virulent strains would take on non-vaccine capsular types (Maiden and Spratt 1999). The classic case invoked in support of the virulence antigen strategy is diphtheria. In Romania, this evolved to be less virulent after the introduction of the toxoid vaccines because toxigenic strains decreased in frequency (Fig. 11.2). As we described earlier, the standard explanation for this is that anti-toxin vaccination greatly reduced the resource acquisition benefits of the toxin, making the metabolic costs of producing it not a price worth paying (Soubeyrand and Plotkin 2002). This argument may be correct. But an alternative expectation is that toxin production (and hence virulence) increases (that is, that tox++ strains should evolve) because the benefits of increased resource acquisition via toxins can be had with less risk of death because the host is now protected by the vaccine (Gandon et al. 2001). The contradictory predictions arise because the two arguments differ in what is considered the cost of virulence: host death or the metabolic costs of toxin production. In fact, increased or decreased virulence can evolve depending on the relative magnitude of the two costs (Gandon et al. 2002; Read et al. 2004). Thus, one needs to know a lot to predict what will happen. In the case of diphtheria, one can of course ask the empirical question of what actually happened. Without doubt, the incidence of diphtherial disease has declined in the face of vaccination, but what we want to know is what evolution has occurred. This is a question about the frequency of the toxigenic strains in the bacterial population. So far as we know, the only relevant data are those reproduced in Fig. 11.2. Frustratingly, the study that generated those data has apparently never been published in the primary literature, and so far as we know, there are no other such data published. However, we note that diphtheria case fatality rates have remained unchanged in the United States despite

60 years of widespread vaccination (Mortimer and Wharton 1999), and they have actually increased in Delhi (Singh et al. 1999). It is quite possible that geographically variable evolutionary outcomes are occurring in diphtheria. In the veterinary context, different evolutionary trajectories have been seen even within the same host–pathogen combination. Epitope variants of infectious bursal disease virus were responsible for vaccine breakthrough in the U.S. poultry industry; in Europe, hyperpathogenic variants were responsible (van den Berg 2000; Rautenschlein et al. 2005; Le Nouën et al. 2006). We reiterate: very system-specific analyses are needed to predict evolutionary trajectories. We doubt that simple generalities will emerge.

Watching evolution The difficulties of prediction make it even more important that we should watch and learn from the experiments already underway. One of the most obvious ways of doing this—taking samples from clinical cases of vaccine failure—is often done. However, to make sense, such analyses need to be set into context: samples of pathogens from vaccinated and unvaccinated people, and from disease cases and asymptomatic infections, are required. But even these are not sufficient to predict the direction of pathogen evolution because there are population level influences, such as herd immunity, at play: thus one needs to measure change at the population level. To detect or study evolutionary change, random samples of a pathogen population are needed from each of those four groups before widespread vaccination is introduced, and then with successive samples over subsequent decades. The existence of such series data for pertussis (van Loo et al. 1999; Hallander et al. 2005; Hardwick et al. 2007) is one of the reasons it has played such an important part in discussions of vaccine-driven evolution. There are encouraging signs that such collections are being made for at least some other human diseases (e.g., Pebody et al. 2006). We believe this should be routine for all human diseases and for many veterinary diseases when new vaccines are being introduced. Genetic change is easiest to study where relevant genes are known: molecular epidemiology on epitope variants and known virulence determinants


is relatively straightforward. However, it is much harder when relevant genetic variation is unknown, as is the case for many virulence determinants. In these cases, what is needed is some phenotypic marker of virulence (e.g., determination of the frequency of toxigenic strains of diphtheria), but in many cases, virulence can only be assayed by comparing lines inoculated into a laboratory standard host. The ability to do these assays is what enabled the evolution of myxomatosis to be so successfully studied (Fenner and Fantini 1999). It will be extremely challenging to detect virulence changes for human diseases. Against the background of falling disease incidence that should follow from the introduction of effective vaccination, it may be rather late in the day when increases in virulent breakthrough strains are noticed. There is also usually no ethically acceptable experimental host in which the virulence of strains can be compared. Comparisons of case fatality rates over time are confounded by changes in clinical medicine and other environmental changes, including changes in disease cofactors. It is substantially easier to track virulence changes in the veterinary context where experimental infection of relevant hosts is possible. It may be no accident that this is where increasingly virulent strains in vaccinated populations have been detected (MDV, IBDV, and feline calicivirus (FCV)). Based on our discussion above, we think surveillance and archiving of pathogen samples is especially warranted for infectious diseases where existing or new vaccines: • fail to prevent transmission, or where immunity wanes and vaccinated hosts become leaky; • target a subset of strains in a population; • target virulence determinants; • target epitopes not normally seen by natural immunity; • involve novel technologies that, for instance, induce better than natural immunity or induce immune effectors not normally evoked; • could be overcome by increased production of virulence factors, such as toxins and immunosuppressive substances; or • provide prophylactic vaccination in normally naïve populations and thus constitute a major new source of immune selection on pathogens.


Assessment of the potential of vaccination to prompt virulence evolution should begin by considering what selection pressure prevents more virulent strains from dominating current populations, and then asking (1) Will vaccination relax that? and (2) Do virulent strains have a fitness advantage in vaccinated hosts? For diseases where both answers are yes, non-sterilizing vaccination could generate more virulent pathogens.

Coda Society has developed very robust procedures for testing the safety and efficacy of vaccines for individuals. It is much harder to assess the long-term consequences of vaccine use: which vaccines will be undermined by evolution; which will create more or less virulent pathogens. By definition, evolutionary experiments take some time, and replication at the whole population level is problematic. Mathematical models clarify thinking but cannot yield the certainty of a clinical trial. Nonetheless, the complexity of the issue does not make the problem go away. In our view, vaccine evaluation processes should involve evolutionary assessments at all points in the vaccine pipeline, from design to rollout and beyond. In particular, during early trial stages, data can be gathered on the effects of vaccination on transmission of wild type pathogens, and whether more virulent strains are likely to benefit from a vaccinated world. Such data are a necessary first step to evaluating evolutionary risk and deciding among competing vaccine strategies. Research on vaccine-driven evolution is in its infancy, and there are many open issues. Some of these involve vaccine design. Can we prevent the evolution of antigenic targets by focusing on naturally invariant epitopes? Do broad spectrum multitarget vaccines make epitope evolution less likely—and if so, virulence evolution more likely? Other questions involve the evolutionary experiments currently underway: will leaky vaccination of poultry against avian influenza lead to the evolution of strains more virulent to humans (so called ‘monster’ strains; Anon 2004)? Will the efficacy of vaccines against bacterial capsular polysaccharides decline because of serotype replacement and capsular switching? Do changing farm practices lead to pathogen evolution that undermines current


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

vaccines? What will be the evolutionary consequences of vaccines targeted at specific disease syndromes such as cerebral malaria or pregnancyassociated malaria? Vaccines have been and continue to be the most successful infectious disease control measure after hygiene. Yet it seems likely that at least some national immunization programs are going to have to be adjusted in response to pathogen evolution (Kimman et al. 2006; Radford et al. 2006), and agricultural vaccines have already been rendered useless by pathogen evolution. If the repeated evolution of drug resistance has taught us anything, it is that it is better to think about evolution in advance.

Summary 1. The evolution of vaccine-resistant pathogens is not obviously as big a problem as the evolution of drug resistance. We argue that it is nonetheless a problem that is likely to grow. There is no reason for complacency. 2. Vaccine-driven pathogen evolution has been seen in several infectious diseases. 3. Vaccine-induced immunity does not simply substitute for the selective pressures imposed by infection-induced immunity. Vaccines can alter the immune landscape experienced by pathogens, and hence their evolution, by targeting subsets of strains in a population, reducing the number of fully susceptible individuals in a population, and creating or expanding classes of semi-immune hosts. 4. Vaccines of the future are likely to evoke novel immune pressures. These novel vaccine technologies will likely impose completely novel selection for resistance, as do drugs. 5. Vaccination against the acute childhood diseases such as smallpox, polio, and diphtheria has occurred for decades without being undermined by pathogen evolution. However, these diseases were easy targets: natural immunity was evolutionproof; all vaccination needed to do was to induce something very similar. 6. Infectious diseases now under assault by vaccination are different: natural infections induce leaky, often strain-specific immunity that usually wanes. Vaccines against these diseases will likely

induce immunity to which natural selection has already found solutions. 7. Some agricultural vaccines have already failed in the face of pathogen evolution (e.g., Marek’s disease) and the jury is still out on others, including some against human diseases (e.g., pertussis). It may take decades for vaccine resistance to become apparent. 8. A wide variety of pathogen phenotypes can be favored by natural selection in vaccinated populations. Most biomedical research has concentrated on epitope variation, but the evolution of increased virulence may also occur when vaccination relaxes the natural selection against virulence. 9. It seems likely that vaccines could provoke the evolution of enhanced immunosuppression and changes in patterns of antigenic variation, tissue tropism, and invasion pathways. There has been little analysis of these possibilities. 10. Predicting evolutionary consequences of vaccination in advance is extremely difficult. There may be no population-wide health implications of vaccine-driven pathogen evolution, or it could improve or worsen disease burdens. Anything is possible. 11. Evolutionary analysis is particularly warranted where vaccines are leaky, target subset of strains or virulence determinants, involve novel technologies, or relax selection against virulence. 12. Vaccination is one of the most cost-effective methods of public and animal health improvement. Continuing past successes and realizing the full potential of vaccination requires evolutionary considerations at all stages of vaccine design and implementation.

Acknowledgments This chapter was written while AR was at the Wissenshaftskolleg zu Berlin. Our thinking on the topic has been sharpened by discussions with V. Barclay, T. Day, S. Gandon, K. Grech, and M. Maiden. Our empirical work was funded by the Leverhulme Trust, the Royal Society, the Universities of Cambridge and Edinburgh, and the Wellcome Trust. This work is published with the permission of the director of KEMRI.

C H A P T E R 12

The evolution and expression of virulence Dieter Ebert and James J. Bull

Selection has nothing to do with what is necessary or unnecessary, or what is adequate, for continued survival. It deals only with an immediate better-vs.-worse within a system of alternative, and therefore competing, entities. It will act to maximize the mean reproductive performance regardless of the effect on long-term population survival. It is not a mechanism that can anticipate possible extinction and take steps to avoid it. —George C. Williams (1966)

Introduction Studies of the evolution of virulence aim to understand the morbidity and mortality of hosts caused by parasites and pathogens as the result of an evolutionary process. The evolutionary perspective on virulence focuses on the costs and benefits of virulence from the points of view of both parasite and host, with the goal of identifying the selective processes at work. The degree of harm inflicted by the parasite on the host is the trait of interest, ranging from avirulent (asymptomatic) to highly virulent (rapidly killing). Historically, virulence was considered a deleterious side effect of new host–parasite associations that would evolve to low levels with time. This simple hypothesis is no longer entertained by the field for theoretical (Anderson and May 1982; Bull 1994) and empirical reasons (Herre 1993; Ebert 1994). It has been replaced by a set of new models. Depending on the costs and benefits of virulence, host and parasite physiology, historical constraints, and variance in biological, ecological, and epidemiological aspects of host–parasite interactions, almost any level of virulence can evolve

(but not always predictably so), from highly virulent ‘Andromeda strains’ to mild, avirulent infections (Bull 1994). Of course, it is indeed the case that some of our most highly virulent infections are due to novel parasite associations (e.g., ebola, bird flu, SARS, rabies, fox tapeworm), but it is no longer thought that high virulence cannot be maintained as an evolutionary optimum. This new perspective offers many possibilities and promises, in that the evolution of virulence is now seen to have a rich set of causes. Furthermore, the new perspective means that the evolution of virulence is now a topic of more than mere academic interest: it offers a conceptual framework to professionals in many fields and may contribute to decision-making in public health fields. One major goal of understanding the evolution of virulence would be to manage virulence: to design interventions in which parasites evolve lower levels of virulence and to avoid practices that encourage evolution of higher virulence. This hope was championed by Paul Ewald in the 1990s (Ewald 1994). Toward the goal of management, it would be desirable to discover simple rules that could be used to manage virulence evolution across many pathogens. At a less ambitious level, understanding the evolution of virulence might give insight to the future of new infectious diseases—whether the bird flu, Ebola, or SARS agents will evolve lower virulence if they become established as human epidemics, for example. So there are practical reasons to seek an understanding of virulence evolution. This chapter offers an overview of current ideas surrounding virulence and its evolution. 153


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

The emphasis is on an evolutionary rather than a mechanistic perspective. In discussing different hypotheses we have tried to consider the needs and expectations of different audiences. Public health workers and epidemiologists are concerned with the prevention of disease and the reduction of its average virulence; this interest overlaps broadly with those of evolutionary biologists, for both view populations as a whole. Agricultural biologists concerned with livestock and crop pests fall roughly into the same category. In contrast, most human and veterinary medicine deals with parasites on a case-by-case basis, aiming to reduce their harmful effects in individual patients. At first, evolutionary biology seems to have little to offer them, but as we point out below, this is not always the case. The concepts in this chapter are based on parasites that are, for the most part, transmitted infectiously between hosts (horizontal transmission). The evolution of vertically transmitted parasites has been discussed in several reviews (Bull 1994; Ebert and Herre 1996).

Outline of this chapter We begin with some preliminaries: possible definitions of virulence, and the relationship between virulence evolution in a test tube and the creation of live, attenuated vaccines. The latter example lies somewhat outside traditional work on the evolution of virulence, but it sets a precedent for what we hope to obtain from a general theory for the evolution of virulence, and it sheds light on its process. From there, the chapter discusses the evolution of virulence in three successive phases: • Phase 1 is the first contact of a disease agent with a new host, as with accidental infections. • Phase 2 occurs as a parasite has just established itself in a new host species and is far from the optimal virulence. • Phase 3 applies to parasites that are established for a prolonged period of time in a host species and population and evolve in response to changes in the environment, including host demography. Although these phases are not well-defined biological categories, they help to structure current ideas about the evolution and expression of

virulence. For example, the Ebola virus causes occasional local outbreaks in Africa that usually fade out quickly. Because there is no current evidence that it adapts toward its new host during these short episodes, it has never reached phase 2. The human immunodeficiency virus is a relatively young, but persisting, disease that entered the human host only a few decades ago and may still be adapting to it. This disease is still in phase 2 in many parts of the world. In Phase 3 are established diseases, such as malaria, tuberculosis, and leprosy. A parasite in phase 3 that invades a new host population (e.g., human diseases introduced into the Americas from Europe and Africa) may fall back into phase 2. In our discussion of established diseases (phase 3) we introduce the trade-off model and discuss several of its shortcomings because this model has been the foundation of many models for the evolution of virulence. In the section on mechanisms of virulence we discuss features that future models may incorporate to improve their predictive power. Many discussions of the evolution of virulence assume that the host population does not show much variation with respect to virulence expression and that it evolves much more slowly than the parasite. Here we discuss the evolution of virulence in cases where host variability is taken into account. We also discuss the possibility that virulence gives the parasite a direct benefit. Finally we address the question of whether we can use our knowledge about virulence to guide parasite evolution to our benefit.

Defining virulence Central to all theories on the evolution of virulence is an understanding of what virulence is. For many purposes, virulence can be described simply as parasite-mediated morbidity and mortality. This definition encapsulates the entire range of disease symptoms that reduce host fitness, regardless of whether virulence is an evolved characteristic or a coincidental by-product. In specific cases, more precise definitions of virulence should be used and should precede any discussion of its evolution. It is useful here to introduce the distinction between virulence as damage to the host that does


not directly benefit the parasite versus virulence in which the disease symptoms directly benefit the parasite—that the host damage per se (e.g., host death) is needed for parasite transmission. Most literature deals with the first form of virulence as an unavoidable by-product of parasite replication. Mathematical models of the evolution of virulence for horizontally transmitted parasites typically use parasite-induced host death rate (PIHD) to measure virulence. PIHD is a fitness component of both the parasite and the host. However, whereas host death is considered a measure of virulence that directly impairs host fitness, effects of the parasite on host fitness traits such as the host’s attractiveness to mating partners, host fecundity, or more generally morbidity, are not included here because they do not (necessarily) affect the parasite’s transmission and, thus, are not expected to affect selection on the parasite. Nevertheless, mathematical models that use PIHD as a measure of virulence are often discussed in the sense that other virulent consequences of infection correlate closely with PIHD. As a consequence, empirical studies often use host fitness reduction (or a component of it) as a measure of virulence, including parasite-induced live-weight lost, anemia, reduced growth, and reduced fecundity (e.g., Ebert 1994; Mackinnon and Read 1999a). For most host-parasite systems this may be a reasonable first approximation of virulence, but to interpret data with regard to parasite evolution, the form and strength of the relationship between disease-related host traits and parasite fitness is important, though it is usually poorly known (Day 2002; Ebert and Bull 2003a). Even measures of host death associated with infections (e.g., case fatality rate) can be misleading for estimating PIHD (Day 2002). Defining virulence becomes more complex when host evolution is considered. Hosts evolve to reduce the damage that parasites cause to their fitness. Some components of damage may impact both partners (e.g., PIHD), while others may not. For example, parasite-induced reduction in the host’s sexual attractiveness, which may strongly reduce host fitness, may occur without any effect on parasite fitness. Therefore, it must be included in models of host evolution and host–parasite coevolution, but not in models of parasite evolution (unless the parasite is sexually transmitted).


A different perspective on virulence comes from studies that try to explain the evolution of specific disease symptoms as directly beneficial to the parasite. For example, host castration has been directly linked to the parasite’s resource demands (Obrebski 1975; Ebert et al. 2004), impaired host mobility has been suggested to help vectors feed on the diseased hosts (Ewald 1983; Holmstad et al. 2006), and altered host behavior may benefit transmission to a second host that preys on the infected first host (reviewed in Moore 2002). These complications highlight the importance of specifying the bases of virulence when discussing a specific system.

Artificial virulence evolution and live vaccines One of the oldest and most successful uses of virulence evolution has been technological: to create attenuated live vaccines. Live vaccines are strains of a formerly pathogenic organism that have evolved to become avirulent. They create a mild infection that stimulates the body to develop immunity without causing the disease. Live vaccines include the Sabin (oral) polio vaccine; the measles, mumps, rubella, yellow fever, and chicken pox (varicella) vaccines; one of the influenza vaccines (flu mist); the tuberculosis (BCG) vaccine; and one of the typhoid vaccines (Makela 2000). Before the advent of genetic engineering, the standard method for developing a live vaccine was to adapt a virulent pathogen to growth in culture. As the pathogen evolved to grow better in culture, it also often evolved to grow poorly in the normal host, where its virulence was therefore reduced (Ebert 1998). The basis of this outcome is a trade-off between ability to grow under one set of conditions and that of another: to grow better in culture, the pathogen must sacrifice its ability to grow in the host. This method did not always succeed in creating a safe vaccine, and it cannot yet be predicted how much the pathogen must adapt to culture to attenuate on the host. But the method was robust enough to succeed in many cases. The low virulence of attenuated vaccines can be reversed by evolution. If the vaccine strain is again allowed to transmit between hosts, natural


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

selection may promote the evolution of variants that grow better and re-acquire high virulence. Reversion to virulence, often observed for the Sabin vaccine, has caused several local polio epidemics (Kew et al. 2004). The artificial evolution of virulence during live vaccine creation is not typically considered in treatises on evolution of virulence. Yet there are several reasons to acknowledge it here. First, the creation of attenuated vaccines and the reversion to high virulence illustrates that virulence can evolve rapidly in both directions; reversion to high virulence upon escape of a vaccine strain further indicates that virulence is tied to some form of ‘optimum’. Second, practical rules of virulence evolution can be simple, although we must acknowledge that evolution of reduced virulence under radically changing growth conditions may be easier to predict than evolution of virulence within a single host in response to changes in population structure. Third, the results support the basic idea that higher virulence (and in some cases transmission) is associated with higher parasite growth rate (Ebert 1998). So this example is useful for establishing several precedents and offering hope in managing the evolution of virulence.

The three phases of the evolution of infectious diseases In this section we consider different phases in the adaptation of a parasite to a host. These stages are in essence a progression of disease ‘emergence’ in a new host, and the progression directly impacts what might be expected in the evolution of virulence.

Phase 1: Accidental infections Many pathogens can infect and cause disease in a host that is not part of the normal transmission route but is a dead end for the parasite. The accidental infection may fail to cause secondary infections in other individuals (as with rabies, lyme disease, West Nile virus, and anthrax in humans), or it may create a short chain of infections in the accidental host that quickly dies out (e.g., bird flu, SARS, Ebola, and pneumonic plague in humans).

Accidental human infections can be particularly virulent. Rabies, Ebola, SARS, bird flu, and pneumonic plague all have case fatality rates of 50% or more. Untreated alveolar echinococcosis caused by the fox tapeworm, Echinococcus multilocularis, results in mortality that approaches 100% over a course of 15 years (Ammann and Eckert 1996). Related to accidental infections are emerging infectious diseases. They are usually defined as diseases of infectious origin whose incidence in humans has increased within the past two decades or threatens to increase in the near future. Chapter 16 in this book deals in more detail with emerging infectious diseases. The level of virulence expressed when a parasite infects a new host for the first time is not expected to follow evolutionary principles based on natural selection. Because the genes responsible for virulence did not evolve under the conditions in which they are expressed, their effect on the host is not predictable (Levin and Svanborg-Edén 1990). The botulinum toxin produced by Clostridium botulinum did not evolve to kill people. The blindness caused by Toxocara canis did not result from evolution within human hosts. Thus virulence in novel hosts does not represent an equilibrium for parasite fitness. Furthermore, virulence in novel hosts is unlikely to be associated with high transmissibility (which accounts for the eventual die-out). It follows that the virulence of parasites newly introduced into humans cannot be managed or directed, although preventive measures may be used to reduce their harm or to prevent infections (e.g., vaccines). In one sense, accidental diseases are rare. Given the vast number of parasites with which humans have nearly daily contact, it is clear that only a small minority of them are able to infect hosts other than those they usually live on. Nearly all hosts are resistant to nearly all parasites. The frequent opportunistic infections suffered by immune-compromised people testify to the power of the immune system to protect us from most parasites. Although accidental infections are rare, the same zoonotic disease may emerge repeatedly, posing a severe health risk for those exposed, for example people working in agriculture. Host switches are facilitated by frequent contact between the novel and the reservoir


host, as when they share resources or use the same habitat (Epstein et al. 2003). Accidental infections seem more likely when the reservoir host and the novel host are phylogenetically related. Thus, humans become more easily infected with parasites from other primates than from rodents, but more easily with rodent parasites than with fish or even invertebrate parasites. This does not exclude the possibility of a host switch between distantly related hosts, but makes it a rather unlikely event (Krauss et al. 2003). Experimental data suggest that accidental infections are mostly (but not invariably) less virulent than are established ones. The relevant studies (often published under the rubric of local adaptation) compare well-established host–parasite combinations from the same geographic location with novel combinations from different geographic locations. In most cases, infectivity and virulence are, on average, higher in established associations than in novel ones (Lively 1989; Ebert 1994; Morand et al. 1996). There can, however, be considerable variation around this average (Ebert et al. 1998; Kaltz et al. 1999): a novel combination may be avirulent on average, but may occasionally be highly virulent (Ebert 1994). The low virulence of some accidental infections has been of practical value. Two hundred years ago, Edward Jenner used the cowpox virus to immunize humans against the related smallpox virus. The cowpox virus is largely avirulent in humans, but it is similar enough to the human smallpox virus to induce immunity against it. Cowpox was an accidental infection of women who milked cows, and the immunity of those women to smallpox led to the inference that cowpox was protective against smallpox. Conversely, the high virulence of some (deliberate) accidental infections has also been of practical value. Biological control efforts have at times used extremely virulent parasites of an unwanted pest species (e.g., rabbits released into Australia). The virulent agents are usually found as parasites of related species or strains and happen to be highly virulent in the pest species. This protocol was used in selecting the myxoma virus for release into Australian rabbit populations (Fenner and Ratcliffe 1965). Collectively, these applied examples reveal that


there is no universal pattern for the virulence of accidental infections. What, then, can be said about the virulence of new diseases? There is nothing that can be predicted based on evolutionary principles. There is certainly a bias in reporting novel diseases: more virulent ones are more likely to be noticed and reported. Because harmless infections are not noticed, we get the impression that novel diseases are usually virulent (Ebert 1994).

Phase 2: Evolution of virulence soon after successful invasion Host switching appears to be common in the history of many infectious diseases when considered over long evolutionary time (see Chapter 16). The successful invasion of a new host species will probably be preceded by many accidental infections that die out (phase 1). To enter phase 2, an accidental infection of one individual spreads to another host, then another, forming a continual transmission chain that successfully invades the new host population. At least initially, this spread will constitute an epidemic, in which the number of infected hosts increases (in contrast to the endemic phase, in which the number of infected hosts is relatively constant, even though new infections are continuing). The invading parasite will generally not be well adapted to the new host, and consequently, there will be rapid evolution of the parasite and possibly the host. Virulence may evolve rapidly in this phase. The cases where a parasite switches hosts and rapidly causes a novel, spreading disease, are rare. The fox tapeworm and the Ebola virus have so far failed to evolve into spreading human diseases. In contrast, the human immunodeficiency virus (HIV) is one of the rare cases that has established itself in the human population and is still in the initial epidemic phase in parts of the world. Experimental work and field observations have yielded several examples that illustrate the evolution of virulence and make it possible to study the evolution of diseases that emerge from novel parasite–host associations. Phase 2 also applies to the origin of novel variants of a pathogen in the old host. These are


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

extreme mutants of existing parasites that are able to start epidemics. For example, a variant with novel antigenic properties may spread through the host population as if it is a completely new parasite, unaffected by existing host immunity and other defenses. Thus, phase 2 is a broad category to represent parasites whose dynamics are initially (strongly) epidemic, regardless of the reason. Virulence and other phenotypes may be far from optimal when such mutants arise, but their advantage in one phenotype (e.g., immune escape) may outweigh other suboptimal traits, such as too high virulence, so that they spread nonetheless. Evolution may subsequently change these suboptimal traits. In contrast to parasites switching hosts and entering phase 2, novel variants of old diseases spreading epidemically and causing a threat ‘from within’ may be rather common. The classic example of virulence evolution in phase 2 is the European rabbit-myxoma virus system. The highly virulent myxoma virus was isolated from a South-American hare species and released for biocontrol of a rabbit pest in Australia. Over the following years the virus first declined rapidly in virulence. Later (in what was probably phase 3) virulence then increased slowly again, as tests on non-coevolving control rabbits showed. At the same time the wild rabbit evolved to suffer less from the original virus (Fenner and Kerr 1994). In this case, the virulence of the original virus, which was deliberately chosen to be a virulent pest control agent, was apparently well above the optimal level. This impressively documented example shows that virulence can evolve rapidly with large changes—the initial decline in virulence took only a few years—and that hosts evolve to reduce the costs of parasitism. Thus, virulence is influenced by both host and parasite evolution. We will return to this example when discussing the trade-off model. There are few natural examples of phase 2 evolution that have been documented as well as the rabbit–myxoma system. However, there are a number of serial passage experiments that serve to illustrate what happens if a parasite is evolving under new conditions. Serial passage studies reveal the changes in virulence associated with

parasite adaptation to novel hosts or novel culture conditions. Serial passage experiments In the earlier discussion of serial passage experiments, it was noted that virulence in a former host often declines when a parasite is adapted in a new environment. Here we note that the flip-side of this evolution is also true: when a parasite is adapted by serial passage to a new host, virulence in that new host typically increases. While this offers a tool for studying the evolution of virulence under experimental conditions, it has obvious limitations with regard to its application to natural populations. Typically, serial passage experiments involve the artificial infection of a novel, but related, host and subsequent transmission from one individual to the next in that new host species (e.g., by syringe transfer of blood). The host strains used are usually well defined and of low genetic diversity (inbred lines, full sib families, clonally propagated cell lines). Host-tohost transmission is controlled by the experimenter, and selection for the natural mode of transmission is relaxed. Despite huge variation in the purpose and methodology of serial passage experiments, several results are consistent (Ebert 1998): • Virulence usually increases during serial passage in the new host. It can evolve rapidly and on time scales of weeks or months. This effect can be so strong that even selection for reduced virulence during serial passage experiments results in an increase in virulence (Mackinnon and Read 1999b). • The increase in virulence depends on the host’s genotype. As noted earlier, parasites passed through one host-type become ‘attenuated’ or show reduced virulence in the original host and may thus serve as a vaccine in that host. • The increase in virulence likely results from within-host evolution, which drives an increase in within-host growth rate and thus, virulence (Bull and Molineux 1992; Ni and Kemp 1992; Novella et al. 1995). Competition trials between parasite strains with different within-host growth rates have shown that the most rapidly growing strains out-compete slower growing strains (Ni and Kemp 1992; Novella et al. 1995). Thus, it seems that withinhost competition between parasite genotypes drives


within-host growth rate, and that the growth rate is positively correlated with virulence. Given these findings, the question arises, ‘Why does virulence not increase under normal, ‘nonpassage’ conditions?’ The answer may relate to the evolution of between-host transmission. During serial passage experiments, the experimenter ensures host-to-host transmission among genetically homogeneous host lines (e.g., inbred lines, cell cultures) regardless of the level of virulence, so selection for host-to-host transmission is relaxed and adaptation to specific host lines favored. Thus, serial passage experiments mimic endless withinhost growth. Under such conditions within-host competition drives selection for increased growth rates, a process that might have costs in terms of reduced host-to-host transmission (Ebert 1998). The examples of parasites spreading in novel hosts suggest several important insights into the evolution of virulence. Virulence can evolve quickly up or down when the parasite faces a new host from which it is able to transmit. While within-host competition is well established as the mechanism for the evolutionary increase in virulence, the mechanisms that reduce virulence are not yet understood. The trade-off model discussed in the following section is a possible solution.

Phase 3: The evolution of optimal virulence After persisting for some time in a new host population, the parasite should approach equilibrium virulence, and evolutionary changes in virulence should become less striking. The parasite may then reach a selection boundary, in which trade-offs among the parasite’s various fitness components constrain further evolution of virulence and transmission (Ewald 1980; Anderson and May 1982). The main difference between phase 2 and phase 3 is whether the parasite has evolved to the trade-off boundary. Anderson and May (1982) found evidence consistent with a trade-off between parasite-induced host mortality (= virulence) and host-induced parasite mortality (= host recovery) in the data from the rabbit–myxoma virus system. Highly virulent myxoma strains cause many lesions (presumably necessary for transmission), are cleared slowly by the


host immune system, and quickly kill the rabbit; whereas mildly virulent strains are quickly cleared. The fittest strain has intermediate virulence and is cleared at intermediate rates. Epidemiological data showed that strains with intermediate virulence evolve to dominate the system, a finding supported by mathematical models (Fenner and Ratcliffe 1965; Anderson and May 1982). In its simplest form, the optimum in the tradeoff model is found by maximizing the number of secondary infections produced by a primary infection. Such selection is based entirely on maximizing between-host transmission. In most cases the trade-off is between ‘how long’ and ‘how fast’ the parasite can transmit (Mackinnon and Read 2004). High virulence, which is assumed to have a high rate of transmission, impedes net or total transmission because it kills the current host too quickly, impairs host interactions with other hosts, or induces the host immune system to react strongly. These positive correlations between virulence, within-host growth, and transmission rate are well-supported with data from many organisms (Bull and Molineux 1992; Ebert and Mangin 1997; Lipsitch and Moxon 1997; Thrall and Burdon 2003). In some cases, they can be interpreted as suggesting the diversion of resources from the host into the reproducing parasite. The trade-off model is versatile and makes it possible to predict changes in optimal virulence for different conditions. For example, Read and Mackinnon discuss possible evolutionary consequences of vaccination programs for virulence evolution in Chapter 11. Most commonly, it has been recognized that changes in external conditions can favor a different level of virulence, and a parasite with a short generation time can evolve a new level quickly. There are some recognized external effects on the virulence optimum: Short host life span favors higher parasite virulence, although we have no obvious reason to think that host life span will change enough to select meaningful changes in parasite virulence. An empirical test of this prediction failed, however (Ebert and Mangin 1997), possibly because of an epidemiological feedback that arose as a consequence of altered host longevity. The idea that host longevity shapes virulence evolution is based on the notion that host death


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

curtails parasite transmission. Likewise, every other factor that influences the parasite’s life span may influence virulence evolution. A further prediction of the trade-off model is that higher virulence is favored when the number of infected hosts is increasing rapidly (an epidemic); relatively low virulence is favored when the number of infected hosts is static (endemic) (Lenski and May 1994; Bonhoeffer et al. 1996; Bull 2006). Empirical data to support this prediction are lacking. A common misunderstanding of the trade-off model is that the opportunity for transmission affects virulence evolution, such that lower host densities per se were suggested to lead to a decrease in the optimal virulence (Ewald 1994). However, as pointed out by Lipsitch et al. (1995), when the parasite has reached dynamical equilibrium, each infected host gives rise to one new infected host on average, so the effect of host density drops out. Host density does affect the optimal virulence during the epidemic phase, just not when the parasite and host have reached dynamic equilibrium (Bull 1994; Lenski and May 1994). Experimental tests Heineman and Bull (submitted) experimentally evolved virulence (time to lysis) specific to different host densities in the bacteriophage T7. In this study, the virus population was maintained in epidemic phase and not allowed to reach dynamic equilibrium, so the virulence optimum differed with high and low host density: high host density allowed the phage to spread quickly, favoring high virulence; low host density greatly slowed phage growth and favored low virulence. They found mixed support for a trade-off model explaining virulence evolution. At high host density, expected to select high phage virulence (rapid lysis), the experimentally evolved outcome was quantitatively close to the predicted optimum. At low host density, expected to select low phage virulence (late lysis), the phage evolved but still retained much faster lysis than was predicted. In general, the phage’s evolution between the conditions of high and low host density was less extreme than predicted. The central prediction of the trade-off model— that parasite lifetime transmission success peaks at intermediate virulence—has recently been shown

in a case study using a parasite that builds up progeny in the host and then must kill the host to release its progeny (as does bacteriophage T7). Such parasites are very well suited to be model species for the study of virulence, for one can estimate their lifetime transmission success and relate it to other variables. Jensen et al. (2006) showed that for a bacterial pathogen in a planktonic crustacean host lifetime spore production of the bacterium peaks at intermediate time to host death (see Box 12.1). This study is in line with the idea that virulence evolves to an optimum, but it does not allow us to conclude that the mechanism proposed by the trade-off model is the driving force in shaping this optimum. Within-host evolution The trade-off model has been extended along various lines, most importantly to include within-host selection, i.e., competition among parasites within hosts. As discussed in the previous section, it is thought that within-host competition usually selects for higher parasite growth rate, and thus for higher virulence (Bremermann and Pickering 1983; Nowak and May 1994; van Baalen 1994). This has led to predictions that factors that increase the rate of multiple infections (i.e., higher host density, higher transmission rates, and low host background mortality, which allows for longer life span and thus more multiple infections) also select for higher virulence (van Baalen and Sabelis 1995; Frank 1996; Ebert and Mangin 1997). Likewise, under conditions of increased competition among unrelated parasite genotypes, virulence may increase, e.g., less spatial structure and less vertical transmission (Frank 1996). The picture may change if co-infecting parasites cooperate to exploit their host (Turner and Chao 1999; Brown 2001) or evolve to exploit each other, as is the case with defective interfering particles of viruses (DIPs), which are, in essence, parasites of viruses (Bull 1994). Both phenomena may lead to a reduced virulence, but it is currently not clear whether they are representative of natural systems. Trade-off models also predict that if parasites can adjust their virulence facultatively, they should express higher virulence when encountering a competitor within the same host. This prediction has been met in a malaria–mouse system (Taylor



Box 12.1 Experimental evidence from waterfleas The trade-off virulence model predicts that transmission-stage production and host exploitation are balanced, such that the parasite’s lifetime transmission success (LTS) is maximized. For parasites that suppress host reproduction, this simple model has been modified to account for the fact that they convert host reproductive resources into transmission stages (Ebert et al. 2004). Parasites that kill the host too early will not benefit from these resources, while postponing the killing of the host results in diminished returns, because the parasite grows more rapidly than its host. Therefore, killing the host after an intermediate time period results in maximal LTS. Earlier experimental studies have had difficulty finding direct evidence for maximal LTS at

Parasite spores (millions)

10 8 6 4 2 0 20


40 50 60 Time to host death (days)


Figure 12.1 Relationship between lifetime spore production of Pasteuria ramosa and longevity of its host Daphnia magna.

et al. 1998) but not in other systems (Lauria Pires and Teixeira 1997; Imhoof and Schmid-Hempel 1998; Vizoso and Ebert 2005). Combining the trade-off model with within-host evolution In the broad picture, intense host exploitation, and thus, virulence, can be viewed as a selfish strategy favored by within-host competition but selected against by between-host competition (Antia and Koella 1994; Bonhoeffer and Nowak 1994; Bull 1994; Nowak and May 1994; van Baalen 1994;

intermediate virulence. Jensen et al. (2006) used a host–parasite system (Daphnia, a planktonic crustacean, and the bacterial parasite Pasteuria ramosa), in which the competition between host and parasite for resources is particularly strong. The parasite benefits by converting a large proportion of host biomass into parasite transmission stages (endospores). To gain more resources, P. ramosa must suppress reproduction of its host. The transmission stages produced by the parasite accumulate in the host and their number increases with the age of infection. The spores produced by the parasite are not released until host death, which permits one to estimate the parasite’s LTS accurately. To test for an optimal LTS at intermediate times to death, the authors infected individual Daphnia magna of one host clone with the bacterium and followed these individuals until their parasite-induced death. They found that the parasites showed strong variation in the time to kill their host, and that transmission-stage production peaked at an intermediate level of virulence (Fig. 12.1). Variation in time to death and LTS was shown to be at least partially based on genetic variation among parasite genotypes (Jensen et al. 2006). Another interesting finding of this study was the observation that some hosts died before the parasite had produced any transmission stages. Apparently some parasite lines were so virulent that they induced host death before they finished spore development. This may put an upper limit on virulence evolution.

Frank 1996). In this respect, the evolution of virulence is like other evolutionary problems in which selection operates differently at different levels of population structure, as in the classic group versus individual selection in the evolution of cooperation and altruism (Williams 1966). The level of virulence that evolves in hierarchical population structures depends on the conditions that influence the type and frequency of multiple infections and the costs imposed by early parasite death. Epidemiological conditions under which between-host competition is very important lead to low virulence, as


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

in vertically transmitted parasites (Jeon 1972; Bull et al. 1991; Herre 1993), while experimental exclusion of between-host transmission dynamics leads to high virulence, as in serial passage experiments (Ebert 1998; Mackinnon and Read 1999b). These cases are extremes along a continuum, on which most horizontally transmitted parasites evolve under less extreme conditions. Understanding the evolution of virulence under non-extreme conditions is a major challenge to the field in the coming years. The beauty of the trade-off model is that it makes predictions. Direction and magnitude of change may depend on the environment, which may be influenced by treatment. The chapters by Koella and Turner (Chapter 17) and by Read and Mackinnon (Chapter 11) in this book give various examples for the application of the trade-off model. In the following aspects of virulence evolution not considered by the trade-off model are discussed.

Mechanisms of virulence remain to be considered Understanding the mechanisms that generate virulence can shed light on how virulence evolves. Here we suggest that such analysis should partition the different components of morbidity and mortality rather than lumping them together into a single measure of virulence, as is done frequently in applying the trade-off model. Because such understanding has not been incorporated into most evolutionary literature on virulence—most mathematical models ignore the mechanistic basis of virulence—it is an area that begs to be developed. Here we focus on places where incorporating mechanisms is likely to improve our understanding of the evolution of virulence. For many infectious diseases a great detail is known about the molecular mechanisms generating virulence. A detailed marriage between mechanisms (proximate) and evolution (ultimate) should be possible. For example, microbiologists have identified ‘virulence factors’ in many pathogenic bacteria, genes essential to cause disease but that are not essential for the growth of the microbe. Virulence factors of bacteria are often clustered on plasmids or in ‘islands’ of the genome and are

easily identified by comparison of pathogenic and non-pathogenic strains. For many virulence factors their function is known in great detail. Often they are toxins or toxin-delivery systems that target host cells and interfere with the host’s normal cellular function. Although there are few attempts to incorporate such information into models of virulence, knowing the functions of the genes responsible for virulence should help in understanding the connection between benefit to the parasite (e.g., transmission or growth in the host) and virulence phenotype. This knowledge, in turn, may allow a more accurate modeling of whether and when the cost of virulence outweighs the benefit to the parasite. A general assumption that underlies most models of virulence evolution is that higher parasite densities within hosts lead to higher virulence and to higher rates of transmission. From this, a positive correlation between virulence and transmission rate is postulated, and has been found in several studies (see previous section). At a gross level, this seems plausible. Yet, there are biological reasons why this view may be misleading when dealing with more subtle variations in parasite density. The immediate cause of virulence may be the systemic levels of parasites, or it may be due to parasite levels in a specific tissue. This is particularly problematic if virulence results from a complex interaction between parasite levels and host tissue. For example, polio and bacterial meningitis are caused by the infection of tissues in the central nervous system (CNS) that are not part of the normal infection–transmission cycle. For these cases, higher parasite densities in the normal tissue of infection are usually not problematic, but infection of the CNS is. If parasite density in the normal tissue is correlated with infection of the CNS, then the generic model of ‘high parasite density = high virulence’ may work. If parasite density in the normal tissue is uncorrelated with CNS infection, then the model fails (Levin and Bull 1994). Virulence is an interaction between host and parasite. One might naively assume that virulence is due merely to the parasite’s killing of host cells. Yet it is increasingly realized that the causes of virulence are often an ‘over-reaction’ by the host immune system (Margolis and Levin 2007).


For example, the Spanish flu virus (influenza A), which killed up to 50 million people in the 1918 pandemic, is believed to trigger an aberrant innate immune reaction in its host, resulting in persistent elevation of inflammatory-response genes and a highly virulent infiltration of the immune cells in the respiratory tract (Kobasa et al. 2007). Just as an allergic reaction and asthma can be fatal in the absence of a true threat, the immune response to a parasite can contribute to host morbidity and mortality in ways inappropriate for controlling the infection. Immune over-reaction may in fact put the parasite in a ‘cruel bind’; it must find a way to avoid triggering a strong immune response yet still generate enough of an infection to be transmitted (Margolis and Levin 2007). How an immune overresponse should be incorporated into models for the evolution of virulence has yet to be properly developed.

Variation of hosts impacts the expression and evolution of virulence The trade-off model assumes that the level of expressed virulence reflects an adaptation of the parasite, ignoring the fact that the expression of virulence is usually the result of an interaction between host and parasite genotypes. Variation among hosts within populations is common, and virulence across different combinations of host and parasite genotypes has consistently been shown to vary markedly (Hill et al. 1991; Singh et al. 1997; Carius et al. 2001; Rauch et al. 2006). Furthermore, for many diseases only a small fraction of the infected hosts become sick, while the majority are symptom-free carriers. For example, only about 5% of humans infected with tuberculosis develop the disease. About 10% of Europeans carry Neisseria meningitidis at any given time, but only about 1 in 10,000 will develop meningitis. Only one in 200 polio virus infections leads to irreversible paralysis (WHO Factsheets 2003). In the absence of parasite evolution, virulence would decrease as a result of host selection for reduced costs of parasitism. In the absence of host evolution, parasites would evolve to exploit their hosts optimally. In a coevolutionary scenario, however, virulence is subject to antagonistic selection,


and the resulting outcome is not readily predicted (Ebert and Hamilton 1996). The expressed level of virulence in coevolving host–parasite populations may lie somewhere between the hosts’ and the parasites’ optima and may vary within the population, but epidemiological feedbacks can also produce very different outcomes (Restif et al. 2001; Gandon et al. 2002), making predictions difficult. The host-genotype specificity of virulence is enhanced when microparasites spend many generations on an individual host and adapt to deal specifically with this host genotype. Adaptation to one host genotype has been found to increase virulence on that host genotype and to reduce the parasites’ ability to exploit other host genotypes (Ebert 1998), but the generality of this is not clear. A corollary of this was suggested by a study of Garamszegi (2006): generalist species of malaria are less virulent than specialist species. Apparently, in genetically diverse host populations, parasites may have to adapt anew whenever they are transmitted, which goes hand in hand with a reduced level of virulence (Ebert 1998). This finding may—at least in part—explain the notorious sensitivity of agricultural monocultures to devastating epidemics (Barrett 1981). The cloning of livestock also needs consideration, for harmless diseases could evolve into deadly epidemics when infecting clonal herds.

Virulence has a direct benefit for the parasite The explanations for the evolution and expression of virulence that have been discussed so far see virulence as an unavoidable by-product of parasitic infections, one linked to the parasite’s other fitness components. In this sense, virulence has no direct benefit for the parasite or for the host. It is, however, possible that certain symptoms expressed by the diseased host do directly benefit parasite reproduction, survival, or transmission. Such examples of virulence may lend themselves to a practical theory, because the evolution of many traits, such as drug resistance, infectivity, and evasion from the immune system, are more easily predicted when there is a direct connection instead of an indirect connection to parasite fitness. Infected hosts


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

display many, diverse changes in behavior linked directly to parasite fitness (Moore 2002). Likewise, selection for avirulence in vertically transmitted parasites, where parasite fitness is directly linked to host reproduction, has been highly successful (Bull et al. 1991; Bull and Molineux 1992; Herre 1993). The idea that virulence gives an advantage to the parasite can potentially explain a wide range of disease-related symptoms, even diseases without host mortality such as the common cold, so there is a great potential for developing this line of theory. Ewald (1994) has presented some broad direct-benefit hypotheses based on mode of disease transmission. The direct benefit model and the trade-off model are not incompatible. A symptom with a direct benefit to the parasite may have side effects detrimental to the parasite. Furthermore, the overexpression of a beneficial symptom may kill the host, thus making an intermediate level of expression more favorable. Ewald (1983), for example, proposed that vector-borne diseases gain an advantage by reducing their hosts’ ability to defend themselves against biting vectors. This may be achieved by increased fatigue and fever; however, the overexpression of fever may lead to unwanted host death, which is clearly not in the interest of the parasite. We have not presented many examples of disease symptoms that clearly benefit the parasite. Furthermore, the general model of direct benefit is prone to the abuse of simplistic speculation, as is illustrated by alternative explanations of disease symptoms. Coughing may promote the transmission of the parasite, but may also be a host response to clear the respiratory track. Fever may be beneficial to the parasite, but is also recognized as beneficial to the host by impairing parasite growth (Nesse and Williams 1994). Parasitic castration and gigantism in certain invertebrate hosts may help the parasite gain resources, or may help the host to allocate more resources to defense, if the castration is temporary (Minchella 1985; Ebert et al. 2004). To settle these arguments, quantitative data on host and parasite fitness effects are required. Unfortunately, such data are difficult to obtain, particularly in vertebrate hosts, and environmental variation may complicate the picture further.

Can we manage the evolution of virulence? The onslaught of drug-resistant microbes in the past few decades has outstripped industry’s ability to develop new drugs and has thus led to worldwide awareness of simple rules to manage drug resistance evolution. The use of imperfect vaccines and vaccines that do not cover the full spectrum of circulating strains is motivating models to predict and thus manage the impact of those vaccines (Lipsitch 1999; Chapter 11). Can we likewise realistically manage virulence evolution of human and agricultural parasites? Virulence management of human diseases has been proposed as a viable strategy (Dieckman et al. 2002). The ideas of managing virulence in human, agricultural, or natural populations have mostly been based on the trade-off model. In virulence management proposals, that model is interpreted as making it possible to predict expected changes in virulence when host density or demography, opportunities for transmission, or frequency of multiple infections changes. Recently we argued (Ebert and Bull 2003a,b) that this idea may be too simple and is not likely to result in significant reductions of virulence of medically and economically important diseases for the following reasons. 1. There are no good examples of predicted changes in virulence associated with trade-offs. 2. The few published successful experiments either used extreme conditions (e.g., comparison between vertically and horizontally transmitted diseases) or found a weak response. 3. Comparative evidence of virulence among different host–parasite combinations under different environmental conditions explains only a small portion of the variance in virulence, suggesting that simple environmental correlates play only a minor role in virulence evolution. 4. Host mortality—the single measure of virulence in most models—is not high enough for most human parasites of medical interest to have much impact on parasite evolution. For example, human influenza mortality rates are typically less than 1%, which is not be expected to be a limiting factor in the transmission to new hosts.


5. The idea of a two-dimensional trade-off may be too simplistic; the underlying genetic and physiological structure may be multidimensional. For example, trade-off models commonly ignore the immune response and behavioral and life-history changes of the hosts. 6. The trade-off model assumes constant parameter rates throughout the course of the infection. When rates early in the infection differ from those late in the infection, or when the level of virulence affects the immune response, predictions may change drastically. 7. The trade-off model ignores genetic variation among hosts and thus neglects one of the most prominent factors in disease expression. The field of the evolution of virulence is a young field, and we do not want to halt progress by being overly pessimistic. Instead we would like to see the field take an open-minded attitude and not fixate on trade-offs. Virulence in the trade-off model is a correlated trait, and correlated traits are notorious


for their slow evolutionary response. Few people have studied virulence traits under direct selection, where an evolutionary response should be much more rapid. For example, in the case of diphtheria (see Box 12.2), the vaccine produced direct selection against strains that produced the diseasecausing toxin. As a result, virulence of the overall population of the bacterium dropped significantly, due to a reduced frequency of strains carrying the toxin. Without specific knowledge of the biological details of this system, a general model would not have predicted this outcome, and indeed, the evolution of virulence in strains that continue to carry the toxin is a separate matter. Our view is that the best hope for virulence management will be the application of models to specific cases, and that, in contrast to the management of drug resistance evolution, there will be few practical generalities. One of the best arenas for such studies may be agriculture, where extreme crowding leads to strong selection for change.

Box 12.2 Case histories Perhaps the strongest justification for a theory on the evolution of virulence would be evidence that parasite virulence commonly evolves on a short time scale. Such an observation is not easy to make, but they will hopefully be forthcoming as more attention is devoted to this subject. As noted elsewhere in this chapter, various environmental factors can affect virulence, such as nutrition, health care, and immunity. One welldocumented example is measles, which has a high mortality rate, approaching 30%, in humans that are poorly nourished, but a vastly lower mortality rate in well-nourished people. Thus, broad social trends in environmental factors can alter virulence over time, giving the appearance that virulence has evolved when it has not. We discuss a few cases here. There are not many examples, as changes in virulence over time have rarely been documented. And the examples we do have do not clearly support any general model for the evolution of virulence; rather, they suggest that models must be specific to the details of the nature of the disease and virulence.

Diphtheria One well-documented evolutionary change in virulence has occurred in the bacterium that causes diphtheria, Corynebacterium diphtheriae. The pathogenic form of this bacterium carries a set of genes that produce a toxin causing swelling in the throat. Widespread vaccination targeting the toxin specifically has reduced the frequency of the pathogenic form of the bacterium relative to the non-pathogenic form, in essence an evolved reduction in virulence. While this reduction in virulence can be explained post hoc on evolutionary principles, it is not clear whether the theory applied a priori would have predicted a decrease or an increase in virulence in response to the vaccine. Transmissible gastroenteritis coronavirus (TGEV) This virus infects the guts of pigs, causing diarrhea, and is a common source of mortality in piglets (Kim et al. 2000). A mutant form of this virus, porcine respiratory coronavirus (PRCV), differs only by a deletion and a few point Continues


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Box 12.2 Continued mutations, yet it infects the pig respiratory system and is often much less virulent that TGEV. Antibody cross reactivity between the two viruses means that the population of one form of the virus interferes with the other, and it is suspected that less virulent PRCV was responsible for the disappearance of TGEV in some pig farm areas. None of the existing evolution of virulence theories could have predicted either the evolution of a new tissue tropism in this virus, or the differential success of viruses according to their virulence. Indeed, the mutant strain may not have been at its virulence optimum when it first arose.

Influenza A Perhaps the most dramatic changes in the virulence of human pathogens have been observed in the flu virus, although interpreting these changes as virulence evolution is equivocal. The flu virus causes annual epidemics and evolves rapidly in its major antigenic determinants, so the human population remains largely susceptible over time, and the mortality rates in typical years are moderately constant. Yet the mortality and extent of population-wide infection occasionally increases dramatically when ‘pandemics’ occur. Most year-to year evolution in the influenza virus is within a type of virus. Antigenic type is designated H1N1, or H3N2, with ‘H’ referring to the antigenic type of viral hemagglutinin and the ‘N’ referring to the type of viral neuraminidase. Three pandemics occurred in the 1900s, the most lethal being the 1918 flu that was the first introduction of the H1 serotype. Mortality rates from this infection were not only unusually high, but the virus also disproportionately killed 20 and 30 year olds compared to other epidemics. The mortality rate from this virus eventually dropped

Summary 1. Virulence is a complex trait. Its expression depends on the host and parasite genotypes, the evolutionary history of the association and the current conditions. 2. The evolution of virulence by natural selection on the parasite can be partitioned into three stages: phase 1, the first contact of host and parasite, as in

to lower levels. It is not known if the virus evolved lower virulence per se, or if the human population acquired sufficient immunity to H1, but when the H1N1 type was reintroduced in the1970s after disappearing for several decades, virulence was not appreciably high, even though much of the population had had no prior exposure. Experiments with a genetically reconstructed 1918 virus genotype in macaques suggest that this virus was so virulent because infected hosts mounted an aberrant innate immune response that was insufficient for protection (Kobasa et al. 2007). In comparison, the contemporary H1N1 virus elicits a transient and appropriate activation of the immune defense (Kobasa et al. 2007). The 1918 and the contemporary H1N1 virus differ in a number of other proteins. Thus, it is plausible that the virulence of the 1918 type in humans was indeed high for reasons other than the novelty of its antigens. Influenza occurs in many organisms besides humans. Current concerns about ‘bird flu’ center on an H5N1 variant that spreads rapidly and is highly lethal in birds. When it infects humans, mortality rates are 50% or more, but so far all introductions into humans have died out. Evolution of virulence theories can potentially account for virulence in birds, however, which is due chiefly to one or a few amino-acid mutations in a proteolytic cleavage site in the hemagglutinin protein. Similar outbreaks of highly virulent flu strains have been reported previously, forcing the destruction of entire chicken farms. Since many of the birds transmitting these viruses are domestic and maintained at high densities, this case may be well suited to applications of evolution of virulence theories.

accidental infections; phase 2, the evolution toward an optimal virulence soon after successful invasion of a new host species; phase 3, evolution of virulence after the disease is well established. Most efforts to understand, predict, and manage the evolution of virulence have been applied to phase 3. 3. The most common model of virulence evolution assumes a simple trade-off between virulence and transmission and that selection optimizes the net


transmission between hosts. This model may be applied to phases 2 and 3; most efforts have been to phase 3. Empirical data support the assumptions of the trade-off model, but do not well support the predictions. 4. Few current models include the mechanism of virulence. We suggest that models based on biological details of specific diseases may result in better predictions and therefore that future efforts should be directed to consider virulence evolution in specific cases. 5. Although host variability plays an important role in the expression of virulence, the impact of host variability for the evolution and expression of virulence has not been satisfactorily incorporated into the models.


6. We caution against the use of untested general models to guide attempts for the management of infectious diseases. 7. With respect to public health and domestic livestock implications, considerations of virulence should focus on more than just its evolution. Dense host populations are susceptible to invasion by highly virulent parasites, even though the high virulence may not be optimal for the parasite.

Acknowledgments We thank Steve Stearns and Jacob Koella for many detailed comments on three earlier versions of this chapter. Bruce Levin and Elisa Margolis pointed our attention to some inconsistencies.

This page intentionally left blank

C H A P T E R 13

Evolutionary origins of diversity in human viruses Paul M. Sharp, Elizabeth Bailes, and Louise V. Wain

Introduction The extent and pattern of contemporary genetic diversity among strains of a virus reflects the evolutionary history of that virus and may be important in the context of therapy or vaccine development. Analysis of this diversity may also provide insights into the history or epidemiology of the virus, and these in turn can provide vital clues for preventing or limiting further infection. Some viruses have had a profound effect on human history, while others have massive impact today; understanding viral evolution can help explain why. Here we focus on where the different species of viruses currently infecting humans originated, and what factors shape the genetic diversity among strains. We begin by considering the origins and subsequent evolution of human viral diseases, then illustrate their possible histories with a closer examination of four particularly well-studied cases: herpesviruses, AIDS viruses, influenza A, and dengue viruses

Origins of human viruses Viruses are extremely diverse. They can be grouped according to the nature of their genetic material, i.e., single- or double-stranded RNA or DNA. But even within each of these categories there are viruses so different as to exhibit no recognizable similarity. The most important taxonomic grade for viruses is the family. Viruses are classified into more than 70 families; members of more than 20 of these families have been found infecting humans. Typically, two viruses placed in the same family contain similar

genome structures and identifiably homologous genes. Members of different families rarely share identifiably homologous genes. All cellular organisms (bacteria, archaea, and eukaryotes) can be placed in a single evolutionary tree, because they share at least some homologous genes; in contrast, no single evolutionary tree can be drawn containing all viruses— evolutionary relationships can be traced within virus families, but generally not among them. The genetic diversity of individual virus species is affected by both their own evolutionary history and that of their hosts. Ancestors of Homo sapiens were doubtlessly infected by many viruses, some of which we have inherited from them. It is also clear that many viruses must have been acquired by humans from other species over the past 5000 years or so. For example, viruses that cause ‘crowd diseases’ could not have been maintained in prehistoric human hunter-gatherer populations. The classic example is measles virus. Survivors of measles infection have lifelong immunity to the measles virus. Thus, to be maintained in a population, the virus requires a constant supply of previously uninfected individuals—typically children. Consequently, it has been estimated that measles can only persist in populations of at least 250,000 individuals (Bartlett 1957). Human populations first reached this size around 5000 years ago, in the Middle East, and it is commonly assumed that measles has been continuously circulating in humans since about this time (Cliff et al. 1993). Since humans entered the Americas more than 10,000 years ago, they were not exposed to any viruses subsequently acquired by humans in the



PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Old World until after 1492. It has been estimated that in the sixteenth century 90% of the native Americans died of diseases, particularly those caused by viruses such as measles and smallpox, introduced by invading Europeans (McNeill 1976). Other viruses have entered the human population very recently. HIV-AIDS is the prime example of a viral disease that appeared in the twentieth century, and has spread throughout the world. Others, such as Marburg, Ebola, Nipah, Hendra, and SARS have been recognized within humans recently, but caused only limited outbreaks. ‘Emerging’ viral infections may reflect viruses that have either been transmitted to humans only recently, or recently spread beyond their previous geographic range. For example, West Nile virus (a member of the Flaviviridae) emerged in North America in 1999 but was previously known infecting humans in the Old World; the strain that turned up in New York was very similar to those found in the Middle East (Lanciotti et al. 1999). Most recently, in 2005–6, Chikungunya virus (from the family Togaviridae) emerged on Reunion Island in the Indian Ocean, infecting nearly one-third of the population of 770,000; the strain responsible was closely related to viruses found in Central Africa (Schuffenecker et al. 2006).

Origins of diversity within human viruses The extent of contemporary genetic diversity among strains of a virus reflects the time since they shared a common ancestor and their rate of evolution (Fig. 13.1). The pattern of this diversity may be further influenced by recombination and/or reassortment among divergent strains. The time since the last common ancestor (LCA) is influenced by several factors. In viruses that have been recently acquired from another species, and have been spreading rapidly among humans, the LCA may have been soon after cross-species transmission. However, a single ‘species’ of virus may reflect multiple transmission events, so that the total diversity reflects earlier diversification in the previous host. In viruses that have long infected humans, the time since the LCA is likely to reflect population genetic processes, such as random

LCA Diversity

Time Figure 13.1 Origins of genetic diversity in a hypothetical virus. The current diversity reflects the time since the last common ancestor (LCA) and the rate of evolution.

genetic drift or natural selection. Under random genetic drift, through chance events a single virus becomes the ancestor of all members of the species at some later point in time. This process occurs faster when the effective population size (Ne: in this context, roughly the long-term average number of infected individuals at one time responsible for the next round of infected individuals) is smaller. Alternatively, if an advantageous mutation arises, it is expected to spread rapidly through the viral population removing most of the existing genetic diversity. Both random genetic drift and a selective sweep will be less far-reaching if the host population is subdivided. Thus, viewing the contemporary diversity of a virus, we may ask what factors have determined the ‘coalescence time’ (i.e., the time back to the LCA). Under random genetic drift (i.e., in the absence of adaptive changes) in an idealized population (without structure) the expected coalescence time is 2Ne generations. (In this context, a ‘generation’ may be the average time between host individuals in a transmission chain.) With a selective sweep this coalescence time is reduced; i.e., the LCA will have been more recent, dating to the time when the last sweep occurred. If the host population has been subdivided, the coalescence time may be longer, due to retention of different viral lineages in different host subpopulations. The rate of evolution of a virus is primarily influenced by its mutation rate, which in turn reflects the probability of mutation per round of viral replication and the average number of replications per unit time. RNA polymerases tend to be more error prone than DNA polymerases,


and so generally RNA viruses have higher mutation rates than DNA viruses. Population genetic theory indicates that the rate of neutral evolution is equivalent to the rate of mutation. Synonymous nucleotide substitutions (i.e., those that do not cause a change in the encoded protein) are likely to be neutral and thus reflect the mutation rate. In contrast, nonsynonymous substitution rates are predominantly reduced by selective constraints on protein sequences, but can also be increased by adaptive changes; the extent and nature of these selective forces vary among proteins. For these reasons, it is most useful to compare rates of synonymous nucleotide substitution between viruses, since these reflect a ‘basal rate’ of evolution. For rapidly evolving RNA viruses, the rate of evolution can be estimated from comparisons among strains isolated at different time points (Rambaut 2000). The rate can then be used as a ‘molecular clock’ to estimate the dates of ancestral viruses, such as the LCA. While such rates often appear to be less uniform among different strains than expected under a random model, they are nevertheless accurate enough for dating purposes (Jenkins et al. 2002). The extent of information on the evolution and diversity of different viruses varies greatly. Here we consider human viruses from four different families that have been particularly well studied. In each case we start by discussing the origins of the viruses infecting humans. Then we consider how evolutionary processes have affected genetic diversity subsequent to these origins.


Herpesviruses The family Herpesviridae provides the clearest examples of viruses that can be inferred to have been infecting humans throughout their evolutionary history. Eight different herpesviruses are known to commonly infect humans (Table 13.1). They cause a variety of diseases that are typically not life-threatening. All cause persistent infections: after a symptomatic phase the virus becomes latent, but may become reactivated at a later date. For most, transmission is via contact or exchange of saliva. Most have high prevalence rates in most populations. Each herpesvirus shows comparatively low genetic diversity. Thus two questions arise: (1) Why are humans infected by a diverse set of herpes viruses, and (2) Why does each virus exhibit low diversity? Many other herpesviruses have been isolated from other mammals, as well as from birds and reptiles. In addition, viruses isolated from other vertebrates (frogs and fish) and even an invertebrate (oyster) have been placed within the family Herpesviridae, on the basis of their morphology and genome structure, although they show little evidence of gene homology. Herpesviruses have large (100–240 kb), double-stranded DNA genomes. Mammalian herpesviruses share a core set of identifiably homologous genes that support their common ancestry, but they have also acquired many genes from their hosts’ genomes at different points in the past (McGeoch 2001).

Table 13.1 Human herpes viruses Virus


Common name

Typical symptoms



D-1 D-1 D-2 J-1 E-1 E-2 E-2 J-2

Herpes simplex type 1 Herpes simplex type 2 Varicella-zoster virus Epstein-Barr virus Cytomegalovirus None None KS-associated HV

Facial cold sores Genital ulcers Chickenpox, shingles Glandular fever Glandular fever Mild fever, rash None known Kaposi’s sarcoma (KS)

0.014 (19) 0.007 (14) 0.002 (26) 0.006 (5) 0.066 (10) 0.047 (7) 0.003 (5) 0.001 (4)

Diversity refers to the maximum nucleotide sequence difference among available glycoprotein B gene sequences. The values must be taken with some caution, because the pattern of sampling from global strains is not the same in each case; the number in parentheses is the sample size.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

The phylogeny of mammalian herpesviruses in Fig. 13.2 was derived from an analysis of glycoprotein B sequences; phylogenies estimated from different core genes are generally concordant (McGeoch et al. 2000). The mammalian herpesviruses fall into three major clades (D, E, and J), each of which can be split into at least two subclades (e.g., D-1 and D-2) designated as genera (e.g., subclade D-2 is the genus Varicellavirus). The human viruses fall into six different genera. Within most of these subclades are viruses isolated from numerous different host species and, in many cases, the phylogeny of the viruses is similar to that of their hosts (McGeoch et al. 2000). Thus, for all of the human herpesviruses except HHV-6 and HHV-7, closely related viruses have been isolated from other primates. Within subclades where viruses have been isolated from both Old World monkeys and New World monkeys (D-1, J-1, J-2), the Old World monkey viruses are the more closely related to the human viruses, consistent with the more recent common ancestry of Old World monkeys and apes; where viruses have been isolated from chimpanzees, they are typically the closest relatives of a human virus. The match between virus and host relationships extends to viruses from other mammalian orders, such as the carnivore (dog, seal, and cat) and artiodactyl (goat, bovine, and pig) viruses in the D-2 subclade, and artiodactyl (sheep, wildebeest, and pig) viruses in the J-2 subclade, while the two E-1 viruses from rodents (mouse and rat) are each other’s closest relatives. This remarkable concordance between host and viral phylogenies is most easily explained if the herpesviruses have been evolving and co-speciating with their hosts over many millions of years (McGeoch et al. 1995). For example, the last common ancestor of apes and Old World monkeys is thought to have lived around 25 Myr ago: under the host–virus co-speciation hypothesis, that ancestral primate was infected at that time with the last common ancestor of the closely related viruses infecting humans and Old World monkeys today (indicated by black circles in Fig. 13.2). While the majority of mammalian herpesviruses fall into host-dependent clades, there are nevertheless clear cases where cross-species transmissions must have occurred at some point in the past. For example, within the clade of D-1 herpesviruses from

primates lie two viruses from different species of wallabies and a virus from cattle (bovine herpesvirus 2; BoHV-2). The position of BoHV-2 within a clade of primate viruses has been interpreted as suggesting that cattle may have acquired this virus from humans after domestication about 10,000 years ago (Strauss and Strauss 2002). However, if the D-1 herpesviruses did indeed coevolve with the primates, the position of BoHV-2 within the tree indicates that cross-species transmission occurred more than 25 Myr ago. The divergence of the ancestor of the wallaby viruses would be similarly ancient. Since the hypothesis of co-speciation of herpesviruses and their hosts implies that the dates of viral ancestors were the same as those of their hosts, those host divergence dates (inferred from the fossil record, and from molecular clock extrapolations) can be used to estimate the rate of viral evolution. The rates of evolution for various herpesviruses have been estimated at 2–5 × 10 −8 synonymous nucleotide substitutions per site per year (Hatwell and Sharp 2000; Hughes 2002). These rates are about one order of magnitude higher than those estimated for mammals (Li et al. 1987). Analysis of a recently acquired gene has provided an estimate of the rate of herpesvirus evolution independent of the co-speciation assumption. BoHV-4 appears to have acquired a gene encoding C2GnTm quite recently, from African buffaloes (Markine-Goriaynoff et al. 2003). Subsequently, the viral gene has evolved about 25 times faster than its cellular homologue in buffaloes, consistent with the rate estimates derived assuming host– virus co-speciation, and thus providing evidence supporting the co-speciation hypothesis. Thus, the immediate reason why humans are infected by numerous different types of herpesviruses is because our primate ancestors were infected by the ancestors of these viruses. The dearth of E-2 herpesviruses identified from other species means that it is currently unclear whether HHV-6 and HHV-7 evolved with primates or were transmitted to a human ancestor from another species. Inheritance can explain the origin of four other herpesvirus (HHV-3, HHV-4, HHV-5, and HHV-8), but not why humans have two distinct herpes simplex viruses (HHV-1 and HHV-2). The split between the ancestors of HHV-1 and HHV-2


HHV-1/HSV1 Chimpanzee

HHV-2/HSV2 Baboon Macaque Wallaby Wallaby Bovine Spider monkey Squirrel monkey Horse Horse Dog Seal Cat Pig Goat Bovine Patas monkey Fowl Fowl




Turtle Turtle

HHV-5/CMV Chimpanzee Macaque Baboon Tree shrew Guinea pig Mouse Rat




Pig Elephant

HHV-4/EBV Chimpanzee Macaque Baboon Marmoset Pig Sheep Wildbeest



Horse Horse Badger Spider monkey Squirrel monkey Bovine Mouse Macaque



Figure 13.2 Phylogeny of herpesviruses, derived from comparison of glycoprotein B sequences. The eight human herpesviruses are highlighted by larger font. For other viruses, the name of the host from which the virus was isolated is indicated (baboon, macaque, and Patas monkey are Old World monkeys; spider monkey, squirrel monkey, and marmoset are New World monkeys). The ancestral lineages of the D, ␤ and J clades are indicated; brackets at the right denote the subclades (D-1, etc.). The black circles indicate the last common ancestors of viruses from humans and Old World monkeys. The scale bar indicates 0.2 amino-acid replacements per site.



PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

appears to have occurred during the diversification of our ape ancestors. Since the only chimpanzee D-1 herpesvirus as yet identified is closely related to HHV-2, HHV-1 and HHV-2 seem to have diverged before the last common ancestor of humans and chimpanzees. HHV-1 may have been acquired by our ancestors from some other ape. Alternatively, there may have been a viral speciation event within a host species. HHV-1 and HHV-2 now occupy distinct niches (‘above and below the belt’, respectively, as a colleague delicately put it). It has been suggested that a change in sexual behaviour in an ancestral ape led to the separation of these two niches (Gentry et al. 1988), and thus the split between the ancestors of HHV-1 and HHV-2. Whatever the cause, similar events must have occurred in the distant past, leading, for example, to the splits among the ancestors of the D, E, and J lineages. The concordance between herpesvirus phylogenies derived from different core genes indicates that divergent lineages of viruses have not recombined with one another. Again this may reflect the occupation of separate niches. Thus, ultimately, ancient viral speciations explain why there were six different lineages of herpesviruses for humans to inherit. The co-speciation hypothesis implies that a very large number of mammalian herpesviruses exist but have yet to be recognized. For example, chimpanzees should have counterparts to each of the human herpesviruses, but so far no close relatives of HHV-1, HHV-6, or HHV-7 have been described in chimpanzees. In a specific search for primate relatives of HHV-8, three distinct chimpanzee viruses were found (Lacoste et al. 2001), which raises the possibility that humans have other, as yet unrecognized, J-2 herpesviruses. This need not be surprising, especially if (as with HHV-6, HHV-7, and HHV-8) infection causes few and/or mild symptoms in otherwise healthy individuals. In fact, HHV-8 was only discovered in 1994, and then only because its pathogenicity is greatly enhanced in immunocompromised AIDS patients. Although most (if not all) human herpesviruses appear to represent ancient infections, their genetic diversity is generally quite limited. For example, HHV-1 strains from around the world typically differ by up to about 1% in DNA sequence

(Sakaoka et al. 1994). This is low compared to many other viruses (see below), but about an order of magnitude greater than the genetic diversity seen in humans (Li and Sadler 1991), mirroring the difference in evolutionary rates and implying that the coalescence time (the time back to the common ancestor) for HHV-1 variation is similar to that for human genetic variation. Human genetic variation is largely influenced by the small effective population size associated with the origin of modern H. sapiens in East Africa and expansion from there about 50,000–60,000 years ago (Liu et al. 2006). Population genetic processes would be expected to affect herpesviruses similarly. The extent of genetic diversity varies to some extent among the human herpesviruses (Table 13.1). It has been estimated that HHV-3 has about 10 times less diversity than HHV-1, while HHV-5 has about four times more diversity than HHV-1 (Barrett-Muir et al. 2002). This may reflect differences in evolutionary rate, perhaps because of differences in the fraction of time each virus is latent. Alternatively, the coalescence times of the viruses may vary. A deeper coalescence time could result from the survival of more than one viral lineage from prior to the emergence of modern humans; a more recent coalescence could reflect a selective sweep. HHV-3 is more transmissible than the other herpesviruses: it can spread via aerosol, whereas the others generally require more intimate contact. It has been suggested that this may explain a more recent coalescence time for HHV-3, because the host population is less subdivided (BarrettMuir et al. 2002). Recombination has been reported within various human herpesvirus species (Midgley et al. 2000; Norberg et al. 2004; Peters et al. 2006). While this means that genetic diversity can be shuffled, it has comparatively little impact, because the diversity is so low.

AIDS viruses The AIDS viruses represent the most extensively studied example of viruses that have recently emerged in humans following cross-species transmission. AIDS was first described in 1981 (Gottlieb et al. 1981), although it is now apparent that the




HIV-2 SIVsmm SIVmac SIVrcm SIVmnd2 SIVdrl HIV-1 SIVcpzPtt SIVcpzPts SIVsab SIVver SIVgri SIVtan SIVIho SIVsun SIVmnd1 SIVcol SIVgsn SIVmus SIVmon SIVsyk SIVsyk SIVdeb SIVdeb SIVden



HIV-1/M/A HIV-1/M/B HIV-1/M/D HIV-1/M/C HIV-1/M/F HIV-1/M/G HIV-1/M/H HIV-1/M/J Ptt/MB Ptt/LB Ptt/MT Ptt/GAB1 Ptt/CAM13 Ptt/US Ptt/CAM5 Ptt/GAB2 HIV-1/N Ptt/EK Ptt/DP HIV-1/O HIV-1/O Gor Gor Pts/TAN1 Pts/TAN3 Pts/DRC1 Pts/ANT

Figure 13.3 Evolution of AIDS viruses. (a) Phylogeny of primate lentiviruses. Representatives of those simian immunodeficiency viruses (SIVs) for which there are full-length genome sequences are included, as well as HIV-1 and HIV-2. Each SIV has a three-letter suffix indicating the species from which it was isolated. The tree was derived from the N-terminal half of the Pol protein. The scale bar indicates 0.1 aminoacid replacements per site. (b) Phylogeny of the clade including HIV-1. This schematic tree is a consensus of phylogenies derived from different regions of the genome. Ptt and Pts indicate strains of SIVcpz from Pan troglodytes troglodytes and P. t. schweinfurthii, respectively; Gor indicates SIVgor from Gorilla gorilla. For Ptt, MB, LB, MT, EK, and DP represent viruses sampled from different wild communities across southern Cameroon; multiple viruses (not shown) from each community cluster together (Keele et al. 2006). Each of the three branches on which transmission of SIV to humans occurred is marked by a black X. Transmission of SIV from chimpanzee to gorilla could have occurred on either of the branches marked by a gray X.

viruses responsible had been spreading among humans for several decades previously. Two related, but distinct, viruses cause AIDS in humans: human immunodeficiency virus types 1 and 2 (HIV-1, HIV-2). The vast majority of HIV-AIDS cases around the world are due to HIV-1; in comparison, HIV-2 infections are quite rare and were initially found only in people from West Africa. The HIVs are both members of the genus Lentivirus, within the family Retroviridae. When HIV-1 was first characterized, the closest known relative was visna virus, from sheep. Subsequently, much more closely related viruses have been found in more than 30 different species of nonhuman primates (Bibollet-Ruche et al. 2004). These simian immunodeficiency viruses (SIVs) have only been found naturally infecting monkeys and apes from sub-Saharan Africa. Each host species has its own form of SIV, and as far as is

known these viruses do not cause disease in their natural hosts. Humans appear to have acquired SIVs on numerous occasions during the twentieth century; cross-species transmission was most likely through blood splashed during butchery of primates hunted as bushmeat (Hahn et al. 2000). The first SIVs described were found in macaques. The phylogeny of the primate lentiviruses (Fig. 13.3a) reveals that these viruses (SIVmac), together with SIVsmm from sooty mangabeys (Cercocebus atys), are the closest relatives of HIV-2. However, macaques are not found in sub-Saharan Africa, and only captive individuals have been found to be infected. Moreover, SIV-infected macaques suffer from AIDS-like symptoms, and have become the standard experimental model for AIDS in humans. In contrast, sooty mangabeys are infected across their wild range in West Africa (Chen et al. 1996;


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Santiago et al. 2005). Thus sooty mangabeys represent the natural reservoir of these viruses and the source of infection for macaques in captivity (Apetrei et al. 2005) and humans in West Africa (Hirsch et al. 1989; Gao et al. 1992). More detailed phylogenies including multiple strains of HIV-2 and SIVsmm show that HIV-2 strains fall into eight distinct groups, A–H (Damond et al. 2004). The interspersion of these HIV-2 groups among SIVsm lineages indicates that each group of HIV-2 reflects a separate cross-species transmission (Gao et al. 1992; Hahn et al. 2000; Santiago et al. 2005). Two of these groups (A and B) have spread extensively, whereas each of the other groups has been found in only a single instance. The closest SIVsmm relatives of groups A and B have been found in sooty mangabeys from the Ivory Coast, suggesting that the cross-species transmissions giving rise to these human viruses occurred in the easternmost part of the sooty mangabey’s range (Santiago et al. 2005). The closest relatives of HIV-1 are found in the chimpanzee, Pan troglodytes (Fig. 13.3a). Chimpanzees are classified into four subspecies (Gonder et al. 1997; Groves 2001), but SIV has been found in only two of these: SIVcpzPtt in central chimpanzees (P. t. troglodytes) from west central Africa (southern Cameroon, Gabon, and neighbouring areas), and SIVcpzPts in eastern chimpanzees (P. t. schweinfurthii) from central Africa (north of the Congo river in the Democratic Republic of Congo (DRC) and into adjacent countries to the north and east). The western subspecies (P. t. verus) from west Africa and the Nigerian chimpanzee (P. t. vellerosus) from Nigeria and northern Cameroon do not appear to be infected (Sharp et al. 2005). Strains of SIVcpz from central and eastern chimpanzees form two distinct, subspecies-specific, clades; HIV-1 strains lie within the clade from central chimpanzees (Fig. 13.3b), indicating that the origin of HIV-1 was in west central Africa. Strains of HIV-1 form three distinct clades, termed groups M, N, and O; since these are interspersed among lineages of SIVcpzPtt, they must each reflect a separate cross-species transmission. After transmission these viruses have spread to very different extents. The vast majority of HIV-1 infections worldwide are due to HIV-1 group M; in contrast, HIV-1 group N viruses have so far been

found in only about ten individuals in Cameroon (Yamaguchi et al. 2006). HIV-1 group O is predominantly found in individuals from Cameroon, and is more common than group N, but far less widespread than group M. Until recently, only a few strains of SIVcpzPtt from captive chimpanzees had been identified. Because all other SIVs had been found in monkeys, and SIV infection of chimpanzees seemed very rare, doubt remained as to whether chimpanzees were the natural reservoir of these viruses and the source of HIV-1; chimpanzees might have sporadically acquired SIVs by hunting monkeys. However, recent noninvasive sampling of chimpanzees from across west central Africa has shown that in some communities the prevalence of SIV infection is as high as 30% and confirmed that chimpanzees are the natural reservoir of SIVcpz (Keele et al. 2006). Sequences of multiple viruses from five different locations revealed that strains of SIVcpzPtt exhibit phylogeographic clustering (Keele et al. 2006). Strains from the south eastern corner of Cameroon (the LB and MB communities) and south central Cameroon (the EK community) were extremely closely related to HIV-1 groups M and N, respectively, pointing to those regions as the locations of the chimpanzee-to-human transmissions. Very recently, strains of SIV most closely related to HIV-1 group O have been found in gorillas (Van Heuverswyn et al. 2006). The phylogeny (Fig. 13.3b) strongly suggests that SIVgor was transmitted to gorillas from chimpanzees, but it is not yet clear whether group O-like viruses were also transmitted independently from chimpanzees to humans, or whether gorillas were the source of the human viruses (Van Heuverswyn et al. 2006). Thus, the divergence between HIV-1 and HIV-2 reflects their origins from different natural host species and the prior extensive diversification of SIVs during their evolution in different nonhuman primates. The divergence between the different groups within HIV-1 and HIV-2 reflects further diversification within the natural reservoir species, before these viruses jumped into humans. The frequency of transmissions of SIV to humans suggests that additional groups of HIV-1 or HIV-2, or even additional HIVs originating from SIVs infecting other primate species, could yet arise.


Considerable genetic diversity has accumulated within HIV-1 group M since the jump from chimpanzees to humans, and group M has been further classified into subtypes A–K (Fig. 13.3b). The envelope protein sequences of viruses from different subtypes may differ by 30% of amino acids. This reflects a very rapid rate of evolution. Retroviruses contain two copies of a single-stranded RNA genome, which is replicated using a virus-encoded reverse transcriptase (RT). The first consequence of this is that HIV-1 has a high mutation rate. The error rate of HIV-1 RT is about 3.4 × 10−5 per site per replication (Mansky and Temin 1995). HIV-1 replicates rapidly, going through about 300 viral generations per year (Wei et al. 1995). This implies a mutation rate of about 10−2 per site per year, consistent with estimates of the rate of evolution of HIV-1 (Li et al. 1988; Korber et al. 2000). Thus, the basal rate of evolution of HIV-1 is more than one million times faster than that of its host. Using molecular clocks, the last common ancestor of group M has been placed at about 1930 (Korber et al. 2000; Sharp et al. 2000; Salemi et al. 2001), indicating that the chimpanzee-to-human transmission must have occurred before then. HIV-1 sequences from a plasma sample obtained from a man in Leopoldville (now Kinshasa, DRC) in 1959 are consistent with this dating; that virus lies deep in the group M tree, about halfway from the root (LCA) to the tips represented by viruses isolated in the 1980s and 1990s (Zhu et al. 1998). Thus group M sequences have been diversifying for about 75 years. The greatest diversity of group M strains has been found in Kinshasa (Vidal et al. 2000), suggesting that that is where the epidemic first multiplied. The majority of strains from North America and Europe, including those first characterized, fall within subtype B. All of the other subtypes are found in subSaharan Africa. Subtype B has thus emerged from a founder effect, with the epidemic in North America caused by a single ancestral strain that crossed the Atlantic. The other subtypes likely resulted from earlier but similar founder effects within central Africa. Subsequent founder events have shaped the biodiversity of HIV-1 today; for example, the vast majority of strains within southern Africa belong to subtype C. A second consequence of reverse transcription is recombination, when reverse transcriptase jumps


between the two copies of the RNA genome. Single host cells can be infected by multiple viruses (Jung et al. 2002), generating daughter viruses with two different RNA copies that recombine when the virus infects another cell. When an individual is infected by divergent viruses from different sources, hybrid viruses with mosaic genomes result. Numerous viruses have been identified with genomes that are mosaics of different subtypes (Robertson et al. 1995). This implies that multiple infections are not uncommon. Thus the diversity of HIV-1 group M reflects around 75 years of rapid evolution and recombination. The rapid spread of group M infections means that genetic bottlenecks have had little effect in reducing this diversity. The great diversity raises the possibility that drugs or vaccines developed initially to combat HIV-1 group M subtype B may be less effective against the viruses from other subtypes that dominate the global pandemic. One amino-acid change, which arises as a resistance mutation in subtype B viruses after treatment with non-nucleoside reverse transcriptase inhibitors, occurs as a ‘wild-type’ sequence in HIV-1 group O; however, antiretroviral drugs seem effective against all subtypes of HIV-1 group M (Parkin and Schapiro 2004).

Influenza A viruses Humans are infected by three related influenza viruses (termed A, B, and C), from the family Orthomyxoviridae. Influenza C virus infections are comparatively harmless. Influenza B virus infections are of sufficient consequence that a B virus component is included in influenza vaccines, but influenza A viruses cause the most serious annual epidemics. Typically, the antigenic properties of the predominant form of the influenza A virus vary a little from year to year, a process known as antigenic drift. However, on three occasions during the twentieth century, the antigenic properties of the predominant form changed radically, such that the viruses had a different serotype, creating a pandemic in which many more people became infected and died. This process is known as antigenic shift. Thus the two main features to be explained in the evolution of influenza A virus are antigenic shift and antigenic drift.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.


H1+N1 human H1+N1 swine H1+N6 H2+N2 human H2+N9


* * * *

H2+N3 H5+N1 human H5+N1 H5+N8 H6+N1 H6+N8 H8+N2 H8+N4 H12+N5 H12+N9 H9+N2 H9+N6 H11+N2 H11+N6 H13+N6 H13+N9 H16+N3 H16+N3 H3+N2 human H3+N8 H3+N6


H4+N2 H4+N6 H14+N5 H7+N3

2000–2004 1995–1999 1990–1994 1985–1989

H7+N7 H15+N8 H15+N9 H10+N3 0.1



1980–1984 1975–1979 1970–1974 0.02


Figure 13.4 Evolution of influenza A viruses. (a) Phylogeny of hemagglutinin sequences; the neuraminidase serotype of the virus is also given. All viruses were isolated from birds, unless otherwise indicated. The scale bar indicates 0.1 amino-acid replacements per site. (b) Phylogeny of hemagglutinin gene sequences from human H3N2 viruses, isolated between 1968 and 2006. Where possible, two viruses from each year were chosen from those available. Symbols indicate the date for each virus. The scale bar indicates 0.02 nucleotide replacements per site.

Influenza A viruses have negative sense, singlestranded RNA genomes, comprised of eight separate segments. Each segment corresponds roughly to a gene. The serotype is determined by the hemagglutinin (H) and neuraminidase (N) proteins, encoded by segments 4 and 6, respectively. Sixteen different H serotypes have been described, as well as nine different N serotypes, and these occur in many combinations. However, only a few serotypes have been reported in humans, and typically only one serotype is present in the human population at any time. All serotypes have been found in aquatic birds, which are the natural reservoir of influenza A viruses (Webster et al. 1992; Olsen et al. 2006). Particular serotypes of influenza A viruses

have also been found infecting many other mammals, particularly pigs and horses. Influenza viruses were first characterized in the 1930s, and the serotype at that time was termed H1N1. Antigenic shift occurred in 1957 when H2N2 viruses emerged, causing the pandemic known as ‘Asian flu.’ H1N1 viruses disappeared from the human population. Antigenic shift in 1968 gave rise to ‘Hong Kong flu’, and involved the emergence of H3N2 viruses (and the disappearance of H2N2 viruses). An evolutionary tree of hemagglutinin sequences (Fig. 13.4a) shows that the H1, H2, and H3 sequences of human viruses are highly divergent. H1 and H2 sequences differ at about 30% of amino acids, while H3 differs from H1 and H2 by nearly


60%. Human viruses form small clades at the tips of the tree; for example, the hemagglutinin sequences of H3N2 human viruses isolated between 1968 and 2006 (as seen in Fig. 13.4b) would all form a tight cluster with the single H3N2 virus included in Fig. 13.4a. Most of the viruses included in Fig. 13.4a were isolated from birds, where the evolutionary diversification among these hemagglutinin sequences has occurred. Clearly, the antigenic shifts in 1957 and 1968 did not involve the mutation of one hemagglutinin into another of a different serotype, but rather the emergence in humans of a different hemagglutinin from a bird virus. It is notable that in the hemagglutinin tree the neuraminidase serotypes do not form clusters. However, in a tree derived from comparisons of neuraminidase sequences (not shown), all viruses form clades according to their neuraminidase serotype. Thus, the two different segments have different evolutionary histories. When a single individual is infected by two different influenza A viruses, reassortment can occur, where hybrid viruses are formed with a combination of segments from both parents. The seemingly random combinations of H and N serotypes found suggest that superinfection and reassortment have been common in birds. From comparisons of all segment sequences it is apparent that the antigenic shifts in 1957 and 1968 involved reassortment. H2N2 human viruses combined three segments newly derived from bird viruses with five segments from the earlier H1N1 human viruses; H3N2 human viruses contain two bird-derived segments. The ‘Spanish flu’ of 1918 was by far the most serious influenza pandemic of the twentieth century, when it is estimated that at least 40 million people died of influenza. This marked the arrival of H1N1 strains in humans. In contrast to later antigenic shifts, it is thought that the 1918 virus was not a reassortant, but rather a complete avian virus introduced into humans. A lesser pandemic in 1977, the ‘Russian flu,’ involved the re-emergence of H1N1 viruses very closely related to those circulating in humans around 1950. Since influenza A viruses accumulate nucleotide substitutions at steady rates, the lack of divergence of the 1977 virus from those of the early 1950s suggests that this strain was, quite literally,


frozen for about 25 years (Buonagurio et al. 1986). Unusually, this re-emergence of H1N1 viruses did not precipitate the disappearance of H3N2 viruses, perhaps because people over 20 years of age had been exposed to H1N1 viruses previously, so that in 1977 H1N1 viruses did not have the same antigenic novelty as H2N2 or H3N2 viruses did when they emerged. Consequently, for the past 30 years there have been two serotypes of influenza A viruses circulating in humans. Current influenza vaccines are trivalent, protecting against influenza A serotypes H1N1 and H3N2, and against influenza B. As might be anticipated, co-circulation of two serotypes in humans allows the possibility of reassortment. In the winter of 2001–2, the predominant strains of H1 influenza A viruses in the UK were H1N2 viruses, originating from reassortment of H1N1 and H3N2 human viruses (Barr et al. 2003). In 1997 H5N1 viruses appeared in humans in Hong Kong. One-third of the 18 people known to be infected died. The virus was identical to an H5N1 strain circulating among domestic birds in Hong Kong at that time (see the two H5N1 viruses in Fig. 13.4a), and it appears that each person was directly infected from birds. Because avian and human influenza A viruses utilize different cell surface receptors, avian strains need to evolve to become transmissible between humans. The 1997 outbreak was halted by a mass cull of domestic birds. However, in subsequent years there have been hundreds of cases of H5N1 infection of humans, predominantly in Southeast Asia, with a high mortality rate. The ancestry of these highly pathogenic H5N1 avian strains has been traced to several reassortments among different serotypes of avian viruses (Li et al. 2004). As the number of cases rises, so does the risk that an H5N1 strain might evolve to become transmissible among humans, by mutation or by reassortment with one of the current human strains, and create a new pandemic with high mortality. Evolutionary trees restricted to comparisons of human influenza A viruses have a most unusual structure (Buonagurio et al. 1986). For example, the tree in Fig. 13.4b compares the hemagglutinin gene sequences from H3N2 viruses from the past 39 years. The tree is characterized by a long central trunk with very short side branches, quite


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

unlike the bush-like structures in the other trees shown here. There is a clear temporal progression, with viruses from later years falling further from the original 1968 strains. These characteristics of the tree reflect the process of antigenic drift. Apparently, each year or two, one virus undergoes one or a small number of mutations in antigenically important sites that confer a selective advantage to that virus; it outcompetes other viruses in the epidemic and becomes the ancestor of all subsequent human viruses. These selective sweeps lead to repeated genetic bottlenecks. As a consequence, between pandemics, the amount of contemporary genetic variation remains low, typically reaching a difference of no more than a few percent. This process of antigenic drift should not be confused with random genetic drift. The virus that is the founder of each wave of epidemic spread is not merely epidemiologically lucky, but rather has undergone adaptive mutation. The amino-acid changes occurring on the trunk of the tree (i.e., those in viruses which gave rise to future epidemics) have been mapped to antigenically important sites in the hemagglutinin structure (Busch et al. 1999). The rate of human influenza A virus evolution can be estimated by plotting, for each virus, the amount of sequence difference from a virus near the origin of the tree (such as a 1968 H3N2 virus) against the year of isolation (Buonagurio et al. 1986). Hemagglutinin gene sequences evolve at a rate of about 7 × 10−3 nucleotide substitutions per site per year (Fitch 1995). Segment 8 sequences, encoding the nonstructural (NS) proteins, have fewer nonsynonymous substitutions (at least in part because these proteins are not subject to the same immune-related selection) and evolve at about 2 × 10−3 nucleotide substitutions per site per year (Buonagurio et al. 1986). Although the segments are not physically linked, they are present in the same virions, and so the NS gene tree shows a structure similar to that in Fig. 13.4b. Because the reassortment events in 1957 and 1968 did not replace all segments in human influenza A viruses, some segments of today’s viruses are descended directly from the 1918 H1N1 viruses; these include segment 5, encoding the nucleoprotein (NP). In 1918, an epidemic of H1N1 influenza also occurred in swine. Comparisons of NP gene sequences from human and swine viruses show

that they are quite closely related, and using the molecular clock approach outlined earlier, their common ancestry can be traced to shortly before 1918 (Fitch 1995), indicating that the outbreaks in the two species had a common source. The genome of a 1918 virus has been isolated from the remains of an Alaskan woman who died of influenza and was buried in permafrost (Taubenberger et al. 1997). All eight segments have been sequenced (Taubenberger et al. 2005). As expected, in phylogenetic trees this virus lies close to the common ancestor of H1N1 viruses from humans and swine. With the completion of the genome sequence of the 1918 H1N1 virus, work is underway to understand why that virus was so virulent (Tumpey et al. 2005). Results from infection of macaques with a reconstructed 1918 virus suggest that the virus generated an aberrant innate immune response with sustained overexpression of proinflammatory cytokines and chemokines (Kobasa et al. 2007).

Dengue viruses Dengue virus is one of the most important emerging pathogens. Dengue virus infection usually causes dengue fever, an acute febrile illness (‘breakbone fever’) that may be painful but is rarely fatal. In a small fraction of cases, infection leads to more severe dengue hemorrhagic fever or dengue shock syndrome (DHF/DSS). Dengue fever has been known since the eighteenth century—one of the first recorded outbreaks was in Philadelphia in 1780—but only infrequent epidemics were reported until the middle of the twentieth century. Within the past 50–60 years dengue has spread and become pandemic throughout the tropics. It is estimated that there are now over 100 million cases of dengue infection each year, and up to 500,000 cases of DHF/DSS. Unlike the viruses discussed earlier, dengue is an arbovirus, i.e., an arthropod-borne virus, transmitted from human to human via mosquitoes, primarily of the species Aedes aegypti. Dengue virus is a member of the Flavivirus genus within the family Flaviridae, with a positive sense singlestranded RNA genome. Closely related mosquitoborne viruses include Yellow fever virus, St. Louis encephalitis virus, Japanese encephalitis virus, and


West Nile virus. All are important pathogens, but in humans they are not as widespread as dengue. Dengue viruses are classified into four serotypes (DENV-1 to DENV-4), which form four distinct clades in a phylogeny of flaviviruses. Within each clade, viruses show up to about 8–10% DNA sequence difference, and they have been classified into genotypes. Members of different clades differ at about 30% of nucleotides. Recombination has been found to occur between genotypes within serotypes, but not between serotypes (Worobey et al. 1999). Evidence of all four serotypes of dengue virus has been found among monkeys in Southeast Asia, transmitted by various Aedes species (but not A. aegypti) in the forest canopy. A limited number of these ‘sylvatic’ viruses, from three serotypes (DENV-1, DENV-2, and DENV-4), have been characterized. Within each serotype, the sylvatic viruses fall as outgroups to the human strains (Wang et al. 2000), similar to the way the MB and LB forms of SIV from chimpanzees fall as outgroups to HIV-1 group M (at the top in Fig. 13.3b). Thus, dengue viruses seem to have originally infected non human primates and each serotype of human viruses evolved independently. These transitions to infection of humans each involved a switch of vector and of geographic niche, since A. aegypti is ‘peridomestic,’ found primarily around human settlements; the sylvatic forms are transmitted among monkeys by other Aedes species in the forest canopy. While sylvatic strains of DENV-2 have also been found in Africa, and all four serotypes infect humans across Africa and the Americas, it is thought that both the original diversification of the four serotypes and the transmissions to humans occurred in Southeast Asia (Wang et al. 2000). The rate of synonymous substitution in DENV-2 has been estimated at about 5 × 10−4 per site per year (Wang et al. 2000); others have estimated a similar value for the overall rate of evolution, implying a somewhat faster rate of synonymous substitution (Twiddy et al. 2003). These values are about 10–20 times slower than HIV-1 or influenza A viruses, but about four orders of magnitude faster than herpesviruses, and indicate a rate typical of RNA viruses (Jenkins et al. 2002). Within each serotype, from the extent of diversity among human dengue viruses and the divergence of the human and


sylvatic strains, the times of origin of the human strains have been estimated as about 100–300 years ago (Twiddy et al. 2003). Since very few sylvatic strains have been characterized so far, some that are more closely related to the human viruses may yet be found, reducing the upper limit of this time depth. The divergence among the four serotypes would have occurred much earlier. Thus, the diversity of dengue viruses reflects (i) ancient diversification in nonhuman primate hosts, followed by four independent transmissions to humans giving rise to DENV-1–DENV-4, and (ii) subsequent rapid diversification within each serotype. The four transmissions have an important consequence, since only a tetravalent vaccine could protect against all four serotypes. This diversity of dengue viruses could be relevant to whether an infected human develops dengue fever or DHF/ DSS. It is currently unclear whether the difference in disease outcome is due to variation among hosts, variation among viral strains, or a combination of the two. One widely expounded but controversial theory (reviewed by Bielefeldt-Ohmann 1997) is that antibody-dependent cross-enhancement between serotypes leads to DHF/DSS; that is, the immune response raised by previous infection by one serotype may increase the probability of DHF/ DSS developing upon infection by another serotype. However, it has recently been reported that the serotypes circulating in Bangkok, Thailand, over the 20 years from 1980 showed an alternating pattern, interpreted as resulting from some degree of cross-immunity (Adams et al. 2006). Thus the interaction among serotypes is complex and not yet fully understood.

Comparisons among viruses The four viruses discussed above have evolved in very different ways. There is strong evidence that most human herpesviruses were inherited from our ancestors. While it is to be expected that the ancestors of H. sapiens were infected by many viruses, and that we might have inherited many of them, the same clear evidence of host-virus co-speciation is as yet lacking for other viruses. Perhaps the strongest cases can be made for members of two families of small, double-stranded DNA


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

viruses, the Papillomaviridae and the Polyomaviridae. Around 100 different types of human papillomavirus (HPV) have been described, while a much more limited number have been found in other mammals and in birds. Two papillomaviruses, isolated from chimpanzees and bonobos (Pan paniscus) respectively, are each other’s closest relatives and together with HPV13 form a clade consistent with host-virus co-speciation (Van Ranst et al. 1995). It has also been argued that phylogeographic variation within each HPV type is consistent with host-virus coevolution during the diversification of human races (Bernard et al. 2006). If these viruses have indeed been coevolving with mammals, it would be expected that each host species would harbour a similar number of papillomavirus types as humans—initial surveys of a range of mammals have indicated that DNA from a variety of previously undescribed papillomaviruses can be identified (Antonsson and Hansson 2002). Strains of JC virus (JCV), a member of the Polyomaviridae, mostly cluster according to the racial origin of the individual, consistent with coevolution of JCV with humans, at least over the time scale of the diversification of human races subsequent to the emergence of modern humans from Africa; it has even been argued that JCV could be used to shed light on the history of different human populations (Agostini et al. 1997; Sugimoto et al. 1997). JCV resembles herpesviruses in that it causes persistent infections. Assuming that JCV has coevolved with human populations, its rate of evolution can be calculated as about 4 × 10−7 synonymous nucleotide substitutions per site per year (Hatwell and Sharp 2000), about one order of magnitude faster than herpesviruses, but much slower than the RNA viruses discussed earlier. However, it has recently been suggested that overall there is no strong match between the tree of human populations and that linking the clades of JCV found in the same populations (Shackelton et al. 2006). Moreover, Shackelton et al. (2006) estimated that its rate of evolution is much faster (about 2 × 10−5 substitutions per site per year), implying that the LCA of global JCV strains existed only about 1300 years ago. This hypothesis leaves some features of the JCV phylogeny unexplained. For example, native American tribes from North, Central, and South America

all share similar viral strains, most closely related to JCV strains from East Asia. This is as expected under the coevolution hypothesis, but very difficult to explain if these strains all shared a common ancestor within the past few hundred years. Thus, the history of JCV has yet to be resolved. In contrast, influenza A, dengue, and AIDS viruses have clearly been acquired from other species during recent human history. Influenza has plagued humans for many centuries, whereas the AIDS viruses were acquired within the last century. However, despite their similarly rapid rates of evolution, this disparity in time scale is not mirrored by accumulated genetic diversity. The two viruses are similar in that both exist as divergent types (i.e., H1N1 and H3N2 influenza, HIV-1 and HIV-2, or even the multiple groups of the HIV-1) due to multiple cross-species transmissions. Influenza and AIDS viruses are also similar in having very fast rates of evolution. But a major difference is seen within the clades that each result from a single transmission. Diversity has continuously accumulated in HIV-1 but has been restricted by repeated bottlenecks during human influenza A virus evolution. In this regard, dengue viruses are very similar to HIV-1; the lower diversity seen within a dengue virus serotype than within an HIV-1 group reflects the slower evolutionary rate of the flavivirus. Herpesviruses have infected humans for much longer than these other viruses, but show the least intraspecific diversity. In their case, population genetics processes such as random genetic drift or selection have limited the number of generations back to the LCA, and the many orders of magnitude slower rate of evolution of herpesviruses means that little diversity has accumulated. There is evidence of comparatively recent acquisition for many other human viruses. Earlier, it was noted that measles viruses are thought to have jumped into humans a few thousand years ago. Measles viruses are single-stranded RNA viruses from the family Paramyxoviridae. Unlike influenza A viruses, there is no evidence that there have been subsequent reintroductions, and the phylogeny of global strains of measles (Sharp 2002) looks like that for HIV-1 group M (Fig. 13.3b), rather than that of influenza viruses (Fig. 13.4b). However, measles


viruses show limited sequence diversity (Sharp 2002). The rate of evolution of measles viruses may be about one order of magnitude lower than that of HIV-1 or influenza A (Jenkins et al. 2002), but the low diversity still points to a surprisingly recent LCA for contemporary measles viruses. We are currently investigating whether this reflects random genetic drift or a selective sweep.

Summary 1. Among the many different viruses known to infect humans, our knowledge of the amount, pattern, and origins of genetic diversity varies enormously. For those viruses that have been less well studied, sequencing of large sets of strains reflecting the global diversity of different viruses is beginning to shed light on this, while the characterization of more viruses from other species clarifies the origins of human viruses. 2. The four groups of viruses discussed in detail here (herpesviruses, AIDS viruses, influenza A viruses, and dengue viruses) exhibit varied


patterns of diversity, with different factors having played key roles in influencing this. 3. Rates of evolution vary by 5–6 orders of magnitude between the most slowly evolving DNA viruses (herpesviruses) and the most rapidly evolving RNA viruses (AIDS and influenza A viruses), reflecting the error rate of the viral polymerase and the long-term average replication rate of the virus. 4. The time scales of diversification within a clade of human viruses vary by 4–5 orders of magnitude, from a few years for H3N2 influenza viruses, to perhaps 100,000 years or more for some herpesviruses. This time since the LCA depends on how long the viruses have been infecting humans, and whether the virus has been subject to random genetic drift, founder effects, and/or a selective sweep of an advantageous variant. 5. The various aspects of the population genetics of a virus can be influenced by its route of transmission and its interaction with the host immune system. The term ‘phylodynamics’ has been coined to denote this interplay among the various forces shaping viral diversity (Grenfell et al. 2004).

This page intentionally left blank

C H A P T E R 14

The population structure of pathogenic bacteria Daniel Dykhuizen and Awdhesh Kalia

Introduction To control, treat, and possibly eradicate infectious diseases, we need to understand their population dynamics and structure. Only when the population dynamics of HIV was understood did it become clear that a triple cocktail of drugs was the only sensible treatment strategy (Simon et al. 2006). Similarly, fulfilling the promise of genomic sequencing to bring innovation to disease treatment depends upon understanding population structure, for only when population structure is understood can we determine which disease-associated genes are currently under directional selection for change. It is those genes that will be some of the first targeted for the development of speciesspecific antibiotics. Unless we have the luck of the blind, eradication of a pathogen-induced disease requires thorough understanding of the population structure of that disease. Population structure is determined by birth and death processes in finite populations and is thus closely associated with random genetic drift. Since birth and death are organismal processes, the effects of random genetic drift are the same across the entire genome. The mathematical models for random genetic drift have been well worked out for sexually reproducing diploid organisms (Kimura 1983; Hartl and Clark 2007). Natural selection is also defined on the level of the organism; it is the differential survival and reproduction of different phenotypes when at least part of this difference in phenotype is due to differences in genotype. Natural selection changes the birth and death processes such that certain alleles of a gene are favored

over other ones and so can change the pattern of variation expected from genetic drift. This allows statistical tests to determine which loci are likely to have been targeted recently by selection, but these tests, which were derived for sexually reprodu– cing diploids where the genes are unlinked, are often being used for studies on bacteria, where the genes are often tightly linked, without thinking about the underlying models. In this chapter we consider the components required for similar inferences about pathogenic bacteria. To do this we will focus on three human pathogens, Salmonella typhi, Streptococcus pyogenes, and Helicobacter pylori, which have very different population structures and nucleotide diversity.

Population structure Clonality versus panmixia Bacteria reproduce clonally, and recombination of genetic material between clones occurs by transformation, transduction, and conjugation (see Chapter 15 for a description of these processes). Thus the frequency and amount of recombined material varies from species to species depending on which mechanisms are used, how frequently they are induced, how effective they are, and, perhaps most importantly for infectious diseases, how often two genetically distinct clones occur in the same place. The effect of recombination can be quantified by measuring the linkage disequilibrium between loci. Linkage disequilibrium is a measure of the linkage between alleles of two different genes separated by some distance on the chromosome, 185


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

i.e., the probability that if you know the allele at one locus you will know the allele at the other locus. If a bacterial species is purely clonal, it is in maximal linkage disequilibrium. If recombination occurs, disequilibrium decreases, and when the probability, given that you know the allele at one locus, becomes the frequency of the allele at the second locus, then the genes are said to be in linkage equilibrium. If the alleles at most loci are in linkage equilibrium, the population is said to be panmictic. Panmixia usually describes a local population, not a species. An entire species is only panmictic when the migration rate is so high among populations that there are no differences in gene frequencies. When there are striking geographic differences in allele frequencies, as in H. pylori, combining populations would produce linkage disequilibrium.

Population structure and disease type Infectious diseases can be categorized as epidemic or endemic and acute or chronic. These categories imply different underlying population structures. An epidemic of an infectious disease is likely to be caused by a clone that has rapidly infected many people in a locality. Thus the epidemic pathogen population often consists of a group of related clones that have caused epidemics in the host population in the past. In contrast, the population of an endemic pathogen is likely to consist of multiple distantly related clones, and two people infected at about the same time in the same general location are often colonized with different distantly related clones. This gives more opportunity for recombination and thus for clonal divergence promoted by recombination in endemic than in epidemic species. Pathogens that establish chronic or persistent, as opposed to acute, infections are expected to have more within-host diversity, and consequently even more recombination. If the within-host diversity is high enough, a pattern of linkage equilibrium (a lack of clonal structure because of random association of alleles) may be observed. Within each of these categories the mode of transmission of infection from one individual to another could strongly affect the pathogen population structure. While classification by disease type is useful for certain questions, other classifications based on

the natural history of the pathogens can also be enlightening.

Population structure and clonality One such approach considers clonality (Maynard Smith et al. 1993). Largely clonal organisms have little or no recombination among strains, (e.g., Mycobacterium tuberculosis, S. typhi), and genetic differences among isolates reflect changes that accumulated sequentially following descent from recent ancestors. These organisms can be either epidemic or endemic and can cause acute or chronic infections. In many normally endemic, chronic pathogens an epidemic phase will produce transient clonality due to higher fitness of particular clones in otherwise recombining populations (e.g., S. pyogenes, Neisseria meningitidis). Alternatively, strains can be panmictic, or quite freely recombining, with very little evidence of clonality or epidemic spread (e.g., H. pylori and also Neisseria gonorrhoeae). These are likely to cause chronic infections.

Population structure and genetic variation Another approach to population structure focuses on quantifying the extent of neutral genetic diversity. The amount of neutral genetic diversity at equilibrium is a function of the mutation rate and the effective population size (discussed below). Table 14.1 gives the per site nucleotide diversity (␲JC) and nucleotide polymorphism (␪) of a number of human pathogens estimated from sequences used for multilocus sequence typing (MLST), usually housekeeping genes that have a single, usually critical metabolic function conserved across all bacteria with no selection for change in function (Maiden 2006). The products of housekeeping genes are mostly cytoplasmic and usually do not interact with the host, so there is neither selection for immune escape nor for interaction with different chemical environments as the cells encounter different niches. Thus, they are the best examples of genes where the genetic variation is selectively neutral or nearly neutral. The general correspondence between ␲ and ␪ suggest that the variation used to construct these measures is selectively neutral and the population sizes are generally constant.



Table 14.1 Nucleotide diversity and variation in bacterial populations Pathogen

Salmonella typhi c Streptococcus agalactiae Bordetella speciesd Klebsiella pneumoniae Streptococcus pyogenes Pseudomonas aeruginosa Staphylococcus aureus Streptococcus pneumoniae Yersinia pseudotuberculosis Acinetobacter baumanii Escherichia coli Listeria monocytogenes Campylobacter fetus Bacillus cereus Neisseria spp. Helicobacter pylori Burkholderia cepacia complex

Number of STs a

Sequence length (Kb)

Nucleotide diversity per site ␲JC-corrected

␪/site b





308 32

3.456 2.913

0.00561 0.00641

0.00563 0.00642

166 327 340 845 2613 N/A 21 81 34 30 381 5868 389 114

3.012 3.134 2.874 3.198 2.751 2.627 2.94 7.364 3.255 3.312 2.829 3.282 3.846 2.760

0.00755 0.00786 0.00843 0.00915 0.01033 0.01211 0.0141 0.01697 0.02839 0.03142 0.04479 0.04685 0.04711 0.05448

0.00751 0.0079 0.00847 0.00914 0.01039 0.0122 0.0142 0.01712 0.02857 0.03108 0.04588 0.04679 0.0485 0.0563

Sequences were concatenated and downloaded from http://www.pubmlst.org or http://www.MLST.net. ST, sequence type. b Calculated from π using finite sites model. c Data taken from Roumagnac et al. (2006); only housekeeping gene data are shown. d Includes Bordetella bronchiseptica and three recently derived low variation pathogenic species Bordetella pertussis, Bordetella parapertussis, and the sheep pathogen. a

From Table 14.1 we have picked three pathogens to consider in detail. A pathogen of low diversity is S. typhi. This species is said to be a genetically monomorphic pathogen, because it is a clone with no serotype diversity, even though it contains some nucleotide polymorphism. Streptococcus pyogenes is a species with moderate nucleotide diversity. This and the next species are said to be genetically polymorphic since they contain clones of different serotypes. Helicobacter pylori is the representative species with high nucleotide diversity. These three species will be discussed in detail below.

Effective population size Population genetics deals with genetic diversity and population structure. One of its important tools for dealing with population structure is the concept of effective population size, or Ne. The effective

population is an abstract population with convenient mathematical properties whose size has been adjusted to properly represent important processes in real populations with inconvenient mathematical properties. The effective population size for a normal diploid organism can be defined several ways. The one most relevant here is the abstract population in which the rate of fixation of neutral alleles by genetic drift is equal to that seen in the actual population, thus leading to a particular level of heterozygosity. The larger the effective population size, the longer it will take neutral alleles to fix and consequently the higher the level of genetic variation in that population. If the population size fluctuates over time, the effective population size is closer to the smallest than the average population size. For example, if a population has 10, 100, and 1000 individuals over the three generations, the effective size is 27 while


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

the average population size is 370. Estimates of Ne are usually very much smaller than the census size of a population. (See Kimura (1983) or Hartl and Clark (2007) for a discussion of Ne.)

the important influences on population structure and genetic variation to reveal the differences in the population structures of the groups that produce differences in diversity.

Effective population size determined by infection dynamics

Helicobacter pylori

For pathogens causing infectious disease the basic unit determining Ne is the infection, not the individual bacterium, and the basic parameter is not how many offspring each bacterium produces, but the number of new secondary infections each primary infection produces. Most epidemic diseases have a very low Ne because a few infections produce an epidemic. Most endemic diseases have a higher Ne because more infections are likely to produce other infections and the number of infections does not fluctuate so much over time. Chronic endemic diseases can produce the highest Ne, relative to the number of infections. The number of individual pathogens usually transferred between infections is also an important parameter. This is a very different number than LD50 (the number of pathogens transferred such that 50% of the infected animals die). First, even if the LD50 is 1000, if 999 of the cells are eliminated by the defenses of the body and only a single pathogen starts the next infection, then the new infection is started by a single individual, not a 1000. Second, even if the number of individuals that start infections is greater than one when an LD50 experiment is done, the numbers are likely to be lower in nature because most diseases in nature do not kill 50% of the susceptible, infected hosts. Often only a single bacterium will be effectively transferred from one infection to another, leading to low Ne. However, for some infections, the environment within the host may have to be modified during colonization for the infection to succeed. In this and perhaps other cases, more bacteria are required to start an infection, which would give a larger Ne. We now review the natural history and population dynamics of three organisms, S. typhi, S. pyogenes, and H. pylori, all of which are solely pathogens of humans and so have the same population of hosts. These three species represent three very different levels of diversity and so should have very different population structures. We discuss

Helicobacter pylori infects billions of people worldwide, generally colonizing gastric epithelial cell surfaces and the overlying mucin. Its infections typically last decades despite inflammation and other host defenses and are important medically as a cause of chronic gastritis, peptic ulcers, and as an early risk factor for gastric cancer (Blaser and Atherton 2004; Atherton 2006).

Geographic variation MLST analysis of global collections of H. pylori isolates reveals extensive geographic partitioning among housekeeping gene alleles: DNA sequences from different areas typically assort into different clades (Achtman et al. 1999a; Kersulyte et al. 2000). Admixture analysis from different strains suggests that extant populations likely stem from ancestral African, Central Asian, and East Asian populations (Falush et al. 2003). Population genetic analysis suggests that populations from different areas have undergone significant levels of genetic differentiation and have evolved in geographic isolation. It has been proposed that H. pylori’s evolutionary history parallels that of its human host, that its association with humans is ancient, and that it might have accompanied humans ‘out of Africa’ (Falush et al. 2003; Linz et al. 2007). As more than 90% of infections are benign and the clinical outcome of infection varies geographically, H. pylori probably coevolved with the physiology of the human host and to the particularities of its host’s local diet (Covacci et al. 1999).

Infection dynamics Helicobacter pylori populations demonstrate remarkable genetic diversity (Table 14.1). Identical genes are uncommon in H. pylori strains, and isolates with similar or identical genetic fingerprints have only been found within families, close communities, and institutionalized patients (Akopyanz et al.


1992; Magalhaes Queiroz and Luzza 2006). As in many common gastrointestinal diseases, infection is associated with conditions of crowding, poor hygiene, and intrafamilial clustering (Goodman and Correa 1995). Helicobacter pylori is generally thought to transmit via oral-oral or fecal-oral routes (Parsonnet et al. 1999), but other routes are not ruled out. The organism has been recovered most reliably from vomitus and from stools during rapid gastrointestinal transit, and exposure to a H. pylori-infected symptomatic person with gastroenteritis, particularly vomiting, markedly increases risk of infection (Parsonnet et al. 1999; Perry et al. 2006). The large number of bacteria shed during gastroenteritis increases the probability for successful colonization once H. pylori negotiate the initial acid encounter in the gastric lumen. Thus, most infections are acquired from very close contact, and successful colonization might require many viable H. pylori cells. Intrafamily transmission has been much investigated (Drumm et al. 1990; Bamford et al. 1993; Rothenbacher et al. 1999). For example, Japanese immigrants to Peru represent a small proportion of the population and the Japanese type of H. pylori should soon have disappeared in the grandchildren of these immigrants if infection happened commonly outside family interactions. However, 50% of the grandchildren still carry the Japanese types, suggesting about three-fourths of the infections are picked up within the family. Epidemiological studies suggest that infected mothers are the main source of H. pylori infection of their children (Rothenbacher et al. 1999); such vertical transmission implies that the variance in the number of new infections from existing infections is low. Most infections give rise to only a few other new infections and most new infections are acquired in childhood (Miyaji et al. 2000). Such a transmission pattern implies that Ne will be much larger for H. pylori than for a pathogen in which a few infections give rise to many but most do not give rise to any other infections.

Infection inoculum Predominantly restricted H. pylori transmission might result if many cells are required to initiate a new infection (for example, see Graham et al.


(2004)). Most bacterial cells are localized on gastric epithelial cell surfaces and in the overlying mucin layer, a tissue that turns over rapidly, is infiltrated by inflammatory cells after infection, and is buffeted by gastric acidity on its luminal side (Blaser and Atherton 2004). The gastric mucosa is hostile to most bacterial species. It is an unstable niche to which only the Helicobacters among bacterial taxa have become well adapted. For example, although inflammatory responses help protect potential hosts against casual pathogen encounters, H. pylori is thought even to feed on metabolites leached from inflammation-damaged host tissues, and many strains use host-sialylated glycolipids, synthesized during the inflammatory response, as receptors for adherence (Mahdavi et al. 2002).

Mutation rate Effective population size interacts with mutation rate to determine neutral diversity. While the mutation rate in many bacteria is about 10−10 per nucleotide per generation (Drake 1991), the mutation rate in H. pylori may be much higher (Bjorkholm et al. 2001; Kang and Blaser 2006). Many mutator phenotypes (where the mutation rate has increased 10- to 1000-fold) are caused by the lack of genes involved in DNA repair. Helicobacter pylori genomes lack several of the well-known homologs involved in DNA repair, e.g., key mediators of the base-excision repair pathway (MutM and Nei), mismatch repair pathway (MutS, MutH, Dam, ExoI), recombinational repair pathway (RecBCD), and SOS repair (UmuC and LexA) (Kang and Blaser 2006). A study of 29 clinical isolates demonstrated a 700-fold range of mutation frequency and suggested that 25% of these isolates could be considered hypermutators (Bjorkholm et al. 2001). Furthermore, studies aimed at identifying antibiotic resistance targets in H. pylori suggest that the parallel pathways observed probably evolved because of high mutation frequencies (Albert et al. 2005). It is therefore plausible that lack of repair genes contributes significantly to nucleotide variability in H. pylori. It is also possible that the hostile human gastric microniche, which exposes H. pylori cells to mutagens such as reactive oxygen and nitrogen radicals, contributes to its high mutation rate (Kang et al. 2006), and that the variation in the microniches


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

in the human stomach might select for ecotypic microadaptations, further promoting the accumulation of genetic variation.

Recombination in H. pylori Three lines of evidence establish H. pylori as a paradigm for a panmictic bacterial population with an extensive history of recombination: 1. Analysis of nucleotide sequence data suggests that recombination of short DNA sequences (median-size 417 bp) has had greater impact than mutation in diversifying clones with a frequency of recombination high enough to classify H. pylori populations as panmictic (Falush et al. 2001). 2. A test that measures the expected frequencies of nucleotide polymorphisms that occur in multiple, independent branches of a phylogeny suggests that recombination in H. pylori is several magnitudes of order more frequent than in other bacteria (Achtman et al. 1999a). 3. Network analysis also suggests a history of extensive recombination, linking sequences to multiple parallel pathways that cannot be reflected on a phylogenetic tree (Kalia et al. 2004). In addition, H. pylori cells are naturally competent and can pick up exogenous DNA (NedenskovSorensen et al. 1990). However, the presence of several strain-specific restriction-modification systems suggests that barriers also exist against free horizontal gene transfer from other H. pylori strains and from other species (Ando et al. 2000).

Selective sweeps Effective population size can be severely reduced by selective sweeps, which happen when new advantageous mutations appear and spread to fixation in the population. The new mutation occurs in a single cell, and if it is fixed by selection before there is any other mutation or recombination in this strain, all the genetic variation in the population will be lost. In a purely clonal species, the standing variation eliminated by a selective sweep must be regenerated by mutation. However, if there is recombination, the impact of the selective sweep will depend on the amount of recombination. If

there is only a little recombination of large pieces, a large region on the DNA around the advantageous mutation will be swept clean of all the standing variation and the rest of the genome will show a decrease in variation. As the recombination rate goes up and the piece size goes down, the region swept clean of variation decreases and the loss of variation in other regions also decreases. So far there is no evidence of species-wide selective sweeps in H. pylori, unlike Escherichia coli (Guttman and Dykhuizen 1994). This could be because the high rate of recombination reduces the region swept of variation to a very short piece, because the very high mutation rate means that advantageous mutations happen in multiple strains, or because selective sweeps are unusual in these pathogens because of their infection dynamics.

Role of diversifying selection in maintaining nucleotide diversity Diversifying selection, whether frequencydependent, driven by the genetic variation of the host population, or due to different selection in different geographic regions, maintains nucleotide diversity by selecting to maintain different alleles. Diversity at the babA gene is maintained in H. pylori populations because of the variation in ABO blood types in humans. Helicobacter pylori needs the babA encoded adhesin-protein to adhere to gastric mucosal ABO type glycans (Ilver et al. 1998). Analyses of receptor specificities of many H. pylori strains from different human populations distinguish 'specialist' babA adhesins that bind the simple Leb antigen (characteristic of people of blood group O) far better than bulkier ALeb and BLeb (characteristic of blood group A and B, respectively), versus 'generalist' babA adhesins that bind all three antigens with high affinity (Aspholm-Hurtig et al. 2004). Both phenotypes of babA are found in all populations and changes at relatively few amino-acid sites in these proteins modulate the strength and specificity of this interaction. There is no phylogenetic clustering of babA sequences into these phenotypic groups suggesting that BabA binding specificity does not depend on a large number of sequence differences and there has been continuous conversion of one into the other by mutation


and selection. Statistical estimates suggest that only 9.4% of the sites are under this diversifying selection for these different phenotypes (rate ratio of nonsynonymous to synonymous changes = 3.598). In contrast to babA, variation in other H. pylori genes, including the virulence-associated cagA and vacA genes, suggests adaptation to local geographic variation in human diet and physiology. CagA and VacA proteins each enter target cells and disrupt several normal cellular signal transduction pathways, with strengths and specificities that vary geographically. For example, east Asian and Western-type CagA proteins differ most in sequence in the region responsible for phosphorylation, which interacts with the host SHP-2 phosphatase, an intracellular regulator of various cell proliferative, morphogenetic, and motility signaling pathways (Atherton 2006). Similarly, highly active 's1,m1' type alleles of the vacA toxin gene predominate in Japanese isolates, whereas nontoxigenic s2,m2 type alleles predominate in the West (Ito et al. 1997; Yamaoka et al. 1999). An intermediate s1,m2 form is common in coastal China (Pan et al. 1998). Even among m1 allele types, dramatic differences are found between predominant sequence motifs from east Asia and the West. This is noteworthy because the protein domains encoded by m1 and m2 alleles determine the cell type specificity of VacA toxin (Atherton 2006) and perhaps contribute to geographic differences seen in the clinical outcome of infection. It is likely that these geographic differences reflect selection pressures that differed in the various human populations, either currently or over the millennia, and that geographic isolation of ancestral H. pylori populations has fostered adaptations to differences in local host populations. Such differences can be driven by host physiology, diet, or infections with other pathogens that shape the immune response. The adaptation of populations to local circumstances will retard species-wide selective sweeps. If a new advantageous allele comes into a population, it will be linked to the alleles adapted to the population in which it originated rather than those present in the population it is invading. Because selection is on the whole genome, not just individual genes, the advantage of the new allele will be much less in the population it is invading than the one it


came from until it recombines into a background carrying the local allelic types. Therefore differential adaptation will retard selective sweeps and promote the maintenance of genetic variability.

Expectation of high genetic variability in H. pylori Since the amount of genetic variability seen in a species is a function of both Ne and the mutation rate, we expect very high variability in H. pylori, for at least six reasons: 1. It has a very high mutation rate. 2. It has such a high recombination rate that selection on an allele does not reduce diversity much elsewhere in the genome. 3. Its mode of infection results in low variance in the number of new infections that result from an existing infection. 4. Because it is a chronic, long-lasting infection, each population (infection) has time to diversify and create additional genetic variation. 5. The number of individuals required to establish an infection is expected to be large. 6. It has infected humans long enough for neutral genetic variation to accumulate. This extraordinary level of genetic diversity must enable versatile adaptations to the host.

Streptococcus pyogenes ~

The ␤ -hemolytic group A streptococci (Group A Streptococci (GAS) or Streptococcus pyogenes) are among the most prevalent human pathogens. Most often GAS cause mild throat (strep throat) or skin (impetigo) infections, although some cases transform into potentially fatal conditions, such as rheumatic fever, rheumatic heart disease, acute glomerulonephritis, and possibly neuropsychiatric disorders (Bisno and Stevens 2005). Occasionally, the pathogen gains access to normally sterile tissue where it can cause life-threatening illnesses like toxic-shock syndrome or necrotizing fasciitis, for which it has gained notoriety as the 'flesh-eating' bacterium (Bisno and Stevens 2005). Because of the self-limiting and superficial nature of pharyngitis and impetigo, the throat and skin provide easy


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

portals for transmission of GAS strains without seriously disabling their host. Furthermore, many infected individuals carry GAS strains asymptomatically. Although the densities of GAS are much less in asymptomatic carriage than in acute infections, the pathogen remains capable of transmitting to new hosts. The reemergence of GAS as a cause of serious human infections throughout the United States and Western Europe in the mid-1980s and early 1990s renewed interest in understanding the molecular evolutionary mechanisms of GAS pathogenesis and host adaptations (Bisno 1990). GAS populations display a typical epidemic population structure: while at any given time a single clone may dominate in a population, there is an extensive history of background recombination in their deeper phylogenetic history (Musser et al. 1991; Enright et al. 2001; Feil et al. 2001; Kalia et al. 2002).

GAS epidemiology GAS strains show strong niche preference, evident in the repeated isolation of GAS strains that infect the skin or the pharyngeal epithelium. Three classical serological markers, the M-protein, the serum opacity factor (SOF), and the T-antigen protein, are often used to define GAS strain differences (Johnson and Kaplan 1993). M-proteins function as immunodominant antigens that confer type-specific protective immunity; more than 100 M-types exist. In recent years, M–serotyping has been almost entirely supplanted by sequence-typing of the hypervariable region of the emm-genes, which encodes the M protein (emm-typing) (Beall et al. 1996). Decades of epidemiological investigations based on such markers suggest that distinct GAS serotypes show 'birth-and-death' dynamics in human populations, i.e., achieve transient dominance and then decrease in frequency, and that distinct GAS serotypes associate with distinct clinical outcomes. For example, in developed countries, coincident with implementation of improved public health policies and availability of antibiotics, serious GAS diseases such as acute rheumatic fever and scarlet fever underwent a sharp decline. However, since the 1980s, epidemiological data suggest a worldwide resurgence in serious GAS diseases such as toxic shock

and necrotizing faciitis (Schwartz et al. 1990). These studies also indicate that severe invasive disease is most commonly caused by serotype M1 and M3 GAS strains, whereas acute rheumatic fever outbreaks are usually associated with M1, M3, M5, M6, and M18 serotypes (Johnson et al. 1992).

Clonal expansion Both MLST and genome-wide studies indicate that intraserotype variation is significantly less than the variation between serotypes (Enright et al. 2001). For example, using comparative genomic resequencing, Musser and colleagues showed that allelic diversity within M3 GAS strains was ~177fold less than that seen between M-types, thus confirming that serotype M3 strains represent a closely related lineage of GAS (Beres et al. 2006). Clones distinguished by emm-type are stable over long-periods of time. In one study, MLST of 137 GAS isolates collected from a remote aboriginal island community in tropical Australia also revealed the expected one-to-one relationships between emm-type and MLST genotypes (McGregor et al. 2004). Likewise, a study of ⬎ 200 predominantly U.S. isolates showed tight linkage between an emm type and the MLST genotype (Enright et al. 2001). However, the MLST genotypes of the same emm-type from the two different regions were very different. A single emm-type may define a single clone within the United States and a single clone among Australian aboriginals, but the clones are strikingly different from one another. This suggests clonal expansions are local, not global.

Recombination MLST analysis has revealed an extensive history of recombination among GAS strains (Feil et al. 2001; Kalia et al. 2002). Recombination can be measured by statistical tests for congruence among housekeeping gene tree topologies. In the absence of recombination, tree topologies from different loci should be congruent (i.e., identical or nearly so). Since recombination imparts different evolutionary histories to different genes, trees constructed from different genomic regions are often incongruent (i.e., gene trees have different topologies). For


GAS, there is no congruence among housekeeping gene trees, and evolution has occurred in reticulate fashion. If recent clonal expansions of GAS strains are ignored, 'index of association' measurements based on MLST data suggest that housekeeping alleles are randomly associated (i.e., show linkage equilibrium) (Enright et al. 2001). Thus, while GAS has a clonal population structure in local populations, recombination is frequent enough that over time the clonal structure is broken down.

Why is there clonal expansion? Why do certain clones expand and not others? One possibility is that certain clones are more fit than others; the other possibility is that certain clones got lucky and were in the right place at the right time to start many other infections. Closely related clones (or clonal complexes) defined by a single emm-type can differ in their virulence and epidemic potential. Genomic analysis of allelic variation within M3 serotype GAS strains showed that commensal and invasive strains do not represent two deeply differentiated genetic groups, and that no single nucleotide polymorphism (SNP) was uniformly present in one phenotypic group or the other (Beres et al. 2006). Thus it does not seem that these phenotypic traits are involved in clonal expansion. The expansion is local and the genotypes of clones that have expanded in different areas are very different. Thus the expansion might be related to the particular immunological state of the host population at the time of expansion and does not represent an incorporation of an advantageous mutation (Musser et al. 1991). However, some patterns related to clonal expansion have been seen. M1 GAS strains are the commonest cause of severe invasive disease, and M1 epidemic waves contain closely related strains expressing a remarkably heterogeneous array of streptococcal inhibitor of complement (Sic) variants (Hoe et al. 1999). Studies suggest that Sic diversity is driven by positive selection, and that selection operates to select Sic variants that enhance M1 GAS persistence in the upper respiratory tract (Stockbauer et al. 1998). Thus the expansion seems to require persistent infections, not just rapid passage from one individual to another.


In a sample of GAS strains in which MLST alleles are in linkage equilibrium, there are combinations of gene loci present in some strains and absent in others, whose products interact with, or regulate, those that interact with the host, and that occur non-randomly. The multigene activator (Mga) regulon (Caparon and Scott 1987) and the fibronectin-collagen-T antigen (FCT) region (Bessen and Kalia 2002) warrant particular attention in this regard because they encode multiple proteins that interact with the host. Strong linkage disequilibrium is observed between sequence clusters of the transcription regulatory loci (mga and rofA/nra within the Mga and FCT regions, respectively) and the subpopulations of throat and skin specialists (Bessen et al. 2005). It would be expected that the innate preference of GAS strains to either throat or skin colonization imposes an ecological barrier among strains and leads to local differentiation within subpopulations. However, MLST analyses shows that housekeeping genes from apparently skin- and throat-tropic strains recombine frequently (Kalia et al. 2002), suggesting that this ecological differentiation does not diminish gene flow between populations. Thus it is likely that a population structure with large variance in the number of offspring infections, i.e., certain clones get lucky, combined with some selection for that lucky clone, can explain the expansion of local clones. An epidemic population structure of this type will lower Ne , but the fact that the epidemic expansions are local means that the effect on Ne will be much less than if the epidemic expansions are global.

Interspecies horizontal gene transfer Transfer of genetic material from another species can either increase diversity if it remains polymorphic with the species-specific homologues or decrease diversity if it initiates a selective sweep. Human isolates of ␤-hemolytic group C and G streptococci (GCS and GGS, respectively) inhabit the same human microniche environment as GAS and are close genetic relatives (Johnson and Tunkel 2005). MLST analysis suggests that GCS–GGS comprise a single species (Streptococcus dysgalactiae subsp. equisimilis) (Kalia et al. 2001); GCS–GGS are


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

divergent (typically 12–15%, but with a range of 4–48%) and have a larger genome (2.25 vs. 1.8 Mb) than GAS (A. Kalia, unpublished data). GCS and GGS are routinely recovered as part of normal flora but not as pathogens and are considered to be commensal (Johnson and Tunkel 2005). Unlike GAS, there is extensive genotype heterogeneity within a GCS–GGS emm-type (Kalia et al. 2001); i.e., they do not show the same clonal expansion, suggesting that the expansion seen in GAS may be related to virulence. There has been extensive importation of GAS alleles into GCS and GGS (Kalia et al. 2001). The close similarity between alleles in GAS and the transferred alleles suggest that this extensive introgression is a recent event. However, this transfer has not been reciprocal. The genetic variation at GAS housekeeping loci suggests that GAS have not frequently received transfer of genetic material from GCS or GGS or from other related streptococci. This is not likely because they lack

a mechanism to incorporate transferred DNA in GAS, for there is extensive evidence for recombination between GAS strains. Neither GAS nor GCS– GGS are known to be naturally transformable, and generalized transduction might conceivably be the primary mechanism for gene transfer; however, the vehicle(s) for transfer of housekeeping genes within and between species remain(s) unidentified. The evolutionary model best supported thus far is that GAS (S. pyogenes) and the human forms of GCS–GGS (S. dysgalactiae subsp. equisimilis) diverged in the relatively distant past and evolved separately, as reflected in the observed differences between the GAS and GCS–GGS housekeeping alleles (Fig. 14.1). Recently, genetic exchange with GAS isolates resulted in the replacement of the GCS–GGS housekeeping alleles at some loci with those from GAS, to produce the mosaic genomes observed among the human isolates of GCS–GGS. Humans are believed to be the sole host for GAS, whereas many GCS and GGS are parts of large

Common ancestor

Human-host specialization

GAS-hk genes


Ecological Divergence

Cluster I hk genes


Non-human host specialization

GCS-GGS-hk genes

Cluster II hk genes

Genomic uniformity

Recent gene exchanges

Genomic mosaics

Virulence genes Hypothesis: Emergence of hypervirulence

spe A, C, G, H emm skn

Hypothesis: 'Fitter' genotypes for human colonization

Figure 14.1 Dynamics of GAS and GCS–GGS interaction. Model for evolution based on housekeeping (hk) genes shows divergence of GAS and GCS–GGS into cluster I and II alleles, respectively, followed by more recent interspecies gene flow dominated by movement of cluster I alleles from GAS to GCS–GGS. This model also shows that hk gene flow from GAS donors to GCS–GGS recipients tends to involve larger blocks of DNA (allelic replacements; large arrows) than the genetic material transferred from the reverse direction (intragenic recombination; small arrows). Virulence gene flow is more complex. Virulence genes: emm, M-protein; skn, streptokinase; and spe, streptococcal pyrogenic exotoxins.


complexes, such as the Streptococcus equi complex, S. dysgalactiae subsp. dysgalactiae, and Streptococcus canis, which are usually found in animals, including horses, dogs, and cattle. The increased exposure of humans to animals, which followed their domestication ~10,000 years ago, may have led to the transfer of a few clones from streptococcal species previously associated only with animals into humans. In the process of adaptation to humans, these GCS–GGS clones picked up many genes from GAS. In contrast to the strong directionality observed for horizontal transfer of housekeeping genes, the transfer of virulence genes follows more complex patterns. For example, the emm genes undergo transfer from GAS to GCS–GGS and vice versa, the genes encoding exotoxins speA and speC show directionality similar to housekeeping genes, and the streptokinase is acquired by GAS from GCS–GGS. We hypothesize that such gene exchanges are mutually beneficial to both species. These may facilitate acclimatization of GCS–GGS to their new host environment (human) and, yet at the same time, provide GAS with a repertoire for generating antigenic diversity allowing immune escape and/or ecological niche expansion and emergence of hypervirulence.

Expectation of moderate genetic variability in GAS While the mutation rate in GAS has not been determined quantitatively, it is assumed that it is low and typical. Infections are likely to be started by a single cell, but this is not yet known. However, infection rate is a linear function of distance between beds in a military hospital, suggesting only a single inoculum is required (Bisno and Stevens 2005). Both features differ from H. pylori in the direction of lower expected diversity. Recombination is moderate, high enough that selective sweeps would not be expected to reduce the effective population size by much. While there is evidence of considerable cross-species genetic transfer, this will not increase the diversity of MLST loci in GAS and so has little effect on π. The direction of transfer, mainly from GAS to GCS–GGS, suggests that GCS–GGS is the species invading humans while GAS is well adapted to humans and has been a human pathogen


for a long time. Thus, the measured diversity need not be low because of a recent niche change. In a species like GAS with an epidemic population structure, the sequence diversity estimated from samples (where identical genotypes are included) is expected to be much lower than the estimate of sequence diversity from sequence types (where identical genotypes are excluded). The two estimates of sequence diversity are expected to be more similar in endemic species like GCS–GGS.

Salmonella typhi Salmonella typhi (Salmonella enterica subspecies enterica serovar Typhi) is a genetically monomorphic human-restricted pathogen that causes 21 million cases of typhoid fever and 200,000 deaths each year (Crump et al. 2004). The disease is endemic in many developing countries, particularly the Indian subcontinent, Southeast Asia, Africa, and Central America. Infection usually results from ingestion of contaminated food or water. Typhi can also enter an asymptomatic carrier state in certain people, such as 'typhoid Mary' Mallon, who shed many bacteria and significantly increase the risk of food-borne outbreaks. The proposal that asymptomatic carriers are the principle reservoir of S. typhi is supported by epidemiological studies in the United States in which 16 out of 26 food-borne outbreaks between 1960 and 1999 were associated with known asymptomatic food handlers (Olsen et al. 2000). While S. typhi is classified as a serovar of a subspecies of Salmonella, we are considering it a separate species because it is ecologically and genetically isolated from other Salmonella. S. typhi is a clone whose phylogenetic tree derived from a collection of 105 strains shows no homoplasy (similarity for reasons other than common ancestry), except in gyrA which, because of widespread use of quinolones for treating enteric fever, is under tremendous selection for antibiotic resistance (Roumagnac et al. 2006). Homoplasy may arise either because of recurrent mutation or because of intraspecies recombination. Thus there is no convincing evidence of intraspecies recombination. However, there seem to be three distinct interspecies recombination events, two from Salmonella typhimurium and one from Salmonella paratyphi (Roumagnac


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

et al. 2006), suggesting that S. typhi can undergo recombination if the opportunity arises. But the opportunity seems to not arise: strains of S. typhi apparently seldom if ever mix. The tree structure is consistent with neutrality; i.e., the only evidence for a recent selective sweep concerns the gyrA mutations (Roumagnac et al. 2006). The gyrA mutations provide nalidixic acid resistance, have been found since 1991 in S. typhi, are found in many haplotypes, but also seem to be leading to a clonal expansion of haplotype H58. Once it appears, antibiotic resistance usually increases in frequency quickly (Chapter 10). As yet most strains (⬎ 70%, NARMS 2000 Annual Report, CDC) remain susceptible to nalidixic acid, suggesting that the infection reservoir (the asymptomatic carrier) is not being treated with antibiotics.

The last common ancestor of S. typhi One of the two nucleotides polymorphic in S. typhi is the ancestral nucleotide. Which nucleotide is ancestral is determined by the nucleotide found at that site in other Salmonella serovars. In this way the haplotype of the last common ancestor for S. typhi is defined and this haplotype, H45, is still extant (Roumagnac et al. 2006). The time back to the last common ancestor is estimated to be about 20–30,000 years assuming an effective population size (of infections) of about 3 × 105, and with an expansion of the effective population size matching the expansion of the human population (Roumagnac et al. 2006). The time back to the last common ancestor is thus after the Paleolithic expansion of the human population out of the Mideast into Europe and Asia but before agriculture.

The origin of S. typhi The last common ancestor of S. typhi is not necessarily the clone that originated S. typhi, but a descendent of this originating clone. There is a tendency to assume that the last common ancestor and the originating clone are the same for monomorphic virulent species. This assumption is supported for Yersinia pestis, which is a clone that arose out of Yersinia pseudotuberculosis about 13,000 years ago (Achtman et al. 1999b, 2004), and for Bordetella

pertussis and Bordetella parapertussis, which are clones that arose out of Bordetella bronchiseptica relatively recently (Parkhill et al. 2003; Bjørnstad and Harvill 2004). In these clones there was a rapid reduction in genome size and increase in numbers of insertion sequences, unlike S. typhi. The Salmonella that seem most closely related to S. typhi are Salmonella agona and Salmonella enteritidis, which separated from S. typhi some 3–10 million years ago (Kidgell et al. 2002). Thus S. typhi could have originated as a definable pathogenic phenotype as long ago as 3–10 million years, from about the time of the origin of the genus Homo to before the separation of humans, chimpanzees, and gorilla. That S. typhi is likely, through an ancient association with humans, to be well adapted for human colonization and persistence suggests that the S. typhi nucleotide diversity is in steady state, implying that the estimate of Ne reflects its population biology, not recent origin.

Carrier numbers determine Ne We therefore agree with the suggestion by Roumagnac et al. (2006) that the number of asymptomatic carriers determines the effective population size in S. typhi, with the symptomatic individuals simply providing an amplification of S. typhi numbers. Since a single organism is likely to infect each carrier and some carriers are more likely to spread the disease than others, the effective population size of S. typhi should be smaller than the number of carriers. With 21 million cases of typhoid fever per year, it is easy to imagine 200,000 new carriers infected per year. If the carrier state lasts for an average of ten years, that would be 2 million carriers. So an effective population size of something about 300,000 for S. typhi does not seem unreasonable. Such estimates are better made from data than supposition, but if basically correct, it means that eradication of S. typhi should focus on identification of carriers and eliminating S. typhi infections from these individuals.

Further considerations Nucleotide diversity is an informative clue to population structure. The processes that determine nucleotide diversity include the following.


Mutation While it is expected that for most species, the mutation rate will be low, somewhere around the minimum of 10−10 per nucleotide per cell division, a higher rate is possible and should be checked for. A higher diversity is expected at equilibrium with a higher mutation rate, all else being equal. This minimum mutation rate seems to involve a trade-off between the cost of detrimental mutations in a well-adapted organism and the cost of proof-reading (Drake 1991). This trade-off is likely to be general across most species of bacteria, but, since adaptation requires new mutations, a higher mutation rate is likely to be found in populations rapidly adapting to a new niche or to changing environmental conditions.

Inoculation size While it has never been a priority to determine in infectious disease biology, the number of organisms that initiate an infection is an important parameter. We assume that this number is usually one, but in some species it could be much higher, particularly where the host environment must be modified by the bacteria to establish an infection, as in the stomach. Species that need several individuals to start a new infection should have larger π than ones that do not, all else being equal.

Recombination The potential for recombination depends upon two factors. The first is the ability of the organism to transfer and incorporate DNA. One would expect that species that are generally competent for transformation would have high rates of recombination, whereas a species like Borrelia burgdorferi, which incorporates only small pieces of DNA, perhaps using a gene transfer agent (Dykhuizen and Baranton 2001), would have low rates of recombination. The second factor is the opportunity for recombination. Only in sites where different strains mix is there likely to be effective recombination (recombination between identical strains is unimportant biologically and not observable). Thus a species like H. pylori, living in a site that transient strains can easily colonize, at least temporarily,


will have much higher rates of recombination than pathogens living in otherwise sterile, isolated sites like the gall bladder or urinary tract. The recombinants detected in natural populations will be a subsample of those actually formed. In species with low recombination, the recombinants seen are those that are selected for; while in species with high recombination, the recombinants not seen will be those that are selected against. The selectively neutral recombinants will be those usually missing from the low recombination species but present in the high recombination species.

Selective sweeps The importance of selective sweeps depends upon the degree of adaptation to the current environment and upon the recombination rate. If the species is well adapted to the environment, selective sweeps will be rare or non-existent, but, if the environment is novel or changes rapidly, selective sweeps will be common. When selective sweeps are important, they decrease diversity much more in a clonal or low recombination species than in a high recombination species. Low diversity in the genome or in a region of the genome is the signature of past selective sweeps.

Diversifying selection Diversifying selection happens whenever two or more alleles at a locus are selectively favored, depending upon allele frequency, niche diversity, or environmental fluctuation. Diversifying selection will retard extinction of lineages by drift and thus further increase diversity. Any form of diversifying selection will retard selective sweeps, and the lower the rate of recombination, the more it will retard the sweep. In species with a very low recombination rate, the loci where recombination is most likely to be observed in natural populations will be loci under diversifying selection.

Species introgression and high diversity Helicobacter pylori has all the characteristics expected of a species that has high diversity. It causes a chronic infection in a location other strains


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

can colonize; it has a high rate of recombination; it has a small variance in number of infections from each existing infection; it probably requires several cells to start an infection; it has infected humans for a long period of time—all these properties are expected to give a large Ne—and, in addition, it has a high mutation rate. Another group with high diversity is Neisseria spp., a group of six species undergoing partial introgression (Dykhuizen 2005). They all are commensals in mucus membranes of the human throat and are naturally competent for transformation. Pieces of genes are often found transferred between these species. For most genes, if these pieces are excluded, these six species are distinct and have a characteristic phylogeny. However, genes like adk are scrambled (Feil et al. 1995), with the average diversity of this gene nearly the average diversity across the species; i.e., the phylogenetic signal is lost but the diversity of the different lineages remains. Thus the high diversity does not reflect the population dynamics, but rather is an offshoot of the introgression of different species. Similar introgression of separate species to form a highly diverse species may cause the high nucleotide diversity in Bacillus cereus.

Summary 1. Nucleotide diversity gives information about pathogen population dynamics and is now an easy number to obtain with relatively cheap DNA sequencing available. 2. Most species differences in nucleotide diversity reflect differences in population structure. 3. Population structure determines effective population size.

4. Effective population size is a genome wide property. If there is no natural selection, but only random genetic drift operating on the system, mutation rate and effective population size determine nucleotide diversity. 5. Mutation rate will not usually influence species differences in nucleotide diversity because in most species the mutation rate will be reduced to a minimum of about 10−10 per nucleotide or 10−3 per genome. 6. In some species an elevated mutation rate may signal that the species has undergone recent rapid evolutionary change. 7. Recombination or the breaking of linkage does not affect effective population size but does determine the effect that natural selection has on the system. 8. Selective sweeps (selection for new advantageous mutations) will eliminate diversity, while diversifying selection maintains diversity. 9. When there is little or no recombination, selection will strongly influence nucleotide diversity generally; when there is much recombination, selection will only influence nucleotide diversity locally. 10. Thus, the nucleotide diversity (both local and genomic) gives us information about how pathogens have adapted to changing conditions.

Acknowledgments We thank Susanna Remold for a careful reading of the manuscript. D.E.D. is supported by Public Health Service Grant GM060731 and A.K. is supported by a Ralph Powe Junior Faculty Enhancement Award.

C H A P T E R 15

Whole-genome analysis of pathogen evolution Julian Parkhill

Introduction Since the sequencing of the first bacterial genome, that of the human pathogen Haemophilus influenzae in 1995 (Fleischmann et al. 1995), microbial genomics has undergone an explosive expansion. At the time of writing, just eleven years later, there are around 430 complete bacterial genomes published, and nearly 1000 in progress (Liolios et al. 2006, http://www.genomesonline.org); these numbers will almost certainly increase rapidly. Among these genomes are those of many human and animal pathogens, with increasing numbers of intraspecies and even intrastrain comparisons. While the reasons for sequencing these genomes are usually couched in terms of enhancing research on epidemiology, therapeutic interventions, and vaccines, this wealth of whole-genome data also allows us to investigate the evolutionary histories of these pathogens and to suggest hypotheses about the selective pressures that have shaped, and continue to shape, their genomes. In this chapter genomic data on bacterial pathogens are used to investigate three aspects of their evolution: 1. Long-term evolution over perhaps millions of years: how have related groups of organisms diversified and adapted to different niches? 2. Short-term evolution: how have some pathogens changed in the past few tens of thousands of years in response to selective pressures caused by human expansion? 3. Variation in extant organisms: how do mechanisms of stochastic genome variation allow

some pathogens to respond to immediate selective pressures associated with host interaction?

Long-term evolution of pathogens Horizontal exchange of genes Perhaps the most striking discovery of bacterial genomics has been the very large degree of divergence between and within species. The bacterial species concept is under active and sometimes vociferous debate (Cohan 2002; Gevers et al. 2005; Doolittle and Papke 2006; Fraser et al. 2007), but it is generally accepted that bacterial species are considerably broader, in terms of genetic diversity, than eukaryotic species. The primary reason for this is that most bacteria are capable of acquiring novel genes by horizontal transfer from close or distant relatives (Gogarten et al. 2002; see Table 15.1 for the definition of ‘horizontal transfer’ and other concepts). The balance between this mechanism and vertical inheritance of genes in shaping genomes varies among different groups of bacteria. However, it is clear that horizontal transfer is a much more frequent occurrence in bacteria than in eukaryotes, and this has an important consequence: closely related bacterial genomes differ not just in terms of sequence, but also in the presence and absence of genes. As bacteria evolve, their genomes are shaped by the constant acquisition and loss of genes, leading to relationships that can be described in terms of numbers of shared or unique genes, in addition to sequence similarity. 199


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C

Table 15.1 Common acronyms and concepts Bacteriophage Commensal Horizontal transfer Integration IS PAI Phase variation Plasmid Promoter Pseudogene RM

SSR Transposition

A bacterial virus. On infection, the bacteriophage can replicate immediatelyand kill the host, or it can integrate into the chromosome. An organism living in a host without causing disease. Movement of genes between bacterial lineages, other than by simple vertical inheritance. A process involving the insertion of DNA at a specific site in a bacterial chromosome. Insertion sequence. A selfish mobile element capable of copying itself into novel sites in DNA. Pathogenicity island. A large stretch of contiguous genes carrying pathogenicity determinants that has been acquired by horizontal transfer. The random on/off switching of specific phenotypes within bacterial isolates. A DNA element capable of independent replication within a host cell. Sequence directing transcription of a gene or group of genes. In bacteria, this refers to a gene that has been inactivated by mutation. Restriction/modification. An enzymatic system to modify or cleave DNA at a specific site. Modified DNA is usually resistant to subsequent cleavage. The term ‘restriction’ derives from the restriction of bacteriophage host range caused by the presence of these systems. Simple sequence repeat. A perfect repeat of a short number of base pairs, e.g., GGGGGGG or TATATATATATA. The process of copying and insertion of a selfish DNA element.

Figure 15.1 illustrates this relationship for some members of the enteric bacteria. It can be seen immediately that more closely related organisms share more core genes, and more distantly related organisms have more unique genes. It can also be seen that, despite being on different branches, Yersinia pestis (Parkhill et al. 2001b) and Yersinia enterocolitica (Thomson et al. 2006) have approximately the same proportion of unique genes relative to each other as do Escherichia coli (Blattner et al. 1997) and Salmonella enterica serovar Typhi (Parkhill et al. 2001a), reflecting a similar time of evolutionary divergence. Despite this overall similarity in the rate of acquisition and loss, the unique genes themselves show no particular relatedness: they were probably acquired by horizontal transfer. Figure 15.1 also shows both that closely related organisms can have very different pathologies or different host ranges, and that more distantly related strains can cause similar diseases or infect similar hosts. The reason for this is that many of the pathogenicity and host interaction factors are encoded by genes unique to each strain (that is, part of the accessory genome), while genes in the core genome tend to encode common functions such as transcription, translation, and central metabolism. Thus we now conceive of bacteria as having a core genome that is inherited vertically

Yersinia pestis Plague Yersinia enterocolitica Gastroenteritis Escherichia coli O157:H7 Colitis Escherichia coli K12 Non-pathogen Salmonella enterica Typhi Typhoid fever 100 Mya 4




Salmonella enterica Typhimurium Gastroenteritis 0

1237 (31%) 2777 1259 (31%)

1387 (26%) 3953 528 (12%)

1607 (40%) 2430 2023 (45%)

1220 (28%) 3094 1505 (33%)

601 (13%) 3998 479 (11%)

Figure 15.1 An illustration of the relationship between evolutionary distance and gene content in some members of the Enterobacteriaceae. On the left of the figure is a cladogram representing approximate divergence times, and on the right is a set of Venn diagrams showing shared genes and unique genes for selected pairs of strains.

and indicates deep relationships and an accessory genome inherited horizontally and whose genes often mediate interactions with hosts.

Mechanisms of gene exchange The bacterial genome consists of a single circular chromosome anchored to the cell wall and a collection of much smaller circular pieces of DNA called plasmids that inhabit the cytosol. Bacteria

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N


can acquire DNA by three general mechanisms: transformation, transduction, and conjugation.

into a naïve host. Again, under certain conditions, host chromosomal DNA can also be transferred.

1. Transformation refers to the integration into the genome of naked DNA taken up directly from the environment; many bacteria are capable of this, and some encode specialized DNA uptake systems. It has been hypothesized that transformation may have originally evolved for nutritional reasons, rather than specifically as a means to acquire genetic material (Redfield 2001), but it is clear that many bacteria do now use this mechanism to exchange genes. 2. Transduction involves bacteriophage, or bacterial viruses. All bacteriophage package their own DNA into a protein coat before dispersal from the infected cell, but mistakes occur, and some also package host DNA, producing infectious phage particles capable of injecting foreign chromosomal DNA into a new host. 3. Conjugation is a specialized mechanism utilized by plasmids to move themselves between hosts. The plasmids encode machinery capable of producing a protein tube that can carry the plasmid DNA

Once transferred to a new cell novel DNA can be integrated into the chromosome by two mechanisms: homologous recombination using standard cellular machinery, or site-specific integration using specific DNA recombinases called integrases.

Core and accessory genomes The source and fate of this horizontally acquired DNA can vary, and the different transfer mechanisms can act on different types of DNA. For example, transformation can affect both chromosomal and plasmid DNA, and occasionally conjugation machinery can be encoded on the chromosome, transferring chromosomal genes (Fig. 15.2). Transduction can transfer plasmid-encoded DNA, or even whole plasmids, if they are of the correct size. To further complicate this, genes can move between plasmids and the chromosome, and vice versa, within an individual lineage. Transfer of plasmid genes to the chromosome will enable the organism to maintain

Transduction Phage particle


Membrane of foreign cell Conjugation

Cell membrane

Gene exchange IS Plasmid Homologous recombination

Phage integration

Chromosomal DNA Mobile element

PAI integration



Figure 15.2 Bacterial mobile elements and mechanisms of gene exchange: The black lines represent chromosomal DNA, and the grey lines mobile elements. Routes of DNA into and out of the cell are shown above; routes of DNA into and out of the genome are shown below. The figure center represents a phage particle, and the figure on the right represents a conjugation tube. Although the physical routes of transduction and conjugation are made by dedicated machinery from phage, plasmids, or some PAIs, random chromosomal or other DNA can also be transferred through these routes.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C

their conferred phenotype without the plasmid, and transfer of essential chromosomal genes to a plasmid would have the effect of fixing the plasmid in that strain. In the long term, such fixed plasmids can become de-facto second chromosomes (Heidelberg et al. 2000). However, despite this extreme mobility, not all genes in bacterial genomes are equally subject to horizontal exchange. Genes involved in central metabolic functions, for example, are less likely to be gained and lost, while genes encoding peripheral functions are more easily exchanged. This has led to the concept of core and accessory genomes, where the accessory genome encodes functions not found in all members of the species (Young et al. 2006). Genes encoding pathogenicity determinants or host interaction factors are usually part of the accessory genome; they can be present or absent within a species. Whereas initial analyses of these accessory genes often assumed that they were all acquired from other organisms, it is becoming increasingly clear that many accessory genes are actually part of the genome of the whole species but are not present in every member of that species. They are specialized, or adapted, for exchange amongst members of the species. This is expressed in the concept of the pan-genome (Tettelin et al. 2005), whereby the species genome consists of all the genes, core and accessory, present in the species; any one isolate or strain will contain only a subset of the pan-genome. This continuous process of gene acquisition is generally balanced by gene loss, either through deletion or through inactivation and gradual erosion by mutation. These processes are not neutral; whether acquired genes are maintained in the population depends on their impact on the reproductive success of their hosts in their specific niche. In this way the function of the acquired genes (and of those recently lost, see later) reflects the niche of the organism, which in pathogenic bacteria includes host range and pathology.

Pathogenicity islands Many pathogenicity determinants are encoded on large stretches of novel DNA called pathogenicity islands containing 10 to 100 genes or more (Dobrindt

et al. 2004). These islands often carry specific integrases that allow them to insert at a specific site in the chromosome (often a tRNA gene) and also allow them to excise themselves precisely. Some of the more complex pathogenicity islands also carry conjugation systems that promote their transfer from cell to cell. Such islands can often be recognized in genomic sequences because their composition differs from that of the core DNA in terms of G + C content, frequency of dinucleotides, or codon usage. Classical examples of pathogenicity islands include the SPI-1 and SPI-2 islands of S. enterica (Hansen-Wester and Hensel 2001). Both carry typeIII secretion systems that allow the bacterium to inject specific effector proteins into the eukaryotic cell (Galan and Wolf-Watz 2006). SPI-1 is involved in the initial entry of S. enterica into host cells, and SPI-2 is involved in the maintenance of S. enterica cells within vesicles after entry. Both islands are entirely absent from related organisms, and each appears to have been acquired as a block of genes by a relatively recent ancestor of extant S. enterica.

Plasmids In addition to pathogenicity islands on chromosomes, large contiguous blocks of novel genes can be carried on plasmids. These are self-replicating DNAs that carry mechanisms that control their replication and often also determine how many copies are produced and how those copies are partitioned between daughter cells. Some plasmids are entirely neutral, carrying genes only for their own replication. Others, however, are selfish, conferring no benefit to the host but encoding host addiction systems, such as toxin–antitoxin pairs that ensure that daughter cells without the plasmid die. Plasmids can also be self-mobilizable, carrying conjugation systems to allow transfer between hosts. Many plasmids carry genes that benefit the bacterial host; these commonly encode resistance to anti-microbial agents such as heavy metals or antibiotics. In many cases, plasmids carry accessory genes that define the pathology or host range of their bacterial hosts. A very well-studied example is found within the Yersiniae. All pathogenic Yersinia, for example Y. enterocolitica, which causes food poisoning in humans, carry a common plasmid encoding a

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N

type-III secretion system essential for the pathogenicity of these bacteria in mammals. In addition, Y. pestis, the pathogen that causes plague, carries a further two plasmids. They enable survival inside the flea vector and dissemination from the site of the flea bite (Wren 2003). The majority of unique genes in Y. pestis are carried on these plasmids, again indicating that the unique characteristics of the strain are determined by the accessory, rather than the core, genes. A further example to underline this is provided by Shigella dysenteriae, the agent of dysentery. Shigella is, phylogenetically, an E. coli, and is in fact of polyphyletic origin within E. coli. Common to all the Shigella strains of different origins is the presence of a virulence plasmid, versions of which appear to have been acquired on several occasions by E. coli strains (Yang et al. 2007).

Bacteriophage Finally, bacteriophage themselves can sometimes integrate into the bacterial chromosome and thus be carried by strains long term. They may do this silently, or they may carry genes that benefit the bacterial cell and thereby improve their own propagation (Brussow et al. 2004). A classic example of this is the diphtheria toxin, which is encoded by a bacteriophage in Corynebacterium diphtheriae and which may excise from the chromosome, leaving toxin-negative strains (Cianciotto and Groman 1997). Once integrated into the chromosome, most, if not all, phage genes are no longer under any selective constraint and begin to decay.

Homologous recombination The behavior of plasmids, genomic islands, and chromosomally integrating bacteriophage can therefore be understood as individual replicating entities that experience their own selective pressures—they can survive and propagate in their host species by being silent in, parasitic on, or beneficial to their bacterial hosts. Although these large blocks of genes, which are often self-mobile, are clearly identifiable as horizontally acquired and can encode many wellstudied pathogenicity determinants, they are not the only, or even the primary, cause of gene content


differences among bacterial strains and species. Again taking the Enterobacteriaceae as an example, it can be seen from Fig. 15.3 that the vast majority of insertion/deletion events of specific genes between both S. Typhi and Salmonella Typhimurium (within species) and between S. Typhi and E. coli (between species) are within small blocks of 5 genes or less (Parkhill et al. 2001a). Of course, by this calculation the large pathogenicity islands (such as SPI-1 and SPI-2), which count as individual insertion events, carry large number of genes. However, even taking this into account the number of genes gained and lost in events of 10 or fewer genes outweighs the number of genes gained and lost in larger events. These small blocks of genes generally do not carry mechanisms for self-mobility or for integration into the chromosome. In contrast to the large islands and phage, they integrate and excise from the chromosome by simple recombination between conserved flanking genes, which allows the exchange of virtually any gene at any chromosomal position between related strains and species. An example of the importance of this mechanism of exchange is the capsular polysaccharide biosynthesis locus of Streptococcus pneumoniae. The majority of S. pneumoniae strains produce a surface polysaccharide layer termed the capsule. This can exist in up to 90 different antigenic forms (serotypes) on different strains, and the detection of these different serotypes using diagnostic antisera forms the basis for differentiation of the strains. Each of the capsular polysaccharides is encoded by a different gene cluster, between 10 and 30 kb in size, and these clusters all reside at the same chromosomal locus in different strains (Bentley et al. 2006). Thus the clusters can be thought of as alternate alleles at a single locus. However, the cluster present in any specific strain is not determined solely by vertical inheritance—it can be exchanged between strains by DNA transfer (by transformation or transduction) and subsequent homologous recombination between flanking genes. In this way, each of the 90 clusters forms part of the species gene pool that is in principle accessible to any member of the species. This becomes of practical importance when we consider that the capsule is the target of anti-Streptococcal vaccines (Hsu and Pelton 2003; Poolman 2004), and any vaccine can only cover a small proportion of the capsular types,


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C




+ 51 0 –5 46 5 –4 41 0 –4 36 5 –3 31 0 –3 26 5 –2 21 20 19

18 17

16 15

14 13

12 11


20 9







Number of insertion/deletion events


60 Insertion/deletion size (number of CDS) 100 2


Figure 15.3 Sizes of regions of difference between S. Typhi and E. coli and S. Typhimurium. The Y axis represents number of insertion/ deletion events. The X axis represents the size of the event in terms of numbers of CDS. Bars above the axis represent genes present in S. Typhi; bars below the axis represent genes absent in S. Typhi. The black bars show the comparison to S. Typhimurium, and the open bars show the comparison to E. coli. Reproduced from Parkhill et al. 2001. Nature 413: 848–852 with permission of the publisher.

usually those most commonly circulating. However, vaccinating the human population against some of the capsular types puts a strong selective pressure on the circulating S. pneumoniae, most of which exist in the population as commensals that rarely cause disease. Highly pathogenic strains targeted by the vaccine certainly have the mechanism, and may have the opportunity, to switch capsules with less pathogenic strains and thus evade the vaccineinduced immune response while maintaining their pathogenicity. Monitoring of the vaccinated populations is ongoing to detect this inevitable event.

Short-term evolution of pathogens Bacterial genetics consists of several mechanisms that allow long- and short-term adaptive, incremental change in response to selective pressures. However, some human and animal pathogens show a distinctly different pattern of change: they are highly clonal, with strains isolated from different global locations being almost indistinguishable. Bacteria showing this pattern include Y. pestis, which causes plague; S. Typhi, which causes typhoid fever; and Bordetella pertussis, which causes

whooping cough. This extreme form of population structure indicates that these organisms have evolved into their present form very recently and spread rapidly around the world. Estimates of their age suggest that this may have occurred as recently as ~10,000 years ago for Y. pestis (Achtman et al. 2004) and ~50,000 years for S. Typhi (Kidgell et al. 2002). It is reasonable to suggest that such a population structure is associated with the acquisition of, and subsequent expansion into, a new niche. Genomic analyses of these organisms have uncovered a number of correlates of the process, including increased numbers of pseudogenes, expansion of selfish genetic elements (such as insertion sequences (IS)), and gross chromosomal rearrangements. Many of these correlates appear to be degradative, suggesting that the overall evolutionary process is associated with loss of genetic information. How can this be explained?

Yersinia pestis Consider Y. pestis, whose genome contains two novel plasmids that encode virulence determinants and are not found in other Yersinia (Parkhill

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N

et al. 2001b; Wren 2003). It is clear from population analyses that Y. pestis is effectively a clone that has emerged from Yersinia pseudotuberculosis (Achtman et al. 1999), which is a gastrointestinal pathogen, transmitting through a fecal–oral route, and a cause of food poisoning in humans: a fairly broad species with a diverse genetic structure. It is clear that acquisition of these two plasmids that carry virulence determinants enabled Y. pestis to change its niche from that of a gastrointestinal pathogen to a blood-borne pathogen with a flea vector. One plasmid carries genes allowing survival inside the flea; the second allows dissemination of the pathogen away from the site of the flea bite in the mammalian host. The first genomic analysis of Y. pestis was of CO92, a strain isolated from the fatal infection of a veterinarian in Colorado in 1992. It identified many genes both on the chromosome and on the plasmids that appeared to have been inactivated by mutation. Initially, around 150 genes, or nearly 4% of the coding capacity of the genome, were thought to have been disrupted (Parkhill et al. 2001b). In bacteria, genes that have been inactivated by mutation are called pseudogenes. Genomic comparisons of this sequence with those of other Y. pestis strains (Deng et al. 2002) and of Y. pseudotuberculosis (Chain et al. 2004) confirmed these pseudogenes and identified others, with estimates of the number of pseudogenes increasing to 300. Although pseudogenes are now recognized as being more common than previously believed in most bacterial genomes (Ochman and Davalos 2006), it is clear that Y. pestis (and other organisms that have undergone a similar process) have a substantially higher proportion than most bacteria. Another notable discovery in the Y. pestis genome was the profusion of insertion sequence elements. IS elements are selfish genetic elements that code only for their own replication and transmission. They are generally short, encoding only a single protein, the transposase, whose function is to copy the IS element to a new genomic location. Although many bacteria contain active IS elements, their numbers are relatively low, for novel genomic insertions are usually deleterious to the host cell. Yersinia pestis CO92 contains nearly 140 IS elements, substantially more than the 20 found in the sequenced strain of Y. pseudotuberculosis (Chain


et al. 2004). This expansion of IS elements has destabilized the genome, and recombination between these perfect DNA repeats appears to have been frequent during the recent evolution of the organism, as seen by the large-scale chromosomal rearrangements among strains of Y. pestis (Chain et al. 2006). Rearrangement by recombination between IS elements was also seen during growth of a clonal culture of strain CO92 (Parkhill et al. 2001b). Such instability contrasts strikingly with the strong conservation of gene order and orientation usually found on bacterial chromosomes over long periods of evolutionary time (Rocha 2006). Recombination between IS elements has also led to the loss of gene function through deletion: at least 300 genes have lost function in Y. pestis by deletion since evolving from Y. pseudotuberculosis (Chain et al. 2004). Many of the gene functions lost in Y. pestis are those that had been necessary in the previous gastrointestinal niche of the organism, including flagellar biosynthesis (an important virulence factor for gut pathogens) and many host interaction factors, including those involved in invasion and adhesion. It is likely that loss of some of these functions was necessary for the niche change, as they may have interfered with the new modes of transmission and growth in the host. It is also likely that many functions have since been lost because they have no benefit in the new niche and are no longer maintained by selection. However, genes may also have been inactivated or deleted due to neutral processes such as drift (see below). Comparison of deletions and pseudogenes in several Y. pestis genomes shows that, while some are unique to different strains, showing recent continuing gene loss, many are identical in all current strains, indicating that they occurred very early on in the evolution of these organisms (Chain et al. 2004). This suggests that the origin of Y. pestis (probably involving plasmid acquisition) was accompanied by mutational events that produced many point mutations and an expansion of IS elements. This association of a number of mutational events may be a signal of the population genetic processes associated with the acquisition of a new niche. The change of niche is likely to have involved an evolutionary bottleneck with a very small effective population size. Such bottlenecks


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C

allow the relatively rapid fixation of a larger number of random mutations in the population than is normal due to accelerated genetic drift (Andersson and Hughes 1996). A history containing a genetic bottleneck can explain the increased numbers of pseudogenes caused by point mutations, and it can also explain the expansion of IS elements, for each movement of an IS element can be considered as a mutational event, like a base-change in DNA. In a large population, most such events will be rapidly removed by selection, but in the small population during the bottleneck, many of them will be fixed in the population. Thus we do not need to invoke a change in the rate of transposition to explain IS element expansion, simply a change in the rate of fixation of transposition events. Many recently evolved pathogens show such a signature of a recent niche-change accompanied by an evolutionary bottleneck. They include both human pathogens, such as S. Typhi (Parkhill et al. 2001a), and animal pathogens, such as Burkholderia mallei, the causative agent of Glanders (Nierman et al. 2004), which is a recently evolved clone of Burkholderia pseudomallei, a free-living soil-borne organism that can also cause disease in many hosts (Godoy et al. 2003). It has been suggested that these changes may have been a response to the new niches made available by the increases in the human population and the changes in farming practices associated with the Neolithic agricultural revolution (Mira et al. 2006).

Bordetella pertussis The example of B. pertussis suggests that changes in host range and virulence need not involve acquisition of new DNA; they can be entirely degradative processes. Bordetella pertussis causes whooping cough, and, like Y. pestis, is highly homogeneous on a global scale, suggesting very recent evolution (van Loo et al. 2002). Its genome shows the signatures of a recent evolutionary bottleneck described earlier: large numbers of pseudogenes (~350, or 9.4% of coding sequences), IS element expansion (> 260 IS elements), and large-scale chromosomal rearrangement (Parkhill et al. 2003) (Fig. 15.4). Bordetella pertussis is host restricted, in that it only infects humans, causing acute whooping cough in infants and chronic cough in adults. The ancestor

Figure 15.4 Extreme genome rearrangement and IS element expansion in B. pertussis compared to that in B. bronchiseptica. The top bars show the genome of B. bronchiseptica (forward and reverse strands), with a scale in base pairs. The bottom bars show the genome of B. pertussis, with IS elements marked as open boxes. Bars linking the genomes show DNA:DNA matches: Dark gray are forward matches; light gray are reverse matches.

of B. pertussis was Bordetella bronchiseptica, a broad species that infects several mammalian hosts in which it causes chronic, and often asymptomatic, infections. The genome of a representative strain of B. bronchiseptica contained only 18 pseudogenes and no IS elements, although other strains of B. bronchiseptica are known to carry some IS elements. The initial genome sequence of B. pertussis identified around 140 genes not present in the sequenced B. bronchiseptica; however a microarray study of other B. bronchiseptica strains (Cummings et al. 2004) showed that only eleven of these presumed B. pertussis specific genes are truly B. pertussis specific. None of the eleven genes have an identifiable relationship to virulence. It appears, therefore, that B. pertussis has changed its niche from many hosts to only humans purely by gene loss. This mechanism of niche shift is also suggested by the fact that many of the lost or inactivated genes were involved in host interaction, biosynthesis of surface structures, and nutrient utilization. Losing such systems is likely to severely restrict the ability of B. pertussis to infect, and survive in, multiple hosts. Perhaps more surprising, B. pertussis appears to have increased in virulence in humans without acquiring novel genes. As described in the first

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N

section of this chapter, the paradigm of bacterial evolution of new phenotypes is the horizontal acquisition of genes encoding those phenotypes. This is particularly so in pathogens, where the acquisition of virulence factors is well documented. Careful analysis of the patterns of gene loss and inactivation in B. pertussis reveals that several regulatory proteins have been lost, sometimes with striking effects. For example, the genes encoding the pertussis toxin are present in both B. bronchiseptica and B. pertussis, but only B. pertussis expresses the toxin, and this appears to be due to mutations in the region that regulates transcription of the toxin gene. Bordetella bronchiseptica probably expresses the toxin only at a very low level, or only in situations not yet assayed, whereas B. pertussis expresses the toxin at a high level during infections in humans. Thus it appears that B. pertussis has recently increased in virulence by loss of regulation of virulence factors that were present, but under tighter regulatory control, in the ancestor, rather than by acquisition of novel virulence factors. Again, this change in the pathogen may be associated with changes in the human population. Pathogens often trade-off virulence with transmission, and some pathogens keep virulence low to increase chances of transmission between hosts (Chapters 11 and 12). Being able to infect multiple hosts will also increase the chances of transmission. However, if one host becomes much more numerous, and transmission therefore easier on that host, selection against virulence is lowered, and the pathogen may evolve higher virulence within this host. Host restriction could follow from the degradative genetic changes imposed by the evolutionary bottleneck accompanying this change of niche.

Stochastic variation/hypermutability Phase variation Bacteria divide by binary fission and hence are clonal over the very short term. This can make it difficult for them to respond to highly variable, unpredictable, or adaptive environmental stresses, such as those provided by the mammalian immune system. A regulatory system must be able to detect


a specific challenge, stress, or nutrient and to induce the machinery to deal with it. However, recognition by the immune system is difficult to detect and to respond to in a suitable way, for the vertebrate immune response is rapid, adaptive, and (from the bacterium’s point-of-view) unpredictable. Bacteria, and some other pathogens, cope with this problem by using a strategy called phase variation. Phase variation is, in effect, a heritable, random, reversible change in a specific phenotype. At its simplest, it involves a binary on/off genetic switch that flips randomly, then remains in the set position for a random length of time before switching back (see below for details). Using such random switches to control specific phenotypes allows an otherwise clonal population of bacteria to generate a variable population in which only a part is susceptible to a specific challenge, such as recognition of a particular surface protein by the immune system. More complex phase variation systems exist that allow discrete random expression of a succession of surface proteins, for example the recombination mechanisms used to sequentially vary surface proteins in Borrelia spp. (Koomey 1997). Bacterial phase-variation mechanisms have been well studied for many years, most notably in H. influenzae and Neisseria meningitidis (see Henderson et al. 1999; van der Woude and Baumler 2004). This section explores the insight into mechanisms that has been contributed by whole genome sequencing. Two common mechanisms for generating random on/off switches are variation in the length of simple sequence repeats (SSR), and DNA inversion (Figs. 15.5a,b). In this section the use of simple sequence repeats and DNA inversion to generate phase-variation in bacteria are first discussed and then genome sequences are used to show how they make it possible to catalog all of the variable genes in a bacterium, or a bacterial species. How the population sampling inherent in genomic sequencing makes it possible to detect variants that arise due to rapid genomic change is then investigated. Two examples will be discussed—Campylobacter jejuni, which uses SSRs, and Bacteroides fragilis, which uses DNA inversion. Finally, phase-variable restrictionmodification systems, and their effect on the interplay between bacteriophage and their bacterial hosts, are discussed.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C




G(11): OFF


G(12): OFF


Gene start

–1 +1


G(13): OFF



ON Inversion OFF Promoter (c)

Expressed gene

Alternative 3' end

Inversion Expressed gene


Expressed gene

Alternative 3' end

Alternative 5' end


Protein products (selectivity enzyme)

Figure 15.5 Phase variation mechanisms: (a) Variation at simple sequence repeats. A gene containing a G(12) motif (underlined) is shown in the center. Increasing or decreasing the length of the repeat causes a shift in the reading frame leading to a premature stop codon (*). (b) An invertible promoter. The upper line shows the promoter (represented by a circle and arrow) in the ON position, driving transcription of the genes (represented by open arrows). Inversion of the promoter between inverted repeats (bold double-headed arrow) leads to the OFF position, represented on the lower line. (c) Inversion of alternative 3’ partial gene cassettes. The expressed gene is shown on the left of the invertible DNA (bold double-headed arrow) on the top line. Inversion of this DNA (bottom line) leads to the 3’-end of the expressed gene being replaced with an alternative 3’-end (hatched box). (d) Bacteroides fragilis restriction/modification system. The top section shows the expressed gene for the selectivity protein (first open/shaded box on left). Each half of the protein recognizes half of a bipartite DNA sequence for methylation or cleavage. The remaining boxes with different shading represent alternative 3’- and 5’-ends of the gene that can be placed in the expression site by inversion of DNA. The curved lines show all the inversions that can occur. The bottom section shows the eight different selectivity enzymes that can be produced by this system.

Simple sequence repeats Simple sequence repeats are runs of a single base ((G)n, (C)n, etc.), or runs of two ((AT)n), four ((TCAG) n), or more (often up to seven) bases. Random errors occurring during DNA replication can cause addition or deletion of repeat units, lengthening or shortening the repeats. The repeats themselves are often located just within the 5’ end of genes, and therefore increasing or decreasing the length of the repeats

will change the translation phase of the encoded protein, leading to either an intact or a frame-shifted gene. While the intact gene can be translated normally, the frame-shifted gene has a premature termination codon that prevents translation of the protein. In this way, random changes in the length of the repeat produce a binary on/off switch in the expression of the encoded protein. More subtle control of the expression of the protein can be achieved if the variable repeat is contained within the promoter

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N

region that controls transcription of the gene; varying the distance between elements of the transcriptional promoter alters its strength, and therefore the level of expression of the gene. Uncorrected errors in long repeats of this type commonly occur in around 1 in 103 cells per generation, leading to occasional variants at each locus arising within the population. The error rate in shorter repeats is much lower, and the repeats generally need to be above a certain threshold length before they will start varying at a biologically significant level.

DNA inversion A second common mechanism that can control a random on/off switch is site-specific DNA inversion. The paradigm for this control is the expression of one of two flagellar types in S. Typhimurium (Silverman et al. 1979). There are two genes, encoding H1 and H2 flagellins, located at different sites in the chromosome. The H2 gene is encoded downstream of a segment of DNA that can be inverted through the action of a site-specific recombinase (invertase), Hin, on short inverted repeats bounding the DNA segment. The invertible segment carries the hin gene itself and a promoter driving expression out of the invertible segment. Thus, when the segment is in one orientation the H2-flagellin gene is expressed, and a co-transcribed regulator situated after the H2 gene represses the H1 flagellin gene, which is found elsewhere on the chromosome. When the segment is inverted, the promoter no longer drives the expression of the H2 flagellin or the repressor, and the H1 gene is therefore de-repressed. Thus the random inversion of the segment allows a binary switch between the exclusive expression of the two flagellin genes. This switch appears to operate at roughly the same rate as phase variation controlled by SSR, about once per 103 to 105 cells per generation. DNA inversion is also used by some organisms to vary the gene sequence itself. In some cases, the invertible segment can contain two alternative 3’ ends for the same gene, thus encoding two alternative C-termini for the protein encoded by the gene. The inversion of the DNA segment attaches one or other 5’ end to the actively transcribed gene (Fig. 15.5c). Such a system is used by bacteriophage to generate alternate tail fiber proteins (van de Putte


et al. 1980). The tail fiber is used to recognize the surface of the host: it promotes entry. Randomly varying the gene in this way allows the phage to produce a population of phage particles with two alternative host-specificity proteins. The strength of this system is that, if one phage manages to infect a new host, it will regenerate both host specificities during growth within that single host. More complex systems exist where up to seven alternative 3’ ends can be attached to one gene through a system of nested invertible sequences. These systems, termed ‘shufflons’ (Sandmeier et al. 1991), are an elegant and compact way to generate host diversity within a small phage genome.

Genomic discovery of phase variable genes What does genome analysis add to our understanding of phase variation? First, and perhaps most importantly, it allows us to generate a compete catalog of all the phase-variable genes in a particular genome. Perhaps coincidentally, the first bacterial genome sequenced, that of the pathogen H. influenzae (Fleischmann et al. 1995), contains phase-variable repeats, and the complete catalogue was described very soon after the genome was published (Hood et al. 1996). This catalogue identified twelve genes containing long tetranucleotide repeats, and therefore potentially phasevariable, including genes previously undescribed in H. influenzae. The list contained genes that were homologues of genes in other organisms that were known to be involved in the production of surface structures. Using these data, the authors postulated that, as these genes were phase-variable, they could be involved in virulence in H. influenzae. This hypothesis was subsequently verified with an experiment that knocked out one of the genes, leading to a less virulent strain. This example shows that evidence of phase-variation allows the rapid and relatively simple identification of candidate virulence genes in specific pathogens. However, genes that are phase-variable in one strain of a species may not necessarily be phase-variable in another, due to shortening of the SSRs to a length from which they cannot easily expand. Therefore comparisons amongst multiple strains of a species allow a fuller description of the phase-variable


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C

gene repertoire of the whole species. This has been performed to good effect in N. meningitidis (Snyder et al. 2001).

Identification of rapid variation by genomic sampling The second aspect of phase variation to which genome studies can contribute is perhaps more surprising, and to understand it, we need to understand a little of the mechanism of genome sequencing. The sequence of a genome is not read continuously from a single piece of DNA—even the best sequencing machines are only capable of sequencing around 1 kb at a time, and genome sizes are usually measured in megabases. In addition, single sequencing reads are not highly accurate: they may contain errors. For these reasons, genome sequences are generated from a large number of near-random short sequences (known as reads), with a high level of redundancy to ensure accuracy. It is common for 8–10 individual sequences to contribute to each base in the final sequence. Given this, each individual read is effectively a sample of the state of that sequence in the population of organisms from which the DNA was extracted. Generally, this is not evident. Because the organisms are usually grown as a single batch from a single starter organism, and therefore are all clonally related, there should be no variation in the population. However, if diversitygenerating mechanisms, such as those described above, exist in the organism, occasional variants in the sample of the population may be captured. At the rates described earlier (around 1 change per 103 cells per generation) the variants are likely to be rare, unless they occurred very early in the growth of the batch. However, in some organisms, the rates of variation appear to be much more rapid, and many different variants are captured in the shotgun sequences, clearly identifying phase-variable genes directly. Here are two examples.

Campylobacter jejuni—simple sequence repeats Campylobacter jejuni is a commensal of birds that can cause severe gastrointestinal disease if ingested by humans. Its phenotype is highly variable in

culture, with rapid variation of colony morphology and other phenotypes. The first genome sequence of a strain of C. jejuni (11168; Parkhill et al. 2000) identified several regions in the shotgun where sequence variants existed within different reads representing the same part of the chromosomal sequence. All of these variants were differences in the length of homopolymeric tracts, usually poly-C or poly-G. Investigation of these variable regions showed that almost all were located just inside the 5’ end of genes, and that many of the genes encoded surface-localized proteins, or proteins involved in the biosynthesis of surface structures (Table 15.2). Although function could be ascribed to many of the genes, several of them were members of two families of unknown function. As with the H. influenzae investigation, because they were phase-variable and were linked to genes involved in the synthesis or modification of surface structures, it was hypothesized that they were also involved in this process. Subsequent work on these proteins showed that they were actually involved in the phase-variable biosynthesis of the flagellum (Karlyshev et al. 2002). A second curiosity in the list of genes was a phasevariable restriction-modification system. The functional import of this will be discussed below. The phase-variable genes in C. jejuni were easily identified from the shotgun because they appeared to vary very rapidly—indeed, faster than was thought to be the case in better-studied systems. The reason for this is not immediately obvious—it is possible that it was an artifact of the growth conditions of the organism. However, several controls were performed to show that it was not an artifact of the sequencing process. An analysis of the gene content of the C. jejuni chromosome showed that it lacked several previously studied DNA repair genes; DNA mis-match repair is known to be involved in the repair of lesions leading to changes in SSRs, and it is therefore possible that this is the reason for the apparent increase in rate of change of these sequences in Campylobacter. Subsequent genomic analyses have identified similarly detectable rates of variation in relatives of Campylobacter, such as the pathogen associated with stomach ulcers, Helicobacter pylori (Alm et al. 1999). Very rapid rates of change like this allow the rapid generation of a highly variable population of cells

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N


Table 15.2 Phase variable genes in Campylobacter jejuni 11168 Repeat


Putative function

G(8–10) C(9–11) G(9–13) G(9, 11) G(8) G(12) G(9) T(4–5) G(10) G(9) G(9–10) G(10–11) C(8–9) C(8–9) G(9) G(9) C(9–10) C(8–9) C(9–10) G(10–11) G(10–11) G(9–10) G(9–10) C(9–10) C(1–2) C(9–10) C(8–10) C(9–10) C(10–11) C(9–10) C(9) T(7)

Cj0031/Cj0032 Cj0045c Cj0046 Cj0170/ Cj0171 Cj0275 (clpX) Cj0565 Cj0617/Cj0618 Cj0628/Cj0629 Cj0628/Cj0629 Cj0676 (kdpA) Cj0685c non coding Cj1139c Cj1144c/ Cj1145c Cj1295 Cj1296/Cj1297 Cj1305c Cj1306c Cj1310c Cj1318 5’ of Cj1321 Cj1325/Cj1326 Cj1335/Cj1336 Cj1342c Cj1367 Cj1420c Cj1421c Cj1422c Cj1426c Cj1429c Cj1437c Cj1677/Cj1678

Probable restriction/modification enzyme Hemerythrin-like putative iron-binding protein Pseudogene (transport protein) Unknown, similar to Cj1325/Cj1326 Clp protease ATP-binding subunit Non-coding, upstream of pseudogene Unknown, 617 family Lipoprotein Lipoprotein Pseudogene (Potassium transporting ATPase A chain) Possible sugar transferase Upstream of Rrna Galactosyltransferase Unknown Unknown Weak similarity to aminoglycoside N3’-acetyltransferases Unknown, 617 family Unknown, 617 family Unknown, 617 family Unknown, 1318 family Transferase Unknown Unknown, 1318 family Unknown, 617 family Possible nucleotidyltransferase Possible methyltransferase Unknown, similar to putative sugar transferases Unknown, similar to putative sugar transferases Unknown Unknown Aminotransferase Unknown, similar to Cj0628/Cj0629

From Parkhill et al. (2000).

after clonal expansion from a very small founder population. This is likely to be of benefit to an organism that needs both to repeatedly establish an infection in a rapidly changing environment such as the gut and to avoid the immune system sufficiently to maintain long-term persistence as a commensal. Phase-variation mechanisms allow the generation of diversity necessary for a commensal to survive repeated cycles of transmission, establishment, and survival. Many of the organisms that display phasevariation mechanisms, while capable of being pathogens, are better described as commensal organisms

adapted to long-term coexistence with a host. This way to generate diversity may be more important for a commensal than for an acute pathogen that has only a transitory association with any individual host.

Bacteroides fragilis—DNA inversion Another rapidly varying diversity-generating system, utilizing a different mechanism, was identified during the genomic sequencing of B. fragilis, a commensal of the human gut. Although not the most frequently isolated Bacteroides species in the gut, it is responsible


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C

for the largest proportion of opportunistic infections caused by perforation of the gut wall. As with Campylobacter, previous studies had shown it to have a highly variable surface, with three structurally different capsules identifiable by electron microscopy and up to seven different polysaccharides detectable using serotyping antibodies (Patrick et al. 1999). Experiments had shown that these polysaccharides were expressed in a discrete, phase-variable manner, but before the genome sequence project was initiated the mechanism of phase variation was unknown. The genomic shotgun rapidly identified many sequences within the chromosome that could exist in either of two orientations, indicating that not only was inversion common in the genome, it was happening at an extraordinarily rapid rate (Kuwahara et al. 2004; Cerdeno-Tarraga et al. 2005). The invertible sequences fell into several groups defined by the sequence of the short inverted repeats bounding them, each presumably acted on by a common invertase. The commonest invertible regions were short (around 200 bp) and did not appear to encode any genes. However, they were predicted to encode promoters, suggesting that inversion of these was controlling the expression of downstream genes in a phase-variable manner. Downstream of this group of invertible promoters was a set of seven operons, consisting of between eleven and twenty-two genes predicted to be involved in the biosynthesis of surface polysaccharides (three further non-variable operons were identified elsewhere in the chromosome). This immediately suggested a mechanism for the coordinated phase-variable production of entire surface polysaccharides. For each polysaccharide, all the genes required for biosynthesis are co-transcribed, and therefore controlled by a single promoter. Inversion of this promoter means that either all the genes are switched on, when the promoter is orientated toward them, or switched off, when it is orientated away (Fig. 15.5b). This mechanism was independently discovered and confirmed for some of the operons (Krinos et al. 2001), and the specific invertase responsible was later identified (Coyne et al. 2003), supporting the hypothesis that all of the invertible promoters controlling surface polysaccharide production were controlled by a single invertase.

This system is remarkable in its breadth and complexity, but it conforms to the understanding of phase-variation discussed above, where the mechanism drives diversity of surface structures to promote immune evasion, colonization, and other host interaction mechanisms. However, the use of phase-variation in B. fragilis goes beyond this system. A second set of invertible promoters, bounded by different inverted repeats, and presumably acted on by a different invertase, was also discovered. Thirteen of these promoters were identified, and most were upstream of a gene encoding a member of a family of outer membrane proteins related to SusC, which is involved in the degradation of starch by Bacteroides. It seems likely that each of these proteins is involved in the degradation of different dietary polysaccharides. Several other inversion-based systems were also found to control the expression of several other similar outer-membrane proteins. It can therefore be hypothesized that B. fragilis is using random phase variation processes to control nutrient utilization systems, rather than the standard regulatory mechanisms for detecting and responding to specific nutrients. What benefit can this have to the organism? As discussed earlier, phase variation can be used by bacteria to respond to rapidly changing or highly heterogeneous environmental challenges. It may be that in the gut, the presence of different dietary polysaccharides is highly variable, and essentially random in time and space. Using phase-variation mechanisms to generate a diverse population heterogeneous in terms of its capability of using environmental nutrients may well be the most efficient response to such a microheterogeneous environment.

Phase-variable restriction modification systems The final system to be discussed is a phase-variable restriction-modification system (Fig. 15.5d). Restriction-modification (RM) enzymes recognize specific DNA sequences and either cleave them or methylate them. Previously methylated DNA is protected from cleavage by the cognate restriction enzyme (Murray 2002). These systems are

W H O L E - G E N O M E A N A LY S I S O F P A T H O G E N E V O L U T I O N

generally understood to be involved in the protection of bacterial cells from invasion by foreign (usually bacteriophage) DNA. Chromosomal DNA is methylated, and therefore protected from cleavage; foreign DNA is unmethylated and therefore cleaved before it can cause an infection. The system in B. fragilis is a type-I system that uses a single selectivity protein (HsdS) to recognize a bipartite DNA sequence—each half of the protein recognizes one half of the DNA sequence. A complex series of nested inversions (Fig. 15.5d) means that any single B. fragilis cell will be randomly expressing one of eight possible restriction modification systems, presumably recognizing one of eight different DNA sequences. To understand the possible benefit of such a system, we should consider what would happen if a bacteriophage managed to escape the RM system and productively infected a member of a population of cells expressing a single RM system. Bacteriophage particles produced from this infection would contain modified DNA and would therefore be able to infect easily any other member of the population. However, if the population is a random mixture of cells expressing different systems, phage managing to infect one will not be able to spread through the rest of the population. Thus this system is part of an arms race between phage and host. As seen earlier, phage can use phase-variable inversion systems to broaden their host range. Here it is seen that some hosts can use similar systems to attempt to constrict the movement of the phage. Thus the selective pressures driving the evolution of pathogens are not limited to virulence and host interaction. Other, more mundane, factors such as nutrient acquisition can be equally important,


and because our pathogens are subject to attack by their own parasites, they must themselves adapt strategies to survive this attack.

Summary 1. Bacteria can acquire genes by horizontal exchange, in addition to vertical inheritance. Over long periods of time, horizontal gene exchange can be the dominant mechanism of adaptation. 2. Bacterial chromosomes can be divided into core and accessory genes. Core genes are more likely to be involved in central processes such as transcription and translation, and are less likely to he horizontally exchanged. Accessory genes are more likely to be involved in niche adaptation (and pathogenicity), and to have been horizontally acquired. 3. Bacteria can evolve very rapidly in response to the availability of new niches. Such responses can sometimes involve genome degradation, as well as gene acquisition. 4. Many pathogenic bacteria encode specialized mechanisms for generating diversity in otherwise clonal populations. These are used to adapt to rapidly or randomly changing environments. 5. Bacteria evolve under many selective pressures, not just the requirements of pathogenicity. Many systems in bacterial genomes evolved as part of an ‘arms-race’ with their own parasites.

Acknowledgments J.P. thanks the Wellcome Trust and the Sanger Institute for financial support, and all the colleagues whose work has contributed to the analyses described in this chapter.

This page intentionally left blank

C H A P T E R 16

Emergence of new infectious diseases Mark Woolhouse and Rustom Antia

Introduction Infectious diseases are responsible for a quarter of all human deaths. Although most of this burden is currently imposed by just a handful of diseases, notably malaria, AIDS, and tuberculosis, there is a huge diversity of disease-causing agents (i.e., pathogens): over 1400 different species have been reported (Taylor et al. 2001). Additions to this list are still being made, at an average rate of one or two new species per year over the past three decades (Table 16.1); examples that will be considered in more detail later on include human immunodeficiency virus (HIV), severe acute respiratory syndrome (SARS) virus, and Ebola virus. In addition, important new variants of existing species, such as Escherichia coli O157, Vibrio cholera O139, H5N1 influenza A, and methicillin-resistant Staphylococcus aureus (MRSA) are regularly encountered. The dynamic nature of the infectious disease repertoire is of great concern. Over and above a persistent burden of endemic infectious diseases, human history has been marked by numerous plagues (Diamond 2002). The last century saw an influenza pandemic—Spanish flu—that killed tens of millions of people, and even the last generation has seen a completely new disease, AIDS, cause similarly massive mortality. Will there be others? To better answer that question a better understanding of the biological processes that contribute to the emergence of new infectious diseases is needed, which is the subject of this chapter. First, however, exactly what is meant by ‘emergence’ needs to be considered.

The most widely used definition of emerging infectious diseases is ‘diseases of infectious origin whose incidence in humans has increased within the past two decades or threatens to increase in the near future’ (Centers for Disease Control 1994). There are some problems with this defin ition: it may depend on a subjective judgment that incidence ‘threatens’ to increase, and it does not allow that the incidence of many diseases is naturally highly variable in both space and time. Alternative definitions have been proposed, for example, ‘an infectious disease whose incidence is increasing following its first introduction into a defined host population or an infectious disease whose incidence is increasing in a defined host population as a result of long term changes in its underlying epidemiology’ (Woolhouse and Dye 2001). This definition is intended to better reflect the epidemiological principles underlying disease emergence; the main problem is that it requires reliable information on long-term epidemiological trends that is not always available, especially for rarer or neglected diseases. In practice, the designation of an infectious disease or a pathogen as ‘emerging’ is decided by individual observers and may be applied in a variety of circumstances: (i) diseases caused by newly evolved pathogen species (e.g., HIV-1); (ii) novel variants of existing pathogens (e.g., MRSA); (iii) previously rare diseases or pathogens now on the increase (e.g., Borrelia burgdorferi, which causes



PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Table 16.1 List of human pathogen species newly recognized during the period 1975 to 2005 Name



Human bocavirus

Respiratory tract infection


SARS coronavirus

Severe acute respiratory syndrome


Cryptosporidium hominis



Human metapneumovirus

Respiratory tract infection


Nipah virus



Menangle virus

Menangle virus disease


Laguna Negra virus

Hantavirus pulmonary syndrome


Andes virus

Hantavirus pulmonary syndrome


Australian bat lyssavirus

Bat rabies


BSE agent

Variant Creutzfeldt-Jakob disease


Trachipleistophora hominis



Metorchis conjunctus

Trematode infection


Human herpesvirus 8

Kaposi’s sarcoma


Hepatitis G virus

Parenteral non-A, non-B hepatitis


Bagaza virus

Spondweni fever


Hendra virus

Acute respiratory disease, encephalitis


Sabia virus

Sabia hemorrhagic fever (Brazilian)


Sin Nombre virus

Hantavirus pulmonary syndrome


Encephalitozoon intestinalis



Bartonella henselae

Cat scratch disease


Guanarito virus

Guanarito/Venezuelan hemorrhagic fever


Encephalitozoon hellem



Hepatitis C virus



Corynebacterium amycolatum



Ehrlichia chaffeensis

Human monocytic ehrlichiosis


Barmah Forest virus

Barmah Forest virus infections


Hepatitis E virus

Enteric non-A, non-B hepatitis


Human herpesvirus 6

Roseola infantum





Banna virus

Banna virus disease


Human immunodeficiency virus 2

Acquired immunodeficiency disease syndrome


Rotavirus C

Rotaviral enteritis


Cyclospora cayetanensis

Chronic diarrhea


Enterocytozoon bieneusi

Chronic diarrhea


Scedosporium prolificans



Human immunodeficiency virus 1

Acquired immunodeficiency disease syndrome


Rotavirus B

Rotaviral enteritis


Helicobacter pylori

Peptic ulcer disease


Human T-lymphotropic virus 2

Myelopathy/tropical spastic paraparesis


Seoul virus

Hemorrhagic fever with renal syndrome


Borrelia burgdorferi

Lyme borreliosis, Lyme disease


Puumala virus

Hemorrhagic fever with renal syndrome




Table 16.1 Continued Name



Human T-lymphotropic virus 1

Adult T-cell leukemia/lymphoma


Campylobacter jejuni

Enteric diseases


Legionella pneumophila

Legionnaires disease


Hantaan virus

Hemorrhagic fever


Ebola virus

Hemorrhagic fever


Cryptosporidium parvum

Acute and chronic diarrhea


Lyme disease), sometimes reflecting improved detection rather than changing epidemiology; (iv) newly recognized etiological agents of known diseases (e.g., Helicobacter pylori, which is associated with peptic ulcers); and (v) once common diseases or pathogens previously thought to be in decline but now making a comeback (e.g., Mycobacterium tuberculosis), often referred to as ‘re-emerging’. Here, the focus is narrower. The goal is to understand how pathogens successfully invade and spread in the human population. The problem is thus an example of a biological invasion (Elton 1958). The point of interest is the difference between pathogens (such as HIV-1) that have successfully invaded and spread dramatically through the human population and pathogens (such as rabies) that have a long history of repeated introductions into humans and yet have not taken off in the same way. The problem is approached in two ways. First, the characteristics of the pathogens associated with emerging diseases are examined to see whether commonalities can be discerned that would help determine which kinds of pathogen are most likely to emerge. Second, the process of emergence is examined in more biological detail to see what is involved in a successful invasion.

Which diseases emerge? We begin by seeking to identify particular characteristics of newly emerged human pathogens. We do this by first reviewing the diversity of existing pathogens in terms of their taxonomic status and the source of human infections. We then consider

which characteristics are over-represented in the subset of newly emerged pathogens.

Diversity of pathogens Human infectious diseases are caused by a wide variety of organisms. The total set of over 1400 human pathogen species breaks down into over 200 viruses, over 500 bacteria and rickettsia, over 300 fungi, over 50 protozoa, almost 300 helminths, and at least 2 kinds of prion. Moreover, a species count captures only part of the meaningful diversity of pathogens; within a single species there may be variants with different virulence factors (e.g., verocytotoxigenic E. coli such as E. coli O157), different propensities to infect humans (e.g., Trypanosoma brucei complex), different antibiotic resistance profiles (e.g., S. aureus), or different receptor specificities (e.g., influenza A virus). This diversity of pathogens exhibits a huge diversity of lifestyles. The aspect that will be of most concern here is the source of infections transmitted to humans. It is useful to distinguish three possibilities. First, human infections can be acquired from other humans. For the moment, it is not necessary to distinguish between transmission via direct contact, indirect transmission involving contamination of the environment, or transmission involving vectors—the point to consider is whether the ultimate source of the infection in one human was an infection in another human. Examples of infections exclusively or almost exclusively spread between humans include syphilis and falciparum malaria (which requires a vector). Although these kinds of infection are associated with some of the


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

world’s major public health problems, they are comparatively rare from a taxonomic perspective; fewer than 100 species are known only as pathogens in humans. Second, human infections can be acquired from animal reservoirs. Again, the route of transmission is not of concern, simply whether the ultimate source of an infection in a human was an infection in a non-human host. These are referred to as zoonoses; examples include rabies, Lyme disease, and West Nile fever. Some care is needed in using the term ‘zoonosis.’ It is usually taken to include only those pathogens which are transmissible to humans (by whatever route) from other species of vertebrate, and excludes human parasites with complex life cycles involving another vertebrate host. Even so, zoonotic pathogens range from those causing sporadic infections of humans (such as rabies virus) to those so highly transmissible between humans that other hosts are largely incidental to the epidemiology of human disease (such as measles virus, dengue fever virus, or Schistosoma haematobium). Here, all of these are referred to as zoonotic. Zoonotic infections (defined in this broad sense) are comparatively common from a taxonomic perspective. Humans share over 800 species of pathogen (around 60% of all known species) with other vertebrates. A range of different vertebrate taxa are involved, but mostly we share our pathogens with other mammals. In terms of species diversity of human pathogens the most important taxa are ungulates and carnivores, followed by rodents, primates, and various others, including bats, cetaceans, and marsupials. Birds too can be sources of zoonotic infections, and human pathogens also occur in reptiles, amphibians, and fish (Woolhouse and Gowtage-Sequeria 2005). Third, human infections can be acquired, by whatever route, from a source in the wider environment and not from another ‘infection’ at all. These are referred to as sapronoses: examples include anthrax, tetanus, and many fungal infections such as cryptococcidiosis. Again, some care is needed with the definition of sapronoses: here, it is not taken to include pathogens transmitted via the faecal–oral route or via a free-living stage of a complex parasite life cycle. Even so, hundreds of human pathogen species are sapronotic. Most

of these are bacteria or fungi, plus some protozoa, and cause sporadic infections of humans; few are highly transmissible (directly or indirectly) between humans, an important exception being V. cholera. Some human pathogens (e.g., Listeria sp.) are both sapronotic and zoonotic. There is also a related group of infectious diseases, caused mainly by bacteria and fungi, for which the ‘wider environment’ includes ourselves. These are ‘commensals,’ usually found on the skin, on mucosal surfaces, or in the gut. They are normally benign but can sometimes cause disease, for example if introduced into the blood system via a wound or in association with AIDS or other immunosuppressive conditions. An example is fungal infections by species of Candida. In summary, relatively few species of human pathogen occur only as human pathogens. The great majority also occur as infections of other animals or in the wider environment. This suggests three ways in which humans might acquire novel pathogens. The first of these is as a result of the evolution of existing pathogens or commensals. The second way is as a result of exposure to novel pathogens from animal reservoirs; this may or may not involve any evolution of the pathogen. The third is as a result of exposure to novel pathogens in the wider environment; again, this need not involve any evolution of the pathogen. The next section will consider which of these may be most important in practice.

Characteristics of emerging pathogens Given the underlying diversity of human pathogens, it can be investigated whether there are any special characteristics of those pathogens associated with emerging infectious diseases, particularly those which have only recently been associated with humans. It can be asked simply whether recently emerged pathogens are a random subset of all human pathogens, and it is immediately apparent that they are not. From a taxonomic perspective, it is striking that the majority of novel pathogens associated with emerging diseases are viruses, especially RNA viruses: examples include HIV, SARS, hantaviruses, and filoviruses (Ebola and Marburg). In contrast, viruses represent less than one in seven


of all human pathogen species (see previous section). In part this bias may be due to our improved ability to detect and identify viruses, particularly since the advent of PCR technology. But there may also be biological reasons why RNA viruses should be especially likely to emerge. An obvious feature of RNA viruses is that they have very high nucleotide substitution rates and so the potential to evolve extremely rapidly (Holmes and Rambaut 2004). This implies that they may be especially able to adapt to a new host species and, as is discussed in more detail below (see Role of evolution), this may greatly increase the likelihood that they will successfully invade human populations. In terms of their lifestyles, the majority of new emerging infectious diseases of humans are zoonotic (Morse 1995; Taylor et al. 2001): examples include SARS, E. coli O157, and Ebola. Even HIV-1 and HIV-2, which are not now regarded as zoonotic, had animal origins. In contrast, very few novel human pathogens are sapronotic. The one robust example is Legionella pneumophila, the bacterium causing Legionnaires’ disease. The evolution of existing pathogens has been important, but mainly in the very specific context of acquiring resistance to antibiotics and other drugs: examples include MRSA, multidrug resistant tuberculosis, and chloroquine-resistant malaria. When longer time scales are examined, many important human diseases such as measles, influenza, and smallpox are also, like HIV, likely to have emerged as a consequence of animal pathogens crossing into humans and subsequently largely


or wholly escaping their zoonotic roots (Diamond 2002). In contrast, while pathogens have surely been evolving in humans ever since Homo sapiens first appeared as separate species, there is little evidence of extensive pathogen diversification within human populations, at least into different species, over that period. Most of the diversity of human pathogens, at the species level, seems to have non-human origins. A possible counterexample is M. tuberculosis, which may have evolved from older human pathogens (Brosch et al. 2002). Overall, animal reservoirs seem to be by far the most important source of new human pathogens, certainly over ecological time and probably over evolutionary time as well. However, by no means have all animal pathogens appeared in humans, and any special characteristics of those that have need to be considered. One important characteristic is host range. Recently emerged zoonotic pathogens are almost exclusively associated with mammalian reservoirs (Table 16.2), although some (notably H5N1 influenza A virus) infect birds. However, a wide variety of different mammals are involved, including ungulates, rodents, carnivores, primates, bats, and marsupials. In this respect, emerging zoonoses are similar to established zoonoses; there do not seem to be any firm restrictions on the kinds of mammal from which humans acquire new pathogens. However, a consistent feature of emerging zoonoses is that they tend to have broad host ranges (Woolhouse and Gowtage-Sequeria 2005). Nipah virus and Hendra virus, as examples of

Table 16.2 Examples of species jumps into humans from non-human hosts Pathogen

Original host

Year reported

Ebola virus Escherichia coli O157:H7 Borrelia burgdorferi SIV/HIV-1 SIV/HIV-2 Hendra virus BSE/vCJD Australian bat lyssavirus H5N1 influenza A Nipah virus SARS coronavirus

Bats/primates/antelopes Cattle Rodents/deer Chimpanzees Sooty mangabeys Bats/horses Cattle Bats Chickens Bats/pigs Bats/palm civets

1977 1982 1982 1983 1986 1994 1996 1996 1997 1999 2003


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

novel human pathogens, can both infect mammals from several different taxonomic orders. Similarly, the bovine spongiform encephalitis (BSE) agent, another new pathogen, spreads from cattle to a wide range of other mammal species as well as humans (as variant Creutzfeldt Jacob disease, vCJD). In other words, it is pathogens that are already capable of infecting a broad taxonomic range of non-human hosts that are most likely to turn up in humans. To summarize this section, novel, emerging pathogens are not a random subset of all pathogens but tend to have certain characteristics. There is a taxonomic bias in that most of our recently acquired pathogens are viruses. There is also an ecological bias in that we continue to acquire, and probably always have acquired, many of our pathogens from other species of mammal, and many of those pathogens already have a broad host range.

Disease emergence as a biological process The biological factors that are (or could be) responsible for the emergence of a new infectious disease in the human population can now be considered. This is done by developing a conceptual framework—the pathogen pyramid—which is used to help understand the characteristics of new emerging diseases described in the previous section. The ability of a pathogen to invade the human population is determined by both the biology and the ecology of the pathogen–host interaction (Antia et al. 2003; Holmes and Rambaut 2004; Woolhouse et al. 2005). As a result, both evolutionary and ecological changes can contribute to the emergence of a new infectious disease. As explained below, it may sometimes be the interplay between ecology and evolution that sets the stage for a novel pathogen to emerge. In this section these ideas are discussed in general terms; in the following section some specific examples are examined.

The pathogen pyramid The pathogen pyramid illustrates the concept that we may not (yet) have been exposed to all of a large pool of potential human pathogens, that not all of

those we are exposed to will be capable of infecting us, that not all of those can be transmitted by us, and that not all of those will be sufficiently transmissible to cause major epidemics. This is represented as a pyramid of four levels, with diminishing numbers of pathogens reaching each level (Fig. 16.1). The pyramid is used to structure a discussion of the biological processes involved in the emergence of new human pathogens, although the concept can be applied to any set of infectious agents, not just emerging ones (Wolfe et al. 2004). Level 1: Exposure to the pathogen The first step in the emergence of a new pathogen is the exposure of humans to that pathogen. This requires contact between humans and the pathogen reservoir (animal or environmental), where the degree of proximity implied by ‘contact’ is determined by the transmission route of the pathogen and may involve no more than contact with fecal material, a bite from an arthropod vector, or drinking contaminated water. Exposure thus reflects aspects of human ecology and behavior as reviewed below (see Role of ecology). Level 2: Infection of the human host The second step is the generation of an infection in the exposed human—exposure to a pathogen


Figure 16.1 The pathogen pyramid (following Wolfe et al. 2004). Each level represents a different degree of interaction between pathogens and humans, ranging from exposure through to epidemic spread. Only a fraction of pathogen species at one level is capable of reaching the next level. The emergence of a new pathogen species will be associated with overcoming the barriers to reaching higher levels of the pyramid.


does not always lead to an infection. To illustrate this point for zoonotic exposure, Cleaveland et al. (2001) catalogued over 500 different species of pathogen known to occur in domestic livestock. Of these, about 40% are regarded as zoonotic. Humans are, presumably, routinely exposed to many of the remainder, over 300 species, but these have not proved capable of infecting us. Similarly, for dogs and cats almost 400 pathogen species are known, but 30% of these do not occur in humans. These data suggest that many pathogens that we are exposed to, even those from other mammals with which we are closely associated, cannot infect humans (although they also indicate that a substantial fraction can). The inability of a pathogen from one host species to infect a different host species is referred to as the ‘species barrier.’ In some cases, the species barrier may be quantitative rather than qualitative, such that the new host must be exposed to a higher dose than the original host for infection to result, as has been shown experimentally for rabies virus (Blancou and Aubert 1997). Several factors can affect the ability of zoonotic pathogens to overcome the species barrier. For example, viruses and other intracellular organisms need to be able to bind specific receptors on the surface of the host’s cells in order to gain entry. This suggests that the use of a phylogenetically conserved host cell receptor might facilitate infecting multiple host species (Woolhouse 2002) and predispose a pathogen to species jumps. This idea could be extended to other molecular interactions between host and pathogen. An important contributor to the ability of a pathogen to infect humans is variation in human susceptibility. In some cases this variation might have a genetic basis; for example, apparently preexisting genetic variation in human susceptibility to HIV (Slatkin and Rannala 2000). More commonly, phenotypic variation in the human population will be important, particularly factors that weaken the host immune system, such as malnutrition, HIV co-infection, or immunosuppressive therapies. Level 3: Spread in the human population The next step is for the pathogen to be able to transmit (directly, indirectly, or via a vector) from one human to another. Even if a pathogen can


infect humans it may not be transmissible (by whatever route) between humans. This is the case for many zoonotic pathogens. Of the 800 or more zoonotic pathogen species recognized, only onethird are known to be transmissible within human populations (Taylor et al. 2001). Newly emerged pathogens show a similar trend: the majority are not transmissible or are very poorly transmissible between humans. An example is vCJD: humans cannot generally acquire vCJD from other humans, except possibly via an iatrogenic route. Transmissibility is often functionally related to the extent of pathogen replication as well as to tissue tropisms; that is, whether the pathogen can access locations or tissues within the host that provide it with a means of exit from the infected host, such as the gut, urinogenital tract, respiratory tract, or (particularly where vector-borne transmission is involved) blood. Level 4: Widespread transmission in the human population Most emerging pathogens, even if they are able to spread through human populations, are not sufficiently transmissible to cause large epidemics. The final step requires that the pathogen is sufficiently transmissible to spread extensively through the human population without the involvement of the original host. This implies that the basic reproduction number, R0, is greater than 1, i.e., that a single primary case will generate, on average, more than one secondary case (Anderson and May 1991, see also next section). There are two ways in which R0 can increase. The first is as a result of ecological changes—for example, by increases in the population density of the host (for directly transmitted diseases), or increases in the rate of partner change (for sexually transmitted diseases), or increases in the density of vectors (for vector-borne diseases). The second is by the pathogen evolving to become better adapted to the human host. These evolved pathogens will be better adapted to the human host but may do worse in the reservoir host, thus leading to the specialization of the pathogen in the human population and ultimately speciation (Woolhouse et al. 2001). We consider the roles of ecology and evolution, at all levels of the pathogen pyramid, in greater detail in the following sections.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

Role of ecology Many different drivers have been associated with the emergence of infectious diseases (Institute of Medicine 2003; Woolhouse and Gowtage-Sequeria 2005). The majority of these can be thought of as ecological in nature. Here we define ‘ecology’ very broadly to cover a wide range of different factors. Some of these reflect the way humans interact with each other; others reflect the way humans interact with their wider environment and especially with domestic animals and wildlife. Changes in ecology will be most important at levels 1 and 4 of the pathogen pyramid. At the base of the pathogen pyramid (Level 1) the concern is with ecological changes that increase exposure to novel pathogens. These include changes that increase the prevalence of the pathogen in the reservoir (whether the reservoir is an animal population or in the wider environment), changes that affect the density of humans in the pathogen’s habitat, and changes that alter the rate of contact between humans and the reservoir. Obvious examples include changes in land use, agricultural practices, the keeping and trading of livestock and pets (especially exotic species), the production and supply of food and water, and climate change. Ecological changes can also affect the potential for a pathogen to spread within the human population (Level 4). At this level the concern is with the infectiousness of the pathogen in humans and the number of opportunities for transmission from infected to susceptible humans. The latter will be influenced by changes in factors such as urbanization and living conditions, hygiene practices, travel and migration, sexual and social behavior, and hospitalization and medical procedures. The last of these is of special interest because health care facilities not only provide opportunities for infected individuals to make infectious contacts but also potentially expose to infection a population who are unusually susceptible as a result of immunosuppressive conditions or therapies. The effect of ecological changes at these two levels is very different. All else being equal, increases in exposure will result in proportional increases in the probability of emergence in a given timeframe. In contrast, changes that affect the transmission of

the pathogen between humans will have a ‘threshold’ effect. This can be best understood if change in transmission is expressed in terms of the basic reproduction number for the pathogen, R0, which represents the average number of secondary cases generated by a single primary case introduced for the first time into a large population of hosts. If R0 is less than 1, each primary case will, on average, fail to replace itself, and even if there are occasional chains of transmission they will stutter to extinction. If R0 is greater than 1, each primary case will, on average, produce more than one secondary case in the initial stages of an outbreak, and the infection will spread through the host population. The expected size of an outbreak (in terms of the fraction of the total population infected) is a function of the number of primary cases and R0 (see Fig. 16.2a) and is particularly sensitive to whether R0 is less than or greater than 1. If R0 is less than 1 then the size of the outbreak depends predominantly on the number of primary cases—this is because chains of transmission in the human population relatively rapidly stutter to extinction. If R0 is greater than 1, then the size of the outbreak is largely determined by the magnitude of R0, which describes the extent of human to human spread and the size of the susceptible population. However, even if R0 is greater than 1, an outbreak is not inevitable, for the infection may die out due to stochastic effects when the number of infected individuals is low (i.e., shortly after introduction) (see Fig. 16.2b).

Role of evolution Thus far, the evolution of the pathogen has not been considered. Evolutionary changes in the pathogen can contribute to changes both in the ability of the pathogen to infect exposed humans (Level 2) and to increases in the transmissibility of the pathogen from infected humans (Levels 3 and 4). However, it is not clear when such evolutionary changes are most likely to occur. In particular, they may occur before or after the pathogen enters the human popu lation (Holmes and Rambaut 2004), i.e., whether they represent pre-adaptation or adaptation or, less formally, whether most emerging pathogens are ‘off-the-shelf’ or ‘tailor-made.’ Off-the-self pathogens could arise in various ways. Genetic changes in the pathogen in the reservoir


1 0.8 0.6 0.4 0.2 0

Probability of epidemic


Probability of emergence

Relative size Iinf


Increasing primary cases 0


1 R0


0.6 0.4 0.2 2


0.01 0.0001 1. x 10–6 1. x 10–9

Increasing probability of evolutionary change







R0 Figure 16.3 Probability that a pathogen that has a basic reproduction number less than 1 and so is not able to sustain itself in the human population evolves to have a basic reproduction number greater than 1 and so has the potential to cause an epidemic. The probability of successful emergence in this way is a function of the basic reproductive number of the introduced strain (R0), the probability that one of the subsequent chains of infections has the single evolutionary change assumed to be required to increase R0 (␮), and the basic reproductive number of the evolved strain (R0*, here set to 2). Following Antia et al. (2003), this probability is calculated as











R0 Figure 16.2 Population dynamics of infectious disease outbreaks. (a) Expected size of an outbreak (in terms of the fraction of the population that is infected) as a function of the basic reproductive number R0. The lines correspond to increasing numbers of primary cases (I 0). Following Kermack and McKendrick (1927) and Woolhouse (2002) outbreak size (Iinf ) is calculated as

Iinf = 1− (1− I0 )exp(−R0 Iinf ) I 0 increases from 0.001 to 0.01. (b) Probability that an introduction of an infected individual sparks an epidemic as a function of the basic reproductive number R0. Following May et al. (2001), if the introduction of a single infected human when R0 is greater than 1 is considered, the probability is calculated as

 1  Pepidemic = 1 −    R0 

population(s) because of drift could coincidentally result in a pathogen better able to infect, grow in, or transmit between humans. Genetic change because of selection in the reservoirs could have the same inadvertent effect. If, as a consequence of these changes, R0 in the human population rises above 1, then a situation arises where an infectious disease could emerge (see previous section). Tailor-made pathogens could arise when a pathogen with a R0 less than 1 adapts to the human host

 µ R0  1  1 − *  Pemergence ≈   1 − R0  R0  The probability ␮ increases from 10⫺6 to 10⫺3 per infection.

during the course of an infection in a human. When this change raises R0 above 1 it allows the pathogen to emerge. In this scenario, the pathogen with a R0 below 1 would, because of stochastic effects, occasionally cause chains of transmission in the human population (as in Fig. 16.2a for R0 < 1). In the absence of any evolutionary changes, these chains of transmission would always go to extinction and there could be no epidemic. Evolution of the pathogen during these chains of transmission that drives R0 above 1 would make it possible for the evolved pathogen to cause an epidemic (with a specific probability as shown in Fig. 16.2b). It turns out that the probability of such an evolutionary change occurring is very sensitive both to the rate of genetic change that the pathogen is capable of, and the magnitude of the R0 of the introduced strain, especially as R0 approaches 1 (see Fig. 16.3). This has an important consequence: even if ecological changes alone are not sufficient to raise R0 above 1, a small increase in the R0 for the introduced strain may greatly increase the probability of emergence, simply because it greatly increases the number of opportunities for an evolutionary change which does raise R0 above 1.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

In the longer term further evolution of a novel pathogen is possible. The evolution of pathogen virulence is the subject of Chapters 11, 12, and 17. If control measures, such as antibiotics or vaccines, are eventually implemented, then the pathogen may evolve resistance to these measures, as is discussed in Chapters 10 and 11.

Examples of emerging infectious diseases In this section five examples of infectious diseases that have recently emerged or may be in the process of emerging are briefly reviewed. The examples are HIV/AIDS, influenza, SARS, Ebola, and monkeypox. All of these are viral infections that jump or have jumped into the human population from an animal reservoir. They all have high or very high case fatality rates, ranging from up to 10% for monkeypox to up to 90% for Ebola. However, these viruses show very different transmissibilities within the human population, from (currently) near zero (e.g., H5N1 influenza A) to very much greater than 1 (e.g., group M HIV-1). As set out below, it is generally far from clear whether these differences reflect evolutionary adaptation, or the lack of it, to human hosts. Nor is it clear whether key genetic changes occurred before or after the jump into humans, i.e., within the animal reservoir or within the human population. However, the role of ecological factors in providing opportunities for the pathogens to infect humans and increasing the potential for spread within the human population is a consistent theme. In some cases these ecological changes alone may be sufficient to allow the emergence of a new pathogen. However, it is possible, as described in the previous section, that ecological changes are important not only through their direct effects on the basic reproduction number, R0, but also by leading to larger outbreaks and so providing additional opportunities for the pathogen to evolve in the new human host. The five examples below illustrate this interplay between ecology and evolution.

approximately 25 million people and an estimated 40 million are currently infected. Yet it was only recognized as a public health problem as recently as the early 1980s, and the etiological agent, a lentivirus, probably evolved within the past 100 years. HIV-1 is thought to have originated as a species jump by the chimpanzee variant of simian immunodeficiency virus (SIVcpz) into humans in West Africa (see Chapter 13). Initial infection is probably from infected tissues via cuts and abrasions, but the virus is now spread by a combination of sexual transmission, vertical transmission, and blood-based routes. Numerous factors have contributed to the scale of the AIDS pandemic, including patterns of sexual behavior and intravenous drug use. The jump from chimpanzees is now believed to have occurred several times (Keele et al. 2006), resulting in the appearance of both pandemic HIV strains (HIV-1 group M) and non-pandemic strains (HIV-1 groups N and O). Since group N in particular has very limited transmissibility between humans it seems likely that human exposure to SIVcpz occurs frequently, most likely through handling of bush meat. HIV-2 appears to have a similar history; here the source is thought to be SIVsmm, which occurs naturally in sooty mangabey monkeys. Both HIV-1 and HIV-2 have undergone considerable phylogenetic divergence from their SIV relatives, but we do not know whether the high transmissibility of the epidemic and pandemic strains (R0 > 1) is a consequence of evolution following their introduction. However, even without any evolution of the pathogen, an increase in R0 is likely to have been associated with the movement of the virus from remote communities in Cameroon and the Democratic Republic of Congo to major urban centers with higher rates of sexual partner change and other high-risk activities. In terms of the pathogen pyramid, some (perhaps most) HIV lineages are confined to levels 2 and 3, but others have reached level 4, with devastating consequences.

Influenza HIV/AIDS The classic example of an emerging infectious disease is HIV/AIDS. HIV/AIDS has killed

Another classic emerging disease problem is that of novel influenza A viruses, exemplified by current concerns about influenza A subtype H5N1.


H5N1 was first reported in humans in 1997 and is highly pathogenic. However, it has not spread widely in the human population; to date there have been no more than a few hundred cases reported worldwide, almost all involving contact with and presumably transmission from infected poultry. In contrast, other influenza A viruses have historically caused global pandemics resulting in millions of deaths. The barriers that have so far prevented H5N1 from spreading effectively through the human population are not fully understood, so it cannot be determined whether the virus is likely to be able to overcome those barriers, achieve the values R0 of 2 to 3 typically associated with pandemic influenza, and successfully ‘emerge’ (Kuiken et al. 2006). Evaluating the likelihood of emergence of H5N1 will require developing a better understanding of the processes of evolution of the virus both in humans and in its avian reservoir, particularly the evolution of virus affinity for specific cell surface receptors (such as sialic acid) and of tissue tropisms within the respiratory tract. At present H5N1 is at around level 2 of the pathogen pyramid, but other strains of influenza A have reached level 4.

SARS SARS is an example of a recently emerged infectious disease that rapid intervention managed to control. SARS is a coronavirus infection, and it emerged into the human population as a zoonosis via (probably) the palm civet, which is traded as a food animal in parts of Asia. In humans the SARS virus causes a respiratory infection; it spreads both in the community and—very effectively— within health care facilities. The disease spread rapidly to several locations in South Asia and, by airline travel, to distant locations including North America. Interestingly, SARS viruses in palm civets differ genetically from those isolated from human infections (Chinese SARS Molecular Epidemiology Consortium 2004). However, it remains unclear how these changes affect the dynamics of human infections and transmission and whether they originated in palm civets or another animal reservoir or occurred after the introduction of the virus into the human population. Either way, during the


early stages of the 2003 outbreak R0 for SARS was estimated at around 3 (Lipsitch et al. 2003), comparable with values estimated for pandemic influenza. Rapid, effective control measures restricted the outbreak to less than 1000 cases worldwide and prevented what could otherwise have been a major epidemic (Stohr 2003). SARS appears to have climbed very quickly to level 4 of the pathogen pyramid. It is not known whether exposure to SARS coronavirus had occurred previously, but a combination of a flourishing bush meat trade and high rates of travel were important drivers of the 2003 outbreak.

Ebola Outbreaks of hemorrhagic disease due to Ebola virus infection were first reported in Sudan and Democratic Republic of Congo in 1976, and other outbreaks caused by related viruses (including Marburg virus) have been reported since. The virus is thought to have an animal reservoir, probably not a primate, possibly rodents or bats (Dobson 2005). There have been around 20 reported outbreaks, the largest comprising hundreds of cases with several generations of human-to-human transmission. Several of these have involved extensive spread in health care facilities. Initial R0 values prior to control measures being implemented exceed 1 for the larger outbreaks (Chowell et al. 2004). So far, these extended chains of transmission do not seem to have resulted in virus evolution, nor have they allowed the virus to escape ecologically from settings in remote communities and hospitals. Thus, Ebola currently lies between levels 3 and 4 of the pathogen pyramid. It is not clear whether the Ebola viruses have a long history of occasional outbreaks in humans, or whether these are becoming more frequent, presumably as a result of human encroachment into the pathogen’s natural habitat.

Monkeypox Another interesting situation relates to pathogens that have been afforded new opportunities to spread within humans as an incidental result of the control of related species. An example of such a case is monkeypox, which is closely related to


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

smallpox, the first human pathogen to be eradicated. Occasional cross-species transmission of monkeypox into the human population (mainly in Central and West Africa) from animal reservoirs (which may be squirrels, rodents, or other species) can result in short chains of transmission. It is expected that these chains of transmission will increase in length as herd immunity to smallpox wanes because of the absence of natural infections and the discontinuation of smallpox vaccination programs, both of which provide substantial cross immunity to monkeypox. There are indications that this is already happening, resulting in larger outbreaks (Heymann et al. 1998), and that monkeypox is climbing from level 2 to level 3 of the pathogen pyramid. While R0 for monkeypox in unvaccinated human populations remains below 1, larger and more frequent outbreaks increase the likelihood of higher transmission rates evolving (Antia et al. 2003).

Practical implications of disease emergence Emerging infectious diseases present a difficult challenge. Although most emerging pathogens cause only localized and, on a global scale, relatively minor disease problems, they tend to receive disproportionate attention. One reason for this is that when a new disease first appears it often cannot immediately be determined whether it will not become the next great plague. Such concerns may implicitly reflect the knowledge that pathogens sometimes evolve very rapidly, and a pathogen that is only a minor public health concern today might become something much more serious tomorrow. Current interest in H5N1 influenza A is a good illustration of this kind of thinking. Rather than having to wait and see, it would be helpful to be able to predict which pathogens are most likely to emerge, and which of those are most likely to cause major problems.

Predicting pathogen emergence The first point to make under this heading is that any attempt at rational prediction of pathogen emergence can be undermined by ‘out of the blue’

events such as the totally unanticipated appearance of BSE and vCJD in the UK in the 1980s and 1990s. Even so, recent experience suggests several characteristics of the kinds of pathogen most likely to be found emerging in human populations in the immediate future. Firstly, most newly emerged pathogens are viruses, especially RNA viruses. Most new pathogens that are not viruses (and some of the viruses too) are more likely to be newly discovered than to be truly novel causes of human disease (see Table 16.1). However, there are exceptions: for example, the bacterium E. coli O157 (not a new species but a distinct new serotype) and the agent of vCJD. Secondly, humans acquire most of their new pathogens from animal reservoirs. Here too there are exceptions, such as the bacterium L. pneumophila (see Characteristics of emerging pathogens), but these are rare. There is a particular propensity for pathogens with a very broad host range to make the jump into humans (Woolhouse and GowtageSequeria 2005). This implies that it is important to understand the biological characteristics that predispose pathogens to crossing species barriers and infecting human hosts. Thirdly, emerging pathogens frequently occur in association with changes in human ecology, especially changes in the nature and intensity of human interactions with animals, both domestic and wild. This suggests that activities such as trade in exotic pet species or farming livestock in new areas increase the likelihood of new diseases emerging. Finally, most of the new pathogens acquired from animals or other sources are not highly transmissible within human populations. Of particular concern are the exceptions to this rule. This implies that it is important to understand the biological determinants of transmissibility. Beyond this, it needs to be recognized that pathogen transmissibility is as much a function of human ecology and behavior as it is the biology of the pathogen. Therefore a better understanding of how factors such as urbanization, travel, hospitalization, sexual behavior, or water supply affect the potential transmissibility of specific pathogens or kinds of pathogen is needed.



Given the taxonomic and ecological diversity of emerging pathogens, it is probably not realistic to anticipate precise predictions about which pathogens will emerge in the immediate future. However, as the kinds of changes in human ecology that have resulted in pathogen emergence over the past few decades look set to continue, it can at least be predicted with some confidence that emerging disease problems will also continue. The overall impression is that emerging pathogens are likely to be those best able to take advantage of new epidemiological opportunities presented by changes in human ecology and behavior and the wider environment. Reflecting this, emerging pathogens have been likened to weeds (Dobson and Foufopoulos 2001), which seems to be a helpful analogy. Thus far only the ability of a new pathogen to emerge and cause an epidemic in the human population has been considered; of equal interest is the virulence of the newly emerged pathogen. The level of virulence to which pathogens are expected to evolve is thought to be determined largely by the trade-off between virulence and transmissibility, a topic considered in greater detail in Chapters 12 and 17. Understanding this trade-off will help understand the virulence of newly emerging pathogens. A starting point is the observation that high virulence is a feature of many, though not all, novel human pathogens, including HIV, avian influenza, SARS, Ebola, and vCJD.

capacity to undertake surveillance and diagnosis. Various organizations have suggested ways forward (Institute of Medicine 2003; Office of Science and Innovation 2006) and the success of the global effort to combat SARS provides some encouragement (Stohr 2003). Consideration of the ecological and evolutionary origins of emerging pathogens should help with this endeavor. For example, the importance of animal reservoirs for the epidemiology of emerging human pathogens suggests that surveillance could usefully be extended into animal populations, requiring an integrated effort cutting across public health and animal health agencies. It would also be useful to have a much more complete picture of the diversity of pathogens present in domestic animals and wildlife, some of which may represent a threat to public health in the future. It may also be possible to identify risk factors for exposure to novel pathogens and for subsequent spread in the human population, which would allow detection and surveillance efforts to be most efficiently targeted. One of the key messages from this chapter is that the need to respond to the threat of novel human pathogens is likely to remain for some time. The quota of human pathogens was not determined at a distant point in our evolutionary, or even historical, past. Rather, it is still changing. Humans continue to acquire new pathogens at a very high rate and this is likely to remain the case for the foreseeable future.

Public health response


The most important line of defense against emerging pathogens is effective surveillance allowing the early detection of unusual disease outbreaks as a precursor to implementing prevention and control measures. The importance of early detection cannot be overstated: even small delays may allow an infectious disease to spread to an extent where it becomes extremely difficult, if not impossible, to intervene effectively (Ferguson et al. 2006). However, the practical challenge is formidable: seeking to detect small clusters of cases of novel diseases against a huge and diverse background of endemic infectious diseases, often in regions with limited human resources and physical

1. Infectious diseases are caused by a wide variety of organisms (over 1400 documented human pathogen species) and these organisms exhibit a considerable diversity of lifestyles. 2. Almost all novel pathogens associated with newly emerging diseases come from animal reservoirs. From a taxonomic perspective, the majority are viruses, especially RNA viruses. 3. The emergence of a new infectious disease in the human population involves exposure to the pathogen, successful infection of the hosts, and sufficient transmission between hosts to raise the basic reproductive number, R0, above 1. These different levels of emergence make up the ‘pathogen pyramid.’


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

4. Both ecological and evolutionary changes, sometimes acting synergistically, can affect a pathogen’s position on the pyramid. 5. HIV/AIDS, influenza (H5N1), SARS, Ebola, and monkeypox are examples of emerging diseases. From the available data it is hard to tell whether ecological changes alone caused these diseases

to emerge or whether evolutionary changes were sometimes involved as well. 6. Looking at past trends provides some guidelines as to which kinds of pathogen are most likely to emerge in the future and may help to target surveillance for newly emerging diseases.

C H A P T E R 17

Evolution of parasites Jacob C. Koella and Paul Turner

Virulence and transmission in public health and evolution Parasites remain major causes of mortality and morbidity, particularly in underdeveloped regions of the world, where they are responsible for about two-thirds of all deaths (World Health Organization 2003). For public health, our priority is to reduce the total burden of infection, which is a function of two factors: the rate of transmission and the severity of symptoms (i.e., virulence). These two factors are not only important for quantifying our ideas about public health; they are also at the center of most modern ideas about the evolution of parasites. The simplest model of the evolution of parasite virulence assumes a strong connection between a parasite’s transmission and its virulence (which is generally defined as the decrease in host evolutionary fitness due to parasite infection). Consider a parasite that replicates to a relatively high density within its host. Whereas parasite transmission success may depend on the number of within-host progeny, the more parasites, the more they damage the host, and the sooner the host dies and thus disappears as suitable habitat. The model further assumes that evolution leads to a parasite with levels of the two parameters that maximize total transmission during the parasite’s infectious period. In many cases, selection is predicted to favor evolution of intermediate levels of both parameters; low virulence causes a low rate of transmission that enables only a few transmission events, whereas high virulence causes host death before the parasite can be transmitted. Intermediate transmission rates in combination with intermediate rates of host mortality often maximize the number of transmission events for a parasite within a single host.

Chapters 10–16 include aspects and variations of this simple idea; they describe current research on the evolution of virulence, potential evolutionary consequences of control, the evolutionary history of the parasites, and the emergence of new diseases. Here we link these chapters more closely to the basic theory of parasite evolution and introduce features of parasites whose incorporation into models of parasite evolution increase our predictive power. We focus on five main questions: (i) How can ideas about the evolution of virulence be used in control programs? (ii) What are the potential pitfalls in definitions of virulence? (iii) How can deeper understanding of the virulence-transmission trade-off help to analyze the evolution of virulence? (iv) Are there situations where the trade-off model breaks down? (v) Can molecular approaches give us a better understanding of the constraints and trade-offs underlying virulence?

The evolution of virulence in control programs General considerations Using the assumption of a trade-off between virulence and transmission rate, evolutionary biologists have started to discuss virulence management in the context of models of the evolution of virulence (Dieckmann et al. 2005). Their main questions are: Can control programs that promote the evolution of lower virulence be designed? Or will certain kinds of control programs purposefully reduce the proportion of infected individuals in the population, 229


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

but inadvertently select for increased parasite virulence such that total deaths go up? One influential model combines the idea of virulence evolution sketched above with the possibility that the host evolves defense mechanisms against the parasite (van Baalen 1998). Despite the simple structure of the model, the interactions between epidemiology and coevolution lead to complex dynamics that give rise to two alternative evolutionary pathways. In one pathway, the evolution of more virulent parasites increases the benefits for hosts to defend themselves more strongly against a specialist parasite despite the evolutionary cost of the defense mechanisms. Because parasites coevolve with their hosts, this can in turn trigger selection for a further increase of virulence. At equilibrium, hosts pay heavily to defend themselves against a rare but extremely virulent parasite. In the second evolutionary pathway, the hosts tolerate a widespread parasite that remains relatively benign. Human health care managers may thus be confronted with the ethical dilemma posed by a control program that changes the parasite from a common, mild pathogen to a rare, serious one. The following studies discuss potential evolutionary consequences of concrete control programs, such as vaccine and drug administration.

Vaccines Vaccination will select for the invasion of escape mutants, strains that can avoid the immune response stimulated by the vaccine (McLean 2002). Even more worrying is a serious potential evolutionary consequence of vaccines that reduce but do not completely block infection (Gandon et al. 2001, 2003). As described in Chapter 11, while vaccines that reduce the probability of infection are predicted to select against virulence, those that reduce the growth of a parasite within its host reduce the cost of virulence by lowering the probability that the host will die. The vaccines can therefore select for increased parasite virulence. Although such vaccines will decrease the proportion of infected individuals in a population, they have the potential to increase the number of deaths in the long term.

Drug treatment Widespread use of antimicrobial drugs has led to the emergence and spread of resistant strains of many parasites, particularly bacteria resistant to most of the commonly used antibiotics (Chapter 10). Indeed, these observations provide textbook examples of ‘evolution in action,’ and of the ability for selective pressures to change the proportion of resistant parasites in a population. Less attention has been paid to the possibility that as resistance spreads, it can affect other parasite traits. Might the spread of resistance drive virulence? It could do so if the trade-off between transmission and virulence differed for sensitive and resistant parasites. This appears to be the case for malaria parasites, where resistance to chloroquine (until recently the most widely available antimalarial drug) slows the growth of replication stages (which should decrease virulence), affects the timing of the production of transmission stages in vitro (Koella 1994), and enhances transmission to mosquitoes in vivo (Ramkaran and Peters 1969; Wilkinson et al. 1976; Sucharit et al. 1977; Ichimori et al. 1990). Thus, resistance changes the association between virulence and transmission by enabling less virulent parasites to transmit more. The evolutionary model of virulence therefore predicts that the stable level of virulence should change. Unfortunately, no data are available that would make it possible to follow the level of virulence as resistance to chloroquine spreads through populations of malaria parasites. While the malaria example demonstrates possible effects of resistance that make a bad situation worse, evolutionary ideas could also help manage antibiotic use to force the desired outcome of lower virulence. One possibility is to design antibiotics that target bacterial traits important for virulence, especially structural appendages on bacteria that serve as ‘virulence factors’ (traits that promote pathogenicity). Many of these virulence factors are surface-exposed cellular components necessary for host infection, disease progression, vector transmission, or toxin export. These structural appendages are often used by bacteriophages (bacteria-specific viruses) as their attachment sites to initiate infection. One intriguing idea is therefore


to use phage-therapy to treat infected humans, employing phages that target virulence factors (Smith and Huggins 1982; Levin and Bull 2004). Although the evolution of bacterial resistance to the phage may be inevitable, it would lead to bacteria that escape viral attack by losing the virulence factor and are therefore less pathogenic. Thus, the method would provide a rational therapy design to select for decreased virulence in the bacterial pathogen.

The problem of virulence While such ideas are enticing, several problems have made their practical use difficult, leading some scientists to doubt their relevance (Ebert and Bull 2003), as discussed in Chapter 12. One problem is that the concept of virulence remains problematic in evolutionary biology, in contrast to its use in public health, where it is possible in principle, and often in practice, to measure the burdens imposed by most diseases. Many empirical studies in evolutionary biology use this public health view, defining virulence as any reduction in the host’s reproductive fitness (e.g., reduced life span or fertility). Evolutionary theory suggests, however, that for many parasites a more relevant measure of virulence is a decrease in the period of infection caused, for example, by an increased rate of mortality during infection. For the parasite’s evolution it is usually irrelevant whether, say, an infected host cannot find a mate or has fewer offspring than a healthy one. While there is some overlap between the evolutionary, parasitecentered and the public health, host-centered views of virulence, the two are not identical; tests of evolutionary predictions should use the appropriate measure of fitness. A second problem is that because the rate of mortality is difficult to measure, many public health and evolutionary studies use a simple measure of mortality: case fatality rate. This measure, however, combines the host’s rates of mortality and of recovery. The importance of the distinction between virulence (mortality rate) and case fatality rate is easily understood with the examples of HIV-1 and Ebola virus. While the case fatality rate of untreated AIDS patients is essentially


100% and that of Ebola hemorrhagic fever patients is only slightly lower (50–90%), the mortality rate of HIV-infected patients is only about 0.1 per year (assuming a life expectancy after infection of about 10 years), whereas the mortality rate of Ebola fever patients is about 100-fold higher. This distinction generates very different evolutionary consequences for the two parasites that are not recognized if case fatality rate is used as the measure of virulence. Third, although most evolutionary ideas on virulence assume that it is determined exclusively by the parasite, the host can contribute to virulence. This contribution is often called tolerance. Thus, virulence and tolerance are two names for the same effect, the parasite’s damage to the host. One emphasizes the role of the parasite and the other the role of the host. In some cases one can get away with ignoring one side of the interaction because variation in that source does not have much influence on the outcome, but in many cases all the critical information is in the interaction itself because variation in both partners contributes significantly to the outcome. The classical example is in malaria, where sickle cell influences the severity of the disease. In addition, the symptoms of malaria can be a consequence of the host’s immune response rather than the parasite’s direct pathogenicity. Some of the most lethal symptoms occur in cerebral malaria. While these symptoms are due in part to the parasite’s direct effect of obstructing capillaries in the brain, they are also partly mediated by the human immune system. The parasite’s antigens activate platelets that stimulate the host’s local inflammatory response, disrupting brain microvasculature and resulting in hemorrhaging into the brain (van der Heyde et al. 2006). Other symptoms of severe malaria also appear to result from by-products of the immune response. Thus, once T-cells are primed by a first infection, reinfection can make them produce much higher levels of interferon, which (in combination with malarial molecules) increases the production of the cytokine TNF-a by macrophages. This leads to increased risk of severe malaria (Riley 2002). Similarly, in HIV-positive individuals viral load is not a reliable indicator of CD4+ T-cell counts, suggesting that indirect


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

effects of the virus on the immune system and disease progression should receive greater attention (Rodríguez et al. 2006). In such cases it must be asked: to what extent can evolution change the parasite’s contribution to virulence or, in other words, how much of the observed virulence is due to the parasite’s rather than the host’s genes? Indeed, if virulence is determined by the interaction of the two partners’ genotypes, evolutionary predictions change qualitatively (Restif and Koella 2003). In traditional models, for example, increasing external (i.e., parasite-independent) host mortality rate selects for increased parasite virulence, for the parasite must exploit its host more efficiently to ensure its transmission before it dies alongside its host. If virulence is determined by an interaction between the host and the parasite genes, virulence decreases under many conditions with increasing external mortality (Restif and Koella 2003). Fourth, virulence is generally context-dependent. Again, consider malaria. The mortality due to malaria tends to decrease with the number of infections an individual has experienced because of the cumulative development of immunity. Therefore, a population in an endemic area is composed of individuals with high and with low tolerance to malaria infection. In which individuals should virulence be measured? For the evolution of the parasite, it would probably be most meaningful to measure the average virulence (weighted by proportions of individuals with different levels of tolerance), a formidable task. Another aspect of context-dependence is that virulence can depend on the environmental conditions in which the host finds itself. Virulence can, for example, depend on the food conditions for microsporidian or malaria parasites in mosquitoes (Bedhomme et al. 2004; Lambrechts et al. 2006) or on the prevalence of the parasite in the population (Bedhomme et al. 2005).

The problem of the trade-off Because there are little quantitative data on the virulence–transmission trade-off, the evolutionary approach, like other epidemiological approaches, makes ad hoc assumptions about its shape. Yet it is known that its shape is critical for predictions about

virulence: whether it is linear, curves upward, or curves downward will change the evolutionary outcome from a benign parasite like a common cold virus to a rapid killer like Ebola virus. Thus, using the wrong shape of the trade-off can seriously confound interpretations; its origin and nature need to be understood. One way to increase that understanding is to analyze the evolutionary pressures on the dynamics of a parasite within its host. As evolution tends to maximize a parasite’s transmission, models of a parasite’s within-host dynamics can be used to predict the trade-off between virulence and rate of transmission (Antia et al. 1994; Alizon and van Baalen 2005). This approach has recently been used to show that the evolutionary consequences of leaky vaccines mentioned earlier (Gandon et al. 2001, 2003) depend strongly on assumptions about the virulence–transmission trade-off. While the ad hoc assumptions in the earlier models predict evolution of increased virulence only for vaccines that slow the growth of the parasite, a model considering details of within-host growth to determine the trade-off suggests that increased virulence evolves both for anti-growth vaccines and for anti-transmission vaccines (Ganusov and Antia 2006). A more direct approach would be, of course, to measure the trade-off. Unfortunately, it is far from simple to measure what is evolutionarily relevant: the genetic basis of the trade-off. Indeed, the best known study on a virulence–transmission tradeoff—the case of myxomatosis in rabbits—relies on phenotypic correlations, not genetic details (MeadBriggs and Vaughan 1975). Studies on the genetic correlation between virulence and transmission remain rare (exceptions are Kover and Clay 1998; Mackinnon and Read 1999). More seriously, focusing on the virulence– transmission trade-off may take too narrow a view of the evolution of parasites (Ebert and Bull 2003). Virulence may evolve through subtle and complex trade-offs with many other traits that determine the parasite’s fitness, including the rate at which the host recovers (Williams et al. 1990), the ability of the parasite to manipulate the host’s behavior (reviewed in Moore 2002), and the age of the host (Kysela and Turner unpublished). The trade-offs involved in parasite manipulation of host behavior


host’s senescence makes future reproduction less likely than for hosts in peak condition. The parasite is therefore expected to allocate its resources to vertical transmission (i.e., invest in an avirulent strategy) when it infects young, reproductively mature host individuals, and to switch to horizontal transmission as the host ages (Fig. 17.1; Kysela and Turner unpublished). The age-related increase in virulence should be enhanced in the oldest individuals by additional selection for higher virulence as the higher host mortality caused by increased virulence causes the host to evolve more rapid senescence, setting off a coevolutionary downward spiral of increased age-related virulence and ever shorter intrinsic host life span. Simple models help to clarify issues in the evolution of parasite virulence. But to understand the evolution of a specific parasite and the potential methods by which it can be controlled, the detailed biology of the host–parasite interaction needs to be fully comprehended. Such details are important because they can strikingly alter our predictions. It is unlikely that a truly general theory of

Horizontal: vertical transmission effort

Reproductive value

are exemplified in the interaction between malaria and its mosquito vector. On the one hand, the parasite’s infectious stage stimulates the mosquito’s biting (Koella et al. 1998, 2002; Anderson et al. 1999), which increases its own transmission and, as biting is risky, the mosquito’s mortality (Anderson et al. 2000). Thus, at this stage, virulence (increased mortality of the host) and rate of transmission are positively associated and form the classical tradeoff for the parasite between persistence of the host habitat and ability to leave it. On the other hand, the parasite’s developmental stage in the mosquito that cannot be transmitted to humans decreases the mosquito’s biting rate (Anderson et al. 1999; Koella et al. 2002), which increases the mosquito’s survival and allows more parasite individuals to complete their development. This latter manipulation indirectly increases the rate of transmission by elevating parasite densities in the host while decreasing virulence. While this manipulation may reduce the mosquito’s fitness, e.g., via decreased fecundity because of decreased opportunities for blood feeding, these effects are unlikely to affect the parasite’s fitness. (Note again that we define virulence as a trait that affects the parasite’s fitness, which is not always identical to detrimental effects on the host’s fitness.) The age of the host can influence the parasite’s evolution, for different host ages present different environments to the parasite. This is important, for example, for parasites that establish infections through both horizontal transmission to uninfected hosts and vertical transmission to the host’s offspring. Often, there is a trade-off between the success of horizontal and vertical transmission. For example, effective horizontal transmission can require high virulence, but high virulence decreases the host’s reproductive success and thus limits vertical transmission (Bull et al. 1991; Turner et al. 1998). Most theoretical treatments discriminate only infected and uninfected hosts and ignore the host’s age, although host senescence (see Chapter18) clearly influences the parasite’s success, in particular its future opportunity for vertical transmission. If the host is immature, it is risky for the parasite to rely on vertical transmission, as the host might die before it matures to create infected offspring. If the host is old, vertical transmission is risky, for the


Reproductive maturity


Figure 17.1 Hypothesized impact of the host’s reproductive value on the parasite’s investment in horizontal versus vertical modes of transmission. When the host has a relatively high reproductive value, selection should favor a parasite that invests in the vertical route of transfer, as this produces a high likelihood that the offspring will become infected. In contrast, as the host senesces, the parasite should switch strategies and invest relatively more effort in horizontal transmission. This hypothetical model assumes that the parasite cannot simultaneously maximize its two available modes of transfer; i.e., that greater horizontal transfer is benefited by increased virulence, but that this concomitant reduction in host fitness impairs the opportunity for vertical transfer. See Kysela and Turner (unpublished) for details of the model.


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

parasite evolution will be created, as parasites are a highly diverse group of organisms, with similar reproductive strategies but different evolutionary ancestries. Thus, it is expected that parasites will often differ in important ways that alter their evolutionary trajectories.

Beyond the trade-off model To this point we have considered the problem of the evolution of parasites, in particular virulence, as a question of optimizing reproductive success given the constraints posed by a trade-off, usually between virulence and rate of transmission. However, this idea is not relevant for all host–parasite associations. At least two alternatives can be important in some cases: coevolution and changes in selection during the emergence of disease.

Coevolution Coevolution is generally not thought to be important for the control of human diseases, although humans do appear to be evolving in response to their environment (see below; Chapter 2), in particular to parasite pressure. Perhaps the best-known cases are the sickle cell trait and other blood antigens that protect against severe malaria (Hill et al. 1991; Jones 1997). Nevertheless, as most parasites have much shorter generation times than humans, they can, as a first approximation, evolve without facing a coevolutionary response by their host. However, one should not forget that many human disease pathogens are transmitted by vectors that have generation times similar to those of the parasite’s life cycle. For example, in the snail–schistosome system, the available genetic variability and the selection pressure enable both virulence and resistance to evolve, and the snails and parasites have reciprocal effects on each other’s phenotypes (Webster et al. 2004). In the mosquito–malaria system, it has been suggested that it was the coevolution of the malaria vector’s ability to resist infection and the parasite’s ability to suppress the mosquito’s immune response that led to the surprising situation that mosquitoes in natural populations generally cannot melanize their parasites (Koella and Boëte 2003).

While coevolution of two species in a strict sense implies reciprocal effects on each other’s genotypes, the interactions between the human immune system and parasites represent coevolution in a broader sense. In parasites with antigenic switching, which include trypanosomes and malaria (Deitsch et al. 1997), each antigen stimulates the expansion of a different clone of B-cells that can control the parasites with this antigenic type (Vickerman 1989). This kind of immune-mediated coevolution changes only phenotypes in hosts and parasites. An example where coevolution between immunity and the parasite affects the genotype of the parasite is seen in the flu virus, where immunity of a large proportion of the population hinders the spread of the virus until it has mutated to a novel antigenic type. (This evolutionary change is called antigenic drift, in contrast to antigenic shift, due to reassortment of genes from different flu variants implicated in the major pandemics of 1918, 1957, and 1968.) Although these aspects of coevolution are clearest for examples considering antigenic differences in the selected parasite trait, other traits and in particular virulence can also respond to the immune response, setting the stage for coevolution. An empirical example is given for malaria in mice in the PhD thesis of Katrina Grech (2006).

Emerging diseases While the emergence of novel diseases has attracted much attention from ecologists and evolutionary biologists, only rarely has emergence been connected to the evolution of virulence and the trade-offs involved with virulence. An exception is André and Hochberg’s (2005) study, which shows that during the initial stages of an emerging epidemic, stochasticity creates an association between virulence and rate of transmission. If host density is low, they find that virulence is constrained to low values, whereas if host density is high, a wide range of virulence levels can be associated with emerging pathogens. This would explain observed high levels of virulence in the initial stages of epidemics, followed by the evolution of attenuation in accordance with the classical ideas. It is clearly worth considering the possibility that the evolution


of virulence is driven by different processes in emerging and endemic situations. Another way to connect emergence with virulence is to employ the source–sink paradigm from theoretical ecology. Here, the evolution of emerging pathogens takes place in spatially structured habitats with different frequencies of source and sink environments available to the parasite (Dennehy et al. 2006, 2007; Sukorenko et al. 2006). Source habitats allow populations to be self-sustaining, but sinks require that populations receive immigrants from established sources (Holt 1985). Reservoirs are sources (e.g., certain environmental sites, host organisms, or specific body compartments); they allow the parasite to thrive and transmit to other habitats. The virulence habitat is often modeled as a sink, a disease-susceptible host or compartment within the same host, in which pathogen growth causes clinical infection but is not self-sustaining. For example, Pseudomonas aeruginosa bacteria inhabit soil and fresh water and opportunistically infect humans with cystic fibrosis (CF) or extensive burns. These virulence habitats are clearly sinks that do not contribute significantly to the natural circulation of the bacteria. However, certain gene deletions in P. aeruginosa lead to beneficial mucuscoated phenotypes that resist immune system attack and establish chronic infections in the lungs of CF hosts. These mutants are disadvantaged in the wild and quickly disappear from the source population after shedding. Thus, the distinctive sink-adaptive and source-deleterious nature of the mutations provides a genetic footprint that confirms the source–sink dynamics (Sukorenko et al. 2006). It is anticipated that source–sink models will provide additional useful insights into the population dynamics and molecular mechanisms of virulence evolution.

A molecular and an experimental approach to the evolution of parasites Could genomic data help us to better understand these issues? With an increasing number of fully sequenced organisms—including many pathogenic bacteria and viruses, and the most important


malaria parasite and one of its important mosquito vectors—genomic data have already become routine tools for solving important evolutionary questions, including the extent of genetic selection and other evolutionary changes in parasites (Chapter 14). The power of full-genome analysis is exemplified by research involving single nucleotide polymorphisms (SNPs, see Chapter 3) and full-genome scans of allelic variation. Analyses of SNPs in the human genome suggest that human populations have adapted to local conditions, responding to selection in the recent past (Voight et al. 2006). The whole genome analyses of allelic variability in malaria parasites suggest that anti-malarial drugs and the human immune response are the main selective forces operating on them (Kidgell et al. 2006). This result, in itself, is perhaps not very surprising, but it does show that the genomic data yield reasonable results for intuitive predictions. Genomic data are also often used to estimate the rate and location of genetic recombination (Kong et al. 2002, also see Chapter 3). Recombination is important in the evolution of bacterial pathogens, where virulence determinants enter the genome via recombination, through horizontal gene transfer among species, and by movement of genetic elements such as bacteriophages (see Chapter 15). Recombination also remains a major puzzle in evolutionary biology, for in the simplest case it should disappear. Several ideas about how it is maintained are based on the idea of negative epistasis of deleterious mutations, i.e., that mutations interact in a way that their combined effect is greater than the sum of their individual effects. Such epistasis would strongly influence the rate of spread of traits like drug resistance (Bretscher et al. 2004). Genomic data being used to test these theories have as yet yielded little evidence for negative epistasis (Bonhoeffer et al. 2004). Genomic data can also be brought to bear on the ideas that a parasite’s virulence (or other traits) depends on the host’s environment, its genetic background, or its physiological condition. The aim of one recent study, for example, was to identify the genes responsible for a mosquito’s resistance to malaria as a step toward transforming mosquitoes for malaria control. While candidate genes for


PAT H O G E N S : R E S I S TA N C E , V I R U L E N C E , E T C.

resistance had been identified and confirmed previously with knock-out techniques using RNAi, the effect of at least some of these genes (the ones that have been tested) on the parasite’s development depends on the genetic background of the mosquito (Volz et al. 2006). While this study is bad news for the possibility of genetic manipulation for malaria control, it does demonstrate the feasibility of the approach and could be extended to ask whether knocking out candidate genes has similar effects in different parasite genotypes. Similarly, parasites can be transformed or their genes knocked out to answer questions concerning genetic and environmental variability, thus linking population and molecular biology. The use of genomic data to find genetic associations, and through them the traits that trade off with virulence, is an idea proposed several years ago (Stearns and Magwene 2003). The proposal has not yet been realized, in part perhaps because of its difficulty. The proposal involved manipulating two traits (e.g., host reproduction and, via parasite infection, host survival), and to identify the genes expressed differently in controls and the manipulated individuals. The genes revealed as important for both traits would be used to measure the tradeoff. Although implementing this approach is difficult, it may prove to be a useful tool. An alternative, comparative approach to tradeoffs is to correlate the means of the two traits among species (or populations). This approach is valid only if the effects of shared phylogenetic ancestry are controlled in the analysis, and this can only be done in a reliable, well-tested phylogenetic framework; these are now usually based on DNA sequences (see Chapters 13 and 14). Such phylogenetic approaches have been used to analyze tradeoffs of life-history traits in parasitic worms (e.g. Morand and Muller-Graf 2000), to reconstruct the history of infectious disease epidemics (Holmes et al. 1996), and to investigate the co-speciation of hosts and parasites (Page and Hafner 1996). It is striking, however, that the key trade-off between virulence and transmission rate has not yet been so considered. Finally, light would be shed on many of the issues discussed by using a novel approach to

the study of parasites—experimental evolution in semi-realistic laboratory situations (Ebert 1998). Like theoretical models, whose value lies in their ability to represent complex biological systems with a few key parameters, experimental evolution provides a tractable means to study evolutionary processes by simplifying the complexities of the world outside of the laboratory. Model systems are thus highly useful for confirming basic ideas in the evolution of parasite virulence and infectious disease (e.g. Bull et al. 1991; Turner et al. 1998; Duffy et al. 2006). Many such experiments have involved parasites closely related to those causing serious human illnesses, such as arthropod-borne viruses (e.g. Turner and Elena 2000; Greene et al. 2005), and are thus directly relevant to understanding the evolution of disease agents. But future experimental evolution research should mimic real systems to an even greater extent by, for example, examining the interactions between laboratoryevolved pathogens and whole organisms such as mice or arthropod vectors (e.g., sleeping sickness: Balmer 2006). Many chapters in this book show that the evolution of parasites is a fascinating field full of important discoveries. This chapter emphasizes not only that evolutionary biology can be used to help to control infectious disease, but also that many problems remain in the understanding of the evolution of virulence. Given the worldwide importance of parasites, solving these problems will be a major step forward in improving human health.

Summary 1. The simplest model of the evolution of virulence, which assumes that virulence and transmission are linked and that evolution maximizes total transmission, is a useful tool for understanding the potential consequences of control strategies. 2. To reach a more detailed understanding of parasite evolution, details of the host–parasite interactions must be taken into account. Important details can include the knowledge that virulence is governed by the interaction of the host’s and the parasite’s genes, and that virulence is contextdependent.


3. The evolution of virulence is likely to depend on more than a simple virulence–transmission trade-off. In particular, the trade-off may depend on the host’s condition or age, and trade-offs between virulence and other traits of the parasite may be more important for some parasites. 4. Coevolution or non-equilibrium situations such as the emergence of a new disease can complicate matters further.


5. Novel approaches—genomics and experimental evolution—may help to make sense of the complexities of host–parasite interactions.

Acknowledgments This work was partially supported by U.S. National Science Foundation grants to P.E.T. (DEB- 04–52163 and DEB- 06–08398).

This page intentionally left blank


Noninfectious and degenerative disease

This page intentionally left blank

C H A P T E R 18

Evolutionary biology as a foundation for studying aging and aging-related disease Martin Ackermann and Scott D. Pletcher

Introduction Aging is a decline in condition with increasing age that manifests itself as a reduction in the rate of survival and fecundity (Partridge and Barton 1996). At first glance, it is surprising that organisms do age. Why is an adult that emerges from the sophisticated process of development unable to simply maintain the condition it has achieved and instead undergoes a gradual deterioration that ends with death (Williams 1957)? Is there some advantage to aging that would explain why aging exists despite the obvious disadvantage for the individual? Evolutionary biologists were the first to pose these questions, and decades of theoretical and experimental work have culminated in two conclusions: aging does not have a function, and it exists because it is ignored by natural selection (Williams 1957). For the science of aging, the growth and success of molecular biology over the past 20 years make possible a synthesis that exploits the virtues of traditional evolutionary thinking and of new ideas about mechanisms. Evolutionary biologists contend that aging is fundamentally different from most other biological processes, such as development. Because aging does not serve a function, individual biological systems and signaling pathways are not programmed (i.e., favored by natural selection) to produce the death of the organism. Aging may, however, be deterministic in the sense that aging-related changes result from a series of regulated steps executed in an ordered fashion. Such determinism may be a side-product of a diverse range of other

physiological decisions and vital activities critical to the early reproductive success of the organism. This distinction is not merely semantics because how one views aging—as programmed, deterministic, or stochastic—has important implications for how one expects genes to determine the rate of aging and about how this rate can be modified by mutation. Thus, evolutionary biology provides an important context for interpreting mechanisms of aging and for guiding research into effective interventions that may extend human life span. Molecular biologists have uncovered a number of mechanisms that modulate life span in several species. Given this progress, it is now possible to ask whether the assumptions underlying the traditional evolutionary theory of aging are realistic and whether its predictions are born out. While such an assessment re-emphasizes the importance of evolutionary thinking in aging research, it also reveals substantial gaps in the current evolutionary models. The goal of this chapter is to provide an overview of popular themes in evolutionary and molecular aspects of the biology of aging with the aim of synthesizing the two. Fundamental principles are discussed using examples from basic research in non-human organisms. We begin by defining aging and discussing the groundbreaking work that formed the basis for understanding why organisms age. These are the canonical evolutionary models of aging. We next discuss several key results concerning the molecular mechanisms of aging and 241



highlight manipulations that are effective modulators of aging across multiple species. These examples establish that basic principles, which are derived from research on simple systems, effectively provide insight that could not be achieved otherwise and are important for understanding human aging (Chapter 23). We end with discussions about conceptual limitations of current evolutionary theory and about the utility of evolutionary concepts for guiding research into mechanisms of aging.

Defining and measuring aging As stated above, aging manifests itself as an increase with age of the probability that an organism will die for internal reasons and a decrease in the rate at which it successfully reproduces (Partridge and Barton 1996). To measure aging, one thus must measure age-specific rates of survival and reproduction, and then analyze how these rates decline with age. Life span is not an ideal measure for aging, because differences in life span do not necessarily indicate differences in aging (Pletcher et al. 2000). For example, if death by automobile accident or war were suddenly eliminated, the average life span of the population would increase, but it could not be argued that such a change in extrinsic factors would alter the rate of individual aging. Indeed, a few years back, one of the architects of the modern biology of aging, George Williams, argued that many researchers were mistakenly equating death with aging and that most aging research was wrongly focused on studying life span (Williams 1999). For mammalian systems this criticism was perhaps a bit too harsh. Putative signs of normal aging, such as hair loss, slowing reaction time, and increasing cancer rates, are readily observed, and their age progression has been studied for many years. Phenotypic characteristics of normal aging in other model systems, such as yeast, worms, or flies, are not well documented, however, and data composed only of mean or maximum life span should be interpreted with caution. One way of studying the aging process per se, even in the absence of good phenotypic markers, is to measure age-specific mortality rates directly. As previously mentioned, death is not a programmed event in an organism’s life history. As physiological

systems break down with advancing age, whether in a characteristic or a random fashion, the ability of an individual to avoid death through any number of different causes diminishes. Any specific individual is either alive or dead, but observing large cohorts of individuals allows an estimation of the age-dependent risk of death, also called the agespecific mortality. At any particular age, the level of mortality reflects the physiological condition of the individuals in the cohort, and increasing mortality is interpreted as underlying physiological deterioration. Mortality rate represents an instantaneous risk of death that is independent of previous or future observations. It is therefore ideal for describing a dynamic phenotype of aging and for characterizing temporally specific events, such as specific ages over which an experimental manipulation affects death rates. While age-specific mortality is a preferred measure for aging, many articles do not contain this kind of information. Large sample sizes are required for accurate estimation (several hundreds to thousands of individuals), and this often precludes its use for studies of the genetics of aging. When discussing such work, we must rely on life span data. It is plausible that in most cases changes in life span do indeed reveal changes in the rate of aging.

The canonical evolutionary models of aging The seminal insight that formed the basis for evolutionary theories about aging was formulated by J. B. S. Haldane during his study of Huntington’s disease (Haldane 1941). The disease is caused by a single, dominant mutation quite common in human populations. The unexpectedly high frequency of the Huntington’s allele—about 1 in 18,000 in the UK—made for somewhat of a paradox at the time because the science of population genetics made it clear that dominant, deleterious mutations should be quickly eliminated by natural selection. Haldane recognized that this discrepancy could be explained by the fact that the disease usually is manifested after forty years of age. Carriers of the Huntington mutation will have reproduced and thus transmitted the defective allele to the


next generation before any symptoms of the disease become apparent. The phenotypic effects of the disease are dramatic, but the mutant allele has little or no effect on reproductive success. He quickly generalized the explanation to any gene and any allele whose deleterious effect is confined to late ages. Haldane began the movement to formalize the idea that natural selection, which relies on differences in the reproductive success between individuals, ignores how an individual functions late in life. It is easy to see that, as with the majority of Huntington cases, if a mutation has effects only after reproduction has ceased, then it will have no impact on reproductive success. However, even mutations that have effects during the reproductive period can escape the force of natural selection. Throughout most of Earth’s history, the natural conditions experienced by any species (including our own) were exceedingly harsh and unforgiving. Predation, starvation, and other external insults to survival would have meant that only a small number of individuals would survive to the beginning of the reproductive period and far fewer would have been around long enough to encounter the deleterious effects of aging. Old age would have been the rare privilege of the few, thereby weakening selection’s ability to influence the frequency of mutations with effects at late ages and providing the opportunity for such deleterious alleles to persist in the population. This argument applies both to mutations that reduce survival and to mutations that impair the ability to reproduce successfully at later ages.

Evolutionary genetics of aging That late life is not under strong selection does not necessarily mean that individuals will evolve to be frail when old. For this, it must be postulated that genetic factors with late-life effects accumulate in populations over evolutionary time and that aging is the cumulative effect of these mutations. Why should mutations that weaken condition late in life accumulate? There are two main ideas. First, it is possible that, of all the mutations that randomly occur in a population, a sizable number do not have any effect early in life but are exclusively


deleterious late in life. As discussed earlier, such mutations are nearly neutral in terms of their fitness consequences. Because any single mutation originates in one or a very small number of individuals, random sampling of alleles in each generation will most often result in it being lost from the population soon after it appears. Sometimes, however, a nearly neutral mutation may end up in a disproportionate number of offspring by chance alone. In the rare instance where this trend continues over many generations, a neutral allele can increase in frequency and even become fixed (i.e., reach a frequency of 100%) in a population. The idea that the slow accumulation of late-acting mutations over evolutionary time results in organismal aging is known as the mutation accumulation hypothesis (Medawar 1952). It is important to note that this hypothesis does not refer to the accumulation of somatic mutations within an individual’s lifetime. Rather, it postulates that mutations accumulate in genomes over the course of many generations. A second hypothesis postulates that some of the mutations that enhance reproductive success early in life have an associated cost later on. Such mutations would be selected for because of their advantage when animals are young and actively reproducing, which for reasons already discussed would easily compensate for any negative effects late in life. This idea is termed antagonistic pleiotropy and was introduced by George Williams (1957). The concept was later extended by the work of Thomas Kirkwood, who proposed a specific mechanism for why genes would have opposite effects at different ages. His ‘disposable soma theory’ (Kirkwood 1977) suggested that organisms face trade-offs between reproduction on the one hand and investment in somatic maintenance and repair on the other and that alleles that increase the allocation of resources to the former process thus compromise the latter. One instance of the trade-off between survival and reproduction that has recently been discovered involves disease resistance. Interventions that increase testosterone levels increase reproductive performance and depress immune functions in a range of species (see Chapter 7). To summarize, the evolutionary theory of aging suggests that aging exists because selection does



not operate efficiently late in life. This central point can be illustrated by an analogy proposed by Robert Holt concerning a similarity between aging and the observation that individuals tend to perform poorly under conditions not encountered during the evolutionary history of ancestral populations. For example, if humans ascend to high altitudes, they often experience high-altitude sickness. Genetic factors are likely to be involved in the susceptibility to this condition, and one can ask why these genetic factors were not eliminated by natural selection. The answer may be that our ancestors very rarely went to such altitudes, and performance in these instances was largely irrelevant for the number of progeny that an individual contributed to future generations. Selection at low altitude might have favored genotypes that, as a side-effect, happened to be particularly bad in the mountains. Alternatively, it is conceivable that alleles that hampered performance at high altitude, but had no negative effect at low altitude, could infest the lowland populations over evolutionary time. These scenarios are analogous to antagonistic pleiotropy and mutation accumulation, respectively. Our analogy serves to illustrate an important point. One would not argue that altitude sickness is programmed in a way that it is a mechanism that has evolved to become activated at high altitude to serve a particular, beneficial purpose. More likely this disorder exists because there was never strong selection for humans to function well at elevated altitudes. Evolutionary biology puts forward that a similar argument holds for aging—it is not programmed to serve a specific function but is maintained because selection is not efficient in removing the genetic factors that cause it.

Predictions of the evolutionary models The evolutionary biology of aging makes predictions that can be tested experimentally. First, patterns of aging must be heritable and influenced by genetic factors. The existence of genes that modulate aging was first shown by laboratory evolution experiments, where artificial selection for late-life performance resulted in long-lived strains of fruit flies (Rose 1984). More recently, specific genes that modulate aging have been identified in several

species (Tatar et al. 2003; Kenyon 2005)—a topic to be discussed at length later. Twin studies in humans estimate that roughly 50% of the variation in human life span is attributable to genetic variation (Herskind et al. 1996). These studies and others have definitively established that differences in life span and in the rate of aging between and within species have a significant genetic basis. A second prediction of the evolutionary biology of aging is that the level of external risk should influence patterns of age-related mortality and functional decline. This prediction is derived from mathematical models that suggest environmental risks of mortality, such as accidental death or predation, determine the strength of age-specific selection on survival and reproduction (Charlesworth 1994). For example, conditions under which adult individuals are exposed to high extrinsic mortality will usually lead to a fast decline in the strength of selection with age (but see Abrams 1993) and are thus expected to lead to the evolution of earlier aging. That faster aging can evolve under conditions of high external risk for adults has been shown in a laboratory evolution experiment with fruit flies (Stearns et al. 2000). Further support comes from the comparison of life spans of venomous or poisonous animals to animals without chemical protection (Blanco and Sherman 2005), of birds to flightless mammals of similar size (Austad and Fischer 1991), and of opossums living on an island to opossums living on the mainland (Austad 1993). These studies showed that animals that are chemically protected, able to fly, or living on an island, and thus presumably are less likely to fall victim to predation, exhibit longer life span in captivity than comparable animals without such protection. A third prediction of the evolutionary biology of aging is that mutations with age-specific effects exist and arise sufficiently often to account for aging-related decline. A few experimental studies have investigated the frequency of mutations with age-specific effects. Mutation accumulation studies with fruit flies indicated mutations with an age-specific effect on mortality early in life are quite common, but mutations with an age-specific effect on mortality late in life are rare (Pletcher et al. 1998, 1999). In contrast, for fecundity of fruit


flies, the clearest age-specific effects are found for late-acting mutations (Leips and Mackay 2000). Although superficially these data appear to support the necessary qualitative prerequisites of mutational dynamics that would account for aging under the classical hypotheses, this issue has yet to withstand rigorous mathematical investigation to determine whether the effects are quantitatively sufficient to account for the differences measured.

Molecular mechanisms of aging While mutations that alter the rate of aging are central to the evolutionary theory of aging, evolutionary biologists do not usually focus on what these mutations actually do and how they could alter the rate of aging. Elucidating the mechanistic basis of aging is the focus of molecular biology. The two fields have for the most part developed independently, and they pursued different routes. As discussed earlier, the evolutionary theory of aging is based on the premise that all organisms age for the same ultimate reason—because selection late in life is weak—but it does not assume that the specific mechanisms of aging are shared among organisms. In contrast, molecular biology has a long tradition of seeking molecular mechanisms that are broadly shared among different species. The nature of relationship by descent means that despite the vast diversity of life, gene structure and function as well as numerous biological processes, including cell division and development, show surprising similarity across species. Evolutionary conservation of function has made it possible to understand the most basic functions of the human cell, and the machinery that controls it, from work on bacteria and yeast. Is the principle of evolutionary conservation useful for studies of aging, despite its variability and apparent complexity? We have seen that aging is not likely to be selected for. It is reasonable, however, to assert that aging may be selected against, for example by situations that favor current survival and future reproductive output. The question then becomes, are there common ways in which organisms resist aging? The underlying premise, that there are common ways in which organisms age or resist aging, forms the basis for the use of model genetic organisms


for laboratory study (Martin 2002). The organisms most commonly used for aging research include the brewer’s yeast, Saccharomyces cerevisiae; the nematode worm, Caenorhabditis elegans; the fruit fly, Drosophila melanogaster; and the mouse, Mus musculus. The strength of these systems lies in their well-developed genetics that provides researchers with the ability to turn on and turn off genes at specific times and in specific tissues for rapid and precise analysis of gene function. Such manipulations make it possible to determine causation, as distinct from correlation, which is not possible from the traditional experiments employed by evolutionary biologists (e.g., laboratory selection experiments). Most importantly, the life spans of these animals are relatively short—weeks or months in the case of yeast, worms, and flies; 3 years in the case of mice. This makes it possible to screen for mutants with altered life span in a relatively short time.

Dietary restriction The existence of ‘public’ mechanisms of aging (Martin 2002)—i.e., those shared among many organisms—was first suggested by results from experiments on the consequences of dietary restriction. Over 70 years ago it was recognized that reduced nutrient availability (without malnutrition) extends the life span of mice. Intensive research since that time has established dietary restriction as the most powerful modulator of the aging process in mammals, and it has been shown to be effective in species as diverse as yeast, nematode worms, Daphnia, and Drosophila (Masoro 2000). In rodents, dietary restriction maintains most physiological processes in an apparently youthful state, and many aging-related diseases are ameliorated by dietary restriction, including cancer, bone and muscle loss, and hearing loss. Age-dependent hyperactivation of the innate immune response in Drosophila, akin to aging-related inflammation in mammals, is also suppressed in diet-restricted animals. Long-term and ongoing experiments suggest similar effects in primates, including humans (Ingram et al. 2006). Recent work suggests that the effects of dietary manipulation depend on age. In humans, for example, diet-restriction in the womb or as an infant increases the chance of becoming



obese later, and may result in an increase in the rate of aging (Chapter 19). Surprisingly, alterations in nutrient consumption may not be the only evolutionarily conserved mechanism that links environmental signals to aging and aging-related decline. Simple perception of the environment is sufficient to alter aging and adult physiology in both nematode worms and fruit flies (Libert et al. 2006). In humans, mere exposure to the sight, smell, and/or taste of food rapidly elicits cardiovascular, digestive, and endocrine responses reminiscent of those induced in mice by dietary restriction or by genetic mutations that increase life span (see below). Might such sensory systems be a useful target for therapeutic manipulations in humans that affect aging or retard aging-related functional decline? The fact that dietary restriction is so salutary has stimulated intensive research into its basic mechanisms and into the development of effective mimetics. The workhorse for this research has been S. cerevisiae (Lin et al. 2000) with some help from Drosophila (Rogina and Helfand 2004). In S. cerevisiae, replicative aging is measured by the number of daughter cells that a single mother cell produces during its lifetime. Normally, wild-type cells divide roughly 20–30 times, but when extra copies of the gene Sir2 are present, the number of cell divisions is increased by 40% (Kaeberlein et al. 1999). Extra copies of Sir2 in nematodes and overexpression of the gene in flies (Rogina and Helfand 2004) also increase life span, suggesting that Sir2 and its relatives are important modulators of aging across taxa. How does Sir2 act to resist aging? In yeast, Sir2 is critical for maintaining the stability of certain chromosomal regions and for suppressing unwarranted gene expression (Guarente 2000). One of the putative causes of death in yeast cells is the aging-related accumulation of extrachromosomal rDNA circles, and Sir2 suppresses this accumulation. There is no evidence for rDNA circles in organisms other than yeast, but it is possible that Sir2 and its relatives may play some role in genome stability. Sir2 also has a biochemical activity as an NAD-dependent protein deacetylase that removes acetate groups from specific proteins and thereby regulates their function (Imai et al. 2000). Several

substrates of Sir2 have been described, many of which have been shown to be relevant to aging, apoptosis, and cellular response to stress (Brunet et al. 2004). Recent work has suggested that dietary restriction exerts its effects by activating Sir2. Yeast cells from certain genetic stocks that lack a functional Sir2 gene show little or no increase in longevity when exposed to a regimen of dietary restriction (Lin et al. 2000). The same result has been published for flies that lack Sir2 (Rogina and Helfand 2004). Moreover, in both of these systems, longevity of Sir2-overexpressing individuals is not further extended by reduced nutrient availability. Pharmacological activators of Sir2 were identified (Howitz et al. 2003), and several of these plantderived polyphenols have been shown in some studies to increase life span of yeast, worms, and flies (Wood et al. 2004). The best known of these compounds, Resveratrol, is now being studied in mammalian systems with promising results. Resveratrol seems to impact many aspects of mammalian health including NFkB-related inflammation and neurodegeneration. Interestingly, many of these polyphenolic compounds are produced by plants in times of stress, suggesting the possibility that their longevity-promoting effects may be a type of molecular sensory perception that, like dietary restriction, may alert the animal to tough times ahead (Lamming et al. 2004). It should be noted, however, that the molecular and physiological basis of the effects of Sir2 and its role in dietary restriction are controversial. It remains to be seen whether overexpression of Sir2 increases rodent life span and whether the polyphenolic compounds work exclusively through Sir2 or involve other genes and proteins.

Conserved pathways that influence the rate of aging Description of the first long-lived genetic mutants has proven just as influential as the discovery of dietary restriction, and it too has given rise to a well-characterized public mechanism of aging: insulin/IGF-1 signaling (IGF refers to insulin-like growth factor). One of the strengths of model systems with short life spans is the ability to search


for genes that impact aging in an unbiased way using genetic screens. Using this simple approach in C. elegans, Klass (1983) along with Friedman and Kenyon and their co-workers described the longlived mutants age-1 and daf-2, respectively. These mutant animals lived up to twice as long as wildtype worms. Further research revealed that the life span extension in both age-1 and daf-2 animals depended on another mutation called daf-16. Soon after, the genes that were affected in these mutants were isolated and, remarkably, all were shown to be involved in the insulin/IGF-1 signal transduction pathway (Tatar et al. 2003). daf-2 was shown to be the nematode insulin/IGF-1 receptor, age-1 was shown to be a downstream PI3 kinase, and daf-16 was identified as the target FOXO transcription factor. Down-regulation of insulin signaling, as occurs in daf-2 and age-1 mutants, results in activation of daf-16/FOXO, which subsequently activates transcription of genes that promote survival. As alluded to above, these results established a new paradigm for aging research. All of the genes were part of a well-known signal transduction pathway, implying that aging could be modulated and that such modulation could be controlled through a small number of molecules and biological processes. No longer could aging be viewed as solely the result of an intractable mess of accumulated mutations. Moreover, components of insulin signaling are evolutionarily conserved, and subsequent work showing that reduced signaling through this pathway extended life span in fruit flies established that its function was conserved as well (Tatar et al. 2001). The insulin/IGF1 pathway, in addition to its wellknown function in metabolism, is a critical component of a general stress response. Insulin receptor mutants in worms and flies exhibit increased energy storage, and they are resistant to various stressors. The transcription factor daf-16/FOXO is activated by adverse stimuli, such as anoxia and heat, and in its activated form it induces the expression of genes involved in protection against oxidative damage, heat, and pathogens (Murphy et al. 2003). Thus, this pathway may play a vital role in somatic maintenance and endurance. While flies and worms have a single, ancestral insulin/IGF-1 receptor, in mammals there are two,


and both pathways downstream of these receptors have been implicated in the modulation of aging (see also Chapter 23). Whereas the single receptor in invertebrates regulates both growth and metabolism, these duties have been separated in vertebrates, with the insulin receptor primarily responsible for metabolic control and the IGF-1 receptor important for growth control. Ames dwarf mice carry a mutation in the gene prop1, which is a transcription factor that is required for normal development of the pituitary gland. These mice are deficient in growth hormone, prolactin, and thyroid-stimulating hormone. The lack of these three hormones leads to a reduction in the level of serum IGF-1, and the mice exhibit an increase in life span of roughly 55% (Brown-Borg et al. 1996). Mice that are mutant for the growth hormone releasing hormone receptor have reduced levels only of growth hormone and of IGF-1 (they have normal levels of prolactin and thyroid-stimulating hormone), and they, too, are long-lived (Flurkey et al. 2001). Direct manipulation of the IGF-1 and insulin receptors also affects longevity. When a single copy of the IGF-1 receptor is knocked-out, mice reach relatively normal size and fertility, but they are long-lived (their life span increases up to 25%) (Holzenberger et al. 2003). Finally, the gene Klotho is probably involved in inhibition of insulin and IGF-1 signaling. Mutations in Klotho lead to a progeria syndrome in mice, and overexpression of the gene results in extended life span (Kurosu et al. 2005). Population genetic studies in humans have identified an association between a specific allele of Klotho and life span (Arking et al. 2002). Research into the mechanisms of aging in model systems is accelerating. The number of known C. elegans mutations that increase worm life span exceeds 100, and there are now a handful of genes in flies and mice (Kenyon 2005). The general principle is holding true that effective modulators of aging conserved through evolution are those mechanisms that provide organisms with the flexibility to alter general life-history patterns in response to environmental signals. The examples discussed above are few of many. The target of the rapamycin (TOR) pathway, which is involved in sensing amino acids and controlling cell growth, is emerging as another conserved modulator of aging (Vellai et al. 2003).



It may be that dietary restriction acts by downregulating TOR signaling. Several mutations in mitochondrial genes, specifically those encoding components of the respiratory chain, increase life span (Dillin et al. 2002), as do other metabolic and regulatory genes (Rogina et al. 2000).

Merging molecular mechanisms with evolutionary theory Most of the evolutionary theory of aging was formulated without much specific information about the genetics of aging. Now, with a wealth of information about the molecular genetics of aging, it can be asked whether theory and experimental results are in good agreement. Such an inquiry reveals that central experimental results were unanticipated by the classic evolutionary theory of aging. In this section, the major discrepancies will be presented, and how their resolution can lead to new insights into aging will be discussed. One of the tenets of the canonical evolutionary models is that patterns of aging and incidence and severity of aging-related disease should be largely unaffected by single-gene manipulations. Because of the accumulation of mutations with random and deleterious effects at later ages, delaying aging or ameliorating aging-related functional decline would be impossible—there are just too many genes and too many processes expected to be affected. Moreover, each species would be exposed to unique constraints and selection pressures that would make manipulations that affect aging in one species likely to be ineffective in another. These predictions were proven wrong through recent insights about the molecular basis of aging discussed earlier. Single mutations in molecular pathways can lead to dramatic extensions of life span in model organisms, and in many instances similar pathways and mechanisms are important in determining life span in vastly different organisms.

Adaptive responses to environmental signals How can the existence of conserved pathways that regulate the rate of aging be reconciled with the viewpoint that aging exists because of selection’s indifference? One possibility is that some of

these pathways allow organisms to resist aging by altering their allocation of resources in response to environmental signals. Evolutionary biologists have long recognized the powerful influence that environmental cues have on biological systems, an influence that can trigger alterations in lifehistory patterns that range from the subtle, such as delayed reproduction and increased stress resistance, to the dramatic, such as abrogation of normal development entirely and permanent acceptance of a juvenile form. From an evolutionary standpoint, the benefit of such plasticity is clear. Variable environmental conditions challenge individuals to use external information and make calculated decisions about whether to allocate resources to somatic maintenance (i.e., short-term survival) or to reproduction to maximize their individual fitness. For example, environmentally driven variability in levels of age-specific mortality can favor the evolution of iteroparity and life-history patterns in which reproduction is distributed over increasing life spans or even concentrated toward the end of a long life (Tuljapurkar 1980). In this light, the existence of molecular mechanisms to modulate or enact such decisions is not surprising. The existence of conserved regulatory pathways may explain why some mutations have dramatic effects on life span. While there are indeed many genes with small effects on aging (Murphy et al. 2003), these genes are subject to regulation at a high level. Mutations that impact regulation may lead to the more dramatic effects on aging that were so surprising from an evolutionary standpoint. The experimental data suggest that the regulatory pathways and some of the targets of regulation are shared by many different organisms; they are public mechanisms of aging. The specific cues that initiate regulation and some of the downstream effectors are likely environmentand species-specific; they may be private mechanisms of aging (Martin 2002). This view predicts that plastic responses to the environment are an integral aspect of aging in all organisms and that they evolved a long time ago (Fig. 18.1). If indeed organisms respond adaptively to alterations in environmental conditions, then experimental manipulation of these regulatory mechanisms should, in some cases, have maladaptive


Variable environment



Evolution of correlated suites of traits Early reproduction Distributed somatic weakness reproduction stress sensitivity somatic endurance stress resistance Insulin/IGF-1 signalling

Short life

Sir/Stress response


other traits, such as development or reproductive effort. However, competition experiments reveal that loss of function of the insulin receptor, daf-2, results in severely reduced individual fitness in the laboratory environment (Jenkins et al. 2004). In general, life span-extending mutations are associated with significant reductions in other traits (Klass 1983) and some costs are environment-dependent (Marden et al. 2003). These observations are consistent with the notion that covarying sets of lifehistory traits reflect strategies that have evolved to maximize fitness in specific environmental circumstances. In short, genetic manipulations may effectively trigger inappropriate life-history decisions. Perhaps one goal of evolutionary medicine might be to devise interventions that stimulate ancient pathways of endurance and block the expression of strategies that are undesired but, from an evolutionary perspective, would be more appropriate for today’s gluttonous society.

Long life

Figure 18.1 The plasticity of aging. The figure illustrates the hypothesis that variable environments lead to the evolution of conserved pathways that influence the rate of aging. While benign conditions might favor organisms that reproduce quickly and invest little in maintenance and stress resistance, stressful conditions might favor organisms that invest strongly in maintenance and delay reproduction until conditions improve. Environments that vary between benign and stressful can thus select for organisms that can sense the state of the environment and change their life-history strategy accordingly. Conserved pathways that regulate transitions between correlated suites of life-history traits have been identified in diverse organisms, for example in nematodes, fruit flies, and rodents. Insulin/IGF-1 signaling may be activated under benign conditions or in environments where nutrients are replete, and promote reproduction while compromising longevity. In times of stress, Sir2 may be activated and Insulin/IGF-1 signaling repressed, resulting in delayed reproduction and increased longevity.

consequences. For example, stimulation of a dietary restriction response, in which animals experience somatic endurance and reduced reproductive effort, might be maladaptive from an evolutionary perspective if resources are abundant. There is evidence that this may be the case. In C. elegans, mutations in genes encoding components of the insulin signaling pathway significantly increase worm life span. Under standardized, non-competitive conditions, these mutations have little or no effect on

Going beyond traditional evolutionary models of aging The basic tenets of the canonical evolutionary models have been successful because they provided a convincing explanation for why aging persists despite its obvious disadvantages for the individual and because they explained part of the variation in aging and life span between and within species. The simplicity of these models allows one to study very general patterns of aging, but their scope is limited. As discussed in the previous section, evolutionary models were unsuccessful at predicting major trends unearthed by the molecular biologists, and there is little doubt that careful modification will be needed to integrate these results. In addition to looking toward molecular biologists to stimulate new advances, evolutionary biologists may also look to themselves. With an emphasis on new ecological and genetic considerations, the classical questions addressed by the evolution of aging will decrease in significance and thereby release intellectual constraints imposed by generations of teaching that, in our opinion, have hindered the development of new ideas. A primary example of intellectual stagnation is the perceived importance of the dichotomy between



the two classic models of aging, mutation accumulation and antagonistic pleiotropy. For a number of years, researchers yearned for ways to quantify the relative importance of the two mechan isms. Most of these experiments were carried out with Drosophila (e.g., Rose 1984), some with other insect species. They were largely inconclusive and reinforced the idea that both mechanisms are likely to be important. Both theories assume mutations with a deleterious effect that is specific for late age; they differ in whether the same mutation is neutral or beneficial early in life. Surely these two theories are two extremes of a continuum, and their distinction is artificial. At one extreme is mutation accumulation, where mutations have absolutely no beneficial effect early in life. At the other lies antagonistic pleiotropy, where mutations have substantial effects that manifest clearly, can be measured, and lead to substantial selection for the alleles. Between these two extremes lies a continuum of mutations with continuously distributed beneficial effects early in life. Buried within the continuum of mutational effects is the complication that phenotypic effects of mutations are not constant and often depend on the genetic and physical environment. An influence of the genetic background has been found in experiments with fruit flies that analyzed the effect of defined mutations (Kaiser et al. 1997; Spencer and Promislow 2005) or quantitative trait loci (Leips and Mackay 2000) associated with long life span, and in studies with the nematode C. elegans that combined mutant alleles of the insulin-signaling pathway (Gems et al. 1998). Mutational effects are strongly influenced by the physical environment. Studies with C. elegans reported that the effects of individual mutations that extend life span are impacted by the type of bacteria used as a food-source (Garsin et al. 2003). Some mutations with very large effects on life span in laboratory-raised animals show little or no effect when animals were reared under conditions thought to represent their natural habitat (Van Voorhies et al. 2005). Correlated costs of life span-extending mutations can also be conditional; for example, the Indy mutation in Drosophila, which reduces the rate of aging, leads to a decrease in fecundity under nutrient limitation, but not when food is abundant (Marden et al. 2003).

These considerations reduce the logical utility of a distinction between mutation accumulation and antagonistic pleiotropy and suggest that the dichotomy, in addition to being artificial, may also be void of explanatory power. A mutation advantageous early in life in one environment might be neutral (or even deleterious) early in life in another environment or in another genetic background. Most genetic factors that contribute to human aging, for example, were fixed in human and prehuman populations a very long time ago and in a very different environment. It is possible that many of them had consequences early in life that were beneficial in these environments but are irrelevant in the modern environment or in the modern genetic background. Tampering with such genetic factors might, in the modern context, not have the deleterious consequences that antagonistic pleiotropy would suggest.

New directions How do we move beyond outdated conceptual models toward those more predictive and useful for understanding human aging and disease? We suggest three promising directions that can already be seen in current research: experiments designed to estimate the age-specific properties of new mutations, models of aging that incorporate the inherent structure of genomic regulatory networks, and investigations of the origin of aging itself. One area deserving of special attention is the spectrum of mutational effects that influence aging. The age-specific effects of new mutations have not received sufficient attention in evolutionary theory. Most researchers have focused on how selection changes over adult life and have taken mutation with the ‘right’ phenotypic effects for granted. We do not know how frequent mutations with age-specific effects are. This factor determines a number of important questions, from whether a given species should age in the first place, to whether extensions of life span will come at a cost early in life, to how precisely age-specific selection can modulate the evolution of intrinsic mortality. Experiments with Drosophila have shed some light on the questions about age-specific effects of mutations (Stearns and Kaiser 1996; Pletcher et al.


1998, 1999; Leips and Mackay 2000); they showed that mutations with age-specific effects on survival and fecundity do occur, but that they are not ubiquitous. Further insight will no doubt come from detailed, age-specific analysis of animals carrying conditional mutations so that genes can be turned on and turned off at different ages and the effects on the rate of aging studied. Global characteristics of genomic regulatory networks illuminate both ultimate and proximate questions about aging. As has been seen, evolutionary studies have traditionally suffered because of a lack of discriminatory hypotheses and because of untested assumptions concerning the properties of genetic effects (Promislow and Pletcher 2002). It would be useful to know, for example, whether genes that impact aging have unusual characteristics. Are they pleiotropic, with effects at many different ages? Are their effects consistently in one direction (either beneficial or deleterious)? These questions were addressed in a creative study that examined the characteristics that aging-related proteins have with respect to the entire yeast protein–protein interaction network (Promislow 2004). Promislow showed that proteins that affect replicative aging in S. cerevisiae interact with an unexpectedly large number of other proteins, i.e., they are highly connected. In addition, a protein’s connectivity is positively correlated with its degree of pleiotropy. Therefore, ‘aging genes’ are highly pleiotropic. Difficult but important follow-up questions concern how protein–protein interactions change with age and whether the incorporation of network structure into evolutionary models help predict variation in patterns of aging across different taxa. These new network models (Kowald and Kirkwood 1996; Promislow 2004) have the potential to encompass the diversity of influences that manifest in recent experimental data, but it remains to be seen whether they will reveal general patterns and lead to testable predictions. A third area of research addresses the question of when aging originated in the evolutionary history of life. Initially, a distinction between germline and soma was thought to be required for aging to evolve, the idea being that the negative effects of age could be confined to the parental soma, while the germline remained protected.


However, unicellular organisms, such as S. cerevisiae, which reproduce by simple cell division, also age. The germline–soma requirement was subsequently relaxed to stipulate only a need for asymmetry between parent and offspring. Indeed, close examination of yeast replication reveals systematic differences between the two cells emerging from division; pre-existing subcellular structures segregate to one cell at division (the mother), while the other (the daughter) receives newly synthesized structures. Components that remain in the parent deteriorate over time and lead to aging-related phenotypes and death. Surprisingly, it has recently been shown that many bacteria also divide asymmetrically (although asymmetry does not always manifest morphologically) and that these cells age (Ackermann et al. 2003; Stewart et al. 2005). Aging is therefore not confined to eukaryotes; it originated earlier, in the prokaryotic world. That the most rudimentary types of asymmetry are present in bacteria and may prescribe aging begs the question of why asymmetry would evolve in the first place. Investigation into this question has motivated the development of novel theoretical models (Watve et al. 2006; Ackermann et al. 2007). These models investigate why unicellular organisms would distribute aged, damaged structures unequally at cell division, and why they would not repair such structures sufficiently to guarantee unlimited functioning. One general result from this work is that asymmetric division, and thus the distinction between an aging parent cell and a rejuvenated progeny, can readily evolve as a strategy to cope with damage that unavoidably arises as a consequence of cellular processes. This new area of research suggests that aging might be a more fundamental aspect of cellular life than previously thought. It also raises the issues of whether there exist any organisms that do not age and whether interventions that completely abolish aging are possible.

Concluding remarks The first scientific questions about aging were posed by evolutionary biologists, who convincingly answered why organisms age: selection does not operate efficiently late in life and cannot



maintain function at an advanced age. Some of the predictions of the canonical evolutionary models have withstood experimental attempts to falsify them, including the influence of heritable genetic factors on patterns of aging and documentation of the impact of external risk factors. However, the predicted inability of single-gene mutations to extend life span and the presumed absence of mechanisms of aging that are conserved across species have failed to hold up to recent molecular investigations. With the power of molecular genetics, a lot has been learned about how organisms age. Research with genetic model systems revealed that many different processes are affected by aging, but it also showed that many of these processes are under the control of conserved regulatory pathways. Changes in these pathways, either through genetic or environmental manipulation, may induce alternate life-history strategies that alter the allocation of resources to favor somatic maintenance over reproduction. Evolutionary biology has the potential to play an important role in modern aging research. If evolutionary biology wants to sustain something more than an academic role in the development of interventions that impact human aging and aging-related disease, then it must re-emerge with new life and new focus. Sophisticated theoretical models that embrace modern genomics, that incorporate reasonable information on the agespecific effects of new mutations, and that focus on the analysis of public mechanisms of aging are required. Evolutionary biology must work handin-hand with molecular biology to address emerging central questions about aging. The genetics of the plasticity of aging in response to environmental conditions is a pressing issue, as is whether the molecular pathways that appear to be important regulators of aging under laboratory conditions are also responsible for evolutionary changes in life span, for example during the divergence of the lineages that lead to humans and chimpanzees (Partridge and Gems 2006).

Summary 1. Aging does not have a function. It exists because individuals often die for other reasons before aging manifests, and natural selection is thus not efficient in maintaining late-life performance. 2. How fast an individual ages is influenced by genes. These genes do not function to specifically cause aging. It is, however, possible that they impact the allocation of available resources to somatic maintenance or reproduction and in so doing function to postpone aging in certain circumstances. 3. Many of the genes that influence aging are under the control of specific regulatory pathways. Mutations in these pathways often lead to significant changes in the rate of aging. The pathways are shared among very different types of organisms ranging from unicellular fungi to humans. 4. Some of the conserved regulatory pathways detect and decode cues from the environment, suggesting the possibility that environmental conditions favor certain life-history decisions. Genetic interventions may short-circuit normal processing of environmental cues and trigger inappropriate (from an evolutionary standpoint) life-history decisions that result in an increased life span under laboratory or otherwise benign conditions. 5. The ability to alter the investment in maintenance in response to the external cues might be advantageous for organisms living in a variable environment. If conditions are harsh, an increased investment in maintenance and repair would improve survival to better times. That different organisms have related pathways suggests that this ability is an important aspect of aging in all eukaryotes.

Acknowledgments M.A. was supported by the Swiss National Science Foundation. S.D.P. was supported by the U.S. National Institutes of Health (R01AG023166), the Ellison Medical Foundation, and the American Federation for Aging Research.

C H A P T E R 19

Evolution, developmental plasticity, and metabolic disease Christopher W. Kuzawa, Peter D. Gluckman, Mark A. Hanson, and Alan S. Beedle

Introduction: diseases of excess or deficiency? Cardiovascular disease (CVD) is now the leading cause of death worldwide (Mackay et al. 2004). Given their modern appearance as major public health scourges, cardiovascular diseases and related metabolic disorders had long been viewed as ‘lifestyle’ diseases caused primarily by the growing problems of adult overnutrition and weight gain. Then, beginning in the late 1980s, evidence started to accumulate that individuals born small also have higher rates of cardiovascular mortality as adults (Barker et al. 1989). Because fetal growth is largely determined by nutrient delivery across the placenta, this suggested that the risk of adult metabolic diseases might also be increased by undernutrition experienced prior to birth. Initial skepticism about these findings faded as similar results were described in other populations. Extensive animal studies now support findings from human populations and show that unbalanced or restricted maternal or fetal nutrition during pregnancy initiate a similar suite of physiological, metabolic, and morphological changes in adult offspring. These findings suggest that the classic ‘diseases of excess’ may also be characterized as ‘diseases of deficiency’ depending on the age at which the nutritional imbalance occurs. This new developmental approach to the epidemiology of chronic degenerative disease (the ‘developmental origins of health and disease,’ or DOHaD for short) poses an interesting challenge

to classic evolutionary explanations for the rise of these conditions. In its brief history, the field of evolutionary medicine has developed the elegant principle that a beneficial genetic adaptation to one environment can lead to disease when environments change rapidly, as has occurred with recent rapid cultural change (Neel 1962; Eaton and Konner 1985). Considered in an evolutionary light, today’s populations are more likely to become obese not because the gene pool has changed, but because we now inhabit environments markedly different from those that our ancestors confronted. Previously, we had to move on foot to gather food that was widely dispersed. Now, as we consume more but expend less, our scarcity-adapted genome is confronted with the novelty of chronic, positive energy balance, leading to weight gain and its attendant health problems. Thus a genome well-adapted to one environment—the energetically balanced lifestyle of our ancestors—can lead to disease and premature death when placed in the nutritional ecologies of contemporary human societies (see also Chapter 20). This concept of gene-environment ‘mismatch’ is a useful starting point for explaining why chronic overnutrition can lead to diseases like obesity, diabetes, and CVD. But why should undernutrition, when encountered earlier in the life cycle, lead to a similar constellation of adult ailments? Such outcomes could merely be the unavoidable, detrimental effects of an insult that disrupts the normal pattern of embryonic or fetal development (see Gluckman and Hanson 2005), such as the 253



well-known examples of embryopathy, congenital heart disease, and hypospadias. Although straightforward damage may contribute to the long-term effects of severe fetal undernutrition, it is difficult to see how it could explain the common pattern of outcomes seen across the normal range of birthweight or in response to normal variability in intrauterine nutritional sufficiency in both model organisms and humans. As discussed below, the responses induced by undernutrition early in life are not limited to pathologic outcomes. They include changes in the regulation of endocrine systems, central and peripheral changes in energy metabolism, and modifications in growth rate and maturational timing. A simple model of developmental insult might explain the attenuated growth or function of a specific organ or anatomical structure, but it is far less likely to explain this broad, consistent, and integrated response. In this chapter we pose an alternative explanation for these findings. It is well established that developmental plasticity, defined as the potential for a single genome to create a range of phenotypes in different environmental circumstances, can serve as a powerful mode of biological adaptation, allowing organisms to adjust their ‘hard-wired’ biological settings in a single lifetime, much more rapidly than could be achieved by the slow process of natural selection operating on gene frequencies (Stearns and Koella 1986; West-Eberhard 2003; Bateson et al. 2004). At least some of the biological changes triggered by prenatal stimuli may be components of such a capacity for adaptive developmental plasticity in which the fetus anticipates its postnatal environment from nutritional and endocrine cues conveyed across the placenta (Gluckman and Hanson 2005). A capacity for tuning developmental biology to anticipate postnatal conditions could enhance genetic fitness by allowing the organism to adjust its adaptive priorities, such as its body size, nutrient requirements, and tendency to store excess energy in protective stores of body fat. When the actual environment does not match the predicted one, this same capacity for flexibility can elevate risk for disease. First, a brief overview of the extensive literature documenting the role played by early environments in determining risks of later chronic degenerative





Figure 19.1 Expanding the classic model of cardiovascular epidemiology to account for developmental processes.

diseases in both animal models and humans is given. Then a model of how the underlying biological responses might be integrated into a broader strategy aimed at adjusting the body’s adaptive settings is proposed. Most of the biological changes that increase risk for future disease involve adjustments in the handling, metabolism, and allocation of energy within the body. These patterns provide insights into the likely function of these responses, and thus the selective pressures that may have shaped them. Having considered this functional perspective, we conclude by discussing the conditions under which a capacity to adjust metabolic settings in response to prenatal cues could lead to disease in humans. Collectively, these findings expand the classic model of evolutionary medicine by showing that there are not two, but three major sets of factors—genes, environments, and developmental history—that interact across the life cycle and even across generations to determine biological state and disease risk (Fig. 19.1). In this way, we build on the tremendous progress in this area in the past decade to extend the corresponding chapter in the first edition of this book (Barker 1999).

The developmental origins of health and disease (DOHaD) paradigm Origin Almost half a century ago, Neel (1962) focused attention on an evolutionary explanation for the rising prevalence of obesity and diabetes by proposing that a ‘thrifty genotype’ of alleles that were adaptive under the ‘feast and famine’ conditions


faced by our foraging hominin ancestors is now deleterious in a modern environment of nutritional excess. Despite our knowledge of the human genome, such ‘thrifty genes’ remain elusive, and CVD epidemiology is moving beyond simple genetic models of disease susceptibility to accommodate newer evidence for the importance of the early developmental environment. Some of the first evidence that the environment early in the life cycle influences later susceptibility to chronic degenerative diseases came from studies of historical cohorts in Scandinavia. Using records from Norway, Forsdahl (1977) found that the rate of CVD was higher in cohorts that had experienced higher rates of infant mortality, which he interpreted as a marker of poverty and therefore of undernutrition. This interpretation was later supported by the work of Barker and colleagues (1989; 1994), who published a series of studies documenting inverse relationships between birthweight and adult outcomes such as hypertension, type II diabetes mellitus, and CVD. This work led to the key finding that individuals born small but who went on to gain weight by adulthood were at highest risk for metabolic disease, suggesting that the combination of fetal undernutrition followed by subsequent improvement in nutrition was particularly harmful. Building from this finding, and as an explicit reference to Neel’s prior concept of the thrifty genotype, Hales and Barker (1992) posited the ‘thrifty phenotype’ hypothesis that the fetus faced with a compromised prenatal nutritional environment is forced to adjust organ growth to buffer the brain in utero but at the cost of increased risk for diabetes when that individual is faced with more abundant nutrition after birth. By framing the problem as one of fetal adaptation, this construct provided the foundation for a subsequent synthesis of evolutionary and developmental biology, life-history theory, and fetal medicine.

Evidence from epidemiology Since the first observations of a relationship between birthweight and later disease risk, there has been a wealth of epidemiological studies linking measures of early nutrition and stress with a variety of clinical and physiological outcomes


later in life. The initial observations on historical cohorts from Britain have now been extended by more recent studies in, for example, Finland, India (Yajnik 2004), and the Philippines (Kuzawa and Adair 2003); for an extended review see Godfrey (2006). Moreover, data from the much-studied survivors of the Dutch Hunger Winter of 1944–5, a well-defined period of maternal undernutrition and stress, broadly support the epidemiological observations (Painter et al. 2005). Epidemiological studies underpinning the DOHaD phenomena highlight several key points, as follows. The outcomes shown to be affected by the human developmental environment include frank disease (CVD, type II diabetes, obesity, metabolic syndrome, osteoporosis, mood disorders, and cancers) and its underlying markers or precursors (cardiovascular function, hypertension, endothelial dysfunction, dyslipidemia, insulin resistance, and hormone levels) as well as factors reflecting growth and morphology (body size, muscle mass, neuron and nephron number), neurological and psychological development, and reproductive strategy (age of puberty, reproductive hormone levels and fecundity) (see Kuzawa 2005). In general, the epidemiological correlations are stronger for disease (e.g., heart disease) than for surrogate markers (e.g., blood pressure), which has been a source of controversy in the field. Although birthweight itself was initially seen as part of the causal pathway, it is now clear that this is an indirect indicator of environmental factors that modify physiology, metabolism, and the early developmental trajectory. The most commonly used marker in these epidemiological studies has been birth size—usually weight, although thinness at birth, judged from the combination of weight and length or from studies of body composition, has been found to be a more sensitive marker in some studies (Godfrey 2006). While individuals of lower birthweight have poorer long-term health prospects, there is a continuous relationship between birthweight and later outcomes across the full range of birth sizes (Barker 1994), making clear that pathologically low birthweight is not an obligatory part of the pathway to later disease. Although some have suggested that the pleiotropic effects of certain genes, for instance those modulating insulin’s effects on



both fetal growth and later diabetes risk, could explain some of the associations between birth size and adult disease (Hattersley and Tooke 1999), this perspective is not supported by experimental work on nutritional restriction in animal models. Yet, the increased disease risk conferred by polymorphisms in several genes involved in metabolic regulation is dependent on birthweight, suggesting an interaction with intrauterine nutrition or related factors. For example, susceptibility to later insulin resistance in people born small is modified by polymorphisms in the PPARγ2 receptor involved in adipocyte differentiation and metabolic signaling (Eriksson et al. 2002). Although the first studies tended to focus on intrauterine conditions, as reflected in the initial labeling of the field as ‘fetal origins of disease,’ it has become apparent that critical windows of sensitivity to environmental cues begin prior to conception and extend to at least mid-childhood. Maternal diet and body composition both before and after conception affect various outcomes in the child (Morton 2006). Studies of children born after in vitro fertilization provide further evidence for periconceptional influences and also show that the effects of prenatal conditions are not unidirectional, since girls born after such procedures are taller and have greater insulin sensitivity (lowering CVD risk) than normally conceived controls (Miles et al. 2005). Postnatal nutrition can also have repercussions in later life, in some cases by modifying the effects of prenatal nutrition. The nature of infant feeding, for example breast milk versus formula, can affect later cognitive performance, insulin resistance, or obesity (Morley and Lucas 1997; Stettler et al. 2005), providing strong evidence for plasticity in the establishment of physiologic and metabolic settings during early postnatal life. It is unlikely that the effects of prenatal undernutrition and postnatal overnutrition are independent, as postnatal growth patterns, presumably a reflection of nutrient availability, modify the effects of the intrauterine environment as indicated by birthweight. For example, slow growth in the first few months after birth increases the risk of diabetes in children of above-average birthweight, whereas children who are born small and later develop diabetes or insulin resistance are more likely to

have experienced an earlier childhood adiposity rebound and to put on weight faster in late childhood (Eriksson et al. 2003). In other domains, birth size and adult nutrition interact to determine cholesterol levels (Robinson et al. 2006) and the timing of menarche (Adair 2001). Finally, environmental exposures in early life can have transgenerational effects by which one generation’s environmental experiences influence outcomes in their grandchildren. In a historical cohort from northern Sweden, diabetes mortality increased in men if their paternal grandfather was exposed to abundant nutrition during his prepubertal growth period, an effect later extended to paternal grandmother/granddaughter pairs and shown to be transmitted in a gender-specific fashion (Pembrey et al. 2006).

Experimental evidence These epidemiological observations are matched by experimental studies addressing the DOHaD phenomenon in a range of mammalian species varying in size, growth rate, and life span. Most such studies have focused on models immediately applicable to human disease, such as those relating altered nutrition or stress in pregnancy (or its proximal effector glucocorticoid levels) to later outcomes such as blood pressure, fat deposition, or insulin resistance, although other phenotypic attributes such as behavior, body temperature, fluid balance, and longevity have also been studied. This area has been reviewed in detail (McMillen and Robinson 2005; Gluckman and Hanson 2006), and here some of the key findings will be summarized. First, the experimental models reproduce the main features of the human epidemiological studies (see contributions in Gluckman and Hanson 2006). Various manipulations of nutritional and endocrine status around the time of conception, at various periods of pregnancy, or in the neonatal period result in a similar phenotype characterized by a trend to visceral obesity, insulin resistance, hypertension, and endothelial dysfunction. These outcomes are induced by a prenatal low-protein or high-fat diet, maternal global undernutrition, and maternal glucocorticoid exposure at differing times in gestation in sheep, rats, mice, and guinea


pigs. In another parallel with the epidemiology, low birthweight may be, but is not necessarily, part of the phenotype observed. Secondly, the phenotype induced after an early nutritional insult involves parallel changes in different systems reminiscent of what is seen in humans born small. In the rat model of maternal undernutrition, the adult offspring show both centrallymediated alterations in behavior (hyperphagia and lethargy) and peripheral changes in traits such as skeletal muscle insulin sensitivity and obesity (Vickers et al. 2000) as well as alterations in endothelial function, learning capacity, and mood. Thirdly, in another parallel with the human studies, the effects of prenatal manipulations such as undernutrition can be modified by postnatal nutrition or other interventions, resulting in either exacerbation (Vickers et al. 2000) or amelioration (Vickers et al. 2000; Jimenez-Chillaron et al. 2006) of the induced phenotype, indicating a window of plasticity that extends beyond the intrauterine period. The animal models have revealed some of the underlying mechanisms, which include modifications in physiologic, metabolic, morphologic, and epigenetic state. The few studies that have compared animal and human data have, encouragingly, pointed to similar pathways. For example, the proximate mechanisms underlying peripheral insulin resistance include reduced expression of specific insulin signaling proteins in skeletal muscle, which is observed both in low-birthweight humans and in the offspring of rats fed a low-protein diet (Fernandez-Twinn and Ozanne 2006). Other studies explain the similar effects of nutritional challenge and various forms of stress. Maternal protein malnutrition and low birthweight are associated with reduced activity of 11β-hydroxysteroid dehydrogenase type II in the placenta. Because this steroid-metabolizing enzyme usually protects the fetus from maternal glucocorticoids, this exposes the fetus to higher levels of stress hormones (Bertram et al. 2001). Conversely, maternal administration of glucocorticoids affects intermediary metabolism and reduces the activity of placental glucose transporters (Hahn et al. 1999), providing a mechanism for the reduced fetal growth observed in situations of


high maternal stress burden. These findings suggest why multiple forms of stress are experienced as similar exposures and induce similar responses in the developing fetus.

Epigenetic mechanisms Epigenetic changes in gene expression are central to many induced developmental changes, and research in this area is currently brisk. The term epigenetics is increasingly being reserved for processes that establish and maintain distinct patterns of cellular gene expression that are mitotically heritable without changing the underlying DNA sequence. Such processes underlie a wide range of biological phenomena, including tissue differentiation during development, silencing of transposable elements, genomic imprinting, and X chromosome inactivation. The molecular basis for gene silencing by epigenetic processes has become clearer in recent years. Methylation of CpG islands in promoter regions generally results in reduced transcriptional activity, whereas chemical modification (particularly methylation, acetylation, and phosphorylation) of the histone proteins that package DNA controls the tightness of the chromatin framework, regulating access of transcription factors to DNA. Such epigenetic ‘marking’ is stable but potentially reversible, and is placed and cleared by a limited suite of DNA-binding proteins and DNA-modifying enzymes (Klose and Bird 2006; Nightingale et al. 2006). More recently, the epigenetic activity of small RNAs and microRNAs has become apparent; these non-coding molecules appear to act at multiple levels by regulating chromatin structure, directing the placement of other epigenetic marks, and modulating gene expression by pre- and post-transcriptional mechanisms (Krützfeldt and Stoffel 2006). Most epigenetic marks on DNA are cleared after fertilization to ensure the totipotency of the zygote, with reprogramming as tissue differentiation proceeds; nevertheless, imprinted genes and certain retrotransposon sequences retain their marking and such instances of stability may contribute to observations of intergenerational epigenetic inheritance in animal models (Chong and Whitelaw 2004) and humans (Pembrey et al. 2006). Recent evidence



suggests that microRNAs may also mediate epigenetic inheritance (Rassoulzadegan et al. 2006). The growing recognition of such non-Mendelian processes adds considerably to the concept that phenotypic ‘memory’ can be transmitted over at least one or two generations; such parental effects are well recognized in plants and invertebrates, but their significance for human biology is only now coming to light. Although epigenetic changes are integral to development and tissue differentiation, it is now clear that environmental factors can affect epigenetic marks and downstream patterns of gene expression, providing a mechanism for the lasting imprint of many early life exposures. Recent experimental studies have demonstrated how epigenetic markings in offspring are sensitive to maternal diet (Lillycrop et al. 2005) or behavior (Weaver et al. 2004). For example, in rats subjected to maternal protein restriction during gestation, reciprocal postnatal changes occur in the offspring in gene promoter methylation and expression of the peroxisome proliferator-activated receptor α (PPARα) and glucocorticoid receptor genes in the liver, associated with changed expression of downstream genes such as PEPCK and acyl-CoA oxidase (Lillycrop et al. 2005). Supplementation of the lowprotein diet with folate prevented these changes, suggesting the importance of methyl group provision. Although the epigenetic basis remains under investigation, phenotypes induced by prenatal undernutrition or stress hormone exposure may be transmitted to both matrilineal and patrilineal offspring and grand-offspring, influencing risk for outcomes like the metabolic syndrome and diabetes (Drake et al. 2005; Benyshek et al. 2006). The capacity for the environment to modify epigenetic patterns of gene expression helps explain the continuity of environmental effects on biology from early life into adulthood, and at times across generations.

An integrated response to developmental cues The complexity of the epigenetic changes that underlie early nutritional programming, and the similarity in the cluster of responses induced by a

range of stressful prenatal stimuli, raise the important question of whether these responses evolved to provide the organism with a capacity to modify metabolic settings and what the advantage of this capacity might be at different stages of development. If we reassess the empirical work reviewed above with respect to the functional impacts of the responses, there is evidence for an integrated strategy in which postnatal nutritional requirements are reduced and glucose is spared for the most critical functions. Collectively, this may boost the body’s capacity to cope with and prepare for challenge or threat.

Reduced body size and lean mass Individuals born small are often shorter and have reduced muscle mass (sarcopenia) as adults. This reduction in insulin-sensitive tissue contributes to peripheral insulin resistance and suggests reduction in expenditure on body growth and alterations in metabolism to reduce overall nutritional requirements, which would improve survival in a nutritionally constrained environment. Experimental work has demonstrated some of the mechanisms that allow prenatal nutrition to modify the somatic priorities and postnatal growth trajectory of the developing body. For example, increased maternal nutritional intake during the first half of gestation in domestic pigs acts through effects on insulin-like growth factors to increase the number of secondary muscle fibers and the post-weaning growth rate in offspring, while reduced maternal nutrition has the opposite effect (Dwyer and Strickland 1994).

Muscle becomes insulin-resistant In addition to a reduced number of muscle cells and reduced muscle mass, the muscle cells present are also less sensitive to insulin (Fernandez-Twinn and Ozanne 2006). The development of insulin resistance in muscle is a central component of the metabolic syndrome associated with obesity, and is a precursor for the development of non-insulindependent (type II) diabetes mellitus. In addition to this pathophysiologic role, insulin resistance has functional significance for the organism through its effects on energy substrate use. Most tissues


require insulin to acquire glucose from the circulation, and skeletal muscle is the largest insulinsensitive tissue in the body. When muscle becomes insulin-resistant, this effectively shunts glucose to more vital insulin-independent organs like the brain and, perhaps, the fetoplacental unit during pregnancy (see also Chapter 6). Modifying muscle insulin sensitivity is an important mechanism of resource partitioning, and in this light insulin resistance can be viewed as a conservative strategy of glucose allocation. There are multiple mechanisms underlying insulin resistance in individuals born small: they include modifications in insulin-mediated glucose uptake and glycogen synthesis, reduced mitochondrial number, and post-receptor changes in insulin action (Fernandez-Twinn and Ozanne 2006). The multiple targets for modified insulin action in response to prenatal undernutrition suggest a coordinated strategy rather than impairment. Importantly, insulin resistance is not present at birth, but is only expressed in the subsequent months or years, and particularly in association with rapid fat gain (Ibáñez et al. 2006). These points will be returned to later.

Fat deposition is enhanced in highly labile visceral depots Epidemiologic and experimental studies consistently show that prenatally undernourished animals deposit fat preferentially in central or visceral depots, which have well-known adverse effects on metabolic status and disease risk. Unlike fat in peripheral depots, central fat is highly innervated with neuroendocrine-sympathetic fibers, allowing the rapid mobilization and release of free fatty acids (FFA) into the circulation. These FFAs provide energy for muscle and for gluconeogenesis but also influence the organism’s strategy of energy use by inducing peripheral insulin resistance, thus amplifying the altered substrate partitioning discussed earlier. In addition, children born small-forgestational age have adipocytes that release more FFA in response to sympathetic stimulation (Boiko et al. 2005). Thus, individuals faced with prenatal undernutrition not only deposit more fat in central depots, but also have an enhanced capacity


to mobilize these energy stores when faced with a stressful challenge.

Stress responses and reactivity are accentuated The physiologic response to stress, as mediated by the hypothalamic–pituitary–adrenal (HPA) axis or the sympathetic nervous system, is also accentuated in animals undernourished or exposed to stress hormones prior to birth. These systems modulate both the deposition (glucocorticoids) and the mobilization (sympathetic responses) of visceral fat, and also have systemic effects on energy use and metabolism, influencing traits like blood pressure and heart rate. They are critical to the organism’s capacity to maintain homeostasis in the face of challenges such as danger, physical exertion, or the vagaries of dietary intake.

A developmental and evolutionary synthesis Prenatally undernourished organisms are thus born with a set of metabolic biases that influence their developmental trajectory, physiology, and metabolism across the lifecourse. These changes are central to the adult disease consequences of fetal nutritional stress, but from a functional perspective they favor fat deposition in the more labile central depots, accentuate the capacity to mobilize these depots, and repartition glucose allocation away from the insulin-sensitive periphery to insulin-independent tissues, particularly the brain. These adjustments involve complex alterations at multiple levels of biological organization in both central and peripheral tissues, including epigenetic control of gene expression, modification of hormone sensitivity by changes in receptor density and post-receptor pathways, and adjustment of patterns of cell division in response to nutritional and endocrine cues. Although evidence for ‘design’ is a weak basis for evaluating adaptation (Stearns and Ebert 2001), the distributed nature of these responses, their apparent functional integration, and the molecular and epigenetic complexity underlying them make it unlikely that they represent simple organ or tissue



damage. At the same time, however, it remains unclear how these changes are coordinated, what sequence they follow during development, and how or why they tend to co-occur. One possibility is that a small set of early developmental changes have cascading effects on related systems by triggering a compensatory process of phenotypic accommodation (West-Eberhard 2003). For example, any increase in vascular resistance will be associated with changes in the size and structure of the heart as its pattern of growth and development compensates for the added functional loading. In this way, a single primary adjustment could account for multiple correlated changes in the phenotype. By analogy, one or several of the changes in metabolic settings described above could be primary and have cascading effects on other metabolic traits. Recent animal work has shown that simple hormonal cues can reverse many of the changes induced by prenatal nutritional stress, arguing against phenotypic accommodation as an explanation for their co-occurrence. The induction of all of the components of the phenotype associated with maternal undernutrition and postnatal high-fat feeding (obesity, insulin resistance, elevated leptin levels, hyperphagia and reduced activity) can be completely reversed by a simple hormonal manipulation—leptin administration—soon after birth (Vickers et al. 2005). This reversal is accompanied by correlated changes in epigenetic marking and expression of several key genes (Vickers et al. unpublished). Because leptin is an adipose-derived hormone that signals energy status, one interpretation of this finding is that exogenous leptin misleads the undernourished neonate into developing as if it were well-nourished, thus reversing the scarcity-anticipating adjustments initiated during fetal life. These findings suggest the presence of a small set of genes that are responsive to prenatal nutrition and that have pleiotropic effects on a range of interrelated metabolic characteristics. The ability to modify the entire suite of adjustments with a single cue is reminiscent of a polyphenism, a context-dependent switch in developmental form found in many species with an evolved capacity for adaptive developmental plasticity (WestEberhard 2003).

Anticipating the future from maternal cues: predictive developmental plasticity Why is such an integrated suite of changes triggered by poor early nutrition or stress? If there is indeed a design to these responses, what is their purpose and at what age are they meant to be expressed? The changes initiated by prenatal cues could be functional at more than one developmental stage (for an expanded discussion see Gluckman et al. 2005). Some adjustments triggered in response to prenatal stress, such as the premature termination of a difficult pregnancy or the redirection of blood flow to protect the brain, clearly accrue their benefit immediately. Indeed, early DOHaD proposals, such as the thrifty phenotype hypothesis (Hales and Barker 1992), assumed that the bulk of the benefit was experienced during gestation. Some components of the fetal response to intrauterine stress may not have immediate benefits but instead be made in anticipation of postnatal conditions. Such adaptive anticipatory adjustments have been described as ‘predictive adaptive responses’ (Gluckman and Hanson 2005). Such long-term adaptive adjustment has been criticized and defended on a variety of theoretical and empirical bases (e.g., Wells 2003; Kuzawa 2005; Gluckman et al. 2007). There are many examples of mammalian taxa that convey predictive information to offspring in utero, providing a precedent to consider the operation of similar processes in humans (for examples see Gluckman and Hanson 2005). Many of these are shorter-lived species, however, and the extrapolation of these findings to long-lived species has been questioned (Kuzawa 2005). In addition, Wells (2003) proposed that the postnatal nutritional environment that the newborn experiences is constructed by the mother in service of her own goal of balancing reproductive investment across present and future offspring. In this scenario, the suite of metabolically thrifty adjustments made in utero is a fetal strategy for enhancing postnatal survival against the backdrop of less than optimal (from the offspring’s perspective) maternal provisioning. What is clear is that many of the metabolic changes induced by prenatal nutritional stress appear after birth, thus pointing to a predominantly postnatal function. Some of the more important changes in


biology triggered in response to prenatal stress are only expressed postnatally, and in some instances take months or years to emerge. This is seen clearly for insulin resistance, which is central to the induced metabolic phenotype and is an important factor in its pathological consequences in adults. Clinical studies show that individuals born small are more insulin sensitive at birth and do not develop insulin resistance until later in childhood, particularly in association with weight gain (Ibáñez et al. 2006). Recent work in rats suggests that the initial insulin sensitivity after prenatal stress reflects increased insulin sensitivity in adipose tissue, which, when coupled with insulin resistance in muscle, promotes rapid deposition of fat (Cettour-Rose et al. 2005). Thus, organisms born under stressful conditions enter the world with a set of tissue-specific metabolic biases that favor accrual of fat, which is preferentially deposited in rapidly mobilizable central depots. If the constellation of adjustments in metabolic partitioning represents a functional complex, as argued above, it must serve this function at or after the age when the central components of the complex—insulin resistance and enhanced fat depots—are established. Such modifications of energy partitioning and metabolism could provide an advantage for an organism entering a world of marginal or less predictable nutrition. Weaning represents an important developmental bottleneck that could have particular relevance for understanding anticipatory metabolic adjustments in mammals. Nutritional stress has its greatest impact on human mortality during infancy and early childhood, and the pre-reproductive timing of this mortality peak means that selection for metabolic traits that improve survival—whether genetic or induced through plasticity—will be strongly favored. The brain is a particularly important determinant of this metabolic stress in humans because cerebral metabolism accounts for greater than 50% of energy use throughout human infancy and early childhood (see Kuzawa 1998). The requirement to devote a large fraction of the body’s energy budget to the brain reduces flexibility in metabolic expenditure at this age, and the period of peak brain metabolism largely overlaps with the heightened infectious disease and nutritional stress that often accompany weaning. The


metabolic risk imposed by this convergence of high demand and disruption in supply may explain the unique human tendency to begin deposition of sizeable body fat reserves prior to birth (Kuzawa 1998, see also Chapter 6). By the same reasoning, the constellation of traits induced by prenatal undernutrition—the tendency to deposit visceral fat late in gestation and postnatally, the enhanced ability to mobilize this energy reserve when faced with stress, and the emergence of an insulinresistant, glucose-sparing phenotype—might help the infant or young child faced with th