2,889 324 22MB
Pages 1107 Page size 432 x 648 pts Year 2006
Encyclopedia of
Nonlinear Science
Encyclopedia of
Nonlinear Science
Alwyn Scott Editor
ROUTLEDGE NEW YORK AND LONDON
Published in 2005 by Routledge Taylor & Francis Group 270 Madison Avenue New York, NY 10016 www.routledge-ny.com Published in Great Britain by Routledge Taylor & Francis Group 2 Park Square Milton Park, Abingdon Oxon OX14 4RN U.K. www.routledge.co.uk Copyright ? 2005 by Taylor & Francis Books, Inc., a Division of T&F Informa. Routledge is an imprint of the Taylor & Francis Group.
This edition published in the Taylor & Francis e-Library, 2006. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage and retrieval system, without permission in writing from the publisher. 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data Encyclopedia of nonlinear science/Alwyn Scott, Editor p. cm. Includes bibliographical references and index. ISBN 1-57958-385-7 (hb: alk.paper) 1. Nonlinear theories-Encyclopedias. 1. Scott, Alwyn, 1931-QA427, E53 2005 003:75---dc22
ISBN 0-203-64741-6 Master e-book ISBN ISBN 0-203-67889-3 (Adobe eReader Format) (Print Edition)
2004011708
Contents Introduction
vii
Editorial Advisory Board xiii List of Contributors xv List of Entries xxxiii Thematic List of Entries Entries A to Z Index 1011
1
xxxix
Introduction assumption. Of course, the notion that components of complex causes can interact among themselves is not surprising to any thoughtful person who manages to get through an ordinary day of normal life, and it is not at all new. Twenty-four centuries ago, Aristotle described four types of cause (material, efficient, formal, and final), which overlap and intermingle in ways that were often overlooked in 20th-century thought but are now under scrutiny. Consider some examples of linear scientific thinking that are presently being reevaluated in the context of nonlinear science. --- Around the middle of the 20th century, behavioral psychologists adopted the theoretical position that human mental activity can be reduced to a sum of individual responses to specific stimuli that have been learned at earlier stages of development. Current research in neuroscience shows this perspective to be unwarranted. --- Some evolutionary psychologists believe that particular genes, located in the structure of DNA, can always be related in a one-to-one manner to individual features of an adult organism, leading to hunts for a crime gene that seem abhorrent to moralists. Nonlinear science suggests that the relation between genes and features of an adult organism is more intricate than the linear perspective assumes. --- The sad disintegration of space shuttle Columbia on the morning of February 1, 2003, set off a search for the cause of the accident, ignoring Aristotelian insights into the difficulties of defining such a concept, never mind sorting out the pieces. Did the mishap occur because the heat-resistant tiles were timeworn (a material cause)? Or because 1.67 pounds of debris hit the left wing at 775 ft/s during takeoff (an efficient cause)? Perhaps a management culture that discounted the importance of safety measures (a formal cause) should shoulder some of the blame. --- Cultural phenomena, in turn, are often viewed as the mere sum of individual psychologies,
Among the several advances of the 20th century, nonlinear science is exceptional for its generality. Although the invention of radio was important for communications, the discovery of DNA structure for biology, the development of quantum theory for theoretical physics and chemistry, and the invention of the transistor for computer engineering, nonlinear science is significant in all these areas and many more. Indeed, it plays a key role in almost every branch of modern research, as this Encyclopedia of Nonlinear Science shows. In simple terms, nonlinear science recognizes that the whole is more than a sum of its parts, providing a context for consideration of phenomena like tsunamis (tidal waves), biological evolution, atmospheric dynamics, and the electrochemical activity of a human brain, among many others. For a research scientist, nonlinear science offers novel phenomena, including the emergence of coherent structures (an optical soliton, e.g., or a nerve impulse) and chaos (characterized by the difficulties in making accurate predictions for surprisingly simple systems over extended periods of time). Both these phenomena can be studied using mathematical methods described in this Encyclopedia. From a more fundamental perspective, a wide spectrum of applications arises because nonlinear science introduces a paradigm shift in our collective attitude about causality. What is the nature of this shift? Consider the difference between linear and nonlinear analyses. Linear analyses are characterized by the assumption that individual effects can be unambiguously traced back to particular causes. In other words, a compound cause is viewed as the linear (or algebraic) sum of a collection of simple causes, each of which can be uniquely linked to a particular effect. The total effect responding to the total cause is then considered to be just the linear sum of the constituent effects. A fundamental tenet of nonlinear science is to reject this convenient, but often unwarranted,
vii
viii ignoring the grim realities of war hysteria and lynch mobs, not to mention the tulip craze of 17th-century Holland, the more recent dot-com bubble, and the outbreak of communal mourning over the death of Princess Diana.
Evolution of the Science As the practice of nonlinear science involves such abstruse issues, one might expect its history to be checkered, and indeed it is. Mathematical physics began with the 17th-century work of Isaac Newton, whose formulation of the laws of mechanical motion and gravitation explained how the Earth moves about the Sun, replacing a final cause (God’s plan) with an efficient cause (the force of gravity). Because it assumed that the net gravitational force acting on any celestial body is the linear (vector) sum of individual forces, Newton’s theory provides support for the linear perspective in science, as has often been emphasized. Nonetheless, the mathematical system Newton developed (calculus) is the natural language for nonlinear science, and he used this language to solve the two-body problem (collective motion of Earth and Moon)---the first nonlinear system to be mathematically studied. Also in the 17th century, Christiaan Huygens noted that two pendulum clocks (which he had recently invented) kept exactly the same time when hanging from a common support. (Confined to his room by an indisposition, Huygens observed the clocks over a period of several days, during which the swinging pendula remained in step.) If the clocks were separated to opposite sides of the room, one lost several seconds a day with respect to the other. From small vibrations transmitted through the common support, he concluded, the two clocks became synchronized---a typical nonlinear phenomenon. In the 18th century, Leonhard Euler used Newton’s laws of motion to derive nonlinear field equations for fluid flow, which were augmented a century later by Louis Navier and George Stokes to include the dissipative effects of viscosity that are present in real fluids. In their generality, these equations defied solution until the middle of the 20th century when, together with the digital computer, elaborations of the Navier--Stokes equations provided a basis for general models of the Earth’s atmosphere and oceans, with implications for the vexing question of global warming. During the latter half of the 19th century, however, special analytic solutions were obtained by Joseph Boussinesq and related to experimental observations of hydrodynamic solitary waves by John Scott Russell. These studies--which involved a decade of careful observations of uniformly propagating heaps of water on canals and in wave tanks---were among the earliest research
Introduction programs in the area now recognized as nonlinear science. At about the same time, Pierre Francois Verhulst formulated and solved a nonlinear differential equation---sometimes called the logistic equation---to model the population growth of his native Belgium. Toward the end of the 19th century, Henri Poincar´ e returned to Newton’s original theme, presenting a solution of the three-body problem of celestial motion (e.g., a planet with two moons) in a mathematical competition sponsored by the King of Sweden. Interestingly, a serious error in this work was discovered prior to its publication, and he (Poincar´ e, not the Swedish king) eventually concluded that the three-body problem cannot be exactly solved. Now regarded by many as the birth of the science of complexity, this negative result had implications that were not widely appreciated until the 1960s, when numerical studies of simplified atmospheric models by Edward Lorenz showed that nonlinear systems with as few as three degrees of freedom can readily exhibit the nonlinear phenomenon of chaos. (A key observation here was of an unanticipated sensitivity to initial conditions, popularly known as the butterfly effect from Lorenz’s speculation that the flap of a butterfly’s wings in Brazil [might] set off a tornado in Texas.) During the first half of the 20th century, the tempo of research picked up. Although still carried on as unrelated activities, there appeared a notable number of experimental and theoretical studies now recognized as precursors of modern nonlinear science. Among others, these include Albert Einstein’s nonlinear theory of gravitation; nonlinear field theories of elementary particles (like the recently discovered electron) developed by Gustav Mie and Max Born; experimental observations of local modes in molecules by physical chemists (for which a nonlinear theory was developed by Reinhard Mecke in the 1930s, forgotten, and then redeveloped in the 1970s); biological models of predator-prey population dynamics formulated by Vito Volterra (to describe year-to-year variations in fish catches from the Adriatic Sea); observations of a profusion of localized nonlinear entities in solid-state physics (including ferromagnetic domain walls, crystal dislocations, polarons, and magnetic flux vortices in superconductors, among others); a definitive experimental and theoretical study of nerve impulse propagation on the giant axon of the squid by Alan Hodgkin and Andrew Huxley; Alan Turing’s theory of pattern formation in the development of biological organisms; and Boris Belousov’s observations of pattern formation in a chemical solution, which were at first ignored (under the mistaken assumption that they violated the second law of thermodynamics) and later confirmed and extended by Anatol
Introduction Zhabotinsky and Art Winfree. Just as the invention of the laser in the early 1960s led to numerous experimental and theoretical studies in the new field of nonlinear optics, the steady increases in computing power throughout the second half of the 20th century enabled ever more detailed numerical studies of hydrodynamic turbulence and chaos, whittling away at the long-established Navier--Stokes equations and confirming the importance of Poincar´ e’s negative result on the three-body problem. Thus, it was evident by 1970 that nonlinearity manifests itself in several remarkable properties of dynamical systems, including the following. (There are others, some no doubt waiting to be discovered.) --- Many nonlinear partial differential equations (wave equations, diffusion equations, and more complicated field equations) are often observed to exhibit localized or lump-like solutions, similar to Russell’s hydrodynamic solitary wave. These coherent structures of energy or activity emerge from initial conditions as distinct dynamic entities, each having its own trajectory in space-time and characteristic ways of interacting with others. Thus, they are things in the normal sense of the word. Interestingly, it is sometimes possible to compute the velocity of emergent entities (their speeds and shapes) from initial conditions and express them as tabulated functions (theta functions or elliptic functions), thereby extending the analytic reach of nonlinear analysis. Examples of emergent entities include tornadoes, nerve impulses, magnetic domain walls, tsunamis, optical solitons, Jupiter’s Great Red Spot, black holes, schools of fish, and cities, to name but a few. A related phenomenon, exemplified by meandering rivers, bolts of lightning, and woodland paths, is called filamentation, which also causes spotty output beams in poorly designed lasers. --- Surprisingly simple nonlinear systems (Poincar´ e’s three-body problem is the classic example) are found to have chaotic solutions, which remain within a bounded region, while the difference between neighboring solution trajectories grows exponentially with time. Thus, the course of a solution trajectory is strongly sensitive to its initial conditions (the butterfly effect). Chaotic solutions arise in both energy-conserving (Hamiltonian) systems and dissipative systems, and they are fated to wander unpredictably as trajectories that cannot be accurately extended into the future for unlimited periods of time. As Lorenz pointed out, the chaotic behavior the Earth’s atmosphere makes detailed meteorological predictions problematic, to the delight of the mathematician and the despair of the weatherman. Chaotic systems also exhibit strange attractors in the solution space, which are characterized by fractal (non-integer) dimensions.
ix --- Nonlinear problems often display threshold phenomena, meaning that there is a relatively sharp boundary across which the qualitative nature of a solution changes abruptly. This is the basic property of an electric wall switch, the trigger of a pistol, and the flip-flop circuit that a computer engineer uses to store a bit of information. (Indeed, a computer can be viewed as a large, interconnected collection of threshold devices.) Sometimes called tipping points in the context of social phenomena, thresholds are an important part of our daily experience, where they complicate the relationship of causality to legal responsibility. Was it the last straw that broke the camel’s back? Or did all of the straws contribute to some degree? Should each be blamed according to its weight? How does one assign culpability for the Murder on the Orient Express? --- Nonlinear systems with several spatial coordinates often exhibit spontaneous pattern formation, examples of which include fairy rings of mushrooms, oscillatory patterns of heart muscle activity under fibrillation (leading to sudden cardiac arrest), weather fronts, the growth of form in a biological embryo, and the Gulf Stream. Such patterns can be chaotic in time and regular in space, regular in time and chaotic in space, or chaotic in both space and in time, which in turn is a feature of hydrodyamic turbulence. --- If the input to (or stimulation of) a nonlinear system is a single frequency sinusoid, the output (or response) is nonsinusoidal, comprising a spectrum of sinusoidal frequencies. For lossless nonlinear systems, this can be an efficient means for producing energy at integer multiples of the driving frequency, through the process of harmonic generation. In electronics, this process is widely used for digital tuning of radio receivers. Taking advantage of the nonlinear properties of certain transparent crystals, harmonic generation is also employed in laser optics to create light beams of higher frequency, for example, conversion of red light to blue. --- Another nonlinear phenomenon is the synchronization of weakly coupled oscillators, first observed by the ailing Huygens in the winter of 1665. Now recognized in a variety of contexts, this effect crops up in the frequency locking of electric power generators tied to the same grid and the coupling of biological rhythms (circadian rhythms in humans, hibernation of bears, and the synchronized flashing of Indonesian fireflies), in addition to many applications in electronics. Some suggest that neuronal firings in the neocortex may be mutually synchronized. --- Shock waves are familiar to most of us as the boom of a jet airplane that has broken the sound barrier or the report of a cannon. Closely related
x from a mathematical perspective are the bow wave of a speedboat, the breaking of onshore surf, and the sudden automobile pileups that can occur on a highway that is carrying traffic close to its maximum capacity. --- More complicated nonlinear systems can be hierarchical in nature. This comes about when the emergence of coherent states at one level provides a basis for new nonlinear dynamics at a higher level of description. Thus, in the course of biological evolution, chemical molecules emerged from interactions among the atomic elements, and biological molecules then emerged from simpler molecules to provide a basis for the dynamics of a living cell. From collections of cells, multicellular organisms emerged, and so on up the evolutionary ladder to creatures like ourselves, who comprise several distinct levels of biological dynamics. Similar structures are observed in the organization of coinage and of military units, not to mention the hierarchical arrangement of information in the human brain. Often, qualitatively related behaviors---involving one or more of such nonlinear manifestations---are found in models that arise from different areas of application, suggesting the need for interdisciplinary communications. By the early 1970s, therefore, research in nonlinear science was in a state that the physical chemists might describe as supersaturated. Dozens of people across the globe were working on one facet or another of nonlinear science, often unaware of related studies in traditionally unrelated fields. During the mid-1970s, this activity experienced a phase change, which can be viewed as a collective nonlinear effect in the sociology of science. Unexpectedly, a number of conferences devoted entirely to nonlinear science were organized, with participants from a variety of professional backgrounds, nationalities, and research interests eagerly contributing. Solid-state physicists began to talk seriously with biologists, neuroscientists with chemical engineers, and meteorologists with psychologists. As interdisciplinary barriers crumbled, these unanticipated interactions led to the founding of centers for nonlinear science and the launching of several important research journals amid an explosion of research activity. By the early 1980s, nonlinear science had gained recognition as a key component of modern inquiry, playing a central role in a wide spectrum of activities. In the terminology introduced by Thomas Kuhn, a new paradigm had been established.
About this Book The primary aim of this Encyclopedia is to provide a source from which undergraduate and graduate
Introduction students in the physical and biological sciences can study how concepts of nonlinear science are presently understood and applied. In addition, it is anticipated that teachers of science and research scientists who are unfamiliar with nonlinear concepts will use the work to expand their intellectual horizons and improve their lectures. Finally, it is hoped that this book will help members of the literate public---philosophers, social scientists, and physicians, for example---to appreciate the wealth of natural phenomena described by a science that does not discount the notion of complex causality. An early step in writing the Encyclopedia was to choose the entry subjects---a difficult task that was accomplished through the efforts of a distinguished Board of Advisers (see page xiii), with members from Australia, Germany, Italy, Japan, Russia, the United Kingdom, and the United States. After much sifting and winnowing, an initial list of about a thousand suggestions was reduced to the 438 items given on pages 1--1010. Depending on the subject matter, the entries are of several types. Some are historical or descriptive, while others present concepts and ideas that require notations from physics, engineering, or mathematics. Although most of the entries were planned to be about a thousand words in length, some---covering subjects of greater generality or importance---are two or four times as long. Of the many enjoyable aspects in editing this Encyclopedia, the most rewarding has been working with those who wrote it---the contributors. The willing way in which these busy people responded to entry invitations and their enthusiastic preparation of assignments underscores the degree to which nonlinear science has become a community with a healthy sense of professional responsibility. In every case, the contributors have tried to present their ideas as simply as possible, with a minimum of technical jargon. For a list of the contributors and their affiliations, see pages xv--xxxi from which it is evident that they come from about 30 different countries, emphasizing the international character of nonlinear science. A proper presentation of the diverse professional perspectives that make up nonlinear science requires careful organization of the Encyclopedia, which we attempt to provide. Although each entry is self-contained, the links among them can be explored in several ways. First, the Thematic List on pages xxxix--xliii groups entries within several categories, providing a useful summary of related entries through which the reader can surf. Second, the entries have See also notes, both within the text and at the end of the entry, encouraging the reader to browse outwards from a starting node. Finally, the Index contains a detailed list of
Introduction topics that do not have their own entries but are discussed within the context of broader entries. If you cannot find an entry on a topic you expected to find, use the Thematic List or Index to locate the title of the entry that contains the item you seek. Additionally, all entries have selected bibliographies or suggestions for further reading, leading to original research and textbooks that augment the overview approach to which an encyclopedia is necessarily limited. Although much of nonlinear science evolved from applied mathematics, many of the entries contain no equations or mathematical symbols and can be absorbed by the general reader. Some entries are necessarily technical, but efforts have been made to explain all terms in simple English. Also, many entries have either line diagrams expanding on explanations given in the text or photographs illustrating typical examples. Typographical errors will be posted on the encyclopedia web site at http://www.routledgeny.com/ref/nonlinearsci/. The editing of this Encyclopedia of Nonlinear Science culminates a lifetime of study in the area, leaving me indebted to many. First is the Acquisitions Editor, Gillian Lindsey, who conceived of the project, organized it, and carried it from its beginnings
xi in London across the ocean to publication in New York. Without her dedication, quite simply, the Encyclopedia would not exist. Equally important to reaching the finished work were the efforts of the advisers, contributors, and referees, who, respectively, planned, wrote, and vetted the work, and to whom I am deeply grateful. On a broader time-span are colleagues and students from the University of Wisconsin, Los Alamos National Laboratory, the University of Arizona, and the Technical University of Denmark, with whom I have interacted over four decades. Although far too many to list, these collaborations are fondly remembered, and they provide the basis for much of my editorial judgment. Finally, I express my gratitude for the generous financial support of research in nonlinear science that has been provided to me since the early 1960s by the National Science Foundation (USA), the National Institutes of Health (USA), the Consiglio Nazionale delle Ricerche (Italy), the European Molecular Biology Organization, the Department of Energy (USA), the Technical Research Council (Denmark), the Natural Science Research Council (Denmark), the Thomas B. Thriges Foundation (Denmark), and the Fetzer Foundation (USA). Alwyn Scott Tucson, Arizona 2004
Editorial Advisory Board Friedrich H. Busse Theoretical Physics, Universita¨ t Bayreuth, Germany Antonio Degasperis Dipartimento di Fisica, Universita` degli Studi di Roma ”La Sapienza”, Italy William D. Ditto Applied Chaos Lab, Georgia Institute of Technology, USA Chris Eilbeck Department of Mathematics, Heriot-Watt University, UK Sergej Flach Max Planck Institut fu¨ r Physik komplexer Systeme, Germany Hermann Flaschka Department of Mathematics, The University of Arizona, USA Hermann Haken Center for Synergetics, University of Stuttgart, Germany James P. Keener Department of Mathematics, University of Utah, USA Yuri Kivshar Nonlinear Physics Center, Australian National University, Canberra, Australia Yoshiki Kuramoto Department of Physics, Kyoto University, Japan Dave McLaughlin Courant Institute of Mathematical Sciences and Provost, New York University, USA Lev A. Ostrovsky Zel Technologies/University of Colorado, Boulder, and Institute of Applied Physics, Nizhny Novgorod, Russia Edward Ott Institute for Research in Electronics and Applied Physics, University of Maryland, USA A.T. Winfree (deceased) Formerly Department of Ecology and Evolutionary Biology, University of Arizona, USA Ludmila V. Yakushevich Institute of Cell Biophysics, Russian Academy of Science, Pushchino, Russia Lai-Sang Young Courant Institute of Mathematical Sciences, New York University, USA
xiii
List of Contributors Ablowitz, Mark J. Professor, Department of Applied Mathematics, University of Colorado, Boulder, USA Ablowitz--Kaup--Newell--Segur system
Bahr, David Assistant Professor, Department of Computer Science, Regis University, Colorado, USA Glacial flow
Aigner, Andreas A. Research Associate, Department of Mathematical Sciences, University of Exeter, UK Atmospheric and ocean sciences General Circulation models of the atmosphere Navier--Stokes equation Partial differential equations, nonlinear
Ball, Rowena Department of Theoretical Physics, Australian National University, Australia Fairy rings of mushrooms Kolmogorov cascade Singularity theory Barnes, Howard Unilever Research Professor of Industrial Rheology, Department of Mathematics, University of Wales Aberystwyth, Wales Rheology
Albano, Ezequiel V. Instituto de Investigaciones Fisicoquι´micas Teo´ ricas y Aplicadas (INIFTA) University of La Plata, Argentina Forest fires Aratyn, Henrik Professor, Physics Department, University of Illinois at Chicago, USA Dressing method
Barthes, Mariette Groupe de Dynamique des Phases Condense´es UMR CNRS 5581, Universite´ Montpellier 2, France Rayleigh and Raman scattering and IR absorption
Aref, Hassan Dean of Engineering and Reynolds Metals Professor Virginia Polytechnic Institute & State University, USA Bernoulli’s equation Chaos vs. turbulence Chaotic advection Cluster coagulation Hele-Shaw cell Newton’s laws of motion
Beck, Christian Professor, School of Mathematical Sciences, Queen Mary University of London, UK Free energy Multifractal analysis String theory Beeckman, Jeroen Department of Electronics and Information Systems Ghent University, Belgium Liquid crystals
Arrowsmith, David Professor, School of Mathematical Sciences, Queen Mary University of London, UK Symbolic dynamics Topology
Benedict, Keith Senior Lecturer, School of Physics and Astronomy, University of Nottingham, UK Anderson localization Frustration
Athorne, Christopher Senior Lecturer, Department of Mathematics, University of Glasgow, UK Darboux transformation
xv
xvi Berge´ , Luc Commissariat a` l’Energie Atomique, Bruye`res-le-Chaˆ tel, France Development of singularities Filamentation Kerr effect Berland, Nicole Chimie Ge´ne´ral et Organique Lyce´e Faidherbe de Lille, France Belousov--Zhabotinsky reaction Bernevig, Bogdan A. Physics Department, Massachusetts Institute of Technology, USA Holons Biktashev, Vadim N. Lecturer in Applied Maths, Mathematical Sciences, University of Liverpool, UK Vortex dynamics in excitable media Binczak, Stephane Laboratoire d’Electronique, Informatique et Image, Universite´ de Bourgogne´, France Ephaptic coupling Myelinated nerves Biondini, Gino Assistant Professor, Department of Mathematics, Ohio State University, USA Einstein equations Harmonic generation Blair, David Professor, School of Physics, The University of Western Australia, Australia Gravitational waves Boardman, Alan D. Professor of Applied Physics, Institute for Materials Research, University of Salford, UK Polaritons
List of Contributors Borckmans, Pierre Center for Nonlinear Phenomena & Complex Systems, Universite´ Libre de Bruxelles, Belgium Turing patterns Boumenir, Amin Department of Mathematics, State University of West Georgia, USA Gel’fand--Levitan theory Bountis, Tassos Professor, Department of Mathematics and Center for Research and Application of Nonlinear Systems, University of Patras, Greece Painleve´ analysis Boyd, Robert W. Professor, The Institute of Optics, University of Rochester, USA Frequency doubling Bradley, Elizabeth Associate Professor, Department of Computer Science, University of Colorado, USA Kirchhoff’s laws Bullough, Robin Professor, Mathematical Physics, University of Manchester Institute of Science and Technology, UK Maxwell--Bloch equations Sine-Gordon equation Bunimovich, Leonid Regents Professor, Department of Mathematics, Georgia Institute of Technology, USA Billiards Deterministic walks in random environments Lorentz gas Busse, Friedrich (Adviser) Professor, Theoretical Physics, University of Bayreuth, Germany Dynamos, homogeneous Fluid dynamics Magnetohydrodynamics
Bollt, Erik M. Associate Professor, Departments of Mathematics & Computer Science and Physics, Clarkson University, Potsdam, N.Y., USA Markov partitions Order from chaos
Calini, Annalisa M. Associate Professor, Department of Mathematics, College of Charleston, USA Elliptic functions Mel’nikov method
Boon, J.-P. Professor, Faculte´ des Sciences, Universite´ Libre de Bruxelles, Belgium Lattice gas methods
Caputo, Jean Guy Laboratoire de Mathe´matiques, Institut National des Sciences Applique´es de Rouen, France Jump phenomena
List of Contributors Censor, Dan Professor, Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Israel Volterra series and operators Chen, Wei-Yin Professor, Department of Chemical Engineering, University of Mississippi, USA Stochastic processes Chernitskii, Alexander A. Department of Physical Electronics, St. Petersburg Electrotechnical University, Russia Born--Infeld equations Chiffaudel, Arnaud ´ CEA-Saclay (Commissariat a` l’Energie Atomique) & CNRS (Centre National de la Recherche Scientifique), France Hydrothermal waves Choudhury, S. Roy Professor, Department of Mathematics, University of Central Florida, USA Kelvin--Helmholtz instability Lorenz equations Christiansen, Peter L. Professor, Informatics and Mathematical Modelling and Department of Physics, Technical University of Denmark, Denmark Separation of variables Christodoulides, Demetrios Professor, CREOL/School of Optics, University of Central Florida, USA Incoherent solitons Coskun, Tamer Assistant Professor, Department of Electrical Engineering, Pamukkale University, Turkey Incoherent solitons Cruzeiro, Leonor CCMAR and FCT, University of Algarve, Campus de Gambelas, Faro, Portugal Davydov soliton Cushing, J.M. Professor, Department of Mathematics, University of Arizona, USA Population dynamics Dafilis, Mathew School of Biophysical Sciences and Electrical Engineering, Swinbume University of Technology, Australia Electroencephalogram at mesoscopic scales
xvii Davies, Brian Department of Mathematics, Australian National University, Australia Integral transforms Period doubling Davis, William C. Formerly, Los Alamos National Laboratory USA Explosions deBruyn, John Professor, Department of Physics and Physical Oceanography, Memorial University of Newfoundland, Canada Phase transitions Thermal convection Deconinck, Bernard Assistant Professor, Department of Applied Mathematics University of Washington, USA Kadomtsev--Petviashvili equation Periodic spectral theory Poisson brackets Degallaix, Jerome School of Physics, The University of Western Australia, Australia Gravitational waves Deift, Percy Professor, Department of Mathematics, Courant Institute of Mathematical Sciences, New York University, USA Random matrix theory IV: Analytic methods Riemann--Hilbert problem Deryabin, Mikhail V. Department of Mathematics, Technical University of Denmark, Denmark Kolmogorov--Arnol’d--Moser theorem Dewel, Guy (deceased) Formely Professor, Faculte´ des Sciences Universite´ Libre de Bruxelles, Belgium Turing patterns Diacu, Florin Professor, Department of Mathematics and Statistics, University of Victoria, Canada Celestial mechanics N -body problem Ding, Mingzhou Professor, Department of Biomedical Engineering Univeristy of Florida, USA Intermittency
xviii
List of Contributors
Dmitriev, S.V. Researcher, Institute of Industrial Science, University of Tokyo, Japan Collisions
Elgin, John Professor, Maths Department, Imperial College of Science, Technology and Medicine, London, UK Kuramoto--Sivashinsky equation
Dolgaleva, Ksenia Department of Physics, M.V. Lomonosov Moscow State University, Moscow and The Institute of Optics, University of Rochester, USA Frequency doubling
Emmeche, Claus Associate Professor and Head of Center for the Philosophy of Nature and Science Studies, University of Copenhagen, Denmark Causality
Donoso, Jose´ M. E.T.S.I. Aeronauticos, Universidad Politecnica, Madrid, Spain Ball lightning
Enolskii, Victor Professor, Heriot-Watt University, UK Theta functions
Doucet, Arnaud Signal Processing Group, Department of Engineering, Cambridge University, UK Monte Carlo methods
Falkovich, Gregory Professor, Department of Physics of Complex Systems, Weizmann Institute of Science, Israel Mixing Turbulence
Dritschel, David Professor, Department of Applied Mathematics, The University of St. Andrews, UK Contour dynamics Dupuis, Ge´ rard Chimie ge´ne´rale et organique, Lyce´e Faidherbe de Lille, France Belousov--Zhabotinsky reaction Easton, Robert W. Professor, Department of Applied Mathematics, University of Colorado, Boulder, USA Conley index Eckhardt, Bruno Professor, Fachbereich Physik, Philipps Universita¨ t, Marburg, Germany Chaotic Advection Maps in the complex plane Periodic orbit theory Quantum chaos Random matrix theory I: Origins and physical applications Shear flow Solar system Universality
Falqui, Gregorio Professor, Mathematical Physics Sector, International School for Advanced Studies, Trieste, Italy Hodograph transform N -soliton formulas Faris, William G. Professor, Department of Mathematics, University of Arizona, USA Martingales Feddersen, Henrik Research Scientist, Climate Research Division, Danish Meteorological Institute, Denmark Forecasting Fedorenko, Vladimir V. Senior Scientific Researcher, Institute of Mathematics, National Academy of Science of Ukraine, Ukraine One-dimensional maps Fenimore, Paul W. Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, USA Protein dynamics
Efimo, I. Associate Professor of Biomedical Engineering, Stanley and Lucy Lopata Endowment, Washington University, Missouri, USA Cardiac muscle models
Flach, Sergej (Adviser) Max Planck Institut fu¨ r Physik komplexer Systeme, Germany Discrete breathers Symmetry: equations vs. solutions
Eilbeck, Chris (Adviser) Professor, Department of Mathematics, Heriot-Watt University, UK Discrete self-trapping system
Flaschka, Hermann (Adviser) Professor, Department of Mathematics, The University of Arizona, USA Toda lattice
List of Contributors
xix
Fletcher, Neville Professor, Department of Electronic Materials Engineering, Australian National University, Australia Overtones
Garnier, Nicolas Laboratoire de Physique, Ecole Normale Supe´rieure de Lyon, France Hydrothermal waves
Flor´a, Luis Mario Department of Theory and Simulation of Complex Systems, Instituto de Ciencia de Materiales de Aragon, Spain Aubry--Mather theory Commensurate-incommensurate transition Frenkel--Kontorova model
Gaspard, Pierre P. Center for Nonlinear Phenomena & Complex Systems Universite´ Libre de Bruxelles, Belgium Entropy Maps Quantum theory Ro¨ ssler systems
Forrester, Peter Department of Mathematics and Statistics, University of Melbourne, Australia Random matrix theory II: Algebraic developments
Gendelman, Oleg Faculty of Mechanical Engineering, Israel Institute of Technology, Israel Heat conduction
Fowler, W. Beall Emeritus Professor, Physics Department, Lehigh University, USA Color centers
Giuliani, Alessandro Environment and Health Departmant, Istituto Superiore di Sanita´ , Rome, Italy Algorithmic complexity
Fraedrich, Klaus Professor, Meteorologisches Institut, Universita¨ t Hamburg, Germany Atmospheric and ocean sciences General circulation models of the atmosphere
Glass, Leon Isadore Rosenfeld Chair and Professor of Physiology, McGill University, Canada Cardiac arrhythmias and the electrocardiogram
Freites, Juan Alfredo Department of Physics and Astronomy, University of California, Irvine, USA Molecular dynamics
Glendinning, Paul Professor, Department of Mathematics, University of Manchester Institute of Science and Technology, UK He´ non map Invariant manifolds and sets Routes to chaos
Frieden, Roy Optical Sciences Center, University of Arizona in Tucson, USA Information theory
Goriely, Alain Professor, Department of Mathematics, University of Arizona, USA Normal forms theory
Friedrich, Joseph Professor, Lehrstuhl fu¨ r Physik Weihenstephan Technische Universita¨ t Mu¨ nchen, Germany Hole burning
Grand, Steve Director, Cyberlife Research Ltd., Shipham, UK Artificial life
Fuchikami, Nobuko Department of Physics, Tokyo Metropolitan University, Japan Dripping faucet Gallagher, Marcus School of Information Technology & Electrical Engineering, The University of Queensland, Australia McCulloch--Pitts network Perceptron
Gratrix, Sam Maths Department, Imperial College of Science, Technology and Medicine, UK Kuramoto--Sivashinsky equation Grava, Tamara Mathematical Physics Sector, International school for Advanced Studies, Trieste, Italy Hodograph transform N -soliton formulas Zero-dispersion limits
xx Grimshaw, Roger Professor, Department of Mathematical Sciences, Loughborough University, UK Group velocity Korteweg--de Vries equation Water waves Haken, Hermann (Adviser) Professor Emeritus, Fakulta¨ t fu¨ r Physik, University of Stuttgart, Germany Gestalt phenomena Synergetics Halburd, Rodney G. Lecturer, Department of Mathematical Sciences, Loughborough University, UK Einstein equations Hallinan, Jennifer Institute for Molecular Bioscience, The University of Queensland, Australia Game of life Game theory Hamilton, Mark Professor, Department of Mechanical Engineering, University of Texas at Austin, USA Nonlinear acoustics Hamm, Peter Professor, Physikalisch-Chemisches Institut, Universita¨ t Zu¨ rich, Switzerland Franck--Condon factor Hydrogen bond Pump-probe measurements Hasselblatt, Boris Professor, Department of Mathematics, Tufts University, USA Anosov and Axiom-A systems Measures Phase space
List of Contributors Henry, Bruce Department of Applied Mathematics, University of New South Wales, Australia Equipartition of energy He´ non--Heiles system Henry, Bryan Department of Chemistry and Biochemistry, University of Guelph, Canada Local modes in molecules Hensler, Gerhard Professor, Institut fu¨ r Astronomie, Universita¨ tsSternwarte Wien, Austria Galaxies Herrmann, Hans Institute for Computational Physics, University of Stuttgart, Germany Dune formation Hertz, John Professor, Nordic Institute for Theoretical Physics, Denmark Attractor neural networks Hietarinta, Jarmo Professor, Department of Physics, University of Turku, Finland Hirota’s method Hill, Larry Technical Staff Member, Detonation Science & Technology, Los Alamos National Laboratory, USA Evaporation wave Hjorth, Poul G. Associate Professor, Department of Mathematics, Technical University of Denmark, Denmark Kolmogorov--Arnol’d--Moser theorem
Hawkins, Jane Professor, Department of Mathematics, University of North Carolina at Chapel Hill, USA Ergodic theory
Holden, Arun Professor of Computational Biology, School of Biomedical Sciences, University of Leeds, UK Excitability Hodgkin--Huxley equations Integrate and fire neuron Markin--Chizmadzhev model Periodic bursting Spiral waves
Helbing, Dirk Institute for Economics and Traffic, Dresden University of Technology, Germany Traffic flow
Holstein-Rathlou, N.-H. Professor, Department of Medical Physiology, University of Copenhagen, Denmark Nephron dynamics
Hastings, Alan Professor, Department of Environmental Science and Policy, University of California, USA Epidemiology
List of Contributors Hommes, Cars Professor, Center for Nonlinear Dynamics in Economics and Finance, Department of Quantitative Economics, University of Amsterdam, The Netherlands Economic dynamics Hone, Andrew Lecturer in Applied Mathematics, Institute of Mathematics & Actuarial Science, University of Kent at Canterbury, UK Extremum principles Ordinary differential equations, nonlinear Riccati equations Hood, Alan Professor, School of Mathematics and Statistics, University of St Andrews, UK Characteristics Houghton, Conor Department of Pure and Applied Mathematics, Trinity College Dublin, Ireland Instantons Yang--Mills theory Howard, James E. Research Associate, Department of Physics, University of Colorado at Boulder, USA Nontwist maps Regular and chaotic dynamics in atomic physics Ivey, Thomas A. Department of Mathematics, College of Charleston, USA Differential geometry Framed space curves Jime´ nez, Salvador Professor, Departamento de Matema´ ticas, Universidad Alfonso X El Sabio, Madrid, Spain Charge density waves Dispersion relations Joannopoulos, John D. Professor, Department of Physics, Massachusetts Institute of Technology, USA Photonic crystals
xxi Joshi, Nalini Professor, School of Mathematics and Statistics, University of Sydney, Australia Solitons Kaneko, Kunihiko Department of Pure and Applied Sciences, University of Tokyo, Japan Coupled map lattice Kantz, Holger Professor of Theoretical Physics, Max Planck Institut f u¨ r komplexer Systeme, Germany Time series analysis Kennedy, Michael Peter Professor of Microelectronic Engineering, University College, Cork, Ireland Chua’s circuit Kevrekidis, I.G. Professor, Department of Chemical Engineering, Princeton University, USA Wave of translation Kevrekidis, Panayotis G. Assistant Professor, Department of Mathematics and Statistics, University of Massachusetts, Amherst, USA Binding energy Collisions Wave of translation Khanin, Konstantin Professor, Department of Mathematics, Heriot-Watt University, UK Denjoy theory Khovanov, Igor A. Department of Physics, Saratov State University, Russia Quasiperiodicity Khovanova, Natalya A. Department of Physics, Saratov State University, Russia Quasiperiodicity
Johansson, Magnus Department of Physics and Measurement Technology, Linko¨ ping University, Sweden Discrete nonlinear Schro¨ dinger equations
King, Aaron Assistant Professor, Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, USA Phase plane
Johnson, Steven G. Assistant Professor, Department of Mathematics, Massachussetts Institute of Technology, USA Photonic crystals
Kirby, Michael J. Professor, Department of Mathematics, Colorado State University, USA Nonlinear signal processing
xxii Kirk, Edilbert Meteorologisches Institut, Universita¨ t Hamburg, Germany General circulation models of the atmosphere Kivshar, Yuri (Adviser) Nonlinear Physics Center, Australian National University, Australia Optical fiber communications Kiyono, Ken Research Fellow of the Japan Society for the Promotion of Science, Educational Physiology Laboratory, University of Tokyo, Japan Dripping faucet Knott, Ron Department of Mathematics, University of Surrey, UK Fibonacci series Kocarev, Liupco Associate Research Scientist, Institute for Nonlinear Science, University of California, San Diego, USA Damped-driven anharmonic oscillator Konopelchenko, Boris G. Professor, Dipartimento di Fisica, University of Lecce, Italy Multidimensional solitons Konotop, Vladimir V. Centro de Fι´sica Teo´ rica e Computacional Complexo Interdisciplinar da Universidade de Lisboa, Portugal Wave propagation in disordered media Kosevich, Arnold B. Verkin Institute for Low Temperature Physics and Engineering, National Academy of Sciences of Ukraine, Kharkov, Ukraine Breathers Dislocations in crystals Effective mass Landau--Lifshitz equation Superfluidity Superlattices Kovalev, Alexander S. Institute for Low Temperature Physics and Engineering, National Academy of Sciences of Ukraine, Ukraine Continuum approximations Topological defects Kramer, Peter R. Assistant Professor, Department of Mathematical Sciences, Rensselaer Polytechnic Institute, USA Brownian motion Fokker--Planck equation
List of Contributors Krinsky, Valentin Professor, Institut Non-Lineaire de Nice, France Cardiac muscle models Kuramoto, Yoshiki (Adviser) Department of Physics, Kyoto University, Japan Phase dynamics Kurin, V. Institute for Physics of Microstructures, Russian Academy of Science, Russia Cherenkov radiation Kuvshinov, Viatcheslav I. Professor, Institute of Physics, Belarus Academy of Sciences, Belarus Black holes Cosmological models Fractals General relativity Kuzmin, Andrei Professor, Institute of Physics, Belarus Academy of Sciences, Belarus Fractals Kuznetsov, Vadim Advanced Research Fellow, Department of Applied Mathematics, University of Leeds, UK Rotating rigid bodies LaBute, Montiago X. Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, USA Protein structure Lakshmanan, Muthusamy Professor, Department of Physics, Bharathidasan University, Tiruchirapalli, India Equations, nonlinear Nonlinear electronics Spin systems Landa, Polina S. Professor, Department of Physics, Moscow State University, Russia Feedback Pendulum Quasilinear analysis Relaxation oscillators Landsberg, Peter Professor, Faculty of Mathematical Studies, University of Southampton, UK Detailed balance
List of Contributors Lansner, Anders Department of Numerical Analysis and Computer Science (NADA), Royal Institute of Technology (KTH), Sweden Cell assemblies Neural network models Lee, John Professor, Department of Mechanical Engineering, McGill University, Canada Flame front Lega, Joceline Associate Professor, Department of Mathematics, University of Arizona, USA Equilibrium Fredholm theorem Lepeshkin, Nick The Institute of Optics, University of Rochester, USA Frequency doubling Levi, Decio Professor, Dipartimento di Ingegneria Electronica, Universita` degli Studi Roma tre, Italy Delay-differential equations Lichtenberg, Allan J. Professor, Department of Electrical Engineering and Computer Science, University of California at Berkeley, USA Arnol’d diffusion Averaging methods Electron beam microwave devices Fermi acceleration and Fermi map Fermi--Pasta--Ulam oscillator chain Particle accelerators Phase-space diffusion and correlations Liley, David School of Biophysical Sciences and Electrical Engineering, Swinburne University of Technology, Australia Electroencephalogram at mesoscopic scales Lonngren, Karl E. Professor, Department of Electrical and Computer Engineering, University of Iowa, USA Plasma soliton experiments Losert, Wolfgang Assistant Professor, Department of Physics, IPST and IREAP, University of Maryland, USA Granular materials Pattern formation
xxiii Lotricˇ , Maja-Bracˇ icˇ Faculty of Electrical Engineering, University of Liubljana, Slovenia Wavelets Luchinsky, Dmitry G. Department of Physics, Lancaster University, UK Nonlinearity, definition of
¨ Manfred Lucke, Institut fu¨ r Theoretische Physik, Universita¨ t des Saarlandes, Saarbru¨ cken, Germany Thermo-diffusion effects Lunkeit, Frank Meteorologisches Institut, Universita¨ t Hamburg, Germany General circulation models of the atmosphere Ma, Wen-Xiu Department of Mathematics, University of South Florida, USA Integrability Macaskill, Charles Associate Professor, School of Mathematics and Statistics, University of Sydney, Australia Jupiter’s Great Red Spot MacClune, Karen Lewis Hydrologist, SS Papadopulos & Associates, Boulder, Colorado, USA Glacial flow Maggio, Gian Mario ST Microelectronics and Center for Wireless Communications (CWC), University of California at San Diego, USA Damped-driven anharmonic oscillator Maini, Philip K. Professor, Centre for Mathematical Biology, Mathematical Institute, University of Oxford, UK Morphogenesis, biological Mainzer, Klaus Professor, Director of the Institute of Interdisciplinary Informatics, Department of Philosophy of Science, University of Augsburg, Germany Artificial intelligence Cellular nonlinear networks Dynamical systems Malomed, Boris A. Professor, Department of Interdisciplinary Studies, Faculty of Engineering, Tel Aviv University, Israel
xxiv Complex Ginzburg--Landau equation Constants of motion and conservation laws Multisoliton perturbation theory Nonlinear Schro¨ dinger equations Power balance Manevitch, Leonid Professor, Institute of Chemical Physics, Russia Heat conduction Mechanics of solids Peierls barrier Manneville, Paul ´ Laboratoire d’Hydrodynamique (LadHyX), Ecole Polytechnique, Palaiseau, France Spatiotemporal chaos Marklof, Jens School of Mathematics, University of Bristol, UK Cat map Marsden, Jerrold E. Professor of Control and Dynamical Systems California Institute of Technology, Pasadena, USA Berry’s phase
´ Mart´nez, Pedro Jesus Department of Theory and Simulation of Complex Systems, Instituto de Ciencia de Materiales de Aragon, Spain Frenkel--Kontorova model Masmoudi, Nader Associate Professor, Department of Mathematics, Courant Institute of Mathematical Sciences, New York University, USA Boundary layers Rayleigh--Taylor instability Mason, Lionel Mathematical Institute, Oxford University, UK Twistor theory Mayer, Andreas Institute for Theoretical Physics, University of Regensburg, Germany Surface waves
List of Contributors McLaughlin, Richard Associate Professor, Department of Mathematics, University of North, Carolina, Chapel Hill, USA Plume dynamics McMahon, Ben Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, USA Protein dynamics Protein structure Meiss, James Professor, Department of Applied Mathematics, University of Colorado at Boulder, USA Hamiltonian systems Standard map Symplectic maps Minkevich, Albert Professor of Theoretical Physics, Belorussian State University, Minsk, Belarus Cosmological models General relativity Miura, Robert Professor, Department of Mathematical Sciences, New Jersey Institute of Technology, USA Nonlinear toys Moloney, Jerome V. Professor, Department of Mathematics, University of Arizona, USA Nonlinear optics Moore, Richard O. Assistant Professor, Departmant of Mathematical Sciences, New Jersey, Institute of Technology, USA Harmonic generation MLrk, Jesper Professor, Optoelectronics, Research Center COM, Technical University of Denmark, Denmark Semiconductor laser
McKenna, Joe Professor, Department of Mathematics, University of Connecticut, USA Tacoma Narrows Bridge collapse
Mornev, Oleg Senior Researcher, Institute of Theoretical and Experimental Biophysics, Russia Geometrical optics, nonlinear Gradient system Zeldovich--Frank-Kamenetsky equation
McLaughlin, Kenneth Associate Professor, Department of Mathematics, University of North Carolina at Chapel Hill, USA Random matrix theory III: Combinatorics
Mosekilde, E. Professor, Department of Physics, Technical University of Denmark, Denmark Nephron dynamics
List of Contributors
xxv
Mueller, Stefan C. Department of Biophysics, Otto-von-GuerickeUniversita¨ t Magdeburg, Germany Scroll waves
Olsder, Geert Jan Faculty of Technical Mathematics and Informatics, Delft University of Technology, The Netherlands Idempotent analysis
Mullin, Tom Professor of Physics and Director of Manchester Centre for Nonlinear Dynamics, University of Manchester, UK Bifurcations Catastrophe theory Taylor--Couette flow
Olver, Peter J. Professor, School of Mathematics, University of Minnesota, USA Lie algebras and Lie groups
Mygind, Jesper Professor, Department of Physics, Technical University of Denmark, Denmark Josephson junctions Superconducting quantum interference device Nakamura, Yoshiharu Associate Professor, Institute of Space and Astronautical Science, Kanagawa, Japan Plasma soliton experiments Natiello, Mario Centre for Mathematical Sciences, Lund University, Sweden Lasers Winding numbers Newell, Alan Professor, Department of Mathematics, University of Arizona, USA Inverse scattering method or transform Newton, Paul K. Professor, Department of Aerospace and Mechanical Engineering, University of Southern California, USA Berry’s phase Chaos vs. turbulence Neyts, Kristiaan Professor, Department of Electronics and Information Systems, Ghent University, Belgium Liquid crystals
Ostrovsky, Lev (Adviser) Professor, Zel Technologies/Univeristy of Colorado, Boulder, Colorado, USA, and Institute of Applied Physics, Nizhny Novgorod, Russia Hurricanes and tornadoes Modulated waves Nonlinear acoustics Shock waves Ottova-Leitmannova, Angelica Department of Physiology, Michigan State University, USA Bilayer lipid membrance Palmer, John Professor, Department of Mathematics, University of Arizona, USA Monodromy preserving deformations Pascual, Pedro J. Associate Professor, Departamento de Ingenieria Informa´ tica, Universidad Autonoma de Madrid, Spain Charge density waves Pedersen, Niels Falsig Professor, Department of Power Engineering, Technical University of Denmark, Denmark Long Josephson junctions Superconductivity
Nicolis, G. Professor, Faculte´ des Sciences, Universite´ Libre de Bruxelles, Belgium Brusselator Chemical kinetics Nonequilibrium statistical mechanics Recurrence
Pelinovsky, Dmitry Associate Professor, Department of Mathematics, McMaster University, Canada Coupled systems of partial differential equations Energy analysis Generalized functions Linearization Manley--Rowe relations Numerical methods N -wave interactions Spectral analysis
Nunez, ˜ Paul Professor, Brain Physics Group, Department of Biomedical Engineering, Tulane University, USA Electroencephalogram at large scales
Pelletier, Jon D. Assistant Professor, Department of Geosciences, University of Arizona, USA Geomorphology and tectonics
xxvi
List of Contributors
Pelloni, Beatrice Mathematics Department, University of Reading, UK Boundary value problems Burgers equation
Reucroft, Stephen Professor of Physics, Northeastern University, Boston, USA Higgs boson
Petty, Michael Professor, Centre for Molecular and Nanoscale Electronics, University of Durham, UK Langmuir--Blodgett films
Ricca, Renzo L. Professor, Dipartimento di Matematica e Applicazioni, Universita` di Milano-Bicocca, Milan, Italy Knot theory Structural complexity
Peyrard, Michel Professor of Physics, Laboratoire de Physique, Ecole Normale Supe´rieure de Lyon, France Biomolecular solitons Pikovsky, Arkady Department of Physics Universita¨ t Potsdam, Germany Synchronization Van der Pol equation Pitchford, Jon Lecturer, Department of Biology, University of York, UK Random walks Pojman, John A. Professor, Department of Chemistry and Biochemistry, The University of Southern Mississippi, USA Polymerization Pumiri, A. Directeur de Recherche, Institut Non-Lineaire de Nice, France Cardiac muscle models Pushkin Dmitri O. Department of Theoretical and Applied Mechanics, University of IIIinois, Urbana--Champaign, USA Cluster coagulation
Robinson, James C. Mathematics Institute, University of Warwick, UK Attractors Dimensions Function spaces Functional analysis Robnik, Marko Professor, Center for Applied Mathematics and Theoretical Physics, University of Maribor, Slovenia Adiabatic invariants Determinism Rogers, Colin Professor, Australian Research Council Centre of Excellence for Mathematics and Statistics of Complex Systems, School of Mathematics, University of New South Wales, Australia Ba¨ cklund transformations Romanenko, Elena Senior Scientific Researcher, Institute of Mathematics, National Academy of Science of Ukraine, Ukraine Turbulence, ideal Rosenblum, Michael Department of Physics, University of Potsdam, Germany Synchronization Van der Pol equation
Rabinovich, Mikhail Research Physicist, Institute for Nonlinear Science, University of California at San Diego, USA and Institute of Applied Physics, Russian Academy of Sciences Chaotic dynamics
Rouvas-Nicolis, C. Climatologie Dynamique, Institut Royal Me´te´orologique de Belgique, Belgium Recurrence
Ranada, ˜ Antonio F. Facultad de Fisica, Universidad Complutense, Madrid, Spain Ball lightning
Ruijsenaars, Simon Center for Mathematics and Computer Science, The Netherlands Derrick--Hobart theorem Particles and antiparticles
Recami, Erasmo Professor of Physics, Faculty of Engineering, Bergamo State University, Bergamo, Italy Tachyons and superluminal motion
Rulkov, Nikolai Institute for Nonlinear Science, University of California at San Diego, USA Chaotic dynamics
List of Contributors Sabatier, Pierre Professor, Physique Mathe´matique, Universite´ Montpellier II, France Inverse problems Sakaguchi, Hidetsugu Department of Applied Science for Electronics and Materials, Kyushu University, Japan Coupled oscillators Salerno, Mario Professor, Departimento di Fisica ”E.R. Caianiello”, Universita` degli Studi, Salerno, Italy Bethe ansatz Salerno equation Sandstede, Bjorn Associate Professor, Department of Mathematics, Ohio State University, USA Evans function Satnoianu, Razvan Centre for Mathematics, School of Engineering and Mathematical Sciences, City University, UK Diffusion Reaction-diffusion systems Sauer, Tim Professor, Department of Mathematics, George Mason University, USA Embedding methods Savin, Alexander Professor, Moscow Institute of Physics and Technology, Russia Peierls barrier
xxvii Scho¨ ll, Eckehard Professor, Institut fu¨ r Theoretische Physik, Technische Universita¨ t Berlin, Germany Avalanche breakdown Diodes Drude model Semiconductor oscillators Schuster, Peter Institut fu¨ r Theoretische Chemie und Molekulare Strukturbiologie, Austria Biological evolution Catalytic hypercycle Fitness landscape Scott, Alwyn (Editor) Emeritus Professor of Mathematics, University of Arizona, USA Candle Discrete self-trapping system Distributed oscillators Emergence Euler--Lagrange equations Hierarchies of nonlinear systems Laboratory models of nonlinear waves Lifetime Matter, nonlinear theories of Multiplex neuron Nerve impulses Neuristor Quantum nonlinearity Rotating-wave approximation Solitons, a brief history State diagrams Symmetry groups Tachyons and superluminal phenomena Threshold phenomena Wave packets, linear and nonlinear
Schaerf, Timothy School of Mathematics and Statistics, University of Sydney, Australia Jupiter’s Great Red Spot
Segev, Mordechai Professor, Technion-Israel Institute of Technology, Haifa, Israel Incoherent solitons
Schattschneider, Doris Professor, Department of Mathematics, Moravian College, Pennsylvania, USA Tessellation
Shalfeev, Vladimir Head of Department of Oscillation Theory, Nizhni Novgorod State University, Russia Parametric amplification
Schirmer, Jochen Professor, Institute for Physical Chemistry, Heidelberg, Germany Hartee approximation
Sharkovsky, Alexander N. Institute of Mathematics, National Academy of Sciences of Ukraine, Ukraine One-dimensional maps Turbulence, ideal
Schmelcher, Peter Institute for Physical Chemistry, University of Heidelberg, Germany Hartree approximation
Sharman, Robert National Center for Atmospheric Research, Boulder, Colorado, USA Clear air turbulence
xxviii
List of Contributors
Shinbrot, Troy Associate Professor, Department of Chemical and Biochemical Engineering, Rutgers University, USA Controlling chaos
Sosnovtseva, O. Lecturer, Department of Physics, Technical University of Denmark, Denmark Nephron dynamics
Shohet, J. Leon Professor, Department of Electrical and Computer Engineering, University of Wisconsin-Madison, USA Nonlinear plasma waves
Spatschek, Karl Professor, Institut fu¨ r Theoretische Physics 1, Heinrich-Heine-Universita¨ t Du¨ sseldorf, Germany Center manifold reduction Dispersion management
Siwak, Pawel Department of Electrical Engineering, Poznan University of Technology, Poland Integrable cellular automata Skufca, Joe D. Department of Mathematics, US Naval Academy, USA Markov partition Skufca, Joseph Center for Computational Science and Mathematical Modelling, University of Maryland, USA Markov partitions Smil, Vaclav Professor, Department of Environment, University of Manitoba, Canada Global warming Sobell, Henry M. Independent scholar, New York, USA DNA premelting Solari, Herna´ n Gustavo Departamento Fι´sica, University of Buenos Aires, Argentina Lasers Winding numbers Soljac˘ ic´ , Marin Principal Research Scientist, Research Laboratory of Electronics, Massachusetts Institute of Technology, USA Photonic crystals SLrensen, Mads Peter Associate Professor, Department of Mathematics, Technical University of Denmark, Denmark Collective coordinates Multiple scale analysis Perturbation theory Sornette, Didier Professor, Laboratoire de Physique de la Matiere Condensee, Universite´ de Nice - Sophia Antipolis, France Sandpile model
Stadler, Michael A. Professor, Institut fu¨ r Physchologie and Kognitionsforschung, Bremen, Germany Gestalt phenomena Stauffer, Dietrich Institute for Theoretical Physics, University of Cologne, Germany Percolation theory Stefanovska, Aneta Head, Nonlinear Dynamics and Synergetics Group Faculty of Electrical Engineering, University of Ljubljana, Slovenia Flip-flop circuit Inhibition Nonlinearity, definition of Quasiperiodicity Wavelets Storb, Ulrich Institut fu¨ r Experimentelle Physik, Otto-von-GuerickeUniversita¨ t, Magdeburg, Germany Scroll waves Strelcyn, Jean-Marie Professeur, De´partement de Mathe´matiques, Universite´ de Rouen, Mont Saint Aignan Cedex, France Poincare´ theorems Suris, Yuri B. Department of Mathematics, Technische Universita¨ t Berlin, Germany Integrable lattices Sutcliffe, Paul Professor of Mathematical Physics, Institute of Mathematics & Acturial Science, University of Kent at Canterbury, UK Skyrmions Sverdlov, Masha TEC High School, Newton, Massachusetts, USA Hurricanes and Tornadoes
List of Contributors Swain, John David Professor, Department of Physics, Northeastern University, Boston, USA Doppler shift Quantum field theory Tensors Tabor, Michael Professor, Department of Mathematics, University of Arizona, USA Growth patterns Tajiri, Masayoshi Emeritus Professor, Department of Mathematical Sciences, Osaka Prefecture University, Japan Solitons, types of Wave stability and instability Tass, Peter Professor, Institut fu¨ r Medizin, Forschungszentrum Ju¨ lich, Germany Stochastic analysis of neural systems Taylor, Richard Associate Professor, Materials Science Institute, University of Oregon, USA Le´ vy flights Teman, Roger Laboratoire d’Analyse Numerique, Universite´ de Paris Sud, France Inertial manifolds Thompson, Michael Emeritus Professor (UCL) and Honorary Fellow, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, UK Duffing equation Stability Tien, H. Ti (deceased) Formerly Professor, Membrane Biophysics Laboratory, Michigan State University, USA Bilayer lipid membranes Tobias, Douglas J. Associate Professor, Department of Chemistry, University of California at Irvine, USA Molecular dynamics Toda, Morikazu Emeritus Professor, Tokyo University of Education, Japan Nonlinear toys
xxix Trueba, Jose´ L. Departmento di Mathema´ ticas, y Fisica Aplicadas y Ciencias de la Natura, Universidad Rey Juan Carlos, Mo´ stoles, Spain Ball lightning Tsimring, Lev S. Research Physicist, Institute for Nonlinear Science, University of California, San Diego USA Avalanches Tsinober, Arkady Professor, Iby and Aladar Fleischman Faculty of Engineering, Tel Aviv University, Israel Helicity Tsironis, Giorgos P. Department of Physics, University of Crete, Greece Bjerrum defects Excitons Ising model Local modes in molecular crystals Tsygvintsev, Alexei Maitre de Confe´rences, Unite´ de Mathe´matiques Pures et ´ Applique´es, Ecole Normale Superieure de Lyon, France Poincare´ theorems Tuszynski, Jack Department of Physics, University of Alberta, Canada Critical phenomena Domain walls Ferromagnetism and ferroelectricity Fro¨ hlich theory Hysteresis Order parameters Renormalization groups Scheibe aggregates Ustinov, Alexey V. Physikalisches Institut III, University of Erlangen-Nu¨ rnberg, Germany Josephson junction arrays van der Heijden, Gert Centre for Nonlinear Dynamics, University College London, UK Butterfly effect Hopf bifurcation Va´ zquez, Luis Professor, Faculted de Informa´ tica, Universidad Complutense de Madrid, Spain. Senior Researcher and Cofounder of the Centro de Astrobiologι´a, Instituo Nacional de Te´cnica Aeroespacial, Madrid, Spain
xxx Charge density waves Dispersion relations FitzHugh--Nagumo equation Virial theorem Wave propagation in disordered media Verboncoeur, John P. Associate Professor, Nuclear Engineering Department, University of California, Berkeley, USA Electron beam microwave devices Veselov, Alexander Professor, Department of Mathematical Sciences, Loughborough University, UK Huygens principle Vo, Ba-Ngu Electrical and Electronic Engineering Department, The Univeristy of Melbourne, Victoria, Australia Monte Carlo methods Voiculescu, Dan-Virgil Professor, Department of Mathematics, University of California at Berkeley, USA Free probability theory Voorhees, Burton H. Professor, Department of Mathematics, Athabasca University, Canada Cellular automata Wadati, M. Professor, Department of Physics, University of Tokyo, Japan Quantum inverse scattering method Walter, Gilbert G. Professor Emeritus, Department of Mathematical Sciences, University of Wisconsin-Milwaukee, USA Compartmental models
List of Contributors Wilson, Hugh R. Centre for Vision Research, York University, Canada Neurons Stereoscopic vision and binocular rivalry Winfree, A.T. (Adviser) (deceased) Formerly, Department of Ecology and Evolutionary Biology, University of Arizona, USA Dimensional analysis Wojtkowski, Maciej P. Professor, Department of Mathematics, University of Arizona, USA Lyapunov exponents Yakushevich, Ludmilla (Adviser) Researcher, Institute of Cell Biophysics, Russian Academy of Sciences, Russia DNA solitons Young, Lai-Sang (Adviser) Professor, Courant Institute of Mathematical Sciences, New York University, USA Anosov and Axiom-A systems Horseshoes and hyperbolicity in dynamical systems Sinai--Ruelle--Bowen measures Yiguang, Ju Assistant Professor, Department of Mechanical and Aerospace Engineering, Princeton University, USA Flame front Yukalov, V.I. Professor, Bogolubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research, Russia Bose--Einstein condensation Coherence phenomena
Waymire, Edward C. Professor, Department of Mathematics, Oregon State University, USA Multiplicative processes
Zabusky, Norman J. Professor, Department of Mechanical and Aerospace Engineering, Rutgers University, USA Visiometrics Vortex dynamics of fluids
West, Bruce J. Chief Scientist, Mathematics, US Army Research Office, North Carolina, USA Branching laws Fluctuation-dissipation theorem Kicked rotor
Zbilut, Joseph P. Professor, Department of Molecular Biophysics and Physiology, Rush University, USA Algorithmic complexity
Wilhelmsson, Hans Professor Emeritus of Physics, Chalmers University of Technology, Sweden Alfve´ n waves
Zhou, Xin Professor, Department of Mathematics, Duke University, USA Random matrix theory IV: Analytic methods Riemann--Hilbert problem
List of Contributors Zolotaryuk, Alexander V. Bogolyubov Institute for Theoretical Physics, Ukraine Polarons Ratchets
xxxi Zorzano, Mar´a-Paz Young Researcher, Centro de Astrobiologι´a, Instituto Nacional de Te´cnica Aeroespacial, Madrid, Spain FitzHugh--Nagumo equations Virial Theorem
List of Entries Ablowitz--Kaup--Newell--Segur system Adiabatic invariants Alfve´ n waves Algorithmic complexity Anderson localization Anosov and Axiom-A systems Arnol’d diffusion Artificial intelligence Artificial life Atmospheric and ocean sciences Attractor neural network Attractors Aubry--Mather theory Avalanche breakdown Avalanches Averaging methods Ba¨ cklund transformations Ball lightning Belousov--Zhabotinsky reaction Bernoulli’s equation Berry’s phase Bethe ansatz Bifurcations Bilayer lipid membranes Billiards Binding energy Biological evolution Biomolecular solitons Bjerrum defects Black holes Born--Infeld equations Bose--Einstein condensation Boundary layers Boundary value problems Branching laws Breathers
Brownian motion Brusselator Burgers equation Butterfly effect Candle Cardiac arrhythmias and the electrocardiogram Cardiac muscle models Cat map Catalytic hypercycle Catastrophe theory Causality Celestial mechanics Cell assemblies Cellular automata Cellular nonlinear networks Center manifold reduction Chaos vs. turbulence Chaotic advection Chaotic dynamics Characteristics Charge density waves Chemical kinetics Cherenkov radiation Chua’s circuit Clear air turbulence Cluster coagulation Coherence phenomena Collective coordinates Collisions Color centers Commensurate-incommensurate transition Compartmental models Complex Ginzburg--Landau equation Conley index Constants of motion and conservation laws Continuum approximations Contour dynamics
xxxiii
xxxiv Controlling chaos Cosmological models Coupled map lattice Coupled oscillators Coupled systems of partial differential equations Critical phenomena Damped-driven anharmonic oscillator Darboux transformation Davydov soliton Delay-differential equations Denjoy theory Derrick--Hobart theorem Detailed balance Determinism Deterministic walks in random environments Development of singularities Differential geometry Diffusion Dimensional analysis Dimensions Diodes Discrete breathers Discrete nonlinear Schro¨ dinger equations Discrete self-trapping system Dislocations in crystals Dispersion management Dispersion relations Distributed oscillators DNA premelting DNA solitons Domain walls Doppler shift Dressing method Dripping faucet Drude model Duffing equation Dune formation Dynamical systems Dynamos, homogeneous Economic system dynamics Effective mass Einstein equations Electroencephalogram at large scales Electroencephalogram at mesoscopic scales Electron beam microwave devices Elliptic functions Embedding methods Emergence Energy analysis Entropy Ephaptic coupling Epidemiology Equations, nonlinear Equilibrium
List of Entries Equipartition of energy Ergodic theory Euler--Lagrange equations Evans function Evaporation wave Excitability Excitons Explosions Extremum principles Fairy rings of mushrooms Feedback Fermi acceleration and Fermi map Fermi--Pasta--Ulam oscillator chain Ferromagnetism and ferroelectricity Fibonacci series Filamentation Fitness landscape FitzHugh--Nagumo equation Flame front Flip-flop circuit Fluctuation-dissipation theorem Fluid dynamics Fokker--Planck equation Forecasting Forest fires Fractals Framed space curves Franck--Condon factor Fredholm theorem Free energy Free probability theory Frenkel--Kontorova model Frequency doubling Fro¨ hlich theory Frustration Function spaces Functional analysis Galaxies Game of life Game theory Gel’fand--Levitan theory General circulation models of the atmosphere General relativity Generalized functions Geometrical optics, nonlinear Geomorphology and tectonics Gestalt phenomena Glacial flow Global warming Gradient system Granular materials Gravitational waves Group velocity Growth patterns
List of Entries Hamiltonian systems Harmonic generation Hartree approximation Heat conduction Hele-Shaw cell Helicity He´ non map He´ non--Heiles system Hierarchies of nonlinear systems Higgs boson Hirota’s method Hodgkin--Huxley equations Hodograph transform Hole burning Holons Hopf bifurcation Horseshoes and hyperbolicity in dynamical systems Hurricanes and tornadoes Huygens principle Hydrogen bond Hydrothermal waves Hysteresis Idempotent analysis Incoherent solitons Inertial manifolds Information theory Inhibition Instantons Integrability Integrable cellular automata Integrable lattices Integral transforms Integrate and fire neuron Intermittency Invariant manifolds and sets Inverse problems Inverse scattering method or transform Ising model Josephson junction arrays Josephson junctions Jump phenomena Jupiter’s Great Red Spot Kadomtsev--Petviashvili equation Kelvin--Helmholtz instability Kerr effect Kicked rotor Kirchhoff’s laws Knot theory Kolmogorov cascade Kolmogorov--Arnol’d--Moser theorem Korteweg--de Vries equation Kuramoto--Sivashinsky equation
xxxv Laboratory models of nonlinear waves Landau--Lifshitz equation Langmuir--Blodgett films Lasers Lattice gas methods Le´ vy flights Lie algebras and Lie groups Lifetime Linearization Liquid crystals Local modes in molecular crystals Local modes in molecules Long Josephson junctions Lorentz gas Lorenz equations Lyapunov exponents Magnetohydrodynamics Manley--Rowe relations Maps Maps in the complex plane Markin--Chizmadzhev model Markov partitions Martingales Matter, nonlinear theory of Maxwell--Bloch equations McCulloch--Pitts network Measures Mechanics of solids Mel’nikov method Mixing Modulated waves Molecular dynamics Monodromy preserving deformations Monte Carlo methods Morphogenesis, biological Multidimensional solitons Multifractal analysis Multiple scale analysis Multiplex neuron Multiplicative processes Multisoliton perturbation theory Myelinated nerves Navier--Stokes equation N -body problem Nephron dynamics Nerve impulses Neural network models Neuristor Neurons Newton’s laws of motion Nonequilibrium statistical mechanics Nonlinear acoustics Nonlinear electronics Nonlinear optics Nonlinear plasma waves
xxxvi Nonlinear Schro¨ dinger equations Nonlinear signal processing Nonlinear toys Nonlinearity, definition of Nontwist maps Normal forms theory N -soliton formulas Numerical methods N -wave interactions One-dimensional maps Optical fiber communications Order from chaos Order parameters Ordinary differential equations, nonlinear Overtones Painleve´ analysis Parametric amplification Partial differential equations, nonlinear Particle accelerators Particles and antiparticles Pattern formation Peierls barrier Pendulum Perceptron Percolation theory Period doubling Periodic bursting Periodic orbit theory Periodic spectral theory Perturbation theory Phase dynamics Phase plane Phase space Phase-space diffusion and correlations Phase transitions Photonic crystals Plasma soliton experiments Plume dynamics Poincare´ theorems Poisson brackets Polaritons Polarons Polymerization Population dynamics Power balance Protein dynamics Protein structure Pump-probe measurements Quantum chaos Quantum field theory Quantum inverse scattering method Quantum nonlinearity Quantum theory
List of Entries Quasilinear analysis Quasiperiodicity Random matrix theory I: Origins and physical applications Random matrix theory II: Algebraic developments Random matrix theory III: Combinatorics Random matrix theory IV: Analytic methods Random walks Ratchets Rayleigh and Raman scattering and IR absorption Rayleigh--Taylor instability Reaction-diffusion systems Recurrence Regular and chaotic dynamics in atomic physics Relaxation oscillators Renormalization groups Rheology Riccati equations Riemann--Hilbert problem Ro¨ ssler systems Rotating rigid bodies Rotating-wave approximation Routes to chaos Salerno equation Sandpile model Scheibe aggregates Scroll waves Semiconductor laser Semiconductor oscillators Separation of variables Shear flow Shock waves Sinai--Ruelle--Bowen measures Sine-Gordon equation Singularity theory Skyrmions Solar system Solitons Solitons, a brief history Solitons, types of Spatiotemporal chaos Spectral analysis Spin systems Spiral waves Stability Standard map State diagrams Stereoscopic vision and binocular rivalry Stochastic analysis of neural systems Stochastic processes String theory Structural complexity Superconducting quantum interference device
List of Entries Superconductivity Superfluidity Superlattices Surface waves Symbolic dynamics Symmetry groups Symmetry: equations vs. solutions Symplectic maps Synchronization Synergetics Tachyons and superluminal motion Tacoma Narrows Bridge collapse Taylor--Couette flow Tensors Tessellation Thermal convection Thermo-diffusion effects Theta functions Threshold phenomena Time series analysis Toda lattice Topological defects Topology Traffic flow Turbulence
xxxvii Turbulence, ideal Turing patterns Twistor theory Universality Van der Pol equation Virial theorem Visiometrics Volterra series and operators Vortex dynamics in excitable media Vortex dynamics of fluids Water waves Wave of translation Wave packets, linear and nonlinear Wave propagation in disordered media Wave stability and instability Wavelets Winding numbers Yang--Mills theory Zeldovich--Frank-Kamenetsky equation Zero-dispersion limits
Thematic List of Entries General HISTORY OF NONLINEAR SCIENCE Bernoulli’s equation, Butterfly effect, Candle, Celestial mechanics, Davydov soliton, Determinism, Feedback, Fermi--Pasta--Ulam oscillator chain, Fibonacci series, Hodgkin--Huxley equations, Introduction, Integrability, Lorenz equations, Manley-Rowe relations, Markin--Chizmadzhev model, Martingales, Matter, nonlinear theory of, Poincar´ e theorems, Solar system, Solitons, a brief history, Tacoma Narrows Bridge collapse, Van der Pol equation, Zeldovich--Frank-Kamenetsky equation
COMMON EXAMPLES OF NONLINEAR PHENOMENA Avalanches, Ball lightning, Brownian motion, Butterfly effect, Candle, Clear air turbulence, Diffusion, Dripping faucet, Dune formation, Explosions, Fairy rings of mushrooms, Filamentation, Flame front, Fluid dynamics, Forest fires, Glacial flow, Global warming, Hurricanes and tornadoes, Jupiter’s Great Red Spot, Nonlinear toys, Order from chaos, Pendulum, Phase transitions, Plume dynamics, Solar system, Tacoma Narrows Bridge collapse, Traffic flow, Water waves
Methods and Models ANALYTICAL METHODS B¨ acklund transformations, Bethe ansatz, Centermanifold reduction, Characteristics, Collective coordinates, Continuum approximations, Dimensional analysis, Dispersion relations, Dressing method, Elliptic functions, Energy analysis, Evans function, Fredholm theorem, Gel’fand--Levitan theory, Generalized functions, Hamiltonian systems, Hirota’s method, Hodograph transform, Idempotent analysis, Integral transforms, Inverse scattering method or transform, Kirchhoff’s laws, Multiple scale analysis, Multisoliton perturbation theory, Nonequilibrium statistical mechanics, Normal forms
e analysis, Peritheory, N -soliton formulas, Painlev´ odic spectral theory, Perturbation theory, Phase dynamics, Phase plane, Poisson brackets, Power balance, Quantum inverse scattering method, Quasilinear analysis, Riccati equations, Rotating-wave approximation, Separation of variables, Spectral analysis, Stability, State diagrams, Synergetics, Tensors, Theta functions, Time series analysis, Volterra series, Wavelets, Zero-dispersion limits
COMPUTATIONAL METHODS Averaging methods, Cellular automata, Cellular nonlinear networks, Characteristics, Compartmen-
xxxix
xl tal models, Contour dynamics, Embedding methods, Extremum principles, Fitness landscape, Forecasting, Framed space curves, Hartree approximation, Integrability, Inverse problems, Lattice gas methods, Linearization, Maps, Martingales, Monte-Carlo methods, Numerical methods, Recurrence, Theta functions, Time series analysis, Visiometrics, Volterra series and operators, Wavelets
TOPOLOGICAL METHODS Backlund transformations, Cat map, Conley index, Darboux transformation, Denjoy theory, Derrick-Hobart theorem, Differential geometry, Extremum principles, Functional analysis, Horseshoes and hyperbolicity in dynamical systems, Huygens principle, Inertial manifolds, Invariant manifolds and sets, Knot theory, Kolmogorov--Arnol’d--Moser theorem, Lie algebras and Lie groups, Maps, Measures, Monodromy-preserving deformations, Multifractal analysis, Nontwist maps, One-dimensional maps, Periodic orbit theory, Phase plane, Phase space, Renormalization groups, Riemann--Hilbert problem, Singularity theory, Symbolic dynamics, Symmetry groups, Topology, Virial theorem, Winding numbers
CHAOS, NOISE AND TURBULENCE Attractors, Aubry--Mather theory, Butterfly effect, Chaos vs. turbulence, Chaotic advection, Chaotic dynamics, Clear air turbulence, Dimensions, Entropy, Ergodic theory, Fluctuation-dissipation theorem, Fokker--Planck equation, Free probability theory, Frustration, Hele-Shaw cell, Horseshoes and hyperbolicity in dynamical systems, L´ evy flights, Lyapunov exponents, Martingales, Mel’nikov method, Order from chaos, Percolation theory, Phase space, Quantum chaos, Random matrix theory, Random walks, Routes to chaos, Spatiotemporal chaos, Stochastic processes, Turbulence, Turbulence, ideal
COHERENT STRUCTURES Biomolecular solitons, Black holes, Breathers, Cell assemblies, Davydov soliton, Discrete breathers, Dislocations in crystals, DNA solitons, Domain
Thematic List of Entries walls, Dune formation, Emergence, Fairy rings of mushrooms, Flame front, Higgs boson, Holons, Hurricanes and tornadoes, Instantons, Jupiter’s Great Red Spot, Local modes in molecular crystals, Local modes in molecules, Multidimensional solitons, Nerve impulses, Polaritons, Polarons, Shock waves, Skyrmions, Solitons, types of, Spiral waves, Tachyons and superluminal motion, Turbulence, Turing patterns, Wave of translation
DYNAMICAL SYSTEMS Anosov and axiom-A systems, Arnol’d diffusion, Attractors, Aubry--Mather theory, Bifurcations, Billiards, Butterfly effect, Cat map, Catastrophe theory, Center manifold reduction, Chaotic dynamics, Coupled map lattice, Deterministic walks in random environments, Development of singularities, Dynamical systems, Equilibrium, Ergodic theory, Fitness landscape, Framed space curves, Function spaces, Gradient system, Hamiltonian systems, H´ enon map, Hopf bifurcation, Horseshoes and hyperbolicity in dynamical systems, Inertial manifolds, Intermittency, Kicked rotor, Kolmogorov--Arnol’d--Moser theorem, Lyapunov exponents, Maps, Measures, Mel’nikov method, One-dimensional maps, Pattern formation, Periodic orbit theory, Phase plane, Phase space, Phasespace diffusion and correlations, Poincar´ e theorems, Reaction-diffusion systems, R¨ ossler systems, Rotating rigid bodies, Routes to chaos, Sinai-Ruelle--Bowen measures, Standard map, Stochastic processes, Symbolic dynamics, Synergetics, Universality, Visiometrics, Winding numbers
GENERAL PHENOMENA Adiabatic invariants, Algorithmic complexity, Anderson localization, Arnol’d diffusion, Attractors, Berry’s phase, Bifurcations, Binding energy, Boundary layers, Branching laws, Breathers, Brownian motion, Butterfly effect, Causality, Chaotic dynamics, Characteristics, Cluster coagulation, Coherence phenomena, Collisions, Critical phenomena, Detailed balance, Determinism, Diffusion, Domain walls, Doppler shift, Effective mass, Emergence, Entropy, Equilibrium, Equipartition of energy, Excitability, Explosions, Feedback,
Thematic List of Entries
xli
Filamentation, Fractals, Free energy, Frequency doubling, Frustration, Gestalt phenomena, Group velocity, Harmonic generation, Helicity, Hopf bifurcation, Huygens’ principle, Hysteresis, Incoherent solitons, Inhibition, Integrability, Intermittency, Jump phenomena, Kolmogorov cascade, L´ evy flights, Lifetime, Mixing, Modulated waves, Multiplicative processes, Nonlinearity, definition of, N -wave interactions, Order from chaos, Order parameters, Overtones, Pattern formation, Period doubling, Periodic bursting, Power balance, Quantum chaos, Quantum nonlinearity, Quasiperiodicity, Recurrence, Routes to chaos, Scroll waves, Shear flow, Solitons, Spiral waves, Structural complexity, Symmetry: equations vs. solutions, Synergetics, Tachyons and superluminal motion, Tessellation, Thermal convection, Threshold phenomena, Turbulence, Universality, Wave packets, linear and nonlinear, Wave propagation in disordered media, Wave stability and instability
MAPS Aubry--Mather theory, B¨ acklund transformations, Cat map, Coupled map lattice, Darboux transformation, Denjoy theory, Embedding methods, Fermi acceleration and Fermi map, H´ enon map, Maps, Maps in the complex plane, Monodromy preserving deformations, Nontwist maps, One-dimensional maps, Periodic orbit theory, Recurrence, Renormalization groups, Singularity theory, Standard map, Symplectic maps
MATHEMATICAL MODELS Ablowitz--Kaup--Newell--Segur system, Attractor neural network, Billiards, Boundary value problems, Brusselator, Burger’s equation, Cat map,
Cellular automata, Compartmental models, Complex Ginzburg--Landau equation, Continuum approximations, Coupled map lattice, Coupled systems of partial differential equations, Delayoddifferential equations, Discrete nonlinear Schr¨ inger equations, Discrete self-trapping system, Duffing equation, Equations, nonlinear, Euler-Lagrange equations, Fitzhugh--Nagumo equation, Fokker--Planck equation, Frenkel--Kontorova model, Game of life, General circulation models of the enon--Heiles system, Integrable celatmosphere, H´ lular automata, Integrable lattices, Ising model, Kadomtsev--Petviashvili equation, Knot theory, Korteweg--de Vries equation, Kuramoto--Sivashinsky equation, Landau--Lifshitz equation, Lattice gas methods, Lie algebras and Lie groups, Lorenz equations, Markov partitions, Martingales, Maxwell-Bloch equation, McCulloch--Pitts network, Navier-Stokes equation, Neural network models, Newton’s laws of motion, Nonlinear Schr¨ odinger equations, One-dimensional maps, Ordinary differential equations, nonlinear, Partial differential equations, nonlinear, Random walks, Riccati equations, Salerno equation, Sandpile model, SineGordon equation, Spin systems, Stochastic processes, Structural complexity, Symbolic dynamics, Synergetics, Toda lattice, Van der Pol equation, Zeldovich--Frank-Kamenetsky equation
STABILITY Attractors, Bifurcations, Butterfly effect, Catastrophe theory, Controlling chaos, Development of singularities, Dispersion management, Dispersion relations, Emergence, Equilibrium, Excitability, Feedback, Growth patterns, Hopf bifurcation, Lyapunov exponents, Nonequilibrium statistical mechanics, Stability
Disciplines ASTRONOMY AND ASTROPHYSICS
BIOLOGY
Alfven waves, Black holes, Celestial mechanics, Cosmological models, Einstein equations, Galaxies, Gravitational waves, H´ enon--Heiles system, Jupiter’s Great Red Spot, N -body problem, Solar system
Artificial life, Bilayer lipid membranes, Biological evolution, Biomolecular solitons, Cardiac arrhythmias and electro cardiogram, Cardiac muscle models, Catalytic hypercycle, Compartmental models, Davydov soliton, DNA premelting, DNA solitons,
xlii Epidemiology, Excitability, Fairy rings of mushohlich rooms, Fibonacci series, Fitness landscape, Fr¨ theory, Game of life, Growth patterns, Morphogenesis, biological, Nephron dynamics, Protein dynamics, Protein structure, Scroll waves, Turing patterns
CHEMISTRY Belousov--Zhabotinsky reaction, Biomolecular solitons, Brusselator, Candle, Catalytic hypercycle, Chemical kinetics, Cluster coagulation, Flame front, Franck--Condon factor, Hydrogen bond, Langmuir-Blodgett films, Molecular dynamics, Polymerization, Protein structure, Reaction-diffusion systems, Scheibe aggregates, Turing patterns, Vortex dynamics in excitable media
CONDENSED MATTER AND SOLID-STATE PHYSICS Anderson localization, Avalanche breakdown, Bjerrum defects, Bose--Einstein condensation, Charge density waves, Cherenkov radiation, Color centers, Commensurate-incommensurate transition, Discrete breathers, Dislocations in crystals, Domain walls, Drude model, Effective mass, Excitons, ferromagnetism and Ferroelectricity, Franck-Condon factor, Frenkel--Kontorova model, Frustration, Heat conduction, Hydrogen bond, Ising model, Langmuir--Blodgett films, Liquid crystals, Local modes in molecular crystals, Mechanics of solids, Nonlinear acoustics, Peierls barrier, Percolation theory, Regular and chaotic dynamics in atomic physics, Scheibe aggregates, Semiconductor oscillators, Spin systems, Superconductivity, Superfluidity, Surface waves
EARTH SCIENCE Alfven waves, Atmospheric and ocean sciences, Avalanches, Ball lightning, Butterfly effect, Clear air turbulence, Dune formation, Dynamos, homogeneous, Fairy rings of mushrooms, Forest fires, General circulation models of the atmosphere, Geomorphology and tectonics, Glacial flow, Global warming, Hurricanes and tornadoes, Kelvin-Helmholtz instability, Sandpile model, Water waves
Thematic List of Entries
ENGINEERING Artificial intelligence, Cellular automata, Cellular nonlinear networks, Chaotic advection, Chua’s circuit, Controlling chaos, Coupled oscillators, Diodes, Dispersion management, Dynamos, homogeneous, Electron beam microwave devices, Explosions, Feedback, Flip-flop circuit, Frequency doubling, Hele-Shaw cell, Hysteresis, Information theory, Josephson junction arrays, Josephson junctions, Langmuir--Blodgett films, Lasers, Long Josephson junctions, Manley--Rowe relations, Neuristor, Nonlinear electronics, Nonlinear optics, Nonlinear signal processing, Optical fiber communications, Parametric amplification, Particle accelerators, Ratchets, Relaxation oscillators, Semiconductor laser, Semiconductor oscillators, Superconducting quantum interference device, Synchronization, Tacoma Narrows Bridge collapse
FLUIDS Alfven waves, Atmospheric and ocean sciences, Bernoulli’s equation, Chaos vs. turbulence, Chaotic advection, Clear air turbulence, Contour dynamics, Electron beam microwave devices, Evaporation wave, Fluid dynamics, Forecasting, General circulation models of the atmosphere, Glacial flow, HeleShaw cell, Hurricanes and tornadoes, Hydrothermal waves, Jump phenomena, Jupiter’s Great Red Spot, Kelvin--Helmholtz instability, Laboratory models of nonlinear waves, Lattice gas methods, Liquid crystals, Lorentz gas, Magnetohydrodynamics, Navier-Stokes equation, Nonlinear plasma waves, Plasma soliton experiments, Plume dynamics, Rayleigh-Taylor instability, Shear flow, Shock waves, Superfluidity, Surface waves, Taylor--Couette flow, Thermal convection, Thermo-diffusion effects, Traffic flow, Turbulence, Turbulence, ideal, Visiometrics, Vortex dynamics of fluids, Water waves
NEUROSCIENCE Artificial intelligence, Attractor neural network, Cell assemblies, Compartmental models, Electroencephalogram at large scales, Electroencephalogram at mesoscopic scales, Ephaptic coupling, Evans function, FitzHugh--Nagumo equation, Gestalt
Thematic List of Entries phenomena, Hodgkin--Huxley equations, Inhibition, Integrate and fire neuron, Multiplex neuron, Myelinated nerves, Nerve impulses, Neural network models, Neurons, Pattern formation, Perceptron, Stereoscopic vision and binocular rivalry, Stochastic analyses of neural systems, Synergetics
NONLINEAR OPTICS Cherenkov radiation, Color centers, Damped-driven anharmonic oscillator, Dispersion management, Distributed oscillators, Excitons, Filamentation, Geometrical optics, nonlinear, Harmonic generation, Hole burning, Kerr effect, Lasers, Liquid crystals, Maxwell--Bloch equations, Nonlinear optics, Optical fiber communications, Photonic crystals, Polaritons, Polarons, Pump-probe measurements, Rayleigh and Raman scattering and IR absorption, Semiconductor laser, Tachyons and superluminal motion
PLASMA PHYSICS Alfven waves, Ball lightning, Charge density waves, Drude model, Dynamos, homogeneous, Electron beam microwave devices, Magnetohydrodynamics, Nonlinear plasma waves, Particle accelerators, Plasma soliton experiments
SOCIAL SCIENCE Economic system dynamics, Epidemiology, Game theory, Hierarchies of nonlinear systems, Population dynamics, Synergetics, Traffic flow
xliii
SOLID MECHANICS AND NONLINEAR VIBRATIONS Avalanche breakdown, Bilayer lipid membranes, Bjerrum defects, Charge density waves, Cluster coagulation, Color centers, Detailed balance, Dislocations in crystals, Domain walls, Frustration, Glacial flow, Granular materials, Growth patterns, Heat conduction, Hydrogen bond, Ising model, Kerr effect, Langmuir--Blodgett films, Liquid crystals, Local modes in molecular crystals, Mechanics of solids, Molecular dynamics, Nonlinear acoustics, Protein dynamics, Ratchets, Rheology, Sandpile model, Scheibe aggregates, Shock waves, Spin systems, Superlattices, Surface waves, Tessellation, Topological defects
THEORETICAL PHYSICS Berry’s phase, Black holes, Born--Infeld equations, Celestial mechanics, Cherenkov radiation, Cluster coagulation, Constants of motion and conservation laws, Cosmological models, Critical phenomena, Derrick--Hobart theorem, Detailed balance, Einstein equations, Entropy, Equipartition of energy, Fluctuation-dissipation theorem, Fokker-Planck equation, Free energy, Galaxies, General relativity, Gravitational waves, Hamiltonian systems, Higgs boson, Holons, Instantons, Matter, nonlinear theory of, N -body problem, Newton’s laws of motion, Particles and antiparticles, Quantum field theory, Quantum theory, Regular and chaotic dynamics in atomic physics, Rotating rigid bodies, Skyrmions, String theory, Tachyons and superluminal motion, Twistor theory, Virial theorem, Yang--Mills theory
A AB INITIO CALCULATIONS
The NLS equation was known to arise in many physical contexts (Benney & Newell, 1967) and in 1973 Hasegawa and Tappert showed that the NLS equation describes the long-distance dynamics of nonlinear pulses in optical fibers (Hasegawa & Tappert, 1973). Motivated by these developments and indications that other equations fit into this category, David Kaup, Alan Newell, Harvey Segur, and the present author (Ablowitz et al., 1973, 1974) studied the following modification of the Zakharov–Shabat system:
See Molecular dynamics
ABLOWITZ–KAUP–NEWELL–SEGUR SYSTEM In 1967, Gardner, Greene, Kruskal, and Miura (or GGKM) (Gardner et al., 1967) showed that the Kortegweg–de Vries (KdV) equation qt + 6qqx + qxxx = 0,
(1)
with rapidly decaying initial data on − ∞ < x < ∞, can be linearized using direct and inverse scattering methods associated with the linear Schrödinger equation (2) vxx + [k 2 + q(x, t)]v = 0.
(4)
v1t = Av1 + Bv2 , v2t = Cv1 + Dv2 .
(5)
In Equations (4) and (5), v1 and v2 are auxiliary functions obeying the postulated linear systems; Equation (4) play the same role as Equation (2), whereas Equation (5) determine the temporal evolution of the functions v1 and v2 . (The evolution equation associated with the auxiliary function v for the KdV equation was not given above.) The method establishes that the functions q = q(x, t) and r = r(x, t) satisfy nonlinear equations when the (yet to be determined) functions A, B, C, and D are properly chosen. The key to this approach is to make Equations (4) and (5) compatible, that is, set the x-derivative of vit equal to the t-derivative of vix . In other words, we set the x-derivative of the right-hand side of Equations (5) equal to the t-derivative of the right-hand side of Equations (4). The result of this calculation yields the following equations for A, B, C, and D:
The KdV equation is of practical interest, having been first derived in the study of long water waves (Korteweg & de Vries, 1895) and subsequently in several other areas of applied science. In the method proposed by Gardner et al., the solitary wave (soliton) solution to the KdV equation (1) q = 2κ 2 sech2 κ(x − 4κ 2 t − x0 ) and multisoliton solutions are associated with the discrete spectrum of Equation (2). The discrete eigenvalues were shown to be invariants of the KdV motion; for example, the above soliton solution is associated with the discrete eigenvalue of Equation (2) at k = iκ. At that time, it was not clear whether the method could be applied to other physically significant equations. In 1972, however, Zakharov and Shabat (1972) used an operator formalism developed by Lax (1968) to show that the nonlinear Schrödinger (NLS) equation iqt + qxx + σ |q|2 q = 0,
v1x = −iζ v1 + qv2 , v2x = iζ v2 + rv1 ,
Ax Bx + 2iζ B Cx − 2iζ C D
(3)
with rapidly decaying initial data on − ∞ < x < ∞, could also be linearized by direct and inverse scattering methods.
= = = =
qC − rB, qt − 2Aq, rt + 2Ar, −A.
(6)
In Ablowitz et al. (1973, 1974; see also Ablowitz & Segur, 1981), methods to solve these equations are described. The simplest procedure is to look for finite 1
2
ABLOWITZ–KAUP–NEWELL–SEGUR SYSTEM
i power series expansions such as A = N i=0 ζ Ai and similarly for B and C. For example, with N = 2, we find with r = ∓q ∗ that the nonlinear Schrödinger equation (3) with σ = ±1 is a necessary condition. In this case there are 11 equations for the nine unknowns {Ai , Bi , Ci }, i = 0, 1, 2, and the remaining two equations determine the nonlinear evolution equations for q and r (in this case NLS when q = ∓r ∗ ). With N = 3 and r = −1, we find that q must satisfy the KdV equation. Also, with r = ∓ q, the modifed KdV equation qt ± 6q 2 qx + qxxx = 0
(7)
results. If we look for expansions containing inverse powers of ζ , additional interesting equations can be obtained. For example, postulating A = a/ζ, B = b/ζ, C = c/ζ results in the sine-Gordon and sinh-Gordon equations uxt = sin u,
(8)
uxt = sinh u,
(9)
where q = − r = − ux /2 in Equation (8) and q = r = ux /2 in Equation (9). The sine-Gordon equation has been known to be an important equation in the study of differential geometry since the 19th century (cf. Bianchi, 1902), and it has found applications in the 20th century as models for dislocation propagation in crystals, domain walls in ferromagnetic and ferroelectric materials, short-pulse propagation in resonant optical media, and magnetic flux propagation in long Josephson junctions, among others. Thus, a number of physically interesting nonlinear wave equations are obtained from the above formalism. In Ablowitz et al. (1973, 1974; see also Ablowitz & Segur, 1981), it was further shown as to how this approach could be generalized to a class of nonlinear equations described in terms of certain nonlinear evolution operators that were subsequently referred to in the literature as recursion operators. Further, the whole class of nonlinear equations with rapidly decaying initial data on − ∞ < x < ∞ was shown to be linearized via direct and inverse scattering methods. Special soliton solutions are associated with the discrete spectrum of the linear operator (4), and via (5) the discrete eigenvalues were shown to be invariants of the motion. In subsequent years, asymptotic analysis of the integral equations yielded the long-time behavior of the continuous spectrum, which in turn showed the ubiquitous role that the Painlevé equations play in integrable systems (cf. Ablowitz & Segur, 1981). Because this formulation is analogous to the method of Fourier transforms, the method was termed the inverse scattering transform or simply the IST. MARK J. ABLOWITZ
See also Integrability; Inverse scattering method or transform; Korteweg–de Vries equation; Nonlinear Schrödinger equations; Solitons Further Reading Ablowitz, M.J., Kaup, D.J., Newell, A.C. & Segur, H. 1973. Nonlinear equations of physical significance. Physical Review Letters, 31: 125–127 Ablowitz, M.J., Kaup, D.J., Newell, A.C. & Segur, H. 1974. The inverse scattering transform–Fourier analysis for nonlinear problems. Studies in Applied Mathematics, 53: 249–315 Ablowitz, M.J. & Segur, H. 1981. Solitons and the Inverse Scattering Transform, Philadelphia, PA: Society for Industrial and Applied Mathematics Benney, D.J. & Newell, A.C. 1967. The propagation of nonlinear envelopes. Journal of Mathematics and Physics (Name changed to: Studies in Applied Mathematics), 46: 133–139 Bianchi, L. 1902. Lezioni de Geometria Differenziale, 3 vols, Pisa: Spoerri Gardner, C.S, Greene, J.M., Kruskal, M.D. & Miura, R.M. 1967. Method for solving the Korteweg–deVries equation. Physical Review Letters, 19: 1095–1097 Hasegawa, A. & Tappert, F. 1973. Transmission of stationary nonlinear optical pulses in dispersive dielectrical fibers. I. Anamolous dispersion. Applied Physics Letters, 23: 142–144 Korteweg, D.J. & de Vries, F. 1895. On the change of form of long waves advancing in a rectangular canal, and on a new type of long stationary waves. Philosophical Magazine, 39: 422–443 Lax, P.D. 1968. Integrals of nonlinear equations of evolution and solitary waves. Communications in Pure and Applied Mathematics, 21: 467–490 Zakharov, V.E. & Shabat A.B. 1972. Exact theory of two-dimensional self-focusing and one-dimensional selfmodulation of waves in nonlinear media. Soviet Physics, JETP, 34: 62–69
ABLOWITZ–LADIK EQUATION See Discrete nonlinear Schrödinger equations
ACOUSTIC SOLITONS See Nonlinear acoustics
ACTION POTENTIAL See Nerve impulses
ACTION-ANGLE VARIABLES See Hamiltonian systems
ACTIVATOR-INHIBITOR SYSTEM See Reaction-diffusion systems
ADIABATIC APPROXIMATION See Davydov soliton
ADIABATIC INVARIANTS
3
ADIABATIC INVARIANTS Adiabatic invariants, denoted by I , are approximate constants of motion of a given dynamical system (not necessarily Hamiltonian), which are approximately preserved during a process of slow change of the system’s parameters (denoted by λ). This change is on a time scale T , which is supposed to be much larger than any typical dynamical time scale such as traversal time or the period of the shortest periodic orbits. This is an asymptotic statement, in the sense that the adiabatic invariants are better preserved, the slower the driving of the system. In other words, the switching function λ = λ(t) varies more slowly on the typical evolutionary time scale T , and the preservation is perfect in the limit T → ∞. The important point is that while the system’s parameters λ(t) and their dynamical quantities such as the total energy and angular momentum can change by arbitrarily large amounts, their combination involved in the adiabatic invariant I is preserved to a very high degree of accuracy, and this allows us to calculate changes of important quantities in dynamical systems. Examples arise in celestial mechanics, in other Hamiltonian systems, and in the motion of charged particles in magnetic and electric fields. The accuracy of preservation can be calculated in systems with one degree of freedom and is exponentially good with T if the switching function λ(t) is analytic (of class C ∞ ); that is to say, the change of the adiabatic invariant I is of the form I = α exp(−βT ),
(1)
where α and β are known constants. If, however, the switching function λ(t) is only of class C m (m-times continuously differentiable), then the change of the adiabatic invariant I during an adiabatic change over a time period of length T is algebraic only, namely I = αT −(m+1) .
(2)
In both cases, I → 0 as T → ∞. The fact that the evolutionary time scale T is large compared to the typical shortest dynamical time scales (average return time, etc.) suggests the averaging method or the so-called averaging principle. Here the long-term evolution (adiabatic evolution) of the system can be calculated by replacing the actual dynamical system with its averaged correspondent, obtained by averaging over the shortest dynamical time scales (the fast variables). Such a procedure is well known, for example, in celestial mechanics where the secular effects of the third-body perturbations of a planet are obtained by averaging the perturbations over one revolutionary period of the perturbers. This was done by Carl Friedrich Gauss
in 1801 in the context of studying the dynamics of planets. The adiabatic invariants can be easily calculated in one-dimensional systems and in completely integrable systems with N degrees of freedom. Something is known about the ergodic Hamiltonian systems, while little is known about adiabatic invariants in mixedtype Hamiltonian systems (with divided phase space), where for some initial conditions in the classical phase space, we have regular motion on invariant tori and irregular (chaotic) motion for other (complementary) initial conditions. One elementary example is the simple (mathematical) pendulum, of point mass m and of length l with the declination angle ϕ, described by the Hamiltonian H =
pϕ2 2ml 2
− mgl cos ϕ,
(3)
where pϕ = ml 2 ϕ˙ is the angular momentum. For small oscillations ϕ 1, around the stable equilibrium ϕ = 0. It is described by the harmonic Hamiltonian pϕ2 mgl 2 + ϕ . (4) 2ml 2 2 Here √ the angular oscillation frequency is ω = 2π ν = g/ l, where ν is the frequency and g is the gravitational acceleration. We denote the total energy of the Hamiltonian H by E. Paul Ehrenfest discovered that the quantity I = E/ω is the adiabatic invariant of the system, so the change of E(t) on large time scales T 1/ν is such that I = E(t)/ω(t) remains constant. Therefore, if for example the length of the pendulum l = l(t) is slowly, adiabatically changing, then the energy of the system will change according to the law l0 , (5) E = E0 l where E0 and l0 are the initial values and E and l the final values of the two variables. One can easily show that the oscillation amplitude ϕ0 changes as l −3/4 as the length l changes. This is an elementary example of a dynamically driven system in which the change of energy E can be very large, as is the change of ω, but I = E/ω is a well-preserved adiabatic invariant; in fact, it is exponentially well preserved if the switching function λ(t) is analytic. More generally, for Hamiltonian systems H (q, p, λ) with one degree of freedom, whose state is described by the coordinate q and canonically conjugate momentum p in the phase space (q, p), and λ = λ(t) is the system’s parameter (slowly changing on time scale T ), one can show that the action integral 1 p dq (6) I (E, λ) = I (E(t), λ(t)) = 2π H =
4 is the adiabatic invariant of the system, where the contour integral is taken at a fixed total energy E and a fixed value of λ. In this case, 2π I is interpreted as the area inside the curve E = const. in the phase plane (q, p). The accuracy is exponentially good if λ(t) is an analytic function and algebraic if it is of class C m . Moreover, the theorem holds true only if the frequency ω is nonzero. This implies that a passage through a separatrix (in the phase space of a one-dimensional system) is excluded because ω = 0; thus a different approach is necessary with a highly nontrivial result. When crossing a separatrix of a one-dimensional double potential well from outside in an adiabatic way going inside, a bifurcation takes place, and the capture of the trajectory in either of the two wells is possible with some probabilities. These probabilities can be calculated quite easily, and the spread of the adiabatic invariant I after such a passage can also be calculated, but this is more difficult. Important applications are found in celestial mechanics, where an adiabatic capture of a small body near a resonance with a planet can take place; in plasma physics; and in quantum mechanics of states close to the separatrix (in the semiclassical limit). This is an interesting result, because I is precisely that quantity which according to the “old quantum mechanics” of Bohr and Sommerfeld has to be quantized, that is, made equal to an integer multiple of Planck’s constant . Of course, the old quantum mechanics is generally wrong, but it can be a good approximation to the solution of the Schrödinger equation. Even then, strictly speaking, the quantization condition in the sense of EBK or Maslov quantization, must be written in the form α 1 p dq = n + , (7) I= 2π 4 where n = 0, 1, 2, . . . is the quantum number and α is the Maslov index, that is, the number of caustics (projection singularities) round the cycle E = const. in the phase plane. For smooth systems with quadratic kinetic energy, it is typically α = 2. Thus, at this semiclassical level, we have the semiclassical adiabatic invariant, stating that in one-dimensional systems under an adiabatic change, the quantum number (and thus the eigenstate) is preserved. This agrees with the exact result in the theory of the Schrödinger equation in quantum mechanics. Round a closed loop in a parameter space, a quantum system returns to its original state, except for the phase. (This closed-loop phase change is essentially the so-called Berry’s phase.) The method of averaging can also be used in N dimensional Hamiltonians H = H (q, p), where q and p are N -dimensional vectors, but it works only in two extreme cases: the integrable case and the ergodic case. In a classical integrable Hamiltonian system we have N analytic, global, and functionally independent
ADIABATIC INVARIANTS constants of motion Ai = Ai (q, p), i = 1, 2, . . . , N, pairwise in involution; that is, all Poisson brackets {Ai , Aj } vanish identically everywhere in phase space. The orbits in phase space are then confined to an invariant N-dimensional surface, and according to the Liouville–Arnol’d theorem the topology of these surfaces must be the topology of an N-dimensional torus. Then an action integral I = (1/2π ) p·dq along a closed loop on a torus will be zero if the loop can be continuously shrunk to a point on the torus. But there are loops that cannot be shrunk to a point due to the topology of the torus. Then the integral I is different from zero, otherwise its value does not depend on the particular loop, so in a sense it is a topological invariant of the torus. On an N-dimensional torus, there are N such independent elementary closed loops Ci , i = 1, 2, . . . , N. The integrals that we call simply actions or action variables 1 p · dq (8) Ii = 2π Ci are then the most natural momentum variables on the torus, whilst angle variables Θ specifying the position on the torus labeled by I can be generated from the transformation Θ=
∂S(I , q) , ∂I
(9)
where S = p · dq is an action integral on the torus. Applying the averaging principle (the method of averaging), one readily shows that for an integrable system the actions I are N adiabatic invariants, provided the system is nondegenerate, which means that the frequencies ∂H (10) ω= ∂I on the given torus are not rationally connected; that is to say, there is no integer vector k such that ω · k = 0. The problem is that during an adiabatic process the frequencies ω will change, and therefore, strictly speaking, there will be infinitely many points of λ = λ(t), where ω · k = 0, which will, strictly speaking, invalidate the theorem. However, it is thought that if the degree of resonances or rationality conditions ω · k = 0 is of a very high order, meaning that all components of k are very large, then the adiabatic invariants Ii will be quite well preserved. But low-order resonances (rationality conditions) must be excluded. The details of such a process call for further investigation. When the N actions Ii of an integrable system are quantized in the sense of Maslov, as explained above in the one-dimensional case, we again find agreement, at this semiclassical level, with quantum mechanics: in a family of integrable systems, all N quantum numbers and the corresponding eigenstates are preserved under an adiabatic change.
ALFVÉN WAVES
5
Another extreme of classically ergodic and thus fully chaotic systems has been considered already by Hertz. He found that in such ergodic Hamiltonian systems the phase space volume enclosed by the energy surface H (q, p) = E = constant is the adiabatic invariant, denoted by dN q dNp. (11) (E) = H (q,p)≤E
Of course, here it is required that while the system’s parameter λ(t) is slowly changing, the system itself must be ergodic for all λ(t). Sometimes, this condition is difficult to satisfy, but sometimes it is easily fulfilled. Examples are the stadium of Bunimovich with varying length of the straight line between the two semicircles, or the Sinai billiard with varying radius of the circle inside a square. For an ergodic two-dimensional billiard of area A and point mass m, we have (E) = 2π mAE.
(12)
Therefore, when A is adiabatically changing, the energy E of the billiard particle is changing reciprocally with A. Diminishing A implies increasing E, and this can be interpreted as work being done against the “pressure” of only one particle, if we define the pressure as the time average of the momentum transfer at collisions with the boundary of our ergodic billiard. There is a formalism to proceed with this analysis close to the thermodynamic formalism, as derived from statistical mechanics, except that here we are talking about time averages rather than phase averages of classical variables. Again, this general result for ergodic systems is interesting from the quantum point of view because N = (E)/(2π )N is precisely the number of energy levels below the energy E in the semiclassical limit of very large N , which is known as the Thomas– Fermi rule. It is the number of elementary quantum Planck cells inside the volume element H (q, p) ≤ E. Indeed, quantum mechanically, the eigenstate and the (energy counting sequential) main quantum number N are preserved under an adiabatic change. In case of a mixed-type Hamiltonian system, which is a typical case in nature, adiabatic theory is in its infancy. Moreover, in three or higher degrees of freedom, we have universal diffusion on the Arnol’d web, which is dense on the energy surface, even for KAM-type Hamiltonian systems that are very close to integrability, like our solar planetary system. On the Arnol’d web we have diffusional chaotic motion, and there is a rigorous theory by Nekhoroshev giving a rigorous upper bound to the diffusion rate in such a case. However, when compared with numerical calculations, it is found that the diffusion rate is many orders of magnitude smaller than the Nekhoroshev limit. In other words, the actual diffusion time is much longer than estimated by Nekhoroshev, implying that there we have
some approximate adiabatic invariant for long times, but not very long times. MARKO ROBNIK See also Averaging methods; Berry’s phase; Billiards; Phase space; Quantum theory; Quasilinear analysis Further Reading Arnol’d, V.I. 1989. Mathematical Methods of Classical Mechanics, 2nd edition, New York and Heidelberg: Springer Cary, J.R. & Rusu, P. 1992. Separatrix eigenfunctions. Physical Review A, 45: 8501–8512 Cary, J.R. & Rusu, P. 1993. Quantum dynamics near a classical separatrix. Physical Review A, 47: 2496–2505 Landau, L.D. & Lifshitz, E.M. 1996. Mechanics—Course of Theoretical Physics, vol. 1, 3rd edition, Oxford: ButterworthHeinemann Landau, L.D. & Lifshitz, E.M. 1997. Quantum Mechanics: NonRelativistic Theory—Course of Theoretical Physics, vol. 3, 3rd edition, Oxford: Butterworth-Heinemann Lichtenberg,A.J. & Lieberman, M.A. 1992. Regular and Chaotic Dynamics, 2nd edition, New York and Heidelberg: Springer Lochak, P. & Meunier, C. 1988. Multiphase Averaging for Classical Systems, New York and Heidelberg: Springer Reinhardt, W.P. 1994. Regular and irregular correspondences. Progress of Theoretical Physics Supplement, 116: 179–205
ALFVÉN WAVES The essence of Hannes Alfvén’s contributions to cosmic and laboratory plasmas is his idea of combining electromagnetics and hydrodynamics (Alfvén, 1942), thus introducing the new concept of magnetohydrodynamics (MHD). Electromagnetic waves associated with the motion of conducting liquids in magnetic fields, now known as Alfvén waves, were first observed experimentally (Lundquist, 1949; Lehnert, 1954). Later on, waves of this nature have turned out to be fundamental constituents of numerous phenomena in all parts of the universe (Fälthammar, 1995; Wilhelmsson, 2000). In the pioneering experiments, liquid mercury was used by Lundquist, and liquid sodium by Lehnert, who achieved higher electrical conductivity and lower density, leading to a higher Lundquist number (lower damping). Alfvén used his early results to give a possible explanation for sunspots and the solar cycle (periodicity in the Sun’s activity) (Alfvén, 1942). Alfvén noticed that the Sun has a general magnetic field and that solar matter is a good conductor, thus fulfilling idealized requirements for the notion of an electromagnetic wave in a gaseous conductor or plasma. At a very early age, Alfvén was given a copy of a popular astronomy book by Camille Flammarion, which greatly stimulated his lifelong interest in astronomy and astrophysics. His early experiences building radio receivers at the school radio club were also important for his later activities. Interestingly, another great scientist, Albert Einstein, received a small compass as a present when he was five years old, which
6
ALFVÉN WAVES
entirely absorbed his interest. He asked everybody around him what a magnetic field was and what gravity was, and later on in his life he admitted that this early experience might have influenced his lifelong scientific activities. Other similarities between the two scientists were that in their professional work both Einstein and Alfvén were very creative individualists, striving for simplicity of their solutions, and being skilled in many areas, they often looked at problems with fresh eyes. Both received Nobel Prizes in physics: Einstein in 1922, Alfvén in 1970. The simplest form of an Alfvén wave, a propagation of an electromagnetic wave in a highly conducting plasma, was first rejected by critics on the grounds that it could not be correct, otherwise it would already have been discovered by Maxwell. Furthermore, experiments had been performed with magnetic fields and conductive media by Ampère and others long ago. Nevertheless, “The Alfvén wave, in fact, is the very foundation on which the entire structure of magnetohydrodynamics (MHD) is erected. Beginning from a majestic original simplicity, it has acquired a rich and variegated character, and has ended up dictating most of the low-frequency dynamics of magnetized plasmas” (Mahajan, 1995). To visualize the interaction between the magnetic field and the motion of the conductive fluid, one may use an analogy with the theory of stretched strings to obtain a wave along the magnetic lines of force with a velocity vA , where vA 2 =
B2 µ0 ρ
(1)
and ρ is the mass density of the fluid, µ0 is the permittivity, and B is the magnetic field. The variations in velocity and current are mutually perpendicular and the magnetic field variations are in the direction of the fluid velocity variations, all variations being perpendicular to the direction of propagation. One may say that the variations of the magnetic field lines are frozen to those of the fluid motion, as can be deduced from electromagnetic equations, together with the hydrodynamic equation for the case of an incompressible fluid of infinite conductivity. The Alfvén wave is a low-frequency wave (ω < ωci , ωci being the ion cyclotron frequency) for which the displacement current is negligible. In fact, there are two types of Alfvén waves, for which ω/k = vA (torsional or shearwave)
(2)
ω/k = vA (compressional wave)
(3)
and 2 , with k and where ω is the frequency, k 2 = k 2 + k⊥
k⊥ being the wave numbers along and perpendicular to the magnetic field. For the shear wave, the frequency
depends only on k and not on k⊥ , which has profound consequences and leads to a continuous spectrum (Mahajan, 1995). For determining plasma stability and in selecting schemes for plasma heating and current drive in fusion plasma devices, the understanding of Alfvén wave dynamics is of great importance and has led to a vast literature. Nonlinear effects are of relevance to large-amplitude disturbances frequently observed in laboratory and space plasmas (Wilhelmsson, 1976). The formation and propagation of Alfvén vortices with geocosmophysical and pulsar (electron-positron plasma) applications are just two examples. Alfvén waves have also found interesting applications in solidstate plasmas in semiconductors as well as in metals and semimetals. Such studies have resulted in refined methods of measuring magnetic fields. It is often said that the universe consists 99% of plasma. Alfvén used to say that it seems as if only the crust of the Earth is not plasma. In the mid-1960s, this author gave a talk at the Royal Institute of Technology in Stockholm about plasmas in solids (electrons and holes), which Hannes Alfvén himself attended. Among other things, the talk described recent observations of Alfvén waves in such plasmas, and Alfvén said: “Ah, they are here also. How interesting, I did not know that.” It was not until the middle of the 20th century that more intensive investigations on Alfvén waves in space and laboratory plasmas began. The slow development of the field of space plasmas was possibly because many physicists were not acquainted with the fact that electric currents can be distributed in large volumes and magnetic fields in such volumes can be present. Since then, the gigantic laboratory of the universe from the aurora originating in the Earth’s magnetosphere to quasars at the rim of the universe has attracted immense interest with regard to Alfvén waves. When propagating in inhomogeneous plasmas, for example, in the magnetosphere, the Alfvén wave experiences many interesting phenomena, including mode coupling, resonant mode conversion, and resonant absorption. We now know that shear Alfvén waves lie behind the phenomena of micropulsations in the geomagnetic field and also acceleration of particles. Micropulsations were detected a hundred years ago with simple magnetometers on the ground. It took more than 50 years before it was understood that they were related to the magnetosphere. Solar physics is another fascinating field where Alfvén waves occur, giving rise to sunspots (Alfvén, 1942). The vast amount of energy exhibited in eruptions of particles on the solar surface, originating in the interior of the Sun, is probably transported by Alfvén waves. These also play a role in heating the solar corona. Alfvén waves were first identified in the solar wind by means of spacecraft measurements by the end of the 1960s. They also occur in the exosphere of comets. A new and promising area of research
ALGORITHMIC COMPLEXITY is laboratory astrophysics using high-intensity particle and photon beams that may shed light on superstrong fields in plasmas. For applications to confinement and heating of fusion plasmas, for example, in Tokamak devices, shear Alfvén waves have been studied in toroidal plasmas, accounting for nonuniform plasmas in axisymmetric situations. It is believed that the remaining exciting challenges lie in the area of nonlinear physics of shear Alfvén waves and associated particle dynamics and anomalous losses of α particles in a deuterium-tritium plasma. Collective modes in inhomogeneous plasmas as well as energy and particle transport in plasmas with transport barriers are of paramount importance for the design of a future Tokamak power plant (Parail, 2002). Nonlinear transport processes in laboratory and cosmic plasmas have much in common (Wilhelmsson, 2000; Wilhelmsson & Lazzaro, 2001). Similarities (and discrepancies) could be highly indicative and beneficial for an improved understanding of specific phenomena as well as for plasma dynamics in general—possibly even for describing the evolution of the universe (Wilhelmsson, 2002). HANS WILHELMSSON See also Magnetohydrodynamics; Nonlinear plasma waves; Plasma soliton experiments
Further Reading Alfvén, H. 1942. Existence of electromagnetic-hydrodynamic waves. Nature, 150: 405–406 Fälthammar, C.-G. 1995. Hannes Alfvén. In Alfvén Waves in Cosmic and Laboratory Plasmas, edited by A.C.-L. Chian, A.S. de Assis, C.A. de Azevedo, P.K. Shukla & L. Stenflo, Proceedings of the International Workshop on Alfvén Waves, Physica Scripta, T60: 7 Lehnert, B. 1954. Magnetohydrodynamic waves in liquid sodium. Physical Review, 94: 815 Lundquist, S. 1949. Experimental demonstration of magnetohydrodynamic waves. Nature, 164: 145 Mahajan, S.M. 1995. Spectrum of Alfvén waves, a brief review. Physica Scripta, T60: 160–170 Marston, E.H. & Kao Y.H. 1969. Damped Alfvén waves in bismuth. A determination of charge-carrier relaxation times. Physical Review, 182: 504 Parail, V.V. 2002. Energy and particle transport in plasmas with transport barriers. Plasma Physics and Controlled Fusion, 44: A63–85 Wilhelmsson, H. (editor). 1982. The physics of hot plasmas. Proceedings of the International Conference on Plasma Physics, Göteborg, May 1982, Physica Scripta, T2 (1 and 2) Wilhelmsson, H. (editor). 1976. Plasma Physics: Nonlinear Theory and Experiments, New York and London: Plenum Press Wilhelmsson, H. 2000. Fusion: A Voyage through the Plasma Universe, Bristol and Philadelphia: Institute of Physics Publishing Wilhelmsson, H. 2002. Gravitational contraction and plasma fusion burn; universal expansion and the Hubble Law. Physica Scripta, 66: 395
7 Wilhelmsson, H. & Lazzaro, E. 2001. Reaction-Diffusion Problems in the Physics of Hot Plasmas, Bristol and Philadelphia: Institute of Physics Publishing
ALGORITHMIC COMPLEXITY The notion of complexity as an object of scientific interest is relatively new. Prior to the 20th century, the main concern was that of simplicity, with complexity being the denigrated opposite. This idea of simplicity has had a long history enshrined in the dictum of the 14th-century Franciscan philosopher, William of Occam, that “pluritas non est ponenda sine necessitate” [being is not multiplied without necessity], and passed on simply as “Occam’s razor,” or more prosaically, “keep it simple” (Thorburn, 1918). Indeed, the razor has been invoked by such notables as Isaac Newton and, in modern times, Albert Einstein and Stephen Hawking to justify parsimony in the adoption of physical principles. Although the dictum has proved its usefulness as a support for many scientific theories, the last century witnessed a gradual concern for simplicity’s complement. Implicit was a recognition that beyond logical partitions was a need to quantify the simple/complex continuum. Perhaps the first milestone on the road to quantifying complexity came with Claude Shannon’s famous information entropy in the late 1940s. Although it was not specifically developed as a complexity measure, the information connection made by Warren Weaver soon provided an impetus for sustained interest in information as a unifying concept for complexity (Weaver, 1948). Shannon approached information as a statistical measure of receiving a message (Shannon, 1948): if p1 , p2 , . . . , pN are the probabilities of receiving messages m1 , m2 , . . . , mN , then the information carried is defined by I =−
N
pi log2 pi .
(1)
i
Information is typically referred to as a measure of surprise; that is, the more unlikely a message, the more information it contains. To some degree, information is related to the notion of randomness in that the more regular (less complex, less random) something is, the less surprise is available. A simple calculation of this entropy demonstrates that the maximum of the function is achieved when all probabilities are equal. Shannon had a measure of capacities of a communication channel as his goal and did not concern himself with individual objects of the messages. Nonetheless, the quantification in terms of probabilities provided a basis for viewing complexity. This theme was soon independently taken up by Ray Solomonoff (1964), Andrey Kolmogorov (1965),
8
ALGORITHMIC COMPLEXITY
and Gregory Chaitin (1966). In a sense, Somolonoff was looking for a way to measure the effect of Occam’s razor; that is, how can one measure objectively the simplicity of a scientific theory? Kolmogorov and Chaitin, on the other hand, were interested in a measure of complexity of individual objects, as opposed to Shannon’s average. This Kolmogorov– Chaitin complexity has come to be known variously as algorithmic complexity, algorithmic entropy, and algorithmic randomness, among other designations. Both Kolmogorov and Chaitin were interested in binary number strings as objects and the ability to define the complexity of a string in terms of the shortest algorithm that prints out the string. Again, regularity and randomness is involved (Gammarman & Vovk, 1999). Consider, for example, the simple bit string, 101010101010, . . .; the minimal program to write the string requires only the pattern 10, the length of the string, and the “repeat, write” instructions, or K(s) = min{|p| : s = CT (p)},
(2)
where K(s) is the Kolmogorov complexity of the string, |p| is the program length in bits, and CT (p) is the result of running program p on a universal Turing machine T. Clearly, the recognition that patterns play an important role in defining complexity re-emphasized their importance in terms of data compression. In the early 1950s, David Huffman recognized their importance, and algorithmic complexity reaffirmed their utility with the ascendancy of computers and their demand for storage space (Huffman, 1952). Thus, numerous coding schemes were developed to take advantage of the fact that a simple algorithm can compress long data streams based upon the idea that recurrent patterns exist. The efforts of Somonoloff, Kolmogorov, and Chaitin spawned numerous alternative measures of complexity, often seeking to address identified deficiencies in the definitions (Shalizi & Crutchfield, 2001). Among the deficiencies pointed out were the following: (i) Complexity is defined in terms of randomness—it is maximized by random strings. Is this what is really sought? (ii) Complexity is uncomputable, since there is no algorithm to compute it on a universal computing machine. (iii) Complexity does not provide information regarding structural patterns or organizations that have the same amount of disorder. These questions were compounded by the expanding field of nonlinear dynamics. Kolmogorov’s earlier entropy (1958)—developed to determine the rate of information creation—were among the invariants used to distinguish chaotic systems and an inferred complexity. Moreover, the description of physical dynamical systems became an additional issue (Zurek, 1989). Physical processes were typically described
along the continuum of two extremes: periodic or random. However, both such systems are simply described, one by a recurrent pattern and the other by a statistical description. While information is high in a random system, it is low in a periodic process. Amalgams of two such processes might require considerable computational effort, yet no metric sufficiently expressed this. Certainly, such combined processes (random and periodic) may exhibit moderate information but the most concise description may be quite complex. Some of these difficulties have been addressed with varying degrees of acceptance (Wakerberger et al., 1994). Increasingly, however, it appears that the question is evolving along two different lines: a formal approach (rules) with its main ramifications redounding to mathematics and computer science and a physical approach (equations) dealing with the characterization of systems. Both approaches have in common the emphasis on the reconstruction (prediction) of an observed system and on the need to give the most parsimonious recipe for generating the studied entity. The metrics of complexity is thus expressed in terms of program lines for the mathematics-oriented option and in dimensionality for the physics-oriented definition, but there is the same basic notion of complexity as the inverse of compressibility of a given object (Boffetta et al., 2002). This notion of compressibility has an immediate translation in terms of both multidimensional statistics and technology. In multidimensional statistics, the compressibility of a given data set corresponds to the percentage of explained variance by the optimal (generally in a least-squares sense) model of the data. More generally, something can be compressed if there exists some sort of correlation structure linking the different portions of a system, the existence of such correlations implying that the information about one part of the system is implicit in another part. Thus, all the information is not needed to reconstruct the entire system. It is evident how this concept corresponds to the cognate concept of redundancy, bringing us back to the notion of entropy (Giuliani et al., 2001). Clearly, the diverse algorithms designed to measure complexity suggest a commonality. The question remains as to whether one metric is sufficient for its characterization. JOSEPH P. ZBILUT AND ALESSANDRO GIULIANI See also Entropy; Information theory; Structural complexity Further Reading Boffetta, G., Cencini, M., Falcioni, M. & Vulpiani, A. 2002. Predictability: a way to characterize complexity. Physics Reports, 356: 367–474
ANDERSON LOCALIZATION Chaitin, G.J. 1966. On the length of programs for computing finite binary sequences. Journal of the Association for Computing Machinery, 13: 547–569 Gammerman, A. & Vovk, V. 1999. Kolmogorov complexity: sources, theory and applications. The Computer Journal, 42: 252–255 Gell-Mann, M. & Lloyd, S. 1999. Information measures, effective complexity, and total information. Complexity, 2: 44–52 Giuliani, A., Colafranceschi, M., Webber Jr., C.L. & Zbilut, J.P. 2001. A complexity score derived from principal components analysis of nonlinear order measures. Physica A, 301: 567–588 Huffman, D.A. 1952. A method for the construction of minimum redundancy codes. Proceedings IRE, 40: 1098–1101 Kolmogorov, A.N. 1958. A new metric invariant of transitive dynamical systems and automorphism in Lebesgue spaces. Doklady Akademii Nauk SSSR [Proceedings of the Academy of Sciences of the USSR], 119: 861–864 Kolmogorov, A.N. 1965. Tri podkhoda k opredeleniiu poniatiia “kolichestvo informatsii”. [Three approaches to the quantitative definition of information.] Problemy Peredachy Informatsii [Problems Information Transmission], 1: 3–11 Lempel, A. & Ziv, J. 1976. On the complexity of finite sequences. IEEE Transactions on Information Theory, 22: 75–81 Li, M. & Vitányi, P. 1997. An Introduction to Kolmogorov Complexity and Its Applications, 2nd edition, New York: Springer Salomon, D. 1998. Data Compression, the Complete Reference, New York: Springer Shalizi, C.R. & Chrutchfield, J.P. 2001. Computational mechanics: pattern and prediction, structure and simplicity. Journal of Statistical Physics 104: 819–881 Shannon, C.E. 1948. A mathematical theory of communication. Bell System Technical Journal, 27: 379–423 Solomonoff, R.J. 1964. The formal theory of inductive inference, parts 1 and 2. Information and Control, 7: 1–22, 224–254 Thorburn, W.M. 1918. The myth of Occam’s razor. Mind, 27: 345–353 Wackerberger, R., Witt, A., Atmanspacher, H., Kurths, J. & Scheingraber, H. 1994. A comparative classification of complexity measures. Chaos, Solitons & Fractals, 4: 133–173 Weaver, W. 1948. Science and complexity. American Scientist, 36: 536–544 Zurek, W. 1989. Thermodynamic cost of computation, algorithmic complexity, and the information metric. Nature, 341: 119–124
ALL-OR-NOTHING RESPONSE See Nerve impulses
ALMOST PERIODIC FUNCTIONS See Quasiperiodicity
AMBIGUOUS FIGURES See Cell assemblies
ANDERSON LOCALIZATION Anderson localization is a phenomenon associated with the interference of waves in random media. Although
9 Philip Anderson’s original publication (Anderson, 1958) was actually motivated by experiments on the propagation of spin-waves in random magnets, the greatest application of the concept has been the study of electrical transport phenomena in metals and semiconductors. Over the past 10 years, more attention has been focused on other wave phenomena in random media, particularly optical phenomena. Our current understanding of electronic transport in metals and semiconductors is based on the Schrödinger equation for the wave function of conduction electrons of the form −
2 2 ∇ ψ (r) + [U (r) + V (r)] ψ (r) = Eψ (r) , 2m∗
where U (r) is a periodic potential representing the regular lattice in the solid and V (r) is a random function of position, which represents the presence of impurities in the system. In the absence of the random potential, the allowed energies of such an electron fall within a series of bands separated by energy gaps. The eigenfunctions in the absence of the random potential are all of the form eik·r uj,k (r) where the wavevector k lies within the first Brillouin zone, j labels the energy bands and the Bloch function, uj,k (r), has the periodicity of the regular potential, U . Such states are extended in the sense that their support covers the entire system and hence they can contribute to electrical conduction. The eigenstates of an electron subject to a random potential may be of two types. Some are extended, although there may be strong local modulations in the amplitude. These states can contribute to electrical conduction through the material, even at zero temperature, as they connect the two ends of a sample. Other states, however, are localized in that their amplitude vanishes exponentially outside a specific finite region. These states are referred to as Anderson localized and can only contribute to conduction via thermal activation. One can understand the existence of localized states by considering the low-energy states of an electron moving in a very rough, random potential, V (r) (Lee & Ramakrishnan, 1985). The lowest energy states will be those bound to very deep troughs in the potential function. The mixing between states localized in different wells will be very weak because states with significant spatial overlap will have very different energies, while states with similar energies are spatially well separated so that the wave function overlaps are exponentially small. The scale on which the wave functions of localized states decay to zero defines the localization length ξ , which depends on the energy of the state and the strength of the disorder. The balance between extended and localized states depends on the strength of the disorder and the spatial dimensionality of the
10
ANDERSON LOCALIZATION 1
(E) extended states
0.8
0.6 localized states
0.4 localized states
Mobility edges
ments by Thouless and co-workers (Thouless, 1974), which favored a continuous metal-insulator transition for 3-d systems (Abrahams et al., 1979). The discussion of the change in nature of the states from extended to localized in terms of a zero-temperature quantum phase transition has been very fruitful. In this description, the localization length, ξ , plays the same role as the correlation length for fluctuations in a thermal transition. It is supposed that ξ diverges at the mobility edges according to a universal power law
0.2
ξ ∼ |E − Ec |−ν . E
0
2
4
6
8
10
Figure 1. A schematic plot of the density of states showing a single disorder broadened band for a 3-d system. The states in the band center are extended while those in the tails are localized (shaded regions); the mobility edges between the two types of state are marked.
system. In a one-dimensional (1-d) system, all of the electronic eigenstates are strongly localized by any amount of disorder. In two dimensions, it is believed that all states are actually localized, but that the localization length can be very long in the center of a band. The application of a strong magnetic field to a disordered 2-d electron system, such as may be formed at a semiconductor heterojunction at low temperature, causes the conduction band to break up into a sequence of disorder broadened Landau bands, each with an extended state at its center—an essential feature of the quantum Hall effect. In three dimensions, the eigenstates at the center of a band are truly extended while those in the low- and high-energy tails are localized. It is believed that there are two well-defined critical energies within the band at which the nature of the states changes so that localized and extended states do not co-exist at a given energy. The critical energies are usually referred to as mobility edges because the zero-temperature conductance of the system will be zero when the Fermi energy lies in the regime of localized states but nonzero in the extended regime (see Figure 1). The location of the mobility edges depends on the strength of the disorder —in very clean systems, only the states in the tails of a band will be localized while in a very dirty system the mobility edges may meet in the band center so that all states are localized. The behavior at the mobility edge has been studied by performing experiments on a series of devices with increasing amounts of disorder. The transition between metallic (conducting) and insulating behavior is closely analogous to other phase transitions. Mott (1973) supposed that this metalinsulator transition was first order, with the conductivity jumping from a fixed finite value, σmin , to zero. In 1979, a renormalization group analysis was carried out by the so-called “gang of four,” based on earlier scaling argu-
Numerical evidence indicates that the value of the exponent is indeed universal and has the value ν ∼ 1.6. Although the underlying physics of Anderson localization is that of linear waves in random media, the discussion can be recast in terms of nonlinear models without disorder, which are in the same family of statistical field theories used to describe thermal phase transitions, specifically nonlinear sigma models (Efetov, 1997). This has led to the notion that the spatial variation of the wave functions of states at the mobility edge displays a multifractal character. The application of these ideas to other wave phenomena in random media has been slower. It is much harder to observe strong localization in bosonic and classical wave systems, but recently much experimental work has been carried out on optical and acoustic localization (see John (1990) for a good introduction). This work shows that Anderson localization is not an essentially quantum mechanical effect but is ubiquitous for wave propagation in random media. Similarly, the interplay between the physics of randomly disordered systems and quantum chaos is also proving very rich and fruitful (Altshuler & Simons, 1994). There are a number of other mechanisms whereby wave excitations may become spatially localized. The propagation of excitations within macromolecules, for example, may become localized both because of interference effects associated with “random” changes in structure and also because of self-trapping effects associated with nonlinearity in the wave equation for these modes. In the case of electrons in solids, electronic excitations may become localized both by random variations in potential and by interaction effects that give rise to the so-called Mott transition. Such interaction effects are responsible for the phenomenon of Coulomb blockade observed in semiconductor nanostructures. KEITH BENEDICT See also Discrete self-trapping system; Local modes in molecular crystals Further Reading Abrahams, E., Anderson, P.W., Licciardello, D.C. & Ramakrishnan, T.V. 1979. Scaling theory of localization: Absence
ANOSOV AND AXIOM-A SYSTEMS
11
of quantum diffusion in two dimensions. Physical Review Letters, 42: 673 Altshuler, B. & Simons, B.D. 1994. Universalities: from Anderson localization to quantum chaos. In Mesoscopic Quantum Physics, Proceedings of the 61st Les Houches Summer School, edited by E. Akkermans, G. Montambaux, J.-L. Pichard & J. Zinn-Justin, Amsterdam: North-Holland Anderson, P.W. 1958. The absence of diffusion in certain random lattices. Physical Review, 109: 1492 Efetov, K. 1997. Supersymmetry in Disorder and Chaos. Cambridge and New York: Cambridge University Press John, S. 1990. The localization of waves in disordered media. In Scattering and Localization of Classical Waves in Random Media, edited by P. Sheng. Singapore: World Scientific, pp. 1–96 Lee, P.A. & Ramakrishnan, T.V. 1985. Disordered electronic systems. Reviews of Modern Physics, 57: 287 Mott, N.F. 1973. In Electronic and Structural Properties of Amorphous Semiconductors, edited by P.G. LeComber & J. Mort, London: Academic Press, p. 1 Thouless, D.J. 1974. Electrons in disordered systems and the theory of localization. Physics Reports, 13: 93
fast both in forward and in backward time. This is why hyperbolicity is a mathematical notion of chaos. An Anosov diffeomorphism is a smooth invertible map of a compact manifold with the property that the entire space is a hyperbolic set. Axiom A, which is a larger class, focuses on the part of the system that is not transient. More precisely, a point x in the phase space is said to be nonwandering if every neighborhood U of x contains an orbit that returns to U . A map is said to satisfy Axiom A if its nonwandering set is hyperbolic and contains a dense set of periodic points. Definitions in the continuous-time case are analogous: f above is replaced by the time-t-maps of the flow, and the tangent spaces now decompose into E u ⊕ E 0 ⊕ E s where E 0 , which is 1-d, represents the direction of the flow lines.
ANNIHILATION (KINK-ANTIKINK)
Anosov and Axiom-A systems are defined by the behavior of the differential. Corresponding to the linear structures left invariant by df are nonlinear structures, namely stable manifolds tangent to E s and unstable manifolds tangent to E u . Thus, two families of invariant manifolds are associated with an Anosov map and each one of these fills up the entire phase space; they are sometimes called the stable and unstable foliations. The leaves of these foliations are transverse at each point, forming a kind of (topological) coordinate system. The map f expands distances along the leaves of one of these foliations and contracts distances along the leaves of the other. For Axiom-A systems, one has a similar local product structure or “coordinate system” at each point in the nonwandering set, but the picture is local, and there are gaps: the stable and unstable leaves do not necessarily fill out open sets in the phase space. In addition to these local structures, Axiom-A systems have a global structure theorem known as spectral decomposition. It says that the nonwandering set of every Axiom-A map can be written as X1 ∪ · · · ∪ Xr where the Xi are disjoint closed invariant sets on which f is topologically transitive. The Xi are called basic sets. Each Xi can be decomposed further into a finite union Xi,j , where each Xi,j is invariant and topologically mixing under some iterate of f . (Topological transitivity and mixing are irreducibility conditions; See Phase space.) This decomposition is reminiscent of the corresponding result for finite-state Markov chains. One of the reasons why hyperbolic sets are important is their robustness: they cannot be perturbed away. More precisely, let f be a map with a hyperbolic set that is locally maximal, that is, it is the largest invariant set in some neighborhood U . Then for every map g that is C 1 near f , the largest invariant set of g in
Phase Space Structures and Properties See Sine-Gordon equation
ANOSOV AND AXIOM-A SYSTEMS Two classes of dynamical systems exhibiting chaotic behavior were axiomatically defined and systematically studied for the first time in the 1960s. Previous studies had concentrated on more specific situations. AxiomA systems were introduced by Stephen Smale in his seminal paper (Smale, 1967). Anosov systems, which are a special case of Axiom-A systems, were studied independently in Moscow around the same period. Today, Anosov and Axiom-A systems are valued as idealized models of chaos: while the conditions defining Axiom A are too stringent to include many reallife examples, it is recognized that they have features shared in various forms by most chaotic systems.
Definitions First, we give the definitions in the discrete-time case. Let f be a smooth invertible map (for basic notions, See Phase space). A compact invariant set of f is said to be hyperbolic if at every point in this set, the tangent space splits into a direct sum of two subspaces E u and E s with the property that these subspaces are invariant under the differential df , that is, df (x)E u (x) = E u (f (x)), df (x)E s (x) = E u (f (x)), and that df expands vectors in E u and contracts vectors in E s . If E u = {0} in the definition above, then the invariant set is made up of attracting fixed points or periodic orbits. Similarly, if E s = {0}, then the orbits are repelling. If neither subspace is trivial, then the behavior is locally “saddle-like,” that is to say, relative to the orbit of a point x, most nearby orbits diverge exponentially
12 U is again hyperbolic; moreover, f restricted to is topologically conjugate to g restricted to . This is mathematical shorthand for saying that not only are the two sets and topologically indistinguishable, but the orbit structure of f on is indistinguishable from that of g on . The above phenomenon brings us to the idea of structural stability. A map f is said to be structurally stable if every map g, that is C 1 near f is topologically conjugate to f (on the entire phase space). It turns out that a map is structurally stable if and only if it satisfies Axiom A and an additional condition called strong transversality. Next, we discuss the idea of pseudo-orbits versus real orbits. Letting d(·, ·) be the metric, a sequence of points x0 , x1 , x2 , . . . in the phase space is called an ε-pseudo-orbit of f if d(f (xi ), xi+1 ) < ε for every i. Computer-generated orbits, for example, are pseudo-orbits due to round-off errors. A fact of consequence to people performing numerical experiments is that in hyperbolic systems, small errors at each step get magnified exponentially fast. For example, if the expansion rate is ≥ 3, then an ε-error made at one step is tripled at each subsequent step, that is, after only O(| log ε|) iterates, the error is O(1), and the pseudoorbit bears no relation to the real one. There is, however, a theorem that states that every pseudo-orbit is shadowed by a real one. More precisely, given a hyperbolic set, there is a constant C such that if x0 , x1 , x2 , . . . is an ε-pseudo-orbit, then there is a phase point z such that d(xi , f i (z)) < Cε for all i. Thus, paradoxical as it may first seem, this result asserts that on hyperbolic sets, each pseudo-orbit approximates a real orbit, even though it may deviate considerably from the one with the same initial condition. The shadowing orbit corresponding to a biinfinite pseudo-orbit is, in fact, unique. From this, one deduces the following Closing Lemma: for any hyperbolic set, there is a constant C such that the following holds: every finite orbit segment x, f (x), . . . , f n−1 (x) that nearly closes up, that is, d(x, f n−1 (x)) < ε for some small ε, lies within < Cε of a genuine periodic orbit of period n. Thus, hyperbolic sets contain many periodic points.
Examples A large class of Anosov diffeomorphisms comes from linear toral automorphisms, that is, maps of the n-dimensional torus induced by n × n matrices with integer entries, det = ± 1, and no eigenvalues of modulus one. (See Cat map for a detailed example of this). We remark that due to their structural stability (nonlinear), perturbations of linear toral automorphisms continue to have the Anosov property. This remark also applies to all of the examples below. In fact, all known Anosov diffeomorphisms are
ANOSOV AND AXIOM-A SYSTEMS
Figure 1. The horseshoe.
Figure 2. The solenoid.
topologically identical to a linear toral automorphism (or a slight generalization of these). Geodesic flows describe free motions of points on manifolds. Let M be a manifold. Given x ∈ M and a unit vector v at x, there is a unique geodesic starting from x in the direction v. The geodesic flow ϕ t is given by ϕ t (x, v) = (x , v ), where x is the point t units down the geodesic and v is the direction at x . Geodesic flows on manifolds of strictly negative curvature are the main examples of Anosov flows. They were studied by Jacques Hadamard (ca. 1900) and Gustav Hedlund and Eberhard Hopf (in the 1930s) considerably before Anosov theory was developed. Smale’s horseshoe is the prototypical example of a hyperbolic invariant set. This map, so called because it bends a rectangle B into the shape of a horseshoe and puts it back on top of B, is shown in Figure 1. The set {x: f n (x) ∈ B for all n = 0, ±1, ±2, . . .} is hyperbolic (See Horseshoes and hyperbolicity in dynamical systems; Phase space). Finally, we mention the solenoid (see Figure 2, and also in the color plate section as the Smale solenoid), which is an example of an Axiom-A attractor. Here, the map f is defined on a solid torus M = S 1 × D2 , where D2 is a 2-d disk. It is easiest to describe it in two steps: first it maps M into a long thin solid torus, which is then placed inside M winding aroundthe S 1 direction twice. The attractor is given by = n≥0 f n (M).
Symbolic Coding of Orbits and Ergodic Theory An important tool for studying the orbit structure of Axiom-A systems is the Markov partition, constructed for Anosov systems by Sinai and extended to Axiom-A basic sets by Bowen. Given a partition {R1 , . . . , Rk } of the phase space, there is a natural way to attach
ARNOL’D DIFFUSION
13
to each point x in the phase space a sequence of symbols, namely (. . . , a−1 , a0 , a1 , a2 , . . .) where ai ∈ {1, 2, . . . , k} is the name of the partition element containing f i (x), that is, f i (x) ∈ Rai for each i. In general, not all sequences are realized by orbits of f . Markov partitions are designed so that the set of symbol sequences that correspond to real orbits has Markovian properties; it is called a shift of finite type (See Symbolic dynamics). The ergodic theory of Axiom-A systems has its origins in statistical mechanics. In a 1-d lattice model in statistical mechanics, one has an infinite array of sites indexed by the integers; at each site, the system can be in any one of a finite number of states. Thus, the configuration space for a 1-d lattice model is the set of bi-infinite sequences on a finite alphabet. Identifying this symbol space with the one from Markov partitions, Sinai and Ruelle were able to transport some of the basic ideas from statistical mechanics, including the notions of Gibbs states and equilibrium states, to the ergodic theory of Axiom-A systems. The notion of equilibrium states, which is equivalent to Gibbs states for Axiom-A systems, has the following meaning in dynamical systems in general: given a potential function ϕ, an invariant measure is said to be an equilibrium state if it maximizes the quantity hµ (f ) − ϕ dµ,
of γ . This function is known to be meromorphic on a certain domain, but the locations of its poles, which are intimately related to correlation decay properties of the system, remain one of the yet unresolved issues in Axiom-A theory. BORIS HASSELBLATT AND LAI-SANG YOUNG
where hµ (f ) denotes the Kolmogorov–Sinai entropy of f and the supremum is taken over all f -invariant probability measures µ. In particular, when ϕ = 0, this measure is the measure that maximizes entropy; and when ϕ = log | det(df |E u )|, it is the Sinai–Ruelle–Bowen (SRB) measure. From a physical or observational point of view, SRB measures are the most important invariant measures for dissipative dynamical systems (See Sinai–Ruelle–Bowen measures).
ANTISOLITONS
Periodic Points and Their Growth Properties
See Cat map
We discuss briefly some further results related to the abundance of periodic points in Axiom-A systems. For an Axiom-A diffeomorphism f , if P (n) is the number of periodic points of period ≤ n, then P (n) ∼ ehn where h is the topological entropy of f . That is to say, the dynamical complexity of f is reflected in its periodic behavior. An analogous result holds for Axiom-A flows. Finally, we mention the dynamical zeta function, which sums up the periodic information of a system. n In the discrete-time case, ζ (z) := exp ∞ n=1 P (n)z /n has been shown to be a rational function analytic on case, the zeta func|z| < e−h . In the continuous-time tion is given by ζ (z) := γ (1 − exp(−z l(γ )))−1 , where the product is taken over all (nonstationary) periodic orbits γ and l(γ ) is the smallest positive period
See also Cat map; Horseshoes and hyperbolicity in dynamical systems; Phase space; Sinai–Ruelle– Bowen measures; Symbolic dynamics Further Reading Bowen, R. 1975. Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Berlin and New York: Springer Branner, B. & Hjorth, P. (editors). 1995. Real and Complex Dynamical Systems. Proceedings of the NATO Advanced Study Institute held in Hillerød, June 20–July 2, 1993, Dordrecht and Boston: Kluwer Fielder, B. (editor). 2002. Handbook of Dynamical Systems, Vol. 2, Amsterdam and New York: Elsevier Hasselblatt, B. & Katok, A. (editors). 2002. Handbook of Dynamical Systems, Vol. 1A, Amsterdam and New York: Elsevier Hasselblatt, B. & Katok, A. 2003. Dynamics: A First Course, Cambridge and New York: Cambridge University Press Katok, A. & Hasselblatt, B. 1995. Introduction to the Modern Theory of Dynamical Systems, Cambridge and New York: Cambridge University Press Smale, S. 1967. Differentiable dynamical systems. Bulletin of the American Mathematical Society, 73: 747–817
See Solitons, types of
ANTI-STOKES SCATTERING See Rayleigh and Raman scattering and IR absorption
ARNOL’D CAT MAP
ARNOL’D DIFFUSION For near-integrable Hamiltonian systems with more than two degrees of freedom, stochastic and regular trajectories are intimately co-mingled in the 2Ndimensional phase space. Stochastic layers in phase space exist near resonances of the motion. The thickness of the layers expands with increasing perturbation, leading to primary resonance overlap, motion across the layers, and the appearance of strong stochasticity in the motion. In the limit of weak perturbation, however, primary resonance overlap does not occur. A new physical behavior of the motion then makes its appearance: motion along the resonance layers called Arnol’d diffusion (AD). For two degrees of freedom, with a weak perturbation, two-dimensional
14
ARNOL’D DIFFUSION The diffusion rate (D) along a layer has been calculated by Chirikov (1979) for the important case of three resonances, and by Tennyson et al. (1979) for an equivalent mapping model, which they called a stochastic pump. These models predict, for a single action I corresponding to J2 in Figure 1, D = (I )2 /t ∝ e−A/ε
Figure 1. Illustration of the directions of the fast diffusion across a resonance layer and the slow diffusion along the resonance layer.
Kolmogorov–Arnol’d–Moser (KAM) surfaces divide the three-dimensional energy “volume” in phase space into a set of closed volumes each bounded by KAM surfaces, much as lines isolate regions of a plane. For N > 2 degrees of freedom, the N-dimensional KAM surfaces do not divide the (2N −1)-dimensional energy volume into distinct regions. Thus, for N > 2, in the generic case, all stochastic layers of the energy surface in phase space are connected into a single complex network—the Arnol’d web. The web permeates the entire phase space, intersecting or lying infinitesimally close to every point. For an initial condition within the web, the subsequent stochastic motion will eventually intersect every finite region of the energy surface in phase space, even in the limit as the perturbation strength approaches zero. The merging of stochastic trajectories into a single web was proved (Arnol’d, 1964) for a specific nonlinear Hamiltonian. A general proof of the existence of a single web has not been given, but many computational examples support the conjecture. From a practical point of view, there are two major questions with respect to AD in a particular system: what is the relative measure of stochastic trajectories (fraction of the phase space that is stochastic) in the region of interest? And for a given initial condition, how fast will system points diffuse along the thin threads of the Arnol’d web? We illustrate the motion along the resonance layer in Figure 1. A projection of the motion onto the J1 , θ1 plane is shown, illustrating a resonance with a stochastic layer. At right angles to this plane, the action of the other coordinate J2 is shown. If there are only two degrees of freedom in a conservative system, the fact that the motion is constrained to lie on a constant energy surface restricts the change in J2 for J1 constrained to the stochastic layer. However, if there is another degree of freedom, or if the Hamiltonian is time dependent, then this restriction is lifted, and motion along the stochastic layer in the J2 direction can occur.
1/2
,
(1)
where t is the time, ε is a perturbation parameter, and A ≈ 1. For coupling among many resonances, a rigorous upper bound on the diffusion rate (Nekhoroshev, 1977) generally overestimates the rate by orders of magnitude. Using a similar formalism with a somewhat more restrictive class of Hamiltonians, but still encompassing most physical problems, the upper bound can be improved (Benettin et al., 1985; Lochak & Neistadt, 1992) to give what they considered to be an optimal upper bound: D ∝ e−A/ε , γ ≈ N −1 . γ
(2)
If N is large, such an exponentially small diffusion could only hold for very small ε (specified within the theory), otherwise the exponential factor could be essentially unity. Also, an upper bound must be related to the fastest local diffusion. This may be much more rapid than an average global diffusion, which would be controlled by the portions of the phase space where the diffusion is slowest. For upper bound calculations, consult the original papers of Nekhoroshev (1977), Benettin et al. (1985), and Lochak & Neistadt (1992). The simplest way to calculate local AD is to couple two standard maps together with a weak coupling term µ sin(θn + φn ), where θn and φn are the map phases and µ 1. Using the stochastic pump model, with a regular orbit (in the absence of coupling) in the (I, θ) map being driven by stochasticity in the (J, φ) map, the Hamiltonian of the mapping is approximated as H ≈ Hi + Hj , with Hi = I 2 /2 + Ki cos θ + 2µ cos(θ + φ), Hj = J 2 /2 + Kj cos φ + 2Kj cos φ cos 2π n, (3) where n is the time normalized to mapping periods. We have retained only the lowest Fourier term from the mapping frequency in the Hj equation of (3), and considered that the stochasticity in Hi is driven by the coupling. To calculate the changes in Hi per iteration due to kicks delivered by (J, φ), we take the derivative dHi ∂Hi = ∂n dn d [2µ cos(θ + φ)] = dn dθ (4) +2µ sin[θ + φ(n)]. dn
ARTIFICIAL INTELLIGENCE
15
For rotational orbits θ = wi n + θ0 , scaling the time variable to revolutions of the map (s = ωj n), and defin1/2 ing the ratio of frequencies (Q0 = ωi /ωj = ωi /Kj ), Equation (4) is integrated to obtain
∞ sin[Q0 s + φ(s)]ds Hi = 2µQ0 cos θ0 ∞ −∞ + sin θ0 cos[Q0 s + φ(s)]ds . (5) −∞
The first of the integrals in (5) integrates to zero; the second is a Mel’nikov–Arnol’d integral (Chirikov, 1979, Appendix A), which can be evaluated to give the change in Hi over one characteristic half-period of the (J, φ) map. Squaring Hi and averaging over θ0 gives (Hi )2 = 32π 2 Q40 µ2
sinh2 (πQ0 /2) . sinh2 (πQ0 )
(6)
To determine the diffusion constant D, divide (Hi )2 by twice the average number of iterations in this halfperiod Tj =
1 32e ln , ωj w1
(7)
where w1 = H /Hseparatrix is the relative energy of 1/2 the edge of the stochastic region, w1 = 8π(2π /Kj )3 1/2
× e−π /Kj , and e is the base of natural logarithms. Combining (6) and (7), and using Hi = I I , the diffusion constant in action space can be approximated in a form that exhibits the main Q0 scaling: 2
D ≈ 16µ2 nQ20 e−π Q0 ,
(8)
where we have assumed that I ≈ ωi . Comparing (8) to 1/2 (1) with Q0 = ωi /Kj , we observe that Kj ∝ ε, the perturbation parameter. The numerical results agreed well with (8) (see Lichtenberg & Aswani (1998) and references therein). Chirikov et al. (1979) found, numerically, that one could distinguish the diffusion in a range where ε was sufficiently large and a single resonance was dominant, such that a three-resonance model scaling as in (1) holds, from a range of smaller values of ε with many overlapping weak resonances, where the scaling in (2) applies. The results of their numerical investigation demonstrated the transition between the two regimes. In another approach, the diffusion through a large number of weakly coupled standard mappings was determined numerically, with the strength of the coupling controlled in a manner such that the threeresonance model could be applied in a statistical manner to determine the diffusion rate (Lichtenberg & Aswani, 1998).
These studies indicate that the basic models can be used to determine Arnol’d diffusion in multidimensional systems if the system parameters can be sufficiently controlled. For more information on these and related topics, the reader is referred to Chirikov (1979) and to Lichtenberg & Lieberman (1991, Chapter 6). ALLAN J. LlICHTENBERG See also Kolmogorov–Arnol’d–Moser theorem; Phase space diffusion and correlations; Standard map Further Reading Arnol’d, V.I. 1964. Instability of dynamical systems with several degrees of freedom. Russian Mathematical Surveys, 18: 85 Benettin, G., Galgani, L. & Giorgilli,A. 1985.A proof of Nekoroshev’s theorem for the stability times of nearly integrable Hamiltonian systems. Celestial Mechanics, 37: 1–25 Chirikov, B.V. 1979.A universal instability of many-dimensional oscillator systems. Physics Reports, 52: 265–379 Chirikov, B.V., Ford, J. & Vivaldi, F. 1979. Some numerical studies of AD in a simple model. In Nonlinear Dynamics and the Beam-Beam Interaction, edited by M. Month & J.C. Herrera, New York: American Institute of Physics Lichtenberg, A.J. & Aswani, A.M. 1998. Arnold diffusion in many weakly coupled mappings. Physical Review E, 57: 5325–5331 Lichtenberg,A.J. & Lieberman, M.A. 1991. Regular and Chaotic Dynamics, 2nd edition, New York: Springer Lochak, P. & Neistadt, A.I. 1992. Estimates in the theorem of N.N. Nekhoroshev for systems with quasi-convex Hamiltonian. Chaos, 2: 495–499 Nekhoroshev, N.N. 1977. An exponential estimate of the time of stability of nearly itegrable Hamiltonian systems. Russian Mathematical Surveys, 32: 1–65 Tennyson, J.L., Lieberman, M.A. & Lichtenberg, A.J. 1979. Diffusion in near-integrable Hamiltonian systems with three degrees of freedom. In Nonlinear Dynamics and the BeamBeam Interaction, edited by M. Month & J.C. Herrera, New York: American Institute of Physics
ARNOL’D TONGUES See Coupled oscillators
ARTIFICIAL INTELLIGENCE Artificial intelligence (AI) is a field of research in computer science reproducing intelligent reasoning. AI programs are mainly based on logic-oriented symbolic languages such as, for example, Prolog (Programming in Logic) or LISP (List Programming). Historically, AI was inspired by Alan Turing’s question: “Can machines think?” According to the Turing test for AI, a machine is intelligent if a human user cannot distinguish whether he or she is interacting and communicating with a machine or a human being. Thus, before starting with AI, a general concept of computer and computabilty must be defined in computer science. In 1936, Turing and Emil Post independently suggested the following definition of computability.
16 A “Turing machine” consists of: (a) a control box in which a finite program is placed, (b) a potentially infinite tape, divided lengthwise into squares, and (c) a device for scanning, or printing on one square of the tape at a time, and for moving along the tape or stopping, all under the command of the control box. If the symbols used by a Turing machine are restricted to a stroke / and a blank *, then every natural number x can be represented by a sequence of x strokes (e.g., 3 by ///), each stroke on a square of the Turing tape. The blank is used to denote that the square is empty (or the corresponding number is zero). In particular, a blank is necessary to separate sequences of strokes representing numbers. Thus, a Turing machine computes a numerical function f with arguments x1 , . . . , xn if the machine program starts with the input tape . . . ∗ x1 ∗ x2 ∗ . . . ∗ xn ∗ . . . and stops after finite steps with an output . . . ∗ x1 ∗ x2 ∗ . . . ∗ xn ∗ f (x1 , . . . , xn ) ∗ . . . on the tape. From a logical point of view, John von Neumann’s general-purpose computer is a technical realization of a universal Turing machine that can simulate any kind of Turing program. Besides Turing machines, there are many other mathematically equivalent procedures for defining computability (e.g., register machines, recursive functions) that are mathematically equivalent. According to Alonzo Church’s thesis, the informal intuitive notion of an algorithm is identical to one of these equivalent mathematical concepts, for example, the program of a Turing machine. With respect to AI, the paradigm of effective computabilty implies that the mind is represented by program-controlled machines, and mental structures refer to symbolic data structures, while mental processes implement algorithms. Historically, the hard core of AI was established during the Dartmouth Conference in 1956 when leading researchers, such as John McCarthy, Alan Newell, Herbert Simon, and others from different disciplines, formed the new scientific community of AI. If human thinking can be represented by an algorithm, then according to Church’s thesis, it can be represented by a Turing program that can be computed by a universal Turing machine. Thus, human thinking could be simulated by a general-purpose computer and, in this sense, Turing’s question (“Can machines think?”) must be answered with a “yes.” The premise that human thinking can be codified and represented by recursive procedures is, of course, doubtful. Even processes of mathematical thinking can be more complex than recursive functions. The first period of AI was dominated by questions of heuristic programming, which means the automated search for human problem solutions in trees of possible derivations, controlled and evaluated by heuristics. In 1962, these simulative procedures were generalized and enlarged for the so-called General Problem Solver (GPS), which was assumed to be the heuristic
ARTIFICIAL INTELLIGENCE framework of human problem solving. But GPS could only solve some insignificant problems in a formalized microworld. Thus, AI researchers tried to construct specialized systems of problem solving that use the specialized knowledge of human experts. The architecture of an “expert system” consists of the following components: knowledge base, problemsolving component (interference system), explanation component, knowledge acquisition, and dialogue component. Knowledge is the key factor in the performance of an expert system. The knowledge is of two types. The first type is the facts of the domain that are written in textbooks and journals in the field. Equally important to the practice of a field is the second type of knowledge, called heuristic knowledge, which is the knowledge of good practice and judgment in a field. It is experimental knowledge, that art of good guessing, that a human expert acquires over years of work. Expert systems are computational models of problem-solving procedures that need symbolic representation of knowledge. Unlike program-controlled serial computers, the human brain and mind are characterized by learning processes without symbolic representations. With respect to the architecture of von Neumann computers, an essential limitation derives from the sequential and centralized control, but complex dynamical systems like the brain are intrinsically parallel and self-organized. In their famous paper “A Logical Calculus of the Ideas Immanent in Nervous Activity” in 1943, Warren McCulloch and Walter Pitts offered a complex model of neurons as threshold logic units with excitatory and inhibitory synapses. Their “McCulloch–Pitts neuron” fires an impulse along its axon at time t + 1 if the weighted sum of its inputs and weights at time t exceeds the threshold of the neuron. The weights are numbers corresponding to the neurochemical interactions of the neuron with other neurons. But, in a McCulloch–Pitts network, the function of an artificial neuron is fixed for all time. McCulloch and Pitts succeeded in demonstrating that a network of formal neurons of their type could compute any finite logical expression. In order to make a neural computer capable of complex tasks, it is necessary to find procedures of learning. A learning procedure is nothing else than an adjustment of the many weights so that the desired output vector (e.g., a perception) is achieved. The first learning neural computer was Frank Rosenblatt’s “Perceptron” (1957). Rosenblatt’s neural computer is a feedforward network with binary threshold units and three layers. The first layer is a sensory surface called a “retina” that consists of stimulus cells (S-units). The S-units are connected with the intermediate layer by fixed weights that do not change during the learning process. The elements of the intermediate layer are
ARTIFICIAL LIFE called associator cells (A-units). Each A-unit has a fixed weighted input of some S-units. In other words, some S-units project their output onto an A-unit. An S-unit may also project its output onto several A-units. The intermediate layer is completely connected with the output layer, the elements of which are called response cells (R-units). The weights between the intermediate layer and the output layer are variable and thus able to learn. The Perceptron was viewed as a neural computer that can classify a perceived pattern in one of several possible groups. In 1969, Marvin Minsky and Seymour Papert proved that Perceptrons cannot recognize and distinguish the connectivity of patterns, in general. The Perceptron’s failure is overcome by more flexible networks with supervised and unsupervised learning algorithms (e.g., Hopfield systems, Chua’s cellular neural networks, Kohonen’s self-organizing maps). In the age of globalization, communication networks such as the Internet are a tremendous challenge to AI. From a technical point of view, we need intelligent programs distributed in the nets. There are already more or less intelligent virtual organisms (agents), learning, self-organizing, and adapting to our individual preferences of information, to select our e-mails, to prepare economic transactions, or to defend against attacks of hostile computer viruses, like the immune system of our body. Although the capability to manage the complexity of modern societies depends decisively on progress in AI, we need computational ecologies with distributed AI to support human life and not human-like robots to replace it. KLAUS MAINZER See also Artificial life; Cell assemblies; Game of life; McCulloch–Pitts network; Neural network models; Perceptron Further Reading Lenat, D.B. & Guha, R.V. 1990. Building Large KnowledgeBased Systems, Reading, MA: Addison-Wesley Mainzer, K. 2003. Thinking in Complexity. The Computational Dynamics of Matter, Mind, and Mankind, 4th edition, Berlin and New York: Springer Minsky, M. & Papert, S.A. 1969. Perceptrons, Cambridge, MA MIT Press Minski, M. 1985. The Society of Mind, New York: Simon & Schuster Nilson, N.J. 1982. Principles of Artificial Intelligence, Berlin and New York: Springer Palm, G. (editor). 1984. Neural Assemblies. An Alternative Approach to Artificial Intelligence, Berlin and New York Springer
ARTIFICIAL LIFE The term artificial life (AL) was coined in 1987 by Christopher Langton, who organized a workshop by that name in frustration with the lack of a forum
17 for discussing work on the computer simulation of biological systems. In Langton’s characterization, AL seeks to “contribute to theoretical biology by locating life-as-we-know-it within the larger picture of life-as-itcould-be” (Langton, 1989). In other words, AL aims to use computer simulation to synthesize alternative lifelike systems and, thus, find out which characteristics and principles are essential and which are merely contingent on how life happened to evolve on this planet. While other branches of biology may use simulation to understand specific mechanisms, AL is broader, more abstract, and highly interdisciplinary, in addition to implying certain ideological convictions. Chief among these is the assumption that life is a process, rather than a metaphysical substance or an atomic property of matter, which emerges in a bottom-up fashion from local interactions among suitably arranged populations of individually lifeless components. Opinions differ on whether such artificial systems may be logically equivalent to their natural counterparts and therefore really alive, or whether they are simply life-like simulacra. The former view is called the strong AL hypothesis, to associate it with a similarly functionalist standpoint known as Strong Artificial intelligence. However, the strong position in AL is considerably more tenable than its AI analog, which fails to distinguish between emergent and explicitly predetermined sources of behavior. Related to this “strong versus weak” argument is the unresolved question of whether life is an absolute category in nature at all or simply a useful way of grouping certain phenomena.
Early History Attempts to construct living or life-like artifacts from mechanical parts date back at least to the ancient Greeks, and we can presume that many of these experiments were motivated by questions similar to those posed today. Nevertheless, these early systems tended to employ the “if it quacks like a duck it is a duck” principle, and so were only superficially lifelike, rather than in the deeper sense presently hoped for. One of the most ingenious of these early automata was indeed a duck (or at least something that moved, ate, defecated, and quacked like one), built by Jacques de Vaucanson around 1730. Mary Shelley’s Frankenstein explores similar issues in a fictional context. Contrary to popular belief, Shelley’s monster was apparently not made from human body parts but from raw materials (cadavers are only mentioned with regard to Frankenstein’s research). These components were then imbued with the “spark of life” (which Shelley associates with electricity) in order to animate them. Her viewpoint was still partially vitalistic, but there is a link between Shelley and her
18 contemporary Charles Babbage, whose interpretation of intelligence (if not life itself) was more formalized, abstract, and mechanical. The mechanization of the mind continued with George Boole’s logical algebra and then the work of Alan Turing and John von Neumann on automating thought processes, which led directly to the invention of the digital computer and the beginnings of artificial intelligence. It was a similar inquiry into the abstract nature of life, as distinct from mind, that prompted von Neumann’s investigations into self-replicating machinery and Turing’s work on embryogenesis and “unorganized machines” (related to neural networks).
Formal Methods While complexity theory is concerned with the manner in which complex behavior arises from simple systems, AL is interested in how systems generate continually increasing levels of complexity. The most striking feature of living systems is their ability to self-organize and self-maintain— a property that Humberto Maturana and Francisco Varela have termed “autopoiesis” (Maturana & Varela, 1980). Evolution, embryogenesis, learning, and the development of social organizations are therefore the mechanisms of primary interest to AL researchers. The key features of AL models are the use of populations of semi-autonomous entities, the coupling of these through simple local interactions (no centralized control and little or no globally accessible information), and the consequent emergence of collective, persistent phenomena that require a higher level of description than that used to describe their substrate. Conventional mathematical notation is not usually appropriate for such distributed and labile systems, and the individual computer programs are often their own best description. There are, however, a number of frequently used abstract structures and formal grammars, including the following: Cellular automata, in which the populations are arrays of finite state machines and interactions occur between neighboring cells according to simple rules. Under the right conditions, emergent entities (such as the glider in John Conway’s Game of Life) arise and persist on the surface of the matrix, interacting with other entities in computationally interesting ways. Genetic algorithms, in which the populations are genomes in a gene pool and interactions occur between their phenotypes and some form of stressful environment. Natural selection (or sometimes human choice) drives the population to adapt and grow ever fitter, perhaps solving real practical problems in the process. L-systems, or Lindenmayer systems, which provide a grammar for defining the growth of branching (often plant-like) physical structures, as insights into morphology and embryology.
ATMOSPHERIC AND OCEAN SCIENCES Autonomous agents, which are composite code and data objects, representing mobile physical entities (robots, ants, stock market traders) embedded in a real or simulated environment. They interact locally by sensing their environment and receiving messages from other agents, giving rise to emergent phenomena of many kinds including cooperative social structures, nest-building, and collective problem-solving. Autocatalytic networks, in which the populations are of simulated enzymes and the interactions are equivalent to catalysis. Such networks are capable of self-generation and a growth in complexity, mimicking the bootstrapping process that presumably gave rise to life on Earth.
Current Status Like most new fields, AL has undergone cycles of hubris and doubt, innovation and stasis, and differentiation and consolidation. The listing of topics for the latest in the series of workshops started by Langton in 1987 is as broad as ever, although probably the bulk of AL work today (2004) is focused on artificial evolution. Most research concentrates on fine details, while the basic philosophical questions remain largely unanswered. Nevertheless,AL remains one of relatively few fields where one can ask direct questions about one’s own existence in a practical way. STEVE GRAND See also Catalytic hypercycle; Cellular automata; Emergence; Game of life; Hierarchies of nonlinear systems; Turing patterns Further Reading Adami, C. 1998. Introduction to Artificial Life, New York: Springer Boden, M.A. (editor). 1996. The Philosophy of Artificial Life, Oxford and New York: Oxford University Press Langton, C.G. (editor). 1989. Artificial Life, Redwood City, CA: Addison-Wesley Levy, S. 1992. Artificial Life: The Quest for a New Creation, New York: Pantheon Maturana, H.R. & Varela, F.J. 1980. Autopoiesis and Cognition: The Realization of the Living, Dordrecht and Boston: Reidel
ASSEMBLY OF NEURONS See Cell assemblies
ATMOSPHERIC AND OCEAN SCIENCES Earliest works on the study of the atmosphere and ocean date back to Aristotle and his student Theophrastus in 350 BC and further progressed through Torricelli’s invention of the barometer in 1643, Boyle’s law in 1657, and Celsius’s invention of the thermometer in 1742 (due to Galileo in 1607). The first rigorous theoretical model for the study of the atmosphere was proposed by
ATMOSPHERIC AND OCEAN SCIENCES Vilhelm Bjerknes in 1904, following which many scientists began to apply fundamental physics to the atmosphere and ocean. The advent of these theoretical approaches and the invention of efficient communication technologies in the mid-20th century made numerical weather prediction feasible and was in particular encouraged by Lewis Fry Richardson and John von Neumann in 1946, using the differential equations proposed by Bjerknes. Today, advanced numerical modeling and observational techniques exist, which are constantly being developed further in order to understand and study the complex nonlinear dynamics of the atmosphere and ocean. This overview article summarizes the governing equations used in atmospheric and ocean sciences, features of atmosphere-ocean interaction, and processes for an idealized geometry and structure with reference to a one-dimensional vertical scale (Figure 1), a twodimensional vertically averaged scale (Figures 3(a) and 4(a)), a two-dimensional zonally averaged meridional scale (Figures 3(b) and 4(b)), and a three-dimensional scale (Figure 2), and regimes of interacting systems (such as El Niño and Southern Oscillation and North Atlantic Oscillation) (Figures 5–7). The entry serves as an introduction to the many nonlinear processes taking place (for example, chaos, turbulence) and provides a few illustrative examples of self-organizing coherent structures of the nonlinear dynamics of the atmosphere and ocean.
Governing Equations The combined atmosphere and ocean system can be regarded as a huge volume of fluid resting on a rotating oblate spheroid with varying surface topography moving through space, with an interface (which in general is discontinuous) between two fluid masses of differing densities. This coupled atmosphere-ocean system is driven by energy input through solar radiation (see Figure 1), gravity (for example, through interaction with other stellar bodies such as the Sun and Moon, i.e., tides), and inertia. The entire fluid is described by equations for conserved quantities such as momentum, mass (of air, water vapor, water, salt), and energy together with equations of state for air and water (See Fluid dynamics; Navier–Stokes equation). The movement of large water or air masses in a rotating reference frame adds to the complexity of motions, due to the presence of Coriolis forces, introduced by Coriolis in 1835. Atmosphere-ocean interactions can be defined as an exchange of momentum, heat, and water (vapor and its partial masses: salts, carbon, oxygen, nitrogen, etc.) between air and water masses. The governing equations in the Euler formulation and a cartesian coordinate system are given by:
19 (i) The conservation of momentum 1 du + 2 × u = − ∇p − g + Fext + Ffric , dt ρ
(1)
where the second term 2 × u is the term due to the Coriolis force ( is the angular velocity of the Earth; || = 7.29 × 10−5 s−1 ), and forces due to a pressure gradient ∇p, gravity (|g | = 9.81 m s−2 ) and external (Fext ) as well as frictional (Ffric ) forces are included. Note that the operator d/dt is defined by
∂ dv = + v · ∇ v. dt ∂t (ii) The conservation of mass (or continuity equation) 1 dρ + ∇ · u = 0. ρ dt
(2)
Note that there are alternative formulations such as the Lagrangian and impulse-flux form for these equations, and cartesian coordinate systems can be mapped to different geometries such as spherical coordinates by appropriate transformations. (iii) The conservation of energy (First Law of Thermodynamics) and Gibbs’s equation (Second Law of Thermodynamics) dε dα dQ = +p , dt dt dt dη 1 dε p dα µi dγi = + − , (3) dt T dt T dt T dt where Q is the heat supply (sensible, latent, and radiative heat fluxes; see Figure 1), T is the temperature, ε the internal energy and α the specific volume (α = 1/ρ), η the entropy, µi the chemical potentials, and γi the partial masses. The conservation of energy states in brief that the change in heat is balanced by a change in internal energy and mechanical work performed, and Gibbs’s equation determines the direction of an irreversible process, relating entropy to a change in internal energy, volume, and partial masses. (iv) The conservation of partial masses of water and air, that is, salinity for water, where all constituents are represented as salts and water vapor for air, yield equations similar to (2) 1 dρv + ∇ · u = Wv ρv dt and dρs + ρs∇ · u = Ws , dt
(4)
where ρv is the density of water vapor, s the specific salinity (gram salts per gram water), and Wv , Ws
20
ATMOSPHERIC AND OCEAN SCIENCES
Exosphere Ionosphere Z, km T, °C Thermosphere 85 −120 Mesosphere Stratopause 50
−60
emitted radiation reflected, solar, infrared, long-wave short- wave
incoming solar radiation 100
6
−5
25 3
4
6
38
26
net emission by H2O, CO2
backscatter by air absorbed by H2O, dust, O3 16
reflected by clouds
Troposphere Surface 0 Thermocline −1
20
Atmosphere
10
Stratosphere Tropopause 10
Space
absorbed in clouds 3 Ocean / Land
reflected by surface 51
absorption by H2O, CO2 15
emission by clouds
net emission of sensible infrared radiation heat from surface flux 21
latent heat flux 7
23
−10
Figure 1. Sketch of the vertical structure of the atmosphere–ocean system and radiation balance and processes in the global climate system. Adapted from National Academy of Sciences (1975). Note that lengths are not to scale and and temperatures indicate only global averages.
contain possible source and sink terms as well as the effect of molecular diffusion in terms of the concentration flux density S (−∇·S) and possible phase changes. (v) The equation of state for a mixture of salts and gases for air and water, whose constituent concentrations are virtually constant in the atmosphere and ocean is p ≈ ρRT (1 + 0.6078 q) ,
(5)
where R is the gas constant for dry air (R = 287.04 J kg−1 K−1 ) and q = ρv /ρ is the specific humidity. Similarly, the equation of state for near incompressible water is ρ ≈ ρ0 [1 − α(T − T0 ) + β(S − S0 )],
(6)
where ρ0 , T0 , and S0 are reference values for density, temperature, and salinity (ρ0 =1028 kg m−3 , T0 = 283 K(= 10◦ C), S0 = 35‰), and α and β are the coefficients of thermal expansion and saline contraction (α = 1.7 × 10−4 K−1 , β = 7.6 × 10−4 ), see Krauss (1973); Cushman-Roisin (1993). Equations detailed in (i)–(v) form a set of hydrothermodynamic equations for the atmosphere-ocean system to which various approximations and scaling limits can be applied. Among them are the shallow-water equations, primitive equations, the Boussinesq and anelastic approximation, quasigeostrophic, and semigeostrophic equations and variants or mixtures of these. These equations have to be solved with appropriate boundary conditions and conditions at the air-sea interface; for details refer to Krauss (1973), Gill (1982) and Kraus & Businger (1994). For studies of the up-
per atmosphere, further equations for the geomagnetic field can also be taken into account (Maxwell’s equations).
Atmospheric Structure and Circulation In the vertical dimension, several atmospheric layers can be differentiated (see Figure 1). Figure 2 gives the length and time scales of typical atmospheric processes. From sea level up to about 2 km is the atmospheric boundary layer, characterized by momentum, heat, moisture, and water transfer between the atmosphere and its underlying surface. Above the boundary layer is the troposphere (Greek, tropos meaning turn, change) that constitutes most of the total mass of the atmosphere (about 10 km height) and is largely in hydrostatic balance characterized by a decrease in temperature. Above the troposphere and stratosphere, which contains the ozone layer, temperatures rise throughout. The mesosphere, which is bounded by the stratopause (about 50 km height) below and mesopause (about 85 km height) above, is a layer of very thin air where temperatures drop to extreme lows. Above the mesopause, temperatures increase again throughout the thermosphere (from about 85 km to 700 km), the largest layer of the atmosphere, where the ionosphere is located (between about 100 km and 300 km). The ionosphere contains ionized atoms and free electrons and permits the reflection of electromagnetic waves. Above the thermosphere is the exosphere, which is the outermost layer of the atmosphere and the transition region between the atmosphere and outer space, the magnetosphere in particular, where atoms can escape into space beyond the so-called escape velocity and where the Van Allen belt is situated.
ATMOSPHERIC AND OCEAN SCIENCES
21 Atmosphere
Atmosphere
2 10 1min 101 1
ty osi
extratropical cyclones fronts thunderstorms deep convection
wa ve s
cul
5
10 1d 4 10 1hr 3 10
tornadoes
2/3 law ~ L (5/3 T small scale turbulence
)
ty
vi
ra rnal g inte rnal exte
LT
18°C
ET
Df
Cf Cs
2 g =
18°C Cf ET
10°C 3°C 0°C
internal sound waves
Dw Df
BW BS Aw
60° Cf
Cw Af
BS BW Cw Cs
30° 0° 30°
60° EF
1 2 10 10 103 104 105 106 107 108 1km CE characteristic horizontal scale L (m)
10 10y 8 10 1y 7 10 6 10
idealized continent 90°N Tropopause
thermohaline circulation
tsunamis
polar front jet stream
jet stream
60° L extratropical cyclone anticylone
tidal waves small scale turbulence
90° S
polar easterlies
quasigeostrophic eddies
105
4 10 1hr 3 10 2 10 1min 1 10
a
Ocean
9
characteristic time scale T (sec)
EF
Aw
10-1 -2 10 10-1 1
1d
0°C 3°C 10°C
jet stream
isc
6
ar v
10
90°N molecular diffusion
L2 1 T =m ole
characteristic time scale T (sec)
108 1y 7 10
internal waves
inertia gravity waves
wind waves 1 10-2 10-1 1 101 102 103 104 105 106 107 108 1km CE characteristic horizontal scale L (m)
Figure 2. Schematic logarithmic time and horizontal length scales of typical atmospheric and oceanic phenomena. Note that Richardson’s L ∝ T 3/2 relation and CE stands for circumference of the Earth. Modified from Lettau (1952), Smagorinsky (1974), and World Meteorological Organization (1975).
A low (high) in meteorology refers to a system of low (high) pressure, a closed area of minimum (maximum) atmospheric pressure (closed isobars, or contours of constant pressure) on a constant height chart. A low (high) is always associated with (anti)cyclonic circulation, thus also called a cyclone (anticyclone). Anticyclonic means clockwise in the Northern Hemisphere (and counterclockwise in the Southern Hemisphere). Cyclonic means counterclockwise in the Northern Hemisphere (and clockwise in the Southern Hemisphere). At zeroth order, a balance of pressure gradient forces and Coriolis forces, that is, geostrophic balance, occurs, leading to the flow of air along isobars instead of across (in the direction of the pressure gradient). A front is a discontinuous interface or a region of strong gradients between two air masses of differing densities or temperatures, thus encouraging conversion of poten-
polar front Trade winds Doldrums
H
b
ITCZ
30° westerlies
Hadley cell
H tropical H cyclone
0°
Figure 3. Sketch of the near-surface climate and atmospheric circulation of the Earth with an idealized continent. (a) Averaged isothermals of the coldest month (dashed-dot, −3◦ C, 18◦ C) and the warmest month (solid, 0◦ C, 10◦ C) and periodical dry season boundaries α and dry climate β. The following climate regions are indicated: wet equatorial climate (Af), tropical wet/dry climate (Aw), desert climate (BW), steppe climate (BS), sinic climate (Cw), Mediterranean climate (Cs), humid subtropical climate (Cf), humid continental climate (Df), continental subarctic climate (Dw), tundra climate (ET) and snow and ice climate (EF). Modified from Köppen (1923). (b) The zonal mean jet streams (primary circulation) and mass overturning (secondary circulation) in a meridional height section, the subtropical highs (H) and subpolar lows (L), polar easterlies, westerlies, polar front, trade winds, and intertropical convergence zone (ITCZ). denotes a cold front and a warm front. Adapted from Palmen (1951), Defant and Defant (1958), and Hantel in Bergmann & Schäfer (2001).
tial into kinetic energy (examples are polar front, arctic front, cold front, and warm front). Hurricanes and typhoons (local names for tropical cyclones) transport large amounts of heat from low to mid and high latitudes and develop over oceans. Little is known about the initial stages of their formation, although they are triggered by small low-pressure systems in the Intertropical Convergence Zone (See Hurricanes and tornadoes). Because of their strong winds, cyclones are particularly active in inducing
22
ATMOSPHERIC AND OCEAN SCIENCES Ocean 90°N 60°
subpolar gyre 30°
NE SE
subtropical gyre EC
0°
equator
subtropical 30° gyre
a
Antarctic idealized 60° circumpolar continent current
90°S subtropical convergence PC
WD
Antarctic subtropical polar front convergence ACC
NE EC SE
AAIW
NADW 90°N
60°
AABW 30°
0°
30°
60°
90°S
b Figure 4. Sketch of the oceanic circulation of an idealized basin (a) the global wind-induced distribution of ocean currents (primary circulation) and (b) the zonal mean thermohaline circulation (secondary circulation) in a meridional depth section showing upper, intermediate, deep and bottom water masses. NE denotes the north equatorial, EC the equatorial and SE the south equatorial current. PC stands for polar current, ACC for Antarctic circumpolar current, WD for west drift, AAIW for Antarctic intermediate waters, NADW for North Atlantic deep water, and AABW for Antarctic bottom water. Adapted from Hasse and Dobson (1986).
upwelling and wind-driven surface water transport. Extratropical cyclones are frontal cyclones of mid to high latitudes (see Figures 2 and 3). Meteorology and oceanography are concerned with understanding, predicting, and modeling the weather, climate, and oceans due to their fundamental socioeconomic and environmental impact. In meteorology, one distinguishes between short (1–3 days) and medium-range (4–10 days) numerical weather prediction models (NWPs) for the atmosphere and general circulation models (GCMs, See General circulation models of the atmosphere). While NWPs for local and regional weather prediction are usually not coupled to ocean models, GCMs are global three-dimensional complex coupled atmosphere-ocean models (which even include the influence of land masses), used to study global climate change, modeling radiation, photochemistry, transfer of heat, water vapor, momentum,
greenhouse gases, clouds, ocean temperatures, and ice boundaries. The atmosphere-ocean interface couples the “fast” processes of the atmosphere with the comparably “slow” processes of the ocean through evaporation, precipitation, and momentum interaction. GCMs are validated using statistical techniques and correlated to the actual climate evolution. Additionally, the application of GCMs to different planetary atmospheres, for example, on Mars and Jupiter, leads to a greater understanding of the planet’s history and environment. The complexity of the dynamics of the atmosphere and ocean is largely due to the intrinsic coupling between these two large masses at the air-sea interface.
Ocean Ocean circulation is forced by tidal forces (also known to force atmospheric tides), due to gravitational attraction, wind stress, applied shear forces acting on the interface, and external, mainly solar, radiation, penetrating into the sea surface and affecting the heat budget and water mass due to evaporation. Primary sources of tidal forcing, earliest work on which was undertaken by Pierre-Simon Laplace in 1778, are the Moon and the Sun. One discerns between diurnal, semidiurnal, and mixed-type tides. In the ocean, one distinguishes between two types of ocean currents: surface (wind-driven) and deep circulation (thermohaline circulation). Separating the surface and deep circulation is the thermocline, a small layer of strong gradient of temperature, salinity, and density, acting as an interface between the two types of circulations. Surface circulation ranging up to 400 m in depth is forced by the prominent westerly winds in the midlatitudes and trade winds in the tropical regions (see Figures 3 and 4), which are both forced by solar heating and Coriolis forces leading to expansion of water near the equator and decreased density, but increased salinity due to evaporation. An example of the latter is the Gulf Stream in the North Atlantic. The surface wind stress, solar heating, Coriolis forces, and gravity lead to the creation of large gyres in all ocean basins with clockwise (anticyclonic) circulation in the northern hemisphere and counterclockwise circulation in the southern hemisphere. The North Atlantic Gyre, for example, consists of four currents: the north equatorial current, the Gulf Stream, the North Atlantic current, and the Canary current. Ekman transport, the combination of wind stress and Coriolis forces, leads to a convergence of water masses in the center of such gyres, which increases the sea surface elevation. The layer of Ekman transport can be 100–150 m in depth and also leads to upwelling due to conservation of mass on the western (eastern) coasts for winds from the north (south) in the Northern (Southern) Hemisphere. As a consequence, nutrient-rich
ATMOSPHERIC AND OCEAN SCIENCES 90°W
23 90°E
0°
180°
cold deep water creation warm surface water creation
warm surface water creation
warm, less salty surface circulation cold, saline deep circulation
cold deep water creation
Figure 5. Sketch of the global conveyor belt through all oceans, showing the cold saline deep circulation, the warm, less salty surface circulation, and the primary regions of their creation. Note that this circulation is only characteristic of the actual global circulation. Adapted from Broecker (1987).
deep water is brought to the surface. With the opposite wind direction, Ekman transport acts to induce downwelling. Another important combination of forces is the balance of Coriolis forces and gravity (pressure gradient forces), which is called geostrophic balance, leading to the movement of mass along isobars instead of across (geostrophic current), similar to the atmosphere. The boundary currents along the eastern and western coastlines are the major geostrophic currents in a gyre. The western side of the gyre is stronger than the eastern due to the Earth’s rotation, called western intensification. Deep circulation makes up 90% of the total water mass and is driven by density forces and gravity, which in turn is a function of temperature and salinity. Highdensity deep water originates in the case of extreme cooling of the sea surface in the polar regions, sinking to large depths as a density current, a strongly nonlinear phenomenon. When the warm Gulf Stream waters, which have increased salinity due to excessive evaporation in the tropics, move north due to the North Atlantic Gyre, they are cooled by Arctic winds from the north and sink to great depths forming the high-density Atlantic deep waters (see Figure 5). The downward trans-
port of water is balanced by upward transport in lowand mid-latitude regions. The most prominent example of the interaction between atmospheric and ocean dynamics is the global conveyor belt, which links the surface (winddriven) and deep (thermohaline) circulation to the atmospheric circulation. The global conveyor belt is a global circulatory system of distinguishable and recognizable water masses traversing all oceans (see Figure 5). The water masses of this global conveyor belt transport heat and moisture, contributing to the climate globally. In Earth’s history, the global conveyor belt has experienced flow reversals and perturbations leading to changes in the global circulatory system. The rather recent anthropogenic impact on climate and oceans through greenhouse gas emissions has the potential to create instability in this large-scale dynamical system, which could alter Earth’s climate and have devastating environmental and agricultural effects.
ENSO and NAO Another example of atmosphere-ocean coupling is the combination of the El Niño and Southern Oscillation (ENSO). The El Niño ocean current (and associated
24
ATMOSPHERIC AND OCEAN SCIENCES Normal/La Niña Atmosphere
NAO+
sea ice
Walker circulation sea ice
Pacific Ocean
cold SST
warm SST
warm SST
North Atlantic Ocean
cold SST cold SST
thermocline
Ocean
NAO−
sea ice
El Niño
sea ice
Atmosphere warm SST
Pacific Ocean cold SST
warm SST
North Atlantic Ocean
warm SST
cold SST
Ocean
thermocline
Figure 6. Sketch of the El Niño in the tropical Pacific, showing a reversal in (trade) wind direction from easterlies to westerlies during an El Niño period bringing warmer water (warm corresponds to a positive sea-surface temperature [SST]) close to the South American coast, displacing the equatorial thermocline downwards. Note the change in atmospheric tropical convection and associated heavy rainfall. After McPhaden, NOAA/TAO (2002) and Holton (1992).
wind and rain change) is named from the Spanish for Christ Child, due to its annual occurrence off the South American coast around Christmas, and may also be sensitive to anthropogenic influence (see Figure 6). The Southern Oscillation occurs as a 2–5-year periodic reversal in the east-west pressure gradient associated with the present equatorial wind circulation, called Walker circulation, across the Pacific leading to a reversal in wind direction and changes in temperature and precipitation. The easterly wind in the West Pacific becomes a westerly. As a consequence, the strong trade winds are weakened, affecting climate globally (e.g., crop failures in Australia, flooding in the USA, and the monsoon in India). The Southern Oscillation in turn leads to large-scale oceanic fluctuations in the circulation of the Pacific Ocean and sea-surface tempera-
Figure 7. Sketch of the North Atlantic Oscillation (NAO) during the northern hemisphere winter season. Positive NAO (NAO+) showing an above-usual strong subtropical high-pressure center and subpolar low, resulting in increased wind strengths and storms crossing the Atlantic towards northern Europe. NAO+ is associated with a warm wet winter in Europe and cold dry winter in North America. Central America experiences mild wet winter conditions. Negative NAO (NAO−) shows a weaker subtropical high and subpolar low, resulting in lower wind speeds and weaker storms crossing the Atlantic toward southern Europe and receded sea ice masses around Greenland. NAO− is associated with cold weather in northern Europe and moist air in the Mediterranean. Central America experiences colder climates and more snow. Adapted from Wanner (2000).
tures, which is called El Niño. The interannual variability, though, is not yet fully understood; consideration of a wider range of tropical and extratropical influences is needed. A counterpart to the ENSO in the Pacific is the NorthAtlantic Oscillation (NAO), which is essentially an oscillation in the pressure difference across the North Atlantic and is described further in Figure 7.
Monsoons The monsoons (derived from Arabic, mauism, meaning season or shift in wind) are seasonally reversing
ATTRACTOR NEURAL NETWORKS winds and one of the most pertinent features of the global atmospheric circulation. The best-known examples are the monsoons over the Indian Ocean and, to some extent, the western Pacific Ocean (tropical region of Australia), the western coast of Africa, and the Carribbean. Monsoons are characteristic for wet summer and dry winter seasons, associated with strong winds and cyclone formation. They occur due to differing thermal characteristics of the land and sea surfaces. Land, having a much smaller heat capacity than the ocean, emits heat from solar radiation more easily, leading to upward heat (cumulus) convection. In the summer season, this leads to a pressure gradient and thus wind from the land to the ocean in the upper layers of the atmosphere and subsequent conserving flow of moisture-rich air from the sea back inland at lower levels. This leads to monsoonal rains, increased latent heat release, and intensified monsoon circulation. During the monsoons of the winter season, the opposite of the summer season monsoon takes place, although less pronounced, since the thermal gradient between the land and sea is reversed. The winter monsoons thus lead to precipitation over the sea and cool dry land surfaces. ANDREAS A. AIGNER AND KLAUS FRAEDRICH See also Fluid dynamics; General circulation models of the atmosphere; Hurricanes and tornadoes; Lorenz equations; Navier–Stokes equation
Further Reading Apel, J. 1989. Principles of Ocean Physics, London: Academic Press Barry, R.G., Chorley, R.J. & Chase, T. 2003. Atmosphere, Weather and Climate, 8th edition, London and New York: Routledge Bergmann, K., Schaefer C. & von Raith, W. 2001. Lehrbuch der Experimentalphysik, Band 7, Erde und Planeten, Berlin: de Gruyter Cushman-Roisin, B. 1993. Introduction to Geophysical Fluid Dynamics, Englewood Cliffs, NJ: Prentice–Hall Defant, A. & Defant, Fr. 1958. Physikalische Dynamik der Atmosphäre, Frankfurt: Akademische Verlagsgesellschaft Gill, A. 1982. Atmosphere–Ocean Dynamics, New York: Academic Press Hasse, L. & Dobson, F. 1986. Introductory Physics of the Atmosphere and Ocean, Dordrecht and Boston: Reidel Holton, J.R. 1992. An Introduction to Dynamic Meteorology, 3rd edition, New York: Academic Press Kraus, E.B. & Businger, J.A. 1994. Atmosphere–Ocean Interaction, New York: Oxford University Press, and Oxford: Clarendon Press Krauss, W. 1973. Dynamics of the Homogeneous and Quasihomogeneous Ocean, vol I, Berlin: Bornträger LeBlond, P.H. & Mysak, L.A. 1978. Waves in the Ocean, Amsterdan: Elsevier Lindzen, R.S. 1990. Dynamics in Atmospheric Physics, Cambridge and New York: Cambridge University Press Pedlosky, J. 1986. Geophysical Fluid Dynamics, New York: Springer Philander, S.G. 1990. El Niño, La Niña, and the Southern Oscillation, New York: Academic Press
25
ATTRACTOR NEURAL NETWORKS Neural networks with feedback can have complex dynamics; their outputs are not related in a simple way to their inputs. Nevertheless, they can perform computations by converging to attractors of their dynamics. Here, we analyze how this is done for a simple example problem: associative memory, following the treatment by Hopfield (1984) (see also Hertz, et al., 1991, Chapters 2 and 3). Let us assume that input data are fed into the network by setting the initial values of the units that make it up (or a subset of them). The network dynamics then lead to successive changes in these values. Eventually, the network will settle down into an attractor, after which the values of the units (or some subset of them) give the output of the computation. The associative memory problem can be described in the following way: there is a set of p patterns to be stored. Given, as input, a pattern that is a corrupted version of one of these, the attractor should be a fixed point as close as possible to the corresponding uncorrupted pattern. We focus on networks described by systems of differential equations such as τi
dui + ui (t) = wij g[uj (t)]. dt
(1)
j =i
Here, ui (t) is the net input to unit i at time t and g( ) is a sigmoidal activation function (g > 0), so that Vi = g(ui ) is the value (output) of unit i. The connection weight to unit i from unit j is denoted wij , and τi is the relaxation time. We can also consider discrete-time systems governed by ⎡
Vi (t + 1) = g ⎣
⎤ wij Vj (t)⎦ .
(2)
j
Here, it is understood that all units are updated simultaneously. In either case, the “program” of such a network is its connection weights wij . In general, three kinds of attractors are possible: fixed point, limit cycle, and strange attractor. There are conditions under which the attractors will always be fixed points. For nets described by the continuous dynamics of Equation (1), a sufficient (but not necessary) condition is that the connection weights be symmetric: wij = wj i . General results about the stability of recurrent nets were proved by Cohen & Grossberg (1983). They showed, for dynamics (1), that there is a Lyapunov function, that is, a function of the state variables ui , which always decreases under the dynamics, except for special values of the ui at which it does not change. These values are fixed points. For values of the ui close to such a point, the system will evolve either toward it (an attractor) or away
26
ATTRACTORS
from it (a repellor). For almost all starting states, the dynamics will end at one of the attractor’s fixed points. Furthermore, these are the only attractors. We treat the case g(u) = tanh(βu) and consider the ansatz wij =
p 1 µ µ ξi ξj . N
(3)
µ=1
That is, for each pattern, there is a contribution to the connection weight proportional to the product of µ µ sending (ξj ) and receiving (ξi ) unit activities when the µ network is in a stationary state Vi = ξi . This is just the form of synaptic strength proposed by Hebb (1949) as the basis of animal memory, so this ansatz is sometimes called a Hebbian storage prescription. To see how well the network performs this computation, we examine the fixed points of (1) or (2), which solve ⎛ Vi = tanh ⎝β
⎞ wij Vj ⎠ .
(4)
j
The quality of retrieval of a particular stored pattern µ µ ξi is measured by the quantity mµ = N −1 i ξi Vi . Using (4), with the weight formula (3), we look for solutions in which the configuration of the network is correlated with only one of the stored patterns, that is, just one of the mµ ’s is not zero. If the number of stored patterns p N , we find a simple equation for mµ : mµ = tanh(βmµ ).
(5)
This equation has nontrivial solutions whenever the gain β > 1 and for β large, mµ → 1, indicating perfect retrieval. If the gain is high enough, there are other attractors in addition to the ones we have tried to program into the network with the choice (3), but by keeping the gain between 1 and 2.17 we can limit the attractor set to the desired states. When p is of the same order as N, the analysis is more involved. We define a parameter α = p/N. For small α, the overlaps mµ between the stored patterns and the fixed points are less than, but still close to, 1. However, there is a critical value of α, αc (β), above which there are no longer fixed points close to the patterns to be stored and the memory breaks down catastrophically. One finds αc (1) = 0 and, in the limit β → ∞, αc (β) → 0.14. Thus, attractor computation works in this system over a wide range of the model parameters α and β. It can be shown to be robust with respect to many other variations, including dilution (random removal of connections), asymmetry (making some of the wij = wj i ),
and quantization or clipping of the weight values. Its breakdown at the boundary αc (β) is a collective effect like a phase transition in a physical system. The weight formula (3) was only an educated guess. It is possible to obtain better weights, which reduce the crosstalk and increase αc , by employing systematic learning algorithms. It is also possible to extend the above-described model to store pattern sequences by including suitable delays in the discrete-time dynamics (2). It appears that attractor networks play a role in computations in the brain. One example of current interest is working memory: some neurons in the prefrontal cortex that are selectively sensitive to a particular visual stimulus exhibit continuing activity after the stimulus is turned off, even though the animal sees other stimuli. Thus, they seem to be involved in the temporary storage of visual patterns. Computational network models based on the simple concepts described above (Renart et al., 2001) are able to reproduce the main features seen in recordings from these neurons. JOHN HERTZ See also Cellular nonlinear networks; McCulloch– Pitts network; Neural network models Further Reading Amit, D.J. 1989. Modeling Brain Function, Cambridge and New York: Cambridge University Press Cohen, M. & Grossberg, S. 1983. Absolute stability of global pattern formation and parallel memory storage by competitive neural networks. IEEE Transactions on Systems, Man and Cybernetics, 13: 815–826 etc. Hebb, D.O. 1949. The Organization of Behavior, New York: Wiley Hertz, J.A., Krogh, A.S., & Palmer, R.G. 1991. Introduction to the Theory of Neural Computation, Redwood City, CA: Addison-Wesley Hopfield, J.J. 1984. Neurons with graded responses have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Sciences USA, 79: 3088–3092 Renart, A., Moreno, R., de la Rocha, J., Parga, N. & Rolls, E.T. 2001. A model of the IT-PF network in object working memory which includes balanced persistent activity and tuned inhibition. Neurocomputing, 38–40: 1525–1531
ATTRACTORS A wide variety of problems arising in physics, chemistry, and biology can be recast within the framework of dynamical systems. A dynamical system is made up of two parts: the phase space, which consists of all possible configurations of the physical system, and the “dynamics,” a rule describing how the state of the system changes over time. The fundamental insight of the theory is that some problems, which initially appear extremely complicated, can be greatly simplified if we are prepared to concentrate on their long-term behavior, that is, what happens eventually.
ATTRACTORS This idea finds mathematical expression in the concept of an attractor. The simplest possibility is that the system settles down to a constant state (e.g., a pendulum damped by air resistance will end up hanging vertically downward). In the phase space, this corresponds to an attractor that is a single “fixed point” for the dynamics. If the system settles down to a repeated oscillation, then this corresponds to a “periodic orbit,” a closed curve in the phase space. For two coupled ordinary differential equations (ODEs), it is a consequence of the Poincaré– Bendixson Theorem that these fixed points and periodic orbits are essentially the only two kinds of attractors that are possible (see Hirsch & Smale, 1974 for a more exact statement). In higher dimensions, it is possible for the limiting behavior to be quasi-periodic with k different frequencies, corresponding to a k-torus in the phase space (cf. Landau’s picture of turbulence as in Landau & Lifschitz, 1987). However, with three or more coupled ODEs (or in one-dimensional maps), the attractor can be an extremely complicated object. The famous “Lorenz attractor” was perhaps the first explicit example of an attractor that is not just a fixed point or (quasi) periodic orbit. Edward Lorenz highlighted this in the title of his 1963 paper, “Deterministic Nonperiodic Flow.” The phrase “strange attractor” was coined by Ruelle & Takens (1971) for such complicated attracting sets. These attractors, and the chaotic dynamics associated with them, have been the focus of much attention, particularly in relation with the theory of turbulence (the subject of Ruelle & Takens’ paper; see also Ruelle, 1989). There is no fixed definition of a “strange attractor”; some authors use the phrase as a signature of chaotic dynamics, while others use it to denote a fractal attractor (e.g., Grebogi et al. (1984) discuss “strange nonchaotic attractors”). Over the years, various authors have given precise (but different) definitions of an attractor: Milnor (1985) discusses many of these (and proposes a new one of his own). Most definitions require that an attractor attract a “large set of initial conditions and satisfy some kind of minimality property” (without this, the whole phase space could be called an attractor). We refer to the set of all those points in the phase space whose trajectories are attracted to some set A as the basin of attraction of A and write this B(A). There are essentially two choices of what it means to attract a large set of initial conditions: the more common one is that B(A) contains an open neighborhood of A, while Milnor (1985) suggested that a more realistic requirement is that B(A) has positive Lebesgue measure. Exactly what type of minimality assumption we require depends on what we want our attractor to say about the dynamics. At the very least, there should be no smaller (closed) set with the same basin of
27 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 −0.4
a
−1.5
−1
−0.5
0
0.5
0
2
1
1.5
10 8 6 4 2 0 −2 −4 −6 −8 −10 −8
b
−4
−2
4
6
Figure 1. (a) A symmetric double-well potential. (b) Phase portrait of a particle moving in the potential of (a) with friction.
attraction: this excludes any “unnecessary” points from the attractor. A consequence of this minimality property is that the attractor is invariant: if A is the attractor of a map f , this means that f (A) = A (there is, of course, a similar property for the attractor of a flow). In particular, this means that it is possible to talk about the “dynamics on the attractor.” If we want one attractor to describe the possible asymptotic dynamics of every initial condition, then there is no need to impose any further minimality assumption. Figure 1(b) shows the phase portrait for a particle moving with friction in the symmetric doublewell potential of Figure 1(a); the basin of attraction of the fixed point corresponding to the bottom of the lefthand well is shaded. (The equations of motion are x˙ = y and y˙ = − y2 + x − x 3 .) We could say that the attractor consists of the three points {(−1, 0), (1, 0), (0, 0)}, but this discards much of the information contained in the phase portrait. This motivates the further requirement that an attractor be “indecomposable”: it should not be possible to split it into two disjoint invariant subsets. (Some definitions require there to be a dense orbit in
28 the attractor: essentially, this means that one trajectory “covers” the entire attractor, so in particular the attractor cannot be split into two pieces.) This gives us two possible attractors: (−1, 0) and (1, 0) (the origin does not attract a neighborhood of itself, nor any set of positive measure). In this example, the boundary between the basins of attraction of the two competing attractors is a smooth curve. However, in many examples this boundary is a fractal set. This was first noticed by McDonald et al. (1985), who observed that near a fractal boundary, it is harder to predict the asymptotic behavior of imprecisely known initial conditions. An extreme version of this occurs with the phenomenon of “riddled basins,” first observed by Alexander et al. (1992): arbitrarily close to a point attracted to one attractor; there can be a point attracted to another. In this case, an arbitrarily small change in the initial condition can lead to completely different asymptotic behavior. (In addition to treating some analytically tractable examples, Alexander et al. (1992) give an impressive array of pictures from their numerical simulations.) Attractors can also be meaningfully defined for the infinite-dimensional dynamical systems arising from partial and functional differential equations (e.g., Hale, 1988; Robinson, 2001; Temam 1988/1996), and for random and nonautonomous systems (Crauel et al. (1997) adopt an approach that includes both these cases). JAMES C. ROBINSON See also Chaos vs. turbulence; Dynamical systems; Fractals; Phase space; Turbulence
Further Reading Alexander, J.C., Yorke, J.A., You, Z. & Kan, I. 1992. Riddled basins. International Journal of Bifurcation and Chaos, 2: 795–813 Crauel, H., Debussche, A. & Flandoli, F. 1997. Random attractors. Journal of Dynamics and Differential Equations, 9: 307–341 Grebogi, C., Ott, E., Pelikan, S. & Yorke, J.A. 1984. Strange attractors that are not chaotic. Physica D, 13: 261–268 Hale, J.K. 1988. Asymptotic Behavior of Dissipative Systems, Providence, RI: American Mathematical Society Hirsch, M.W. & Smale, S. 1974. Differential Equations, Dynamical Systems and Linear Algebra. New York: Academic Press Landau, L.D. & Lifschitz, E.M. 1987. Fluid Mechanics, 2nd edition, Oxford: Pergamon Press Lorenz, E.N. 1963. Determininstic non-periodic flow. Journal of Atmospheric Science, 20: 130–141 Milnor, J. 1985. On the concept of attractor. Communications in Mathematical Physics, 99: 177–195 Robinson, J.C. 2001. Infinite-Dimensional Dynamical Systems, Cambridge and New York: Cambridge University Press Ruelle, D. 1989. Chaotic Evolution and Strange Attractors, Cambridge and New York: Cambridge University Press Ruelle, D. & Takens, F. 1971. On the nature of turbulence. Communications in Mathematical Physics, 20: 167–192
AUBRY–MATHER THEORY Temam, R. 1996. Infinite-dimensional Dynamical Systems in Mechanics and Physics, 2nd edition, Berlin and New York: Springer
AUBRY–MATHER THEORY Named after Serge Aubry and John Mather, who independently shaped the seminal ideas, the Aubry–Mather theory addresses one of the central problems of modern dynamics: the characterization, of nonintegrable Hamiltonian time evolution beyond the realm of perturbation theory. In general terms, when a Hamiltonian system is near-integrable, perturbation theory provides a rigorous generic description of the invariant sets of motion (closed sets containing trajectories) as smooth surfaces (KAM tori), each one parametrized by the rotation number ω of the angle variable (angleaction coordinates): all the trajectories born and living inside the invariant torus share this common value of ω. An invariant set has an associated natural invariant measure, which describes the measure-theoretical (or statistical) properties of the trajectories inside the invariant set. The invariant measure on a torus is a continuous measure, so that the distribution function of the angle variable is continuous. In this near-integrable regime of the dynamics, perturbative schemes converge adequately and future evolution is—to a desired arbitrary degree—predictable for arbitrary initial conditions on each torus. Far from integrable Hamiltonian dynamics, what is the fate of these invariant natural measures, or invariant sets of motion, beyond the borders of validity of perturbation theory? The answer is that each torus breaks down and its remaining pieces form an invariant fractal set, called by Percival a cantorus (or Aubry–Mather set) characterized by the rotation number value common to all trajectories in the cantorus. The statistical properties of the trajectories on the invariant cantorus are now described by a purely discrete measure or Cantor distribution function (see Figure 1).
Basic Theorems The formal setting of Aubry–Mather theory for the transition from regular motion on invariant tori to orbits on hierarchically structured nowhere dense cantori is the class of maps of a cylindrical surface C = S 1 × R (cylindrical coordinates (u, p)) (see Figure 2) onto itself, f : C → C,
(1)
characterized by preservation of areas (symplectic) and the “twist” property, meaning that the torsion produced by an iteration of the map on a vertical segment of the cylindrical surface converts it into a part of the graph
AUBRY–MATHER THEORY
29
n
F(u)
0 1 2
.. .
3
u CANTOR SET
Figure 1. Left: Construction of a Cantor set from the unit real interval (or circle S 1 ) as a limiting process. At each step n, a whole piece is cut out from each remaining full segment. Right: The distribution function F (u) of the projection onto the angular component un of a cantorus orbit. F (u) is the limiting proportion of the values of un < u, −∞ < n < + ∞.
p
f
u p p
f
u
> u
Figure 2. Schematic illustration on the unfolded cylindrical surface C = S 1 ×R of the twist (upper right) and area-preserving (lower right) properties of the map F : C → C. In the upper right area the curve is a single-valued function ϕ(u) of the angular variable.
of some function ϕ(u) of the angular coordinate. More explicitly, if we denote by (u , p ) = f (u = 0, p) the image of the vertical segment, then u is a monotone function of p, so that (u , p ) is the graph of a singlevalued function. An area-preserving twist map has associated an action-generating function related to the map via a variational (extremal action) principle: • An action-generating function, H (x, x ), of a twist map is a two-variable function that is strictly convex: ∂ 2H ≤ K < 0. ∂u∂u
(2)
• If u0 is a critical point of L(x) = H (u−1 , x) + H (x, u1 ), then a certain sequence (u−1 , p−1 ), (u0 , p0 ), (u1 , p1 ) is a segment of a cylinder orbit of f . {uj }nj = 0
with fixed ends (u0 = (Given a sequence a, n = b), the associated action functional L is the sum u n−1 j = 0 H (uj , uj +1 ).
A cylinder orbit is called ordered when it projects onto an angular sequence ordered in the same way as a uniform rotation of angle ω. An invariant set is a minimal invariant set if it does not include proper invariant subsets and is called ordered if it contains only ordered orbits. The proper definition of an Aubry– Mather set is a minimal invariant ordered set that projects one-to-one on a nowhere dense Cantor set of the circle S 1 . The following points comprise the main core (Golé, 2001; Katok & Hasselblatt, 1995) of the Aubry–Mather theory: • For each rational value ω = p/q of the rotation number, there exist (Poincaré–Birkhoff theorem) at least two ordered periodic orbits (Birkhoff periodic orbits of type (p, q)), which are obtained by, respectively, minimizing and maximizing the action over the appropriate set of angular sequences. In general, periodic orbits of rational rotation number ω = p/q do not form an invariant circle, in which case there are nonperiodic orbits approaching two different periodic orbits as n → − ∞ and as n → + ∞, called heteroclinic orbits (or homoclinic in some contexts). These orbits connect two Birkhoff periodic orbits through a minimal action path. Usually, the number of map iterations needed for such action-minimizing orbits to pass over the action barriers is exponentially small. • For each irrational value of ω, there exists either an invariant torus or an Aubry–Mather set. There are also homoclinic trajectories connecting orbits on the Aubry–Mather set. The hierarchical structure of gaps that break up the torus has its origins in the path-dependent action barriers. Note also that heteroclinics to nearby periodic orbits (of rational rotation number) pass over nearly the same action barriers, leading to a
30 somewhat metaphorical view of nearby resonances biting the tori and leaving gaps. Certainly, the action barriers fractalize the invariant measure according to the proximity of the irrational rotation number ω to rationals. The explanatory power of the extremal action principle in the fractalization of invariant sets of motion suggests one of the immediate physical applications of this theory. Indeed, Aubry’s work was originally motivated by the equilibrium problem of a discrete field (un ) n ∈ Z , under some energy functional, whose extremalization defines equilibrium field configurations. Under some conditions on the energy-generating function H (u, u ), both are equivalent mathematical physics problems, and a one-to-one correspondence between orbits (un , pn ) and equilibrium field configurations (un ) does exist (Aubry, 1985).
Application to the Generalized Frenkel–Kontorova Model From this perspective, the Aubry–Mather theory gives rigorous variational answers in the description of equilibrium discrete nonlinear fields such as the generalized Frenkel–Kontorova (FK) model with convex interactions. Although the terminology changes, every aspect of cylinder dynamics has a counterpart in the equilibrium problem of this interacting nonlinear many-body model. • Commensurate (periodic) field configurations correspond to Birkhoff periodic orbits, and as such they are connected by the field configurations associated to heteroclinics, which are here called discrete (sine-Gordon) solitons or elementary discommensurations. • Incommensurate (quasiperiodic) field configurations can correspond either to tori trajectories or to Aubry– Mather trajectories. The macroscopic physical properties (formally represented by averages on the invariant measure) of the field configuration experience drastic changes when passing from one case (tori) to the other (Aubry–Mather sets). This transition (called breaking of analiticity by Aubry) has been characterized as a critical phenomenon using renormalization group methods by (MacKay, 1993). The Aubry–Mather theory puts on a firm basis what is known as discommensuration theory, which is the description of a generic incommensurate or (higherorder) commensurate field configuration as an array of discrete field solitons, strongly interacting when tori subsist, but almost noninteracting and deeply pinned when only Cantor invariant measures remain. Aubry’s work provided a satisfactory understanding of the complexity of the phase diagrams and the singular character of the equations of state of the generalized FK model (Griffiths, 1990). LUIS MARIO FLORÍA
AVALANCHE BREAKDOWN See also Commensurate–incommensurate transition; Frenkel–Kontorova model; Hamiltonian systems; Kolmogorov–Arnol’d–Moser theorem; Phase transitions; Standard map; Symplectic maps Further Reading Aubry, S. 1985. Structures incommensurables et brisure de la symmetrie de translation I [Incommensurate structures and the breaking of traslation symmetry I]. In Structures et Instabilités, edited by C. Godréche, Les Ulis: Editions de Physique, pp. 73–194 Golé, C. 2001. Symplectic Twist Maps. Global Variational Techniques, Singapore: World Scientific Griffiths, R.B. 1990. Frenkel–Kontorova models of commensurate-incommensurate phase transitions. In Fundamental Problems in Statistical Mechanics VII, edited by H. van Beijeren, Amsterdam: North-Holland, pp. 69–110 Katok, A. & Hasselblatt, B. 1995. Introduction to the Modern Theory of Dynamical Systems. Cambridge and New York: Cambridge University Press MacKay, R.S. 1993. Renormalisation in Area-preserving Maps, Singapore: World Scientific
AUTO-BÄCKLUND TRANSFORMATION See Bäcklund transformations
AUTOCATALYTIC SYSTEM See Reaction-diffusion systems
AUTOCORRELATION FUNCTION See Coherence phenomena
AUTONOMOUS SYSTEM See Phase space
AUTO-OSCILLATIONS See Phase plane
AVALANCHE BREAKDOWN Charge transport in condensed matter is simply described by the current density j as a function of the local electric field E. For bulk materials, the current density per unit area is given by j (E) = − env, where e > 0 is the electron charge, n is the conduction electron density per unit volume, and v is the drift velocity. In the simplest case, v is a linear function of the field: v = − µE, with mobility µ. Thus, the conductivity σ = j/E = enµ is proportional to the number of conduction electrons. In metals, n is given by the number of valence electrons, which is temperature independent. In semiconductors, however, the concentration of electrons in the conduction
AVALANCHE BREAKDOWN
31
band varies greatly and is determined by generationrecombination (GR) processes that induce transitions between valence band, conduction band, and impurity levels (donors and acceptors). Charge carrier concentration depends not only upon temperature but also upon the electric field, which explains why the conductivity can change over many orders of magnitude. A GR process that depends particularly strongly on the field is impact ionization, the inverse of the Auger effect. It is a process in which a charge carrier with high kinetic energy collides with a second charge carrier, transferring its kinetic energy to the latter, which is thereby lifted to a higher energy level. The kinetic energy is increased by the local electric field, which heats up the carriers. As a certain minimum energy is necessary to overcome the difference in the energy levels of the second carrier, the impact ionization probability depends in a threshold-like manner on the applied voltage. Impact ionization processes may be classified as band-band processes or band-trap processes depending on whether the second carrier is initially in the valence band and makes a transition from the valence band to the conduction band, or whether it is initially at a localized level (impurity, donor, acceptor), and makes a transition to a band state. Further, impact ionization processes are classified as electron or hole processes according to whether the ionizing hot carrier is a conduction band electron or a hole in the valence band. Schematically, impact ionization may be written as one of the following reaction equations, in analogy with chemical kinetics: e e + et h h + ht
−→ −→ −→ −→
2e + h, 2e + ht , 2h + e, 2h + et ,
(1) (2) (3) (4)
where e and h denote band electrons and holes, respectively, and et and ht stand for electrons and holes trapped at impurities (donors, acceptors, or deep levels). The result of the process is carrier multiplication (avalanching), which may induce electrical instabilities at sufficiently high electric fields. Impact ionization from shallow donors or acceptors is responsible for impurity breakdown at low temperatures. Being an autocatalytic process (i.e., each carrier ionizes secondary carriers that might, in turn, impact ionize other carriers), it induces a nonequilibrium phase transition between a low- and high-conductivity state and may lead to a variety of spatiotemporal instabilities, including current filamentation, self-sustained oscillations, and chaos. The conductivity saturates when all impurities are ionized. Band-to-band impact ionization eventually induces avalanche breakdown, limiting the bias voltage that can be safely applied to a device. The conductivity increases much more
strongly than during impurity breakdown because of the large number of valence band electrons available for ionization. Impurity impact-ionization breakdown at helium temperatures (ca. 5 K) in p-Ge, n-GaAs, and other semiconductor materials has been thoroughly studied both experimentally and theoretically as a model system for nonlinear dynamics in semiconductors. It displays S-shaped current–voltage characteristics because the GR kinetics incorporating impact ionization from at least two impurity levels (ground state and excited state) allows for three different values of the carrier density n(E) in a certain range of fields E. As a result of the negative differential conductivity, a variety of temporal and spatiotemporal instabilities occur, ranging from stationary and breathing current filaments and traveling charge density waves to various chaotic scenarios. Band-to-band impact ionization of a reverse biased p–n junction is the basis of a variety of electronic devices. A number of these devices depend on a combination of impact ionization of hot electrons and transit time effects. The IMPATT (impact ionization avalanche transit time) diodes can generate the highest continuous power output at frequencies > 30 GHz. The originally proposed device (Read diode) involves a reverse biased n+ –p–i–p+ multilayer structure, where n+ and p+ denote strongly n- or p-doped semiconductor regions, and i denotes an intrinsic (undoped) region. In the n+ –p region (avalanche region), carriers are generated by impact ionization across the bandgap; the generated holes are swept through the i region (drift region) and collected at the p+ contact. When a periodic (ac) voltage is superimposed on the time-independent (dc) reverse bias, a π phase lag of the ac current behind the voltage can arise. This phase lag is due to the finite buildup time of the avalanche current and the finite time carriers take to cross the drift region (transit-time delay). If the sum of these delay times is approximately one-half cycle of the operating frequency, negative conductance is observed; in other words, the carrier flow drifts opposite to the ac electric field. This can be achieved by properly matching the length of the drift region with the drift velocity and the frequency. Other devices using the avalanche breakdown effect are the TRAPATT (trapped plasma avalanche triggered transit) diode and the avalanche transistor. The Zener diode is a p–n junction that exhibits a sharp increase in the magnitude of the current at a certain reverse voltage where avalanche breakdown sets in. It is used to stabilize and limit the dc voltage in circuits (overload and transient suppressor) since the current can vary over a large range at the avalanche breakdown threshold without a noticeable change in the voltage. The original Zener effect, on the other hand, is due to quantum mechanical tunneling across the bandgap at high fields, and is effective in highly doped (resulting in
32
AVALANCHES
narrow depletion layers) Zener diodes at lower breakdown voltages. ECKEHARD SCHÖLL See also Diodes; Drude model; Nonlinear electronics; Semiconductor oscillators Further Reading Landsberg, P.T. 1991. Recombination in Semiconductors, Cambridge and New York: Cambridge University Press Schöll, E. 1987. Nonequilibrium Phase Transitions in Semiconductors, Berlin: Springer Schöll, E. 2001. Nonlinear Spatio-temporal Dynamics and Chaos in Semiconductors, Cambridge and New York: Cambridge University Press Schöll, E., Niedernostheide, F.-J., Parisi, J., Prettl, W. & Purwins, H. 1998. Formation of spatio-temporal structures in semiconductors. In Evolution of Spontaneous Structures in Dissipative Continuous Systems, edited by F.H. Busse & S.C. Müller, Berlin: Springer, pp. 446–494 Shaw, M.P., Mitin, V.V., Schöll, E. & Grubin, H.L. 1992. The Physics of Instabilities in Solid State Electron Devices, New York: Plenum Press
AVALANCHES An avalanche is a downhill slide of a large mass, usually of snow, ice, or rock debris prompted by a small initial disturbance. Avalanches, along with landslides, are one of the major natural disasters that still present significant danger for people in the mountains. On average, 25 people die in avalanches every winter in Switzerland alone. Dozens of people were killed on September 23, 2002, in a gigantic avalanche in Northern Ossetia, Russia, when a 150 m thick chunk of the Kolka Glacier broke off and triggered an avalanche of ice and debris that slid some 25 km along Karmadon gorge. In 1999, some 3000 avalanches occurred in the Swiss Alps. Avalanches vary widely in size, from minor slides to large movements of snow reaching a volume of 105 m3 and a weight of 30,000 tons. The speed of the downhill snow movement can reach 100 m/s. There are two main types of avalanches-loose avalanche and slab avalanche: depending on the physical properties of snow. Soft dry snow typically produces loose avalanches that form a wedge downward from the starting point, mainly determined by the physical properties of the granular material. In wet or icy conditions, on the other hand, a whole slab of solid dense snow may slide down. The initiation of the second type occurs as a fracture line at the top of the slab. The study of real avalanches and landslides is mostly an empirical science that is traditionally a part of geophysics and draws from the physics of snow, ice, and soil. Semi-empirical computer codes have been developed for prediction of avalanches dependent on the weather conditions (snowfall, wind, temperature profiles) and topography.
Figure 1. Only several layers of mustard seeds are involved in the rolling motion inside the avalanche: moving grains are smeared out in this long-exposure photograph. Reproduced with permission from Jaeger et al. (1998).
More fundamental aspects of avalanche dynamics have been studied in controlled laboratory experiments with dry or wet granular piles, or sandpiles. Granular slope can be characterized by two angles of repose— the static angle of repose θs which is the maximum angle at which the granular slope can remain static, and the dynamic angle of repose θd , or a minimum angle at which the granular flow down the slope can persist. Typically, in dry granular media, the difference between static and dynamic angles of repose is about 2–5◦ , for smooth glass beads θs ≈ 25◦ , θd ≈ 23◦ . Avalanches may occur in the bistable regime when the slope angle satisfies θd < θ < θs . The bistability is explained by the need to dilate the granular material for it to enter flowing regime (Bagnold’s dilatancy). An avalanche can be initiated by a small localized fluctuation from which the fluidized region expands downhill and sometimes also uphill, while the sand always slides downhill. An avalanche in a deep sandpile usually involves a narrow layer near the surface (see Figure 1). Avalanches have also been studied in finite-depth granular layers on inclined planes. The two-dimensional structure of a developing avalanche depends on the thickness of the granular layer and the slope angle. For thin layers and small angles, wedgeshaped avalanches are formed similar to the loose snow avalanches (Figure 2a). In thicker layers and at higher inclination angles, avalanches have a balloon-type shape that expands both down- and uphill (Figure 2b). The kinematics of the fluidized layer in one dimension can be described by a set of hydraulic equations for the local thickness R(x, t) of the layer of rolling particles flowing over a sandpile of immobile particles with variable profile h(x, t) (BCRE model, after Bouchaud et al., (1994)), ∂t R = −v∂x R + (R, h) + (diffusive terms), (1) ∂t h = −(R, h) + (diffusive terms),
(2)
AVALANCHES
33
Figure 2. Structure of the avalanche in a thin (4 grain diameters) layer of glass beads: (a) wedge-shaped avalanche for θ = 31.5◦ ; (b) balloon-shaped avalanche propagating both up- and downhill for θ = 32.5◦ . Reprinted by permission from Nature (Daerr & Douady, 1999).Copyright (1999) Macmillan Publishers Ltd.
where is the entrainment flux of immobile particles into the rolling layer and the downhill transport velocity v is assumed constant. becomes positive when the local slope becomes steeper than the static repose angle θs , and in the simplest case, = γ R(∂x h − tan θs ). This model allows for a complete analytical treatment. A more sophisticated continuum theory of granular avalanches is based on the fluid dynamics (Navier– Stokes) equations coupled with a phenomenological description of the first-order phase transition from a static to a fluidized state driven by the local shear stress (Aranson & Tsimring, 2001). The local phase state is described by the local order parameter ρ that is controlled by a Ginzburg–Landau-type equation with bistable free energy F (ρ, δ): ∂t ρ = D∇ 2 ρ − ∂ρ F (ρ, δ)
(3)
The control parameter δ in this equation depends on the ratio of shear to normal stress. This theory can describe a variety of “partially fluidized” granular flows, including avalanches in sandpiles. In a “shallowwater” approximation, it yields the BCRE-type equations for the local slope and the thickness of the rolling layer. The wide distribution of scales in real avalanches led Bak et al. (1988) to propose a “sandpile cellular automaton” (See Sandpile model) as a paradigm model for self-organized criticality (SOC), the phenomenon that occurs in slowly driven nonequilibrium spatially
extended systems when they asymptotically reach a critical state characterized by a power-law distribution of event sizes. The BTW model is remarkably simple, yet it exhibits a highly nontrivial behavior. The sandpile is formed on a lattice by dropping “grains” on a random site from above, one at a time. “Grains” form stacks of integer height at each lattice site. After each grain dropping the sandpile is allowed to relax. Relaxation occurs when the slope (a difference in heights of two adjacent stacks) reaches a critical value (“angle of repose”) and the grain hops to a lower stack. This may prompt a series of subsequent hops and so trigger an avalanche. The size of the avalanche is determined by the number of grains set into motion by adding a single grain to a sandpile. In the asymptotic regime in a large system, the avalanche size distribution becomes scale-invariant, P (s) ∝ s −α with α ≈ 1.5. The relevance of this model and its generalizations to real avalanches is still a matter of debate. The sandpile model is defined via a single repose angle, and so its asymptotic behavior has the properties of the critical state for a second-order phase transition. Real sandpiles are characterized by two angles of repose and thus exhibit features of the first-order phase transition. Experiments with avalanches in slowly rotating drums do not confirm the scale-invariant distribution of avalanches. However, in such experiments, the internal structures of the sandpile (the force chains) are constantly changing in the process of rotation. In other experiments with large monodispersed glass beads dropped on a conical sandpile, SOC with α ≈ 1.5 was observed. The characteristics of the size distribution depend on the geometry of the sandpile and the physical and geometrical properties of grains. SOC was also observed in the avalanche statistics in a threedimensional pile of long rice; however, a smaller scaling exponent α ≈ 1.2 was measured for the avalanche size distribution. An avalanche in a pile of sand has been used as a metaphor in many other physical phenomena including the avalanche diodes, vortices in type-II superconductors, Barkhausen effect in ferro-magnetics, 1/f noise, and. LEV TSIMRING See also Granular materials; Sandpile model Further Reading Aranson, I.S. & Tsimring, L.S. 2001. Continuum description of avalanches in granular media. Physical Review E, 64: 020301 Bak, P., Tang, C. & Wiesenfeld, K. 1988. Self-organized criticality. Physical Review A, 38: 364–374 Bouchaud, J.-P., Cates, M.E., Ravi Prakash, J. & Edwards, S.F. 1994. A model for the dynamics of sandpile surfaces. Journal de Physique I, 4: 1383–1410 Daerr, A. & Douady, S. 1999. Two types of avalanche behaviour in granular media. Nature, 399: 241–243
34 Duran, J. 1999. Sands, Powders, and Grains: An Introduction to the Physics of Granular Materials, Berlin and New York: Springer Jaeger, H.M., Nagel, S.R. & Behringer, R.P. 1996. Granular solids, liquids, and gases. Reviews of Modern Physics, 68: 1259–1273 Jensen, H.J. 1998. Self-Organized Criticality, Cambridge: Cambridge University Press Nagel, S.R. 1992. Instabilities in a sandpile. Reviews of Modern Physics, 64(1): 321–325 Rajchenbach, J. 2000. Granular flows. Advances in Physics, 49(2): 229–256
AVERAGING METHODS
Figure 1. A hierarchy of adiabatic invariants for the charged particle gyrating in a nonaxisymmetric, magnetic mirror field. The three adiabatic invariants are the magnetic moment µ, the longitudinal invariant J|| , and the guiding center flux invariant .
AVERAGING METHODS Averaging methods are generally used for dynamical systems of two or more degrees of freedom when time scales or space scales are well separated. An average over the rapidly varying coordinates of one degree of freedom, considering the coordinates of the second degree of freedom to be constant during the average, can, with appropriate variables, retain a time-invariant quantity that enters into the solution of the slower motion. This solution, in turn, supplies a parameter to the rapid motion, which can then be solved in a lowerdimensional space. The averaging method is closely related to the calculation of adiabatic invariants, which are the approximately constant integrals of the motion that are obtained by averaging over the fast angle variables. The lowest-order calculation is generally straightforwardly performed in canonical coordinates. A transformation from momentum and position coordinates (p, q), for the fast oscillation, to action-angle form (J, θ ) gives a constant of the motion J , if all other variables are held constant. The action J is then the constant parameter in the equation for the slower motion. It is not always convenient to transform to action-angle form directly, but the underlying constants are related to the action variables. The formal expansion procedure that is employed is to develop the solution in an asymptotic series. The mathematical method applied to ordinary differential equations was developed by Nikolai Bogoliubov (Bogoliubov & Mitropolsky, 1961) and in a somewhat different form by Martin Kruskal (1962). The expansion techniques can be formally extended to all orders in the perturbation parameter but are actually divergent. For multiple periodic systems, higher-order local nonlinear resonances between the degrees of freedom may destroy the ordering in their neighborhood. We will return to this problem below. Averaging over the fastest oscillation of an N degree-of-freedom system reduces the number of freedoms to N − 1. A second average over the next fastest motion then produces a second adiabatic invariant to reduce the freedoms to N − 2. This process may be continued to obtain a hierarchy of adiabatic
invariants, until the system is reduced to one degree of freedom, which can be integrated to obtain a final integrable equation. The process is well known in plasma physics where, for a charged particle gyrating in a magnetic mirror field, we first find the magnetic moment invariant µ associated with the fast gyration, then find the longitudinal invariant J|| associated with the slower bounce motion, and finally find the flux invariant associated with the drift motion. The three degrees of freedom are shown in Figure 1. The small parameters in this case are ε1 , the ratio of bounce frequency to gyration frequency; ε2 , the ratio of guiding center drift frequency to bounce frequency; and ε3 , the ratio of the frequency of the time-varying magnetic field to the drift frequency. This example motivated the development of averaging methods. The derivations of these invariants are given in detail in Northrop (1963) or in other plasma physics texts. Although the asymptotic expansions are formally good to all orders in a small dimensionless parameter of the form ε = |ω/ω ˙ 2 |, where ω is the frequency of the fast oscillation that is slowly changing in time and ω˙ ≡ dω/dt, the series generally diverge. The physical reason is that resonances or near-resonances between degrees of freedom lead to small denominators in the coefficients of terms. For nonlinear coupled oscillatory systems, exact resonances for certain values of the action locally change the structure of the phase-space orbits so that they do not follow the values obtained by averaging. This led to the development of the secular perturbation theory (Born, 1927), in which a local transformation of the coordinates around the resonance can be made. The frequency of the oscillatory motion in the neighborhood of the exact resonance is then slow compared with the other frequencies in the transformed coordinates, and averaging can then be applied locally. A review of the various methods, their limitations, practical examples, and reference to original sources can be found in Lichtenberg (1969) and Lichtenberg & Lieberman (1991). The above discussion is related to the study of finite-dimensional systems governed by ordinary
AVERAGING METHODS differential equations. The methods are usually applied to relatively low-dimensional systems, for example, the motion of a magnetically confined charged particle as described above. However, averaging methods are also applied to systems governed by partial differential equations, such as nonlinear wave propagation problems and wave instabilities. For example, waves on discrete oscillator chains can be obtained by first averaging over the discreteness using a Taylor expansion. ALLAN J. LICHTENBERG See also Adiabatic invariants; Breathers; Collective coordinates; Modulated waves; Solitons
35 Further Reading Bogoliubov, N.N. & Mitropolsky, Y.A. 1961. Assymptotic Methods in the Theory of Nonlinear Oscillators, New York: Gordon & Beach Born, M. 1927. The Mechanics of the Atom, London: Bell Kruskal, M.D. 1962. Asymptotic theory of Hamiltonian systems with all solutions nearly periodic. Journal of Mathematical Physics, 3: 806–828 Lichtenberg,A.J. 1969. Phase Space Dynamics of Particles, New York: Wiley Lichtenberg, A.J. & Lieberman M.A. 1991. Regular and Chaotic Dynamics, 2nd edition, New York: Springer Northrop, T.G. 1963. The Adiabatic Motion of Charged Particles, New York: Wiley
B BÄCKLUND TRANSFORMATIONS
crucial role in intruding the Bäcklund parameter β into the parameter-independent “Bianchi” transformation Bβ = 1 to produce Bβ . It was in 1892 that Luigi Bianchi in his masterly paper Sulla Trasformazione di Bäcklund per le Superficie Pseudosferiche established that the BT Bβ admits a commutation property Bβ2 Bβ1 = Bβ1 Bβ2 , a consequence of which is a nonlinear superposition principle embodied in what is termed a “permutability theorem.”
Bäcklund transformations (BTs) originated in investigations conducted in the late 19th century into invariance properties of pseudospherical surfaces, namely surfaces of constant negative Gaussian curvature. In 1862, it was Edmond Bour who derived the well-known sine-Gordon equation ωuv =
1 sin ω ρ2
(1)
via the Gauss–Mainardi–Codazzi system for pseudospherical surfaces with total curvature K = − 1/ρ 2 , parametrized in asymptotic coordinates. In 1883,Albert Bäcklund published his now classical result whereby pseudo-spherical surfaces may be generated in an iterative manner. Thus, if r is the position vector of a pseudospherical surface corresponding to a seed solution ω of Equation (1) and ω denotes the Bäcklund transformation of ω via the BT
ω +ω 2β sin , ωu − ωu = ρ 2
Bβ ω − ω 2 sin , ωv + ωv = βρ 2 (2)
Bianchi’s Permutability Theorem If ω is a seed solution of the sine-Gordon equation (1), let ω1 , ω2 denote the BT of ω via Bβ1 and Bβ2 , that is, ω1 = Bβ1 (ω) and ω2 = Bβ2 (ω). Let ω12 = Bβ2 (ω1 ) and ω21 = Bβ1 (ω2 ). Then, imposition of the commutativity requirement ω12 = ω21 yields a new solution of (1), namely = ω12 = ω21 = ω + 4 tan−1
ω2 − ω1 β2 + β1 tan . β2 − β1 4 (4)
then the position vector r of the one-parameter class of surfaces corresponding to ω is given by (Bäcklund, 1883)
This result is commonly encapsulated in what is termed a “Lamb diagram” as shown in Figure 1. This solutiongeneration procedure may be iterated via what is sometimes termed a Bianchi lattice. At each iteration, a new Bäcklund parameter βi is introduced. The discovery of the BT for the iterative construction of pseudospherical surfaces along with its concomitant permutability theorem led to an intensive search by geometers at the turn of the 20th century for other classes of privileged surfaces that possess Bäcklundtype transformations. In this connection, Luther Eisenhart, in the preface to his monograph Transformations of Surfaces published in 1922, asserted that: “During the past twenty-five years many of the advances in differential geometry of surfaces in Euclidean space have had to do with transformations of surfaces of a given type into surfaces of the same type.” Thus,
L sin ω
ω + ω ω − ω ru + sin rv , × sin 2 2
r = r +
(3) where L = ρ sin ζ and β = tan(ζ /2), ζ being the constant angle between the normals to and and β being termed the Bäcklund parameter. Sophus Lie subsequently observed that Bβ may be decomposed −1 according to Bβ = L−1 β Bβ = 1 Lβ , where Lβ and Lβ are Lie invariances. Thus, Lie transformations play a 37
38
BÄCKLUND TRANSFORMATIONS class of projective minimal surfaces for which BTs can be established (Rogers & Schief, 2002). In particular, this class contains the Demoulin system (1933) (ln h)uv = h −
Figure 1.
The Lamb diagram.
distinguished geometers such as Bianchi, Calapso, Darboux, Demoulin, Guichard, Jonas, Tzitzeica, and Weingarten all conducted extensive investigations into various classes of surfaces that admit BTs. The particular Lamé system descriptive of triply orthogonal systems in the case when one of the constituent coordinate surfaces is pseudospherical was shown by Bianchi (1885) to admit an auto-BT, that is, a BT that renders the system invariant. Bianchi followed this in 1890 with the construction of a BT for the Gauss–Mainardi–Codazzi system associated with the class of hyperbolic surfaces with Gaussian curvature K = − 1/ρ 2 subject to the constraint ρuv = 0,
(5)
descriptive of isothermic surfaces with fundamental forms I = e2θ (dx 2 + dy 2 ),
II = e2θ (κ1 dx 2 + κ2 dy 2 ), (7)
where κ1 , κ2 are principal curvatures and x, y are conjugate coordinates. The classical BT for system (6) has been set in a modern solitonic context by Cie´sli´nski (1997). In the first decade of the 20th century, the Romanian geometer Gheorghe Tzitzeica embarked upon an investigation of an important class of surfaces for which, in asymptotic coordinates, the Gauss– Mainardi–Codazzi system reduces to the nonlinear hyperbolic equation (ln h)uv = h − h−2
(8)
to be rediscovered some 70 years later in a soliton context. Tzitzeica (1910) not only constructed a BT for (8) but also set down what, in modern terms, is a linear representation containing a spectral parameter. Tzitzeica surfaces may be subsumed in the more general
1 . hk
ut + 6uux + uxxx = 0,
(9)
(10)
(namely, preservation of velocity and shape following interaction as well as the concomitant phase shift) were all recorded. Bianchi’s permutability theorem was subsequently employed in an investigation of the propagation of ultrashort optical pulses in a resonant medium by Lamb (1971). A BT for the Korteweg–de Vries equation (10), namely ( + )x = β − 21 ( − )2 , ( + )t = (u − u )(uxx − uxx )
where (6)
(ln k)uv = k −
The application of BTs in physics began with the work of Seeger et al. (1953) on crystal dislocation theory. Therein, within the context of Frenkel and Kontorova’s dislocation theory, the superposition of so-called “eigenmotions” was obtained via the permutability relation (4). The interaction of what are today called breathers with kink-type dislocations was both described analytically and displayed graphically. The typical solitonic features to be subsequently discovered for the Korteweg–de Vries (KdV) equation
where u, v are asymptotic coordinates. In 1899, Gaston Darboux constructed a BT for the nonlinear system θxx + θyy + κ1 κ2 e2θ = 0, κ1,y + (κ1 − κ2 )θy = 0, κ2,x + (κ2 − κ1 )θx = 0
1 , hk
−2(u2x + ux ux + ux2 ), x = u(σ, t)dσ ∞
(11) (12)
was established by Wahlquist and Estabrook (1973). The spatial part of the BT was used to construct a permutability theorem, whereby multi-soliton solutions may be generated. This permutability theorem makes a remarkable appearance in numerical analysis as the so-called ε-algorithm. A BT for the celebrated nonlinear Schrödinger (NLS) equation iqt + qxx + 2|q|2 q = 0
(13)
was established by Lamb (1974) employing a direct method due to Clairin (1902) and by Chen (1974) via the inverse scattering transform (IST) formalism. The BT adopts the form qx + qx = (q − q )(4β 2 − |q + q |2 )1/2 , qt + qt = i(qx − qx )(4β 2 − |q + q |2 )1/2 i + (q + q )(|q + q |2 + |q − q |2 ), (14) 2 the spatial part of which may be used to construct a permutability theorem (Rogers & Shadwick, 1982).
BALL LIGHTNING
39
Crum’s theorem may be adduced to show that, at the level of the linear representation of soliton equations, the action of the BT is to add a discrete eigenvalue to the spectrum. The role of BTs in the context of the IST and their action on reflection coefficients is treated in detail by Calogero and Degasperis (1982). That the Toda lattice equation y¨n = exp[−(yn − yn−1 )] − exp[−(yn+1 − yn ] (15) admits a BT, namely = β [exp{−(yn − yn )} y˙n − y˙n−1 − yn−1 )}], − exp{(yn−1
y˙n − y˙n = β −1 [exp{−(yn+1 − yn )} )}] − exp{−(yn − yn−1
(16)
was established by Wadati and Toda (1975). BTs for a range of integrable differential-difference as well as integro-differential equations may be conveniently derived via Hirota’s bilinear operator approach (see Rogers & Shadwick, 1982). BTs have by now been constructed for the gamut of known solitonic equations as well as their Painlevé reductions (Gromak, 1999). The importance of BTs in soliton theory with regard to such aspects as multi-soliton generation, geometric connections, and integrable discretization is well established. Moreover, BTs also have extensive applications in continuum mechanics (Rogers & Shadwick, 1982). Important connections between infinitesimal BTs as originally introduced in a gas dynamics context (Loewner, 1952) and the construction of 2 + 1 dimensional solitonic systems have also been uncovered. COLIN ROGERS See also Hirota’s method; Inverse scattering method or transform; N -soliton formulas; Sine-Gordon equation; Solitons
Darboux, G. 1899. Sur les surfaces isothermiques. Comptes Rendus, 128: 1299–1305 Demoulin, A. 1933. Sur deux transformations des surfaces dont les quadriques de Lie n’ont que deux ou trois points charactéristiques. Bulletin de l’Académie Belgique, 19: 479–501, 579–592, 1352–1363 Gromak, V. 1999. Bäcklund transformations of Painlevé equations and their applications. In The Painlevé Property: One Century Later, edited by R. Conte, New York: Springer Konopelchenko, B.G & Rogers, C. 1993. On generalised Loewner systems: novel integrable equations in 2+1dimensions. Journal of Mathematical Physics, 34: 214–242 Lamb, G.L. Jr. 1971. Analytical descriptions of ultra short optical pulse propagation in a resonant medium. Reviews of Modern Physics, 43: 99–124 Lamb, G.L. Jr. 1974. Bäcklund transformations for certain nonlinear evolution equations. Journal of Mathematical Physics, 15: 2157–2165 Loewner, C. 1952. Generation of solutions of systems of partial differential equations by composition of infinitesimal Bäcklund transformations. Journal d’Analyse Mathématique, 2: 219–242 Rogers, C. & Schief, W.K. 2002. Bäcklund and Darboux Transformations: Geometry and Modern Applications in Soliton Theory, Cambridge and New York: Cambridge University Press Rogers, C. & Shadwick, W.F. 1982. Bäcklund Transformations and Their Applications, New York: Academic Press Seeger, A., Donth, H. & Kochendörfer, A. 1953. Theorie der Versetzungen in Eindimensionalen Atomreihen III. Versetzungen, Eigenbewegungen und ihre Wechselwirkung, Zeitschrift für Physik, 134: 173–193 Tzitzeica, G. 1910. Sur une nouvelle classe de surfaces. Comptes Rendus, 150: 955–956 Wadati, M. & Toda, M. 1975. Bäcklund transformation for the exponential lattice. Journal of the Physical Society of Japan, 39: 1196–1203 Wahlquist, H.D. & Estabrook, F.B. 1973, Bäcklund transformations for solutions of the Korteweg–de Vries equation. Physical Review Letters, 31: 1386–1390
BAKER MAP See Maps
BAKER–AKHIEZER FUNCTION Further Reading Bäcklund, A.V. 1883. Om ytor med konstant negativ krökning. Lunds Universitets Årsskrift, 19: 1–48 Bianchi, L. 1885. Sopra i sistemi tripli ortogonali di Weingarten. Annali di Matematica, 13: 177–234 Bianchi, L. 1890. Sopra alcone nuove classi di superficie e di sistemi tripli ortogonali. Annali di Matematica, 18:301–358 Bianchi, L. 1892. Sulla traformazione di Bäcklund per le superficie pseudosferiche. Rendiconti Lincei, 5: 3–12 Calogero, F. & Degasperis, A. 1982. Spectral Transform and Solitons, Amsterdam and New York: North-Holland Chen, H.H. 1974. General derivation of Bäcklund transformations from inverse scattering problems. Physical Review Letters, 33: 925–928 Cie´sli´nski, J. 1997. The Darboux–Bianchi transformation for isothermic surfaces. Classical results versus the soliton approach. Differential Geometry and Its Applications, 7: 1–28 Clairin, J. 1902. Sur les transformations de Bäcklund. Annales de l’Ecole Normale Supérieure, 27: 451–489
See Integrable lattices
BALL LIGHTNING Properties Ball lightning is an impressive natural phenomenon for which there is yet no accepted scientific explanation. It consists of flaming balls or fireballs, usually bright white, red, orange, or yellow, which appear unexpectedly sometimes near the ground, following the discharge of a lightning flash, or in midair coming from a cloud. Most observations of ball lightning are associated with thunderstorms, and they exhibit the following more detailed properties: (i) Their shape is usually spherical or spheroidal with diameters between 10 and
40 50 cm. (ii) They tend to move horizontally. (iii) The observed distribution of lifetimes has a most probable value between 2 and 5 s and an average value of about 10 s or higher (some cases of more than 1 min having been reported). (iv) Ball lightning is bright enough to be clearly seen in daylight, the visible output being in the range 10–150 W (similar to that of a home electric light bulb). (v) Some balls have appeared within aircraft, traveling inside the fuselage along the aisle from front to rear. (vi) There are reports of odors, similar to those of ozone, burning sulfur, or nitric oxide, and of sounds, mainly hisses, buzzes, or flutters. (vii) Most balls decay silently, but some expire with an explosion. (viii) Ball lightning has killed or injured people and animals and damaged trees, buildings, cars, and electric equipment. (ix) Fires have been started showing that there is something hot inside. In such events, the released energy has been estimated to be between 10 kJ and more than 1 MJ. (x) Ball lightning has never been produced in laboratories, in spite of many attempts and some interesting results, including anode spots and luminous objects that decay very quickly. Consequently, the properties of ball lightning are derived from reports by witnesses, who are often excited by the phenomenon and have no scientific training. A possibly related phenomenon has been observed in submarines, after a short circuit of the batteries. Balls of plasma that float in air for several seconds have appeared at the electrodes. On these occasions, the current was about 150 kA and the energy was estimated to be between 200 and 400 kJ.
Classification of the Models Three main characteristics must be accounted for by a successful model but seem very difficult to explain: the tendency toward horizontal motion (hot air or plasma in air tends to rise), relatively long lifetimes, and contradictions among witnesses. For example, some report that balls are cold since they did not feel any warmth when one passed nearby, while others were burned and needed medical care. The many different models proposed to explain the phenomenon can be classified into two groups, according to whether the energy source is internal or external. In the first group, some are based on plasmoids (equilibrium configurations of plasmas), high-density plasmas with quantum mechanical properties, closed loops of currents confined by their own magnetic field (in some cases, the linking of the currents playing an important role), vortex structures as whirlwinds or rotating spheres, bubbles containing microwave radiation, chemical reactions or combustion, fractal structures, aerosols, filaments of silicon, carbon nanotubes, nuclear processes, or new physics, and even primordial mini black holes. In the second group, some assume
BALL LIGHTNING that the balls are powered by electrical discharges or by high-frequency microwave focused from thunderclouds. None of them is generally accepted.
Chemical and Electromagnetic Models The association of ball lightning with electrical discharges suggests strongly that they have an electromagnetic nature. However, Michael Faraday argued that ball lightning cannot be an electric phenomenon as it would decay almost instantaneously, in contrast to its observed lifetime of at least several seconds. Finkelstein & Rubinstein (1964) used the time-independent magnetic virial theorem to place a stringent upper limit to the energy of a fireball. This limit has been viewed as a compelling argument against electromagnetic models, stimulating non-electromagnetic chemical approaches. Recently, aerosol models have received considerable attention. In one model (Abrahamson & Dinniss, 2000), a lightning discharge vaporizes silicon dioxide in the soil that—after interacting with carbon compounds— is transformed into pure silicon droplets of nanometer scale. These droplets become coated with an insulating coat of oxides and are polarized, after which they become aligned with electric fields and form networks of filaments, in loose structures called “fluff balls.” In another model (Bychkov, 2002), the discharges pick up organic material from the soil and transform it into a kind of “spongy ball” that can hold electric charges. Models of this type fail to explain that some balls appear in mid air, where there is neither silicon nor organic nor any other similar material. Electromagnetic models that include chemical effects are promising candidates for an explanation of ball lightning. Indeed, there are now counterarguments to the three main objections that express Faraday’s argument in modern language. These are based on the radiated output, the pinch effect, and magnetic pressure. The first objection is that the power emitted by a plasma of the size of a ball lightning is too high (one liter of air plasma at 15,000 K emits about 5 MW, several orders of magnitude too much). This may be, however, an indication that most of the ball is at ambient temperature, only a small fraction being hot, concentrated in filamentary structures (as hot current streamers). If this fraction is of the order of 1 ppm, the radiated output would be of the order of 10–100 W, in agreement with reports. But the solution to the first problem raises another one. Any plasma current channel inside the ball would be necked and cut in a very short time by the pinch effect (the Lorentz force); thus, a ball structured by such currents could not last long enough. However, in 1958, Chandrasekhar and Woltjer showed in an astrophysical context that plasmas relax to minimum energy states, verifying the condition that the electric current and
BALL LIGHTNING the magnetic field are parallel so ∇ × B = λB , in which there can be no pinch effect since the Lorentz force vanishes. They concluded that such states, known as force-free fields, can confine large amounts of magnetic energy. Although the minimum energy of an uncontained plasma (as in ball lightning) is zero and corresponds to an infinitely expanded magnetic field, an almost force-free condition could be attained first in a very short time at a finite radius, a slow expansion continuing afterward with negligible pinch effect. This could take several seconds. The third objection is based on the magnetic virial theorem, which states that a system of charges in electromagnetic interactions has no equilibrium state in the absence of external forces, because the large magnetic pressure must produce an explosion with no other force to compensate it. But it is not certain that the fireballs are in equilibrium; they could be just in metastable states with slow evolution, the streamers, moreover, clearly not being in equilibrium themselves. Still more important, the force-free condition annihilates the magnetic pressure or at least reduces it to a much smaller value if the field is almost force-free. Furthermore, the problem needs a much more complex analysis than has been offered up to now. For instance, one must include the thermochemical and quantum effects on the transport processes in the plasma as well as other nonlinear effects. Faddeev and Niemi (2000) have proposed compelling arguments that challenge certain widely held views on plasmas, showing that the virial theorem does allow nontrivial equilibrium states of streamers and electromagnetic fields inside a background of plasma, which are “topologically stable solitons that describe knotted and linked flux tubes of helical magnetic fields.” This kind of configuration was proposed in 1998 by Rañada et al. (2000) in the context of ball lightning. Therefore, it seems that the virial theorem does not necessarily support Faraday’s view. That ball lightning may contain force-free magnetic configurations of plasmas seems plausible. Because electric conduction in air proceeds through thin channels called streamers—as happens in ordinary lightning—it can be imagined that plasma inside the fireball consists of a self-organized set of metastable, highly conductive, wire-like or filamentary currents. Furthermore, unusually long-lived filaments (even closed loops) in high-density structures have been theoretically predicted and experimentally observed in many plasma systems within a great range of length scales, for instance, in astrophysics, tokamaks, and ordinary discharges in air. Thus, filamentary structures are currently receiving considerable attention. The strongly nonlinear behavior of a plasma is enhanced when filamentary structures appear, leading to a complex non-isotropic system, which should be studied within a more general theory than
41 ideal magnetohydrodynamics (MHD). Still, the main features of such systems can be described in the framework of resistive MHD. The important dissipative effects depend on the transport coefficients, such as thermal and electrical conductivities, which are highly nonlinear functions of the electromagnetic fields and the temperature, as well as of the chemical and quantum properties. From the point of view of the MHD approximation, the dimensionless parameters of the plasma inside ball lightning may be quite similar to those found in other plasma scenarios, implying stable or metastable currents along a set of closed loops in filamentary structures. An interesting and unexplored possibility is the establishment, inside the streamers, of a quasicollision-free highly conductive regime in the direction of the magnetic field, which is strong and parallel to the streamers axis. In such a regime, both the electric and the thermal conductivities would become highly anisotropic. The first would be considerably enhanced along the axis of the streamer. On the other hand, both conductivities would be greatly reduced in the transverse directions, behaving as 1/B 2 according to classical predictions. In this way, the dissipation and the spreading in the streamers would be much smaller than in ordinary regimes and should produce a long-lived strongly magnetized global plasma structure within an intricate stabilizing topology of filamentary currents. In summary, even though the phenomenon of ball lightning has been known for many years, there is still no accepted theory to explain it—the alternatives currently most favored being the chemical and the electromagnetic models. The latter seem promising now, after recent results showed that some classical objections are not always applicable. As an example, a number of filamentary plasma structures have generated considerable interest, which are similar to stable plasma scenarios observed in nature, for instance, in astrophysics. These kinds of models could possibly embody chemical and electromagnetic elements. ANTONIO F. RAÑADA, JOSÉ L. TRUEBA, AND JOSÉ M. DONOSO See also Helicity; Magnetohydrodynamics; Nonlinear plasma waves Further reading Abrahamson, J. & Dinniss, J. 2000. Ball lightning caused by oxidation of nanoparticle networks from normal lightning strikes on soil. Nature, 403: 519–521 Barry, J.D. 1980. Ball Lightning and Bead Lightning. Extreme Forms of Atmospheric Electricity, New York: Plenum Press Bychkov, V.L. 2002. Polymer-composite ball lightning. Philosophical Transactions of the Royal Society A, 360: 37–60 Faddeev, L. & Niemi, A.J. 2000. Magnetic geometry and the confinement of electrically conducting plasmas. Physical Review Letters, 85: 3416–3419
42
BELOUSOV–ZHABOTINSKY REACTION
Finkelstein, D. & Rubinstein, J. 1964. Ball lightning. Physical Review A, 135: 390 Ohtsuki, Y.-H. (editor). 1988. Science of Ball Lightning (Fire Ball), Singapore: World Scientific Rañada, A.F., Soler, M. & Trueba, J.L. 2000. Ball lightning as a force-free magnetic knot. Physical Review E, 62: 7181–7190 Singer, S. 1971. The Nature of Ball Lightning, New York and London: Plenum Press Smirnov, B.M. 1993. Physics of ball lightning. Physics Reports, 224: 151–236 Stenhoff, M. 1999. Ball Lightning: An Unsolved Problem in Atmospheric Physics, Dordrecht and New York: Kluwer Trubnikov, B.A. 2002. Current filaments in plasmas. Plasma Physics Reports, 28: 312–326
BANACH SPACE
In 1972, Richard Field, Endre Körös, and Richard Noyes, at the University of Oregon, studied this mechanism using a bromide-selective electrode to follow the reaction. They proposed 18 steps involving 21 chemical species, using the same principles of chemical kinetics and thermodynamics that govern ordinary chemical reactions—this was the FKN mechanism. In 1974, the same scientists proposed a simplified mechanism with penetrating chemical insight: the “Oregonator” (in honor of the University of Oregon).
See Function spaces
BASIN OF ATTRACTION See Phase space
BAXTER’S Q-OPERATOR See Bethe ansatz
BELOUSOV–ZHABOTINSKY REACTION In 1950, Boris Pavlovich Belousov worked at the Institute of Biophysics of the Ministry of Public Health of the USSR when he observed that the reaction between citric acid, bromate ions, and ceric ions (as catalyst) produced a regular periodic and reproducible change of color between an oxidized state and a reduced state. A temporal oscillating reaction appeared like a chemical clock. His 1951 paper on this study was largely rejected by the science community because it seemed to violate the Second Law of thermodynamics. In 1961, Anatol Zhabotinsky, a Russian postgraduate student guided by his professor, Simon Shnoll, modified the previous reaction by replacing citric acid with malonic acid and adding ferroin sulfate as an indicator. As the oxidized state of ferroin is blue and the reduced one is red, he was able to observe an oscillating temporal reaction with larger oscillating amplitudes. This reaction 3CH2 (CO2 H)2 + 4BrO− 3 = 4Br − + 9CO2 + 6H2 O
Figure 1. A photograph showing the periodic potential obtained between a platinum electrode and a reference electrode immersed in a BZ solution.
(1)
was named the Belousov–Zhabotinsky (BZ) reaction. The first publications in English, recognizing the works of Belousov and Zhabotinsky, were done in 1967 by the Danish scientist Hans Degn. However, Zhabotinsky was unable to propose a complete mechanism for the system. Experimentally, the periodic evolution of the potential of the reacting solution can be followed by a potentiometric method, as shown in Figure 1.
Compound: BrO− Organic species HBrO 3 Notation: A B P Compound: HBrO2 Notation: X
Br− Ce4+ Y Z
With these notations, the FKN mechanism is • • • • •
A + Y → X + P, A + X → 2X + 2Z, X + Y → 2P, 2X → A + P, B + Z → (f/2)Y + other products.
The second step is fundamental for the observation of oscillations; it is an autocatalytic reaction or retroaction loop. In the fifth step, B represents all oxidizable organic species present, and f is a stoichiometric factor. The kinetic differential equations are: d[X] = k1 [A] · [Y ] + k2 [A] · [X] dt −k3 [X] · [Y ] − 2k4 [X]2, (2) d[Y ] f = −k1 [A] · [Y ] − k3 [X] · [Y ] + 2k5 [B] · [X], dt 2 (3) d[Z] = 2k2 [A] · [X] − k5 [B][Z]. dt
(4)
BELOUSOV–ZHABOTINSKY REACTION
43
The rate constants are: Rate constants: k1 k2 k3 k4 k5 Value 6 3 (l/mol s): 1.28 2.4 ×10 33.6 3×10 1 This system is clearly nonlinear. Solutions can be obtained by numerical methods, and for 0.5 ≤ f ≤ 2.4, some oscillating temporal solutions are observed, whose solutions depend on the initial conditions. According to the Second Law of thermodynamics, all spontaneous chemical changes in a homogeneous and closed system involve a decrease in free enthalpy of this system. If a fluctuation disrupts the system close to equilibrium, the system will return irrevocably to this stable state, making oscillations impossible. Nonetheless, it is possible to observe oscillations when the system is far from equilibrium. One of the striking properties of nonlinear systems is the effect of fluctuations (of the concentrations of intermediates), which can transform an unstable system into new states that are more organized than the initial state. Ilya Prigogine (who was awarded the 1977 Nobel Prize in chemistry for his work on thermodynamics) gave the name “dissipative structures” to such systems to emphasize the importance of irreversible phenomena far from equilibrium. Continuing to work on oscillating reactions, in 1970, Zhabotinsky published, with Zaikin, a paper that announced the existence of two-dimensional waves in the BZ reaction; also in 1972, Arthur Winfree observed spiral wave patterns in a BZ reaction (see Figure 2). This reaction took place without stirring in a thin (approximately 2 mm thick) layer of reactants poured into a Petri dish. Blue concentric circles (called targets) radiated across the dish on a red background and selfgenerating spirals appeared. Soon afterward, scientists reported that a blue target center can produce waves of oxidation propagating through the reduced medium, and as the waves advance toward the interior of the center, they transform from red to blue. The period of oscillation is variable, but the speed of propagation is roughly constant. Thus, the BZ reaction proves to be a stationary spatiotemporal oscillating reaction. With the aid of the FKN mechanism, Field and Noyes showed how to understand the development of such target waves. The diffusion is the transport of species from the areas of high concentrations to those of low concentrations. When there is a coupling between a chemical reaction with an autocatalytic step (or feedback-retroaction loop) as in the BZ reaction with the diffusion of species, spatial organization phenomena can occur; thus these are called “reactiondiffusion systems.” In such systems, molecules react chemically with each other when they collide, and as
Figure 2. Spiral waves in a BZ reaction (Courtesy of A.T. Winfree). See text for details.
the concentrations of components change, a chemical wave propagates. In 1984, Oleg Mornev (at the Institute of Theoretical and Experimental Biophysics of the Russian Academy of Sciences) showed that in an infinite plane stationary system of reaction-diffusion oscillations, Snell’s sine law of refraction was verified. Thus the simple rule v1 sin ψ1 = (5) sin ψ2 v2 dictates the angles ψ1 and ψ2 when waves hit an interface separating two regions with different speeds (v1 and v2 ) of wave propagation. This result was surprising because reaction-diffusion systems are nonlinear; thus, the medium is an active and an integral part of the wave. In fact, ψ1 cannot be set arbitrarily but is slaved with ψ2 due to the nature of the two regions. In 1993, Zhabotinsky (at the Department of Chemistry, Brandeis University) demonstrated Snell’s law experimentally. In 1998, Rui Dilaö and Joaquim Sainhas (at the Instituto Superio Tecnico, Lisbon Portugal) showed the following constraint in a reactiondiffusion system: after the reaction, the medium must be chemically identical to its initial form. In other words, while the waves are propagating and reactions are taking place, the medium has different properties, but these waves transform the medium back to the original species. By solving reaction-diffusion equations under this constraint, Dilaö and Sainhas showed (using computer simulations) that their formulation agrees with experiments, and mathematically chemical waves obey Snell’s sine law. Continuing to work on the phenomenon of refraction, Mornev is developing formulations that hold both in infinite and finite media. There are other examples of reaction-diffusion systems. In 1983, Patrick de Kepper (a French chemist in Toulouse) highlighted Turing structures
44 − in a ClO− 2 , I , malonic acid reaction (CIMA reaction). In Turing structures, stationary zones of varying concentrations appear in space. For these observations, the reaction must have steps with retroaction loops containing activators and other steps with inhibitors, and the activators must diffuse more slowly than the inhibitors. Although the intense study of oscillating chemical reactions and nonlinear dynamics in chemistry is only about 30 years old, its progress has been impressive. Depending on the initial conditions, the BZ reactions can have unpredictable behaviors, even though they are described by deterministic laws. Thus, the BZ reaction belongs to the group of physical or chemical systems that exhibits deterministic chaos. GÉRARD DUPUIS AND NICOLE BERLAND
See also Brusselator; Chemical kinetics; Fairy rings of mushrooms; Reaction-diffusion systems; Turing patterns; Vortex dynamics in excitable media Further Reading Dilao, R. & Sainhas, J. 1998. Wave optics in reaction-diffusion systems. Physical Review Letters, 80: 5216 Epsein, I.R. & Pojman, J.A. 1998. An Introduction to Nonlinear Chemical Dynamics: Oscillations, Waves, Patterns and Chaos, Oxford and New York: Oxford University Press Field, R.J., Körös, E. & Noyes, R.M. 1972. Journal of American Chemical Society, 94: 8649 Gray, P. & Scott, S.K. 1990. Chemical Oscillations and Instabilities, Nonlinear Chemical Kinetics, Oxford: Clarendon Press and New York: Oxford University Press Mornev, O.A. 1984. Elements of the “optics” of autowaves. In Self-Organization of Autowaves and Structures Far from Equilibrium, edited by V.I. Krinsky, Berlin: Springer, pp. 111–118 Zaikin, A.N. & Zhabotinsky, A.M. 1970. Concentration wave propagation in two dimensional liquid phase self oscillating system. Nature, 225: 535–537 Zhabotinsky, A.M. 1964. Biofizika, 9, 306. Zhabotinsky,A.M., Eager, M.D. & Epstein, I.R. 1993. Refraction and reflection of chemical waves. Physical Review Letters, 71: 1526–1529
BENJAMIN–BONA–MAHONY EQUATION See Water waves
BENJAMIN–FEIR INSTABILITY See Wave stability and instability
BENJAMIN–ONO EQUATION See Solitons, types of
BERNOULLI SHIFT See Maps
BERNOULLI’S EQUATION
BERNOULLI’S EQUATION Bernoulli’s equation is possibly the best-known result in fluid mechanics—and the most frequently abused. Bernoulli’s equation may be viewed as an energyconservation budget for a fluid particle as it travels up and down the “hills” of potential energy, due to the fields of gravity and the pressure within the fluid, acquiring and relinquishing kinetic energy. In its simplest form, it states that V 2 /2 + p/ρ + gz = C,
(1)
where p is the pressure in the fluid of density ρ, V is the flow speed, and z is the vertical coordinate. (The flow takes place in a uniform gravitational field of acceleration g.) The sum on the left-hand side of Equation (1) is a constant, C. The first term is the kinetic energy of the fluid per unit mass, the second and third terms are the potential energy (again per unit mass) in the combined energy “landscape” of pressure and gravity. The result is credited to Daniel Bernoulli (1700– 1782), son of Johann Bernoulli (1667–1748), and to his monograph Hydrodynamica, initiated in 1729 and ultimately published in 1738. The history of the equation is, however, much richer than this simple sequence of events would suggest. Hunter Rouse puts the issues this way in a book containing English translations of the writings of both the son and the father: Why [these] works should have been singled out for translation seems at first sight rather obvious, if only because of the frequency with which the name Bernoulli is on the hydraulician’s lips. But it is only Daniel to whom one is making reference, and the word is gradually spreading that the theorem bearing his name is nowhere to be found in his habitually cited Hydrodynamica. Not until the last few years has mention of either the work Hydraulica or its author Johann Bernoulli appeared in the fluids literature with any frequency, and this almost exclusively in the writings of C. Truesdell. It is Truesdell’s thesis that, whereas Daniel has received too much credit for the formulation of the Bernoulli theorem, Johann has received too little. (Carmody et al., 1968).
A complicated set of circumstances ensued in which both father (Johann) and son (Daniel) sent their manuscripts to Leonhard Euler for comment and this led to Johann’s manuscript, which appears to have been composed later than Daniel’s, being published in the Memoirs of the Imperial Academy of Science in St. Petersburg in 1737 and 1738 (although these were not printed until a decade later). Indeed, Johann’s treatise first appeared in his collected works published in Switzerland in 1743. While Daniel’s treatise has the gist of what we today call Bernoulli’s equation, Johann’s treatment is more mature and complete.
BERRY’S PHASE
45
In the form stated in Equation (1), Bernoulli’s equation applies only to steady, constant-density, irrotational flow, that is, to a flow pattern that is unchanging in time and that has no vorticity. More refined versions may be derived. Thus, in a steady, constant-density flow with vorticity, Equation (1) still holds along each streamline, but the “constant” on the right-hand side may vary from streamline to streamline. Indeed, the gradient of this changing “Bernoulli constant,” ∇C, equals the Lamb vector, the vector product of flow velocity and vorticity,
V × ω = ∇C. If the flow is irrotational but unsteady, a version of Bernoulli’s equation again holds, but the constant on the right-hand side of (1) is replaced by (minus) the time derivative of the velocity potential. (In an irrotational flow, the velocity field is the gradient of a scalar known as the velocity potential.) With V = −∇φ, where φ is the velocity potential, we obtain Bernoulli’s equation in the form ∂φ (2) (∇φ)2 /2 + p/ρ + gz = − , ∂t which, coupled with the condition of irrotational flow, φ = 0,
(3)
gives a system of two partial differential equations for the fields p and φ. Bernoulli’s equation in the simplistic form “high flow speed implies low pressure and vice versa” is often applied as a first, crude explanation of many flow phenomena from the ability to balance a ball atop a plume of air to the lift on an airfoil in flight. Some of these explanations are too simplistic, not to say incorrect. Nevertheless, Bernoulli’s equation, when properly applied under the assumptions that ensure its validity, can be an extremely useful and powerful tool of fluid flow analysis. It is remarkable—and important to note—that Bernoulli’s equation (1) is not invariant to a Galilean transformation, ordinarily a prerequisite for a physical law to be useful. Thus, if one wants to use Bernoulli’s equation (1) to calculate the pressure distribution for flow around an object, assuming the velocity field is known, it is essential to do so in a frame of reference in which the flow satisfies the necessary assumptions, in particular, that the flow is steady. The correct result is obtained by carrying out such a calculation in a frame of reference moving with the body. If the calculation is attempted in the “laboratory frame” through which the object is moving, one has to tackle the much more complex version of Bernoulli’s equation given in (2). If the version in Equation (1) is applied, one obtains an incorrect result. HASSAN AREF See also Fluid dynamics
Further Reading Batchelor, G.K. 1967. An Introduction to Fluid Dynamics, Cambridge: Cambridge University Press Carmody, T., Kobus, H. & Rouse, H. 1968. Hydrodynamica by Daniel Bernoulli and Hydraulica by Johann Bernoulli, translated from the Latin, with a preface by Hunter Rouse, New York: Dover Lamb, H. 1932. Hydrodynamics, 6th edition, Cambridge: Cambridge University Press
BERRY’S PHASE Consider the parallel transport of an orthonormal frame along a line of constant latitude on the surface of a sphere. In going once around the sphere, the frame undergoes a rotation through an angle θ = 2π cos α, where α is the colatitude. This may be shown using the geometry of Figure 1. As is also evident from the figure, this phase shift is purely geometric in character—it is independent of the time it takes to traverse the closed loop. This construction underlies the well-known phase shift exhibited by the Foucault pendulum as the Earth rotates through one full period. Although arising through a dynamical process involving two widely separated time scales (the period of the Earth’s rotation and the oscillation period of the pendulum), the phase shift in this and other examples is now understood in a more unified way. Holonomic effects such as these arise in a host of applications ranging from problems in superconductivity theory, fiber optic design, magnetic resonance imaging (MRI), amoeba propulsion and robotic locomotion and control, micromoter design, molecular dynamics, rigid-body motion, vortex dynamics in incompressible fluid flows (Newton, 2001), and satellite orientation control. For a survey and further references on the use of phases in locomotion problems, see Marsden & Ostrowski (1998). That the falling cat learns quickly to re-orient itself optimally in mid-flight while maintaining zero angular momentum is a manifestation of the fact that controlling and manipulating a system’s internal or shape variables can lead to phase changes in the external, or group variables, a process that can be exploited and has deeper connections to problems related to the dynamics of Yang–Mills particles moving in their associated gauge field, a link that is the falling cat theorem of Montgomery (1991a) (see further discussion and references in Marsden (1992) and Marsden & Ratiu (1999)). One can read many of the original articles leading to our current understanding of the geometric phase in the collection edited by Shapere & Wilczek (1989). Problems of this type have a long and complex history dating back to work on the circular polarization of light in an inhomogeneous medium by Vladimirskii and Rytov in the 1930s and by Pancharatnam in the 1950s, who studied interference patterns produced by plates of
46
BERRY’S PHASE where
cut and unroll cone
end start parallel translate frame along a line of latitude
end start
Figure 1. Parallel transport of a frame around a line of latitude.
an anisotropic crystal. Much of this early history is described in the articles by Michael Berry (Berry, 1988, 1990). The more recent literature was initiated by his earlier articles (Berry, 1984, 1985), which investigated the evolution of quantum systems whose Hamiltonian depends on external parameters that are slowly varied around a closed loop. The adiabatic theorem of quantum mechanics states that for infinitely slow changes of the parameters, the evolution of the complex wave function, governed by the time-dependent Schrödinger equation, is instantaneously in an eigenstate of the frozen Hamiltonian. At the end of one cycle, when the parameters recur, the wave function returns to its original eigenstate, but with a phase change that is related to the geometric properties of the closed loop. This phase change now goes by the name Berry’s phase. Geometric developments started with the work of Simon (1983), and Marsden et al. (1989). One can introduce a bundle of eigenstates of the slowly varying Hamiltonian, as well as a natural connection on it; the Berry phase is then the bundle holonomy associated with this connection, while the curvature of the connection, when integrated over a closed two-dimensional (2-d) surface in parameter space gives rise to the first Chern class characterizing the topological twisting of this bundle. The classical counterpart to Berry’s phase was originally developed by Hannay (1985) (hence the terminology Hannay’s angle) and is most naturally described by considering slowly varying integrable Hamiltonian systems in action-angle form. If we let (I1 , . . . , In ; θ1 , . . . θn ) represent the action-angle variables of a given integrable system, then the governing Hamiltonian can be expressed as H(I1 , . . . , In ; R(t)), where R(t) is a slowly varying parameter that cycles through a closed loop in time period T , that is, ˙ ∼ εR, ε 1. The configuration R(t + T ) = R(t), R(t) space for the system is an n-dimensional torus Tn and we seek a formula for the angle variables as the parameter or parameters slowly evolve around the closed loop C in parameter space. The time-dependent system is governed by ˙ · ∂I , I˙ = R(t) ∂R ˙ · ∂θ , θ˙ = ω(I) + R(t) ∂R
∂H . ∂I Since R is slowly varying, we can average the system around level curves of the frozen (i.e., ε = 0) Hamiltonian. If we let denote this phase-space average, then the averaged canonical system becomes ˙ · ∂I (3) I˙ = R(t) ∂R ˙ · ∂θ . θ˙ = ω(I) + R(t) (4) ∂R ω(I) ≡
The well-known adiabatic theorem of quantum mechanics guarantees that the action variable is nearly constant due to its adiabatic invariance, whereas the angle variables can be integrated over period T i T T ˙ · ∂θ dt ωi (I)dt + (5) R(t) θTi = ∂R 0 0 = θ d + θg . (6) The first term, θd , called the dynamic phase is due to the frozen system, while the second term, θg , arises from the time variation. This geometric phase can be rewritten in a revealing manner as i T ˙ · ∂θ dt (7) R(t) θg = ∂R 0 i ∂θ = dR. (8) ∂R The contour integral is taken over the closed loop C in parameter space. Although arising through a dynamical process, it is ultimately a purely geometric quantity that results from a delicate balance of two compensating effects in the limit ε → 0. On the one hand, T → ∞ ˙ → 0. Their rates exactly in (7), while on the other, R(t) balance so that the integral leaves a residual term in the limit ε = 0, as given in (8). A nice example developed in Hannay (1985) is the bead-on-hoop problem in which a frictionless bead is constrained to slide along a closed planar wire hoop that encloses area A and has perimeter length L. As the bead slides around the hoop, the hoop is slowly rotated about its vertical axis (which is aligned with the gravitational vector) through one full revolution. We are interested in the angular position of the bead with respect to a fixed point on the hoop after one full revolution of the hoop. When compared with its angular position had the hoop been held fixed (the frozen problem), this angle difference would represent the geometric phase and is given by (9) θ = −8π 2 A/L2 .
(1)
Montgomery (1991b) shows that modulo 2π , we have the following rigid-body phase formula:
(2)
θ = − + 2ET /R.
(10)
BETHE ANSATZ
47 true trajectory
dynamic phase
horizontal lift
geometric phase projection to body angular momentum space periodic orbit of the body angular momentum trajectory
spherical cap
P Pm
Figure 2. The geometry of the rigid-body phase formula.
Let us explain the notation in this remarkable formula. When a rigid body is freely spinning about its center of mass, one learns in mechanics that this dynamics can be described by the Euler equations, which are equations for the body angular momentum Π. This vector in R3 moves on a sphere (of radius R = Π ) and describes periodic orbits (or exceptionally, heteroclinic orbits). This orbit is schematically depicted by the closed curve on the sphere shown in Figure 2. However, the full dynamics includes the dynamics of the rotation matrix for describing the attitude of the rigid body as well as its conjugate momentum. There is a projection from the full dynamic phase space (which is 6-d) to the body angular momentum space (which is 3-d). After one period of the motion on the sphere, the actual rigidbody motion was not periodic, but it had rotated about the spatial angular momentum vector by an angle θ, the left-hand side of the above formula. The quantity is the spherical angle subtended by the cap shown in the figure, E is the energy of the trajectory, and T is the period of the closed orbit on the sphere. A detailed history of this formula is given in Marsden & Ratiu (1999). PAUL K. NEWTON AND JERROLD E. MARSDEN See also Adiabatic invariants; Averaging methods; Hamiltonian systems; Integrability; Phase space Further Reading Berry, M.V. 1984. Quantal phase factors accompanying adiabatic changes. Proceedings of the Royal Society, London A, 392: 45–57 Berry, M.V. 1985. Classical adiabatic angles and quantal adiabatic phase. Journal of Physics A, 18: 15–27 Berry, M.V. 1988. The geometric phase. Scientific American, December, 46–52 Berry, M.V. 1990. Anticipations of the geometric phase. Physics Today, 43(12), 34–40 Hannay, J.H. 1985. Angle variable holonomy in adiabatic excursion of an integrable Hamiltonian. Journal of Physics A, 18: 221–230 Marsden, J.E. 1992. Lectures on Mechanics, Cambridge and New York: Cambridge University Press
Marsden, J.E., Montgomery, R. & Ratiu, T. 1989. Cartan– Hannay–Berry phases and symmetry. Contemporary Mathematics, 97: 279–295 Marsden, J.E. & Ostrowski, J. 1998. Symmetries in motion: geometric foundations of motion control, Nonlinear Science Today. (http://link.springer-ny.com) Marsden, J.E. & Ratiu, T. 1999. Introduction to Mechanics and Symmetry, 2nd edition, New York: Springer Montgomery, R. 1991a. Optimal control of deformable bodies and its relation to gauge theory. In The Geometry of Hamiltonian Systems, edited by T. Ratiu, NewYork: Springer, pp. 403–438 Montgomery, R. 1991b. How much does a rigid body rotate? A Berry’s phase from the 18th century, American Journal of Physics, 59: 394–398 Newton, P.K. 2001. The N-Vortex Problem: Analytical Techniques, New York: Springer, Chapter 5 Shapere, A. & Wilczek, F. (editors). 1989. Geometric Phases in Physics, Singapore: World Scientific Simon, B. 1983. Holonomy, the quantum adiabatic theorem, and Berry’s phase. Physical Review Letters, 51(24): 2167–2170
BETHE ANSATZ The Bethe ansatz is the name given to a method for exactly solving quantum many-body systems in one spatial dimension (1-d) or classical statistical lattice models (vertex models) in two spatial dimensions (Baxter, 1982; Korepin et al., 1993). The method was developed by Hans Bethe in 1931 (Bethe, 1931) in order to diagonalize the Hamiltonian of a chain of N spins with isotropic exchange interactions, introduced by Werner Heisenberg some years before as the simplest model for a 1-d magnet. This result was achieved by assuming the wave function to be of the form f (x1 , x2 , ..., xM ) =
AP e
i
M
j =1 kPj xj
(1)
P
with the sum performed on all possible permutations P of M distinct wave numbers {k1 , ..., kM }, corresponding to down spins in the system (Bethe ansatz). By imposing invariance under the physical symmetries of the system (discrete translations and total spin rotations), Bethe obtained conditions on the coefficients AP , which were satisfied if a set of M nonlinear equations (Bethe equations) in N complex parameters (Bethe numbers) were fulfilled. Surprisingly, the wave functions thus constructed were simultaneous eigenfunctions not only of the translation operator, the total spin S , and its projection Sz along the z-direction but also of the isotropic Heisenberg Hamiltonian H =
N i=1
Si · Si+1 −
1 . 4
(2)
The energy and the crystal momentum were expressed as symmetric functions of the Bethe numbers; thus, the eigenvalue problem for H was reduced to the solution of an algebraic problem—solution of the Bethe equations.
48 This remarkable result was possible because of the existence of additional symmetries of the Heisenberg Hamiltonian, which emerged thanks to the ansatz made by Bethe on the wave function. In this original formulation, the method is known as the coordinate Bethe ansatz. Progress in clarifying the role of symmetries in the Bethe ansatz, the link with integrable systems, and the algebraic aspect of the method was achieved by the Saint Petersburg (formerly Leningrad) School in the course of developing the quantum inverse scattering method (QISM) (Faddeev, 1984; Sklyanin & Faddeev, 1978; Korepin et al., 1993), after the work of Baxter on the integrability of vertex models (Baxter, 1982). In this approach, a key role is played by the monodromy operator defined as τ (λ) = LN (λ)LN−1 , ..., L1 (λ), with λ being a complex number (spectral parameter) and Ln (λ) being the quantum local Lax operator defined for the isotropic Heisenberg model as
λ + iSnz iSn− (3) Ln (λ) = + z iSn λ − iSn with S + , S − raising and lowering spin- 21 operators. Note that Ln can be viewed as an operator acting on the space hn ⊗ V , where hn (≡ C 2 ) is the physical Hilbert at site n (the space of couples of complex numbers), and V is an auxiliary space related to the matrix representation of Ln (for the present case, V is also identified with C 2 ). The product of Lax operators, taken in the auxiliary space, coincides with the usual matrix multiplication, so that the monodromy matrix can be rewritten as
A(λ) B(λ) τ (λ) = , (4) C(λ) D(λ) with A, B, C, D operators acting on the full physical Hilbert space: H = ⊗ hi . As is known from QISM (Korepin et al., 1993; Sklyanin & Faddeev, 1978; Faddeev, 1984), the commutation relations between elements of the monodromy matrix can be written in a compact form as R(λ − µ) (τ (λ) ⊗ τ (µ)) = (τ (µ) ⊗ τ (λ)) R(λ − µ), (5) where R is a 4 × 4 matrix (quantum R-matrix) satisfying the Yang–Baxter equation. From Equation (5), it follows that the trace of the monodromy operator (also known as the transfer matrix) T (λ) ≡ tr(τ (λ)) = A(λ) + D(λ) gives rise, for different values of the spectral parameter, to an abelian algebra of operators [T (λ), T (µ)] = 0. (6) One can prove that the Hamiltonian is also an element of this algebra so that, for a system of N sites, there are N quantum integrals of motion in involution,
BETHE ANSATZ corresponding to Liouville integrability in the classical limit. The diagonalization of the Hamiltonian and the other integrals of motion is thus reduced to the solution of the eigenvalue problem for the transfer matrix T . This problem can be solved by the so-called algebraic Bethe ansatz, a procedure that resembles the algebraic diagonalization of the harmonic oscillator by means of creation and annihilation operators. It relies on the existence of a vector | (pseudovacuum) in the Hilbert space, which is annihilated by the operator C of the monodromy matrix C(λ)| = 0.
(7)
For the Heisenberg chain, | can be chosen as | = N i = 1 ⊗ | ↑ i with | ↑ i denoting the spin up state at site i. From Equation (3) it is clear that Ln acts on the state | ↑ n as a triangular matrix, and the same is true for τ (λ) acting on |; thus, Equation (7) is automatically satisfied (C plays the role of an annihilation operator). From Equations (3) and (4), it is also evident that A(λ)| = (λ + i/2)N |, D(λ)| = (λ − i/2)N |. Moreover, one can show that the operator B in Equation (4) can be used as a creation operator. By taking N different values of the spectral parameter λ1 , λ2 , ..., λN , one constructs a trial wave function as |(λ1 , ..., λN ) =
N
B(λi )|.
(8)
i=1
A direct calculation shows that T (λ)|(λ1 , ..., λN ) = (λ)|(λ1 , ..., λN ) + unwanted terms, (9) where the unwanted terms can be calculated using the commutation relations of A and D with B, obtained from Equation (5). The unwanted terms, however, are eliminated if the λi are taken as solutions of the Bethe equations that, for the isotropic Heisenberg chain, are of the form
M λα − λβ − i λα − i/2 N , =− λα + i/2 λα − λβ + i β=1
α = 1, 2, ..., M.
(10)
The set of states obtained from Equation (8) in correspondence of the solutions of this system of nonlinear equations can be shown to be complete. The diagonalization of T (λ), and hence of the Hamiltonian and all the quantum integrals of motion, is thus reduced to the problem of solving the Bethe equations. For finite size systems, this is a difficult problem to solve due to the nonlinearity of the equations, and one usually resorts to numerical tools. In the thermodynamical
BIFURCATIONS limit, however, it is possible to obtain exact solutions of the energy spectrum by deriving linear integral equation for the density distribution of the Bethe solutions in a complex plane (however, this requires an assumption on the nature of the solution known as the string hypothesis). The algebraic Bethe ansatz has been successfully applied to a large class of many-body problems, including anisotropic generalizations of the Heisenberg chain, the Hubbard model, and the Kondo model, and has stimulated a variety of related approaches including Baxter’s q-operator method (Baxter, 1982) and the notion of quantum groups. Recent progress in the computation of correlation functions of quantumintegrable many-body problems have also been made using the Bethe ansatz (Korepin et al., 1993). MARIO SALERNO See also Quantum inverse scattering method; Salerno equation Further Reading Baxter, R.J. 1982. Exactly Solved Model of Statistical Mechanics, New York: Academic Press Bethe, H. 1931. Zur Theorie der Metalle I. Eigenwerte and Eigenfunktionen der Linearen Atomkette. Zeitschrift für Physik, 71: 205–226 Faddeev, L.D. 1984. Integrable models in 1+1 dimensional quantum field theory. In Recent Advances in Field Theory and Statistical Mechanics, Les Houches 1982, edited by J.B. Zuber & R. Stora, Amsterdam: North-Holland Korepin, V.E., Bogoliubov, N.M. & Izergin,A.G. 1993. Quantum Inverse Scattering Method and Correlation Functions, Cambridge and New York: Cambridge University Press, and references therein Sklyanin, E.K. & Faddeev, L.D. 1978. Quantum mechanical approach to completely integrable field theory models. Soviet Physics Doklady, 23: 902
BIFURCATIONS Bifurcations are critical events that arise in systems when an external control parameter is varied (Arnol’d et al., 1994). For small values of the parameter the system will be linear and a unique fixed point will exist. As the parameter is changed to ranges, where nonlinearity becomes important, instabilities in the form of new fixed points or solutions with qualitatively different dynamical behavior may arise at bifurcations. These critical events are of mathematical and practical interest since their analysis can be performed, and they form organizing centers for observed dynamics (Guckenheimer & Holmes, 1986). As an example, consider the simple physical system of a plastic ruler that is compressed lengthwise between your hands. This was first considered by Leonhard Euler in 1744 and is often referred to as the Euler strut problem (Acheson, 1997). At small forces, the ruler is approximately straight and supports the applied load.
49 This is called the trivial state of the system. However, as the load increases, buckling takes place so that the ruler is deflected up or down. The straight trivial state becomes unstable and is replaced by a pair of solutions where each corresponds to one of the buckled states. If the ruler and the application of the load were both perfect, this would provide a physical example of a symmetry-breaking supercritical pitchfork bifurcation of the type shown in the bifurcation diagram of Figure 1(a). The symbol X represents the deflection of the center of the ruler, which is used as the measure of the state of the system shown plotted as a function of the applied load λ. The symmetry that is broken is the mirrorplane symmetry of the straight ruler. The bifurcation is called supercritical because the nontrivial branches have the same stability as the trivial state, and it is termed pitchfork due to its shape. When the bifurcating solutions have a stability opposite to the trivial, the bifurcation is called subcritical. A sketch of such a bifurcation is given in Figure 1(b) where an increase in the parameter λ would involve a jump to a large X state when λ is increased beyond the critical value λc . In order to regain the trivial state, λ would then have to be reduced to reach the folded part of the solution branches, and a sudden change back to the trivial state would occur. Hence, hysteresis takes place between the two transitions, and such a path is labeled C in Figure 1(b). The pair of folds in Figure 1(b) are called saddle-node bifurcations (Iooss & Joseph, 1990). A physical example of this is provided by the buckling of an elastic wire such as the outer portion of a bicycle brake cable. When a short length is held vertically, it will stand upright. If you push the remaining length upward through your hand, it will eventually become long enough so that gravity will cause it to fall over through a large angle of deflection. Now pull it back downward through your hand and you will find that the deflected state remains over a range of lengths before flipping back to the vertical. This is an example of a hysteresis loop. The two models in Figures 1(a) and (b) contain a reflection symmetry. If this is not present, the bifurcation is transcritical and an example of such a bifurcation is given in Figure 1(c). In the physical example of the buckling of the ruler, this type of bifurcation would be observed if a constant side load
C
x
λc
λc
a
λ
b
λc
c
Figure 1. Sketches of (a) supercritical pitchfork, (b) subcritical pitchfork and (c) transcritical bifurcations. Solid lines indicate stable solutions and dashed lines indicate unstable.
50
BIFURCATIONS
x
λ Figure 2. Imperfect pitchfork bifurcation.
H P
x
G
y H λ Figure 3. Schematic of a gluing bifurcation sequence.
a parameter is changed. As in the case of pitchforks, Hopf bifurcations may also be super- or subcritical, with hysteresis present in the latter case. An interesting feature of supercritical Hopf bifurcations is that the system takes more and more time to reach the equilibrium state as the bifurcation point nears. The observed long-term dynamics are analogous to critical slowing down (Landau & Lifshitz, 1980) in phase transitions and have been found in a fluid flow (Pfister & Gerdts, 1981), for example. Further interesting global bifurcations (Glendinning, 1994) may occur when pitchfork and Hopf bifurcations occur sequentially. An example of this is shown schematically in Figure 3, where the pair of asymmetric states that arise at the pitchfork P then undergo a pair of Hopf bifurcations at the points labeled H. The cycles that arise at H then join together at a gluing bifurcation (Coullet et al., 1984) when λ is increased. This point is marked G in Figure 3, and it is an example of a homoclinic bifurcation. In this case, a single large orbit is formed from the pair of cycles as λ is increased beyond G with the period going to infinity exactly at G. Interesting dynamical behavior including chaos can be observed near such points in experiments when physical imperfections are taken into account (Glendinning et al., 2001). TOM MULLIN See also Catastrophe theory; Critical phenomena; Equilibrium; Hopf bifurcation; Phase transitions
was applied in addition to the end load in the example of the ruler. In this case, the mid-plane symmetry is automatically broken. Of course, in any real system, physical imperfections will be present. These can be taken into account using the imperfect bifurcation theory (Golubitsky & Schaeffer, 1985). The effect of an imperfection is to disconnect the supercritical pitchfork bifurcation as shown in Figure 2. It can be seen that there is one state that evolves smoothly with an increase in parameter λ and another disconnected branch. In the example of the ruler, imperfections could arise from irregularities in the shape of the ruler or a slight imbalance in the applied load. The lower limit of the disconnected branch is defined by another type of bifurcation, a saddle node. The disconnected state can be attained, either by variation of two parameters (e.g., variation of side load for the Euler strut), or by a discontinuous or sudden jump in the parameter λ. In the latter case, there is a finite chance that the system will land on the disconnected solution. Examples of observations of such behavior in fluid flows are provided by Taylor–Couette flow (Pfister et al., 1988), a flow through a sudden expansion (Fearn et al., 1990), and convection (Arroya & Saviron, 1992). Another very important bifurcation is a Hopf where a simply periodic cycle arises from a fixed point as
Further Reading Acheson, D. 1997. From Calculus to Chaos, Oxford and New York: Oxford University Press Arnol’d, V.I.,Afrajmovich, V.S., Il’yashenko,Yu.S. & Shil’nikov, L.P. 1994. Bifurcation Theory and Catastrophe Theory, Berlin and New York: Springer Arroyo, M.P. & Saviron, J.M. 1992. Rayleigh Bénard convection in a small box-spatial features and thermal-dependence of the velocity field. Journal of Fluid Mechanics, 235: 325–348 Coullet, P., Gambaudo, J.M. & Tresser, C. 1984. Une nouvelle bifurcation de codimension 2: le collage de cycles. Comptes Rendu de l’Academie des Sciences de Paris, Ser. I–Mathematische 299: 253–256 Fearn, R.M., Mullin, T. & Cliffe, K.A. 1990. Nonlinear flow phenomena in a symmetric sudden-expansion. Journal of Fluid Mechanics, 211: 595–608 Glendinning, P. 1994. Stability, Instability and Chaos: An Introduction to the Theory of Nonlinear Differential Equations, Cambridge and New York: Cambridge University Press Glendinning, P., Abshagen, J. & Mullin, T. 2001. Imperfect homoclinic bifurcations. Physical Review E, 64: 036208 Golubitsky,M. & Schaeffer,D.G.1985. Singularities and Groups in Bifurcation Theory I, Berlin and New York: Springer Guckenheimer, J. & Holmes, P.J. 1986. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, 2nd edition, Berlin and New York: Springer Iooss, G. & Joseph, D.D. 1990. Elementary Stability and Bifurcation Theory, 2nd edition, Berlin and New York: Springer
BILAYER LIPID MEMBRANES Landau, L.D. & Lifshitz, E.M. 1980. Statistical Physics, part 1, 3rd edition, London: Pergamon Pfister, G. & Gerdts, U. 1981. The dynamics of Taylor wavy vortex flow. Physics Letters A, 83: 23–27 Pfister, G., Schmidt, H. Cliffe, K.A. & Mullin, T. 1988. Bifurcation phenomena in Taylor–Couette flow in a very short annulus. Journal of Fluid Mechanics, 191: 1–18
51 (when light waves reflecting from one layer of soap molecules destructively interfere with light waves reflecting from the second layer of soap molecule) to be about 3–8×10−6 in. thick. (Modern measurements give thicknesses between 5 and 9 nm, depending on the soap solution used.)
BI-HAMILTONIAN STRUCTURE
Origins of the Lipid Bilayer Concept
See Integrable lattices
The recognition of the lipid bilayer as a model for biomembranes dates back to the work of Hugo Fricke, in the 1920s and 1930s, who calculated the thickness of red blood cell (RBC) membranes to be between 3.3 and 11 nm, based on frequency-dependent measurements of the impedance of cell suspensions. Modern measurements on experimental bilayer lipid membranes (BLMs) and biomembranes confirm Fricke’s estimation of the thickness of the plasma membrane (Tien & Ottova, 2000). In his 1917 studies of the molecular organization of fatty acids at the air-water interface, Irving Langmuir had demonstrated that a simple trough apparatus could provide the data to estimate the dimensions of a molecule. Evert Gorter and F. Grendel (respectively, a pediatrician and a chemist) used Langmuir’s trough to determine the area occupied by lipids extracted from red blood cell (from human, pig, or rat sources) “ghosts” (empty membrane sacs) and found that there was enough lipid to form a layer two molecules thick over the whole cell surface. In other words
BILAYER LIPID MEMBRANES When a group of unknown researchers reported the artificial assembly of a bimolecular lipid membrane in vitro (at a 1961 symposium on the plasma membrane sponsored by the American and New York Heart Association), it was initially met with skepticism. The research group led by Donald O. Rudin began their report with a description of mundane soap bubbles, followed by “black holes” in soap films, ending with an invisible “black” lipid membrane made from extracts of cows’ brains. The reconstituted structure (7.5 nm thick) was created just like a cell membrane separating two aqueous solutions. The speaker then said: upon adding one, as yet unidentified, heat-stable compound. . .from fermented egg white. . . to one side of the bathing solutions. . .lowers the resistance. . .by 5 orders of magnitude to a new steady state. . .which changes with applied potential. . .Recovery is prompt. . . the phenomenon is indistinguishable. . . from the excitable alga Valonia. . ., and similar to the frog nerve action potential (Ottova & Tien, 2002).
The first report was published a year later (Mueller et al., 1962). In reaction to that report, the subsequent inventor of liposomes (artificial spherical bilayer lipid membranes) wrote recently in an article entitled “Surrogate cells or Trojan horses” (Bangham, 1995): . . .a preprint of a paper was lent to me by Richard Keynes, then Head of the Department of Physiology [Cambridge University], and my boss. This paper was a bombshell . . .They [Mueller, Rudin, Tien, and Wescott] described methods for preparing a membrane . . . not too dissimilar to that of a node of Ranvier. . .The physiologists went mad over the model, referred to as a “BLM”, an acronym for Bilayer or by some for Black Lipid Membrane. They were as irresistible to play with as soap bubbles. Indeed, the Rudin group was playing with soap bubbles using equipment purchased from a local toy shop. But scientific experimentation with soap bubbles began with the observations of Robert Hooke (who coined the word “cell” in 1665 to describe the structure of a thin slice of cork tissue observed through a microscope he had constructed), with his observation of “black spots” in soap bubbles and films. Years later, Isaac Newton estimated the blackest soap film
surface area occupied (from monolayer experiment) surface area of red blood cell ∼ (1) = 2. Thus, Gorter and Grendel suggested that the plasma membrane of red blood cells may be thought of as a lipid bilayer, with the polar (hydrophilic) head groups oriented outward.
Experimental Realization The structure of black soap films led to the realization by Rudin and his co-workers in 1960 that a soap film in its final stages of thinning has a structure composed of two fatty acid monolayers sandwiching an aqueous solution as follows: air | monolayer | soap solution | monolayer | air. With the above background in mind, Rudin et al. simply proceeded to make a BLM under an aqueous solution, which may be represented as follows: aqueous solution | BLM | aqueous solution. Their effort was successful (Tien & Ottova, 2001, p. 86). Rudin and his colleagues showed that a
52 BLM formed from brain extracts and separating two aqueous solutions was self-sealing to punctures, with many physical and chemical properties similar to those of biomembranes. Upon modification with a certain compound called excitability-inducing molecule (EIM), this otherwise electrically inert structure became excitable, displaying characteristic features similar to those of action potentials of the nerve membrane. By the end of the early 1970s, it had been determined that an unmodified bilayer lipid membrane separating two similar aqueous solutions is about 5 nm thick and is in a liquid-crystalline state with the following electrical properties: membrane potential (Em 0), membrane resistivity (Rm 109 cm), membrane capacitance 0.5–1F cm−2 ), and dielectric breakdown (Cm (Vb > 250,000 V/cm). In spite of its very low dielectric constant (ε 2–7), this liquid-crystallline BLM is surprisingly permeable to water (8–24 m/s) (Tien & Ottova, 2000).
The Lipid Bilayer Principle In spite of their variable compositions, the fundamental structural element of all biomembranes is a liquidcrystalline phospholipid bilayer. Thus, the lipid bilayer principle of cell or biological membranes may be summarily stated as follows: all living organisms are made of cells, and every cell is enclosed by a plasma membrane, the indispensable component of which is a lipid bilayer. The key property of lipid bilayer-based cells is that they are separated from the environment by a permeability barrier that allows them to preserve their identity, take up nutrients, and remove waste. This 5 nm thick liquid-crystalline lipid bilayer serves not only as a physical barrier but also as a conduit for transport, a reactor for energy conversion, a transducer for signal processing, a bipolar electrode for redox reactions, or a site for molecular recognition. The liquid-crystalline lipid bilayer of biomembranes not only provides the physical barrier separating the cytoplasm from its extracellular surroundings, it also separates organelles inside the cell to protect important processes and events. More specifically, the lipid bilayer of cell membrane must prevent its molecules of life (genetic materials and many proteins) from diffusing away. At the same time, the lipid bilayer must keep out foreign molecules that are harmful to the cells. To be viable, the cell must also communicate with the environment to continuously monitor the external conditions and adapt to them. Further, the cell needs to pump in nutrients and release toxic products of its metabolism. How does the cell carry out all of these multi-faceted activities? A brief answer is that the cell depends on its lipidproteins-carbohydrate complexes (i.e., glycoproteins,
BILAYER LIPID MEMBRANES proteolipids, glycolipids, etc.) embedded in the lipid bilayer to gather information about the environment in various ways. Examples include communication with hundreds of other cells about a variety of vital tasks such as growth, differentiation, and death (apoptosis). Glycoproteins are responsible for regulating the traffic of material to and from the cytoplasmic space. Paradoxically, the intrinsic structure of cell membranes creates a bumpy obstacle to these vital processes of intercellular communication. The cell shields itself behind its lipid bilayer, which is virtually impermeable to all ions (e.g., Na+ , K + , Cl− ) and most polar molecules (except H2 O). This barrier must be overcome, however, for a cell to inform itself of what is happening in the world outside, as well as to carry out vital functions. Thus, over millions and millions of years of evolution, the liquid-crystalline lipid bilayer— besides acting as a physical restraint—has been modified to serve as a conduit for material transport, as a reactor for energy conversion, as a bipolar electrode for redox reactions, as a site for molecular recognition, and other diverse functions such as apoptosis and signal transduction. Insofar as membrane transport is concerned, cells make use of three approaches: simple diffusion, facilitated diffusion, and active transport. Although simple diffusion is an effective transport mechanism for some substances such as water, the cell must make use of other mechanisms for moving substances in and out of the cell. Facilitated diffusion utilizes membrane channels to allow charged molecules, which otherwise could not diffuse across the lipid bilayer. These channels are especially useful with small ions such as K+ , Na+ , and Cl− . The number of protein channels available limits the rate of facilitated transport, whereas the speed of simple diffusion is controlled by the concentration gradient. Under active transport, the expenditure of energy is necessary to translocate the molecule from one side of the lipid bilayer to the other, in contrast to the concentration gradient. Similar to facilitated diffusion, active transport is limited by either the capacity of membrane channels or the number of carriers present. Today, ion channels are found ubiquitously. To name a few, they are in the plasma membrane of sperm, bacteria, and higher plants; the sarcoplasmic retculum of skeletal muscle, nerve membrane, synaptic vesicle membranes of rat cerebral cortex, and the skin of carps. As a weapon of attack, many toxins released by living organisms such as dermonecrotic toxin, hemolysin, brevetoxin, and bee venom are polypeptide-based ionchannel formers. For example, functioning of membrane proteins, in particular, ionic channels, can be modulated by alteration of their arrangement in membranes (e.g., electroporation, Tien & Ottova, 2003). At the membrane level, most cellular activities involve some kind of lipid bilayer-based
BILLIARDS receptor-ligand contact interactions. Outstanding examples among these are ion-sensing, molecular recognition (e.g., antigen-antibody binding and enzymesubstrate interaction), light conversion and detection, gated channels, and active transport. The development of self-assembled bilayer lipid membranes (BLMs and liposomes) has made it possible to investigate directly the electrical properties and transport phenomena across a 5 nm thick biomembrane element separating two aqueous phases. A modified or reconstituted BLM is viewed as a dynamic structure that changes in response to environmental stimuli as a function of time, as described by the so-called dynamic membrane hypothesis. Under this hypothesis, each type of receptor interacts specifically with its own ligand. That is, the so-called G-receptor is usually coupled to a guanosine nucleotide-binding protein that in turn stimulates or inhibits an intracellular, lipid bilayer-bound enzyme. Gprotein-linked receptors mediate the cellular responses to a vast variety of signaling molecules, including local mediators, hormones, and neurotransmitters, which are as varied in structure as they are in function. G-proteinlinked receptors usually consist of a single polypeptide chain, which threads back and forth across the lipid bilayer up to seven times. The members of this receptor family have a similar amino acid sequence and functional relationship. The binding sites for G-proteins have been reported to be the second and third intracellular loops and the carboxy-terminal tail. The endogenous ligands, such as hormones, neurotransmitters, and exogenous stimulants such as odorants, belonging to this class are important target analytes for biosensor technology Tien & Ottova (2003). H. Ti. TIEN AND ANGELICA OTTOVA-LUEITMANNOVA See also Langmuir–Blodgett films; Nerve impulses; Neurons
Further Reading Bangham, A.D. 1995. Surrogate cells or Trojan horses. BioEssays, 17: 1081–1088 Mueller, P., Rudin, D.O., Tien H.T. & Wescott, W.C. 1962. Reconstitution of cell membrane structure in vitro and its transformation into an excitable system. Nature, 194: 979–980 Ottova A. & Tien, H.T. 2002. The 40th anniversary of bilayer lipid membrane research. Bioelectrochemistry, 56: 171–173 Ottova, A., Tvarozek, V. & Tien, H.T. 2003. Supported BLMs. In Planar Lipid Bilayers (BLMs) and Their Applications, edited by H.T. Tien & A. Ottova-Leitmannova, Amsterdam: Elsevier Tien, H.T. 1974. Bilayer Lipid Membranes (BLM): Theory and Practice, New York: Marcel Dekker Tien, H.T. & Ottova, A.L. 2000. Membrane Biophysics: As Viewed from Experimental Bilayer Lipid Membranes (Planar Lipid Bilayers and Spherical Liposomes), Amsterdam and New York: Elsevier Science Tien, H.T. & Ottova, A. 2001. The lipid bilayer concept and its experimental realization: from soap bubbles, the kitchen sink,
53 to bilayer lipid membranes. Journal of Membrane Science, 189: 83–117 Tien, H.T. & Ottova,A. 2003. The bilayer lipid membrane (BLM) under electrical fields. IEEE Transactions on Dielectrics and Electrical Insulation, 10(5): 717–727
BILLIARDS In mathematical physics, the singular noun “billiards” denotes a dynamical system corresponding to the inertial motion of a point mass within a region that has a piecewise smooth boundary. The reflections from the boundary are taken to be elastic; that is, the angle of reflection equals the angle of incidence. This model arises naturally in optics, acoustics, and classical and statistical mechanics. In fact, two fundamental models in statistical mechanics, gas of hard spheres (Boltzmann gas) and the Lorentz gas, are billiards. The billiards concept occupies a central position in nonlinear physics because it provides ideal visible models for analysis of dynamical properties leading to classical chaos and an ideal testing ground for the semiclassical analysis of quantum systems. Billiards models are Hamiltonian systems. Hence, the phase volume is preserved under the dynamics, and the system can be studied in the framework of ergodic theory. In particular, the boundary of the billiard region is supposed to be only piecewise smooth; that is, it consists of smooth components. Therefore, the dynamics of billiards is not defined for orbits that hit singular points of the boundary. However, the phase volume of such orbits equals zero. The dynamics of billiards is completely defined by the shape of its boundary. A smooth component of the boundary is called dispersing, focusing, or neutral if it is convex inward, outward the billiard region, or if it is flat (has zero curvature), respectively. Any billiard orbit is a broken line in its configuration space. The classical examples of integrable billiards are provided by circular and elliptical boundaries. Configuration spaces of these billiards are foliated by caustics, which are smooth curves (surfaces γ ) such that if one link of the billiard orbit is tangent to γ , then every other link of this orbit is tangent to γ . Billiards in a circle has one family of caustics formed by (smaller) concentric circles, while billiards in an ellipse has two families of caustics (confocal ellipses and confocal hyperbolas), which are separated by orbits such that each link intersects a focus of the ellipse. Birkhoff’s conjecture (Birkhoff, 1927) claims that among all billiards inside smooth convex curves, only billiards in ellipses are integrable. Berger (1990) has shown that in three dimensions (d), only billiards in ellipsoids produce foliations of a billiard region by smooth convex caustics. However, it does not imply that only billiards in ellipsoids are integrable because if a billiard in d >2 has an invariant hypersurface then this hypersurface does not necessarily consist of
54
BILLIARDS
rays tangent to some hypersurface in the configuration space. Using KAM theory, Lazutkin has shown that if a billiards boundary is strictly convex, with a sufficiently smooth curve and its curvature never vanishes; then there exists an uncountable number of smooth caustics in the vicinity of the boundary, and moreover, the phase volume of the orbits tangent to these caustics is positive (Lazutkin, 1991). An opposite situation occurs when a boundary is everywhere dispersing. Such models were introduced by Sinai (1970), in his seminal paper and they are called Sinai (or dispersing) billiards (Figure 1(a)). Sinai billiards have the strongest chaotic properties; that is, they are ergodic, mixing, have a positive metric entropy, and are Bernoulli systems. If a (narrow) parallel beam of rays is made to fall onto a dispersing boundary, then after reflection it becomes divergent and, therefore, the distance between the rays in this beam increases with time. It is the mechanism of dispersing that generates sensitive dependence on initial conditions (hyperbolicity) and is responsible for strong chaotic properties of dispersing billiards. On the other hand, focusing boundaries produce the opposite effect. Indeed, a narrow parallel beam of rays after reflection from the focusing boundary becomes convergent; that is, the distance between rays in such a beam decreases with time. Therefore, it has been the general understanding that a dispersing boundary always produces chaotization of the dynamics, while a focusing boundary produces stabilization of the dynamics. However, there exists another mechanism of chaos in billiards (and, in general, in Hamiltonian systems), which is called defocusing (Bunimovich, 1974, 1979). The point is that a narrow parallel beam of rays, after focusing because of reflection from a focusing boundary, may become divergent provided that a free path between two consecutive reflections from the boundary is long enough. Assuming that the time of divergence exceeds (averaged over all orbits) the time of convergence, one obtains chaotic billiards. One of the first, and the most famous, example of such billiards is called a stadium (Figure 1(b)). One obtains a stadium by cutting a circle into two semi-circles and connecting them by two common tangent segments. The length of these segments could be arbitrarily small, which demonstrates that the mechanism of defocusing can work under small deformations of even the integrable (a circle) billiards. Focusing billiards can have as strong chaotic properties as Sinai’s billiards do (Bunimovich, 2000).
There are no other mechanisms of chaos in billiards. Indeed, billiards in polygons and polyhedrons have zero metric entropy (Boldrighini et al., 1978). Nevertheless, a typical billiard in a polygon is ergodic (Kerckhoff et al., 1986). Because focusing components can form parts of the boundary of integrable as well as chaotic billiards, a natural question is whether there are some restrictions. Two classes of focusing components admissible in chaotic billiards were found (Wojtkowski, 1986; Markarian, 1988). The most general class of such focusing components is formed by absolutely focusing mirrors (AFM) (Bunimovich, 1992). AFMs form a new notion in geometric optics. A mirror γ (or a smooth component of a billiards’boundary) is called absolutely focusing if any narrow parallel beam of rays that falls on γ becomes focused after its last reflection in a series of consecutive reflections from γ . Observe that a mirror is focusing if any parallel beam of rays becomes focused just after the first reflection from this mirror. AFMs can also be characterized in terms of their local properties (Donnay, 1991; Bunimovich, 1992). Generic Hamiltonian systems are neither integrable nor chaotic. Instead, their phase spaces get divided into KAM-islands and chaotic sea(s). The only clear and clean example of this phenomenon is a billiard in a mushroom (Bunimovich, 2001). The mushroom consists of a semicircular hat sitting on a foot (Figure 1(c)). A mushroom becomes a stadium when the width of the foot equals the width of the hat. Clearly, the mechanism of dispersing works is higher than two dimensions as well (Sinai, 1970). It is not obvious at all for the mechanism of defocusing because of astigmatism. However, chaotic focusing billiards also do exist in dimension d ≥ 3 (Bunimovich & Rehacek, 1998). But one pays a price of astigmatism by not allowing the focusing component to be as large as it can be in d = 2. Many properties of classical dynamics of billiards are closely related to the properties of the corresponding quantum problem. Consider the Schrödinger equation with a potential equal to zero inside the billiard region and equal to infinity on the boundary. The eigenfunctions become uniformly distributed over the regions of ergodic billiards for high wave numbers (Shnirelman, 1991). On the contrary, there exist infinite series of eigenfunctions localized in the vicinity of convex caustics of billiards (Lazutkin, 1991). LEONID BUNIMOVICH See also Ergodic theory; Horseshoes and hyperbolicity in dynamical systems; Lorentz gas Further Reading
a
b
c
Figure 1. (a) Sinai billiard. (b) Stadium. (c) Mushroom.
Berger, M. 1990. Sur les caustiques de surfaces en dimension 3. Comptes Rendu de l’Academie de Sciences, 311: 333–336 Birkhoff, G. 1927. Dynamical Systems. New York, American Mathematical Society
BINDING ENERGY Boldrighini, C., Keane, M. & Marchetti, F. 1978. Billiards in polygons. Annals of Probability, 6: 532–540 Bunimovich, L.A. 1974. On billiards close to dispersing. Mathematical USSR Sbornik, 95: 49–73 (originally published in Russian) Bunimovich, L.A. 1979. On the ergodic properties of nowhere dispersing billiards. Communications in Mathematical Physics, 65: 295–312 Bunimovich, L.A. 1992. On absolutely focusing mirrors. In Ergodic Theory and Related Topics, edited by U. Krengel, et al., Berlin and New York: Springer, pp. 62–82 Bunimovich, L.A. 2000. Billiards and other hyperbolic systems with singularities. In Dynamical Systems, Ergodic Theory and Applications, edited by Ya. G. Sinai, Berlin: Springer Bunimovich, L.A. 2001. Mushrooms and other billiards with divided phase space. Chaos, 11: 802–808 Bunimovich, L.A. & Rehacek, J. 1998. How many dimensional stadia look like. Communications in Mathematical Physics, 197: 277–301 Donnay, V. 1991. Using integrability to produce chaos: billiards with positive entropy. Communications in Mathematical Physics, 141: 225–257 Kerckhoff, S., Mazur, H. & Smillie, J. 1986. Ergodicity of billiard flows and quadratic differentials. Annals of Mathematics, 124: 293–311 Lazutkin, V. F. 1991. The KAM Theory and Asymptotics of Spectrum of Elliptic Operators. Berlin and New York: Springer Markarian, R. 1988. Billiards with Pesin region of measure one. Communications in Mathematical Physics, 118: 87–97 Shnirelman, A. I. 1991. On the asymptotic properties of eigenfunctions in the regions of chaotic motion. Addendum in The KAM Theory and Asymptotics of Spectrum of Elliptic Operators by V. F. Lazutkin. Berlin and New York: Springer Sinai, Ya. G. 1970. Dynamical systems with elastic reflections. Ergodic properties of dispersing billiards. Russian Mathematical Surveys, 25: 137–189 (originally published in Russian 1970) Wojtkowski, M. 1986. Principles for the design of billiards with nonvanishing Lyapunov exponents. Communications in Mathematical Physics, 105: 391–414
BINDING ENERGY When two particles form a bound state under a certain kind of physical interaction, the resulting state has an energy smaller than the sum of the rest energies of the constituent elements of such a bound state. That is why, by definition (or one could say by construction), bound states are ones in which work has to be done to separate the constituents. The energy that one has to provide (equivalently the work) in order to separate a bound state into its elements is called the binding energy, Eb , and from the above, it can be directly inferred that Eb > 0. The equivalent mass to this energy (under the Einstein relation) also bears a name and is called the “mass defect,” m = Eb /c2 , where c is the speed of light. Examples of binding energy can be easily found among the fundamental forces in nature, such as the gravitational force, the electromagnetic force, and the nuclear force.
55 Considering an approximately circular (in reality, elliptical) motion of the Earth around the Sun, equating the gravitational force Fg = GMs Me /R 2 (where G is the gravitational constant, the subscripts s and e denote Sun and Earth, respectively, and R is their relative distance) with the√centripetal force Fc = Me v 2 /R, one obtains v = GMs /R, leading to a kinetic energy Ek = GMe Ms /(2R), which combined with the potential energy of Ep = − GMe Ms /R, results in a binding energy for the solar system of the form Me Ms . (1) 2R Using the relevant masses for the Earth and Sun and their separation, this quantity can be approximately calculated as Eb ≈ 2.6 × 1033 J (m ≈ 2.9 × 1016 kg). However, what actually matters in terms of physical “observability” is the ratio of mass defect to the bound state mass (the closer this ratio is to 1, the greater the possibility of observing the mass defect). In the case of the gravitational system M/Mb ≈ 1.5 × 10−14 ; hence, the mass defect for the gravitational force will not be observable. Similar calculations can be performed classically for the hydrogen atom (following the same path, but substituting G → 1/(4π ε0 ), Ms → |qe |, and Me → qe , where ε0 is the dielectric constant in a vacuum and qe is the charge of the electron). In this case, for the electrostatic force, Eb = G
Eb =
1 qe2 . 2 4π ε0 R
(2)
In this case, however, R ≈ 0.53 × 10−10 m (while in the previous example, it was ≈ 1.5 × 1011 m!). In the case of the hydrogen atom, Eb ≈ 13.6 eV and m ≈ 2.5 × 10−35 kg. The ratio M/Mb ≈ 1.5 × 10−8 , indicating that in this case also it is not possible to observe the mass defect. In the case of the nuclear force, however, the ratio of M/Mb is of order 10−3 , and hence it is possible to observe the mass defect. For example, the mass of an α particle consisting of two protons and two neutrons is 6.6447 × 10−27 kg, while the individual masses of these particles add up to 6.6951 × 10−27 kg, resulting in a binding energy of 28.3 MeV and M/Mb ≈ 0.0075. In fact, a very common diagram in nuclear physics is the so-called nuclear binding energy curve (see, e.g., http://hyperphysics. phy-astr.gsu.edu/hbase/nucene/nucbin.html), which shows the binding energy of various elements as a function of their mass number. In this graph, the larger the Eb , the more stable the element; iron (with atomic number A = 56 and binding energy 8.8 MeV/nucleon) is the most stable element. Lighter elements can yield energy by fission, while heavier elements can yield energy by means of fusion, emitting energies in the MeV range.
56
BINDING ENERGY
Binding Energy in Nonlinear Systems Naturally, bound states of multiple waves can be formed in nonlinear systems. To fix ideas, we will examine such bound states and their corresponding binding energies in the specific context of the well-known sine-Gordon equation. For a detailed exposition of the features and applications of this equation, see Dodd et al. (1982). The sine-Gordon equation in (1 + 1) dimensions is utt = uxx − sin(u).
(3)
Perhaps, the best-known nonlinear wave solution of this equation consists of the topological soliton (kink), which is of the form (in the static case) u(x) = 4 tan−1 (esx ),
(4)
s ∈ { − 1, 1}, where the case of s = 1 corresponds to a kink, while s = − 1 corresponds to an antikink. The energy of such a static kink solution ∞ 1 2 (ut + u2x ) + 1 − cos(u) dx (5) E= 2 −∞ can be calculated as E = 8. Another elemental solution of the equation is the breather-like solution of the form u(x, t) = 4 tan−1
sin(ωt) (1−ω2 )1/2 . (6) ω cosh (1−ω2 )1/2 x
This exponentially localized in space, periodic in time solution can be considered as a result of a merger of a kink and an antikink. Hence, this is perhaps the simplest example of a bound state in this nonlinear system. The bound state character of this solution can also be revealed by the expression for its energy. Using expression (6) in Equation (5), we obtain Ebreather = 16(1 − ω2 )1/2 .
(7)
Hence, this energy, for any ω ∈ (0, 1) (ω is the frequency of the internal breathing oscillation), is less than the sum of the kink and antikink energies, verifying that the binding energy of such a state is ! " (8) Eb = 16 1 − (1 − ω2 )1/2 . It is also worthwhile to note that the energy of such a breather excitation varies in the interval (0, 16) depending on its frequency. Hence, there is no threshold for the excitation of such a wave, but even for small amounts of energy, such waveforms will be excited (large frequency/small period ones for small excitation energy). One can generalize the solution of the form (6) in a periodic breather lattice solution of the sine-Gordon equation in the form (see, e.g., McLachlan, 1994) u(x, t) = 4 tan−1 [a sn(bt, k 2 )dn(cx, 1 − m2 )], (9)
where sn(x, k) and dn(x, k) are the Jacobi elliptic functions with modulus k. Here k m a = , b= and m (m + k)(1 + mk) # k . c = (m + k)(1 + mk) One can then evaluate the energy (per breather) of this infinite periodic breather lattice configuration (for details of the calculation, the interested reader is directed to Kevrekidis et al., 2001) to be # k E(1 − m2 ), (10) E = 16 (k + m)(1 + km) where E(1 − m2 ) is the complete elliptic integral of the second kind. Depending on the values of the elliptic moduli, k and m, the expression of Equation (9) represents a lattice of different entities. For m, k → 0, it corresponds to genuine sine-Gordon breathers. For k, m → 1, the limit gives the “pseudosphere” solution u = 4 tan−1 (tanh 2t ), which resembles a π -kink but in time rather than space (see McLachlan, 1994). On the other hand, the k → finite, m → 0 limit gives the kink-antikink pair solution 4 tan−1 (t sech x). This solution has the character of a kink-antikink pair “breathing” in time. The above different limits illustrate why Equation (10) is an important result, since it can be used (see below) to obtain the asymptotic interaction between entities such as breathers, pseudospheres, or kinkantikink pairs. When taking the appropriate above limits of expression (10), the leading-order term will be the energy of a single such entity. However, the correction to that will be the (per particle) binding energy in a configuration of multiple such entities (or, as it is often referred to, the energy of interaction between two such entities). To calculate the breather-breather interaction (their binding energy), we take the limit m, k → 0, with k/m = (1 − ω2 )/ω2 , where ω is the breather frequency (see McLachlan, 1994), to obtain $ (1 − ω2 )3/2 E = 16 1 − ω2 − 8m2 ω2 + 6m4
(1 − ω2 )5/2 E(1 − m2 ). ω4
(11)
Hence, the corrections to the (single) breather energy in Equation (11) correspond to the binding energy of the formed breather bound states. Similar expressions can be found for the pseudosphere: π π E = 4π − (k − 1)2 + (m − 1)2 2 4 3π (m − 1)2 (k − 1)2 + (12) 32
BIOLOGICAL EVOLUTION
57
(again, 4π is the energy of a single pseudosphere) and for kink-antikink pairs
3 1 + 2m2 2 + 3k 2 + 2 E = 16 − 8 k + k k ×E(1 − m2 ).
(13)
Similar examples of breather lattices also exist for other equations, such as the well-known Korteweg– de Vries (KdV) equation, and one can again infer breather-breather state binding energies in a similar manner. In general, one can say that the concept of a bound state for nonlinear evolutionary partial differential equations supporting soliton (or solitary wave) solutions persists in a form very similar to the way it manifests itself for fundamental physical forces and their particle carriers. In the present case, the elements of the bound states are the nonlinear waves proper (a feature reminiscent of the particle-like character of such waves, manifest evidently also in their interactions). In a number of (most often integrable) cases, where the form of the bound state solutions is analytically tractable, the calculation of the bound state energy and of the energy of its constituent elements again provides information, through the difference between the two, for the binding energy (or energy of interaction) of such waves. P.G. KEVREKIDIS See also Breathers; Partial differential equations, nonlinear; Sine-Gordon equation; Solitons Further Reading Dodd, R.K., Eilbeck, J.C., Gibbon, J.D. & Morris, H.C. 1982. Solitons and Nonlinear Wave Equations, London: Academic Press Kevrekidis, P.G., Saxena, A. & Bishop, A.R. 2001. Physical Review E, 64: 026613 McLachlan, R. 1994. Math. Intelligencer, 16: 31
BIOLOGICAL EVOLUTION The term evolution defines a process that is driven by internal and/or external forces. In quantum mechanics, an evolution operator conducts change in time. Cosmic evolution in the standard model aims at a consistent description of the process from the “big bang” to the present universe. “Prebiotic evolution” deals with chemical precursors of present-day life and is determined by the conditions at the early Earth, be it in the primordial atmosphere, in the surrounding of volcanic hot springs at the sea floor, or at some other location. “Biological evolution” follows the prebiotic scenario, and it shaped and is still shaping the biosphere on Earth. A temporal change in the biosphere manifests itself as the appearance, alteration, and extinction of biological species. This view was not
generally accepted before Charles Darwin. Influenced by the geologist Charles Lyell and his concept of uniformitarianism, Darwin and the proponents of the theory of evolution suggested that changes in the biosphere occur gradually, continuously, or at least, in small steps. In this aspect, which is not essential for the mechanism of evolution, Darwin’s theory contrasted the view held by the majority of his contemporaries, who assumed constancy of biological species and change exclusively through catastrophic events leading to mass extinction (Ruse, 1979). The opponents of evolutionary thinking, Louis Agassiz, Georges Cuvier, and others, considered species as invariant entities. The remnants of extinct species in the fossil record were interpreted by them as witnesses from earlier worlds destroyed by punctual events, the great deluge, and other catastrophes that wiped out major parts of the organismic world. In society, the concept of evolution was heavily attacked by representatives of the Christian Churches because it was seen to be in conflict with the Genesis report in the Bible (Ruse, 2001). During the 20th century, European religious thought has reconciled religious belief and the idea of an evolving biosphere. In North America, the strong opposition of some groups of religious fanatics led to the peculiar development of Creationism, whose claim of being an alternative to the theory of evolution is rejected by the established scientific community (NAS, 1999). The current theory of biological evolution originated from two epochal contributions by Charles Darwin and Gregor Mendel. Darwin conceived a mechanism for evolutionary change of the biosphere based on variation and selection, and he gathered empirical data providing evidence for the action of natural and artificial selection, the latter exercised in animal breeding and nursery gardens. Darwin’s principle (published in On The Origin of Species by Natural Selection in 1859) has two consequences: species adapt to their environments and are related to their ancestors in terms of phylogenies, or branches of an ancestral tree of species. In 1866, Gregor Mendel introduced quantitative statistics into the evaluation of data in biology and performed the first precisely controlled fertilization experiments with plants. He discovered and interpreted correctly the action of genes in determining the properties of organisms. Mendel’s work was considered irrelevant by the evolutionists of the second half of the 19th century and was “rediscovered” around 1900. Only in 1930 were the Darwinian concept of selection and Mendel’s rules of inheritance combined to a common mathematical formalism by the population geneticists Ronald Fisher, John Haldane, and Sewall Wright (for a recent text in population genetics, see Hartl & Clark, 1997). In the 1940s, finally, Darwinian evolution and Mendelian genetics were united in the Synthetic or Neo-Darwinian Theory of Evolution by the works of the experimental biologists Theodosius
58
BIOLOGICAL EVOLUTION
dxi = xi (fi − ), i = 1, . . . , n dt n with (t) = fj xj = f .
(1)
j =1
flux (t). Frequencies of variants with fitness values above average, fi > , increase with time, those of below-average variants, fi < , decrease and as a consequence, the mean fitness increases. The flux (t) is a nondecreasing function of time and selection continues until all variants, except the fittest, have died out (See also Fitness landscape). For two variants, I0 and I1 , the solution boils down to x1 x1 (t) = (0) · exp(mt) x0 x0 or
= t
X1 X0
w t, 0
j =1
= xi (ai − ),
i = 1, . . . , n
(2)
1 0.8 0.6 m = ∆ f = 0.1
m = ∆ f = 0.2
0.4 0.2
m = ∆ f = 0.01
0 0
The variables denote the frequencies of reproduc ing variants: xi (t) = Ni (t)/ nj = 1 Nj (t), with Ni (t) counting the number of individuals with phenotype Si or genotype Ii at time t. (For several genotypes giving rise to the same phenotype, see neutrality below.) Fitness values fi when averaged over the entire population yield the mean fitness expressed by a time-dependent
X1 X0
where the upper equation refers to continuously varying x(t) and the lower equation refers to population to discrete time variables Xt with synchronized reproduction. The Malthusian fitness difference m = f1 − f0 is related to the Darwinian relative fitness w = (1 + f1 )/(1 + f2 ) by m ≈ ln w (see Hartl & Clark (1997)). The conditions for selection are m > 0 or w > 1, respectively. An example is shown in Figure 1. Sexual reproduction of diploid organisms involves Mendelian genetics (see Figure 2). Every gene (A) comes in two copies, identical or different, which are chosen from a reservoir of variants Ai called alleles. Recombination occurs in the process of reproduction when the two copies are separated and reassembled in pieces through random combination. The differential equation (1) is extended to describe selection in the diploid case in the form of Fisher’s selection equation: ⎛ ⎞ n dxi ⎝ = xi · aij xj − ⎠ dt
Fraction of advantageous variant
Dobzhansky, Julian Huxley, Ernst Mayr, and others (Mayr, 1997). In the second half of the 20th century, molecular biology put evolutionary theory on firm fundamentals, chemistry and physics. Comparison of genes and, more recently of whole genomes, allows for reconstruction of phylogenies on the basis of nucleotide sequence divergence through mutation (Judson, 1979); the exploration of molecular structures provides insights into the chemistry of present day life; and knowledge of biomolecular properties eventually led to the construction of laboratory systems that allow for observation of evolution of molecules in the test tube (Spiegelman, 1971; Watts & Schwarz, 1997). Darwinian evolution results from the interplay of variation and selection, both being consequences of reproduction in populations. Variation operates on genomes or genotypes, which are polynucleotide sequences carrying the genetic information, and occurs in two fundamentally different ways: (i) mutation causes local changes in genomic sequences, whereas (ii) recombination exchanges corresponding segments between two genotypes. Selection is based on differences in fitness being a property of the phenotype. The phenotype is defined as the union of all, structural as well as dynamic, properties of an individual organism. Unfolding of the phenotype is programmed by the genome; but, at the same time, requires a highly specific environment. In addition, it is influenced by epigenetic factors (epigenetic refers to every nonenvironmental factor that interferes with the development of the organism, except those encoded in the nucleotide sequence of DNA; many epigenetic factors are already understood at the molecular level, and involve specific modifications of genomic DNA). Fitness, in essence, counts the number of fertile descendants reaching the reproductive age. It has two major components: (i) the probability of survival to reproduction, and (ii) the number of viable and fertile offspring. To illustrate selection in a population of n asexually reproducing phenotypes, we consider a continuoustime model that describes change by a differential equation
200
400 600 Time [generations]
800
1000
Figure 1. Illustration of selection in populations. The plotted curves represent the frequencies of advantageous mutants I1 in a population of individuals I0 with a Malthusian fitness difference of m = f = f1 − f0 = 0.1, 0.02, and 0.01. The population size is N = 10,000, and the mutants were initially present in a single copy: N1 (0) = 1 or x1 (0) = 0.0001.
BIOLOGICAL EVOLUTION
59
P
F1 = P × P
4
4
F1
+
2
F2=F1 × F1
+
2
P × F1
+ 2
Intermediate pair of alleles
+
2
3
+ 2
Dominant/recessive pair of alleles
Figure 2. Mendelian genetics. In sexual reproduction, the two parental genomes are split into pieces and recombined randomly, which means each of the four alleles has a 50 % chance to be incorporated in the genome of an offspring. Mendel’s laws are of a statistical nature and hold as mean values in the limit of large numbers of observations. Two cases are shown: (i) the heterozygote unfolds into a phenotype with intermediate properties (gray through blending of black and white), and (ii) the property of one allele (black) is dominant. In the latter case, the other allele (white) is called recessive. Interbreeding of two homozygous individuals (parent generation P) leads to a first offspring generation (F1) of identical heterozygous individuals; the phenotypes in the next (F2) generation show a distribution of 1:2:1 in the intermediate and 1:3 in the dominant/recessive case. Crossing of the (recessive) parent genotype with an F1-individual yields a 1:1 ratio of phenotypes.
with ai = (t) = a =
n i=1
ai x i =
n
aij xj , j =1 n n
aij xi xj .
i=1 j =1
The variables refer to alleles Aj rather than to whole genomes, and the rate coefficient aij represents the individual fitness values for the combination Ai Aj . Fitness is assumed to be independent of the positioning of alleles, Ai Aj or Aj Ai , and hence, aij = aj i holds. The term ai is the population-averaged mean fitness of the allele combinations carrying Ai at least once: Ai Aj , j = 1, . . . , n. Fisher’s fundamental theorem states that the flux (t) = ni= 1 ai xi = a is a nondecreasing function of time, but the outcome of selection need not be unique as optimization might end in a local optimum of (See also Fitness landscape). For example, in the two-allele case, inferiority of the heterozygote A1 A2 , a12 < min{a11 , a22 }, results in bistability since homogenous populations of either homozygote, A1 A1 or A2 A2 , represent stable equilib-
rium points. Then, the initial conditions determine the outcome of selection. The optimization principle is not universally valid: when mutation is included or when more complex cases of recombination are considered, optimization of mean fitness is restricted to certain ranges of initial conditions, whereas different behavior is observed for other starting values. Still, optimization remains an important heuristic in evolution as it is frequently observed. Innovation is introduced into genes by mutation consisting of a local change in the sequence of nucleotides resulting from an imperfect replication of genetic information or externally caused damage. Two scenarios are distinguished: (i) rare mutation treated by conventional population genetics and typically occurring with multicellular organisms and most bacteria, and (ii) frequent mutation handled by quasispecies theory (Eigen, 1971; Eigen & Schuster, 1977) and determining evolution of viruses. Higher mutation rates are often advantageous because they allow for adaptation, but there exists an error threshold of replication beyond which inheritance breaks down because too many mutations destroy the genetic message. RNA viruses are under a strong selection constraint by the host and their mutation rates are close to the error threshold. The idea that genotypes and phenotypes are related one-to-one turned out to be wrong. Molecular genetics revealed a high degree of neutrality (Kimura, 1983): many different genotypes give rise to the same phenotype. Advantageous mutations are rare; deleterious mutations are eliminated by selection thus leaving a majority of observed changes in the genomes to result from neutral mutations. Neutrality gives rise to random drift of populations in genotype space, which was also found to be important for the mechanism of evolution since it allows populations to escape from minor local fitness optima or evolutionary traps (Schuster, 1996) and Schuster in Crutchfield & Schuster, 2003). Random drift leads to an almost constant mutation rate per year and nucleotide independent of the species being tantamount to a molecular clock of evolution. This clock is used for dating in the reconstruction of phylogenies from comparison of present-day genome sequences. Molecular clock dates yield substantially longer time spans compared with those from the fossil record. The discrepancy seems to be reconcilable because paleontological datings are too young and molecular clock datings are too old by systematic errors (Benton & Ayala, 2003). The Darwinian mechanism is powerful because it makes no reference to the specific nature of the reproducing entities. Therefore, it is likewise valid for molecules, viruses, bacteria, or higher organisms. Selection based on the Darwinian principle is observed in many disciplines outside biology, for example, in
60 physics and chemistry, in economics, and in the social sciences. Since its introduction, the theory of evolution has undergone changes and modifications. The rejection of catastrophic events as an important source of change in the history of life on Earth was a political issue rather than one based on scientific data. Geological evidence for fallings of large meteorites as well as major floods is now available, and such events wiped out substantial parts of the biosphere. The paleontological record reflects the interplay between continuous evolution and external influences, which resulted in epochs of gradual development interrupted by punctuated events. Interestingly, evolution of bacteria or molecules under constant conditions also showed punctuation without external triggers: populations “wait” during quasistationary periods for rare mutations that initiate fast periods of change. Still, there are open problems in current evolutionary theory. Recent sequence data challenge the idea of a tree of life. Although animal phylogeny appears to be on a firm basis, there are problems with the reconstruction of a tree-like history of plant species. Prokaryote evolution cannot be cast into a tree: archebacteria and eubacteria exchange genetic information across species and kingdoms. Such horizontal gene transfer occurs frequently and obscures the descendance of species. Darwinian evolution, although successful in describing the mechanisms of optimization and adaptations of species, is unable to provide explanations for the major evolutionary transitions that lead from one hierarchical level of life to the next higher forms (Maynard Smith & Szathmáry, 1995; Schuster , 1996). Examples of such transitions are the origin of the genetic code; the transition from the prokaryotic to the eukaryotic cell; the transition from unicellular organisms to multicellular plants, fungi, and animals; the transition from solitary animals to animal societies; and eventually the transition to man and human societies. Common to all these transitions is the integration of individual competitors as cooperating elements into a novel functional unit. Simple model mechanisms have been proposed that can explain cooperation of competitors (see, e.g., the hypercycle Eigen & Schuster, 1978), but no real solution to the problem has been found yet. PETER SCHUSTER See also Catalytic hypercycle; Fitness landscape
Further Reading Benton, M.J. & Ayala, F.J. 2003. Dating the tree of life. Science, 300: 1698–1700 Crutchfield, J.P. & Schuster, P. (editors). 2003. Evolutionary Dynamics: Exploring the Interplay of Selection, Accident, Neutrality, and Function, Oxford and New York: Oxford University Press
BIOMOLECULAR SOLITONS Eigen, M. 1971. Selforganization of matter and the evolution of biological macromolecules. Naturwissenschaften, 58: 465–523 Eigen, M. & Schuster, P. 1977. The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften, 64: 541–565 Eigen, M. & Schuster, P. 1978. The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle. Naturwissenschaften, 65: 7–41 Hartl, D.L. & Clark, A.G. 1997. Principles of Population Genetics, 3rd edition, Sunderland, MA: Sinauer Associates Judson, H.F. 1979. The Eighth Day of Creation. The Makers of the Revolution in Biology, London: Jonathan Cape and New York: Simon and Schuster Kimura, M. 1983. The Neutral Theory of Molecular Evolution, Cambridge and New York: Cambridge University Press. Maynard Smith, J. & Szathmáry, E. 1995. The Major Transitions in Evolution, Oxford and New York: Freeman Mayr, E. 1997. The establishment of evolutionary biology as a discrete biological discipline. BioEssays, 19: 263–266 National Academy of Sciences (NAS). 1999. Science and Creationism. A View from the National Academy of Sciences, 2nd edition, Washington, DC: National Academy Press Ruse, M. 1979. The Darwinian Revolution, Chicago, IL: University of Chicago Press Ruse, M. 2001. Can a Darwinian Be a Christian? The Relationship Between Science and Religion, Cambridge and New York: Cambridge University Press Schuster, P. 1996. How does complexity arise in evolution? Complexity, 2(1): 22–30 Spiegelman, S. 1971. An approach to the experimental analysis of precellular evolution. Quarterly Reviews of Biophysics, 4: 213–253 Watts, A. & Schwarz, G. (editors). 1997. Evolutionary Biotechnology—From Theory to Experiment. Biophyscial Chemistry, vol. 66, nos. 2–3, Amsterdam: Elsevier, pp. 67–284
BIOMOLECULAR SOLITONS Biological molecules are complex systems that evolve in an ever-changing environment and nevertheless exhibit a remarkable stability of their functions. This feature, which is reminiscent of the exceptional stability of solitons in the presence of perturbations, is perhaps what led to suggestions that solitons could have a role in some biological functions. Beyond this analogy, there are more solid arguments to consider the role of nonlinearity in biological molecules. They are very large atomic assemblies performing their function through large conformational changes, which have a cooperative character because they involve many atoms moving in a coherent manner, and are highly nonlinear due to their amplitude of motion, which is much larger than the standard thermal motions observed in small molecules. Additional nonlinearities can originate from the coupling of different degrees of freedom, as proposed for proteins. Dispersion, necessary to balance the effect of nonlinearity in order to obtain solitonlike excitations, is introduced by the discreteness of the molecular lattice, which behaves in a manner different from a continuous medium. Besides conformational changes involved in many biological functions, issues important for biological
BIOMOLECULAR SOLITONS molecules are energy transport and storage, and charge transport. Nonlinear excitations have been proposed as possible contributors to these phenomena, in the two main classes of biological molecules: nucleic acids and proteins (Peyrard, 1995). Following Erwin Schrödinger in his prophetic book What Is Life? (Schrödinger, 1944), the nucleic acid DNA can be viewed as an “aperiodic crystal.” The static structure of DNA is a fairly regular pile of flat base pairs, linked by hydrogen bonds and connected by sugar– phosphate strands that form a double helix (Calladine & Drew, 1997). The lack of periodicity occurs because the base pairs can be either A−T (adenine–thymine) or G−C (guanine–cytosine), their sequence defining the genetic code. The static picture that emerges from crystallographic data has little to do with actual DNA in a living cell. The genetic code, buried in the double helix, would not be accessible if DNA were not a highly dynamical structure. Biologists have observed the “breathing of DNA,” which is a fluctuational opening in which one or a few base pairs open temporarily. These motions are probed experimentally by monitoring deuterium-hydrogen exchange, based on the assumption that the imino-protons that bind the bases can only be exchanged for open base pairs. DNA double helix is also opened by enzymes during the transcription of a gene, that is, the reading of the genetic code. This phenomenon is complex, but there are related experimental observations that are more amenable to physical analysis, on the thermal denaturation of DNA. When the double helix is heated, one first observes local openings over a few to a few tens of base pairs. These grow and invade the whole molecule, leading to a thermal separation of the two strands, which can be monitored by measuring the UV absorbance of the molecule, which is highly sensitive to the disturbance of the base stacking. This “DNA melting”—which appears as a phase transition in one dimension—poses challenging questions because, in order to cause the local openings, one has to break many hydrogen bonds between the bases, which requires the localization of a large amount of thermal energy in a small region of the molecule. Nonlinear effects could be at the origin of this phenomenon. All these observations led to many investigations and models of the nonlinear dynamics of the DNA molecule. A description at the scale of the individual atoms is not necessary to analyze base-pair openings, so the bases are generally described as rigid objects in these models. The earliest attempt to describe DNA opening in terms of solitons is due to Englander et al. (1980), who viewed it as a cooperative motion involving 10 or more base pairs and propagating as a localized defect along the molecule. This idea was further formalized by Yomosa (1984), and Takeno & Homma (1983), who introduced a coupled base rotator model for the structure and dynamics of DNA. The
61
χ’n
a
χ n
b
Figure 1. (a) Schematic view of the plane base rotator model for DNA. (b) A symmetric open state of the model.
general ideas behind this approach are schematized in Figure 1(a). Only rotational degrees of freedom of the bases are introduced (denoted by angles χn and χn in Figure 1). The pairing of the bases is described by an on-site potential V (χn , χn ), which, in its simplest form is V (χn , χn ) = A(1 − cos χn ) + A(1 − cos χn ) + B(1 − cos χn cos χn ), and the stacking along the molecule = is represented by a potential W (χn , χn , χn−1 , χn−1 )], whS[1 − cos(χn − χn−1 )] + S[1 − cos(χn − χn−1 ere A, B, S are constants. Adding the kinetic energy of the bases 21 I (χ˙ n2 + χ˙ n 2 ), where I is the moment of inertia of the bases around their rotation axis, and summing over n, one obtains the Hamiltonian of the model. Various nonlinear excitations are possible depending on the symmetries of the motion (such as χn = χn , χn = −χn ) and the values of the constants. If the stacking interaction is strong enough, a continuum approximation can be made. This approximation replaces the discrete variables χn (t) by the function χ (x, t) and finite differences such as χn − χn−1 by derivatives a(∂χ /∂x), where a is the spacing between the bases and x denotes the continuous coordinate along the helix axis. When A = 0, in its simplest form, the model leads to a sine-Gordon equation I
∂ 2χ ∂ 2χ − Sa 2 2 + B sin χ = 0, ∂t 2 ∂x
(1)
which has topological solutions such as the one schematized in Figure 1(b), where the bases undergo a 2π rotation, generating an open state that may slide along the chain. Models for the rotation of the base pairs have been further refined byYakushevich (1998) and are discussed in the entry on DNA solitons. Another point of view was chosen later by Dauxois et al. (1993), who were interested in the statistical physics of DNA thermal denaturation. This problem had been studied by Ising models, which simply use a two-state variable equal to 0 or 1 to specify whether a base pair is closed (0) or open (1). Such models cannot describe the intermediate states, but they can be generalized by introducing a real variable yn (t) that measures the stretching of the hydrogen bonds
62
BIOMOLECULAR SOLITONS
in a base pair that is equal to 0 in the equilibrium structure and grows to infinity when the two strands are fully separated. With such a variable, a natural shape of the on-site potential is the Morse potential V (yn ) = D[exp( − αyn ) − 1]2 (D and α are constants), which has a minimum corresponding to the binding of the two bases in their equilibrium state by the hydrogen bonds, and a plateau at large yn , which is associated to the vanishing of the pairing force ∂V /∂yn when the bases are far apart. Such a model does not have topological solitons, but its nonlinear dynamics leads to localized oscillatory modes, called breathers, which are approximately described by solitons of the nonlinear Schrödinger equation in the continuum limit and turn into permanently open states at a high temperature (See Breathers). These studies have focused attention on the importance of discreteness for nonlinear energy localization. In DNA, the stacking interactions are not very strong, and this is why imino-proton exchange experiments can detect the exchange on one base pair while the neighboring base pairs are not affected. As a result a continuum approximation is very crude. What could appear as a problem because it complicates analytical studies of the nonlinear dynamics turns out to have a far-reaching consequence because it has been shown that discreteness is crucial for the existence and formation of nonlinear localized modes (Sievers & Takeno, 1988; MacKay & Aubry, 1994), which correspond to the “breathing” of DNA observed by biologists. It is important to notice that the existence of these nonlinear solutions is not linked to a particular mathematical expression of the potentials. Instead, it is a generic feature of nonlinear lattices having interactions qualitatively similar to those that connect the bases in DNA. Moreover, it has also been shown that thermal fluctuations can self-localize in such lattices so it is likely that related nonlinear excitations could exist in DNA. But discreteness has another consequence. Large-amplitude modes are strongly localized due to their high nonlinearity. Their width becomes of the order of the spacing between the bases and they lose the translational invariance of solitons in continuum media. The image of freely moving solitons has to be corrected by the pinning effect of discreteness, and the translation of the nonlinear excitations in DNA, if it occurs, has to be activated, for instance, by thermal fluctuations. Proteins are much more complex than DNA because they do not have a quasi-periodic structure, but some of their substructures are nevertheless fairly regular. They are biological polymers composed of amino acids of the general formula
H H H O C
C N H
O R
where R is an organic radical that determines the amino acid. These building blocks are linked by a peptide bond that can be viewed as a result of the elimination of a water molecule between consecutive amino acids, leading to the generic formula
H H
H H
H H
C C N C C N C C N O R1
O R2
O R3
A given protein is defined by its sequence of amino acids chosen by 20 possible types and the length of the chain (typically 150–180 residues), but this so-called primary structure does not determine the function that depends on the spatial organization of the residues. Segments of the chain tend to fold into secondary structures having the shape of helices (called α-helices) or sheets (called β-sheets) stabilized by hydrogen bonds formed mainly between the negatively charged C = O groups and the positively charged protons linked to the nitrogen atom of a peptide bond. The different components of the secondary structure assemble in the tertiary structure, which is the functional form of the protein. Proteins perform numerous functions and one of them is the storage and transport of the energy released by the hydrolysis of adenosine-triphosphate (ATP), which plays the role of the fuel necessary for many biological processes, such as muscle contraction. The hydrolysis of a single ATP molecule releases approximately 0.4 eV, which is transmitted to a protein for later use. This raises a puzzling question because, if this energy were distributed among all the degrees of freedom of a protein, each atom would carry such a small amount that the energy would be useless. There must be a mechanism that maintains this energy sufficiently localized, and moreover, as it will not be used at the site where it has been released, it must be transported efficiently within the molecule. Recent experiments at the molecular scale have shown that the hydrolysis of a single ATP molecule can be used for several steps of a molecular motor involved in muscle contraction (Kitamura et al., 1999), providing evidence of the temporary storage of the energy. Attempting to understand these phenomena in 1973, Alexander Davydov noticed that the energy released by ATP hydrolysis almost coincides with 2 quanta of the vibrational energy of the C = O bond, which led him to the conclusion that this energy was stored as vibrational energy in the peptide bond (Scott, 1992). He conjectured that it could stay localized through an extrinsic nonlinearity associated with a distortion of the chain of hydrogen bonds that spans the α-helix. The underlying mechanism is similar to the one leading
BIOMOLECULAR SOLITONS to the formation of a polaron in solid-state physics (Ashcroft & Mermin, 1976). The vibration of the C = O bond strains the lattice in its vicinity, resulting in slight displacements of the neighboring amino acids. But, as the frequency of the C = O vibration is affected by its interactions with the neighboring atoms, the frequency of the excited C = O bond becomes slightly shifted and no longer coincides with the resonating frequencies of the neighboring C = O bonds, preventing an efficient transfer of energy to the neighboring sites. As a consequence, the energy released by theATP hydrolysis does not spread along the protein. Therefore, the basic idea behind the mechanism proposed by Davydov is nonlinear energy localization due to the shift of the frequency of an oscillator when it is excited. For the protein, it is not due to an intrinsic nonlinearity of the C = O bond (as was the case for the Morse potential linking the bases in a pair for DNA), but due to a coupling with another degree of freedom, which is an acoustic mode of the lattice of amino acids connected by hydrogen bonds. As only a few quanta of the C = O vibrational motion are excited, the theory cannot ignore quantum effects, and in order to go beyond the qualitative picture discussed above, one has to solve the timedependent Schrödinger equation. Davydov proposed a simple ansatz to describe the quantum state of the system. In this simple approximation, the motion of the self-trapped energy packet is described by a discrete form of the nonlinear Schrödinger equation. When one introduces proper parameters for the α-helix, the calculation of the soliton width shows that it is much broader than the lattice spacing, which should allow its motion without pinning by discreteness. As a result, energy transfer by solitons in the α-helix is plausible, but a definitive conclusion about the existence of such solitons is still pending. This is because the role of thermal fluctuations, which could destroy the coherence of the lattice distortion around the excited C = O site and hence the self-trapping, and the extent of quantum effects are hard to evaluate quantitatively (Peyrard, 1995). A direct experimental observation on a protein has not been possible up to now. These uncertainties prompted physicists and physical chemists to experimentally investigate model systems that are simpler than proteins but, nevertheless, show chemical bonds comparable to the peptide bonds in proteins. Crystalline acetanilide consists of quasi-one-dimensional chains of hydrogen-bonded peptide groups. In the early 1980s, it was recognized by spectroscopic studies that the C = O stretching and N−H stretching bands of crystalline acetanilide exhibit anomalies, and tentative explanations involve selftrapped states similar to the Davydov solitons. These ideas have been confirmed by recent pump–probe experiments (Edler et al., 2002). A direct observation of self-trapping could be achieved, and it appears that the
63 crystal structure is essential to stabilize the excitation that decays 20 times faster for isolated molecules than for molecules linked by hydrogen bonds in the crystal. Although the lifetime of the self-trapped state (20 ps) is shorter than expected by Davydov, this study supports the original idea of the importance of the coupling with the lattice degrees of freedom. A possible translational motion of the self-trapped state and its possible role for biological functions are still open questions. The News and Views section of the journal Nature attests that nonlinear excitations in biomolecules have been the object of strong controversy, ranging from enthusiastic approval (Maddox, 1986, 1989) to harsh criticisms (Frank-Kamenetskii, 1987), which were justified by some of the overstatements by theoreticians. Today, passionate opinions have subsided and experiments at the scale of a single molecule have become feasible, showing us how biomolecules work or take their shape. Thus, it appears likely that while freely moving solitons along DNA or protein α-helices may not exist, nonlinear excitations leading to energy localization or storage, and perhaps transport, could well provide useful clues to understand some of the phenomena occurring in biomolecules. MICHEL PEYRARD See also Davydov soliton; DNA premelting; DNA solitons; Pump-probe measurements
Further Reading Ashcroft, N.W. & Mermin, D.A. 1976. Solid State Physics, Philadelphia: Saunders Company Calladine, C.R. & Drew, H.R. 1997. Understanding DNA: The Molecule and How It Works, 2nd edition, San Diego and London: Academic Press Dauxois, T., Peyrard, M. & Bishop, A.R. 1993. Dynamics and thermodynamics of a nonlinear model for DNA denaturation. Physical Review E, 47: 684–695 and R44–R47 Edler, J., Hamm, P. & Scott, A.C. 2002. Femtosecond study of self-trapped vibrational excitons in crystalline acetanilide. Physical Review Letters, 88 (1–4): 067403 Englander, S.W., Kallenbach, N.R., Heeger, A.J., Krumhansl, J.A. & Litwin, S. 1980. Nature of the open state in long polynucleotide double helices: Possibility of soliton excitations. Proceedings of the National Academy of Sciences USA, 777: 7222–7227 Frank-Kamenetskii, M. 1987. Physicists retreat again. Nature, 328: 108 Kitamura, K., Tokunaga, M., Iwane A.H. & Yanagida, T. 1999. A single myosin head moves along an actin filament with regular steps of 5.3 nanometres. Nature, 397: 129–134 MacKay, R.S. & Aubry, S. 1994. Proof of existence of breathers for time-reversible or Hamiltonian networks of weakly coupled oscillators. Nonlinearity, 7: 1623–1643 Maddox, J. 1986. Physicists about to hi-jack DNA? Nature, 324: 11 Maddox, J. 1989. Towards the calculation of DNA. Nature, 339: 577 Peyrard, M. (editor). 1995. Nonlinear Excitations in Biomolecules, Berlin and New York: Springer
64 Schrödinger, E. 1944. What is Life? The Physical Aspect of the Living Cell, Cambridge: Cambridge University Press Scott, A.C. 1992. Davydov’s soliton. Physics Reports, 217: 1–67 Sievers, A.J. & Takeno, S. 1988. Intrinsic localized modes in anharmonic crystals. Physical Review Letters, 61: 970–973 Takeno, S. & Homma, S. 1983. Topological solitons and modulated structures of bases in DNA double helices. Progress of Theoretical Physics, 70: 308–311 Yakushevich, L.V. 1998. Nonlinear Physics of DNA, Chichester and New York: Wiley Yomosa, S. 1984. Solitary excitations in deoxyribonucleic acid (DNA) double helices. Physical Review A, 30: 474–480
BIONS See Breathers
BIRGE–SPONER RELATION See Local modes in molecules
BIRKOFF–SMALE THEOREM See Phase space
BISTABILITY See Equilibrium
BISTABLE EQUATION See Zeldovich–Frank-Kamenetsky equation
BJERRUM DEFECTS Ice is the most common and important member in a class of solids in which the conduction of electricity is carried almost exclusively by protons. Its dc electrical conductivity reaches the level of 10−7 −1 m−1 , placing ice among the semiconductors. Known protonic semiconductors include disordered forms of solid water and several salt hydrates and gas hydrates. In the case of lithium hydrazinium sulfate (LiN2 H5 SO4 ), one finds quasi-one-dimensional hydrogen-bonded chains (HBCs) along the c-crystallographic axis. Electrical conductivity is three orders of magnitude larger in the cdirection (compared with the perpendicular directions), demonstrating that proton conductivity is directly related to the presence of hydrogen bonds. In addition to inorganic crystals, protonic conductivity plays a significant role in biological systems, where it participates in energy transduction and formation of proton pumps. Of particular significance is proton transport across cellular membranes through the use of hydrogenbonded side-chains of proteins embedded in membrane pores.
BJERRUM DEFECTS As is shown in Figures 1 and 2, protonic conductivity in HBCs takes place through ionic defects and bonding or “Bjerrum” defects (named after Danish physical chemist Niels Bjerrum). Ionic defects are formed by an excess proton (H3 O+ ) or a proton vacancy (HO− ), while bonding defects are misfits in the orientations of neighboring atoms resulting in either vacant bonds (L defect) or placing two protons in the same bond (D defect). Bonding defects do not obey the Bernal–Fowler rule of one proton per hydrogen bond for the ideal ice crystal. When a proton is transported along an HBC through an ionic defect, after the passage of the proton, the chain remains blocked to further proton movement since all chain protons have been moved, say, from the lefthand to the right-hand side of each hydrogen bond. The chain gets unblocked through cooperative rotations, that is, through the passage of a bonding defect. Thus, protons move in an HBC through coordinated ionic and Bjerrum defects, using a mechanism that is also found in hydrogen-bonded protein side chains. Coordinated proton transport in biological macromolecules leads to the formation of proton pumps that channel protons across membranes and, through reversals, produce cyclic motor actions of mechanical nature. Defects in HBCs are topological in nature, with the rotational activation energy being smaller than that of ionic defect energy. The total charge of the topological ionic and bonding defects is not the same; in ice, the ionic defect charge is eI = 0.64e (e is the proton charge), while the bonding defect charge is eB = 0.36e. Only after a coordinated passage of an ionic and bonding defect is one entire proton charge transferred across the HBC. A simple one-dimensional cooperative model of an HBC is similar to the Frenkel–Kontorova model but with two alternating barriers modeling bonding and ionic activation energies. The minima separating the barriers correspond to equilibrium positions of protons that interact mutually through dipole-dipole interactions. In equilibrium under this model, there is initially one proton per hydrogen bond; transitions of protons over the large barriers correspond to ionic defects, while bonding defects result from transitions over the smaller barrier. Both classes of defects are modeled through topological solitons. There are two kink solutions corresponding to HO− and L-bonding defects, while the corresponding antikinks are the H3 O+ and D-bonding defects, respectively. This simple model can be made quantitative through the introduction of the one-dimensional Hamiltonian p2 1 n + (un+1 − un )2 + ωV (un ) , (1) H = 2 2 n where un and pn are the dimensionless displacement from an equilibrium position and momentum, respectively, of the nth hydrogen that is coupled to its nearest
BLACK HOLES
65 NEGATIVE EFFECTIVE CHARGE
a
POSITIVE EFFECTIVE CHARGE
b
Figure 1. Ionic defects present in a hydrogen bonded chain: (a) negative ionic defect HO− and (b) positive ionic defect H3 O+ . Large open circles denote ions, for example, oxygen ions in ice, while small black dots are protons. A hydrogen bond that links two ions contains a covalent part (solid line) that places in equilibrium the proton closer to one of the two oxygens. In the ionic defect region, there is a gradual transition in the equilibrium locations of protons within the hydrogen bonds; this transitional region is modeled through a topological soliton. POSITIVE EFFECTIVE CHARGE
NEGATIVE EFFECTIVE CHARGE
a
b
Figure 2. Bonding defects in a hydrogen bonded chain: (a) negative bonding defect (L) and (b) positive bonding defect (D). Molecular rotations introduce additional protons or remove protons from the quasi-one-dimensional HBC and produce bonding defects.
neighboring protons through harmonic spring interaction, while ω sets the energy scale. A typical choice for V (un ), the nonlinear substrate potential that models the ionic and bonding barriers, is &2 2 % un (2) cos( ) − α . V (un ) = 1 − α2 2 The substrate potential (2) is periodic and (for appropriate values of the parameter α) is doubly periodic with two distinct alternating maxima that separate degenerate minima. In this model, one assumes one proton per unit cell, the latter consisting of the larger ionic barrier with its adjacent minima, one on each side. In the strongly cooperative limit, where neighboring hydrogen displacements do not differ substantially, one obtains for the proton displacement u(x, t) that becomes a function of the continuous space variable x as well as time t the double sine-Gordon partial differential equation: ω2 % u& = 0, − sin u + 2α sin utt − c2 uxx + 1 − α2 2 (3) where c is the speed of sound of the linearized lattice oscillations. This sine-Gordon model has as solutions two sets of soliton kinks as well as their corresponding antikinks representing L-Bjerrum (kink I), D-Bjerrum (antikink I), HO− ionic (kink II) and H3 O+ (antikink II). More complex nonlinear models can be constructed that also include an acoustic interaction between neighboring ions as well as coupling of protons with ions. In these cases, one obtains two component solitons where the defects in the proton sublattice are topological solitons that induce a polaronic-like deformation in the ionic lattice. This more complex defect can travel along an HBC when an external electric field is applied in the system. Numerical
simulations demonstrate that these nonlinear defects do indeed encompass some of the basic dynamical properties of the ionic and bonding defects found in hydrogen-bonded networks. G.P. TSIRONIS See also Frenkel–Kontorova model; Hydrogen bond; Sine-Gordon equation; Topological defects Further Reading Hobbs, P.V. 1974. Ice Physics, Oxford: Clarendon Press Pnevmatikos, St. 1988. Soliton dynamics of hydrogen-bonded networks: a mechanism for proton conductivity. Physical Review Letters, 60: 1534–1537 Pnevmatikos, St., Tsironis, G.P. & Zolotaryuk, A.V. 1989. Nonlinear quasiparticles in hydrogen-bonded systems. Journal of Molecular Liquids, 41: 85–103 Savin, A.V, Tsironis, G.P. & Zolotaryuk, A.V. 1997. Reversal effects in stochastic kink dynamics. Physical Review E, 56: 2457–2466 Zolotaryuk, A.V., Pnevmatikos, St. & Savin, A.V. 1991. Charge transport by solitons in hydrogen-bonded materials. Physical Review Letters, 67: 707–710
BLACK HOLES A massive body like the Earth is characterized by an “escape velocity,” which is the speed that a moving particle must have on leaving the surface of the body to leave the attraction of gravity. Consider a bullet at the surface of the Earth that is moving upward with a speed of 11.2 km/s and neglect atmospheric friction. Such a bullet will just escape Earth’s gravitational field by exchanging its kinetic energy for the potential energy of the gravitational field. If the mass of the Earth were compressed into a smaller radius, this escape velocity would be larger, because the gravitational energy to be overcome by the kinetic energy of the bullet would be larger.
66
BLACK HOLES
In 1916, Karl Schwarzschild used Einstein’s gravitational field equations to show that if a body of mass m is compressed to a radius 2Gm , (1) c2 where G is the gravitational constant, then an object traveling at the speed of light c will be unable to escape the influence of gravity (Schwarzschild, 1916). For a body having the mass of the Earth, this “Schwarzschild radius” is about 1 cm, and for the Sun, it is about 3 km. Interestingly, Schwarzschild’s idea was first suggested in the 18th century (Mitchell, 1783; Laplace, 1796). The term “black hole” was coined by John Archibald Wheeler in 1967 to denote a cosmic object with its mass concentrated within the Schwarzschild radius. Neither particles nor light can overcome gravitational attraction and travel outside the sphere of radius rs . Interestingly, Stephen Hawking (1974) has shown that—due to quantum fluctuations—a black hole should radiate as a black body with the temperature rs =
T =
hc , 4π krs
(2)
where k and h are, respectively, the Boltzmann and Planck constants. Indirect evidence for black holes is provided by the fact that there do not exist stable cold stars with masses larger than about three Sun masses. Under its own gravitational field, according to theory, such a star should collapse into a black hole. In 1931, Subrahmanyan Chandrasekhar was the first to conclude that above some critical mass of white dwarfs, the equation of state (of a quantum relativistic gas of degenerate Fermi particles) is too weak to counter the gravitational forces, leading to the formation of black holes. Both Lev Landau (1932) and Arthur Eddington (1924) rejected this implication of relativistic quantum mechanics rather than accept the possibility of black holes. Albert Einstein also concluded that Schwarzschild singularities do not exist in the real world (Einstein, 1939). In 1939, however, J. Robert Oppenheimer and his colleagues used general relativity (rather than Newtonian gravity) to show that when all thermonuclear sources of energy are exhausted (with no further outward pressure due to radiation), a sufficiently heavy star will continue to contract indefinitely, never reaching equilibrium (Oppenheimer & Volkoff, 1939; Oppenheimer & Snyder, 1939). Oppenheimer et al. further noted that if one considers stellar collapse from the inside, a stationary observer sees the stellar surface moving inward to the Schwarzschild sphere and finally sees the surface freeze as it nears the Schwarzschild sphere. Moreover, they showed that observers who move inward with the collapsing matter do not observe such freezing; these observers could cross the critical surface (“event horizon”) after a finite time on their own
clocks, after which they have no possibility of sending a signal that could be detected by an observer located outside the collapsing matter. Recently, the scientific history of black holes has been characterized by a rapid growth of observational, theoretical, and mathematical studies, in which the discovery of such compact objects becomes the main purpose (Thorne et al., 1986). Currently, the most important classes are black holes of stellar masses (about 3–10 solar masses) and super-massive black holes. The most convincing candidates for stellar black holes are binary X-ray sources, one component of which is an ordinary star and the other component is a black hole or neutron star (Novikov & Zeldovich, 1966). Estimates of the masses of compact objects in these systems are essentially greater than three solar masses, and one example of such a system is Cygnus X-1 (V 1357 Cyg). The present number of systems mentioned as possible candidates for black holes with stellar masses is about 20, all of which are X-ray sources in binary systems (Novikov & Frolov, 2001). In the case of super-massive black holes and nuclei of Seyfert galaxies, interpretations of the observable effects using the black hole theory seem the most simple and natural; for example, Galaxy M 31 is a black hole candidate having a mass of about 3 × 107 the Sun’s mass (Novikov & Frolov, 2001). Presently, the concept of black holes continues to be confirmed by direct observations and is used to explain observable astronomical effects related to exceptionally strong emission of energy. Thus, it is expected that in the future new astronomical objects will be detected near black holes, and new physical phenomena will be discovered that can be interpreted using the black hole concept.Along these lines, it is interesting to note recent work in which concepts from thermodynamics and information theory (such as temperature and entropy) are connected with black holes based on Hawking’s ideas (Markov, 1965; Hawking, 1977). Also of interest are “artificial black holes,” which do not compress a large amount of mass into a small volume, but reduce the speed of light in a moving medium to less than the speed of the medium, thereby creating an event horizon (Leonhardt, 2001). VIATCHESLAV KUVSHINOV See also Binding energy; Einstein equations; General relativity Further Reading Chandrasekhar, S. 1931. The maximum mass of ideal white dwarfs. Astrophysical Journal, 74: 81 Eddington,A. 1924.A comparison of Whitehead’s and Einstein’s formulae. Nature, 113: 192 Einstein, A. 1939. On a stationary system with spherical symmetry consisting of many gravitating masses. Annals of Mathematics, 40: 922 Hawking, S.W. 1974. Black hole explosions. Nature, 248: 30–31
BORN–INFELD EQUATIONS Hawking, S.W. 1977. Gravitational instantons. Physics Letters A, 60: 81 Landau, L.D. 1932. On the theory of stars. Physikalische Zeitschrift der Sowjetunion, 1: 285 Laplace, P.-S. 1796. Exposition du système du monde, [Description of the World System], Paris Leonhardt, U. 2001. A laboratory analogue of the event horizon using slow light in an atomic medium. Nature, 415: 406–409 Markov, M.A. 1965. Can the gravitational field prove essential for the theory of elementaryparticles? Progress of Theoretical Physics, (Suppl): 85–95 Mitchell, J. 1783. Transactions of the Royal Society of London, 74: 35 Novikov, I.D. & Frolov, V.P. 2001. Black holes in the Universe. Physics-Uspekhi, 44(3): 291 Novikov, I.D. & Zeldovich, Ya.B. 1966. Nuovo Cimento, 4 (Suppl.): 810 Oppenheimer, J.R. & Snyder, M. 1939. On continued gravitational contraction. Physical Review, 56: 455 Oppenheimer J.R. & Volkoff, G.M. 1939. On massive neutron cores. Physical Review, 55: 374–381 Schwarzschild, K. 1916. Über das Gravitational eines Massenpunktes nach der Einsteineschen Theory. Sitzungsberichte der Preusischen Akademie der Wissenschaften zu Berlin, PhysikMathematik, Kl.: 189–196 Thorne, K.S., Price, R.H. & MacDonald, D.A. (editors). 1986. Black Holes: The Membrane Paradigm, New Haven: Yale University Press
BLOCH DOMAIN WALL See Domain walls
BLOCH FUNCTIONS See Periodic spectral theroy
BLOWOUT BIFURCATION See Intermittency
BLOW-UP (COLLAPSE) See Development of singularities
BOHR–SOMMERFELD QUANTIZATION See Quantum theory
BOOMERONS See Solitons, types of
BORN–INFELD EQUATIONS Classical linear vacuum electrodynamics with point massive charged particles has two limiting properties: the electromagnetic energy of a point particle field is infinity, and a Lorentz force must be postulated to describe interactions between point particles
67 and an electromagnetic field. Nonlinear vacuum electrodynamics can be free of these imperfections. Gustav Mie (1912–1913) considered a nonlinear electrodynamics model in the framework of his “Fundamental unified theory of matter.” In this theory the electron is represented by a nonsingular solution with a finite electromagnetic energy, but Mie’s field equations are noninvariant under the gauge transformation for an electromagnetic four-potential (addition of the four-gradient of an arbitrary scalar function). Max Born (1934) considered a nonlinear electrodynamics model that is invariant under the gauge transformation. A stationary electron in this model is represented by an electrostatic field configuration that is everywhere finite, in contrast to the case of linear electrodynamics when the electron’s field is infinite at the singular point (see Figure 1). The central point in Born’s electron is also singular because there is a discontinuity of electrical field at this point (hedgehog singularity). The full electromagnetic energy of this electron’s field configuration is finite. Born and Leopold Infeld (1934) then considered a more general nonlinear electrodynamics model, which has the same solution associated with the electron. Called Born–Infeld electrodynamics, this model is based on the Born–Infeld equations, which have the form of Maxwell’s equations, including electrical and magnetic field strengths E, H , and inductions D, B with nonlinear constitutive relations D = D(E, B), H =H (E, B) of a special kind. For inertial reference frames and in the region outside of field singularities, these equations are ⎧ div B = 0, ⎪ ⎪ ⎪ ⎪ 1 ∂B ⎪ ⎪ ⎨ + curl E = 0, c ∂t ⎪ div D = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 1 ∂D − curl H = 0, (1) c ∂t where 1 (E + χ 2 J B), L 1 H = (B − χ 2 J E), L $ L ≡ | 1 − χ 2 I − χ 4 J 2 |, I = E · E − B · B, J = E · B. D=
(2)
(3)
Relations (2) can be resolved for E and H : 1 D − χ2 P × B , H 1 H = B + χ2 P × D , H E=
(4)
68
BORN–INFELD EQUATIONS −Er
r Figure 1. Radial components of electrical field for Born’s electron and purely Coulomb field (dashed).
+ where H= 1+χ 2 D 2 +B 2 +χ 4 P 2 , P ≡D × B. Using relations (4) for Eq. (1), the fields D and B are unknown. The symmetrical energy-momentum tensor for Born–Infeld equations has the following components: 1 1 i P, T 00 = (H − 1) , T 0i = 4π χ 2 4π , ! " 1 T ij = δ ij D · E + B · H − χ −2 (H − 1) 4π (5) − Di Ej + B i H j . In spherical coordinates, the field of Born’s static electron solution may have only radial components e e , (6) Dr = 2 , Er = √ r r¯ 4 + r 4 √ where e is the electron’s charge and r¯ ≡ |χ e|. At the point r = 0, the electrical field has the maximum absolute value |e| 1 (7) |Er (0)| = 2 = , r¯ χ which Born and Infeld called the absolute field constant. The energy of field configuration (6) is 2 r¯ 3 (8) m = T 00 dV = β 2 , 3 χ where the volume integral is calculated over the whole space, and ! 1 "2 ∞ ( 4 ) dr ≈ 1.8541. (9) = β≡ √ √ 4 π 1 + r4 0
In view of the definition for r¯ below (6), Equation (8) yields 2 e2 (10) r¯ = β . 3 m Considering m as the mass of electron and using (7), Born & Infeld (1934) estimated the absolute field constant χ −1 ≈ 3 × 1020 V/m. Later, Born & Schrödinger (1935) gave a new estimate (two orders of
magnitude less) based on some considerations taking into account the spin of the electron. (Of course, such estimates may be corrected with more detailed models.) An electrically charged solution of the Born–Infeld equations can be generalized to a solution with the singularity having both electrical and magnetic charges (Chernitskii, 1999). A corresponding hypothetical particle is called a dyon (Schwinger, 1969). Nonzero (radial) components of fields for this solution have the form Ce , r2 Cm Br = 2 , r
Dr =
Ce Er = √ , r¯ 4 + r 4 Cm Hr = √ , r¯ 4 + r 4
(11)
where Ce is the electric charge and Cm is the magnetic "1/4 ! . The energy of this soluone; r¯ ≡ χ 2 Ce 2 + Cm 2 tion is given by Equation (8) with this definition for r¯ . It should be noted that space components of electromagnetic potential for the static dyon solution have a line singularity. A generalized Lorentz force appears when a small, ˜ B˜ is considered in addition to almost constant field D, ˜ B˜ the moving dyon solution. The sum of the field D, and the field of the dyon with varying velocity is taken as an initial approximation to some exact solution. Conservation of total momentum gives the following trajectory equation (Chernitskii, 1999): v d = Ce D˜ + v × B˜ m √ 2 dt 1 − v + Cm B˜ − v × D˜ , (12) where v is the velocity of the dyon, and m is the energy for static dyon defined by (8). A solution with two dyon singularities (called a bidyon) having equal electric (Ce = e/2) and opposite magnetic charges can be considered as a model for a charged particle with spin (Chernitskii, 1999). Such a solution has both angular momentum and magnetic moment. A plane electromagnetic wave with arbitrary polarization and form in the direction of propagation (without coordinate dependence in a perpendicular plane) is an exact solution to the Born–Infeld equations. The simplest case assumes one nonzero component of the vector potential (Ay ≡ φ(t, x)), whereupon Equations (1) reduce to the linearly polarized plane wave equation 1 + χ 2 φx2 φtt − χ 2 2 φx φt φxt 2 − c − χ 2 φt2 φxx = 0 (13) with indices indicating partial derivatives. Sometimes called the Born–Infeld equation, Equation (13) has solutions φ = ζ (x 1 − x 0 ) and φ = ζ (x 1 + x 0 ), where ζ (x) is an arbitrary function (Whitham, 1974). Solutions
BOSE–EINSTEIN CONDENSATION comprising two interacting waves propagating in opposite directions are obtained via a hodograph transform (Whitham, 1974). Brunelli & Ashok (1998) have found a Lax representation for solutions of this equation. A solution to the Born–Infeld equations, which is the sum of two circularly polarized waves propagating in different directions, was obtained by Erwin Schrödinger (1943). Equations (1) with relations (2) have an interesting characteristic equation (Chernitskii, 1998) g µν
∂ ∂ = 0, ∂x µ ∂x ν
g µν ≡ g µν − 4π χ 2 T µν , (14)
where (x µ ) = 0 is an equation of the characteristic surface and T µν are defined by (5). This form for g µν , including in addition the energy-momentum tensor, is special for Born–Infeld equations. The Born–Infeld model also appears in the quantized string theory (Fradkin & Tseytlin, 1985) and in Einstein’s unified field theory with a nonsymmetrical metric (Chernikov & Shavokhina, 1986). In general, this nonlinear electrodynamics model is connected with ideas of space-time geometrization and general relativity (see Eddington, 1924; Chernitskii, 2002). ALEXANDER A. CHERNITSKII See also Einstein equations; Hodograph transform; Matter, nonlinear theory of; String theory Further Reading Born, M. 1934. On the quantum theory of the electromagnetic field. Proceedings of the Royal Society of London A, 143: 410–437 Born, M. & Infeld, L. 1934. Foundation of the new field theory. Proceedings of the Royal Society of London A, 144: 425–451 Born, M. & Schrödinger, E. 1935. The absolute field constant in the new field theory. Nature, 135: 342 Brunelli, J.C. & Ashok, D. 1998.A Lax representation for Born– Infeld equation. Physics Letters B, 426: 57–63 Chernikov, N.A. & Shavokhina, N.S. 1986. The Born–Infeld theory as part of Einstein’s unified field theory. Soviet Mathematics, (Izvestiya Vgsikish Uchebnykh Zaverdenii), 30(4): 81–83 Chernitskii, A.A. 1998. Light beams distortion in nonlinear electrodynamics. Journal of High Energy Physics, 11, 15: 1–5 Chernitskii, A.A. 1999. Dyons and interactions in nonlinear (Born–Infeld) electrodynamics. Journal of High Energy Physics, 12, 10: 1–34 Chernitskii, A.A. 2002. Induced gravitation as nonlinear electrodynamics effect. Gravitation & Cosmology, 8 (Suppl.), 123–130 Eddington, A.S. 1924. The Mathematical Theory of Relativity, Cambridge: Cambridge University Press Fradkin, R.S. & Tseytlin, A.A. 1985. Nonlinear electrodynamics from quantized strings. Physics Letters B, 163: 123–130 Mie, G. 1912–13. Grundlagen einer theorie der materie. Annalen der Physik, 37: 511–534; 39: 1–40: 40: 1–66 Schrödinger, E. 1942. Dynamics and scattering-power of Born’s electron. Proceedings of the Royal Irish Academy A, 48: 91–122
69 Schrödinger, E. 1943. A new exact solution in non-linear optics (two-wave system). Proceedings of the Royal Irish Academy A, 49: 59–66 Schwinger, J. 1969. A magnetic model of matter. Science, 165: 757–761 Whitham, G.B. 1974. Linear and Nonlinear Waves, New York: Wiley
BOSE–EINSTEIN CONDENSATION Bose–Einstein condensation (BEC) is the occupation of a single quantum state by a large number of identical particles, which implies that the particles are bosons, satisfying Bose–Einstein statistics and allowing for many particles to pile up in the same quantum state. This is in contrast to fermions, satisfying Fermi– Dirac statistics, for which the Pauli exclusion principle forbids the occupation of any single quantum state by more than one particle. The role of quantum correlations caused by Bose– Einstein statistics is crucial for the occurrence of BEC. These statistics were advanced by Satyendranath Bose (1924) for photons, having zero mass, and generalized by Albert Einstein (1924) to particles with nonzero masses. Einstein (1925) also described the phenomenon of condensation in ideal gases. The possibility of BEC in weakly nonideal gases was theoretically demonstrated by Nikolai Bogolubov (1947). The wave function of Bose-condensed particles in dilute gases satisfies the Gross–Pitaevskii equation, suggested by Gross (1961) and Pitaevskii (1961). Its mathematical structure is that of the nonlinear Schrödinger equation. Experimental evidence of BEC in weakly interacting confined gases was achieved 70 years after Einstein’s prediction, almost simultaneously, by three experimental groups (Anderson et al., 1995; Bradley et al., 1995; Davis et al., 1995). To say that many particles are in the same quantum state implies that these particles display state coherence, a particular example of coherence phenomena requiring the particles to be strongly correlated with each other. The necessary conditions may be qualitatively understood by applying the de Broglie duality arguments to an ensemble of atoms in thermodynamic equilibrium at temperature T . Then the thermal energy of an atom is given by kB T , where kB is the Boltzmann constant. This energy defines the thermal wavelength + (1) λT ≡ 2π 2 /m0 kB T for an atom of mass m0 , with being the Planck constant. Thus, an atom can be associated with a matter wave characterized by the wavelength (λT ). Atoms become correlated with each other when their related waves overlap, which requires that the wavelength be larger than the mean interatomic distance, λT > a. The average atomic density ρ ≡ N/V for N atoms in
70
BOSE–EINSTEIN CONDENSATION
volume V is related to the mean distance a through the equality ρa 3 = 1. Hence, condition λT > a may be rewritten as ρλ3T > 1. With the thermal wavelength (1), this yields the inequality 2π 2 2/3 ρ , (2) T < m0 kB which implies that state coherence may develop if the temperature is sufficiently low or the density of particles is sufficiently high. An accurate description of BEC for an ideal gas is based on the Bose–Einstein distribution −1
εp − µ −1 , (3) n(p) = exp kB T describing the density of particles with a single-particle energy εp = p2 /2m0 for a momentum p and with a chemical potential µ. The latter is defined from the condition N = p n(p) for the total number of particles. Assuming the thermodynamic limit N → ∞,
V → ∞,
N → const V
allows the replacement of summation over p by integration. Then, the fraction of particles, condensing to the state with p = 0 is
3/2 T N0 =1− (4) n0 ≡ N Tc below the condensation temperature 2π 2 ρ 2/3 , (5) m0 kB ζ 2/3 where ζ ≈ 2.612. Above the critical temperature (5), n0 = 0. The latter is about half of the right-hand side of inequality (2). The condensate fraction (4) is derived for an ideal (noninteracting) Bose gas. A weakly nonideal (weakly interacting) Bose gas also displays Bose–Einstein condensation, although particle interactions deplete the condensate so that at zero temperature the condensate fraction is smaller than unity (n0 < 1). A system is called weakly interacting if the characteristic interaction radius rint is much shorter than the mean interparticle distance (rint a). This inequality can be rewritten 3 1, and such a system is termed dilute. as ρrint Superfluid liquids, such as liquid 4 He, are far from being dilute, but it is commonly believed that the phenomenon of superfluidity is somehow connected with BEC. Although an explicit relation between the superfluid and condensate fractions is not known, theoretical calculations and experimental observations for superfluid helium estimate the condensate fraction at T = 0 as n0 ≈ 0.1. A strongly correlated pair of fermions can be treated approximately as a boson, allowing superfluidity in Tc =
liquid 3 He to be interpreted as the condensation of coupled fermions. Similarly, superconductivity is often compared with the condensation of the Cooper pairs that are formed by correlated electrons or holes. One should understand, however, that the superconductivity of fermions is analogous to but not identical to BEC of bosons. An ideal object for the experimental observation of BEC is a dilute atomic Bose gas confined in a trap and cooled down to temperatures satisfying condition (2). Such experiments with different atomic gases have been recently realized, BEC has been explicitly observed, and a variety of its features have been carefully investigated. It has been demonstrated that the system of Bose-condensed atoms displays a high level of state coherence. There exist different types of traps (single- and double-well), magnetic, optical, and their combinations, which make it possible to confine atoms for sufficiently long times of up to 100s. Using a standing wave of laser light, multi-well periodic effective potentials called optical lattices have been obtained, which have allowed the demonstration of a number of interesting effects, including Bloch oscillations, Landau–Zener tunneling, Josephson current, Wannier–Stark ladders, Bragg diffraction, and so on. Displaying a high level of state coherence, an ensemble of Bose-condensed atoms forms a matter wave that is analogous to a coherent electromagnetic wave from a laser. Therefore, a device emitting a coherent beam of Bose atoms is called an atom laser. The realization of BEC of dilute trapped gases is important for several reasons. First, this demonstrated the phenomenon predicted by Einstein in the 1920s. Note that a direct observation of BEC in superfluid helium—despite enormous experimental efforts—has never been achieved. Second, dilute atomic gases are simple statistical systems that can serve as a touchstone for testing different theoretical approaches. Finally, Bose-condensed trapped gases display a variety of interesting properties that promise diverse practical applications. V.I.YUKALOV See also Coherence phenomena; Critical phenomena; Lasers; Nonequilibrium statistical mechanics; Nonlinear optics; Nonlinear Schrödinger equations; Order parameters; Phase transitions; Quantum nonlinearity; Quantum theory; Superconductivity; Superfluidity Further Reading Anderson, M.H., Ensher, J.R., Matthews, M.R., Wieman, C.E. & Cornell, E.A. 1995. Observation of Bose–Einstein condensation in a dilute atomic vapor. Science, 269: 198–201 Bogolubov, N.N. 1947. On the theory of superfluidity. Journal of Physics, 11: 23–32 Bose, S.N. 1924. Plancks gesetz und lichtquantenhypothese. Zeitschrift für Physik, 26: 178–181
BOUNDARY LAYERS Bradley, C.C., Sackett, C.A., Tollett, J.J. & Hulet, R.G. 1995. Evidence of Bose–Einstein condensation in an atomic gas with attractive interactions. Physical Review Letters, 75: 1687–1690 Coleman, A.J. & Yukalov, V.I. 2000. Reduced Density Matrices, Berlin: Springer Courteille, P.W., Bagnato, V.S. & Yukalov, V.I. 2001. Bose– Einstein condensation of trapped atomic gases. Laser Physics, 11: 659–800 Dalfovo, F., Giorgini, S., Pitaevskii, L.P. & Stringari, S. 1999. Theory of Bose–Einstein condensation in trapped gases. Reviews of Modern Physics, 71: 463–512 Davis, K.B., Mewes, M.O., Andrews, M.R., van Drutten, N.J., Durfee, D.S., Kurn, D.M. & Ketterle, W. 1995. Bose–Einstein condensation in a gas of sodium atoms. Physical Review Letters, 75: 3969–3973 Einstein, A. 1924. Quantentheorie des einatomigen idealen gases. Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physik-Mathematik, 261–267 Einstein, A. 1925. Quantentheorie des einatomigen idealen gases. Zweite abhandlung. Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physik-Mathematik, 3–14 Gross, E.P. 1961. Structure of a quantized vortex in boson systems. Nuovo Cimento, 20: 454–477 Huang, K. 1963. Statistical Mechanics, New York: Wiley Klauder, J.R. & Skagerstam, B.S. 1985. Coherent States, Singapore: World Scientific Lifshitz, E.M. & Pitaevskii, L.P. 1980. Statistical Physics: Theory of Condensed State, Oxford: Pergamon Nozières, P. Pines, D. 1990. Theory of Quantum Liquids: Superfluid Bose Liquids, Redwood, CA: Addison-Wesley Parkins, A.S. & Walls, D.F. 1998. The physics of trapped dilutegas Bose–Einstein condensates. Physics Reports, 303: 1–80 Pitaevskii, L.P. 1961. Vortex lines in an imperfect Bose gas. Journal of Experimental and Theoretical Physics, 13: 451–455 Ter Haar, D. 1977. Lectures on Selected Topics in Statistical Mechanics, Oxford: Pergamon Yukalov, V.I. & Shumovsky, A.S. 1990. Lectures on Phase Transitions, Singapore: World Scientific Ziff, R.M., Uhlenbeck, G.E. & Kac, M. 1977. The ideal BoseEinstein gas revisited. Physics Reports, 32: 169–248
BOSONS See Quantum nonlinearity
BOUNDARY LAYERS The Navier–Stokes system is the basic mathematical model for viscous incompressible flows. It reads ⎧ ⎨ ∂t u + u · ∇ u − ν∆u + ∇p = 0, div(u) = 0, (NSν ) (1) ⎩ u = 0 on ∂, where u is the velocity, p is the pressure, and ν is the viscosity. We can define a typical length scale L and a typical velocity U . The dimensionless parameter or Reynolds number, Re = U L/ν, is very important to compare the properties of different flows. Indeed, two flows having the same Re have the same properties. When Re is very large (ν very small), the Navier–Stokes system (NSν ) behaves like the Euler
71 system (Euler)
⎧ ⎨ ∂t U + U · ∇ U + ∇p = 0, div(U ) = 0, ⎩ U . n = 0 on ∂.
(2)
In the region close to the boundary, the length scale becomes very small and we cannot neglect viscous effects. In 1905, Ludwig Prandtl suggested that there exists a thin layer called the boundary layer, where the solution u undergoes a sharp transition from a solution to the Euler system to the no-slip boundary condition u = 0 on ∂ of the Navier–Stokes system. In other words, u = U + uBL where uBL is small except near the boundary. To illustrate this, we consider a two-dimensional (planar) flow u = (u, v) in the half-space {(x, y) | y > 0} subject to the following initial condition u(t = 0, x, y) = u0 (x, y), boundary condition u(t, x, y = 0) = 0, and u → (U0 , 0) when y → ∞. Taking the typical length and velocity of order −1 one, the Reynolds √ number reduces to Re = ν . Let ε = Re−1/2 = ν. Near the boundary, the Euler system is not a good approximation. We introduce new independent variables and new unknowns y t˜ = t, x˜ = x, y˜ = , ε v (u, ˜ v)( ˜ t˜, x, ˜ y) ˜ = u, (t˜, x, ˜ εy). ˜ ε Notice that when y˜ is of order one, y = εy˜ is of order ε. Rewriting the Navier–Stokes system in terms of the new variables and unknowns yields ⎧ ⎨ u˜ t˜ + u˜ u˜ x˜ + v˜ u˜ y˜ − u˜ y˜ y˜ − ε 2 u˜ x˜ x˜ + px˜ = 0, ε2 (v˜t˜ + u˜ v˜x˜ + v˜ v˜y˜ − v˜y˜ y˜ ) − ε 4 v˜x˜ x˜ + py˜ = 0, ⎩ u˜ x˜ + v˜y˜ = 0. (3) Neglecting the terms of order ε 2 and ε4 yields . u˜ t˜ + u˜ u˜ x˜ + v˜ u˜ y˜ − u˜ y˜ y˜ + px˜ = 0, (4) u˜ x˜ + v˜y˜ = 0. py˜ = 0, Since p does not depend on y, ˜ we deduce that the pressure does not vary within the boundary layer and can be recovered from the Euler system (2) when y = 0, namely px (t, x) = − (Ut + U Ux )(t, x, y = 0), since V (t, x, y = 0) = 0. Going back to the old variables, we obtain . ut + uux + vuy − νuyy + px = 0, (5) ux + vy = 0 which is the so-called Prandtl system. It should be supplemented with the following boundary conditions: . u(t, x, y = 0) = v(t, x, y = 0) = 0, (u, v)(t, x, y) → (U (t, x, 0), 0) as y → ∞. (6)
72
BOUNDARY VALUE PROBLEMS
Formally, a good approximation of u should be U + uBL , where U is the solution of the Euler system (2) and uBL + U (t, x, 0) is the solution of the Prandtl system (5), (6). Replacing the Navier–Stokes system by the Euler system in the interior and the Prandtl system near the boundary requires a justification. Mathematically, this can be formulated as a convergence theorem when ν goes to 0; namely, u − (U + uBL ) goes to 0 when ν goes to 0 in L∞ or in some energy space (see Masmoudi, 1998 for a special case). In its whole generality, this is still a major open problem in fluid mechanics. This is due to problems related to the wellposedness of the Prandtl system as well as problems related to the instability of some solutions to the Prandtl system, which may prevent the convergence. Let us explain the first problem for the steady Prandtl system . uux + vuy − νuyy + px = 0, (7) ux + vy = 0 in = {(x, y) | 0 < x < X , y > 0} subject to the following extra boundary condition u(x = 0, y) = u0 (y). Here, x should be thought of as a time-like variable. If we assume that U, u0 ≥ 0 and u > 0 if y > 0, we can introduce the von Mises transformation (x, y) → (x, ψ) and
w = u2 ,
where ψy = u, ψx = − v and ψ(x, 0) = 0. In (x, ψ), the steady Prandtl system reads √ wx = ν wwψψ − 2px , which is a degenerate parabolic equation, with the boundary conditions w(x, 0) = 0, w(0, ψ) = w1 (ψ), w(x, ψ) → U 2 (x) as ψ → ∞, y where w1 ( 0 u1 (s)ds) = u21 (y). Using this new equation, one can prove existence for the steady Prandtl system (see Oleinik and Samokhin, 1999). In the case of favorable pressure gradient, namely px ≤ 0, the solution is global (X = + ∞). If px > 0, then a separation of the boundary layer may occur. x0 is said to be a point of separation if uy (x0 , 0) = 0 and uy (x, 0) > 0 for 0 < x < x0 . Qualitatively, the separation of the boundary layer is caused by a downward flow that drives the boundary layer away from the boundary. In that case, the assumption that the tangential velocity is large compared with the normal one is not valid, and the derivation of the Prandtl system is not justified. A second obstacle to the convergence can come from the instability of the solution to the Prandtl system itself, if we consider a two-dimensional shear flow us = (us (y), 0), which is a steady solution of the Euler system. It is well known that the linear stability of such
a flow is linked to the presence of inflection points in the profile us . A necessary condition of instability is that the profile has an inflection point. The solution to the Prandtl system with initial data us and U = 0 is just the solution of a heat equation ut = νuyy . If the profile us is linearly unstable for the Euler system, then u is not a good approximation of the Navier–Stokes system when ν goes to 0 (see Grenier, 2000). The boundary layer theory is a very powerful tool in asymptotic analysis and is present in many different fields of partial differential equations, including the magnetohydrodynamic flow boundary layer. In fluid mechanics and atmospheric dynamics, we can also mention the Ekman layer, which is due to the balance between the viscosity and the rapid rotation of a fluid (Coriolis forces). In kinetic theory, systems of conservation laws, passage to the limit from a parabolic to a hyperbolic system, different types of boundary layers arise. NADER MASMOUDI See also Fluid dynamics; Navier–Stokes equation Further Reading Grenier, E. 2000. On the nonlinear instability of Euler and Prandtl equations. Communications on Pure and Applied Mathematics, (53): 1067–1091 Grenier, E. & Masmoudi, N. 1997. Ekman layers of rotating fluids, the case of well prepared initial data. Communications in Partial Differential Equations, 22: 953–975 Masmoudi, N. 1998. The Euler limit of the Navier-Stokes equations, and rotating fluids with boundary. Archive Rational Mechanics and Analysis, 142(4): 375–394 Oleinik, O.A. & V.N. Samokhin. 1999. Mathematical Models in Boundary Layer Theory, Boca Raton, FL: Chapman & Hall Prandtl, L. 1905. Mathematiker-Kongresses. Boundary Layer, Heidelberg: Verhandlung Internationalen, pp. 484–494 Weinan, W.E. 2000. Boundary layer theory and the zero-viscosity limit of the Navier–Stokes system. Acta Mathematica Sinica, 16(2): 207–218
BOUNDARY VALUE PROBLEMS For a given ordinary or partial differential equation, a boundary value problem (BVP) requires finding a solution of the equation valid in a bounded domain and satisfying a set of given conditions on the boundary of a domain. To define a boundary value problem, therefore, one needs to give an equation, a domain, and an appropriate number of functions supported on the boundary of the given domain, defining the boundary conditions. For example, qt + qx + qxxx = 0, the PDE, x ∈ [0, ∞), t ∈ [0, T ], the domain, q(x, 0) = f0 (x), q(0, t) = g0 (t), the boundary conditions. Finding a solution of a given BVP is more difficult than finding a function that satisfies the PDE, because of
BOUNDARY VALUE PROBLEMS the constraint imposed on the solution at the boundary of the domain. Indeed, for nonlinear equations, there exists no general method to find the solution of a given BVP. The question of the solvability of such a problem must also be addressed, and in general, it does not have an easy answer (in the above example, if we prescribe two rather than just one condition at the boundary x = 0, there exists, in general, no solution). The existence and uniqueness of the solution of a given BVP can be guaranteed only when a specific, well-defined number of boundary conditions are prescribed, and this number depends on the highest-order derivatives appearing in the equation with respect to each variable (Coddington & Levinson, 1955). For linear ordinary differential equations (ODEs), there is a general methodology for solving a BVP, based on defining the particular solution of a related problem. This particular solution is called the Green function associated with the BVP, and it depends on the boundary conditions. The Green function is used to define an integral operator, and if this operator is sufficiently regular, one can use it to express the solution of the original problem (Stackgold, 1979). This approach is powerful and fairly general, but it is not always successful, and it cannot be used for nonlinear equations. No general methods are available to construct solutions for nonlinear ODEs or even to assert their existence. Most techniques rely on perturbing in some way the solution of an associated linearized problem or an integrable nonlinear problem. If one hopes to extract information about the nonlinear problem by studying a corresponding linearized one, the correct way to linearize must also be evaluated. Examples of such techniques are branching theory, eigenvalue perturbation, and boundary conditions or domain perturbation. For linear PDEs in two variables, the classical approach for solving a BVP (going back to Jean D’Alembert’s work in the 1740s) is separation of variables. The aim of this technique is to reduce the problem to two distinct linear problems for two ODEs. However, the separability of the problem depends on the specific domain and boundary conditions. For example, depending on the specific boundary conditions prescribed, the ODE one obtains may lead, via the associated Green function, to a non-self-adjoint transform problem, for which few general results are available. An important theoretical result for the solvability of BVP for linear PDEs is the fundamental principle of Ehrenpreis (1970), which states that there always exists an appropriate generalization of the Fourier transform capable of representing the general solution of a BVP for a linear PDE in the variables (x1 , x2 ), posed in a smooth convex domain. This result assumes the wellposedness of the problem. It then ensures that there
73 exists a measure ρ(k) and a contour in the complex plane such that the solution of the problem can be expressed as an integral in the form ef1 (k)x1 +f2 (k)x2 dρ(k),
where k is a complex parameter. The functions f1 (k) and f2 (k) are given explicitly. For example, in the case of an evolution equation ut = Lu, where u = u(x, t) and L is a linear differential operator, the representation takes the form eikx−iω(k)t dρ(k), (1) u(x, t) =
where ω(k) is the dispersion relation of the equation. In representation (1), the dependence on the solution variables (x, t) is explicit; the integration variable k is called the spectral parameter, and such a representation is said to be spectrally decomposed. However, this result is, in general, not constructive, as ρ(k) and are not known. In some cases, it is possible, to obtain this representation via separation of variables, but this is not always the case. Consider, for example, the secondorder BVP iqt + qxx = 0, 0 < x < ∞, 0 < t < ∞, q(x, 0) = q0 (x), q(0, t) = f (t),
(2)
where q = q(x, t) and it is assumed that all functions are infinitely differentiable and vanish as x → ∞. By separating variables, one obtains an ODE in x, which can be solved using the sine transform. Assuming that a unique solution exists, this procedure yields the representation 2 ∞ 2 sin(kx)e−ik t q(x, t) = π 0
t 2 × qˆ0 (k) + ik eik s f (s)ds dk, 0
where qˆ0 (k) =
∞
sin(kx)q0 (x)dx. 0
This representation is not in form (1) as the variable t also appears as a limit of integration. This fact not only makes this representation less convenient for extracting information about the t dependence of the solution but also makes the rigorous proof of existence and uniqueness of a solution more cumbersome, as the relevant integral is not uniformly convergent as x → 0. For nonlinear PDEs, no general method is available (Logan, 1994). Perturbation techniques can be of some use in the study of evolutionary PDEs of the form ut + P (u) = 0, where u = u(t, x) and P u is a nonlinear ODE containing the x-derivatives. Solutions of this problem such that ut = 0 are called steady-state
74 solutions: these are the solutions that are independent of time. In this context, one studies the linearized stability of the steady state by using the same perturbative techniques discussed for ODEs, as this yields information about the qualitative behavior in time of the solution of the nonlinear problem. The results available in this area are, in general, of limited applicability and practical use for finding explicit solutions. The special class of nonlinear PDEs known as integrable deserves separate consideration. For these equations, there exists a particular linearizing technique, the inverse scattering transform, which yields the solution of the Cauchy problem. Some of these equations, such as the Korteweg–de Vries and sine-Gordon equations, have been considered on simple domains, and specific BVPs have been solved by ad hoc PDE techniques. The first such result was obtained already 40 years ago (Cattabriga, 1959), but recently this field has witnessed a new surge of interest. To obtain such results, the nonlinear problem is often considered as a linear problem, with the nonlinear term considered as a nonhomogeneous (or forcing) term; thus, the analysis is based on the analysis of the linearized equations by classical PDE techniques (Bona et al., 2001). A different approach involves the attempt to extend the inverse scattering linearizing technique to BVPs, as done, for example, in Leon (2002) for the sine-Gordon equation. Recently, a general approach to solving BVPs for two-dimensional linear PDEs has been proposed and successfully used to solve many different types of such problems (Fokas, 2000). Its relevance is enhanced by the fact that this approach can be generalized to treat integrable nonlinear PDEs. This methodology yields a spectral transform associated directly to the PDE rather than to transforms associated to the two ODEs obtained by separating variables. For Example (2), this yields, for the solution, the representation ∞ 1 2 eikx−ik t qˆ0 (k)dk q(x, t) = 2π 0 1 2 eikx−ik t q(k)dk, ˆ + 2π where is the boundary of the first quadrant of the complex k-plane, and ∞ qˆ0 (k) + qˆ0 (−k) 2 −k eik t f (t)dt. q(k) ˆ = 2 0 This representation is in Ehrenpreis form, and in addition, measure and contour are explicitly constructed. The above-mentioned approach provides a unification of the integral representation of the solution of a linear PDE in terms of the Ehrenpreis fundamental principle with the inverse scattering transform for inte-
BOUNDARY VALUE PROBLEMS grable nonlinear PDEs. Indeed, when the problem reduces to an initial value problem for decaying solutions (i.e., the domain for the spatial variable is the whole real line, and the solution is assumed to vanish at ± ∞), the transform obtained is precisely the inverse scattering transform. The essential ingredients of this approach are the reformulation of the PDE as the closure condition of a certain differential form, and the definition in the complex plane of a Riemann–Hilbert problem depending on both the PDE and the domain. The differential form can be found algorithmically for linear PDEs and is equivalent to the Lax pair formulation for integrable nonlinear PDEs (Lax, 1968). The solution of this Riemann–Hilbert problem (which can be found in closed form in many cases) takes the role of the classical Green formula, and yields an integral representation for the solution, which is independent of the particular boundary conditions and indeed contains all the boundary values of the solution. What this approach crucially provides (when the definition domain is connected) is a global relation among these boundary values, which is the tool necessary to express the solution only in terms of the given boundary conditions and to prove rigorously problems with well-posedness, as well as existence and uniqueness results. BEATRICE PELLONI See also Integrability; Inverse scattering method or transform; Riemann–Hilbert problem; Separation of variables Further Reading Bona, J., et al. 2001. A non-homogeneous boundary value problem for the Korteweg–de Vries equation. Transactions of the American Mathematical Society, 354: 427–490 Cattabriga, L. 1959. Un problema al contorno per una equazione parabolica di ordine dispari. Annali della Scuola Normale Superiore di Pisa, 13: 163–203. Coddington, E.A. & Levinson, N. 1955. Theory of Ordinary Differential Equations, New York: McGraw-Hill Ehrenpreis, L. 1970. Fourier Analysis in Several Complex Variables, New York: Wiley-Interscience Fokas, A.S. 2000. On the integrability of linear and nonlinear PDEs. Journal of Mathematical Physics, 41: 4188 Ghidaglia, J.L. & Colin, T. 2001. An initial-boundary value problems for the Korteweg–de Vries equation posed on a finite interval. Advances in Differential Equations, 6(12): 1463–1492 Lax, P.D. 1968. Integrals of nonlinear equations of evolution and solitary waves. Communications in Pure and Applied Mathematics, 21: 467–490 Leon, J. 2002. Solution of the Dirichlet boundary value problem for the sine-Gordon equation. Physics Letters A, 298, 343–252 Logan, J.D. 1994. An Introduction to Nonlinear Partial Differential Equations, New York: Wiley-Interscience Stackgold, I. 1979. Green’s Functions and Boundary Value Problems, New York: Wiley-Interscience
BRANCHING LAWS
75
BOUSSINESQ EQUATIONS See Water waves
BOX COUNTING See Dimensions
BRAIN WAVES See Electroencephalogram at large scales
BRANCHING LAWS In this entry, we briefly trace the history of a familiar phenomenon, branching, in physical and biological systems and the laws governing them. The simplest type of branching tree is one in which a single conduit enters a vertex and two conduits emerge. This dichotomous process is clearly seen in the patterns of biological systems, such as botanical trees, neuronal dendrites, lungs, and arteries, as well as in the patterns of physical systems, such as lightning, river networks, and fluvial landscapes. The quantification of branching through the construction of the mathematical laws that govern them can be traced back to Leonardo da Vinci (1452–1519). In his Notebooks, he writes (Richter, 1970): All the branches of a tree at every stage of its height when put together are equal in thickness to the trunk [below them]. All the branches of a water [course] at every stage of its course, if they are of equal rapidity, are equal to the body of the main stream.
He also admonished his readers with: “Let no man who is not a Mathematician read the elements of my work.” This statement increases in significance when we consider that da Vinci wrote nearly two centuries before Galileo (Galilei, 1638), who is generally given the credit for establishing the importance of mathematics in modern science. The first sentence in the da Vinci quote is further clarified in subsequent paragraphs of the Notebooks. With the aid of da Vinci’s sketch reproduced in Figure 1, this sentence has been interpreted as follows: if a tree has a trunk of diameter d0 that bifurcates into two limbs of diameters d1 and d2 , the three diameters are related by d0a = d1a + d2a
(1)
Simple geometrical scaling yields the diameter exponent α = 2, which corresponds to rigid pipes carrying fluid from one level of the tree to the next, while retaining a fixed cross-sectional area through successive generations of bifurcation. Although the pipe model has a number of proponents from hydrology, the diameter exponent for botanical trees was determined empirically by Cecil D. Murray in 1927 to be insensitive to the kind of botanical tree and to have a value 2.59 rather than 2 (Murray, 1927). Equation (1) is referred to as Murray’s law in the physiology literature.
Figure 1. Sketch of tree from Leonardo da Vinci’s Notebooks, PL XXVII (Richter, 1970).
The significance of Murray’s law was not lost on D’Arcy Thompson (1942). In the second edition of his classic On Growth and Form, first published in 1917, Thompson argues that the geometrical properties of biological systems can often be the limiting factor in the development and final function of an organism. This is stated in his principle of similitude, which is a generalization of certain observations made by Galileo regarding the trade-off between the weight and strength of bone (Galilei, 1638). Thompson goes on to argue that the design principle for biological systems is that of energy minimization. The second sentence in the da Vinci quotation is as suggestive as the first. In modern language, we would interpret it to mean that the flow of a river remains constant as tributaries emerge along the river’s course. This equality must be true in order for the water to continue flowing in one direction and not stop and reverse course at the mouth of a tributary. Using the pipe model introduced above, and minimizing the energy with respect to the pipe radius, yields α = 3, in Equation (1). Thus, the value of the diameter exponent obtained empirically by Murray falls between the theoretical limits of geometric self-similarity and hydrodynamic conservation, 2 ≤ α ≤ 3. In da Vinci’s tree, it is easy to assign a generation number to each of the limbs, but the counting procedure can become complicated in more complex systems like the generations of the bronchial tubes in the mammalian lung. One form taken by the branching laws is that the ratio of the radii of the tubes (from one generation to the next) is constant, that is, by the scaling relation rj = R. rj +1
(2)
Equation (2) is analogous to Horton’s law for river trees and fluvial landscapes, which involves the ratio
76
BREATHERS
In the case of the lung, the diameter of an airway is reduced by a factor 2−1/3 at each generation, since α = 3 for the bronchial tree. Therefore, after j generations, rj = r0 exp (−j/λ), where the exponential rate of reduction, λ = ln (2)/3, is the same for each generation beyond the trachea r0 , as argued by Weibel (2000). A less space-filling value of the scaling index is obtained for the arterial system, where it is empirically determined that α = 2.7. For a general noninteger branching index, the scaling relation Equation (3) defines a fractal tree. Such trees have no characteristic scale length and were first organized and discussed as a class by Benoit Mandelbrot (1977)— the father of fractals. The classical approach relied on the assumption that biological processes, like their physical counterparts, are continuous, homogeneous, and regular. However, most biological systems and many physical ones are discontinuous, inhomogeneous, and irregular and are necessarily that way in order to perform a particular function, such as gas exchange in lungs and arteries. An entirely different kind of fractal tree is that of neuronal dendrites. The branching trees of neurons interleave the brain and form the communication system within the body. In the neurophysiology literature, Equation (1) is known as Rall’s law with α = 1.5 (Rall, 1959). More recent measurements of the scaling index, at each generation of dendritic branching, show a change with generation number; that is, the single parameter R is replaced with Rj . This nonconstant scaling coefficient implies that Thompson’s principle of similitude is violated. A fractal model of the bronchial tree assumes that the ratio of successive generations of radii is dependent on the generation number, giving rise to a renormalization group relation, with the solution given by rj =
a(j ) , j > 0. ju
3
Log Diameter (mm), {Log d(z)}
of the number of branches in successive generations of branches, rather than radii (Mandelbrot, 1977). In either case, the parameter R determines the branching law and Equation (2) implies a geometrical self-similarity in the tree, as anticipated by Thompson. In the branching of bronchial airways, d1 = d2 at each stage of the tree, so that from Equation (1) we deduce the relationship between the radii of the pipes between successive generations as (3) rj = 21/α rj +1 .
Dog Rat Hamster Human
2 Dog
Rat 1
Human Hamster
0
−1 0
1 2 Log Generation (Log z)
3
Figure 2. The variation in diameter d of the bronchial airways is depicted as a function of generation number j for rats, hamsters, humans, and dogs. The modulated inverse power law from the fractal model of the bronchial airway is observed in each case (West & Deering, 1994).
Mandelbrot, 1977) and in the development of branching laws. BRUCE J. WEST See also Fibonacci series; Geomorphology and tectonics; Martingales; Multiplex neuron Further Reading Galilei, G. 1638. Dialogue Concerning Two New Sciences, translated by H. Crew & A. deSalvio in 1914, New York: Dover, 1954 Mandelbrot, B.B. 1977. The Fractal Geometry of Nature, San Francisco: W.H. Freeman Murray, C.D. 1927. A relationship between circumference and weight and its bearing on branching angles. Journal of General Physiology, 10: 725–729 Rall, W. 1959. Theory of physiological properties of dendrites. Annals of New York Academy of Science, 96: 1071–1091 Richter, J.P. 1970. The Notebooks of Leonardo da Vinci, vol. 1, New York: Dover, unabridged edition of the work first published in London in 1883 Thompson, D.W. 1942. On Growth and Form, 2nd edition, Cambridge: Cambridge University Press, republished New York: Dover, 1992 Weibel, E.R. 2000. Symmorphosis: On Form and Function in Shaping Life, Cambridge, MA: Harvard University Press West, B.J. 1999. Physiology, Promiscuity and Prophecy at the Millennium: A Tale of Tails, Singapore: World Scientific West, B.J. & Deering, W. 1994. Fractal physiology for physicists: Levy statistics. Physics Reports, 246: 1–100
(4)
Here, the average radius is an inverse power law in the generation number j , modulated by a slowly oscillating function a(j ) as observed in the human, dog, rat, and hamster data shown in Figure 2 (West & Deering, 1994). In this way, the fractal concept is used as a design principle in biology (Weibel, 2000; West, 1999;
BREATHERS The term breather (also called a “bion”) arose from studies of the sine-Gordon (SG) equation utt − uxx + sin u = 0,
(1)
which has localized solutions that oscillate periodically with time and decay exponentially in space. Such a
BREATHERS
77
solution of Equation (1) is given by $ λ sin ωt , λ = 1 − ω2 , u(x, t) = 4 tan−1 ω cosh λx
4 3
which is shown in Figure 1. Although the breather of Equation (2) is a nontopological soliton of Equation (1), it can be considered as a bound state of two topological solitons of the SG equation (kink and antikink), one of which is shown in Figure 2(a). The kink and antikink oscillate with respect to each other with the period T = 2π/ω. Thus, such a soliton is also called a “doublet.” A sketch of the bion at small frequencies (ω 1) and large enough t is presented in Figure 2(b). At some initial time, the kink and antikink move outward in opposite directions and separate in space with increasing time up to some finite distance (at t = T /4). The kink and antikink components of the breather never become fully free of distortions in their shapes due to interactions between each other, and finally oscillate in a kind of bound state. At 1 − ω2 1, Equation (2) reduces to a smallamplitude breather u(x, t) = 4 Re ψ(x, t), where ψ(x, t) = −i
λ exp(−iωt) cosh λx
u(x, t)
2
(2)
iψt +
ψxx − ψ
+ |ψ|2 ψ
= 0,
−3 −4 −10
which is regarded as a breather and can be written as ψ(x, t) = φ(x) exp(iωt). In this form, the spatial dependence of the soliton amplitude and the time dependence of the phase (of the complex function ψ) are separated. As a result, the nonlinearity appears only in the amplitude, but not in the phase of the NLS soliton. Although such a separation of the spatial and time dependencies in a soliton expression does not take place in the general form of the SG breather, the limiting case of the SG breather coincides with the amplitude of the NLS soliton. At present, it is not known whether other nonlinear Klein–Gordon equations similar to Equation (1), but differing from it only by the nonlinear term, possess exact breather solutions (Segur & Kruskal, 1987). However, if certain nonlinear terms in a Klein–Gordontype equation differ only slightly from sin u (slightly perturbed SG equation), a breather-like solution may persist in the first order with respect to the perturbation (Birnir et al., 1994). The breather of the SG equation can move along the space coordinate axis with a stationary velocity V . As Equation (1) is a relativistic invariant equation (invariant under a Lorentz transformation
−5
0 x
5
10
Figure 1. u(x, t) from Equation (2) versus x for 26 different times equally spaced and covering one period, with λ = 0.5.
u(x)
x
a u(x) t< T
(3)
(4)
0 −1 −2
t=0, T, T x
√ and λ = 2(1 − ω). Equation (3) is a soliton solution of the nonlinear Schrödinger (NLS) equation 1 2
1
t> T
b Figure 2. (a) A sketch of the sine-Gordon kink. (b) Three profiles of the kink-antikink oscillations.
of the independent variables), one can obtain a moving breather from Equation (2) substituting x → (x − V t)/(1 − V 2 )1/2 and t → (t − V x)/(1 − V 2 )1/2 . Consequently, the moving breathers form a twoparametric (ω and V ) family of solutions of the SG equation. Possible values of the breather parameters ω and V can be compared with the dispersion relation 2 (ω = 1 + k 2 ) for small vibrations (phonons) described by the linearized version of Equation (1). These phonons have frequencies ω > 1 and phase velocities ω/k > 1. A breather frequency, on the other hand, is smaller than the minimum frequency of the phonons (ω < 1), and the breather velocity is smaller than the minimum phonon phase velocity (V < 1). Therefore, the dynamical breather parameters lie outside of the spectrum of the linear vibrations. Although the time dependence of the breather includes the higher temporal harmonics of the oscillations, the phonons cannot be resonantly excited by the breather. Thus, breathers and phonons are asymptotically independent vibrational
78
BROWNIAN MOTION
modes of the system described by the SG equation. This asymptotic independence of nonlinear excitations (breathers and kinks) and phonons follows from the integrability of Equation (1). An important way to study nonlinear integrable equations is the inverse scattering method. According to this method, breathers are characterized by poles in the complex phase plane of scattering parameters for the equation under consideration. It is known that several nonlinear differential equations possess breather solutions. The Landau– Lifshitz (LL) equation provides an example of a nonlinear equation generalizing the results that are described by the SG and NLS equations. The breatherlike solution of the LL equation has a more complicated form than the one presented above; however, it is also a two-parameter soliton called a dynamic magnetic soliton (Kosevich et al., 1977). Its oscillatory behavior is characterized by a frequency ω, and its center can move with a velocity V . In the general case, the magnetic soliton can be described by some complex function of x and t, but the time and spatial dependencies are not separated in the analytical expression for such a soliton. An important class of breathers the so-called discrete breathers (also known as intrinsic localized modes, selflocalized anharmonic modes, or nonlinear localized excitations). These are solutions of a nonlinear equation on a lattice, and they are periodic in time and localized in space. Although most such investigations are performed by numerical calculations, there exist nonlinear dynamic equations on a lattice possessing exact analytical breather solutions. One of them is the following discrete version of the NLS equation proposed by Ablowitz and Ladik (AL) in 1976 (Ablowitz & Ladik, 1976): i∂t ψn − (ψn+1 + ψn−1 )(1 + |ψn |2 ) = 0.
(5)
The AL lattice is integrable and it allows for breatherlike solutions, the simplest of which has a form close to that of Equation (2): ψ=
sinh β exp(−iωt) , cosh β(n − x0 )
(6)
where n is the integer number of a lattice site, x0 = constant, and ω = − 2 cosh β. ARNOLD KOSEVICH See also Discrete breathers; Discrete nonlinear Schrödinger equations; Integrability; Inverse scattering method or transform; Sine-Gordon equation; Solitons Further Reading Ablowitz, M.J. & Ladik, J.F. 1976. Nonlinear differentialdifference equations and Fourier analysis. Journal of Mathematical Physics, 17: 1011
Birnir, B., McKean, H.P. & Weinstein, A. 1994. The rigidity of sine-Gordon breathers. Communications on Pure and Applied Mathematics, 47: 1043 Flach, S. & Willis, C.R. 1998. Discrete breathers. Physics Reports, 295: 181 Kosevich, A.M., Ivanov, B.A. & Kovalev, A.S. 1977. Nonlinear localized magnetization wave of a ferromagnet as a bound-state of a large number of magnons. Pis’ma Zhurnal Eksperimental’noy i Teoreticheskoy Fiziki, 25: 516 (in Russian); JETP Letters, 25: 486 Kosevich, A.M., Ivanov, B.A. & Kovalev, A.S. 1990. Magnetic solitons. Physics Reports, 194: 117 Segur, H. & Kruskal, M.D. 1987. Nonexistence of smallamplitude breather solutions in φ 4 theory. Physical Review Letters, 58: 747
BROUWER’S FIXED POINT THEOREM See Winding numbers
BROWNIAN MOTION In 1828, Robert Brown, a leading botanist, observed that a wide variety of particles suspended in liquid exhibit an intrinsic, irregular motion when viewed under a microscope. While not the first to witness such motion, his experimental focus on this phenomenon, which would bear his name, established its universality and intrinsic nature, thereby raising it as an issue for fundamental scientific inquiry (Nelson, 1967). Deutsch has recently raised the question of whether Brown actually witnessed Brownian motion or fluctuations due to some external contaminating influence (Peterson, 1991). Indeed, while Brownian motion is a ubiquitous phenomenon, not all irregular motions can be ascribed to Brownian motion. The dancing of dust particles in sunlight is dominated by imperceptible turbulent currents, not Brownian motion. True Brownian motion is generally only visible on scales of microns and below, but has important macroscopic ramifications because all microscopic particles manifest it. For example, Brownian motion makes possible both the fine-scale mixing of initially segregated substances in nature and industry, as well as the passive transport of ions, nutrients, and fuel, which allow biological cells to support life. The origin of Brownian motion remained under debate throughout the 19th century, with Cantoni, Delsaux, Gouy, and C. Weiner proposing that thermal motions in the suspending liquid were responsible, as discussed in Einstein (1956, pp. 86–88), Gallavotti (1999, Chapter 8), and Russel et al. (1989, pp. 65– 66). Attempts to examine this hypothesis quantitatively were hampered by the inability to measure accurately the velocity of particles undergoing Brownian motion, since such motion loses coherence over time scales (microseconds) that are shorter than those which experimental observations were able to resolve. In
BROWNIAN MOTION
79
one of three ground-breaking papers that Einstein published in 1905, he offered a statistical mechanical means for theoretical calculations involving Brownian motion (Einstein, 1956). Einstein realized that the quantity involving Brownian motion that can be best observed under a microscope in an experiment is the “diffusivity”: D = lim
t→∞
|X(t) − X(0)|2 , 2t
(1)
where X(t) denotes the observed displacement of the Brownian particle along a fixed direction at time t. In practice, t is simply taken as some satisfactorily long time of observation, and there is no need for fine temporal resolution as there would be if the velocity were to be measured. Einstein employed a random walk model for his analysis and showed that the diffusivity defined in (1) is identical to the diffusion constant that describes the macroscopic evolution of the concentration density n(x, t) of a large number of Brownian particles: ∂ 2 n(x, t) ∂n(x, t) =D . ∂t ∂xj2 3
(2)
j =1
Through an elegant argument based on equilibrium statistical mechanical arguments, Einstein showed that the diffusivity D of a Brownian particle must be related to its friction coefficient ξ in the following way: kB T , (3) D= mξ where kB is Boltzmann’s constant and T is the absolute temperature (measured in Kelvin scale), and m is the particle’s mass. The friction coefficient ξ appears in the relation between the drag force Fdrag and velocity v of the particle in steady-state motion (assuming a low Reynolds number): Fdrag = mξ v.
(4)
For a sphere of radius a moving through a fluid with dynamic viscosity µ, the friction coefficient is given by ξ = 6π µa/m. The remarkable property of the “Einstein relation” in (3) is that it links a quantity D pertaining to statistically unpredictable dynamical fluctuations to a quantity ξ , which involves deterministic, steady-state properties. Later work generalized the Einstein relation (3) to “fluctuation-dissipation theorems,” which express the structure of the spontaneous statistical fluctuations in a wide class of physical systems to the structure of the dissipative (frictional) dynamics (Kubo et al., 1991, Chapter 1). The basic theory of Brownian motion was developed by Einstein in 1905, a time when the premises of the atomic theory of matter were still not yet fully agreed upon (Gallavotti, 1999; Nelson, 1967). Einstein realized that a careful observation of Brownian motion and his relation between the diffusivity of
a Brownian particle and its mobility could be used to calculate the number of particles making up a given mass of fluid if the atomic theory were valid. Under a microscope sufficient to resolve the Brownian motion of a particle, all quantities in (3) are directly observable except for Boltzmann’s constant kB . Therefore, the Einstein relation (3) can be used to compute a value for kB based on Brownian motion data. Now, kB is in turn related to Avogadro’s number NA , which is the number of molecules in a “mole” (a certain well-defined macroscopic amount) of a substance. The Brownian motion data and the Einstein relation, therefore, furnish an independent prediction for Avogadro’s number NA and, thereby, the number of molecules per unit mass of the fluid. In other words, the number (and therefore mass) of the individual fluid particles could be calculated without having to observe them at an individual level, an experimental feat that has become possible only in recent years. Instead, their individual mass and number could be assessed through their collective influence on a much larger and, therefore, observable immersed particle. In 1908, Jean Perrin experimentally confirmed that the value of NA computed in this way agreed with those obtained from other techniques (Gallavotti, 1999), providing strong support for the atomic theory of matter. Since the 1970s, Brownian motion has been investigated in the laboratory through dynamic light scattering techniques (Russel et al., 1989, Chapter 3). The most idealized mathematical representation of Brownian motion with diffusivity D is defined as (2D)1/2 W (t), where W (t) is a canonical continuous random process with Gaussian statistics such that W (0) = 0, W (t) = 0, and (W (t) − W (t ))2 = |t − t |.
(5)
This mathematical Brownian motion is often referred to as the Wiener process (Borodin & Salminen, 2002; Gallavotti, 1999; Nelson, 1967). This idealized Brownian motion has independent increments (no inertia). Physical Brownian motion, of course, has some small inertia as well as several other complicating influences from the fluid environment and from the presence of other nearby Brownian particles (Russel et al., 1989). These extra features can be built into a dynamical description using the mathematical Brownian motion as the basic noise input with influence mediated by the other physical parameters. The mathematical Brownian motion has a similar role in modeling noise input in a wide variety of stochastic models in physics, biology, finance, and other fields. More precisely, the Levy–Khinchine theorem indicates that in any system affected by noise in a continuous way such that the noise on disjoint time intervals is independent can be modeled in terms of mathematical Brownian motion (Reichl, 1998, Chapters 4, 5).
80
BROWNIAN MOTION 4 3 2 x 1 0 -1 -2 -3
0
0. 1
0. 2
0. 3
0. 4
0. 5 t
0. 6
0. 7
0. 8
0. 9
1
0
0. 1
0. 2
0. 3
0. 4
0. 5 t
0. 6
0. 7
0. 8
0. 9
1
3 2
x
1 0 -1 -2
Figure 1. Sample trajectories of fractional Brownian motion using Fourier-wavelet method (Elliott et al., 1997). Top panel: H = 13 , lower panel: H = 23 . Both simulations used the same random numbers.
Discontinuous noise-induced jumps, in contrast, are modeled in terms of Poisson processes or more generally Lévy processes (Reichl, 1998, Chapters 4, 5). Continuous noise with long-range correlations (so that the independent increment property is not satisfied), on the other hand, can often be usefully modeled in terms of “fractional Brownian motion” (FBM) (Mandelbrot, 2002). This is an idealized Gaussian random process Z(t) with Z(0) = 0, Z(t) = 0, and (Z(t) − Z(t ))2 = |t − t |2H
(6)
where the Hurst exponent H is chosen from the interval 0 < H < 1. The FBM with H = 21 corresponds to ordinary Brownian motion with independent increments. FBMs with 21 < H < 1 have positive, long-ranged correlations with less rough trajectories and large excursions, while FBMs with 0 < H < 21 have negative, long-ranged correlations with rougher trajectories and a more oscillatory character (Figure 1). All FBMs have a statistical self-similarity property; the statistics of the rescaled FBM λ−H Z(λt) are identical to those of the original Z(t). That is, these processes have no finite length or time scale associated to them and can be thought of as random fractals. Fractional Brownian motions are therefore particularly appropriate for modeling systems with fluctuations occurring over a wide range of scales; cutoff lengths
can be introduced by filtering an input FBM. Models built from FBMs have been developed in turbulence theory, natural landscape and cloud structures, surface adsorption processes, neural signals in biology, and self-organized critical systems such as earthquakes, forest fires, and sandpiles. PETER R. KRAMER See also Fluctuation-dissipation theorem; Fluid dynamics; Fokker–Planck equation; Lévy flights; Random walks Further Reading Borodin, A.N. & Salminen, P. 2002. Handbook of Brownian Motion: Facts and Formulae, 2nd edition, Basel: Birkhäuser Einstein, A. 1956. Investigations on the Theory of the Brownian Movement, edited with notes by R. Fürth, translated by A.D. Cowper, New York: Dover Elliott, Jr, F.W., Horntrop, D.J. & Majda, A.J. 1997. A Fourierwavelet Monte Carlo method for fractal random fields. Journal of Computational Physics, 132(2): 384–408 Gallavotti, G. 1999. Statistical Mechanics, Berlin: Springer Kubo, R., Toda, M. & Hashitsume, N. 1991. Statistical Physics, vol. 2, 2nd edition, Berlin: Springer Mandelbrot, B.B. 2002. Gaussian Self-affinity and Fractals, Chapter IV. New York: Springer Mazo, R.M. Brownian Motion. Fluctuations, Dynamics and Applications, Oxford and New York: Oxford University Press
BRUSSELATOR Nelson, E. 1967. Dynamical Theories of Brownian Motion, Princeton, NJ: Princeton University Press Peterson, I. 1991. Did Brown see Brownian motion? Science News, 139: 287 Reichl, L.E. 1998. A Modern Course in Statistical Physics. 2nd edition, New York: Wiley Russel, W.B., Saville, D.A. & Schowalter, W.R. 1989. Colloidal Dispersions. Cambridge and New York: Cambridge University Press
BRUSSELATOR The Brusselator is an autocatalytic model involving two intermediates. It illustrates how the fundamental laws of thermodynamics and chemical kinetics as applied to open systems far from equilibrium can give rise to self-organizing behavior and to dissipative structures in the form of temporal oscillations and spatial pattern formation. Chemical kinetics imposes stringent conditions on the concentrations of the species involved in a reaction scheme and on the associated parameters. In a scheme consisting entirely of elementary steps, the overall rates are given (in an ideal system) by mass action kinetics, featuring particular combinations of products of concentrations preceded by stoichiometric coefficients (integer numbers specifying how the relevant constituents are produced or consumed). This guarantees the positivity of the solutions of the mass balance equations. A second condition is detailed balance, which requires that in chemical equilibrium, each individual reaction step is balanced by its inverse (See Detailed balance). This gives rise to relations linking the concentrations of initial reactants and final products to the rate constants, independent of the concentrations of the intermediates. An additional set of requirements stems from the fact that self-organization in chemical kinetics must arise through an instability. One reason for this is that equilibrium and the states in its vicinity, obtained as the constraints are gradually increased (called the “thermodynamic branch”), are stable. To overcome this property and evolve to states that are qualitatively different from equilibrium, new branches of solutions must be generated, which can only take place through the mechanisms of instability and bifurcation. This, in turn, requires that the non-equilibrium constraints exceed a critical value (Glansdorff & Prigogine, 1971). Because the evolution laws generated by chemical kinetics at the macroscopic level of description are dissipative, the bifurcating states are attractors, attained by families of initial conditions belonging to an appropriate part of phase space. This guarantees structural stability, that is to say, the robustness of the solution toward small perturbations—a condition to be fulfilled by a model aiming to describe a physical phenomenon. Note that non-equilibrium instabilities and self-organization collapse when the
81 system becomes closed to the external environment or when the kinetics involves only first-order steps, in which case one obtains a monotonic decay to a unique steady state (Denbigh et al., 1948). Following the pioneering work of Alfred Lotka, several authors in the late 1940s, proposed models of open nonlinear systems deriving from mass action kinetics and giving rise to sustained oscillations (Lotka, 1956; Moore, 1949). These models do not have structural stability (as they give rise to a continuum of initial condition-dependent solutions), and they do not exhibit the role of the constraints in a transparent manner (as they are usually formulated in the limit of irreversible reactions). When the non-equilibrium constraints are explicitly accounted for, it is found (Lefever et al., 1967) that there is no instability threshold in these models. As the first known chemical model that is both fully compatible with the laws of thermodynamics and chemical kinetics and generates dissipative structures through non-equilibrium instabilities, the Brusselator is free from such deficiencies.
Model Presentation In the interest of transparency, one desires a minimal model, and if oscillatory behavior is one of the required properties, this necessitates two coupled variables representing the concentrations of intermediate products. As in the models of the Lotka family, one seeks steps that are not only nonlinear but also include feedback processes, the simplest chemical version of which is autocatalysis. But contrary to these models, one now needs to scan the whole range of near to far from equilibrium situations and to undergo an instability somewhere in this range. As the Lotka-type models contain only second-order steps, a natural solution is to amend them by replacing these steps by a third-order one. This leads to the scheme (Prigogine & Lefever, 1968) k1
k2
k−1
k−2
A X, B + X Y + D, k3
k4
k−3
k−4
2X + Y 3X, X E
(1)
Hanusse, Tyson, and Light have shown that a two-variable system compatible with the above requirements necessarily comprises a third-order step. Here A, B are the initial reactants, D, E the final products, and X, Y the intermediates: X can be thought of as an activator generating Y at its own expense, which acts as an inhibitor if the B concentration is large. From the standpoint of irreversible thermodynamics, the Brusselator can be driven out of equilibrium through two independent constraints (affinities) related to the
82
BURGERS EQUATION
overall reactions k1 k 4
k3 k2
k−1 k−4
k−3 k−2
A E, B
D.
This offers sufficient flexibility to allow one to take the limit of purely irreversible steps and of fixed reactant and product concentrations (also referred to as pool chemical approximation, ensuring that the (X, Y ) subsystem becomes open to the external environment), while satisfying the positivity of concentrations, detailed balance, and mass conservation (Lefever et al., 1988). When diffusion is also included, and upon performing a suitable scaling transformation, this leads to the Brusselator equations ∂X = A − (B + 1)X + X 2 Y + Dx ∇ 2 X, ∂t ∂Y = BX − X 2 Y + Dy ∇ 2 Y ∂t
(2)
in which B, Dx /Dy , and the system size usually play the role of the parameters controlling the instabilities. A number of variants of this canonical form have also been developed, including Brusselator in an open well-stirred reactor, Brusselator in a non-ideal system, including coupling between non-equilibrium instabilities and phase transitions, and coupling with external fields or advection.
Behavior of the Solutions Since the first bifurcation analysis of the Brusselator equations (Nicolis & Auchmuty, 1974), several studies have been devoted to the various modes of spatiotemporal organization generated by Equations (2): limit cycles, Turing patterns, and traveling waves in one-dimensional systems (Nicolis & Prigogine, 1977); spatiotemporal chaos arising from the diffusive coupling of local limit cycle oscillators (Kuramoto, 1984); patterns in two- and three-dimensional systems including patterns arising from the interference of different instability mechanisms such as Turing, Hopf (De Wit, 1999); and the effect of confinement (HerschkowitzKaufman & Nicolis, 1972). Many phenomena now known to be generic have first been discovered on these Brusselator-based analyses, which have also helped to test the limits of traditional theoretical approaches and to explore new methodologies such as normal forms and phase dynamics. The Brusselator has also been used to explore possible thermodynamic signatures of dissipative structures. No clearcut tendencies seem to exist, suggesting that global thermodynamic quantities like entropy and entropy production do not provide adequate measures of dynamic complexity. Finally, attention has been focused on the new insights afforded when the mean-field equations (2) are augmented to account for fluctuations, a study for which the
Brusselator is well suited thanks to its mechanistic basis. The interest here is to provide a fundamental understanding of how large-scale order can be sustained despite the locally prevailing thermal disorder. Early accounts of the results, with emphasis on critical behavior in the vicinity of bifurcations, can be found in Nicolis & Prigogine (1977) and in Walgraef et al. (1982). G. NICOLIS See also Chemical kinetics; Detailed balance; Emergence; Turing patterns Further reading Denbigh, K.G., Hicks, M. & Page, F.M. 1948. Kinetics of open systems. Transactions of the Faraday Society, 44: 479–494 De Wit, A. 1999. Spatial patterns and spatio-temporal dynamics in chemical systems. Advances in Chemical Physics, 109: 453–513 Glansdorff, P. & Prigogine, I. 1971. Thermodynamic Theory of Structure, Stability and Fluctuations, London and New York: Wiley Herschkowitz-Kaufman, M. & Nicolis, G. 1972. Localized spatial structures and nonlinear chemical waves in dissipative systems. Journal of Chemical Physics, 56: 1890–1895 Kuramoto, Y. 1984. Chemical Oscillations, Waves and Turbulence, Berlin and New York: Springer Lefever, R., Nicolis, G. & Prigogine, I. 1967. On the occurrence of oscillations around the steady state in systems of chemical reactions far from equilibrium. Journal of Chemical Physics, 47: 1045–1047 Lefever, R., Nicolis, G. & Borckmans, P. 1988. The Brusselator: it does oscillate all the same. Journal of the Chemical Society, Faraday Transactions, 1. 84: 1013–1023 Lotka, A. 1956. Elements of Mathematical Biology, New York: Dover Moore, M.J. 1949. Kinetics of open reaction systems: chains of simple autocatalytic reactions. Transactions of the Faraday Society, 45: 1098–1109 Nicolis, G. & Auchmuty, J.F.G. 1974. Dissipative structures, catastrophes, and pattern formation: a bifurcation analysis. Proceedings National Academy of Sciences USA, 71: 2748–2751 Nicolis, G. & Prigogine, I. 1977. Self-organization in Nonequilibrium Systems, New York: Wiley Prigogine, I. & Lefever, R. 1968. Symmetry-breaking instabilities in dissipative systems. II. Journal of Chemical Physics, 48: 1695–1700 Walgraef, D., Dewel, G. & Borckmans, P. 1982. Nonequilibrium phase transitions and chemical instabilities. Advances in Chemical Physics, 491: 311–355
BULLETS See Solitons, types of
BURGERS EQUATION In 1915, Harry Bateman considered a nonlinear equation whose steady solutions were thought to describe certain viscous flows (Bateman, 1915). This equation, modeling a diffusive nonlinear wave, is now
BURGERS EQUATION
83
widely known as the Burgers equation, and is given by µ ut + uux = uxx , 2
(1)
where µ is a constant measuring the viscosity of the fluid. It is a nonlinear parabolic equation, simply describing a temporal evolution where nonlinear convection and linear diffusion are combined, and it can be derived as a weakly nonlinear approximation to the equations of gas dynamics. Although nonlinear, Equation (1) is very simple, and interest in it was revived in the 1940s, when Dutch physicist Jan Burgers proposed it to describe a mathematical model of turbulence in gas (Burgers, 1940). As a model for gas dynamics, it was then studied extensively by Burgers (1948), Eberhard Hopf (1950), Julian Cole (1951), and others, in particular; after the discovery of a coordinate transformation that maps it to the heat equation. While as a model for gas turbulence the equation was soon rivaled by more complicated models, the linearizing transformation just mentioned added importance to the equation as a mathematical model, which has since been extensively studied. The limit µ → 0 is a hyperbolic equation, called the inviscid Burgers equation: ut + uux = 0.
(2)
This limiting equation is important because it provides a simple example of a conservation law, capturing the crucial phenomenon of shock formation. Indeed, it was originally introduced as a model to describe the formation of shock waves in gas dynamics. A first-order partial differential equation for u(x, t) is called a conservation law if it can be written in the form ut + (f (u))x = 0 . For Equation (2), f (u) = u2 /2. Such conservation laws may exhibit the formation of shocks, which are discontinuities appearing in the solution after a finite time and then propagating in a regular manner. When this phenomenon arises, an initially smooth wave becomes steeper and steeper as time progresses, until it forms a jump discontinuity—the shock. Once a discontinuity forms, the solution is no longer a globally differentiable function; thus, the sense in which it can be considered as a solution of the PDE must be clarified. A discontinuous function (u(x, t)) can still be considered as a solution in the weak sense if it satisfies 1 uϕt + u2 ϕx dxdt = 0, (3) 2 D where D is any rectangle in the (x, t) plane, and ϕ(x, t) is any smooth function vanishing on the boundary ∂D. Any regular solution is a weak solution, as is seen by
multiplying the equation by ϕ(x, t), integrating by parts along ∂D, and using Green’s theorem. In physical applications, one often considers the discontinuous solution as the limit, as µ → 0, of smooth solutions of the viscous Equation (1). This idea is correct from a physical point of view, as it takes into account the significance of these solutions as a physical description of gas dynamics. From the form of the equation (or its weak formulation (3)), one can derive the velocity vs of a shock separating two regimes, ur to the right and ul to the left of a discontinuity. The result is the Rankine–Hugoniot formula, valid in general for conservation laws, which for the case of the Burgers equation yields s = 21 (ur + ul ).
(4)
Even this, however, is not enough to guarantee uniqueness of the solution, because there are several ways of writing the equation in the form of a conservation law. Often, the way to select the physically relevant solution is to consider the vanishing viscosity limit. To obtain this solution mathematically, an additional entropy condition, that ul > s > ur , must be imposed. Besides its significance as a model for shocks, the Burgers equation is prominent among PDEs because it is completely integrable. Indeed, the nonlinear change of variable (5) u = −µ(log ψ)x transforms Equation (1) into the heat equation ψt = ψxx , with initial conditions transforming simply into initial conditions for this latter equation: if u(x, 0) = f (x) is the given condition, then the corresponding initial condition for the heat equation is given by 1 x f (ξ )dξ . ψ(x, 0) = exp − µ 0 The relation between the Burgers and the heat equation was already mentioned in an earlier book (Forsyth, 1906), but the former had not been recognized as physically relevant; hence, the importance of this connection was seemingly not noticed at the time. Using the transformation of Equation (5), known as the Cole–Hopf transformation, it is easy to solve the initial value problem for this equation. Recently, a generalization of the Cole–Hopf transformation has been successfully used to linearize the boundary value problem for the Burgers equation posed on the semiline x > 0 (Calogero & De Lillo, 1989). The existence of this linearizing transformation, which is a transformation of Bäcklund type (Rogers & Shadwick, 1982) relating the solutions of two different PDEs, stimulated work to extend this approach to a generalized version of the Burgers equation, such as the Korteweg–de Vries–Burgers equation, given by ut + uux =
µ uxx − εuxxx , 2
ε > 0.
(6)
84 Although it was found out that such a directly linearizing transformation did not exist, efforts in this direction were rewarded by several discoveries. Indeed, the importance of a linearizing transformation became evident when the inverse scattering transform (IST) was discovered, leading to the full analytical understanding of the solution of the Cauchy problem for the Korteweg–de Vries (KdV) equation, and later all integrable evolution equations in one spatial dimension, such as KdV, the nonlinear Schrödinger, and the sineGordon equations. A crucial step in the discovery of the IST was an observation made by Robert Miura. In analogy to gas dynamics, he noted that one needs conservation laws to compute jump conditions across the region where the solution is small and essentially dispersive, isolating the solitonic part, which is thought of as a kind of reversible shock. This led to the connection between the KdV and modified KdV equation via the Miura transformation and eventually to the IST through which these nonlinear equations are solved through a series of linear problems (Gardner et al., 1967). Nowadays, the Burgers equation is used as a simplified model of a kind of hydrodynamic turbulence (Case & Chiu, 1969), called Burgers turbulence. Burgers himself wrote a treatise on the equation now known by his name (Burgers, 1974), where several variants are proposed to describe this particular kind of turbulence. Generalizations such as the KdV–Burgers equation (6) arose from the need to model more complicated physical situations and introduce more factors than those that the Burgers equation takes into account. Lower-order friction terms may be considered that reduce the amplitude of the wave, although in a different scale and manner than the reduction due to the higher-order diffusion term uxx . For example, the KdV–Burgers equation is an appropriate model when a different higher-order amplitude-reducing effect, namely dispersion, is introduced. Depending on the relative sizes of µ and ε, this equation may exhibit either an essentially shock-like structure, with the presence of dispersive tail, or mainly dispersive phenomena; thus, Equation (6) has been proposed as a natural model for hydrodynamic turbulence. In the context of the study of gas dynamics (particularly turbulent and vorticity phenomena), the Burgers equation has also been used to model phase diffusion along vortex filaments. BEATRICE PELLONI See also Constants of motion and conservation laws; Inverse scattering method or transform; Shock waves; Turbulence Further Reading Bateman, H. 1915. Some recent research on the motion of fluids. Monthly Weather Review, 43: 163–170
BUTTERFLY EFFECT Burgers, J. 1940.Application of a model system to illustrate some points of the statistical theory of free turbulence. Proceedings of the Nederlandse Akademie van Wetenschappen, 43: 2–12 Burgers, J. 1948. A mathematical model illustrating the theory of turbulence, Advances in Applied Mechanics, 1: 171–199 Burgers, J. 1974. The Nonlinear Diffusion Equation: Asymptotic Solutions and Statistical Problems, Dordrecht and Boston: Reidel Calogero, F. & De Lillo, S. 1989. The Burgers equation on the semiline. Inverse Problems, 5: L37 Case, K.M. & Chiu, S.C. 1969. Burgers turbulence models. Physics of Fluids, 12: 1799–1808 Cole, J. 1951. On a quasilinear parabolic equation occuring in aerodynamics. Quarterly Journal of Applied Mathematics, 9:225–236 Forsyth, A.R. 1906. Theory of Differential Equations, Cambridge: Cambridge University Press Gardner, C.S., Greene, J.M., Kruskal, M.D. & Miura, R.M. 1967. Method for solving the Korteweg-de Vries equation. Physical Review Letters, 19: 1095–1097 Hopf, E. 1950. The partial differential equation ul + uux = µuxx . Communications in Pure and Applied Mathematics, 3: 201–230 Lax, P.D. 1973. Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves, Philadelphia: Society for Industrial and Applied Mathematics Newell, A.C. (editor). 1974. Nonlinear Wave Motion, Providence, RI: American Mathematical Society Rogers, C. & Shadwick, W.F. 1982. Bäcklund Transformations and Their Applications, New York: Academic Press Sachdev, P.L. 1987. Nonlinear Diffusive Waves, Cambridge and New York: Cambridge University Press Smoller, J. 1983. Shock Waves and Reaction-Diffusion Equations, Berlin and New York: Springer Whitham, G.B. 1974. Linear and Nonlinear Waves, New York: Wiley-Interscience
BUTTERFLY EFFECT The Butterfly Effect serves as a metaphor for what in technical language is called “sensitive dependence on initial conditions” or “deterministic chaos,” the fact that small causes can have large effects. As recounted by Gleick (1987, Chapter 1), in the early 1960s, Edward Lorenz was carrying out computer simulations on a 12-dimensional weather model. One day, he decided to run a particular time series for longer. In order to save time, he restarted his code from data from a previous printout. After returning from a coffee break, he found that the weather simulation had diverged sharply from that of his earlier run. After some checks, he could only conclude that the difference was caused by the difference in initial conditions: he had typed in only the first three of the six decimal digits that the computer worked with internally. Apparently, his assumption that the fourth digit would be unimportant was false. Lorenz realized the importance of his observation: “If, then, there is any error whatever in observing the present state—and in any real system such errors seem inevitable—an acceptable prediction of an instantaneous state in the distant future may well be
BUTTERFLY EFFECT impossible” (Lorenz, 1963, p. 133). Indeed, the error made by discarding the fourth and higher digits is so small that it can be imagined to represent the effect of the flap of the wings of a butterfly. Lorenz originally used the image of a seagull. The more lasting name seems to have come from his address at the annual meeting of the American Association for the Advancement of Science in Washington, December 29, 1972, which was entitled “Predictability: does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?” The text of this talk was never published but is presented in its original form as an appendix in Lorenz (1993). Sensitive dependence on initial conditions forces us to distinguish between determinism and predictability, two concepts often confused by scientists and popular writers alike. Determinism has to do with how Nature (or, less ambitiously, any system under consideration) behaves, while predictability has to do with what we, human beings, are able to observe, analyze, and compute. We have determinism if we have a law or a formula describing exactly, and fully, how the system behaves given its present state. To have predictability we need, in addition, to be able to measure the present state of the system with sufficient precision and to compute with the given formula (to solve the equations) in a sufficiently accurate computational scheme. Determinism is most famously expressed by PierreSimon Laplace (1814, p.2): An intelligence that, at a given instant, could comprehend all the forces by which nature is animated and the respective situation of the beings that make it up, if moreover it were vast enough to submit these data to analysis, would encompass in the same formula the movements of the greatest bodies of the universe and those of the lightest atoms. For such an intelligence nothing would be uncertain, and the future, like the past, would be open to its eyes.
Laplace’s dramatic statement is often erroneously interpreted as a belief in perfect predictability now rendered untenable by the chaos theory. But he was describing determinism: given the state of the system (the universe) at some time, we have a formula (a set of differential equations) that gives, in principle, the state of the system at any later time. Nowhere will one find a claim about the computability, by us humans, of all the consequences of the laws of mechanics. Indeed, the quote appears in the introduction of a book on probability. Laplace is, in fact, assuming incomplete knowledge from the start and uses probabilities to make rational inferences. If it were not for quantum mechanics, Laplace’s statement would still stand, unaffected by deterministic chaos. To illustrate the problems with computability, consider the simple but important example of the (deterministic) Bernoulli shift map defined by f : [0, 1] → [0, 1] : xn+1 = 2xn (mod 1).
85 On numbers in binary representation, this map has a particularly simple effect: shift the binary point one place to the right and discard the first digit. For example, if x0 = 0.10110 (which corresponds to the decimal 0.6875), then x1 = 0.01100 (decimal 0.375). Now, any rational starting number x0 is represented by a repeating sequence of 0s and 1s and hence leads to a periodic orbit of f , while any irrational x0 is represented by a nonrepeating sequence of 0s and 1s and hence leads to a nonperiodic orbit. This latter sequence would appear as unpredictable as the sequence of heads and tails generated by flipping a coin, the quintessentially random process. Since there is an irrational number arbitrarily close to every rational number and vice versa, the map exhibits a sensitive dependence on initial conditions. In practice, on a computer, numbers are always represented with finite precision; hence, the computations become completely meaningless once— after a finite number of iterations—all significant digits have been removed. In the standard 32-bit (4-byte) floating point arithmetic with 23-bit mantissa, this will be after roughly 23 iterations. The significance of the Bernoulli shift map is that dynamical systems theory tells us that its dynamics lies at the heart of the so-called “horseshoe dynamics,” which in turn is commonly found in (the wide class of) systems with homoclinic (i.e., expanding and reinjecting) orbits (Wiggins, 1988). It means that in many situations, all we can say about a system’s dynamics is of a statistical nature. A quantitative measure of the sensitivity on initial conditions, and therefore a measure of the predictability horizon, is provided by the leading Lyapunov exponent. The possibility of small causes having large effects (in a perfectly deterministic universe) was anticipated by many scientists before Lorenz, and even before the birth of dynamical systems theory, which is generally accepted to have its origins in Poincaré’s work on differential equations toward the end of the 19th century. Maxwell (1876, p. 20) wrote: “There is a maxim which is often quoted, that ‘The same causes will always produce the same effects.”’After discussing the meaning of this principle, he adds: “There is another maxim which must not be confounded with [this], which asserts that ‘Like causes produce like effects.’ This is only true when small variations in the initial circumstances produce only small variations in the final state of the system.” He then gives the example of how a small displacement of railway points sends a train on different courses. Others have often used the image of the weather: Wiener (1954/55): It is quite conceivable that the general outlines of the weather give us a good, large picture of its course for hours or possibly even for days. However, I am profoundly skeptical of the unimportance of the unobserved part of the weather for longer periods.
86
BUTTERFLY EFFECT To assume that these factors which determine the infinitely complicated pattern of the winds and the temperature will not in the long run play their share in determining major features of weather, is to ignore the very real possibility of the self-amplification of small details in the weather map. A tornado is a highly local phenomenon, and apparent trifles of no great extent may determine its exact track. Even a hurricane is probably fairly local where it starts, and phenomena of no great importance there may change its ultimate track by hundreds of miles.
Poincaré (1908, p. 67): Why have meteorologists such difficulty in predicting the weather with any certainty? Why is it that showers and even storms seem to come by chance, so that many people think it quite natural to pray for rain or fine weather, though they would consider it ridiculous to ask for an eclipse by prayer? We see that great disturbances are generally produced in regions where the atmosphere is in unstable equilibrium. The meteorologists see very well that the equilibrium is unstable, that a cyclone will be formed somewhere, but exactly where they are not in a position to say; a tenth of a degree more or less at any given point, and the cyclone will burst here and not there, and extend its ravages over districts it would otherwise have spared. If they had been aware of this tenth of a degree, they could have known it beforehand, but the observations were neither sufficiently comprehensive nor sufficiently precise, and that is the reason why it all seems due to the intervention of chance.
Even earlier, Franklin (1898, p. 173) had used an analogy surprisingly similar to Lorenz’s: . . . an infinitesimal cause may produce a finite effect. Long range detailed weather prediction is therefore impossible, and the only detailed prediction which is possible is the inference of the ultimate trend and character of a storm from observations of its early stages; and the accuracy of this prediction is subject to the condition that the flight of a grasshopper in Montana may turn a storm aside from Philadelphia to New York! Duhem (1954, p. 141) used Hadamard’s theorem of 1898 on the complicated geodesic motion on surfaces of negative curvature to “expose fully the absolutely irremediable physical uselessness of certain mathematical deductions." If such incomputable behavior is possible in mechanics, “the least complex of physical theories,” Duhem goes on to ask rhetorically, “Should we not meet that ensnaring conclusion in a host of other, more complicated problems, if it were possible to analyse the solutions closely enough?”
Many had contemplated the possibility of sensitive dependence on initial conditions, but Lorenz was the first to see it actually happening quantitatively in the numbers spit out by his Royal McBee computing machine, and to be sufficiently intrigued by it to study it more closely in the delightfully simple system of equations now bearing his name (Lorenz, 1963). Indeed, while most scientists, with Duhem, had looked to complicated systems for unpredictable behavior, Lorenz found it in simple ones and thereby made it amenable to analysis. GERT VAN DER HEIJDEN See also Chaotic dynamics; Determinism; General circulation models of the atmosphere; Horseshoes and hyperbolicity in dynamical systems; Lorenz equations Further Reading Bricmont, J. 1995. Science of chaos or chaos in science? Physicalia Magazine, 17: 159–208 Duhem, P. 1954. The Aim and Structure of Physical Theory, translated by Ph.P. Wiener, Princeton: Princeton University Press (original French edition, La Théorie Physique: Son Objet, Sa Structure, 2ème éd., Paris: Marcel Rivière & Cie, 1914; 1st edition, 1906) Franklin, W.S. 1898. A book review of P. Duhem, Traité Elémentaire de Méchanique Chimique fondée sur la Thermodynamique, vols. I and II, Paris, 1897, The Physical Review, 6: 170–175 Gleick, J. 1987. Chaos: Making a New Science, London: Heinemann, and New York: Viking Laplace, P.-S. 1814. Philosophical Essay on Probabilities, translated from the fifth French edition of 1825 by A.I. Dale, Berlin and New York: Springer, 1995 (first edition published in French, 1814) Lorenz, E.N. 1963. Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20: 130–141 Lorenz, E.N. 1993. The Essence of Chaos, Seattle: University of Washington Press and London: University College London Press Maxwell, J.C. 1876. Matter and Motion, New York: Van Nostrand Poincaré, H. 1908. Chance. In Science and Method, pp. 64–90, translated by F. Maitland, London: Thomas Nelson and Sons, 1914 (original French edition, Science et Méthode, Paris: E. Flammarion, 1908) Wiener, N. 1954/55. Nonlinear prediction and dynamics. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Berkeley: University of California Press, vol. 3, pp. 247–252; Mathematical Reviews, 18: 949 (1957); reprinted in 1981. Norbert Wiener: Collected Works with Commentaries, vol. III, edited by P. Masani, Cambridge, Massachusetts: MIT Press, pp. 371–376 Wiggins, S. 1988. Global Bifurcations and Chaos, Berlin and New York: Springer
C CALOGERO–MOSER MODEL
balance condition
See Particles and antiparticles
P = vE,
(1)
implying v = P /E (cm/h). Metaphorically, the flame must digest energy at the same rate at which it is eaten. Consider a family of cylindrical candles with various diameters (d), and assume the dissipation rates of their flames to be independent of the sizes of the candles. Since stored chemical energy is proportional to the area of cross section (E ∝ d 2 ), power balance implies (2) v ∝ 1/d 2 .
CANDLE At about the time that John Scott Russell was systematically studying hydrodynamic solitons on a Scottish canal, Michael Faraday, the brilliant English experimental physicist and physical chemist, organized his annual Christmas Lectures on facets of natural philosophy for young people. These included a series on the candle that began with the claim (Faraday, 1861): There is no better, there is no more open door by which you can enter into the study of natural philosophy than by considering the physical phenomena of a candle.
Some measured flame speeds for typical candles are plotted on a log–log scale in Figure 1, where the dashed line of slope −2 indicates a 1/d 2 dependence. From this figure, it is evident that the inverse square dependence of Equation (2) is obeyed for larger candles. For smaller candles, v is somewhat less than expected, because the flames are not so large. Although Equation (2) was derived for candles, Equation (1) is quite general, expressing a global constraint that governs the dynamics of many kinds of nonlinear diffusion, including the propagation of nerve impulses (Scott, 2002). For a smooth axon described by the Hodgkin–Huxley equations, power balance is established between electrostatic energy released from the fiber membrane and ohmic √ dissipation by circulating ionic currents, implying v ∝ d. (Plotted in Figure 1, this dependence would have a slope of + 21 .) For myelinated nerves, on the other hand, evolutionary pressures require that v ∝ d, corresponding to a slope of unity in Figure 1. In the language of nonlinear dynamics, a candle flame provides a physical example of an attractor, which is evidently stable because moderate disturbances (small gusts of air) do not extinguish the flame by forcing it out of its basin of attraction. As the air becomes still, the flame returns to its original shape and size. The task of lighting a candle, on the other hand, requires getting the wick hot enough—above an
Although this assertion may have startled some of his listeners, Faraday went on to support it with a sequence of simple yet elegant experiments that clearly expose the structure and composition of a candle flame, demonstrating a stream of energy-laden vapor feeding into the flame and suggesting an analogy with the process of respiration in living organisms (Day & Catlow, 1994). (An engraving showing Faraday presenting one of these lectures can be found on a recent British 20-pound note.) While the details are intricate (Fife, 1988), the flame of a candle can be regarded globally as a dynamic process balancing two flows of energy: the rate at which energy is dissipated by the flame (through emission of heat and light) and the rate at which energy is released from the wax as the flame eats its way down the candle. Let us define variables as follows. • P is the power dissipation by the flame, in units of (say) joules per hour. • E is the chemical energy stored in the wax of the candle, in units of (say) joules per centimeter. • v is the speed at which the flame moves down the candle. If the rates of dissipation and energy input are equal, then the velocity of the flame is determined by the power 87
88
CANDLE 1938 the equation
Flame speed (cm/hour)
100
D
(-2) 10
1
0.1 0.3
1
3
10
Diameter (cm)
Figure 1. Measurements of flame speeds (v) for candles of different diameters (d). The error bars indicate rms deviations of about six individual measurements. (Data courtesy of Lela Scott MacNeil Scott (2003).)
ignition threshold and into the basin of attraction—so that a viable flame is established. Qualitatively similar conditions govern the firing of a nerve axon, leading to an all-or-nothing response. From a more intuitive perspective, an ignition threshold implies the power balance indicated in Equation (1), where the corresponding flame is unstable. Above the threshold, instability arises from the establishment of a positive feedback loop, in which the flame releases more than enough chemical energy than is needed to maintain its temperature. Such a positive feedback loop is represented by the diagram Release of energy (vE) ↓ ↑ Dissipation of energy (P ), with the gain about the loop being greater than unity, implying an increase in the flame size with time (Scott, 2003). Eventually, this temporal increase is limited by nonlinear effects in the release and dissipation of energy, reducing the loop gain to unity as the fully developed flame is established. The candle flame is an example of a reactiondiffusion (or autocatalytic) process, going back to an early suggestion by Robert Luther, a German physical chemist (Luther, 1906). Following a lecture demonstration of a chemical wave, Luther claimed that such systems should √ support traveling waves at a speed proportional to D/τ where D is the diffusion constant for the reacting components (in units of distance squared per unit of time) and τ is a delay time for the onset of the reaction. During the 1930s, autocatalytic systems were studied in the context of genetic diffusion through spatially dispersed biological species, and in
1 ∂ 2 u ∂u = u(u − a)(u − 1), − ∂x 2 ∂t τ
(3)
where a is a threshold parameter lying in the range (0, 21 ], was used by Soviet scientists Yakov Zeldovich and David Frank-Kamenetsky to represent a flame front. These authors showed that uniform traveling waves solutions of Equation (3) propagate at a fixed speed given by the expression $ (4) v = (1 − 2a) D/2τ , √ D/τ and the which includes both Luther’s factor power balance condition. Long overlooked by the neuroscience community, this early work on flame propagation offers a convenient model for the leading edge of a nerve impulse, confirming Faraday’s intuition (Scott, 2002). ALWYN SCOTT See also Attractors; Flame front; Hodgkin–Huxley equations; Power balance; Zeldovich–FrankKamenetsky equation Further Reading Day, P. & Catlow, C.R.A. 1994. The Candle Revisited: Essays on Science and Technology, Oxford and New York: Oxford University Press Faraday, M. 1861. A Course of Six Lectures on the Chemical History of a Candle. Reprinted as Faraday’s Chemical History of a Candle, Chicago: Chicago Review Press, 1988 Fife, P.C. 1988. Dynamics of Internal Layers and Diffusive Interfaces, Philadelphia: Society for Industrial and Applied Mathematics Luther, R. 1906. Räumliche Fortpflanzung chemischer Reaktionen. Zeitschrift für Elektrochemie 12(32): 596–600 (English translation in Journal of Chemical Education 64 (1987): 740–742) Scott, A.C. 2002. Neuroscience: A Mathematical Primer, Berlin and New York: Springer Scott, A.C. 2003. Nonlinear Science: Emergence and Dynamics of Coherent Structures. 2nd edition, Oxford and New York: Oxford University Press
CANONICAL VARIABLES See Hamiltonian systems
CANTOR SETS See Fractals
CAPACITY DIMENSION See Dimensions
CAPILLARY WAVES See Water waves
CARDIAC ARRHYTHMIAS AND THE ELECTROCARDIOGRAM
CARDIAC ARRHYTHMIAS AND THE ELECTROCARDIOGRAM In the early 1900s, Willem Einthoven developed the string galvanometer to measure the potential differences in the body surface associated with the heartbeat and introduced a nomenclature for the deflections of the electrocardiogram that are still used today. For this work, Einthoven was awarded a Nobel prize in 1924 (Katz & Hellerstein, 1964). The electrocardiogram (ECG) is a measurement of the potential difference between two points on the surface of the body. Because the heart generates waves of electrical activation that propagate through the heart during the cardiac cycle, ECG measurements reflect cardiac activity. Over the past century, physicians have learned how to interpret the electrocardiogram to diagnose a variety of different cardiac abnormalities. Although interpreting an ECG is difficult, this entry introduces the basic principles. In order to appreciate the ECG, it is first necessary to have a rudimentary knowledge about the spread of the cardiac impulse in the heart. The heart is composed of four chambers, the right and left atria, and the right and left ventricles (see Figure 1). The atria are electrically connected to each other, but are insulated from the ventricles everywhere except in a small region called the atrioventricular (AV) node. The ventricles are also electrically connected to each other. The rhythm of the heart is set by the sinoatrial node located in the right atrium, which acts as the pacemaker of the heart. From a mathematical perspective, this pacemaker is an example of a nonlinear oscillator. Thus, if the rhythm is perturbed, for example, by delivering a shock to the atria, then in general the timing of subsequent firings of the sinus node may be reset (i.e., they occur at different times than they would have if the shock had not been delivered), but the frequency and amplitude of the oscillation will remain the same. A wave of excitation initiated in the sinus node travels through the atria, then through the atrioventricular node, and then through specialized Purkinje fibers to the ventricles. The wave of electrical excitation is associated with a wave of mechanical contraction so that the cardiac cycle is associated with contraction and pumping of
Figure 1. A schematic diagram of the heart. Adapted from Goldberger & Goldberger (1994) with permission.
89
the blood through the body. The right and left atria are comparatively small chambers and act as collection points for blood. The right atrium collects blood from the body and the left atrium collects blood from the lungs. The right ventricle pumps blood to the lungs to be oxygenated, whereas the left ventricle pumps blood that has returned to the heart from the lungs to the rest of the body. The right atrium and right ventricle are separated by the tricuspid valve that prevents backflow of blood during the ventricular contraction. Similarly, the left atrium and left ventricle are separated by the mitral valve. In order to pump the blood, the ventricles are comparatively large and muscular. In the normal ECG, there are several main deflections labeled the P wave, the QRS complex, and the T wave, Figure 2a (Goldberger & Goldberger, 1994). The P wave is associated with the electrical activation of the atria, the QRS complex is associated with the electrical activation of the ventricles, and the T wave is associated with the repolarization of the ventricles. The duration of the PR interval reflects the conduction time from the atria to ventricles, which is typically 120–200 ms. The duration of the QRS complex reflects the time that it takes for the wave of excitation to activate the ventricles. Because of the specialized Purkinje fibers, the wave of activation spreads rapidly through the ventricles so that the normal duration of the QRS complex is less than 100 ms. The time interval from the beginning of the QRS complex to the end of the T wave, called the QT interval, reflects the time that the ventricles are in the contraction phase. The
Figure 2. Sample electrocardiograms. In all traces, one large box represents 0.2 s. (a) The normal electrocardiogram. The P wave, QRS complex, and T wave are labeled. (b) 3:2 Wenckebach rhythm, an example of a second-degree heart block. There are 3 P waves for each R wave in a repeating pattern. (c) Parasystole. The normal beats, labeled N, occur with a period of about 790 ms, and the abnormal ectopic beats, labeled E, occur with a regular period of 1300 ms s. However, when ectopic beats fall too soon after the normal beats, they are blocked. Normal beats that occur after an ectopic beat are also blocked. If a normal and ectopic beat occur at the same time, the complex has a different geometry, labelled F for fusion. In this record, the number of normal beats occurring between ectopic beats is either 4, 2, or 1, satisfying the rules given in the text. Panels (a) and (b) are adopted from Goldberger & Goldberger (1994), with permission. Panel (c) is adapted from Courtemanche et al. (1989) with permission.
90
CARDIAC ARRHYTHMIAS AND THE ELECTROCARDIOGRAM
duration of QT interval depends somewhat on the basic heart rate. It is shorter when the heart is beating faster. For heart beats in the normal range, the QT interval is typically of the order of 300–450 ms. On examining an ECG, one first looks for P waves. The presence of the P wave indicates that the heart beat is being generated by the normal pacemaker. In the normal heart, each P wave is followed by a QRS complex and then a T wave. The heart rate is often measured by time intervals between two consecutive R waves. Abnormally fast heart rates, faster than about 90 beats per minute, are called tachycardia, and abnormally slow heart rates, slower than about 50 beats per minute, are called bradycardia. Reduced to the basics, all cardiac arrhythmias (i.e., abnormal cardiac rhythms) are associated with abnormal initiation of a wave of cardiac excitation, abnormal propagation of a wave of cardiac excitation, or some combination of the two. Given such a simple underlying concept, it is not surprising that mathematicians have been attracted to the study of cardiac arrhythmias, or that many cardiologists are mathematically inclined. However, despite the apparent simplicity, cardiac arrhythmias can manifest themselves in many different ways, and it is still not always possible to figure out the mechanism of an arrhythmia in any given individual. The following is focused on some arrhythmias that are well understood and that have interesting mathematical analyses. One class of cardiac arrhythmias is associated with conduction defects through the AV node. In first-degree heart block, the PR interval is elevated above its normal value, but each P wave is followed by a QRS complex and T wave. However, in second-degree heart block, there are more P waves than QRS complexes, as some of the atrial activations do not propagate to the ventricles. This type of cardiac arrhythmia, sometimes called Wenckebach rhythms (after Karel Frederik Wenckebach, a Dutch-born Austrian physician, who studied these rhythms at the beginning of the 20th century), has repeatedly attracted theoretical interest (Katz & Hellerstein, 1964). It is common to classify Wenckebach rhythms by a ratio giving the number of P waves to the number of QRS complexes. For example, Figure 2b shows a 3:2 heart block. In the 1920s, Balthasar van der Pol and J. van der Mark developed a mathematical model of the heart as coupled nonlinear oscillators that displayed striking similarities to the Wenckebach rhythms. We now understand that in a number of different models, as the frequency of atrial activation is increased, different types of N : M heart block can be observed (van der Pol & van der Mark, 1928). In fact, theoretical models have demonstrated that if there is N : M heart block at one stimulation frequency and an N : M heart block at a higher frequency, then the N + N : M + M heart block is expected at
some intermediate stimulation frequency. This result provides a mathematical classification complementary to the cardiological classification, and can be confirmed in clinical settings (Guevara, 1991). Finally, in thirddegree heart block, there is a regular atrial rhythm and a regular ventricular rhythm (at a slower frequency), but there is no coupling between the two rhythms. Such rhythms in mathematics are called quasi-periodic. A different type of rhythm that appeals to mathematicians is called parasystole. In the “pure” case, the normal sinus rhythm beats at a constant frequency, and an abnormal (ectopic) pacemaker in the ventricles beats at a second slower frequency (Glass et al., 1986; Courtemanche et al., 1989). Figure 2c shows the normal (N) beats and the ectopic (E) beats. If the ectopic pacemaker fires at a time outside the refractory period of the ventricles, then there is an abnormal ectopic beat, identifiable on the ECG by a morphology distinct from the normal beat, and the following normal sinus beat is blocked. If the normal and abnormal beats occur at the same time, this leads to a fusion (F ) beat. Surprisingly, this simple mechanism has amazing consequences that can be appreciated by forming a sequence of integers that counts the number of sinus beats between two ectopic beats. In general, for fixed sinus and ectopic frequencies and a fixed refractory period, in this sequence, there are at most three integers, where the sum of the two smaller integers is one less than the largest integer. Moreover, given the values of the parameters, it is possible to predict the three integers. The mathematics for this problem is related to the “gaps and steps” problem in number theory. Both AV heart block and parasystole lead to mathematical predictions of cardiac arrhythmias in humans, which have been tested in experimental models and in humans. Such arrhythmias are diagnosed and treated when necessary, by physicians who have no knowledge of the underlying mathematics. Thus, to date, the mathematical analysis of these arrhythmias is of little medical interest. From a medical perspective, the most important class of arrhythmias is called re-entrant arrhythmias. In these arrhythmias, the period of the oscillation is set by the time an excitation takes to travel in a circuitous path, rather than the period of oscillation of a pacemaker. The re-entrant circuit can be found in a single chamber of the heart or can involve several anatomical features of the heart (Josephson, 2002). In typical atrial flutter, there is a wave circulating around the tricuspid valve in the right atrium; in Wolf–Parkinson–White syndrome, there can be excitation traveling in the normal circuit from atria to the AV node to the ventricles, but then traveling retrogradely back to the atria via an abnormal accessory pathway between the ventricles and the atria. Also in some patients who have had a heart attack, there is a re-entrant circuit contained entirely in the ventricles. In all these three re-entrant arrhythmias, a
CARDIAC MUSCLE MODELS part of the circuit is believed to be a comparatively thin strand of tissue. Considering these re-entrant arrhythmias from a mathematical perspective, the wave often appears to be circulating on a one-dimensional ring. This conceptualization developed by cardiologists has an important implication for therapy: “if you cut the ring, you can cure the rhythm.” By inserting catheters directly into a patient’s heart and delivering radio frequency radiation to precisely identified loci, the cardiologist destroys heart tissue and can often cure these serious arrhythmias. In these cases, the cardiologist is thinking like a topologist since changing the topology of the heart cures the arrhythmia. Moreover, there is a body of mathematics that has studied the properties of excitation traveling on onedimensional rings (Glass et al., 2002). Other reentrant arrhythmias are not as well understood and are not as easily treated. Many theoretical and experimental studies (See Cardiac muscle models) have documented spiral waves circulating stably in two dimensions and scroll waves circulating in three dimensions (Winfree, 2001). Since real hearts are three-dimensional, and there is still no good technology to image excitation in the depth (as opposed to the surface) of the cardiac tissue, the actual geometry of excitation waves in cardiac tissue associated with some arrhythmias is not as well understood and is now the subject of intense study. From an operational point of view, it is suggested that any arrhythmia that cannot be cured by a small localized lesion in the heart is a candidate for a circulating wave in two or three dimensions. Such rhythms include atrial and ventricular fibrillation. In these rhythms, there is evidence of strong fractionation (breakup) of excitation waves giving rise to multiple small spiral waves and patterns of shifting blocks. Tachycardias can also arise in the ventricles in patients other than those who have experienced a heart attack, or perhaps occasionally in hearts with completely normal anatomy, and in these individuals it is likely that spiral and scroll waves are the underlying geometries of the excitation. A particularly dangerous arrhythmia, polymorphic ventricular tachycardia (in which there is a continually changing morphology of the ECG complexes), is probably associated with spiral and scroll waves that undergo a meander. New technologies are presenting unique opportunities to image cardiac arrhythmias in model systems and the clinic, and nonlinear dynamics is suggesting new strategies for controlling cardiac arrhythmias. For a summary of advances up until 2002 see the collection of papers in Christini & Glass (2002). Despite the great advances in research and medicine over the past 100 years, there is still a huge gap between what is understood and what actually happens in the human heart. The only way to appreciate this gap is to
91 toss out the models and start looking at real data from patients who are experiencing complex arrhythmia as measured on the ECG. All who plan to model cardiac arrhythmias are encouraged to take this step. LEON GLASS See also Cardiac muscle models; Scroll waves; Spiral waves; Van der Pol equation Further Reading Christini, D. & Glass, L. (editors). 2002. Focus issue: mapping and control of complex cardiac arrhythmias. Chaos, 12(3) Courtemanche, M., Glass, L., Bélair, J., Scagliotti, D. & Gordon, D. 1989. A circle map in a human heart. Physica D, 40: 299–310 Glass, L., Goldberger, A., & Bélair, J. 1986. Dynamics of pure parasystole. American Journal of Physiology, 251: H841–H847 Glass, L., Nagai, Y., Hall, K., et al. 2002. Predicting the entrainment of reentrant cardiac waves using phase resetting curves. Physical Review E, 65: 021908 Goldberger, A.L. & Goldberger, E. 1994. Clinical Electrocardiography: A Simplified Approach, 5th edition, St Louis: Mosby Guevara, M.R. 1991. Iteration of the human atrioventricular (AV) nodal recovery curve predicts many rhythms of AV block. In Biomechanics, Biophysics, and Nonlinear Dynamics of Cardiac Function, edited by L. Glass, P. Hunter, P. & A., McCulloch Theory of Heart: New York: Springer, pp. 313–358 Josephson, M.E. 2002. Clinical Cardiac Electrophysiology: Techniques and Interpretations, Philadelphia and London: Lippincott Williams & Wilkins Katz, L.N. & Hellerstein, H.K. 1964. Electrocardiography. In Circulation of the Blood: Men and Ideas, edited by A.P. Fishman & D.W. Richards, Oxford and New York: Oxford University Press, pp. 265–354 van der Pol, B. & van der Mark, J. 1928. The heartbeat considered as a relaxation oscillation, and an electrical model of the heart. Philosophical Magazine, 6: 763–765 Winfree, A.T. 2001. The Geometry of Biological Time, 2nd edition, Berlin and New York: Springer
CARDIAC MUSCLE MODELS Cardiac muscle was created by evolution to pump the blood, and contractions of cardiac muscle, as of any muscle, are governed by calcium (Ca) ions. Increased concentration of calcium ions inside a cardiac cell ([Ca]i ) induces a contraction, and diminished concentration induces relaxation (diastole). Calcium ion concentration in a cardiac cell is governed by many mechanisms. Importantly, a signal to increase [Ca]i is given by an abrupt increase in membrane potential E, which is called an action potential (AP). The membrane potential E in cardiac cells is described by reaction diffusion equations. Cardiac models have the mathematical structure of the well-known Hodgkin–Huxley (HH) equations: ∂E/∂t = − I (E, g1 , ..., gN ))/C+∇·(D∇E),
(1)
∂gi /∂t = (g˜ i (E)−gi )/τi (E), i = 1, ..., N,
(2)
92
CARDIAC MUSCLE MODELS
where gi are the gating variables (describing opening and closing of the gates of ionic channels), g˜ i (E) are the steady values of those variables, τi (E) are associated time constants, I (· · ·) is the transmembrane ionic current, and the diffusive term ∇ · (D∇E) describes the current flowing from the neighboring cells. In this description of the coupling, the anisotropy of the heart is properly described with an anisotropic diffusivity tensor. Equations (1) and (2) are similar to standard HH equations (with N = 3), and this formulation was used for the first cardiac model introduced by Denis Noble in 1962. As more and more ionic channels were discovered, they were incorporated into more detailed models: N = 6 for a modified Beeler–Reuter (BR) model, and N > 10 for the Luo–Rudy model and for recent Noble models. Although the original HH formulation was based on analytic functions for the g˜ i (E) and τi (E), experimentally measured functions can be used in the equations, as shown in Figure 1. This results in both physical transparency and faster numerical calculations. The functions gi (E) and τi (E) have a simple physical interpretation—the dynamics of each gating variable gi are governed by the membrane potential E only. For fixed E, Equations (2) are linear and independent; each of them describes relaxation to the steady value g˜ i (E) with the time constant τi (E). The characteristic times τi (E) scan four orders of magnitude (see Figure 1(a)), leading to qualitative understanding using time-scale separation.
1
Gating variables
x1
0 -100
-50
f 0
mV 3 x1 2 1
f j d
0 h -1 m -2 -100
(3)
∂n/∂t = (n(E) ˜ − n)/τn ,
(4)
∂m/∂t = (m(E) ˜ − m)/τm , ˜ ∂h/∂t = (h(E) − h)/τh ,
(6)
(5)
where INa (E, m, h) = (gNa m3 h + ε1 )(E − ENa ), (7) IK (E, n) = (αgK n4 + ε2 )(E − EK ),
(8)
A cardiac action potential has a duration of about 200 ms, which is about two orders of magnitude longer than that of a typical nerve impulse. To describe cardiac action potential, just one time constant in the HH equations was increased by a factor of 100, two small terms (ε1 and ε2 ) were added, and other time constants were adjusted to incorporate cardiac experimental results. To observe how the N4 model works, note that it has three gating variables: m, n, and h. The characteristic time τn is about 100 ms, while τm and τh are almost two orders of magnitude faster (shorter). This permits adiabatic elimination of the fast equations (5) and (6), thereby replacing Equations (3)–(6) with a system of equations only (system N2) (Krinsky & Kokoz, 1973): 2 ˜ ˜ h)+I ∂E/∂t= − (INa (E, m, K (E, n))/C+D∇ E, (9)
˜ ˜ 3 (E)h(E) + ε1 )(E − ENa ), INa = (gNa m
a
Time constants
+D∇ 2 E,
(10)
Note that the current INa becomes
j
d
b
∂E/∂t = −(INa (E, m, h) + IK (E, n))/C
∂n/∂t = (n(E) ˜ − n)/τn .
m 0.5 h
Noble’s 1962 Model This original cardiac model (N4 model), which is the key for understanding all subsequent models, consists of the following four equations (Noble, 1962).
-50
0 mV
Figure 1. BR model with eight variables. (a) Gating variables g˜ i (E). (b) Time constants τi (E) (log scale).
(11)
which contains only the known functions m ˜ ≡ m(E) ˜ ˜ and h˜ ≡ h(E). The behavior of this model is illustrated in Figure 2. Nullclines dE/dt = 0, and dn/dt = 0 for a Purkiknje fiber are shown in Figure 2(a). A Purkinje fiber has a pacemaker activity, which the nullclines (in the phase plane) show directly. Note that there is only one fixed point S, which is unstable, thus giving rise to limit cycle oscillations. The nullclines for a myocyte are shown in Figure 2(b). There is no pacemaker activity in myocytes; thus, the nullclines show the absence of a limit cycle. Although there are two unstable fixed points (S and S ), they do not induce a limit cycle because there is a third fixed point (S) that is stable and determines the resting potential.
CARDIAC MUSCLE MODELS
93
n4
Ito (Kv1.2, 4.2, 4.3)
0.75
ICa (α/β/γ/δ-subunits)
0.5
IKur (Kv1.5)
0 mV S
IKr (HERG) IKs (KvLQT1/mink)
0.25 INa (hHI)
−75
−50
−25
25
50
E, mV
a n
IK1 (Kir 2.X)
0.75 a
0.5
−80 mv
Sarcoplasmic Reticulum Troponin
−50
−25
25
Irel
50
1000
1500
b
t, msec
0
Ileak
Iup
IKr IKs
IK1
IKp Ito INaK
INs(Ca) IK(Na) IK(ATP)
Figure 3. (a) Action potential showing the principal currents that flow in each phase, with the corresponding subunit clones shown in parentheses. (Courtesy of S. Nattel) (b) Main ionic currents (Courtesy of Y. Rudy).
−50 n4
c
Itr
Calmodulin
E, mV 50 500
NSR
JSR Calsequestrin
E, mV
b
NCX
0.2 s
ICa(L) INaCa Ip(Ca) ICa(T) ICa, b
INa INa, b
0.25
S1′′
S1 S1′
If (HCN)
IK1
4
0.75 0.5
0.25
∆E = 2.8mv S ∆E = 0 −50 d
−25
25
50 E, mV
Figure 2. Analysis of the N4 model based on adiabatic elimination of two fast variables. (a) A Purkiknje fiber with a pacemaker activity. The nullclines of the reduced system N2 contain only one fixed point S that is unstable, thus giving rise to the limit cycle oscillations. (b) An excitable myocyte. There are three fixed points: S and S are unstable, and point S determines the resting potential. (c) Analysis of the effect of an arrhythmogenic drug aconitine, showing oscillations of the plateau of action potential (AP). (d) As in (c), where the nullclines show a small-amplitude limit cycle on the plateau of AP. (Parameters: (a) α = 1.2; (b) α = 1.3; (c) and (d) α = 1.3; h ˜ − δE); δE = 2.8 mV.) = h(E
Analysis of the effect of the arrhythmogenic drug aconitine (inducing oscillations on the plateau of the action potential) is shown in Figures 2(c) and (d). Aconitine induces dangerous oscillations because of a shift of the voltage dependence of Na
inactivation variable h. Oscillations on the plateau of action potential are shown in Figure 2(c), and the corresponding nullclines are shown in Figure 2(d). The shift of h(E) dependence results in the disappearance of the stable resting point and appearance of a small-amplitude limit cycle on the plateau of the AP. The electrophysiological characteristics in the full (N4) and reduced (N2) models are the same with an accuracy of 0.1–0.2 mV (Krinsky & Kokoz, 1973).
Contemporary Models Recent models include more ionic currents (see Figures 1 and 3), and also incorporate a change of intracellular ionic concentrations. For example, the BR model contains an additional equation for the concentration [Ca]i of intracellular Ca: ∂Cai /∂t = ICa .
(12)
The Luo–Rudy (LR) model, and the Noble model are widely known. The LR model (Rudy, 2001) can be downloaded from http://www.cwru.edu/med/CBRTC/ LRdOnline. Noble models are described at http://www. cellml.org/ and http://cor.physiol.ox.ac.uk/. The model of human ventricular myocyte (Ten Tusscher et al.
94 2003) is also available at http://www-binf.bio.uu.nl/ khwjtuss/HVM. In contemporary models the gap between the 100 and 1 ms time scales has been filled by many ionic currents, and (contrary to what was seen in the first cardiac model) the graphs even intersect each other (see Figure 1(b)). This makes the separation for fast and slow variables dependent upon the phase of AP, so results as in Figure 2 cannot be obtained directly. Instead, events on every time scale must be analyzed separately, eliminating adiabatically equations with a faster time scale and considering variables with a larger time scale fixed, or even postulating a model with only two to four equations (Keener & Sneyd, 1998; Fenton & Karma, 1998). Models that do not follow the HH formalism are needed because HH-type models predict that a point stimulation will create a circular or an elliptical distribution of membrane potential, while in the experiments a quadrupolar distribution was found. Thus, bidomain models were created that describe separately potentials inside (Ei ) and outside (Eo ) of a cell instead of considering Eo = 0 as in the HH formulation. These models correctly reproduce many important electrophysiological effects, and turn out to be useful for understanding the mechanisms of cardiac defibrillation (Trayanova et al., 2002). The integration of these models, however, is computationally more expensive than the integration of HH models. Markov chain models are used to describe transitions between states of single ionic channels linking genetical defects with cardiac arrhythmias. They also depart from the HH formalism. Anatomical models incorporate cardiac geometry and tissue structure, but they require months of laborious measurements on anatomical slices cut from the heart. A new approach is being developed at INRIA in France, which aims to create models for every cardiac patient and is intended to be used in clinics. Cardiac contractions are measured and incorporated into the model. NMR tomography methods were used that permit obtaining anatomical data in a few minutes, not months, see http://wwwsop.inria.fr/epidaure/. More cardiac models and authors can be found at http://www.cardiacsimulation.org/.
Dynamics of Myocardial Tissue In the past, breakthroughs were due to very simple models beginning with the pioneering work of Norbert Wiener and Arturo Rosenbluth (Wiener & Rosenbluth, 1946). They led the way to understanding rotating waves using a cellular automata model, where a cardiac cell can be in three states only: rest, excitation, and refractory. This model explained anatomical reentry (a wave rotating around an obstacle (Wiener & Rosenbluth, 1946)), and predicted a free rotating wave. Partial differential equations yield more refined refined results (Keener & Sneyd, 1998).
CARDIAC MUSCLE MODELS As two- and three-dimensional studies of rotating waves are time consuming, it is often convenient to use a two-variable model, permitting increased speed of calculations by two orders of magnitude. Numerical simulation of ionic models can be accelerated either by adiabatic elimination of fast variable m (Na) activation or by slowing down its dynamics and increasing the diffusion coefficient to keep the propagation velocity unchanged.
Propagation Failure As cardiac cells are connected via gap junctions, an excitation can be blocked when propagating from one cardiac cell to another, similar to the propagation failure in myelinated nerves. When an excitation propagates from auricles (A) to ventricles (V) via the AV node, a periodic pattern can be observed: for example, from every three pulses, only two pulses propagate, and the third pulse is blocked (3:2 Wenckebach periodicity). Other periodicities N : (N − 1) can also be observed. Propagation block can be observed on any cardiac heterogeneity when the period T of stimulation is shorter than the restoration time R (refractory period). Usually, Wenckebach rhythms with only small periods N are observed because TN ∼ N −2 ,
(13)
where TN is an interval of T that can yield. Wenckebach rhythms with period N. When an excitation block (Wenckebach rhythm) occurs in a two- or threedimensional excitable medium, it generically gives rise to a wave break that evolves into a rotating wave. For cardiac muscle, initiation of a rotating wave is a dangerous event, often leading to life-threatening cardiac arrhythmias. A new approach is being developed at INRIA in France, to create patient-specific models to be used in clinics. A 3-d electro-mechanical model of the heart is automatically adpted to a time series of volumetric cardiac images gated on the ECG (Ayache et al., 2001) providing useful quantitative parameters on the heart function in a few minutes, not months, see http:// www-sop.inria.fr/epidaure/. V. KRINSKY, A. PUMIR, AND I. EFIMOV See also Hodgkin–Huxley equations; Myelinated nerves; Neurons; Scroll waves; Synchronization; Van der Pol equation Further Reading Ayache, N., Chapelle, D., Clément, F., Coudiére, Y., Delingette, H., Désidéri, J.A., Sermesant, M. Sorine, M. & Urquiza, J. 2001. Towards model-based estimation of the cardiac electromechanical activity from ECG signals and ultrasound images. In Functional Imaging and Modeling of the Heart (FIMH’01), Helsinki, Finland, Lecture Notes in Computer Sciences, vol. 2230, Berlin, Springer, pp. 120–127
CAT MAP Chaos, topical issue: Ventricular fibrillation, 8(1), 1998 Fenton, F. & Karma, A. 1998. Vortex dynamics in threedimensional continuous myocardium with fiber rotation: Filament instability and fibrillation. Chaos, 8: 20–47 Journal of Theoretical Biology, topical issue devoted to the work of Winfree, 2004 Keener, J. & Sneyd, J. 1998. Mathematical Physiology, New York: Springer Krinsky, V. & Kokoz, Ju. 1973. Membrane of the Purkinje fiber. reduction of the noble equations to a second order system. Biophysics, 18: 1133–1139 Noble, D. 1962. A modification of the Hodgkin–Huxley equations applicable to Purkinje fiber action and pacemaker potential. Journal of Physiology, 160: 317–352 Noble, D. & Rudy, Y. 2001. Models of cardiac ventricular action potentials: iterative interaction between experiment and simulation. Philosophical Transactions of the Royal Society A, 359: 1127–1142 Rudy, Y. 2001. The cardiac ventricular action potential. In Handbook of Physiology: A Critical, Comprehensive Presentation of Physiological Knowledge and Concepts. Section 2, The cardiovascular system. vol. 1, the heart, edited by E. Page, H.A. Fozzard & R.J. Solaro, Oxford: Oxford University Press, pp. 531–547 Pumir, A., Romey, G. & Krinsky, V. 1998. De-excitation of cardiac cells. Biophysical Journal, 74: 2850–2861 Sambelashvili, A. & Efimov, I.R. 2002. Pinwheel experiment re-revisited. Journal of Theoretical Biology, 214: 147–153 TenTusscher, K.H.W.J., Noble, D., Noble, P.J. & Panfilov, A.V. 2003. A model of the human ventricular myocyte. American Journal of Physiology, 10: 1152 Trayanova, N., Eason, J. &Aguel, F. 2002. Computer simulations of cardiac defibrillation: a look inside the heart. Computing and Visualization in Science, 4: 259–270 Wiener, N. & Rosenbluth, N. 1946. The mathematical formulation of the problem of conduction of impulses in a network of connected excitable elements, specifically in cardiac muscle. Archivos del Instituto de cardiologia de Mexico, 16: 205–265
95 a mapping φ : M → M defined by x "→ Ax (mod Z2 ), where
x1 a b x= , A= , (1) x2 c d provided a, b, c, d ∈ Z are chosen such that | det A| = 1 and A has eigenvalues λ± with modulus not equal to one. (This implies that both eigenvalues are real and distinct.) The matrix A used in our illustration is
2 1 A= . (2) 1 1 Let us now explore some of the dynamical properties that show that the cat map is indeed completely “chaotic” (See Chaotic dynamics). Sensitivity on initial conditions is measured by the rate of divergence of two nearby points, x and x + δx, under iterations of φ. For any smooth map φ, the Taylor expansion for small δx yields φ(x + δx) = φ(x) + Dφx δx + O(δx 2 ) where Dφx is the differential of φ at x; it may be viewed as a linear map from T Mx to T Mφ(x) , the tangent spaces at x and φ(x), respectively. Because φ is linear in the case of the cat map, the above Taylor expansion is in fact exact, that is, φ(x + δx) = φ(x) + Dφx δx with Dφx = A. If v± are the eigenvectors of A corresponding to the eigenvalues λ± , let us denote by Ex+ and Ex− the subspaces of T Mx spanned by v+ and v− , respectively. As λ+ = λ− , we have T Mx = Ex+ ⊕ Ex− .
(3)
Furthermore, since |λ+ | = 1/|λ− | > 1, we find
CASIMIRS
Dφx (ξ ) ≥ eλ ξ
if ξ ∈ Ex+ ,
(4)
See Poisson brackets
Dφx (ξ ) ≤ e−λ ξ
if ξ ∈ Ex− ,
(5)
CAT MAP The cat map is perhaps the simplest area-preserving transformation that exhibits a high degree of chaos. In the development of the theory of dynamical systems, it served as a guiding example to illustrate new concepts such as entropy (Sinai, 1959) and Markov partitions (Adler & Weiss, 1967). The cat map owes its name to an illustration by V.I. Arnol’d showing the image of a cat before and after the application of the map. In the mathematical literature it is also referred to as “hyperbolic toral automorphism.” The torus M, which topologically has the shape of a doughnut, may be described by the points in the unit square (see Figure 1) where opposite sides are identified. Alternatively, one may think of a point on M as a point in the plane R2 modulo integer translations in Z2 . This yields a representation of M as the coset space R2 /Z:= {x + Z2 : x ∈ R2 }. The cat map is now
where λ = ln |λ+ | > 0 and · is the Euclidean norm. Hence, φ is expanding in the direction of v+ , and contracting in the direction of v− , which will therefore be referred to as the unstable and stable directions, respectively. [Here, inequalities (4) and (5) are in fact equalities; inequalities become necessary in the case of more general Anosov maps, if one seeks uniform bounds with λ independent of x.] For the nth forward or backward iterates of the map (n a positive integer), we have, by the above arguments with φ replaced by φ ±n ,
Dφxn (ξ ) ≥ enλ ξ , if ξ ∈
Ex+ ,
Dφxn (ξ ) ≤ e−nλ ξ , if ξ ∈ Ex− .
Dφx−n (ξ ) ≤ e−nλ ξ , (6)
Dφx−n (ξ ) ≥ enλ ξ , (7)
96
CAT MAP
Figure 1. The cat map: the image of a cat in the unit square (left) is stretched by the matrix A (middle) and then re-assembled by cutting and translating (without rotation) the different parts of the cat’s face back into the unit square (right). (Illustration by Federica Vasetti.)
The expansion/contraction is thus exponentially fast in time. Relations (3), (6), and (7) are equivalent to the statement that the cat map is an Anosov system. Special features of Anosov systems are ergodicity, mixing, structural stability, exponential proliferation of periodic orbits, and positive entropy h. There is a particularly simple formula for the entropy h due to Sinai (1959), which states that h = λ = ln |λ+ |. The number of fixed points of the nth iterate φ n is equal to | det(1 − An )| = |(1 − λn+ )(1 − λn− )| and is therefore asymptotically given by exp(nh), for large n. The notion of ergodicity implies that for any f ∈ L1 (M, dx), for almost every x ∈ M (that is, for all x up to a set of Lebesgue measure zero), we have N −1 1 f (φ n x) = f , N →∞ N
lim
n=0
f :=
f (x) dx. M
(8) “Mixing” means that, if f, g ∈ L2 (M, dx), then lim
n→±∞ M
f (φ n x) g(x) dx = f g.
(9)
Although the mixing property follows from general arguments for Anosov systems, it can be proved for φ directly by means of Fourier analysis. What is more, the rate of convergence in (9) is in fact exponentially fast in n for suitably smooth test functions f, g (“exponential decay of correlations”) and super-exponentially fast for analytic ones. Markov partitions are a powerful tool in the analysis of dynamical systems. In the case of the cat map the torus is divided into a finite collection of nonoverlapping parallelograms P1 , . . . , PN whose sides point in the directions of the eigenvectors v+ and v− , such that if φ(Pi ) (or φ −1 (Pi )) intersects with Pj , then it extends all the way across Pj . Let us construct an N by N matrix B whose coefficients are Bij = 1 if the intersection of φ(Pi ) with Pj is non-empty, and Bij = 0 otherwise. A symbolic description of the
dynamics of the cat map can now be obtained as follows. Consider doubly infinite sequences of the form · · · b−2 b−1 b0 b1 b2 . . ., where bn is an integer 1, . . . , N with the condition that the number bn can be followed by bn+1 only if Bbn bn+1 = 1. To each such sequence we can associate a point x on M by requiring that for every n ∈ Z, the parallelogram Rbn contain the iterate φ n (x). The symbolic dynamics of φ is now given by shifting the mark by one step to the right: the new word · · · b−1 b0 b1 b2 b3 · · · indeed represents the point φ(x). The dynamical properties of φ are thus encoded in the matrix B. In particular, the higher the rate of mixing of φ, the smaller the number of coefficients Bij that are zero. This in turn means that we have fewer restrictions on bn " → bn+1 and a typical orbit will be represented by a more “random” sequence of symbols. Cat maps, as well as higher-dimensional toral automorphisms, are featured in most textbooks on dynamical systems and ergodic theory. An introduction to the basic concepts may be found in Arnol’d & Avez (1968), and more advanced topics, such as entropy, Markov partitions, and structural stability, are discussed, for example, in Adler & Weiss (1970), Katok & Hasselblatt (1995), Pollicott &Yuri (1998), and Shub (1987). JENS MARKLOF See also Anosov and Axiom-A systems; Chaotic dynamics; Horseshoes and hyperbolicity in dynamical systems; Maps; Markov partitions; Measures Further Reading Adler, R.L. & Weiss, B. 1967. Entropy, a complete metric invariant for automorphisms of the torus. Proceedings of the National Academy of Sciences of the United States of America, 57: 1573–1576 Adler, R.L. & Weiss, B. 1970. Similarity of Automorphisms of the Torus, Providence, RI: American Mathematical Society Arnol’d, V.I. & Avez, A. 1968. Ergodic Problems of Classical Mechanics, New York and Amsterdam: Benjamin Katok, A. & Hasselblatt, B. 1995. Introduction to the Modern Theory of Dynamical Systems, Cambridge and New York: Cambridge University Press
CATALYTIC HYPERCYCLE Pollicott, M. & Yuri, M. 1998. Dynamical Systems and Ergodic Theory, Cambridge and New York: Cambridge University Press Sinai, Ya.G. 1959. On the concept of entropy for a dynamic system. Doklady Akademii Nauk SSSR, 124: 768–771 Shub, M. 1987. Global Stability of Dynamical Systems, Berlin and New York: Springer
CATALYTIC HYPERCYCLE The concept of the “hypercycle” was invented in the 1970s in order to characterize a functional entity that integrates several autocatalytic elements into an organized unit (Eigen, 1971; Eigen & Schuster, 1977, 1978a,b). This concept is a key to understanding the dynamics of living organisms. A catalytic hypercycle is defined as a cyclic network of autocatalytic reactions (Figure 1). Autocatalysts, in general, compete when they are supported by the same source of energy or material. Hypercyclic coupling introduces mutual dependence of elements and suppresses competition. Consequently, the fate of all members of a hypercycle is identical to that of the entire system and, in other words, no element of a hypercycle dies out provided the hypercycle as such survives. The current view of biological evolution distinguishes periods of dominating Darwinian evolution based on variation, competition, and selection interrupted by rather short epochs of radical innovations often called major transitions (Maynard Smith & Szathmáry, 1995; Schuster, 1996). In the course of biological evolution major transitions introduce higher hierarchical levels. Examples are (i) the origin of translation from nucleic acid sequences into proteins including the invention of the genetic code, (ii)
Figure 1. Definition of hypercycles. Replicator equations as described by the differential equation (2) can be symbolized by directed graphs: the individual species are denoted by nodes and two nodes are connected by an edge, j · −→ · i, if and only if aij > 0. The graphs of hypercycles consist of single Hamiltonian arcs as sketched on the left-hand side of the figure. These dynamical systems are permanent independent of the choice of rate parameters fi . For n ≤ 5 they represent the only permanent systems, but for n ≥ 6 the existence of a single Hamiltonian arc is only a sufficient but not a necessary condition for permanence. The graph on the right-hand side, for example, does not contain a Hamiltonian arc but the corresponding replicator equation is permanent for certain choices of rate parameters (Hofbauer & Sigmund, 1998).
97 the transition from independent replicating molecules to chromosomes and genomes, (iii) the transition from the prokaryotic to the eukaryotic cell, (iv) the transition from independent unicellular individuals to differentiated multicellular organisms, (v) the transition from solitary animals to animal societies, and (vi) presumably a series of successive transitions from animal societies to humans. All major transitions introduce a previously unknown kind of cooperation into biology. The hypercycle is one of very few mechanisms that can deal with cooperation of otherwise competing individuals. It is used as a model system in prebiotic chemistry, evolutionary biology, theoretical economics, as well as in cultural sciences. The simplest example of a catalytic hypercycle is the elementary hypercycle. It is described by the dynamical system ⎛ ⎞ n dxi ⎝ = xi fi xi−1 − fj xj −1 xj ⎠ ; dt j =1
i, j = 1, 2, . . . , n; i, j = mod n.
(1)
The catalytic interactions within a hypercycle form a directed closed loop comprising all elements, often called Hamiltonian arc: 1 → 2 → 3 → · · · → n → 1 (Figure 1). Hypercycles are special cases of replicator equations of the class ⎛ ⎞ n n n dxi = xi ⎝ aij xj − aj k xj xk ⎠ ; dt j =1
i, j, k = 1, 2, . . . , n
j =1 k=1
(2)
with aij = fi · δi − 1,j ; i, j = mod n. (The ‘mod n’ function implies a cyclic progression of integers, 1, . . . , n − 1, n, 1, . . . . The symbol δi,j represents Kronecker’s symbol: δ = 1 for i = j and δ = 0 for i = j .) For positive rate parameters and initial conditions inside the positive orthant (the notion of an orthant refers to the entire section of a Cartesian coordinate system in which the signs of variables do not change. In n-dimensions, the positive orthant is defined by {xi > 0 ∀ i = 1, 2, . . . , n}.). The trajectory of a hypercycle remains within the orthant: fi >0, xi (0)>0 ∀i =1, 2, . . . , n$⇒xi (t)>0 ∀t ≥0. In other words, none of the variables is going to vanish and hence, the system is permanent in the sense that no member of a hypercycle dies out in the limit of long times, limt → ∞ xi (t) = 0 ∀ i = 1, . . . , n. The existence of a Hamiltonian arc, that is, a closed loop of directed edges visiting all nodes once, is a sufficient condition for permanence (Hofbauer & Sigmund, 1998). It is also a necessary condition for low-dimensional systems with n ≤ 5, but there exist permanent
CATALYTIC HYPERCYCLE
xk(t)
xk(t)
98
t
xk(t)
xk(t)
t
t
t
Figure 2. Solution curves of small elementary hypercycles. The figure shows the solution curves of Equation (1) with f1 = f2 = . . . = fn = 1 for n = 2 (upper left picture), n = 3 (upper right picture), n = 4 (lower left picture), and n = 5 (lower right picture). The initial conditions were x1 (0) = 1 − (n − 1) · 0.025 and xk (0) = 0.025 ∀ k = 2, 3, . . . , n. The sequence of the curves xk (t) is k = 1 full black line, k =2 full gray line, k = 3 hatched black line, k = 4 hatched grey line, and k = 5 black line with long hatches. The cases n = 2, 3, and 4 have stable equilibrium points in the middle of the concentration space c = (1/n, 1/n, . . . , 1/n); Equation (1) with equal rate parameters, n = 4, and linearized around the midpoint c exhibits a marginally stable “center” and very slow convergence is caused by the nonlinear term, which becomes smaller as the system approaches c. For n = 5, the midpoint c is unstable and the trajectory converges toward a limit cycle (Hofbauer et al., 1991).
dynamical systems for n ≥ 6 without a Hamiltonian arc; one example is shown in Figure 1. The dynamics of Equation (1) remains qualitatively unchanged when all rate parameters are set equal: f1 = f2 = . . . = fn =f , which is tantamount to a barycentric transformation of the differential equation (Hofbauer, 1981). The hypercycle is invariant with respect to a rotational change of variables, xi $⇒ xi + 1 with i = 1, 2, . . . , n; i mod n, it has one equilibrium point in the center, and its dynamics depends exclusively on n. Some examples with small n are shown in Figure 2. The systems with n ≤ 4 converge toward stable equilibrium points, whereas the trajectories of Equation (1) with n ≥ 5 approach limit cycles. Independent of n, elementary hypercycles do not sustain chaotic dynamics. Hypercycles have two inherent instabilities, which are easily illustrated for molecular species: (i) The members of the cycle may also catalyze the formation of nonmembers that do not contribute to the growth of the hypercycle, and thus hypercycles are vulnerable to parasitic exploitation (Eigen & Schuster, 1978a,b), and (ii) concentrations of individual species in oscillating hypercycles (n ≥ 5) go through very small values, and these species might become extinct
through random fluctuations. More elaborate kinetic mechanisms can stabilize the system in Case (ii). Exploitation by parasites, Case (i), can be avoided by compartmentalization. Competition between different hypercycles is characterized by a strong nonlinearity in selection (Hofbauer, 2002): once a hypercycle has been formed and established, it is very hard to replace it by another hypercycle. Epochs with hypercyclic dynamics provide explanations for “once for ever” decisions or “frozen accidents.” PETER SCHUSTER See also Artificial life; Biological evolution Further Reading Eigen, M. 1971. Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften, 58: 465–523 Eigen, M. & Schuster, P. 1977. The hypercycle. A principle of natural self-organization. Part A: emergence of the hypercycle. Naturwissenschaften, 64: 541–565 Eigen, M. & Schuster, P. 1978a. The hypercycle. A principle of natural self-organization. Part B: The abstract hypercycle. Naturwissenschaften, 65: 7–41 Eigen, M. & Schuster, P. 1978b. The hypercycle. A principle of natural self-organization. Part C: The realistic hypercycle. Naturwissenschaften, 65: 341–369
CATASTROPHE THEORY Hofbauer, J. 1981. On the occurrence of limit cycles in the Volterra–Lotka equation. Nonlinear Analysis, 5: 1003–1007 Hofbauer, J. 2002. Competitive exclusion of disjoint hypercycles. Zeitschrift für Physikalische Chemie, 216: 35–39 Hofbauer, J. & Sigmund, K. 1998. Evolutionary Games and Replicator Dynamics, Cambridge and New York: Cambridge University Press Hofbauer, J., Mallet-Paret, J. & Smith, H.L. 1991. Stable periodic solutions for the hypercycle system. Journal of Dynamics and Differential Equations, 3: 423–436 Maynard Smith, J. & Szathmáry, E. 1995. The Major Transitions in Evolution, Oxford and New York: Freeman Schuster, P. 1996. How does complexity arise in evolution? Complexity, 2(1): 22–30
99 A
l1
D
B C
θ
A
l2
x
Pointer
x
Wheel O
k
A
P θ
Q
a
b
Figure 1. (a) Sketch of Zeeman’s catastrophe machine, (b) A sketch of the three-dimensional solution surface and its projection onto two dimensions near the cusp A.
CATASTROPHE THEORY Many natural phenomena (cell division, the bursting of bubbles, the collapse of buildings, and so on) involve discontinuous changes, whereas the majority of applied mathematics is directed toward modeling continuous processes. On the other hand, catastrophe theory is primarily concerned directly with singular behavior and as such deals with properties of discontinuities directly. This approach has found use in many and diverse fields and at one time was heralded as a new direction in mathematics, uniting singularity and bifurcation theories and their applications (Zeeman, 1976). A simple mechanical system that illustrates the important ideas of discontinuous changes and hysteresis is provided by Zeeman’s catastrophe machine (Zeeman, 1972, Poston & Stewart, 1996). A sketch showing its construction is given in Figure 1(a). It is recommended that readers make such devices for themselves and experiment with them. The lines between Q, P, and the pointer represent rubber bands attached to a disk that rotates around O. Movement of the pointer such that it remains outside of the region ABCD results in a smooth motion of the wheel from one equilibrium state to another. This is illustrated by the path 1 in Figure 1(b), where x and θ are as defined in Figure 1(a) and k is a measure of the stiffness of the bands. However, starting with the pointer below AB and moving the pointer horizontally to cross AD will cause an anticlockwise jump in the wheel. This is equivalent to following the path 2 in Figure 1(b). This equilibrium will remain until AB is crossed when the pointer is moved backwards. The loci DAB form a cusp in the parameter space of the system where AD and AB are lines of folds that meet at the cusp point A. The cusp is a projection down onto the plane of a three-dimensional folded surface. The region labeled ABCD comprises four such cusps, each of which can be described by a simple cubic equation. Indeed, the set can be obtained from an approximate model for the machine (Poston & Stewart, 1996). The term catastrophe was introduced by René Thom in 1972 to describe such discontinuous changes in a system where a parameter is changed smoothly.
Zeeman (1976) then coined the phrase catastrophe theory, and an explosion of applications arose ranging from the physical to the social sciences. An important idea put forward is that there are seven elementary catastrophes that classify most types of observed discontinuous behavior. They are the fold, cusp, swallowtail, and butterfly catastrophes and the elliptic, hyperbolic, and parabolic umbilic catastrophes. These describe all possible discontinuities for potential systems that are controlled by up to four variables. They are ordered according to their typicality of occurrence with the fold being the most common. A path of folds will be represented by a line of singular behavior in parameter space and a cusp will be formed when two such lines meet. Indeed, these two singularities are sufficient to cover most of the observed macroscopic critical behavior in practical applications. One area of application where catastrophe theory has been used with considerable success is in optical caustics (Nye, 1999). Common experience of this phenomenon is observation of a distant light source through a drop of water on a window pane where a web of bright lines separated by dark regions can often be seen. The bright lines on the bottom of swimming pools are also examples of caustics where the bright sunlight is focused by the surface of the water. In this case, the line caustics are examples of paths of folds in the ray surfaces. An example of an optical cusp is provided by strong sunlight focused on the surface of a cup of coffee where two principal fold lines are made to meet by the curvature of the cup. In an outstanding series of experiments (Berry, 1976), a laser beam was shone through a water drop and a whole sequence of catastrophes was uncovered including swallowtails. All of the observed patterns can be reproduced in detail using the equations of ray optics (Nye, 1999). Catastrophe theory has also been used to explore in some detail the state selection process in Taylor– Couette flow between concentric cylinders. In this case, even numbers of vortices are generated in the flow field, and the number that is realized depends
100 on the length of the cylinders. For a given length of the cylinder, one state develops smoothly, with control parameter and neighboring states delimited by fold lines in parameter space. The fold lines meet in a cusp that has been observed experimentally (Benjamin, 1978) and calculated numerically (Cliffe, 1988) from the Navier–Stokes equations. There has been considerable criticism of catastrophe theory on both technical and practical grounds (Arnol’d, 1986; Arnol’d et al., 1994). For example, it is known that critical behavior or bifurcations in some multidimensional gradient systems do not reduce to critical points of potentials (Guckenheimer, 1973). Also, the ideas have been applied to a wide range of social, financial, and biological applications where the governing rules are not known or are very primitive. Very often, it is a case of re-interpretation of common experience in terms of technical mathematical language, which is most often qualitative rather than quantitative (Arnol’d, 1986). Hence, it is often the case that disparate systems superficially appear the same, but closer examination reveals that they are quite different in terms of important details. TOM MULLIN See also Bifurcations; Critical phenomena; Development of singularities; Equilibrium; Taylor– Couette flow
Further Reading Arnol’d, V.I. 1986. Catastrophe Theory, Berlin and New York: Springer Arnol’d, V.I., Afrajmovich, V.S. Il’yashenko, Yu.S. & Shil’nikov, L.P. 1994. Bifurcation Theory and Catastrophe Theory, Berlin and New York: Springer Benjamin, T.B. 1978. Bifurcation phenomena in steady flows of a viscous fluid. Proceedings of the Royal Society of London, Series A, 359: 1–43 Berry, M.V. 1976. Waves and Thomas theorem. J Advances in Physics, 25(1): 1–26 Cliffe, K.A. 1988. Numerical calculations of the primary-flow exchange process in the Taylor problem. Journal of Fluid Mechanics, 197: 57–79 Guckenheimer, J. 1973. Bifurcation and catastrophe. In Dynamical Systems: Proceedings of the Symposium of University of Babia Salvador, 1971, pp. 95–109 Nye, J.F. 1999. Natural Focusing and Fine Structure of Light: Caustics and Wave Dislocation, Bristol: Institute of Physics Publishing Poston, T. & Stewart, I. 1996. Catastrophe Theory and Its Applications, New York: Dover Saunders, P.T. 1980. An Introduction to Catastrophe Theory, Cambridge and New York: Cambridge University Press Thom, R. 1972. Structural Stability and Morphogenesis. Reading, MA: Benjamin Zeeman, E.C. 1976. Catastrophe theory: a reply to Thom. In Dynamical Systems Warwick 1974, edited by A. Manning, Berlin and New York: Springer, pp. 405–419 Zeeman, E.C. 1972. A catastrophe machine. In Towards a Theoretical Biology, vol. 4, edited by C.H. Waddington, Edinburgh: Edinburgh University Press, pp. 276–282
CAUSALITY
CAUSALITY Basic to science as well to common sense is the root notion of causality, that things and processes in the world we experience are not totally random, but ordered in specific ways that allow for rational understanding through explanations of various types. In the transition from a mythological worldview to a rational one, the notion of guilt (in Greek aitia), as in a criminal being guilty of a crime, was metaphorically used to describe nonpersonal natural processes whenever one phenomenon would necessarily follow another. As one aim of modern science is to uncover the deep structure of the world beyond our immediate experience, its explanations deal with the different determinants (or causes) of processual order. Three classical conceptions of inquiry, associated with the traditions of Plato, Aristotle, and Archimedes, have provided influential ideas about the role of causal explanations in science. In the Platonic tradition, certain properties of nature could be derived from a priori given mathematical structures. As discovered by the Pythagorean philosophers of nature, a specific mathematical structure (relations between small whole numbers) could be the key to a part of nature (such as acoustics), so why not see whether that same structure could also describe other areas (such as astronomy)? Although the latter attempt failed, the general idea of using the power of demonstrative or formal necessity in mathematics as a descriptive tool of natural causality is still vital in many areas of science, including cosmology and high-energy physics. Also, the use of analog mathematical structures to describe phenomena has become standard. The Aristotelian tradition did not refuse mathematical description but saw it only as a tool in a search for the real causes of things. For Aristotle, there were four kinds of causes: the material cause (hyle or causa materialis) that describes the stuff of which something is made, the formal cause (eidos or causa formalis) that describes the organization of something, the efficient cause (to kineti’kon or causa efficiens) that describes the active forces by which the phenomenon comes into being, and the final cause (to ’telos or causa finalis) that describes the purpose that it serves. Thus, for a house, bricks and mortar are its material cause, the plan of the house is its formal cause, the mason building it is its efficient cause, and the purpose of sheltering a family is its final cause. In that ancient world of Aristotle, each phenomenon generally served a purpose. Aristotle did not consider the four causes as necessarily separate aspects of nature, but more like principles of explanation that may sometimes merge, as in the sprouting acorn becoming an oak tree where the formal, efficient, and final causes work together to actualize the characteristics of an adult oak. The popular renaissance critique of the final cause as implying the paradox of a future state (a goal)
CAUSALITY influencing a present state led to a dismissal of any pluralist conception of causes. In the subsequent mechanical world picture, only efficient causes were left as explanatory. The life sciences could not live up to this reduction but continued as a descriptive natural history with an essentially Aristotelian outlook, at least until Darwin would explain the final cause of adaptations by the efficient causes of natural selection and heredity. Yet, even the Darwinian paradigm could not account for the nonlinear mechanisms of self-organization in the organism’s embryonic development. Such goal-like (teleological) properties of development and self-reproduction remained necessary yet unexplained preconditions for the mechanics of natural selection. The Archimedean tradition was founded by disciplines more physical than mathematical, although combining the two, such as optics, astronomy, mechanics, and music theory. The mathematical relations discovered by Archimedes (ca. 287–212 BC) in his books on mechanics were not a priori, as in the Platonic tradition, but derived from experience. However, the Aristotelian pursuit after the causes of the phenomena, especially the final ones, was regarded as metaphysical and so ignored. The Archimedean tradition includes such names as Ptolemaios, Johannes Kepler, Galileo Galilei, and Isaac Newton. Kepler started out as a Platonic, aiming to explain the Copernican system (which placed the Sun at the center of the solar system, in opposition to the Ptolemaic system) by regular polyhedrons, but failed and found the right laws for planetary movements through a mathematical analysis of Tycho Brahe’s empirical observations. Galileo found his laws of falling bodies by eschewing the search for a hypothetical cause of gravitational force and instead using measures proportional to the velocity of a moving body for the effect of this force. Although the mechanical worldview emphasizes only the role of efficient causes as principles of explanation in physics, the very idea of cause gave way for a long period to skepticism about proving any real existence of causes (the positivism of David Hume), eventually seeing the concept of cause as a feature of the observing subject (the transcendental idealism of Immanuel Kant). Yet, in physics, the laws of nature as expressed in terms of mathematics came to play the explanatory role of the causes of a system’s movement. It was assumed that any natural system could be encoded into some formalism (e.g., a set of differential equations representing the basic laws governing the system) and that the entailment structure of that formalism perfectly mirrored the (efficient) causal structure of that part of nature. This view was compatible with a micro-determinism where a system’s macroscopic properties and processes are seen as completely determined by the behavior of the system’s constituent particles, governed by deterministic laws.
101 This view was deeply questioned by quantum physics, and by Rosen’s work on fundamental limits on dynamic models of causal systems. The complexity of causality, especially in goaldirected systems, was presaged by cybernetic research in the 1940s, dealing with negative feedback control (in animals and artifacts such as self-guiding missiles) and the role of information processing for the regulation of dynamic systems. A paradigmatic example is the closed causal loops connecting various physiological levels of hormones in the body, essential for maintaining a constant internal environment (homeostasis)—a modern version of the ancient symbol of uroboros, the snake biting its own tail. The emergence of nonlinear science in the late 20th century increased interest in the old idea that causal explanations may not all reduce to simple one-to-one correspondences between cause and effect. The realization that complex systems may occupy different areas in phase space characterized by qualitatively distinct attractors, eventually separated by fractal borders, has questioned micro-determinism even more than the fact that many such nonlinear systems have a high sensitivity to the initial conditions (the butterfly effect). Another insight is that complex things often selforganize as high-level patterns via processes of local interactions between simple entities. This emergence of wholes (or collective behavior of units) may be mimicked in causal explanations. Instead of top-down reductive explanations, nonlinear science provides additional bottom-up explanations of emergent phenomena. Although these explanations are still reductive (in the methodological sense that one can show exactly what is going on from step to step in a simulation of the system), the complexity makes prediction impossible; thus, computational shortcuts to predict a future state can rarely be found. As an emergent whole is formed bottom-up, its organization constrains its components in a top-down manner, that has been called downward causation (DC). There are three interpretations of DC: in strong DC, the emergent whole (a human mind) effectuates a change in the very laws that govern the lower-level (like free will might suspend what normally determines the action of the brain’s neurons). This interpretation is often related to vitalist and dualist conceptions of life and mind and is hard to reconcile with science. In medium DC, lower-level laws remain unaffected; yet, their boundary conditions are constrained by the emergent pattern (a mental representation), which is considered just as real as the components of the system (neuronal signaling). Here, the state of the higher level works as a factor selecting which of the many possible next states of the high level may emerge from the low level. In weak DC, the emergent higher levels are seen as regulated by stable (cyclic or chaotic) attractors for the dynamics of the lower level. The fact that a biological species consists of
102 stable organisms is not solely a product of natural selection, but is a result of such internal, formal properties in the system’s organization—the job of natural selection being to sort out the possible stable organisms and find those most fit for the given milieu (Kauffman, 1993; Goodwin, 1994). It should be emphasized that DC is not a form of efficient causation (involving a temporal sequence from cause to effect), rather it is a modern version of the Aristotelian formal and final cause. Nonlinear science may be said to integrate a Platonic appreciation of universality (as found in the equations governing the passage to chaos in systems of quite distinct material nature), an Aristotelian acceptance of several types of causes, and an Archimedean pragmatism regarding the deeper status of determinism and causality. The latter is reflected in the fact that although deterministic chaos characterizes a large class of systems, this does not imply that these systems (or nature) are fully deterministic. The determinism refers to the mathematical tools used rather than an ontological notion of causality. CLAUS EMMECHE See also Biological evolution; Butterfly effect; Determinism; Feedback
Further Reading Depew, D.J. & Weber, B.H. 1995. Darwinism Evolving: System Dynamics and the Genealogy of Natural Selection, Cambridge, MA: MIT Press Emmeche, C., Stjernfelt, F. & Køppe, S. 2000. Levels, Emergence, and three versions of downward causation. In Downward Causation. Minds, Bodies and Matter, edited by P.B. Andersen, C. Emmeche, N.O. Finnemann & P.V. Christiansen, aarhus: Aarhus University Press, pp. 13–34 Fox, R.F. 1982. Biological Energy Transduction: The Uroboros, New York: Wiley Goodwin, B. 1994. How the Leopard Changed Its Spots: The Evolution of Complexity, New York: Scribner’s Kauffman, S.A. 1993. The Origins of Order. Self-organization and Selection in Evolution, Oxford and New York: Oxford University Press Pedersen, O. 1993. Early Physics and Astronomy: A Historical Introduction, Cambridge and New York: Cambridge University Press Rosen, R. 2000. Essays on Life Itself, New York: Columbia University Press Weinert, F. (editor). 1995. Laws of Nature: Essays on the Philosophical, Scientific and Historical Dimensions, Berlin and New York: Walter de Gruyter
CAUSTICS See Catastrophe theory
CAVITY SOLITONS See Solitons, types of
CELESTIAL MECHANICS
CELESTIAL MECHANICS Although its origins can be traced back in antiquity to the first attempts of explaining the apparently irregular wandering of the planets, celestial mechanics was born in 1687 with the release of Isaac Newton’s Principia. In 1799, Pierre-Simon Laplace introduced the term mécanique céleste (Laplace, 1799), which was adopted to describe the branch of astronomy that studies the motion of celestial bodies under the influence of gravity. Celestial mechanics is researched and developed by astronomers and mathematicians; the methods used to investigate it including numerical analysis, the theory of dynamical systems, perturbation theory, the quantitative and qualitative theory of differential equations, topology, the theory of probabilities, differential and algebraic geometry, and combinatorics. Ptolemy’s idea of the epicycles—according to which planets are orbiting on small circles, whose centers move on larger circles, whose centers move on even larger circles around the Earth—dominated astronomy in antiquity and the MiddleAges. In 1543, after working for more than 30 years on a new theory, Copernicus finished writing De Revolutionibus, a book in which he expressed the motion of the planets with respect to a heliocentric reference system, that is, one with the Sun at its origin. This allowed Kepler to use existing observations and formulate three laws of planetary motion, published in 1609 in Astronomia Nova: (i) The law of motion: every planet moves on an ellipse having the sun at one of its foci. (ii) The law of areas: every planet moves such that the segment planet-sun sweeps equal areas in equal intervals of time. (iii) The harmonic law: the squares of the periods of any two planets are to each other as the cubes of their mean distances from the sun. But all these achievements were empirical, based on observations, not on deductions obtained from a more general physical law. In 1666, Newton came up with the idea that the attractive force responsible for the free fall of objects might be the same as the one keeping the Moon in its orbit. He conjectured that the expression of this force is directly proportional to the product of the masses and inversely proportional to the square of the distance between bodies. The tools of calculus, which he had invented independent of—and at about the same time as—Gottfried Wilhelm von Leibniz, allowed him to proceed with the computations. Two decades later, in Principia, Newton proved the correctness of his theory. Kepler’s laws follow as consequences. They are obtained from the differential equations of the Newtonian two-body problem (also called the Kepler problem) given by a potential energy of the form U (r) = − Gm1 m2 /r, where G is the gravitational
CELL ASSEMBLIES constant and r is the distance between the bodies of masses m1 and m2 . After Newton, mathematicians, such as Johann Bernoulli, Alexis Clairaut, Leonhard Euler, such as Jean d’Alembert, Laplace, Joseph-Louis Lagrange, Siméon Poisson, Carl Jacobi, Karl Weierstrass, and Spiru Haretu, attacked various theoretical questions of celestial mechanics (e.g., the 2- and 3-body problem, the lunar problem, the motion of Jupiter’s satellites, and the stability of the solar system) mostly with the quantitative tools of analysis, algebra, and the theory of differential equations. On the practical side, the first resounding success in the field was the prediction of the return of Halley’s comet, which occurred in 1758—as the calculations had shown. An even more spectacular achievement came in 1846 with the discovery of the planet Neptune on the basis of the perturbation theory through computations independently performed by John Couch Adams and Urbain Jean-Joseph Le Verrier. Having its origin in one of Euler’s papers, which applied the calculus of trigonometric functions to the 3-body problem, perturbation theory is now an independent branch of mathematics (see, e.g., Verhulst, 1990; Guckenheimer & Holmes, 1983) that is often used in celestial mechanics. An important theoretical advance was achieved by Henri Poincaré toward the end of the 19th century, when the questions of celestial mechanics—especially those concerning the Newtonian 3-body problem—received substantial attention. While working on this problem, Poincaré understood that the quantitative methods of obtaining explicit solutions for differential equations are not strong enough to help him make significant progress; thus, he tried to describe the qualitative behavior of orbits (e.g., stability, the motion in the neighborhood of collisions and at infinity, existence of periodic solutions) even when their expressions were too complicated or impossible to derive, which is the case in general. His ideas led to the birth of several branches of mathematics, including the theory of dynamical systems, nonlinear analysis, chaos, stability, and algebraic topology (Barrow-Green, 1997; Diacu & Holmes, 1996). Today’s astronomers working in celestial mechanics are primarily interested in questions directly related to the solar system, such as the accurate prediction of eclipses, orbits of comets and asteroids, the motion of Jovian moons, Saturn’s rings, and artificial satellites. The invention of the electronic computer had a significant impact on the practical aspects of the field. The development of numerical methods allowed researchers to obtain good approximations of the planet’s motion for long intervals of time. These types of results are also used in astronautics. No space mission, from the Sputnik, Apollo, and Pioneer programs to the space shuttle, the Hubble telescope launch, and the recent international space
103 collaboration projects, could have been possible without the contributions of celestial mechanics. Contemporary mathematicians active in the field are mostly dealing with theoretical issues, as, for example, the study of the general N-body problem and its particular cases (Wintner, 1947) (N = 2, 3, 4, the collinear, isosceles, rhomboidal, Sitnikov, and planetary problems, central configurations, etc.), attempting to answer questions regarding motion near singularities and at infinity, periodic orbits, stability and chaos, oscillatory behavior, Arnol’d diffusion, etc. Some researchers also study alternative gravitational forces like that suggested by Manev (Diacu et al., 2000; Hagihara, 1975; Moulton, 1970), which offers a good relativistic approximation at the level of the solar system. Celestial mechanics and mathematics have always influenced each other’s development, a trend that is far from slowing down today. The contemporary needs of space science bring a new wave of interest in the theoretical and practical aspects of celestial mechanics, making its connections with mathematics stronger than ever before. FLORIN DIACU See also N -body problem; Perturbation theory; Solar system Further Reading Barrow-Green, J. 1997. Poincaré and the Three-Body Problem, Providence, RI: American Mathematical Society Diacu, F. & Holmes, P. 1996. Celestial Encounters—The Origins of Chaos and Stability, Princeton, NJ: Princeton University Press Diacu, F., Mioc, V. & Stoica, C. 2000. Phase-space structure and regularization of Manev-type problems, Nonlinear Analysis, 41: 1029–1055 Guckenheimer, J. & Holmes, P. 1983. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, Berlin and New York: Springer Hagihara, Y. 1975. Celestial Mechanics, vol. 2, part 1, Cambridge, MA: MIT Press Laplace, P.-S. 1799. Traité de mécanique céleste. 5 vols. Paris, 1799–1825 Moulton, J.R. 1970. An Introduction to Celestial Mechanics, Dover Verhulst, F. 1990. Nonlinear Differential Equations and Dynamical Systems, Berlin and New York: Springer Wintner, A. 1947. The Analytical Foundations of Celestial Mechanics, Princeton, NJ: Princeton University Press
CELL ASSEMBLIES The term cell assembly became well established as a neuropsychological concept with the publication in 1949 of Donald Hebb’s book The Organization of Behavior. A cell assembly forms when a group of cortical neurons gets wired together through experience-dependent synaptic potentiation induced by nearly synchronous activity in pairs of connected neurons. Once formed, it serves as a cooperative and
104 holistic mental representation. The mutual excitation within the active cell assembly influences network dynamics such that (i) the entire group of cells can be activated by stimulating only a part of it and (ii) once active, the ensemble displays afteractivity outlasting the stimulus triggering it. An important extension of the concept was suggested by Peter Milner, who proposed that negative feedback from lateral inhibition is essential to prevent mass activation of the entire network and the emergence of epilepticlike activity (Milner, 1957). This also introduces an element of competitive interaction between the cell assemblies allowing one assembly at a time to dominate the network. Hebb further proposed that cell assemblies activated in succession would wire together to form “phase sequences” that might be the physiological substrate of chains of association and the flow of thought. The concept of cell assemblies has been extensively elaborated in the context of cortical associative memory (see, e.g., Fuster, 1995). Notably, abstract attractor network models (Hopfield, 1982; Rolls and Treves, 1997) may be regarded as mathematical instantiations of Hebb’s cell assembly theory. They have been useful, for example, to estimate how many assemblies can be stored in a network of a given size, the existence of spurious attractors, and the effect of sparse activity and diluted network connectivity. The extensive recurrent neuronal circuitry required for the formation of cell assemblies is abundant in the cerebral cortex in the form of horizontal intracortical and cortico-cortical connections. The latter are myelinated and support fast communication over large distances. The existence of experience-dependent synaptic changes in the form of “Hebbian" long-term synaptic potentiation is very well experimentally established today. In light of what we now know about brain circuitry and neuronal response properties, it is reasonable to assume that a cell assembly may extend across large areas of sensory, motor, and association cortices, perhaps even involving subcortical structures like the thalamus and basal ganglia. Rather than individual neurons, the actual nodes of the cell assembly are likely to be cortical modules, that is, minicolumns, comprising a couple of hundred neurons and with a diameter of about 30 m (Mountcastle, 1998). Feature detectors in primary sensory areas like the visual orientation columns described by David Hubel and Torsten Wiesel are typical examples. Although the modular organization of higher-order cortical areas is less obvious, it has been proposed that the cortical sheet actually comprises a mosaic of such minicolumns. A single-cell assembly would then engage a small fraction of them distributed over a large part of the brain. The reverberatory afteractivity in an active cell assembly was proposed by Hebb (1949) to correspond to “a single content in perception” and last some
CELL ASSEMBLIES hundred milliseconds to half a second. Notably, this is of the same order as the duration of visual fixations and the period of cognitively relevant neurodynamical phenomena evident in EEG (e.g., the theta rhythm) and evoked potentials (Pulvermueller et al., 1994). Persistent activity over seconds but otherwise of the same origin has recently been proposed as a mechanism underlying working memory (Compte et al., 2000). The cell assembly theory connects the cellular and synaptic levels of the brain with psychological phenomenology. It suggests explanations for the close interaction between memory and perception, including the holistic and reconstructive nature of Gestalt perception, perceptual fusion of polymodal stimuli, and perceptual illusions. Typical examples are perceptual completion and filling in as when looking at the Kaniza triangle and perceptual rivalry as demonstrated by the slowly alternating perception of an ambiguous threedimensional stimulus like the Necker cube (Figure 1). An analogy to Gestalt perception in the motor domain would be motor synergies, and their existence can also be understood based on the cell assembly theory. In the motor control domain, however, the
Figure 1. Two perceptual illusions that may be understood in terms of cell assembly dynamics.
CELLULAR AUTOMATA temporal component of the underlying neurodynamics is critical, and for instance, finely tuned temporal sensory-motor coordination cannot be explained within this paradigm alone. The neurodynamics of cell assemblies has been studied in a network with biophysically detailed model neurons (see, e.g., Lansner and Fransén, 1992; Fransén and Lansner, 1998), showing that the cellular properties of cortical pyramidal cells promote sustained activity. Moreover, the experimentally observed level of mutual excitation is sufficient to support such activity, and the measured magnitude of cortical lateral inhibition is effective in preventing coactivation of assemblies. The time to activate an entire assembly from a part of it was found to be within 50–100 ms in accordance with psychological experimental results. Modeling has further shown that neuronal adaptation due to accumulation of slow afterhyperpolarization (presumably together with synaptic depression) is a likely cause of termination of activity in an active assembly. Afteractivity may typically last some 300 ms, but the network dynamics is quite sensitive to endogenous monoamines such as serotonin, which acts by modulating the neuronal conductances underlying such adaptation. Further, simulations demonstrate that a cell assembly can survive even if the average conduction delay between the participating neurons increases to 10 ms, corresponding to a spatial extent of about 50 mm at axonal conduction velocities of 5 m/s. Despite quite powerful mutual excitation, reasonably low firing rates of cortical cells can be obtained in models with saturating synapses with slow kinetics (e.g., NMDA receptor gated channels) together with cortical feedback inhibition. Although biologically highly plausible and supported by computational models, solid experimental evidence for the existence of cell assemblies is still lacking. Detection of their transient and highly distributed and diluted activity requires simultaneous noninvasive measurement in awake animals of the activity in a large number of neurons with high spatial and temporal resolution. This is still beyond the reach of current experimental techniques. Nevertheless, Hebb’s original proposal has remained a vital hypothesis for more than half a century and it continues to inspire much experimental and computational research aimed at understanding how the brain works. ANDERS LANSNER See also Attractor neural network; Electroencephalogram at large scales; Gestalt phenomena; Neural network models; Neurons
Further Reading Compte, A., Brunel, N., Goldman-Rakic, P.S. & Wang, X.-J. 2000. Synaptic mechanisms and network dynamics underly-
105 ing visuospatial working memory in a cortical network model. Cerebral Cortex, 10: 910–923 Fransén, E. & Lansner, A. 1998. A model of cortical associative memory based on a horizontal network of connected columns. Network: Computation in Neural Systems, 9: 235–264 Fuster, J.M. 1995. Memory in the Cerebreal Cortex. Cambridge, MA: MIT Press Hebb, D.O. 1949. The Organization of Behavior, New York: Wiley Hopfield, J.J. 1982. Neural networks and physical systems with emergent collective computational properties, Proceedings of the National Academy of Sciences, USA, 81: 3088–3092 Lansner, A. & Fransén, E. 1992. Modeling Hebbian cell assemblies comprised of cortical neurons. Network: Computation in Neural Systems, 3: 105–119 Milner, P.M. 1957. The cell assembly: Mark II. Psychological Review, 64: 242–252 Mountcastle, V.B. 1998. Perceptual Neuroscience: The Cerebral Cortex. Cambridge, MA: Harvard University Press Pulvermueller, F., Preissl, H., Eulitz, C., Pantev, C., Lutzenberger, W., Elbert, T. & Birbaumer, N. 1994. Brain rhythms, cell assemblies and cognition: evidence from the processing of words and pseudowords. Psycoloquy, 5(48) Rolls, E. & Treves, A. 1997. Neural Networks and Brain Function. Oxford and New York: Oxford University Press Scott, A.C. 2002. Neuroscience: A Mathematical Primer. Berlin and New York: Springer
CELLULAR AUTOMATA Following a suggestion by Stanislaw Ulam, John von Neumann developed the concept of cellular automata (CA) in 1948. Von Neuman wanted to formalize a set of primitive logical operations that were sufficient to evolve the complex forms of organization necessary for life. In doing this, he constructed a twodimensional self-replicating automaton and initiated not only the study of CA but also the idea, now popular among students of complexity theory, that highly complex global behavior can be generated from local interaction rules. Much of the interest in CA has been motivated by their ability to generate complex spatial and temporal patterns from simple rules. Because of this, they provide a rich modeling environment, having the twin virtues of mathematical tractability and representational robustness (see Burk, 1970, for discussion of much of the early work on CA). The non-obvious connection between simple local rules of interaction and emergent complex global patterns offers a possible approach to an explanation of complexity through determination of its generating local interactions. Some researchers have gone so far as to suggest that the universe itself is a CA, or CAlike object, and that sets of local generative interaction rules can replace differential equations as the standard mathematical expression for physical models. CA are spatially and temporally discrete symbolic dynamical systems defined in terms of a lattice of sites (or cells), an alphabet of symbols that can be assigned to lattice sites, and a local update rule that
106
CELLULAR AUTOMATA
determines what symbol is assigned to each site at time t + 1 on the basis of the site values in a local neighborhood of that site at time t. Given the local neighborhood structure, a neighborhood state is defined by an assignment of symbols from the alphabet to each site in the neighborhood. The local update rule can be specified in terms of a look-up table that lists the set of neighborhood states together with the symbol that each one assigns to the designated site at the next time step. For example, if the lattice is isomorphic to the integers, the alphabet is 0, 1, and the neighborhood of any site s is the set of sites s − 1, s, s + 1, then there are 256 possible update rules, defined by the look-up table: 000 001 010 011 100 101 110 111 x1 x2 x3 x4 x5 x6 x7 x0 (The updated symbol is entered in the central cell. This defines what are called nearest-neighbor rules.) Here, each of the xi is either 0 or 1 depending on the specific rule. Thus, they can be thought of as components of the rule. A cellular automaton can be defined in any number of dimensions. Von Neuman’s original automaton was two dimensional, as is what is perhaps the best-known CA—Conway’s Game of Life (Gardner, 1970). This is one of the simplest two-dimensional CA known to be equivalent to a universal Turing machine. If is the full state space for a CA, then the local update rule defines a global mapping ψ : → . Much of the analytical work on CA has been directed at determining the mathematical properties of the map ψ. A fundamental paper published by Hedlund in 1969 showed that CA are just the shift-commuting endomorphisms of the shift dynamical system. It is also known that surjectivity (a function is surjective, or onto, if every state has at least one predecessor) of the map ψ is decidable only in dimension one and that for one-dimensional additive rules (i.e., those satisfying the condition ψ(µ + µ ) = ψ(µ) + (µ )), injectivity is equivalent to a certain rule-dependent complex polynomial having no roots that are nth roots of unity for any n. (A function is injective, or one to one, if every state has at most one predecessor. A function that is both surjective and injective is called reversible.) From the early 1960s until the early 1980s, much of the work on CA was either simple applications or mathematical analysis. The terminology was not settled, and work can be found under the names cellular structures, cellular spaces, homogeneous structures, iterative arrays, tessellation structures, and tessellation arrays. As computers became powerful enough to support the intense calculations required, however, an experimental mathematics approach became possible. In addition, solution of systems of differential equations by computer makes use of numerical combination rules
on a discrete lattice, and CA are the simplest examples of such rules, adding impetus to interest in their study. Concurrent with the appearance of powerful computers, work was initiated on the physics of computation, and the construction of reversible automata that, it was supposed, would be a discrete equivalent to the time-invariant differential equations of physics. More or less simultaneously, Stephen Wolfram began to publish a series of papers that popularized the study of elementary automata (see Wolfram, 1994), and by the mid-1980s, CA had emerged as a major field of interest among research in the field of complex systems theory. Because of their generality as a modeling platform, CA have found wide application in many areas of science. In chemistry and physics, they have provided models of pattern formation in reaction-diffusion systems, the evolution of spiral galaxies, spin exchange systems and Ising models, fluid and chemical turbulence (especially as lattice gas automata), dendritic crystal growth, and solitons, among other applications. Spatially recurring patterns that propagate in the space-time evolution of certain CA have been likened to particles moving in physical space-time. The interactions of these “particles” are important in attempts to use CA for computational tasks (e.g., Crutchfield & Mitchell, 1995). It has also been pointed out that these particles are analogous to the defects, or coherent structures found in pattern formation processes in condensed matter physics, and to solitons in hydrodynamics. The best-known examples of such particles are the so-called “gliders” that occur in Conway’s game of life. Numerous connections have also been shown between fractals and cellular automata. Rescaling the space-time output of a CA often generates a fractal, as, for example, the two-site rule defined by 00, 11 → 0, 01, 10 → 1 generates the well-known Sierpinski gasket (Peitgen, Jürgen & Saupe, 1992). In biology and medicine, CAs have been applied in models of heart fibrillation, developmental processes, evolution, propagation of diseases infectious, plant growth, and ecological simulations. In computation, CAs have been used as parallel computers, sorters, prime number sieves, and tools for encryption and for image processing and pattern recognition. Some automata have the capacity for universal computation, although how to implement this capacity remains problematic. Cellular automata have also been used to model social dynamics (Axtell & Epstein, 1996), the spread of forest fires, neural networks, and military combat situations. Extensive references to these applications and others can be found in Voorhees (1995) and Ilachinski (2001).
CELLULAR NONLINEAR NETWORKS Work on CA has also stimulated work on other systems that generate complex patterns based on local rules. There are close connections to the fields of artificial life, random Boolean networks, genetic programming and evolutionary computation (Mitchell, 1996), and the general theory of computational mechanics. A web search under the key word “cellular automata” will turn up literally hundreds of sites devoted to various aspects of their study. A particularly useful program for the study of CA, Boolean networks, and other discrete iterated systems is Discrete Dynamics Lab (Wuensche & Lesser, 1992), available for downloading from http://www.ddlab.com. BURTON H. VOORHEES See also Artificial life; Chaotic dynamics; Emergence; Fractals; Game of life; Integrable cellular automata; Lattice gas methods; Neural network models; Solitons Further Reading Axtell, R. & Epstein, J.M. 1996. Growing Artificial Societies: Social Science from the Bottom Up, Cambridge, MA: MIT Press Burk, A.W. (editor). 1970. Essays on Cellular Automata, Champaign, IL: University of Illinois Press Crutchfield, J.P. & Mitchell, M. 1995. The evolution of emergent computation. Proceedings of the National Academy of Sciences, 92(10): 10,742–10,746 Doolen, G.D. (editor). 1991. Lattice Gas Methods: Theory, Applications, and Hardware, New York: Elsevier Gardner, M. 1970. The fantastic combinations of John Conway’s new solitaire game of life. Scientific American, 223: 120–123 Ilachinski, A. 2001. Cellular Automata, Singapore: World Scientific Mitchell, M. 1996. An Introduction to Genetic Algorithms, Cambridge, MA: MIT Press Peitgen, H.-O, Jürgens, H. & Saupe, D. 1992. Chaos and Fractals: New Frontiers in Science, New York: Springer Toffoli, T. & Margolis, N. 1987. Cellular Automata Machines: A New Environment for Modeling, Cambridge, MA: MIT Press Voorhees, B.H. 1995. Computational Analysis of OneDimensional Cellular Automata. Singapore: World Scientific Wolfram, S. 1994. Cellular Automata and Complexity, Reading, MA: Addison-Wesley Wuensche, A. & Lesser, M. 1992. The Global Dynamics of Cellular Automata, Reading, MA: Addison-Wesley
CELLULAR NONLINEAR NETWORKS The development of cellular nonlinear networks (CNN) is embedded in the history of the electronic and computer industry, which is characterized by three revolutions: cheap computing power via microprocessors (since the 1970s), cheap bandwidth (since the end of the 1980s), and cheap sensors and MEMS (micro-electromechanical system) arrays (since the end of the 1990s). These research and technology breakthroughs led the way for several important economic enterprises such as the PC industry of the 1980s, the Internet indus-
107 try of the 1990s, and the future analog computing industry, which is growing, together with optical and nanoscale implementations on the atomic and molecular level. Analog cellular computers have been the technical response to the sensors revolution, mimicking the autonomy and physiology of sensory and processing organs. The CNN was invented by Leon O. Chua and Lin Yang in Berkeley in 1988. The main idea behind the CNN paradigm is Chua’s so-called “local activity principle,” which asserts that no complex phenomena can arise in any homogeneous media without local activity. Obviously, local activity is a fundamental property in microelectronics, where, for example, vacuum tubes and, later on, transistors are locally active devices in the electronic circuits of radios, televisions, and computers. The demand for local activity in neural networks was motivated by practical technological reasons. In 1985, John Hopfield theoretically suggested a neural network, which, in principle, could overcome the failures of pattern recognition in Frank Rosenblatt’s perceptron (See Perceptron). However, its globally connected architecture was highly impractical for technical realizations in VLSI (very-large-scale-integrated) circuits of microelectronics: the number of wires in a fully connected Hopfield network grows exponentially with the size of the array. A CNN only needs electrical interconnections in a prescribed sphere of influence. In general, a CNN is a nonlinear analog circuit that processes signals in real time. It is a multicomponent system of regularly spaced identical (“cloned”) units, called cells, which communicate with each other directly only through their nearest neighbors. However, the locality of direct connections also permits global information processing. Communications between nondirectly (remote) connected units are obtained on passing through other units. The idea that complex and global phenomena can emerge from local activities in a network dates back to John von Neumann’s first paradigm of cellular automata (CA). In this sense, the CNN paradigm is a higher development of the CA paradigm under the new conditions of information processing and chip technology. Unlike conventional cellular automata, CNN host processors accept and generate analog signals in continuous time with real numbers as interaction values. Furthermore, the CNN paradigm allows deep insights into the dynamic complexity of computational processes. While the classification of complexity by CA was more or less inspired by empirical observations of pattern formation in computer experiments, the CNN approach delivers a mathematically precise measure of dynamic complexity. The basic idea is to understand cellular automata as a special case of CNNs that can be characterized by a precise code for attractors of nonlinear dynamical systems and by a unique complexity index.
108
CELLULAR NONLINEAR NETWORKS
Applications
Figure 1. Standard CNN with array (a), 3 × 3 and 5 × 5 neighborhood (b,c).
Mathematical Definition A CNN is defined by (1) a spatially discrete set of continuous nonlinear dynamical systems (cells or neurons) where information is processed into each cell via three independent variables (input, threshold, initial state) and (2) a coupling law relating relevant variables of each cell to all neighboring cells within a predescribed sphere of influence. A standard CNN architecture consists of an M × N rectangular array of cells C(i, j ) with cartesian coordinates (i, j ) with i = 1, 2, ..., M and j = 1, 2, ..., N (Figure 1a). Figures 1b–c show examples of cellular spheres of influence as 3 × 3 and 5 × 5 neighborhoods. The dynamics of a cell’s state is defined by a nonlinear differential equation (CNN state equation) with scalars for state xij , output yij , input uij , and threshold zij , and coefficients, called “synaptic weights”, modeling the intensity of synaptic connections of the cell C(i, j ) with the inputs (feedforward signals) and outputs (feedback signals) of the neighboring cells C(k, l). The CNN output equation connects the states of a cell with the outputs. The majority of CNN applications use spaceinvariant standard CNNs with a cellular neighborhood of 3 × 3 cells and no variation of synaptic weights and cellular thresholds in the cellular space. A 3 × 3 sphere of influence at each node of the grid contains nine cells with eight neighboring cells and the cell in its center. In this case, the contributions of the output (feedback) and input (feedforward) weights can be reduced to two fixed 3 × 3 matrices, which are called feedback (output) cloning template A and feedforward (input) cloning template B. Thus, each CNN is uniquely defined by the two cloning templates A, B, and a threshold z, which consist of 3 × 3 + 3 × 3 + 1 = 19 real numbers. They can be ordered as a string of 19 scalars with a uniform threshold, nine feedforward, and nine feedback synaptic weights. This string is called a “CNN gene” because it completely determines the dynamics of the CNN. Consequently, the universe of all CNN genes is called the “CNN genome.” In analogy to the human genome project, steady progress can be made by isolating and analyzing various classes of CNN genes and their influences on CNN genomes.
In visual computing, the triple A, B, z, and its 19 real numbers can be considered as a CNN macroinstruction on how to transform an input image into an output image. Simple examples are a subclasses of CNNs with practical relevance such as the class C(A, B, z) of space-invariant CNNs with excitatory and inhibitory synaptic weights, the zero-feedback (feedforward) class C(0, B, z) of CNNs without cellular feedback, the zero-input (autonomous) class C(A, 0, z) of CNNs without cellular input, and the uncoupled class C(A0 , B, z) of CNNs without cellular coupling. In A0 , all weights are zero except for the weight of the cell in the center of the matrix. Their signal flow and system structure can be illustrated in diagrams that can easily be applied to electronic circuits as well as to typical living neurons. CNN templates are extremely useful for standards in visual computing. Simple examples are CNNs detecting edges either in binary (black-and-white) input images or in gray-scale input images.An image consists of pixels corresponding to the cells of a CNN with binary or gray scale. Logic operators can also be realized by simple CNN templates in order to combine CNN templates for visual computing. The logic NOT CNN operation inverts intensities of all binary image pixels, the foreground pixels becoming the background, and vice versa. The logic AND (logic OR, respectively) CNN operation performs a pixel-wise logic AND (logic OR operation, respectively) on corresponding elements of two binary images. These operations can be used as elements of some Boolean logic algorithms that operate in parallel on data arranged in the form of images. The simplest form of a CNN can be characterized via Boolean functions. We consider a space-invariant binary CNN belonging to the uncoupled class C(A0 , B, z) with a 3 × 3 neighborhood that maps any static 3 × 3 input pattern into a static binary 3 × 3 output pattern. It can be uniquely defined by a Boolean function of nine binary input variables, where each variable denotes one of the nine pixels within the sphere of influence of a cell. Although there are infinitely many distinct templates of the class C(A0 , B, z), there are only a finite number of distinct combinations of 3 × 3 pattern of black and white cells, namely, 29 = 512. As each binary nine input pattern can map to either 0 (white) or 1 (black), there are 2512 distinct Boolean maps of nine binary variables. Thus, every binary standard CNN can be uniquely characterized by a CNN truth table, consisting of 512 rows with one for each distinct 3 × 3 black-and-white pattern, nine input columns with one for each binary input variable, and one output column with binary values of the output variable. 2512 ≈ 1.3408 × 10154 > 10154 is an “immense” number (in the sense proposed by Walter Elsasser), although the uncoupled C(A0 , B, z) CNNs are only
CELLULAR NONLINEAR NETWORKS a small subclass of all CNNs. So, the question arises as to which subclass of Boolean functions exactly characterizes the uncoupled CNNs. In their critique of the perceptron (1969), M. Minsky and S. Papert introduced the concept of linearly separable and nonseparable Boolean functions. It can be proven that the class C(A0 , B, z) of all uncoupled CNNs with binary inputs and binary outputs is identical to the linearly separable class of Boolean functions. Thus, linearly nonseparable Boolean functions such as, for example, the XOR function cannot be realized by an uncoupled CNN. But the uncoupled CNNs can be used as elementary building blocks that are connected by CNNs of logical operations. It can be proved that every Boolean function of nine variables can be realized by using uncoupled CNNs with nine inputs and either one logic OR CNN, or one logic AND CNN, in addition to one logic NOT CNN. Every uncoupled CNN C(A0 , B, z) with static binary inputs is completely stable in the sense that any solution converges to an equilibrium point. The waveform of the CNN state increases or decreases monotonically to the equilibrium point if the state at this point is positive or negative. Moreover, except for some degenerate cases, the steady-state output solution can be explicitly calculated by an algebraic formula without solving the associated nonlinear differential equations. Obviously, this is an important result to characterize a CNN class of nonlinear dynamics with robust CNN templates. Completely stable CNNs are the workhouses of the most current CNN applications. But there are also even simpler CNNs with oscillatory or chaotic behavior. Future applications will exploit the immense potentials of the unexplored terrains of oscillatory and chaotic operating regions. Then, Cellular Neural Networks will actually be transformed to Cellular Nonlinear Networks with all kinds of phase transitions and attractors of nonlinear dynamics.
Complexity Paradigm From the perspective of nonlinear dynamics, it is convenient to think of standard CNN state equations as a set of ordinary differential equations with the components of the CNN gene as bifurcation parameters. Then, the dynamical behavior of standard CNNs can be studied in detail. Numerical examples deliver CNNs with limit cycles and chaotic attractors. The emergence of complex structures in nature can be explained by the nonlinear dynamics and attractors of complex systems. They result from the collective behavior of interacting elements in a complex system. The different paradigms of complexity research promise to explain pattern formation and pattern recognition in nature by specific mechanisms (e.g., Prigogine’s chemical dissipation, Haken’s work on lasers). From the CNN point of view, it is convenient to
109 study the subclass of autonomous CNNs where the cells have no inputs. In these systems, it can be explained how patterns can arise, evolve, and sometimes converge to an equilibrium by diffusion-reaction processes. Pattern formation starts with an initial uniform pattern in an unstable equilibrium that is perturbed by small, random displacements. Thus, in the initial state, the symmetry of the unstable equilibrium is disturbed, leading to rather complex patterns. Obviously, in these applications, cellular networks do not refer only to neural activities in nerve systems, but to pattern formation in general. A CNN is defined by the state equations of isolated cells and the cell coupling laws. For simulating reaction-diffusion processes, the coupling law describes a discrete version of diffusion (with a discrete Laplacian operator). CNN state equations and CNN coupling laws can be combined in a CNN reaction-diffusion equation, determining the dynamics of autonomous CNNs. If we replace their discrete functions and operators by their limiting continuum version, then we obtain the well-known continuous partial differential equations of reaction-diffusion processes that have been studied in different complexity approaches. Chua’s version of the CNN reactiondiffusion equation delivers computer simulations of these pattern formations in chemistry and biology (e.g., concentric, auto, and spiral waves). On the other hand, for any nonlinear partial differential equation, many appropriate CNN equations can be associated with it. In many cases, it is sufficient to study the computer simulations of associated CNN equations, in order to understand the nonlinear dynamics of these complex systems.
CNN Universal Machine and Programming There are practical and theoretical reasons for introducing a CNN Universal Machine (CNN-UM). From an engineering point of view, it is totally impractical to implement different CNN components or templates with different hardwired CNNs. Historically, John von Neumann’s general-purpose computer was inspired by Alan Turing’s universal machine in order to overcome all the different hardware machines of the 1930s and 1940s for different applications. From a theoretical point of view, CNN-UM opens new avenues of analog neural computers. In the CNN-UM, analog (continuous) and logic operations are mixed and embedded in an array computer. It is a complex nonlinear system, which combines two different types of operations, namely continuous nonlinear array dynamics and continuous time with local and global logic. Obviously, the mixture of analog and digital components considerably resembles to neural information processing in living organisms. The stored program, as a sequence of templates, could
110 be considered as a genetic code for the CNN-UM. The elementary genes are the templates. After the introduction of the architecture with standard CNN universal cells and the global analog programming unit (GAPU), the complete sequence of an analog CNN program can be executed on a CNNUM. The description of such a program contains the global task, the flow diagram of the algorithm, the description of the algorithm in a high level α (analog) programming language, and the sequence of macroinstructions by a compiler in the form of an analog machine code (AMC). At the lowest level, the chips are embedded in their physical environment of circuits. The AMC code will be translated into hardware circuits and electrical signals. At the highest level, the α compiler generates a macro-level code called analog macro-code (AMC). The input of the α compiler is the description of the flow diagram of the algorithm using the language. In Figure 2, the levels of the software and the core engines are described. The analog macro code is used for software simulations running on a Pentium chip in a PC and for applications in a CNN-UM Chip with a CNN Chip Prototyping System (CCPS). The CNN-UM is technically realized by analog and digital VLSI implementation. It is well known that any complex system of digital technology can be built from a few implemented building blocks by wiring and programming. In the same way, the CNNUM, also containing analog building blocks, can be constructed. A core cell needs only three building blocks of a capacitor, resistor, and a VCCS (voltagecontrolled current source). If a switch, a logic register, and a logic gate are added to the three building blocks, the extended CNN cell of the CNN-UM can be implemented. In principle, six building blocks plus wiring are sufficient to build the CNN-UM: resistor,
CELLULAR NONLINEAR NETWORKS capacitor, switch, VCCS, logic register, logic gate. As in a digital computer, stored programmability can also be introduced for analog neural computers, enabling the fabrication of visual microprocessors. Similar to classical microprocessors, stored programmability needs a complex computational infrastructure with high-level language, compiler, macro-code, interpreter, operating system, and physical code, in order to make it understandable for the human user. Using this computational infrastructure, a visual microprocessor can be programmed by downloading the programs onto the chips, as in the case of classical digital microprocessors. Writing a program for an analog CNN algorithm is as easy as writing a BASIC program. With respect to computing power, CNN computers offer an orders-of-magnitude speed advantage over conventional technology when the task is complex. There are also advantages in size, complexity, and power consumption. A complete CNN-UM on a chip consists of an array of 64 × 64 0.5 m micron CMOS cell processors. Each cell is endowed not only with a sensor for direct optical input of images and video but also with communication and control circuitries, as well as local analog and logic memories. CNN cells are interfaced with their nearest neighbors, as well as with the outside world. A CNN chip with 4096 cell processors on a chip means more than 3.0 Tera-OPS (operations per second) equivalent of computing power, which is about a 1000 times faster than the computing power of an advanced Pentium processor. By exploiting the state-of-the-art vertical packaging technologies, close to 1015 OPS CNN-UM architectures can be constructed on chips with 200 × 200 arrays. Thus, CNN universal chips will realize Tera-OPS or even Penta(1015 ) OPS, which are required for high-speed target recognition and tracking, real-time visual inspection of manufacturing processes, and intelligent machine vision capable of recognizing context-sensitive and moving scenes. KLAUS MAINZER See also Attractor neural network; automata; Integrable cellular automata
Cellular
Further Reading
Figure 2. Levels of the software and the core engines in the CNN-UM.
Chua, L.O. 1998. A Paradigm for Complexity, Singapore: World Scientific Chua, L.O., Gulak, G., Pierzchala, E. & Rodriguez-Vázquez (editors). 1998. Cellular neural networks and analog VLSI. Analog Integrated Circuits and Signal Processing. An International Journal, 15(3) Chua, L.O. & Roska, T. 2002. Cellular Neural Networks and Visual Computing: Foundations and Applications, Cambridge and New York: Cambridge University Press Chua, L.O., Sbitnev, V.I. & Yoon, S. 2003. A nonlinear dynamics perspective of Wolfram’s new kind of science. Part II: universal neuron. International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, 13: 2377–2491
CENTER MANIFOLD REDUCTION
111
Chua, L.O. & Yang, L. 1988. Cellular neural networks: theory. IEEE Transactions on Circuits and Systems, 35: 1257–1272; Cellular neural networks: applications. IEEE Transactions on Circuits and Systems, 35: 1273–1290 Chua, L.O., Yoon, S. & Dogaru, R. 2002. A nonlinear dynamics perspective of Wolfram’s new kind of science. Part I: threshold of complexity. International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, 12: 2655–2766 Elsassee, W.M. 1987. Reflections on a Theory of Organisms: Holism in Biology, Frelighsburg, Quebec: Editions Orbis Huertas, J.L., Che, W.K. & Madan, R.N. (editors). 1999. Visions of Nonlinear Science in the 21st Century. Festshrift Dedicated to Leon O. Chua on the Occasion of his 60th Birthday, Singapore: World Scientific Mainzer, K. 2003. Thinking in Complexity: The Computational Dynamics of Matter, Mind, and Mankind, 4th edition, Berlin and New York: Springer Tetzlaff, R. (editor). 2002. Cellular Neural Networks and Their Applications. Proceedings of the 7th IEEE International CNN Workshop, Singapore: World Scientific
y Es
Ec x
Figure 1. The linear stable (E s ) and center manifold (E c ) for example (1).
Both manifolds will be deformed in the transition to the nonlinear system. The (nonlinear, perturbed) center manifold can be described in the present example by M c : y = h(x)
CENTER MANIFOLD REDUCTION A dynamical system might be difficult to solve, even numerically. To better understand its behavior in the neighborhood of an equilibrium, a reduction can be performed. For this, one starts with the eigenvalue spectrum λ of the linearized system. In the linear evolution ∼ exp(λt), eigenvalues with &λ < 0 (&λ > 0) are called stable (unstable), whereas those with &λ = 0 are called central. In the neighborhood of an equilibrium point, P , of a dynamical system, in general, three different types of invariant manifolds exist: the trajectories belonging to the stable manifold M s are being attracted by P , whereas those of the unstable manifold M u are being repelled. The dynamics on the center manifold M c depend on the nonlinearities. For the linearized problem, E s ≡ M s , E u ≡ M u , and E c ≡ M c are uniquely determined linear subspaces that span the whole space. The transition to the nonlinear system causes (only) deformations of the linearly determined manifolds Ms , Mu , and Mc . However, the form of the latter critically depends on the nonlinear terms. Let us elucidate that the behavior for the very simple system (Grosche et al., 1995) x˙ = −xy,
y˙ = −y + x 2 .
(1)
Here, the dot means differentiation with respect to time t. The equilibrium point P is (0, 0), and the linearized system is x˙ = 0,
y˙ = −y.
(2)
Here, we have the stable manifold E s being identical to the y-axis and the center manifold E c being identical to the x-axis. The linearized problem can be visualized by the graph shown in Figure 1.
(3)
with h(0) = h (0) = 0. Using that ansatz for the center manifold in (1), we obtain x˙ = −x h(x).
(4)
Differentiating (3) with respect to t, leads to y˙ = h (x) x˙
(5)
−h(x) + x 2 = h (x)[−xh(x)],
(6)
or
that is, a differential equation for h = h(x). Performing a power series ansatz h(x) = cx 2 + dx 3 + · · ·, we find that c = 1. The (nonlinear) center manifold is thus given by (7) y = x2 + · · · , and the dynamics on it follows from x˙ = −x 3 ;
(8)
that is, the trajectories are being attracted by P . For an illustration, see Figure 2.
More General Theoretical Background Let us now generalize the idea and consider a system of ordinary differential equations (ODEs) a˙ = Aa + N (a, b),
(9)
b˙ = Bb + M(a, b),
(10)
describing the dynamics of amplitudes a1 , . . . , an and b1 , . . . , bm of n linear marginal stable modes and m linear stable modes, respectively [(a1 , . . . , an ) := a,
112
CENTER MANIFOLD REDUCTION (a, b) = (0, 0) in &n × &m , so that every trajectory starting in U converges to a trajectory on the center
y
manifold.
c
M
x
M
s
Figure 2. The stable (M s ) and center manifold (M c ) for example (1).
(b1 , . . . , bm ) := b]. This implies that the real parts of eigenvalues of the matrix A vanish, and the real parts of the eigenvalues of the matrix B are negative. The functions N(a, b), M(a, b) ∈ C r on the right-hand sides of Equations (9) and (10) represent the nonlinear terms. Let E c be the n-dimensional (generalized) eigenspace of A and E s be the mdimensional (generalized) eigenspace of B. Under these assumptions, the center manifold theorem provides the following statement (Guckenheimer & Holmes, 1983): There exists an invariant C r manifold M s and an invariant C r−1 manifold M c that are tangent at (a, b) = (0, 0) to the eigenspaces E s and E c , respectively. The stable manifold M s is unique but the center manifold M c is not necessarily unique.
Locally, the center manifold M c can be represented as a graph, M c = {(a, b)|b = h(a)} , h(0) = 0, Dh(0) = 0, (11) where the C r−1 function h is defined in a neighborhood of the origin, and Dh denotes the Jacobi matrix. Introducing (11) in Equations (9) and (10), we obtain a˙ = Aa + N (a, h(a)),
(12)
Dh(a) [Aa + N (a, h(a))] =Bh(a) + M(a, h(a)). (13) The solution h of Equation (13) can be approximated by a power series. The ambiguity of the center manifold is manifested by the fact that h is determined only modulo C ∞ , a non-analytic function; thus, the power series approximation of the function h is unique. The importance of the center manifold theory is reflected by the following theorem (Marsden & McCracken, 1976; Carr, 1981): If there exists a neighborhood U c of (a, b) = (0, 0) on M c , so that every trajectory starting in U c never leaves it, then there exists a neighborhood U of
Therefore, it is sufficient to discuss the dynamics on the center manifold, described by Equation (12). If all solutions are bounded to some neighborhood of the origin, then we have described all features of the asymptotic behavior of Equations (9) and (10). In order to fulfill the condition, the function N(a, h(a)) has to be expanded up to a sufficiently high order. We end up with normal forms, for example, the third order may be adequate. Very often, the problems contain parameters and, in addition, the systems may be infinite dimensional. In both cases, one can generalize the theory presented so far. Parameters can be taken into account by expanding Equations (9) and (10) to a˙ = A()a + N(a, b, ), b˙ = B()b + M(a, b, ), ˙ = 0,
(14) (15) (16)
where = (an + 1 , . . . , an + l ) contains l parameters. The center manifold now has dimension n + l.
PDE Reduction and Symmetry Considerations The theory is also valid in the infinite-dimensional case, if the spectrum of the linear operator can be split into two parts. The first part contains a finite number of eigenvalues whose real parts are zero, and the second part contains (an infinite number of) eigenvalues with negative real parts that are bounded away from zero. To elucidate the power of center manifold reduction, let us consider the partial differential equation (PDE) ∂φ ∂ 2φ ∂ 3φ ∂ 4φ ∂φ +φ +α + β 3 + 4 + νφ=0. ∂t ∂y ∂y 2 ∂y ∂y (17) All coefficients α, β, and ν are nonnegative. In the following, we treat β as a fixed parameter, and consider the dynamics in dependence in α and ν. The linearization with φ ≡ 0 as the equilibrium solution leads to (18) ω = −βk 3 + i −k 4 + αk 2 − ν when we assume a unit cell of length 2π with periodic boundary conditions.A typical dependence of the linear growth (or damping) rate γ := 'ω is shown in Figure 3 for α = 5.25 and ν = 3.8. The case of two unstable modes (k = 1, 2) is already highly nontrivial. Let us choose α = αc = 5 and ν = νc = 4. Then, the modes φ (1) = sin y, φ (2) = cos y, φ (3) = sin 2y, and φ (4) = cos 2y belonging to k = 1 and 2, respectively, are marginally stable. We introduce
CENTER MANIFOLD REDUCTION
113 Translational invariance implies the following. If φ(y) is a solution,
4
γ
Ty0 φ(y) := φ(y + y0 )
2
0 1
2
k
3
-2
-4
Figure 3. Growth rate curve, with two unstable modes at k = 1 and k = 2 for the PDE (17).
the four (real) amplitudes a1 , a2 , a3 , and a4 , as well as α5 = α − αc and α6 = ν − νc . The center manifold theory will allow us to derive a closed set of nonlinear amplitude equations a˙ n = fn (a1 , . . . , a6 ), n = 1, . . . , 6,
(19)
which are valid in the neighborhood of the critical point αc , νc . One has f5 ≡ f6 ≡ 0. The other functions fn are written as a power series in an , fn =
Am n
am +
1≤m≤6
mp An
am ap + · · · . (20)
1≤m≤p≤6
The dynamics on the center manifold is characterized by a1 , . . . , a6 . Thus, we can make the ansatz (Carr, 1981) an (t) φ (n) (y) φ(y, t) = 1≤n≤4
+
1≤n≤m≤6
an (t) am (t)φ (nm) (y), (21)
where the 27 = 21 new functions φ (nm) and, of course, 8 the next 3 = 56 functions φ (nmp) , and so on, can be chosen orthogonal to φ (n) , n = 1, . . . , 4. The technical procedure is now as follows. One inserts ansatz (21) into the basic equation (17) and compares equal orders in the amplitudes. For example, in the second-order, one collects equal powers ar as ; the “coefficients” (being equated to zero) will determine the unknown functions φ (nm) via ODEs. Taking into account the (periodic) boundary conditions, we have to satisfy the solvability conditions. Collecting equal powers of the amplitudes an , we find the solutions for np the coefficients Ar . With these values, we can solve for φ (mn) . This procedure should be continued to higher orders. Actually, when written explicitly, one faces considerable work (in second order, we have to solve for 84, and in third order for 224 coefficients A··· r , and so on). One can simplify the calculations by making use of symmetries.
(22)
will also satisfy the dynamical equation (17), where y0 is a real shift parameter. (In the case β ≡ 0, we also have the mirror symmetry φ(y) = φ(− y).) Remember the structure of the center manifold reduction: the modes φ (nm) , φ (nmp) , . . . have to be determined from inhomogeneous differential equations. The inhomogeneities contain (in nonlinear forms) the marginal modes φ (r) , r = 1, . . . , 4. Thus, the so-called slaved modes can be written in symbolic form as & % (23) φ (m...) = h(m...) {φ (r) } . Thus, the following symmetry should hold: & % & % Ty0 h(m...) {φ (r) } = h(m...) Ty0 {φ (r) } .
(24)
The consequences of the translational symmetry are most easily seen when combing the marginal modes to ϕ :=
4
ar φ (r) ≡ ' c1 eiy + c2 e2iy
(25)
r=1
with the complex amplitudes c1 := a1 + ia2 , c2 := a3 + ia4 . The (complex) amplitude equations are c˙ = gn (c1 , c2 , a5 , a6 ), n = 1, 2, a˙ m = 0, m = 5, 6.
(26) (27)
The translational symmetry (22) requires einy0 gn (c1 , c2 , a5 , a6 ) = gn eiy0 c1 , ei2y0 c2 , a5 , a6 (28) for n = 1, 2. The vector field (g1 , g2 ) is called equivariant with respect to the operation (29) (c1 , c2 ) → eiy0 c1 , ei2y0 c2 . The most general form of vector fields being equivariant under operation (28) is (g1 , g2 ) = c1 P1 + c¯1 c2 Q1 , c2 P2 + c12 Q2 , (30) where P1 , P2 , Q1 , and Q2 are polynomials in |c1 |2 , |c2 |2 , and &(c12 c¯2 ); of course, they can also depend on a5 and a6 . Keeping in mind the symmetry properties, the general form of the amplitude equations reduces to c˙1 = λc1 + Ac¯1 c2 + C c1 |c1 |2 + E c1 |c2 |2 + O(|c|4 ), (31) c˙2 = µc2 + B c12 + Dc2 |c1 |2 + F c2 |c2 |2 + O(|c|4 ), (32) a˙ 5 = a˙ 6 = 0.
(33)
114
CHAOS VS. TURBULENCE
A straightforward analysis leads to λ = a5 − a6 − iβ,
A = 21 , C = 0, E = 21 D,
equations):
µ = 4a5 − a6 + i8β,
B = − 21 , 3 D = − 4(20−i9β) ,
(34)
1 F = − 12(15−i4β) .
This completes the center manifold reduction. Very interesting conclusions result, for example, with respect to the number of modes and their interplay in time, from the systematic treatment with the center manifold theory. For example, one interesting aspect is that the present codimension-two analysis can describe successive bifurcations of one unstable mode, which, in some cases can lead to chaos in time. KARL SPATSCHEK See also Inertial manifolds; Invariant manifolds and sets; Synergetics Further Reading Carr, J. 1981. Applications of Center Manifold Theory, New York: Springer Grosche, G., Ziegler, V., Ziegler, D. & Zeidler, E. (editors). 1995. Teubner-Taschenbuch der Mathematik, Teil II, Stuttgart: Teubner Guckenheimer, J. & Holmes, P. 1983. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, Berlin and New York: Springer Marsden, J.E. & McCracken, M. 1976. The Hopf Bifurcation and Its Applications, New York: Springer
CENTRAL LIMIT THEOREM See Martingales
CHAOS VS. TURBULENCE The notion of chaos has its genesis in the work of Henri Poincaré (See Poincaré theorems) on the threebody problem of celestial mechanics. Poincaré realized that this problem cannot be reduced to quadratures and solved in the manner of the two-body problem. A precise definition of chaos or non-integrability can be given in terms of the absence of conserved quantities necessary to yield a solution. It took several decades for the full significance of non-integrable dynamical systems to be appreciated and for the term “chaos” to be introduced (See Chaotic dynamics). An important step was the 1963 paper by Edward N. Lorenz, entitled “Deterministic Nonperiodic Flow” (Lorenz, 1963), on a model describing thermal convection in a layer of fluid heated from below. The Lorenz model truncates the basic fluid dynamical equations, written in terms of Fourier amplitudes, to just three modes (See Lorenz
x˙ = −σ x + σy, y˙ = −xz + rx − y z˙ = xy − bz.
(1)
In this system, x is the time-dependent amplitude of a stream-function mode, while y and z are mode amplitudes of the temperature field. The parameters σ , r, and b depend on the geometry, the boundary conditions, and the physical parameters of the fluid. Equations (1) are a subset of the full, infinite system of mode amplitude equations, chosen such that it exactly captures the initial instability of the thermally conducting state to convecting rolls when the parameter r, known as the Rayleigh number, is increased. What Lorenz observed in numerical solutions of (1), and verified by analysis, was that very complicated, erratic solutions would arise when r was increased well beyond the conduction-to-convection transition. In fact, Lorenz had found the first example of what is today called a strange attractor (See Figure 1 and Attractors). System (1) is clearly deterministic, yet it can produce non-periodic solutions. There were other intriguing aspects of the solutions to (1) in the chaotic regime. Solutions arising from close initial conditions would separate exponentially in time, leading to an apparently random dependence on initial conditions of the solution after a finite time (See Butterfly effect). Today, this would be associated with the existence of a positive characteristic Lyapunov exponent. A list of “symptoms” can be established that are shared by systems having the property of chaos, including: complex temporal evolution, exponential separation from close initial conditions, a strange attractor in phase space (if the system is dissipative), and positive Lyapunov exponents. An important difference from Poincaré’s work was that Lorenz’s system described a dissipative system in which energy is not conserved. From the start, the potential connection between chaos and other concepts in statistical physics, such as ergodicity and turbulence, was of central interest. For example, chaos was thought to imply ergodic
Figure 1. Strange attractor associated with the Lorenz equations. Reproduced with permission from Images by Paul Bourke, http://astronomy.swin.edu.au/ pbourke/fractals/lorenz/.
CHAOS VS. TURBULENCE
115
behavior in the sense of the “ergodic hypothesis” underlying equilibrium statistical mechanics (See Ergodic theory). Similarly, the connection between chaos and turbulence was sought, particularly appropriate given that Lorenz’s model was of a fluid flow. Experiments on other fluid systems by Gollub, Swinney, Libchaber, and later many others established that the transition from laminar to turbulent flow typically takes place through a regime of chaotic fluid motion. The well-known route to chaos via period-doubling bifurcations of Mitchell J. Feigenbaum belongs here as well (Feigenbaum, 1980; Eckmann, 1981). In view of this, it is natural to think that turbulent flow itself is simply some kind of chaotic flow state. Turbulence is a common state of fluid flow that shares several “symptoms” with chaotic dynamical systems, but also has distinct features not easily duplicated by chaos. The word “turbulence” was apparently first used by Leonardo da Vinci to describe a complex flow. In mathematical terms, turbulent flows should be solutions of the Navier–Stokes equation, usually written in the dimensionless form (See Navier– Stokes equation) ∂u + u · ∇ u = −∇p + R −1 u, ∂t ∇ · u = 0.
(2) (3)
We have restricted attention to incompressible flows by insisting in (3) that the velocity field u(x, t) be divergence free. In (2) the field p represents the pressure—the constant density has been absorbed in the nondimensionalization. The sole dimensionless parameter R is Reynolds number. In terms of physical variables R = U L/ν, where U is a typical scale of velocity, L a typical length scale of the flow, and n is the kinematic viscosity of the fluid. For small values of R, say 0 < R ≤ 1, the flow is laminar. For moderate R, say 1 < R ≤ 100, various periodic flow phenomena may arise, such as the shedding of vortices from blunt bodies. For large R, the flow eventually breaks down into many interacting eddies—this is turbulent flow. Since most flowing fluid is, in fact, flowing at large R, turbulence is the prevailing flow state of fluids in our surroundings (oceans and atmosphere), in the universe in general, in many industrial processes, and to some extent, within our bodies. The characterization of what makes a flow turbulent is not nearly so clear as what makes a dynamical system chaotic. First, the issue of whether the particular set of nonlinear partial differential equations (2) and (3) even has a smooth solution for all time, given smooth initial conditions, is still unsettled and is one of the prize challenges set by the Clay Mathematics Institute (http://www.claymath.org). In spite of several attempts, a convincing example of a flow with smooth initial conditions, evolving under (2) and (3), that develops a singularity in a finite time has not been found.
Conversely, there is no proof that solutions with the requisite number of derivatives will exist for all time. Turbulent flows are also recognized by a variety of “symptoms.” The flow velocity as a function of time at any given point in a turbulent flow is a random function (roughly a Gaussian). However, the overall nature of the velocity field viewed as a random vector field is not Gaussian. The random nature of turbulent velocity fields is today thoroughly familiar to the flying public. The randomness is not just temporal at a fixed point in space; the spatial variation of the flow field at a given time constitutes a multitude of interacting eddies of different sizes. Because of their random character, turbulent flows stir vigorously, leading to rapid dispersal of a passively advected substance or a field, such as temperature, and to a rapid exchange of momentum with contiguous fluid. In the classic pipe flow experiment of Osborne Reynolds, for example, in which the transition from laminar to turbulent flow was first demonstrated to depend only on the dimensionless number R, a streak of dye introduced at the inlet would remain a thin streak (except for a bit of molecular diffusion) when the flow in the pipe was laminar. When the flow rate was increased and the flow became turbulent, the dye rapidly dispersed across the pipe. In a turbulent flow, the large scales of motion, which are typically in contact with some kind of forcing from the outside, will generate smaller scales through interactions and instabilities. This process continues through a broad range of length scales, ultimately reaching small scales where molecular dissipation is effective and quells the motion altogether. The repeated process of “handing down” energy from larger scales to smaller scales is a key process in turbulence. It is usually referred to as the Kolmogorov cascade (See Kolmogorov cascade). The qualitative nature of this process was already envisaged by Lewis Fry Richardson and was described by him in an adaptation of a verse by Jonathan Swift: Big whorls have little whorls, Which feed on their velocity; And little whorls have lesser whorls, And so on to viscosity (in the molecular sense).
Because of its broad range of length scales, the energy in a turbulent flow may be considered partitioned among modes of different wavenumbers k. The energy spectrum E(k) is defined such that E(k) dk is the amount of kinetic energy of the turbulent flow associated with motions with wavenumbers between k and k + dk. The cascade implies a transfer of energy from scale to scale with a characteristic energy flux per unit mass, ε, which must also be equal to the rate at which energy is fed to the flow from the largest scales, and to the rate at which energy is dissipated by viscosity at the smallest scales. A simple dimensional argument then (See Dimensional
116
CHAOS VS. TURBULENCE
analysis) gives the dependence of E(k) on ε and k to be E(k) = Cε2/3 k −5/3 .
(4)
This is the well-known Kolmogorov spectrum, predicted by Andrei N. Kolmogorov in 1941 (Hunt et al., 1991; Frisch, 1995) and only subsequently verified by experiments in a tidal channel (see Figure 2). Turbulence has many further intriguing statistical properties, which remain subjects of active research. A major shift in our thinking on turbulence occurred in the late 1960s and in the 1970s when experiments by Kline and Brown & Roshko demonstrated that even in turbulent shear flows at very large Reynolds number, one can identify coherent structures that organize the flow to some extent (Figure 3). Later investigations have shown that even in homogeneous, isotropic turbulence,
one-dimensional spectrum function E(k) (m3 s-2 )
10-3 10-4 slope = - 5/3 10-5 10-6 10-7 10-8 10-9 10-10 10-11
10-12 1
10
102
103
104
wavenumber k (m−1)
the flow is often organized into strong filamentary vortices. The persistence of these organized structures, which can dominate the flow for long times and interact dynamically, forces a strong coupling among the spectral modes, reducing the effective number of degrees of freedom of the problem. Chaos and turbulence both describe states of a deterministic dynamical system in which the solutions appear random. Our current understanding of chaos is largely restricted to few-degree-of-freedom systems. Turbulence, on the other hand, is a many-degreeof-freedom phenomenon. It seems somewhat unique to fluid flows—related phenomena such as plasma turbulence or wave turbulence appear to be intrinsically different. The emergence of collective modes in the form of coherent structures in turbulence amidst the randomness is an intriguing feature, somewhat reminiscent of the mix between regular “islands” and the “chaotic sea” observed in chaotic, low-dimensional dynamical systems. The coherent structures themselves approximately form a deterministic, low-dimensional dynamical system. However, it seems impossible to fully eliminate all but a finite number of degrees of freedom in a turbulent flow—the modes not included explicitly form an essential, dissipative background, often referred to as an eddy viscosity, that must be included in the description. Turbulence is intrinsically spatiotemporal, whereas chaotic behavior in a fluid system can be merely temporal with a simple spatial structure. It is possible for the flow field to be perfectly regular in space and time, yet the trajectories of fluid particles moving within the flow will be chaotic. This is the phenomenon of chaotic advection (See Choatic advection), which points out the hugely increased complexity of a turbulent flow relative to chaos in a dynamical system. PAUL K.A.NEWTON AND HASSAN AREF See also Attractors; Butterfly effect; Celestial mechanics; Chaotic advection; Chaotic dynamics; Diffusion; Ergodic theory; Kolmogorov cascade; Lorenz equations; Lyapunov exponents; Navier– Stokes equation; N-body problem; Partial differential equations, nonlinear; Period doubling; Phase space; Poincaré theorems; Routes to chaos; Shear flow; Thermal convection; Turbulence
Figure 2. One-dimensional spectrum in a tidal channel from data in Grant et al. (1962).
Further Reading
Figure 3. Coherent structures in a turbulent mixing layer. From Brown & Roshko (1974), reprinted from An Album of Fluid Motion, M. Van Dyke, Parabolic Press, 1982.
Aref, H. & Gollub, J.P. 1996. Application of dynamical systems theory to fluid mechanics. Research Trends in Fluid Dynamics, Report of the US National Committee on Theoretical and Applied Mechanics, edited by J.L. Lumley et al., New York: AIP Press, pp. 15–30 Eckmann, J.P. 1981. Roads to turbulence in dissipative dynamical systems. Reviews of Modern Physics, 53: 643–654 Feigenbaum, M.J. 1980. Transition to aperiodic behavior in turbulent systems. Communications in Mathematical Physics, 77: 65–86
CHAOTIC ADVECTION
117
Frisch, U. 1995. Turbulence—The Legacy of A. N. Kolmogorov, Cambridge and New York: Cambridge University Press Grant, H.L., Stewart, R.W. & Moilliet,A. 1962. Turbulent spectra from a tidal channel. J. Fluid. Mech., 12: 241–268 Hunt, J.C.R., Phillips, O.M., & Williams, D. (editors). 1991. Turbulence and stochastic processes: Kolmogorov’s ideas 50 years on. Proceedings of the Royal Society, London A, 434: 1–240 Lorenz, E.N. 1963. Deterministic nonperiodic flow. Journal of Atmospheric Sciences, 20: 130–141 Ruelle, D. 1991. Chance and Chaos, Princeton, NJ: Princeton University Press
CHAOTIC ADVECTION In fluid mechanics, advection means the transport of material particles by a fluid flow, as when smoke from a chimney is blown by the wind. The term passive advection is sometimes used to emphasize that the substance being carried by the flow is sufficiently inert that it follows the flow entirely, the velocity of the advected substance at every point and every instant adjusting to that of the prevailing flow. To describe the kinematics of a fluid, two points of view may be adopted: the Eulerian representation focuses on the velocity field u as a function of position and time, u(x, t); the Lagrangian representation emphasizes the trajectories xP (t) of a fluid particle as it is advected by the flow. The two points of view are linked by stating that the value of the velocity field at a given point in space and instant in time equals the velocity of the fluid element passing through that same point at that instant, that is, x˙ P (t) = u(xP (t), t).
(1)
The Eulerian representation is used extensively for measurements and numerical simulations of fluid flow since it allows one to fix the points in space and time where the field is to be determined. The Lagrangian representation, on the other hand, is often more natural for theoretical analysis, as it explicitly addresses the nonlinearity of the Navier–Stokes equation. For a given flow, the equations of motion (1), sometimes called the advection equations, are a system of ordinary differential equations that define a dynamical system. These equations can be integrable or non-integrable. Chaotic advection appears when the equations are non-integrable and the trajectories of fluid elements become chaotic. The dynamical system defined by (1) has two or more degrees of freedom. For a two-dimensional time-independent or steady flow, there are just two degrees of freedom and no chaotic motion is possible. However, already for a 2-d timedependent or a 3-d steady flow, there are enough degrees of freedom to allow for chaotic trajectories. In other words, chaotic advection can appear even for flows that would otherwise be considered laminar. The phenomenon of chaotic advection is also known as Lagrangian chaos, or sometimes Lagrangian
turbulence. Usually, the word turbulence refers to the Eulerian representation and to flows in which the velocity field fluctuates across a wide range of spatial and temporal scales with limited correlations. In such flows, the trajectories of fluid elements are always chaotic. By contrast, chaotic advection or Lagrangian chaos can arise in situations where the velocity field is spatially coherent and the time dependence is no more complicated than a simple periodic modulation. Many examples have now been given to illustrate the point that the complexity of the spatial structure of material advected by a flow can be much greater than one might surmise from a picture of the instantaneous streamlines of the flow. Thus, in the paper that introduced the notion of chaotic advection (Aref (1984) and Figure 1), the case of two stirrers that act alternately on fluid confined to a disk was considered. Each stirrer was modeled as a point vortex that could be switched on and off. There are several parameters in the system, such as the strengths and positions of the vortex stirrer and the time interval over which each acts. For a wide range of parameter values, the dynamics is as shown in Figure 1; after just a few periods, the 10,000 particles being advected are spread out over a large fraction of the disk. Chaotic advection gives rise to very efficient stirring of a fluid. Material lines are stretched at a rate given by the Lyapunov exponent. In bounded flows, these exponentially growing material lines have to be folded back over and over again, giving rise to ever finer and denser striations. They are familiar from the mixing of paint or from marbelized paper. On the smallest scales diffusion, takes over and smoothes the steep gradients, giving rise to mixing on the molecular scale. The interplay between stirring and diffusion is the
Figure 1. Spreading of 10,000 particles in a cylindrical container (disk) under the alternating action of two stirrers. The positions of the stirrers are marked by crosses. (a) initial distribution; (b)–(g) positions of the particles after 1, 2, …, 6 periods; (h) after 9 periods; (i) after 12 periods. From (Aref, 1984).
118
CHAOTIC DYNAMICS
source of the efficient mixing in the presence of chaotic advection. This phenomenon is being exploited in various procedures for mixing highly viscous fluids, including applications to materials processing, in micro-fluidics, and even in large-scale atmospheric, oceanographic, and geological flows. It may play a role in the feeding of microorganisms. In the case of 2-d incompressible flows the equations of motion allow for an interesting connection to Hamiltonian dynamics. The velocity field can be represented through a stream function ψ(x, y, t), so that u = ∇ × ψez and Equations (1) for the trajectories of fluid elements become x˙ =
∂ψ , ∂y
y˙ = −
∂ψ . ∂x
(2)
The relation to Hamilton’s canonical equations is established through the identification x=position, y=momentum, and ψ=Hamilton function. Thus, what is the phase space in Hamiltonian systems can be visualized as the position space in the hydrodynamic situation. The structures that appear in 2-d periodically driven flows are, therefore, similar to the phase space structures in a Poincaré surface of section for a chaotic Hamiltonian system, and the same techniques can be used to analyze the transport of particles and the stretching and folding of material lines. The phenomena that arise in chaotic advection by simple flows may be relevant to turbulent flows when a separation of length and time scales is possible. Consider, for example, the small-scale structures that appear in the density of a tracer substance when the molecular diffusivity κ of the tracer is much smaller than the kinematic viscosity ν of the liquid, that is, in a situation where the Schmidt number Sc = ν/κ is much larger than one. Then, the velocity field is smooth below the Kolmogorov scale, λK = (ν 3 /ε)1/4 , where ε is the kinetic energy dissipation, but the scalar field has structures on even smaller scales, down to λs = (Sc)−1/2 λK . These arise from Lagrangian chaos with a randomly fluctuating velocity field. The patterns produced in this so-called Batchelor regime are strikingly similar to the ones observed in laminar flows. On larger scales, ideas from chaotic advection are relevant when there are large-scale coherent structures with slow spatial and temporal evolution. Typical examples are 2-d or quasi-2-d flows, for example, in the atmosphere or in the oceans. Fluid volumes can be trapped in regions bounded by separatrices or by stable and unstable manifolds of stagnation points and may have very little exchange with their surroundings. Such a reduction in stirring appears to occur in the Wadden sea (Ridderinkhof & Zimmerman, 1992). Equations (1) apply in this form to fluid elements and ideal particles only. For realistic particles with finite volume and inertia, further terms must be added. A
significant change in the qualitative side is that the effective velocity field for inertial particles can have a nonvanishing divergence even for incompressible flows (Maxey & Riley, 1983). The book by Ottino, (1989) and the two conference proceedings (Aref, 1994; IUTAM, 1991) provide good starting points for entering the many aspects of chaotic advection and Lagrangian chaos in engineering applications, geophysical flows, turbulent flows, and theoretical modeling. Historical remarks may be found in the Introduction to Aref, (1994) and in Aref, (2002). Today, the term chaotic advection designates an established subtopic of fluid mechanics that is used as a classification keyword by leading journals and conferences in the field. HASSAN AREF AND BRUNO ECKHARDT See also Chaotic dynamics; Chaos vs. turbulence; Dynamical systems; Hamiltonian systems; Lyapunov exponents; Turbulence Further Reading Aref, H. 1984. Stirring by chaotic advection. Journal of Fluid Mechanics, 143: 1–21 Aref, H. (editor). 1994. Chaos applied to fluid mixing. Chaos, Solitons and Fractals, 4: 1–372 Aref, H. 2002. The development of chaotic advection. Physics of Fluids, 14: 1315–1325 IUTAM Symposium on fluid mechanics of stirring and mixing. 1991. Physics of Fluids, 3: 1009–1496 Maxey, M. & Riley, J. 1983, Equation of motion for a small rigid sphere in a nonuniform flow. Physics of Fluids, 26: 883–889 Ottino, J.M. 1989. The Kinematics of Mixing: Stretching, Chaos and Transport, Cambridge: Cambridge University Press Ridderinkhof, H. & Zimmermann, J.T.F. 1992. Chaotic stirring in a tidal system. Science, 258: 1107–1111
CHAOTIC BILLIARDS See Billiards
CHAOTIC DYNAMICS When we say “chaos”, we usually imagine a very complex scene with many different elements that move in different directions, collide with each other, and appear and disappear randomly. Thus, according to everyday intuition, the system’s complexity (e.g., many degrees of freedom) is an important attribute of chaos. It seems reasonable to think that in the opposite case, for example, a system with only a few degrees of freedom, the dynamical behavior must be simple and predictable. In fact, this point of view is Laplacian determinism. The discovery of dynamical chaos has destroyed this traditional view. Dynamical chaos is a phenomenon that can be described by mathematical models for many natural systems, for example, physical, chemical, biological, and social, which evolve in time according to a deterministic rule and demonstrate capricious and
CHAOTIC DYNAMICS seemingly unpredictable behavior. To illustrate such behavior, consider a few examples.
Examples Hyperion: Using Newton’s laws, one can compute relatively easily all future solar eclipses not only for the next few hundred years but also for thousands and millions of years into the future. This is indicative of a real predictability of the system’s dynamical behavior. But even in the solar system, there exists an object with unpredictable behavior: a small irregularly shaped moon of Saturn, Hyperion. Its orbit is regular and elliptic, but its altitude in the orbit is not. Hyperion is tumbling in a complex and irregular pattern while obeying the laws of gravitational dynamics. Hyperion may not be the only example of chaotic motion in the solar system. Recent studies indicate that chaotic behavior possibly exists in Jovian planets (Murray & Holman, 1999), resulting from the overlap of components of the mean motion resonance among Jupiter, Saturn, and Uranus. Chaos in Hamiltonian systems, which represent the dynamics of the planets, arises when one resonance is perturbed by another one (See Standard map). Chaotic mixing is an example of the complex irregular motion of particles in a regular periodic velocity field, like drops of cream in a cup of coffee; see Figure 1. Such mixing, caused by sequential stretching and folding of a region of the flow, illustrates the general mechanism of the origin of chaos in the phase space of simple dynamical systems (See Chaotic advection; Mixing). Billiards: For its conceptual simplicity, nothing could be more deterministic and completely predictable
Figure 1. Mixing of a passive tracer in a Newtonian flow between two rotating cylinders with different rotation axes. The rotation speed of the inner cylinder is modulated with constant frequency. The flow is stretched and folded in a region of the flow. The repetition of these operations leads to a layered structure—folds within folds, producing a fractal structure (Ottino, 1989).
119 than the motion of a single ball on a billiard table. However, in the case of a table bounded by four quarters of a circle curved inward (Sinai billiard), the future fate of a rolling billiard ball is unpredictable beyond a surprisingly small number of bounces. As indicated by Figure 2, a typical trajectory of the Sinai billiard is irregular and a statistical approach is required for a quantitative description of this simple mechanical system. Such an irregularity is the result of having a finite space and an exponential instability of individual trajectories resulting in a sensitive dependence on initial conditions. Due to the curved shape of the boundary, two trajectories emanating from the same point but in slightly different directions with angle δ between them, hit the boundary ∂ (see Figure 2) at different points that are cδ apart where c > 0. After a bounce, the direction of the trajectories will differ by angle (1 + 2c)δ, and because an actual difference between the directions is multiplied by a factor µ = (1 + 2c) > 1, the small perturbation δ will grow more or less exponentially (Sinai, 2000). Such sensitive dependence on initial conditions is the main feature of every chaotic system. A Markov map: To understand in more detail how randomness appears in a nonrandom system, consider a simple dynamical system in the form of a onedimensional map xn+1 = 2xn mod 1.
(1)
Since the distance between any two nearby trajectories (|xn − xn | 1) after each iteration increases at least two times (|dxn + 1 /dxn | = 2), any trajectory of the map is unstable. The map has a countable infinity of unstable periodic trajectories, which can be seen as fixed points when one considers the shape of the map xn + k = F (k) (xn ); see Figure 3(b). Since all fixed points and periodic trajectories are repelling, the only possibility left for the most arbitrarily selected initial condition is that the map will produce a chaotic motion that never exactly repeats itself. The irregularity of such dynamics can be illustrated using a binary symbolic description (sn = 0 if xn < 21 and sn = 1 if xn ≥ 21 ). In this case, any value of xn can be represented as a binary
Figure 2. Illustration of the trajectory sensitivity to the initial conditions in a billiard model with convex borders.
120
CHAOTIC DYNAMICS 1
xn+1
a 00
δx
0
xn
1
1
xn+3
0
b
0
xn
1
Figure 3. Simple map diagram: (a) two initially close trajectories diverge exponentially; (b) illustration of the increasing of the number of unstable periodic trajectories with the number of iterations.
irrational number generates a new irrational number. Since the irrational numbers appear in the interval xn ∈ [0, 1] with probability one, one can observe only the aperiodic (chaotic) motions. Random-like behavior of the chaotic motions is illustrated in a separate figure in the color plate section (See the color plate section for a comparison of chaos generated by Equation (1) and a truly random process). The degree of such chaoticity is characterized by Lyapunov exponents that can be defined for onedimensional maps (xn + 1 = f (xn )). The stability or instability of a trajectory with the initial state x0 is determined by the evolution of neighboring trajectories starting at x˜0 = x0 + δx0 with |δx0 | 1. After one iteration df x˜1 =x1 + δx1 =f (x0 + δx0 )≈f (x0 ) + δx0 . dx x=x0 Now, the deviation is δx1 ≈ f (x0 )δx0 . After the nth iteration it becomes δxn = ( nm−=10 f (xm ))δx0 . The evolution of the distance between the two trajectories is calculated by taking the absolute value of this product. For infinitesimally small perturbations and large enough n, it is expected that |δxn | = α n |δx0 |, where / n−1 01/n
δxn 1/n = |f (x )| α ≈ lim m n→∞ δx 0
m=0
or ln α ≈ λ = lim
n→∞
decimal xn = 0.sn+1 sn+2 sn+3 . . . ≡
∞
2 − j sj .
j =n+1
If the initial state happens to be a rational number, it can be written as a periodic sequence of 0’s and 1’s. For instance, 0.10111011101110111… is the rational number 11 15 . Each iteration xn → xn + 1 of map (1) corresponds to setting the symbol sn + 1 to zero and then moving the decimal point one space to the right (this is known as a Bernoulli shift). For example, the iterations of the number 11 15 yield 0.10111011101110111 . . . , 0.01110111011101110 . . . , 0.11101110111011101 . . . , 0.11011101110111011 . . . , 0.10111011101110111 . . . , which illustrates a periodic motion of period 4. Selecting an irrational number as the initial condition, one chooses a binary sequence that cannot be split into groups of 0’s and 1’s periodically repeated an infinite number of times. As a result, each iteration of the
n−1 1 ln |f (xm )| . n
(2)
m=0
Limit (2) exists for a typical trajectory xm and defines the Lyapunov exponent, λ, which is the time average of the rate of exponential divergence of nearby trajectories. For map (1) f = 2 for all values of x and, therefore, λ = ln 2 (See Lyapunov exponents). Assuming that the initial state cannot be defined with absolute accuracy, the prediction of the state of the map after a sufficiently large number of iterations becomes impossible. The only description that one can use for defining that state is a statistical one. The statistical ensemble in this case is the ensemble of initial conditions. The equation of evolution for the initial state probability density ρn + 1 (F (x)) can be written as (Ott, 1993, p. 33) dF (x) , ρn (x)/ (3) ρn+1 (F (x)) = dx j =1,2
j
where the summation is taken over both branches of F (x). Considering the evolution of a sharp initial distribution ρ0 (x), one can see that at each step this distribution becomes smoother. As n approaches infinity, the distribution asymptotically approaches the steady state ρ(x) = 1.
CHAOTIC DYNAMICS
121
1 0.8
xn
0.6 0.4 0.2 0 2.8
α 3
3.2
3.4
3.6
3.8
4
Figure 4. Bifurcation diagram for the logistic map.
Figure 6. Chaotic oscillation of a periodically driven pendulum, in phase-space plot of angular velocity versus angular position (Deco & Schürmann, 2000).
Xn+1
1
0
Xn
Figure 5. Return map measured in the Belousov–Zhabotinsky autocatalytic reaction.
Population dynamics: A popular model of population growth is the logistic map xn + 1 = αxn (1 − xn ), 0 ≤ α ≤ 4 (See Population dynamics). The formation of chaos in this map is illustrated in the bifurcation diagram shown in Figure 4. This diagram presents the evolution of the attracting set as the value of α grows. Below the Feigenbaum point α∞ = 3.569 . . . , the attractor of the map is periodic. Its period increases through a sequence of period-doubling bifurcations as the value of α approaches α∞ (See Period doubling). For α > α∞ , the behavior is chaotic but some windows of periodic attractors exist (See Order from chaos). Belousov–Zhabotinsky (BZ) autocatalytic reaction: In the BZ reaction (See Belousov–Zhabotinsky reaction), an acid bromate solution oxidizes malonic acid in the presence of a metalion catalyst and other important chemical components in a well-stirred reactor (Roux et al., 1983). The concentration of the bromide ions is measured and parameterized by the return map (plotting a variable against its next value in time) xn + 1 = αxn exp[ − bxn ] (see Figure 5). This map exhibits chaotic behavior for a very broad range of parameter values.
Figure 7. Ueda attractor. The fractal structure of the attractor is typical for all chaotic sets (compare this picture with Figure 1) (Ueda, 1992).
Simple chaotic oscillators: The dynamics of the periodically driven pendulum shown in Figure 6 is described by d! g d2 ! + sin ! = B cos 2πf t, +ν dt 2 dt l
(4)
where the term on the right-hand side is the forcing (sinusoidal torque) applied to the pivot and f is the forcing frequency. Chaotic motions of the pendulum computed for ν = 0.5, g/ l = 1, B = 1.15, f = 0.098, and visualized with stroboscopic points at moments of time t = i/f are shown in Figure 6. A similar example of chaotic behavior was intensively studied in an oscillator where the restoring force is proportional to the cube of the displacement (Ueda, 1992, p. 158) d! d2 ! +ν + !3 = B cos t. dt 2 dt
(5)
The stroboscopic image (with t = i) of the strange attractor in this forced Duffing-type oscillator computed with ν = 0.05 and B = 7.5 is shown in Figure 7.
122
Figure 8. Chaotic attractor generated by electric circuit, which is a modification of van der Pol oscillator: x˙ = hx + y − gz; y˙ = − x; µ˙z = x − f (x); where f (x) = x 3 − x (Pikovsky & Rabinovich, 1978).
Figure 8 presents a chaotic attractor generated by an electronic circuit. Such circuits are a popular topic in engineering studies today.
Characteristics of Chaos Lyapunov exponents: Consider the Lyapunov exponents for a trajectory x˜ (t) generated by a d-dimensional autonomous system dx = F (x), (6) dt with initial condition x0 , x ∈ &d . Linearizing Equation (6) about this solution, one obtains a linear system which describes the evolution of infinitesimally small perturbations w = x(t) − x˜ (t) of the trajectory, in the form dw = M(x˜ )w, (7) dt where M(x) = ∂ F (x)/∂ x is the Jacobian of F (x) that changes in time in accordance with x˜ (t). In d-dimensional phase space of (7), consider a sphere of initial conditions for perturbations w(t) of diameter l, that is, |w(0)| ≤ l. The evolution of this ball in time is governed by linear system (7) and depends on trajectory x˜ (t). As the system evolves in time, the ball transforms into an ellipsoid. Let the ellipsoid have d principal axes of different length lj , j = 1, d. Then, the values of Lyapunov exponents of the trajectory x˜ (t) are defined as lj (x˜ , t) 1 ln . (8) λj (x˜ ) = lim t→∞ t l(x0 , 0) Although limit (8) depends on x˜ (t), the spectrum of the Lyapunov exponents λj for the selected regime of
CHAOTIC DYNAMICS chaotic oscillations generated by (6) is independent of the initial conditions for the typical trajectories and characterizes the chaotic behavior. The Lyapunov exponents, λj , can be ordered in size: λ1 ≥ λ2 ≥ · · · ≥ λd . Self-sustained oscillations in autonomous time-continuous systems always have at least one Lyapunov exponent that is equal to zero. This is the exponent characterizing the stretching of phase volume along the trajectory. The spectrum of λj for chaotic trajectories contains one or more Lyapunov exponents with positive values. Kolmogorov–Sinai entropy is a measure of the degree of predictability of further states visited by a chaotic trajectory started within a small region. Due to the divergence, a long-term observation of such a trajectory gives more and more information about the actual initial condition of the trajectory. In this sense, one may say that a chaotic trajectory creates information. Consider a partitioning of the ddimensional phase space into small cubes of volume εd . Observing a continuous trajectory during T instances of time, one obtains a sequence {i0 , i1 , . . . , iT }, where i0 , i1 , . . . are the indexes of the cubes consequently visited by the trajectory. As a result, the type of the trajectory observed during the time interval from 0 to T is specified by the sequence {i0 , i1 , . . . , iT }. As Kolmogorov and Sinai showed, in dynamical systems whose behavior is characterized by exponential instability, the number of different types of trajectories, KT , grows exponentially with T : 1 log KT . 0 < H = lim T →∞ T The quantity H is the Kolmogorov–Sinai (KS) entropy. The number of unique random sequences {i0 , i1 , . . . , iT } that can be obtained without any rules applied increases exponentially with T . In the case of nonrandom sequences where there is a strict law for the generation of future symbols, like the periodic motion, the number of possible sequences grows in time slower than the exponent. Since the exponential growth takes place for the segments of trajectories in the unstable dynamical system producing chaos, such a dynamical system is capable of generating “random” sequences. The Kolmogorov–Sinai entropy is a measure of such “randomness” in a “nonrandom” system, for example, a dynamical system. Since both KS entropy and Lyapunov exponents reflect the properties of the divergence of the nearby trajectories, these characteristics are related to each other. The formula describing this relation is given by Ruelle’s Inequality m λj ≥ 0 (9) H ≤K= j =1
where m is the number of positive λi (K = 0, when m = 0). The equality H = K holds when the system
CHAOTIC DYNAMICS has a physical measure (Sinai–Ruelle–Bowen measure) (Young, 1998). The invariant set of trajectories characterized by a positive Kolmogorov–Sinai entropy is a chaotic set.
Forecasting If a sufficiently long experimental time series capturing the chaotic process of an unknown dynamical system is available in the form of scalar data {xn }N n = 0, it is possible, in principle, to predict xN + m with finite accuracy for some m ≥ 1. Such predictions are based on the assumption that the unknown generating mechanism is time independent. As a result, “what happened in the past may happen again—even stronger: that what is happening now has happened in the past” (Takens, 1991). In classical mechanics (no dissipation), this idea of “what happens now has happened in the past” is related to the Poincaré Recurrence Theorem. Usually, the prediction procedure consists of two steps: first, it is necessary to consider all values of n in the “past,” that is, with n < N, such that K k = 0 |xn − k − xN − k | < ε, where ε is a small constant. If there are only a few of such n, then one can try again with a smaller value of K or a larger value of ε. In the second step, it is necessary to consider the corresponding elements xn + l for all the values of n found in the first step. Finally, taking a union of the ε-neighborhoods of all these elements, one can predict that xN + l will be in this union. To understand when and why forecasting is possible and when it is not, it is reasonable to use characteristics such as dimension and entropy that can be computed directly from time series (Takens, 1991). If we want to make a catalog of essentially different segments of length k + 1 in {xn }N n = 0 , this can be done with C(k, ε, N ) elements. C(k, ε, N) is a function of N that has a limit C(k, ε) = limN → ∞ C(k, ε, N), and for prediction, we need C(k, ε) N . The quantitative measure for the way in which C(k, ε) increases as ε goes to zero is
C(k, ε, N) . (10) D = lim lim k→∞ ε→0 ln(1/ε) If D is large, the prediction is problematic. The quantity D defined by (10) is the dimension of the time series. The quantitative measure for the way in which C(k, ε) increases with k is
C(k, ε, N) , (11) H = lim lim ε→0 k→∞ k This is the entropy of the time series. For the time series generated by a differentiable dynamical system, both the dimension and entropy are finite, but for a random time series they are infinite. Suppose each xn is taken at
123 random in the interval [0,1] (with respect to the uniform distribution) and for each n1 , . . . , nk (different), the choices of xn1 , . . . , xnk are independent. For such time series, one can find: C(k, ε) = (1 + (1/2ε))k + 1 , where (1/2ε) is the integer part of 1/2ε. From this formula, it immediately follows that both dimension and entropy in such random time series are infinite. Models of the Earth’s atmosphere are generally considered as chaotic dynamical systems. Due to the unstability, even infinitesimally small uncertainties in the initial conditions grow exponentially fast and make a forecast useless after a finite time interval. This is known as the butterfly effect. However, in the tropics, there are certain regions where wind patterns and rainfall are so strongly determined by the temperature of the underlying sea surface, that they do not show such sensitive dependence on the atmosphere. Therefore, it should be possible to predict large-scale tropical circulation and rainfall for as long as the ocean temperature can be predicted (Shukla, 1998).
History The complex behavior of nonlinear oscillatory systems was observed long before dynamical chaos was understood. In fact, the possibility of complex behavior in dynamical systems was discovered by Henri Poincaré in the 1890s in his unsuccessful efforts to prove the regularity and stability of planetary orbits. Later on, experiments with an electrical circuit by van der Pol and van der Mark (1927) and the double-disk model experiments of the magnetic dynamo (Rikitake, 1958) also indicated the paradoxically complex behavior of a simple system. At that time, several mathematical tools were available to aid the description of the nontrivial behavior of dynamical systems in phase space, such as homoclinic Poincaré structures (homoclinic tangles). However, at the time, neither physicists nor mathematicians realized that deterministic systems may behave chaotically. It was only in the 1960s that the understanding of randomness was revolutionized as a result of discoveries in mathematics and in computer modeling (Lorenz, 1963) of real systems. An elementary model of chaotic dynamics was suggested by Boris Chirikov in 1959. During the last few decades, chaotic dynamics has moved from mystery to familiarity. Standard map and homoclinic tangle: The standard map (Chirikov, 1979) is an area-preserving map In+1 = In + K sin !n , !n+1 = In + !n + K sin !n ,
(12)
where ! is an angle variable (computed modulo 2π ) and k is a positive constant. This map was proposed as a model for the motion of a charged particle in a magnetic field. For K larger than Kcr , map
124
CHAOTIC DYNAMICS are written for the amplitude of the first horizontal harmonic of the vertical velocity (x), the amplitude of the corresponding temperature fluctuation (y), and a uniform correction of the temperature field (z) (Lorenz, 1963). σ is the Prandtl number, r is the reduced Rayleigh number, and b is a geometric factor. The phase portrait of the Lorenz attractor, time series, and the return mapping generated on the Poincaré cross section computed for r = 28, σ = 10, and b = 83 are presented in Figure 10. A simple mechanical model illustrating the dynamical origin of oscillations in the Lorenz system is shown in Figure 11 (See Lorenz equations).
2
0
−2
0
2
4
6
2
0
−2
0
2
4
6
Figure 9. Examples of chaos in the standard map for two different values of K. The coexistence of the “chaotic sea” and “regular islands” that one can see in the panel on the right is typical for Hamiltonian systems with chaotic regimes (Lichtenberg & Lieberman, 1992).
(12) demonstrates an irregular (chaotic) motion; see Figure 9. The complexity of the phase portrait of this map is related to the existence of homoclinic tangles formed by stable and unstable manifolds of a saddle point or saddle periodic orbits when the manifolds intersect transversally. The complexity of the manifold’s geometry stems from the fact that, if stable and unstable manifolds intersect once, then they must intersect an infinite number of times. Such a complex structure results in the generation of a horseshoe mapping, which persistently stretches and then folds the area around the manifolds generating a chaotic motion. The layers of the chaotic motion are clearly seen in Figure 9. Lorenz system: The first clear numerical manifestation of chaotic dynamics was obtained in the Lorenz model. This model is a three-dimensional dynamical system derived from a reasonable simplification of the fluid dynamics equations for thermal convection in a liquid layer heated from below. The differential equations x˙ = σ (y − x), y˙ = rx − y − xz, z˙ = − bz + xy
Definition of Chaos As was shown above, dynamical chaos is related to unpredictability. For quantitative measurment of the unpredictability, it is reasonable to use the familiar characteristics dimension and entropy. These characteristics are independent: it is possible to generate a time series that has a high dimension and at the same time entropy equal to zero. This is a quasiperiodic motion. It is also simple to imagine a lowdimensional dynamical system with high entropy (see, e.g., the map in Figure 3). Various definitions of chaos exist, but the common feature of these definitions is the sensitive dependence on initial conditions that was formalized above as positive entropy. Thus, dynamical chaos is the behavior of a dynamical system that is characterized by finite positive entropy.
Chaotic Attractors and Strange Attractors A region in the phase space of a dissipative system that attracts all neighboring trajectories is called an attractor. An attractor is the phase space image of the behavior established in the dissipative system, for example, a stable limit cycle is the image of periodic oscillations. Therefore, the image of chaotic oscillations is a chaotic attractor. A chaotic attractor (CA) possesses the following two properties that define any attractor of the dynamical system: • There exists a bounded open region U containing a chaotic attractor (CA ∈ U ) in the phase space such that all points from this neighborhood converge to a chaotic attractor when time goes to infinity. • A chaotic attractor is invariant under the evolution of the system, In addition, the motion on a chaotic attractor has to be chaotic, for example: • each trajectory of a chaotic attractor has at least one positive Lyapunov exponent. Such types of attractors represent some regimes of chaotic oscillations generated by a Lorenz system and
CHAOTIC DYNAMICS
125
Figure 10. Lorenz attractor (left) and the return map zn + 1 = F (zn ) plotted for maximum values of variable z for the attractor trajectory (right).
Figure 11. A toy model invented by Willem Malkus and Lou Howard illustrates dynamical mechanisms analogous to oscillations and chaos in the Lorenz system. Water steadily flowing into the top (leaky) bucket makes it heavy enough to start the wheel turning. When the flow is large enough, the wheel can start generating chaotic rotations characterized by unpredictable switching of the rotation direction; see Strogatz (1994, p. 302) for details.
the piece-wise linear maps. However, most of the chaotic oscillations observed in dynamical systems correspond to attractors that do not precisely satisfy the latter property. Although almost all trajectories in such attractors are unstable, some stable periodic orbits may exist within the complex structure of unstable trajectories. Chaos in such systems is persistent both in physical experiments and in numerical simulations because all of these stable orbits have extremely narrow basins of attraction. Due to natural small perturbations of the system, the trajectory of the system never settles down on one of the stable orbits and wanders within the complex set of unstable orbits. The definition of a strange attractor is related to the complicated geometrical structure of an attractor. A strange attractor is defined as an attractor that cannot be presented by a union of the finite number of smooth manifolds. For example, an attractor whose topology can be locally represented by the direct product of a Cantor set to a manifold is a strange attractor. In many cases, the geometry of a chaotic attractor satisfies the definition of a strange attractor. At the same time, the definition of a strange attractor can be satisfied in the case of a nonchaotic strange attractor. This is an
attractor that has fractal structure, but does not have positive Lyapunov exponents. The origin of chaotic dynamics in dissipative systems and Hamiltonian systems in many cases is the same and is related to coexistence in the phase space of infinitely many unstable periodic trajectories as a part of homoclinic or heteroclinic tangles. The Lorenz attractor, as for many other attractors in systems with a small number of degrees of freedom, can appear through a finite number of easily observable bifurcations. The bifurcation of a sudden birth and death of a strange attractor is called a crisis. Usually, it is related to the collision of the attractor with an unstable periodic orbit or its stable manifold (Arnol’d et al., 1993; Ott, 1993).
Order in Chaos How does the dynamical origin imprint in chaos? Or in other words, how can the rules or order of the dynamical system be found inside a chaotic behavior? Consider the images (portraits) of the dynamical chaos shown in Figures 7, 8, and 10. The elegance of these images reflects the existence of order in dynamical chaos. The dynamical origin of such elegance is very similar: different trajectories with close initial conditions have to be close in time tl ≈ 1/λ, where λ is the maximally positive Lyapunov exponent. The domain occupied by the strange attractor in phase space is finite; thus, the divergence of the phase space flow changes to convergence, and as a result of sequential action of divergence and convergence of the phase flow in the finite domain, the mixing of trajectories occurs. Such mixing can be illustrated with the motions of liquids in the physical space experimentally observed by Ottino (1989; see Figure 1). Another way to recognize the existence of order in chaos is to analyze its dependence on a control parameter. The macroscopic features of real stochastic
CHAOTIC DYNAMICS
V1, V2 [mV]
V1, V2[mV]
V1, V2[mV]
126 −30 −40 −50
processes, for example, Brownian motion or developed turbulence, depend on this parameter and change without any revolutionary events such as bifurcations. But for dynamical chaos, the picture is different. A continuous increase of control parameters of the logistic map does not necessarily gradually increase the degree of chaos: within chaos, there are windows—intervals of control parameter values in which the chaotic behavior of the system changes to stable periodic behavior, see Figure 4. In a spatially extended system, for example, in convection or Faraday flow, order within chaos is related to the existence of coherent structures inside the chaotic sea (Rabinovich et al., 2001); see Figure 12.
Spatiotemporal Chaos Similar to regular (e.g., periodic) motions, lowdimensional chaotic behavior is observed not only in simple (e.g., low-dimensional) systems but also in systems with many, and even with infinite number of degrees of freedom. The dynamical mechanisms behind the formation of low-dimensional chaotic spatiotemporal patterns in dissipative and nondissipative systems are different. In conservative systems, such patterns are related to the chaotic motion of particle-like localized structures. For example, a soliton that is described by a nonlinear Schrödinger equation with the harmonic potential
i
∂ 2a ∂a + β 2 + |a|2 + α sin qx a = 0 ∂t ∂ x
(13)
ga = 0nS
b
ga = −200nS
−40 −50 −30 −40 −50 −60
ga = −275nS
c 0
Figure 12. Appearance of spatiotemporal chaos in the extended Faraday experiment: chaotic patterns on the surface of the liquid layer in the oscillating gravitational field. The irregular chain of the localized structures—dark solitons—can be seen beneath a background of the square capillary lattice (Gaponov-Grekhov & Rabinovich, 1988).
a
−30
1
2 time [sec]
3
4
Figure 13. Dynamics of chaotic bursts of spikes generated by two living neurons coupled with an electrical synapse—a gap junction (Elson et al., 1998). Chaotic busts in naturally coupled neurons synchronize (a). When natural coupling is compensated by additional artificial coupling ga , the chaotic oscillations are independent oscillations (b). The neurons coupled with negative conductivity fire in the regimes of antiphase synchronization (c).
moves chaotically in physical space x and reminds us of the chaotic motion occurring in the phase space of a parametrically excited conservative oscillator (the equations of such an oscillator can be derived from (13) for slow variables characterizing the motion of the soliton center mass). The interaction of the localized structures (particles) in a finite area, large in comparison with the size of the structure, can also lead to the appearance of spatiotemporal chaos. It was observed that collisions of solitons moving in two-dimensional space result in chaotic scattering similar to the chaotic motion observed in billiards (Gorshkov et al., 1992). In dissipative nonlinear media and high-dimensional discrete systems, the role of coherent structures is also very important (such as defects in convection, clusters of excitations in neural networks, and vortices in the wake behind a cylinder; see Rabinovich et al., 2001). However, the origin of low-dimensional chaotic motions in such systems is determined by dissipation. There are two important mechanisms of finite dynamics (including chaos) that are due to dissipation: (1) the truncation of the number of excited modes (in hydrodynamic flows) due to high viscosity of the small-scale perturbations and (2) the synchronization of the modes or individual oscillators. Dissipation makes synchronization possible not only among periodic modes or oscillators but even in the case when the interacting subsystems are chaotic (Afraimovich et al., 1986). Figure 13 illustrates the synchronization of chaotic bursts of spikes observed experimentally in two coupled living neurons. In the case of a dissipative lattice of chaotic elements (e.g., neural lattices or models of an extended autocatalytic chemical reaction), complete synchronization leads to the onset
CHAOTIC DYNAMICS
127
Figure 14. Coherent patterns generated in the chaotic medium with Rössler-type dynamics of medium elements. Left: an example of coherent patterns with defects. Right: evolution of the attractor with increasing distance r from a defect. The attractor changed from the limit cycle of period T at r = r1 to the period 2T limit cycle at r = r2 > r1 , then to the period 4T limit cycle at r = r3 > r2 , and finally to the chaotic attractor for r = r4 > r3 (Goryachev & Kapral, 1996).
of a spatially homogeneous chaotic state. When this state becomes unstable against spatial perturbations, the system moves to the spatiotemporal chaotic state. A snapshot of such spatiotemporal chaos, which is observed in the model of chaotic media consisting of diffusively coupled Rössler-type chaotic oscillators, is presented in Figure 14. Figure 14 also illustrates the sequence of period-doubling bifurcations that are observed in the neighborhood of the defect in such a medium.
Edge of Chaos In dynamical systems with many elements and interconnections (e.g., complex systems), the transition between ordered dynamics and chaos is similar to phase transitions between states of matter (crystal, liquid, gas, etc.). Based on this analogy, an attractive hypothesis named “edge of chaos”(EOC) appeared at the end of the 1980s. EOC suggests a fundamental equivalence between the dynamics of phase transitions and the dynamics of information processing (computation). One of the simplest frameworks in which to formulate relations between complex system dynamics and computation at the EOC is a cellular automaton. There is currently some controversy over the validity of this idea (Langton 1990; Mitchel et al., 1993).
Chaos and Turbulence The discovery of dynamical chaos has fundamentally changed the accepted concept of the origin of hydrodynamic turbulence. When dealing with turbulence at
finite Reynolds number, the main point of interest is the established irregular motion. The image of such irregularity in the phase space could be a chaotic attractor. Experiments in closed systems, for example, one in which fluid particles continuously recirculate through points previously visited, have shown the most common scenarios for the transition to chaos. These are (i) transition through the destruction of quasiperiodic motion that was observed in Taylor–Couette flow (Gollub & Swinney, 1975); (ii) period-doubling sequence observed in Rayleigh–Bénard convection (Libchaber & Maurer, 1980); and (iii) transition through intermittency (Gollub & Benson, 1980). Observation of these canonical scenarios for particular flows proved the validity of the concept of dynamical origin of the transition to turbulence in closed systems. It is possible to reconstruct a chaotic set in the phase space of the flow directly from observed data; see Brandstäter et al. (1982). At present it is difficult to say how dynamical chaos theory can be useful for the understanding and description of the developed turbulence. The discovery and understanding of chaotic dynamics have important applications in all branches of science and engineering and, in general, to our evolving culture. An understanding of the origins of chaos in the last decades has produced many clear and useful models for the description of systems with complex behavior, such as the global economy (Barkly Russel, 2000), the human immune system (Gupta et al., 1998), animal behavior (Varona et al., 2002), and more. Thus, chaos theory provides a new tool for the unification of the sciences. M.I. RABINOVICH AND N.F. RULKOV
128 See also Attractors; Billiards; Butterfly effect; Chaos vs. turbulence; Controlling chaos; Dripping faucet; Duffing equation; Entropy; Fractals; Hénon map; Horseshoes and hyperbolicity in dynamical systems; Intermittency; Kicked rotor; Lorenz equations; Lyapunov exponents; Maps; Maps in the complex plane; Markov partitions; Multifractal analysis; One-dimensional maps; Order from chaos; Period doubling; Phase space; Quasiperiodicity; Rössler systems; Routes to chaos; Sinai– Ruelle–Bowen measures; Spatiotemporal chaos; Synchronization; Time series analysis Further Reading Abarbanel, H.D.I. 1996. Analysis of Chaotic Time Series, New York: Springer Afraimovich, V.S., Verichev, N.N. & Rabinovich, M.I. 1986. Stochastic synchronization of oscillations in dissipative systems. Izvestiya Vysshikh Vchebnykh Zavedenii Radiofizika. RPQAEC, 29: 795–803 Arnol’d, V.I., Afraimovich, V.S., Ilyashenko, Yu.S. & Shilnikov, L.P. 1993. Bifurcation theory and catastrophe theory. In Dynamical Systems, vol. 5, Berlin and New York: Springer Barkly Russel, J., Jr. 2000. From Catastrophe to Chaos: A General Theory of Economic Discontinuities, 2nd edition, Boston: Kluwer Brandstäter, A., Swift, J., Swinney, H.L., Wolf, A., Doyne Farmer, J., Jen, E. & Crutchfield, P.J. 1982. Low-dimensional chaos in a hydrodynamic system. Physical Review Letters, 51: 1442–1445 Chirikov,V.A. 1979.A universal instability of many-dimensional oscillator systems. Physics Reports, 52: 264–379 Deco, G. & Schürmann, B. 2000. Information Dynamics: Foundations and Applications, Berlin and NewYork: Springer Elson, R.C., Selverston, A.I., Huerta, R., Rulkov, N.F., Rabinovich, M.I. & Abarbanel H.D.I. 1998. Synchronous behavior of two coupled biological neurons. Physical Review Letters, 81: 5692–5695 Gaponov-Grekhov, A.V. & Rabinovich, M.I. 1988. Nonlinearity in Action: Oscillations, Chaos, Order, Fractals, Berlin and New York: Springer Gollub, J.P. & Benson, S.V. 1980. Many routes to turbulent convection. Journal of Fluid Mechanics, 100: 449–470 Gollub, J.P. & Swinney, H.L. 1975. Onset of turbulence in rotating fluid. Physical Review Letters, 35: 927–930 Gorshkov, K.A., Lomov, A.S. & Rabinovich, M.I. 1992. Chaotic scattering of two-dimensional solitons. Nonlinearity, 5: 1343–1353 Goryachev, A. & Kapral, R. 1996. Spiral waves in chaotic systems. Physical Review Letters, 76: 1619–1622 Gupta, S., Ferguson, N. & Anderson, R. 1998. Chaos, persistence, and evolution of strain structure in antigenically diverse infectious agent. Science, 280: 912–915 Langton, C.C. 1990. Computation at the edge of chaos—phase transitions and emergent computation. Physica D, 42: 12–37 Libchaber, A. & Maurer, J. 1980. Une expérience de RayleighBénard en géométrie réduite; multiplication, accrochage et démultiplication de fréquences. Journal de Physique Colloques, 41: 51–56 Lichtenberg,A.J. & Lieberman, M.A. 1992. Regular and Chaotic Dynamics, Berlin and New York: Springer Lorenz, E.N. 1963. Deterministic nonperiodic flow. Journal of Atmospheric Science, 20: 130–136
CHARACTERISTICS Mitchel, M., Hraber, P. & Crutchfield, J. 1993. Revisiting the edge of chaos: evolving cellular automata to perform computations. Complex Systems, 7: 89–130 Murray, N. & Holman, M. 1999. The origin of chaos in the outer solar system. Science, 283: 1877–1881 Ott, E. 1993. Chaos in Dynamical Systems, Cambridge and New York: Cambridge University Press Ottino, J.M. 1989. The Kinetics of Mixing: Stretching, Chaos, and Transport, Cambridge and New York: Cambridge University Press Pikovsky, A.S. & Rabinovich, M.I. 1978. A simple generator with chaotic behavior. Soviet Physics Doklady, 23: 183– 185 (see also Rabinovich, M.I. 1978. Stochastic selfoscillations and turbulence. Soviet Physics Uspekhi, 21: 443–469) Rabinovich, M.I., Ezersky, A.B. & Weidman, P.D. 2001. The Dynamics of Patterns, Singapore: World Scientific Rikitake, T. 1958. Oscillations of a system of disk dynamos. Proceedings of the Cambridge Philosophical Society, 54: 89–105 Roux, J.C., Simoyi, R.H. & Swinney, H.L. 1983. Observation of a strange attractor. Physica D, 8: 257–266 Shukla, J. 1998. Predictability in the midst of chaos: a scientific basis for climate forecasting. Science, 282: 728–731 Sinai, Ya.G. 2000. Dynamical Systems, Ergodic Theory and Applications, Berlin and New York: Springer Strogatz, S.H. 1994. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering Reading, MA: Addison-Wesley Takens, F. 1991. Chaos, In Structures in Dynamics: Finite Dimensional Deterministic Studies, edited by H.W. Broer, F. Dumortier, S.J. van Strien & F. Takens, Amsterdam: NorthHolland Elsevier Science Ueda,Y. 1992. The Road to Chaos, Santa Cruz, CA: Aeirial Press van der Pol, B. & van der Mark, B. 1927. Frequency demultiplication. Nature, 120: 363–364 Varona, P., Rabinovich, M.I., Selverston, A.I. & Arshavsky, Yu.I. 2002. Winnerless competition between sensory neurons generates chaos: A possible mechanism for molluscan hunting behavior. CHAOS 12: 672–677 Young, L.S. 1998. Developments in chaotic dynamics. Notices of the AMS, 17: 483–504
CHARACTERISTICS Systems of first-order partial differential equations describe many different physical phenomena from the behavior of fluids, gases, and plasmas. To introduce the Method of Characteristics, consider the simple scalar conservation law of the form ∂U ∂U + A(U ) = 0. ∂t ∂x
(1)
Here, U = U (x, t), where x is a spatial coordinate and t is the time coordinate. The function A(U ) defines the speed of propagation of a disturbance and either may be independent of U , in which case equation (1) is a linear partial differential equation, or it may depend explicitly on the dependent variable U , in which case the equation is a nonlinear partial differential equation. It is important to specify the initial or boundary conditions that the solution U (x, t) must satisfy.
CHARACTERISTICS
129
Consider the simple case where the function is known at the initial time t = 0. Thus, U (x, 0) = F (x).
(2)
The idea is to simplify Equation (1) by choosing a suitable curve in the x-t plane. This curve can be written in parametric form as x = x(s),
t = t (s),
(4)
implies that U is a function of s. Hence, the chain rule gives the derivatives of U along the curve as ∂U dt ∂U dx dU = + . ds ∂t ds ∂x ds
(5)
Comparing the right-hand side of (5) with the left-hand side of (1), it is clear that they are identical, provided the parametric form of the curve is chosen as dt = 1, (6) ds dx = A(U ), (7) ds and (1) reduces to (8)
The curve satisfied by (6) and (7) is called the characteristic curve. Along this curve, U is constant. However, the value of the constant may be different on each characteristic curve. The solution of the characteristic equations requires some initial conditions for x and t. These are taken as
If A(U ) is a constant, say c, then the solution is simply U (x, t) = F (x − ct),
(13)
but if A(U ) depends explicitly on U , then the solution is an implicit solution. The characteristic curves in this example are straight lines in the x-t plane with a gradient given by A(U ). When A(U ) = c is a constant, the characteristic curves are parallel straight lines. This means that the shape of the initial disturbance propagates unchanged, to the right if c is a positive constant. If A(U ) depends on U , then the characteristic curves are straight lines but with different gradients. There exists the possibility that the characteristic curves may cross, and this corresponds to the formation of a shock. When the characteristic curves diverge, the solution exhibits an expansion fan that can be expressed in terms of a similarity variable. Note that the method of characteristics can be used when A = A(U, x, t) depends explicitly on the space and time coordinates. In this case, the coupled equations, (6)–(8), may be solved numerically. A detailed description of the method of characteristics for general first-order partial differential equations is given in Rubenstein & Rubenstein (1993).
t = 0,
at s = 0.
(9)
Consider the case when A(U ) = U . Then, (1) becomes the inviscid Burgers equation ∂U ∂U +U = 0, ∂t ∂x and the solution satisfying the initial condition ⎧ 0, x < −1, ⎪ ⎨ 1 + x, −1 ≤ x < 0, U (x, 0) = ⎪ ⎩ 1 − x, 0 ≤ x ≤ 1, 0, x>1 is
Note that x0 covers the same domain as x. Solving (6)– (8) yields t = s,
(12)
Example: Burgers Equation
dU = 0, ⇒ U = constant along the curve. ds
x = x0 ,
U (x, t) = F (x0 ) = F (x − A(U )t).
(3)
where s is the parameter that can be thought of as measuring the distance along the curve. To understand how to select the particular form of the curve, note that, using (3), U (x, t) = U (x(s), t (s)),
Note that the particular characteristic curve is determined by the value of the parameter x0 . Hence, eliminating the parameter s and solving for x0 in terms of x, U , and t, the solution given by (11) is
x = A(U )s + x0 ,
(10)
on using the initial conditions (9). x0 can be thought of as a constant of integration and so it has a fixed value along the characteristic curve. This implies, using (2), that U (x, 0) = F (x0 ).
(11)
U=
⎧ ⎪ ⎨
0, 1 + x − U t, ⎪ ⎩ 1 − x + U t, 0,
x < −1 + U t, −1 + U t ≤ x < U t, U t ≤ x ≤ 1 + U t, x > 1 + U t.
(14)
(15)
(16)
Thus, solving for U in each region gives the solution as ⎧ 0, x < −1, ⎪ ⎨ (1 + x)/(1 + t), −1 ≤ x < t, U= (17) ⎪ ⎩ (1 − x)/(1 − t), t ≤ x ≤ 1, 0, x > 1.
130
CHARACTERISTICS
Note that the solution becomes multi-valued for t > 1. This can be understood by considering the characteristic curves defined by (9). In the x-t plane, they are straight lines of the form x = U t + x0 ,
(18)
so that the gradient depends on the value of U at t = 0 and x = x0 . Thus, using the initial conditions, the characteristic curves, valid in the region t ≤ x ≤ 1, can be expressed as x = (1 − x0 )t + x0 ,
where Q is an n × n matrix whose j th column is the j th eigenvector zj . Substituting into (21) yields
Q
∂Vi ∂Vi + λi = 0, ∂t ∂x
t =1
and
x = 1.
(20)
Hyperbolic Systems of Several Dependent Variables Systems of first-order hyperbolic equations can be expressed in vector and matrix form as ∂U ∂U + A(U , x, t) = 0, ∂t ∂x
(21)
where U is a column vector of n elements containing the dependent variables and A is an n × n matrix whose coefficients may depend on the dependent variables. The problem is linear if the matrix A has elements independent of U and nonlinear otherwise. The characteristic curves in this case are given by the equations dt = 1, (22) ds dx = λi (U , x, t) (23) ds for i = 1, 2, ..., n and where λi is an eigenvalue of the matrix A. Here, it is assumed that the matrix A has n distinct eigenvalues. For the linear problem, and in particular, for the case where the matrix A has constant coefficients, the full solution can be obtained by using a suitable linear combination of dependent variables so that the equations reduce to a set of simple advection equations. Hence, the first step is to determine the eigenvalues, λi , of the matrix A and the corresponding eigenvectors zi , where
Az i = λi zi ,
Vi = Fi (x − λi t),
(28)
where Fi is an arbitrary function determined by the initial conditions. Once all the solutions for Vi are determined from the initial conditions, the solution in terms of the original variables is obtained using (25). Note that while the original variables may depend on all the characteristic variables, the Vi solution is constant along the ith characteristic curve. Example: The Second-Order Wave Equation The second-order wave equation
∂ 2U ∂ 2U = c2 2 ∂t 2 ∂x
(29)
can be expressed as a pair of first-order equations as ∂U ∂p = −c , ∂t ∂x ∂p ∂U = −c . ∂t ∂x Thus,
0 c A= . (30) c 0 The eigenvalues are simply λ1 = c and λ2 = − c and the corresponding eigenvectors are
z1 = Thus,
Q=
1 −1
1 1 −1 1
,
z2 =
1 1
,
Q−1 =
1 2
.
1 −1 1 1
(31) . (32)
(24) Equation (27) reduces to the pair of equations
Next, use the change of variable
U = QV ,
(27)
for i = 1, 2, ..., n. The solutions to (27) are simply
x = (1 − xa )t + xa = (1 − xb )t + xb , ⇒
(26)
Finally, pre-multiplying by Q−1 , the inverse of Q, results in a decoupled system of equation, since Q−1 AQ is a diagonal matrix whose elements are the eigenvalues λi . Thus, the final set of n equations are
(19)
for 0 ≤ x0 ≤ 1. Considering two different values of x0 , say xa and xb , the straight lines cross when
∂V ∂V + AQ = 0. ∂t ∂x
(25)
∂V1 ∂V1 −c = 0, ∂t ∂x
∂V2 ∂V2 +c = 0. ∂t ∂x
(33)
CHARGE DENSITY WAVES
131
The solutions are V1 = F1 (x + ct) and V2 = F2 (x − ct) and, in terms of the original variables, the solution is U = F1 (x + ct) + F2 (x − ct), p = −F1 (x + ct) + F2 (x − ct).
(34)
Riemann Invariants A Riemann invariant may be thought of as a function of the dependent variables that is constant along a characteristic curve. In the previous example, it is clear that U + p = 2F2 (x − ct), so U + p is a Riemann invariant along the characteristic curve defined by x − ct = constant. Similarly, U − p is constant along x + ct = constant. Example: Isentropic Flow The dimensionless equations for isentropic fluid flow with p/ρ γ = 1 can be expressed in terms of the fluid velocity, u, and the sound speed c = (γp/ρ)1/2 as
∂u 2 ∂c ∂u +u + c = 0, ∂t ∂x γ − 1 ∂x
(35)
∂c ∂c γ − 1 ∂u + c +u = 0. ∂t 2 ∂x ∂x
(36)
A detailed derivation of these equations is given in Kevorkian (1989). The matrix A is /
A=
u
2 γ −1 c
γ −1 2 c
u
0 (37)
having eigenvalues λ1 = u + c and λ2 = u − c. Thereare two characteristic curves given by the solution of the coupled differential equations dt = 1, ds dx = λi , ds where i = 1 or i = 2. For i = 1, the initial conditions are t = 0 and x = ξ at s = 0, which implies that t the characteristic curve is defined by ξ = x − 0 (u + c) ds = constant. For i = 2, t = 0, and x = η at s = 0, the second curve is defined by t η = x − 0 (u − c) ds = constant. Multiplying (35) by (γ − 1)/2 and then adding and subtracting (36) gives two equations ∂R ∂R + (u + c) = 0, ∂t ∂x ∂S ∂S + (u − c) = 0, ∂t ∂x where R = c + (γ − 1)u/2 and S = c − (γ − 1)u/2. R and S are Riemann invariants since R is constant along
the characteristic curve defined by ξ = constant) and S is constant along η = constant). A more detailed derivation of Riemann invariants is described in Kevorkian (1989). ALAN HOOD See also Burgers equation; Coupled systems of partial differential equations; Hodograph transform; Shock waves Further Reading Kevorkian, J. 1989. Partial Differential Equations: Analytical Solution Techniques, New York: Chapman & Hall Rubenstein, I. & Rubenstein, L. 1993 Partial Differential Equations in Classical Mathematical Physics, Cambridge and New York: Cambridge University Press
CHARGE DENSITY WAVES A charge density wave (CDW) is a collective transport phenomenon, whose origin lies in the interaction between electrons and phonons in a solid (Grüner & Zettl 1985; Grüner, 1988). As envisioned by Rudolph Peierls in 1930 in some quasi-one-dimensional metals (where the influence of one electron to each other electron is much stronger than in higher dimensions), the elastic energy needed to displace the position of the atoms may be balanced by a lowering of conduction electron energy. In such cases, the more stable configuration may have a periodic distortion of the lattice; thus, there is a modulation of the electronic charge density, which gives rise to a CDW. The wave vector turns out to be Q = 2kF , where kF is the Fermi wave vector, and the electronic density becomes δρ = ρ0 cos(2kF x + φ). Due to this periodic lattice distortion, a gap at the Fermi level appears, and the conduction electrons lower their kinetic energy. At high temperatures, thermal excitation of electrons across the band gap makes the normal metallic state stable. When the temperature is sufficiently low, a second-order phase transition (known as the Peierls transition) takes place, and a CDW is formed. In 1954, Herbert Fröhlich suggested that if Q was not commensurate with the lattice constant, the CDW energy would be independent of the phase φ, and thus, an electrical current would appear under any electric field, independent of its intensity. For a while, this phenomenon was speculated to be a possible origin of superconductivity. Interestingly, the interplay and relationship among CDWs, superconductivity, and spin density waves is still a field of study (Gabovich et al., 2002). If the translational invariance of φ is disrupted, there is a phase for which the CDW energy is the
132 lowest, and there is also a minimum threshold field to overcome this energy reduction and to initiate the conduction. A possible cause of the invariance break could be that the CDW is commensurate with the lattice. Although this case is unusual and mostly of theoretical interest, such a CDW (with a period quasimultiple of the lattice constant) may contain solitons in the form of constant phase zones, separated by abrupt change areas. This soliton behavior is modeled by sine-Gordon-like equations. However, empirical evidence suggests that the origin of the pinning of the CDW to the lattice and the appearance of a threshold field stem from impurities. Experimental evidence of CDW behavior became available in the 1970s, and nowadays several materials show CDW behavior, both inorganic, like NbSe3 , NbS3 , or K0.3 MoO3 (“blue bronze”), and organic, like (fluoranthene)2 PF6 . Evidence of this kind of transition is detected through magnitudes affected by the gap at the Fermi level, including magnetic susceptibility, resistivity, thermoelectric power, scattering experiments where the CDW wave vector manifests itself, and more recently, by means of scanning tunneling microscope images. Conductivity is among the more interesting properties of CDWs. The dielectric constants for these materials are high, and conductivity suffers an abrupt change from insulating to metallic values of orders of magnitude. The Hall and thermoelectric effects suggest that their conductivity consists of an ohmic linear term and a CDW nonlinear term. The response of the CDW to a field higher than the threshold value is twofold. First, there appears a high-frequency coherent current, or narrow band noise, which seems to be due to the displacement of the CDW over the pinning potential. Second, a low-frequency broad band noise, incoherent response, is also detected. It is also found that the conductivity saturates for high values of the external field, and it seems that this is due to electrons leaving the CDW region or due to the elimination of 2kF phonons. When an a.c. field is present, the CDW exhibits a strong dependence on the field frequency, and its conductivity, σac , also saturates for high frequencies. There also appear an induced conductivity, σdc , which increases when Vac increases, and some interference phenomena between the narrow band noise and the a.c. The external field Eac cos ωe t causes oscillations of the current at frequency ωe , and if there is also a d.c. field, Edc , there are oscillations at frequency ωi corresponding to the narrow band noise. These two frequencies may interact to produce modelocking phenomena when they are commensurate (nωi = mωe ). In CDW systems, this locking shows up in the step structure of the differential resistance as a function of the d.c. field. As the external d.c. changes, the
CHARGE DENSITY WAVES nonlinearity of the system keeps the relation between both frequencies constant, ωi /ωe , over a finite interval of the external parameter, corresponding to the intervals where dV /dI is constant. When the external parameter moves far from the locking region, the system undergoes a transition to an unlocked state, which is quasi-periodic, with two incommensurate frequencies. The interference between the internal frequency ωi and the external one ωe is the origin of the coherent and incoherent responses of the system. Usually, the low-frequency region of the power spectra consists of a broad band noise, while the narrow components show up at high frequencies as narrow band noise. A systematic elimination of the broad band noise when the CDW entered mode locking (Sherwin & Zettl, 1985) and a reinforcement of this noise in the unlocked regime have been observed. The interplay between the internal frequency and the external one may give rise to chaotic behavior, with a period-doubling route to chaos. Studied in the context of self-organized criticality, CDWs are an example of systems that reorganize themselves near the edge of stability, and any small change in the external electrical field gives rise to a drastic change in the response of the CDW (high increase of conductivity). Although several models have been proposed to explain CDW behavior, none is completely satisfactory. The classical model considers the CDW as a rigid carrier, without any internal degree of freedom, using the forced oscillator equations with some analogy to the Josephson junctions. The tunneling model focuses on the gap in the excitation spectrum of the CDW, explaining the nonlinear conductivity and the scale relationship between σac (ω) and σdc (E). However, these models do not explain the interference phenomena between the narrow band noise and the external field frequency ω. There are other models that consider the internal degrees of freedom of the CDW (segmenting the CDW either through a hydrodynamical description or the Kelmm–Schrieffer model), but none of them completely explains the phenomenology observed in a CDW. Another interesting model is the Fukuyama–Lee–Rice model, which treats the CDW as a classical extended elastic medium, interacting with impurities and an electric field. Discrete versions of these models have also been used. For the commensurate case, Frenkel– Kontorova and soliton models (such as the sine-Gordon equation) have been used. Several applications have been suggested for these materials, including tunable condensers, optical detectors, memory devices, and switches, among others. LUIS VÁZQUEZ, P. PASCUAL, AND S. JIMÉNEZ See also Coupled oscillators; Frenkel–Kontorova model; Polarons; Sine-Gordon equation; Superconductivity
CHEMICAL KINETICS Further Reading Brown, S. & Grüner, G. 1994. Charge and spin density waves. Scientific American, April 1994: 50–56 Gabovich A.M., Voitenko, A.I. & Ausloos, M. 2002. Charge-and spin-density waves in existing superconductors: competition between Cooper pairing and Peierls or excitonic instabilities. Physics Reports, 367: 583–709 Grüner, G. & Zettl, A. 1985. Charge density wave conduction: a novel collective transport phenomenon in solids. Physics Reports, 119: 117–232 Grüner, G. 1988. The dynamics of charge-density waves. Reviews of Modern Physics, 60: 1129–1181 Sherwin, M. & Zettl A. 1985. Complete charge density-wave mode locking and freeze-out of fluctuations in NbSe3 . Physical Review B, 32: 5536–5539 Thorne, R.E. 1996. Charge-density-wave conductors. Physics Today, May 1996: 42–47
CHEMICAL KINETICS Chemical kinetics is a well-defined field of physical chemistry that arose in the 1850s as a complement to the investigation of chemical equilibria. The question of how fast a reactive mixture in a closed vessel reaches equilibrium gave rise to the concept of reaction velocity. The mass action law, enunciated by Cato Guldberg and Peter Waage in 1863, provided a quantitative expression of the velocity of an elementary reaction step in a homogeneous medium in terms of the concentrations or the mole fractions of the reactants involved, and a parameter known as the rate constant. Chemical kinetics is intrinsically nonlinear, since the law of mass action features products of concentrations of the species involved.
Early Developments Evidence that chemical reactions can generate complex behavior was reported in the early days of chemical kinetics (Pacault & Perraud, 1997). In 1899, Wilhelm Ostwald discovered that in a reaction involving chromium in concentrated acid solution, the release of hydrogen gas was periodic. In 1906, Robert Luther observed propagating chemical reaction fronts in connection with the catalytic hydrolysis of alkyl sulfates. These studies remained isolated for a long time. Possible origins, including the systematic study of reaction mechanisms, were hardly touched upon and there was little or no modeling effort. Not surprisingly, therefore, they came to be regarded by the scientific community as curiosities or even as artifices. On the theoretical side in the 1920s, Alfred Lotka devised a model formally deriving from chemical kinetics and giving rise to sustained oscillations. As the model did not apply to any known chemical system, it was discarded by chemists but was far better received in population dynamics where it played a seminal role. This connection was further enforced in 1926 whenVito Volterra advanced an explanation of ecological cycles
133 in connection with predator-prey systems, using ideas similar to those of Lotka.
The Phenomenology of Nonlinear Chemical Kinetics Nonlinear chemical kinetics in its modern form owes much to the Belousov–Zhabotinsky reaction (Zhabotinsky, 1964; Field et al., 1972) dealing with the oxidation of a weak acid by bromate in the presence of a metal ion redox catalyst. In addition to the possibility of displaying long records of oscillatory behavior in batch (closed reactor), this reaction gave rise for the first time to a thorough mechanistic study which highlighted the important role of feedback in the onset of complex behavior, in addition to nonlinearity. Nonlinear phenomena in chemical kinetics have been observed on whole classes of systems giving rise to a large variety of complex behaviors, as reviewed in a Faraday discussion held in 2001. Quantitative phase diagrams have been constructed separating different behavioral modes as some key parameters are varied (Gray & Scott, 1990; Epstein & Pojman, 1998). Open Well-Stirred Reactors
Simple periodic, multi-periodic, and chaotic oscillations are observed as the residence time of reactants (inversely proportional to their pumping rate into the reactor) is varied. A second type of phenomenon is multistability, the possibility of exhibiting more than one simultaneously stable steady-state concentration level. A third type of phenomenon is excitability whereby, once perturbed, a system performs an extended excursion resembling a single pulse of an aborted oscillation, before settling back to its stable steady state. Finally, an interesting phenomenology pertains to combustion reactions, where the dependence of the rate constant on temperature is the source of a universal (reaction mechanism-independent) positive feedback. Open Unstirred Reactors
In the absence of stirring, chemical dynamics coexists with transport phenomena. This can give rise to the generic phenomenon of propagating wave fronts. In a bistable system, a typical front may join the two stable states, with one of them progressing at the expense of the other. In two- and three-dimensional reactors undergoing excitable or oscillatory dynamics, the fronts can take the more spectacular form of circles (target patterns), rotating spirals, and scrolls. An exciting form of spatial self-organization is spontaneous symmetrybreaking leading to sustained steady-state patterns, anticipated by Alan Turing in 1952 and realized experimentally by Patrick De Kepper and coworkers; see Figure 1 (Turing, 1952; De Kepper et al., 2000).
134
CHEMICAL KINETICS
Figure 1. Stationary concentration patterns arising in the chlorite–iodide–malonic acid reaction beyond a symmetry-breaking instability (courtesy of P. De Kepper).
Heterogeneous Systems
Since the late 1980s, a series of novel developments has been initiated following the encounter of nonlinear chemical kinetics with surface science as it is manifested, for instance, in heterogeneous catalysis. Complex behavior in all the above-mentioned forms is observed. Furthermore, the development of sophisticated monitoring techniques, such as field ion microscopy, opens the perspective of monitoring chemical dynamics at the nanoscale level (Hildebrand et al., 1999).
Theoretical Developments: Dynamical Systems and Nonlinear Chemical Kinetics The essence of nonlinear chemical kinetics is captured by the reaction-diffusion equations (Nicolis & Prigogine, 1977) ∂ci = vi ({cj }, kα , Hα , · · ·) + Di ∇ 2 ci, ∂t
(1)
where ci (i = 1, . . . , n) denotes the concentrations or the temperature, kα and Hα the rate constants and heats of reaction of the steps involved, and Di the mass or heat diffusivity coefficients. The rate function vi accounts for the nonlinearities and feedbacks, whereas the contribution of transport processes is linear. The reaction-diffusion equations (1) exhibit nonlinearity in its simplest expression, as a property arising from intrinsic and local cooperative events. Because of this, complex behavior may arise in the absence of spatial degrees of freedom and persist even when few variables are present. In thermodynamic language, reactions are purely dissipative processes, whereas in nonlinear mechanics, inertia plays a very important role in the onset of complex behavior. Understanding how purely dissipative systems can come to terms with the restrictions imposed by the laws of thermodynamics and statistical mechanics has stimulated several fundamental developments (Glansdorff & Prigogine, 1971; Nicolis & Prigogine, 1977). It has also led to the design of canonical models, such as the Brusselator, that are being used with success to test ideas and to assess the limits of validity of approximations.
The intrinsic parameters k and D in Equations (1) has dimensions of [time]−1 and [(length)2 /time], respectively. It follows that a reaction-diffusion system possesses intrinsic time (k −1 ) and space ((D/k)1/2 ) scales. This places nonlinear kinetics at the forefront for understanding the origin of endogenous rhythmic and patterning phenomena as observed, in particular, in biology and in materials science. In thermodynamic equilibrium, these intrinsic time and length scales remain dormant, owing to detailed balance. Nonequilibrium allows for the excitation and eventual stabilization of finite amplitude disturbances bearing these characteristic scales. Equations (1) form the basis of interpretation of the experiments surveyed above. They also constitute some of the earliest and most widely used models of bifurcation and chaos theories. The classical tools used in their analysis are stability theory and the reduction to normal form (amplitude) equations using perturbation techniques and/or symmetry arguments, complemented by interactive numerical simulations (Nicolis, 1995). A most interesting development is the prediction of an impressive variety of intrinsically generated spatial and spatiotemporal patterns, including spatiotemporal chaos, when two or more mechanisms of instability are interfering.
Nonlinear Chemical Kinetics in the Life Sciences Research in nonlinear chemical kinetics has led to a semi-quantitative interpretation of a wide spectrum of dynamical behaviors in biochemistry (Goldbeter, 1996). This has been possible thanks to the development of models in which the involvement of cooperative enzymes in some key steps provides the principal source of nonlinearity and feedback. Glycolytic oscillations, calcium oscillations and waves, the cell division cycle, cAMP-induced aggregation in amoebae, and synchronization in cell populations are among the main achievements of this effort that helped to identify the principal mechanisms behind the observed behavior. Nonlinear kinetics has also been a source of inspiration for approaching dynamical phenomena of crucial importance in biology, in which modeling involving a few variables and/or well-established molecular mechanisms is still not available. Immune response, the electrical activity of the brain, embryonic development, cooperative processes such as food recruitment or building activity in social insects, and, last but not least, chemical and biochemical evolution itself (Eigen & Schuster, 1979; See Biological evolution) have been explored in one way or the other in the light of the concepts and techniques of nonlinear kinetics. As a closing remark, Equations (1) anticipate a decoupling between the evolution laws of the
CHERENKOV RADIATION macroscopic observables and dynamics at the microscopic level, which may actually break down when reactive systems are embedded on low-dimensional supports owing to the generation of anomalous inhomogeneous fluctuations. This leads to interesting synergies among nonlinear chemical kinetics, statistical mechanics, and computational science (Nicolis, 2001). G. NICOLIS See also Belousov–Zhabotinsky reaction; Brusselator; Population dynamics; Reaction-diffusion systems; Turing patterns Further Reading De Kepper, P., Dulos, E., Boissonade, J., De Wit, A., Dewel, G. & Borckmans, P. 2000. Reaction–diffusion patterns in confined chemical systems, Journal of Statistical Physics, 101: 495–508 Eigen, M. & Schuster, P. 1979. The Hypercycle: A Principle of Natural Self-organization, Berlin and New York: Springer Epstein, I.R. & Pojman, J.A. 1998. An Introduction to Nonlinear Chemical Dynamics, Oxford and New York: Oxford University Press Field, R.J., Körös, E. & Noyes, R. 1972. Oscillations in chemical systems. II. Thorough analysis of temporal oscillation in the bromate–cerium–malonic acid system. Journal of the American Chemical Society, 94: 8649–8664 Glansdorff, P. & Prigogine, I. 1971. Thermodynamic Theory of Structure, Stability and Fluctuations, London and New York: Wiley Goldbeter, A. 1996. Biochemical Oscillations and Cellular Rhythms, Cambridge and New York: Cambridge University Press Gray, P. & Scott, S.K. 1990. Chemical Oscillations and Instabilities, Oxford: Clarendon Press and New York: Oxford University Press Hildebrand, M. Kuperman, M., Wio, H., Mikhailov, A.S. & Ertl, G. 1999. Self-organized chemical nanoscale microreactors. Physical Review Letters, 83: 1475–1478 Nicolis, G. 1995. Introduction to Nonlinear Science, Cambridge and New York: Cambridge University Press Nicolis, G. 2001. Nonlinear kinetics: at the crossroads of chemistry, physics and life sciences. Faraday Discussions, 120: 1–10 Nicolis, G. & Prigogine, I. 1977. Self-organization in Nonequilibrium Systems, New York: Wiley Pacault, A. & Perraud J.-J. 1997. Rythmes et Formes en Chimie, Paris: Presses Universitaires de France Royal Society of Chemistry. 2002. Nonlinear Chemical Kinetics: Complex Dynamics and Spatio-temporal Patterns. Faraday Discussion no. 120, London: Royal Society of Chemistry; pp. 1–431 Turing, A.M. 1952. The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London, Series B, 237: 37–72 Zhabotinsky, A.M. 1964. Periodic liquid phase oxidation reactions, Doklady Akademie Nauk SSSR, 157: 392–395
135 by Pavel Cherenkov in 1934 and was theoretically explained by Igor Tamm and Il’ja Frank in 1937. (In 1958, Cherenkov, Tamm, and Frank were awarded a Nobel Prize in physics for this work.) Earlier, Cherenkov radiation was theoretically predicted by Arnol’d Sommerfeld, who, in 1904, solved a formal problem on the radiation of a charged particle moving in vacuum with a velocity v > c, and by Oliver Heaviside (Heaviside, 1950) at the end of the 19th century. At present, all radiation phenomena of waves of any origin, created by a source that moves with a velocity exceeding the phase velocity of the waves, are regarded as Cherenkov radiation. Common examples that can be observed in ordinary life include waves created on a water surface by moving objects, the so-called bow waves according to a theory developed by Lord Kelvin (William Thomson) in the middle of the 19th century, and acoustic shock waves brought about in the atmosphere by a supersonic jet or a rocket, first described by Ernst Mach in 1877. Essentially, Cherenkov radiation can be understood from the following simple considerations. A source moving steadily with a velocity v and depending on coordinates and time as f (r − v t) can be presented in the form of a Fourier integral f (r , t) = f (k)eikr − ikvt d3 k, which suggests that each Fourier harmonic has frequency kv. In electrodynamics, the role of the source is played by the distribution of charge density ρ(r − v t) and current j = ρ v (r − v t), and in hydrodynamics by external forces and moving particles. If a source is to excite the waves in a medium whose wave vector k and frequency ω are related by the dispersion equation ω = ω(k), a resonance condition must be satisfied, under which the wave frequency ω(k) coincides with the external force frequency kv. This equality yields the Cherenkov condition ω(k) = k · v ,
(1)
k
θ v
CHERENKOV RADIATION Cherenkov radiation is the electromagnetic radiation of a charged particle moving uniformly in a medium with a velocity exceeding the velocity of light (c) in that medium. It was discovered experimentally
Figure 1. Wave front configuration by Cherenkov radiation, v is the particle velocity, and k is the wave vector of the emitted wave.
136 which can be fulfilled only if the moving source velocity exceeds the phase velocity of the waves v > vph = ω(k)/k. The radiated wave vectors form a characteristic Cherenkov cone, which is similar to the Mach cone in hydrodynamics. Thus, the angle θ between the wave vector of the radiated wave and the direction of particle velocity is determined by the Cherenkov condition of Equation (1), which requires cos θ = vph /v, showing that the frequency distributions of radiation are related to each other. Figure 1 shows the wave front configuration of a wave of frequency ω, emitted by a moving source. Nowadays, Cherenkov radiation is used in highenergy physics as an important experimental tool (the Cherenkov counter), enabling identification and velocity measurements of fast charged particles, and in electronics for Cherenkov electron oscillators and amplifiers of electromagnetic waves, such as travelingwave tubes and backward-wave oscillators. Many diverse phenomena can be related to manifestations of Cherenkov radiation, including Landau damping and instabilities in plasma physics, and the Kelvin– Helmholtz instability in hydrodynamics (excitation of waves on a water surface by wind). VLADISLAV V. KURIN See also Dispersion relations; Shock waves Further Reading Heaviside, O. 1950. Electromagnetic Theory, 3 vols, London: The Electrician; reprinted New York: Dover Jackson, J.D. 1998. Classical Electrodynamics, NewYork: Wiley Landau, L.D. & Lifshitz, E.M. 1963. Theoretical Physics, vol. 6: Fluid Mechanics, Oxford: Pergamon Press Landau, L.D. & Lifshitz, E.M. 1963. Theoretical Physics, vol. 8: Electrodynamics of Continious Media, Oxford: Pergamon Press Landau, L.D. & Lifshitz, E.M. 1963. Theoretical Physics, vol. 10: Physical Kinetics, Oxford: Pergamon Press
CHI(2) MATERIALS AND SOLITONS See Nonlinear optics
CHIRIKOV MAP See Maps
CHUA’S CIRCUIT Having witnessed futile attempts at producing chaos in an electrical analog of the Lorenz equations while on a visit to Japan in 1983, Leon Chua was prompted to develop a chaotic electronic circuit. He realized that chaos could be produced in a piecewise-linear circuit if it possessed at least two unstable equilibrium points— one to provide stretching and the other to fold the tra-
CHUA’S CIRCUIT
IR
R L
C2
V1
V2
C1
VR
NR
I3
Figure 1. Chua’s circuit consists of a linear inductor L, two linear capacitors (C1 and C2 ), a linear resistor R, and a voltage-controlled nonlinear resistor NR (called a Chua diode). IR Gb Ga −E
E
VR
Gb
Figure 2. The V –I characteristic of the Chua diode NR has breakpoints at ±E and slopes Ga and Gb in the inner and outer regions, respectively.
jectories. With this insight and using nonlinear circuit theory (Chua et al., 1987), he systematically identified those third-order piecewise-linear circuits containing a single voltage-controlled nonlinear resistor that could produce chaos. Specifying that the V –I characteristic of the nonlinear resistor NR should be chosen to yield at least two unstable equilibrium points, he invented the circuit shown in Figure 1. This circuit is described by three ordinary differential equations dV1 = −GV1 − f (V1 ) + GV2 , C1 dt dV2 = GV1 − GV2 + I3 , C2 (1) dt dI3 = −V2 , L dt where G = 1/R. Also, f (·) is the V –I characteristic of the nonlinear resistor NR (known as a Chua diode), which has a piecewise-linear V –I characteristic defined by f (VR )=Gb VR + 21 (Ga −Gb )(|VR +E|−|VR −E|), (2) where ±E are the breakpoints in the characteristic, and Ga and Gb are the slopes in the inner and outer regions, respectively, as shown in Figure 2. If the values of the circuit parameters are chosen such that the circuit contains three equilibrium points (two in the outer regions and one at the origin), all of which are unstable with saddle-focus stability, then a homoclinic trajectory can be formed, potentially producing chaos.
CHUA’S CIRCUIT
137
Soon after its conception, the rich dynamical behavior of Chua’s circuit was confirmed by computer simulation and experiment and in 1986 was proven to exhibit chaos in the sense of Shilnikov (Chua et al., 1986). An intensive effort since then to understand every aspect of the dynamics of this circuit has resulted in its widespread acceptance as a powerful paradigm for learning, understanding, and teaching about nonlinear dynamics and chaos (Madan, 1993; Chua, 1992). By adding a linear resistor R0 in series with the inductor, Chua’s circuit has been generalized to the Chua oscillator (Chua, 1993), with the last of Equations (1) changing to dI3 = −V2 − R0 I3 . (3) L dt
Chua’s circuit oscillator can be realized in a variety of ways by using standard or custom-made electronic components. Since all of the linear elements (capacitors, resistors, and inductor) are readily available as two-terminal devices, only the nonlinear diode must be synthesized using a combination of standard electronic components. The most robust practical realization of Chua’s circuit/oscillator, shown in Figure 3, uses two operational amplifiers (op-amps) and six resistors to implement the nonlinear diode (Kennedy, 1992). The op-amp subcircuit consisting of A1 , A2 , and R1 through R6 functions as a Chua diode with V –I
Chua’s oscillator is canonical in the sense that it is equivalent (topologically conjugate) to a 13-parameter family of three-dimensional ordinary differential equations with odd-symmetric piecewise-linear vector fields. The circuit can exhibit every dynamical behavior known to be possible in a system described by a continuous odd-symmetric three-region piecewiselinear vector field. Unlike the Lorenz or Rössler equations, which have more complex multiplicative nonlinearities, the only nonlinearity in Chua’s circuit is a scalar function of one variable. With an appropriate choice of parameters, the circuit can be made to follow the classic period-doubling, intermittency, and torus breakdown routes to chaos; in addition, over 60 different types of strange attractors have been reported in Chua’s oscillator.
R
IR
L I3
C2
V2
V1
VR
C1
NR
R0
IR
R
R3 V+
L I3
R6
C2
V2
V1
C 1 VR
V+ A2
A1 V−
V−
R0 R1
R2
R5 R4 NR
Figure 3. Robust practical implementation of Chua’s circuit/oscillator using two op amps and six resistors to realize the Chua diode. In the case of Chua’s circuit, R0 is zero; in Chua’s oscillator, R0 may assume negative or positive values. Component values for Chua’s circuit are listed in Table 1.
Figure 4. Typical experimental bifurcation sequence in Chua’s circuit (component values as in Table 1) recorded using a Hitachi VC-6025 Digital Storage Oscilloscope. Horizontal axis V2 200 mV/div; vertical axis V1 1 V/div. (a) R = 1.83 k, period 1; (b) R = 1.82 k, period 2; (c) R = 1.81 k, period 4; (d) R = 1.80 k, Spiral attractor; (e) R = 1.76 k, Spiral attractor; (f) R = 1.73 k, double-scroll attractor [reproduced from Kennedy (1993)].
138
CLEAR AIR TURBULENCE
Element
Description
A1
Op-amp ( 21 AD712 or TL082)
A2 C1 C2 R R1 R2 R3 R4 R5 R6 L
Op-amp ( 21 AD712 or TL082) Capacitor Capacitor Potentiometer 1 4 W Resistor 1 4 W Resistor 1 4 W Resistor 1 4 W Resistor 1 4 W Resistor 1 4 W Resistor
Inductor (TOKO type 10RB)
Value
Tolerance (%)
10 nF 100 nF 2 k 3.3 k
±1 ±1
22 k
±1
22 k
±1
±1
Chua, L.O., Komuro, M. & Matsumoto, T. 1986. The double scroll family—Parts I and II. IEEE Transactions on Circuits and Systems, 33(11): 1073–1118 Kennedy, M.P. 1992. Robust op amp realization of Chua’s circuit. Frequenz, 46(3–4): 66–80 Kennedy, M.P. 1993. Three steps to chaos—Parts I and II. IEEE Transactions on Circuits and Systems—Part I, 40(10): 640–674 Kennedy, M.P. 1995. Experimental chaos from autonomous electronic circuits. Philosophical Transactions of the Royal Society London A, 353(1701): 13–32 Madan, R.N. 1993. Chua’s Circuit: A Paradigm for Chaos, Singapore: World Scientific
2.2 k
±1
220
±1
CIRCLE MAP
220 18 mH
±1 ±5
See Denjoy theory
Table 1. Component list for Chua’s circuit.
characteristic as shown in Figure 2. Using two 9 V power supplies for the analog devices AD712, op-amps set their saturation voltages at approximately ± 8.3 V, yielding breakpoints E ≈ 1V. With R2 = R3 and R5 = R6 , the V –I characteristic of the Chua diode is defined by Ga = − 1/R1 − 1/R4 = − 25/33 mS and Gb = 1/R3 − 1/R4 = − 9/22 mS. Note that the value of the resistance R0 is ideally zero in Chua’s circuit. In practice, the real inductor L has a small parasitic resistance that can be modeled by R0 ; the TOKO-type 10RB inductor is preferred because it has a sufficiently low parasitic resistance R0 for this application. By reducing the value of the variable resistor R from 2000 to zero, with all other components as in Table 1, the circuit exhibits a Hopf bifurcation from dc equilibrium, a sequence of period-doubling bifurcations to a spiral attractor, periodic windows, a double-scroll strange attractor, and a boundary crisis (Kennedy, 1995). Although the diode characteristic in Equation (2) is piecewise-linear, qualitatively similar behavior is observed when a smooth nonlinearity such as a cubic is used instead. The piecewise-linear nonlinearity is more convenient for circuit realization, while the smooth nonlinearity is more appropriate for bifurcation analysis. MICHAEL PETER KENNEDY See also Attractors; Bifurcations; Chaotic dynamics; Hopf bifurcation; Horseshoes and hyperbolicity in dynamical systems; Period doubling; Routes to chaos Further Reading Chua, L.O. 1992. The genesis of Chua’s circuit. Archiv für Elektronik und Übertragungstechnik, 46(4): 250–257 Chua, L.O. 1993. Global unfolding of Chua’s circuit. IEICE Transactions Fundamentals, E76A(5): 704–734 Chua, L.O., Desoer, C.A. & Kuh, E.S. 1987. Linear and Nonlinear Circuits, New York: McGraw-Hill
CLEAR AIR TURBULENCE Description In 1966, the National Committee for Clear Air Turbulence officially defined clear air turbulence (CAT) as “all turbulence in the free atmosphere of interest in aerospace operations that is not in or adjacent to visible convective activity (this includes turbulence found in cirrus clouds not in or adjacent to visible convective activity).” FAA Advisory Circular AC 0030B (1997) has simplified this somewhat to “turbulence encountered outside of convective clouds.” Thus, CAT is considered to mean turbulence in the clear air, not in or near convective clouds, usually at upper levels of the atmosphere (above 6 km). CAT was first observed by high-flying fighter aircraft in the mid-to-late 1940s. It was expected that turbulence encounters would be rare at high levels due to the lack of clouds at upper levels. However, it was soon discovered that turbulence encounters in clear air were not only frequent but sometimes quite severe. Since then, CAT encounters have become a significant problem for commercial aircraft flying at cruising altitudes (18,000– 45,000 ft above the mean sea level). In fact, various reviews of National Transportation Safety Board (NTSB) reports indicate that in the U.S., turbulence encounters account for approximately 65% of all weather-related accidents or incidents for commercial aircraft; probably more than half of these are due to CAT. Although most turbulence encounters are generally just an annoyance to passengers and crew, on average there are about eight commercial aircraft turbulence-related incidents per year that are significant enough to be reported to the NTSB, accounting for 10 serious and 32 minor injuries. Fortunately, fatalities and substantial damage to the aircraft structure are rare but can occur, as shown in Figure 1. It should be noted that only a certain range of frequencies or wavelengths of turbulent eddies is felt by aircraft as bumpiness. For most commercial aircraft, this wavelength is anywhere from about 10 m to 1 km.
CLEAR AIR TURBULENCE
139 16
Clear Air In Cloud
14
Altitude (km)
12 10 8 6 4
Figure 1. Damage sustained to a DC-8 cargo aircraft in an encounter with CAT on 9 December 1992 at 31,000 ft over the Rocky Mountains near Evergreen, Colorado. Note the loss of left outboard engine and approximately 6 m of the wing. (Photo by Kent Meiries.)
Shorter wavelengths are integrated out over the aircraft structure and longer wavelengths are felt as “waves” and do not generally have vertical accelerations large enough to be felt as “bumps.” CAT can be measured quantitatively with instrumented research aircraft or remotely by instruments such as clear air radar and lidar, but by far the majority of measurements are through pilot reports (PIREPs) of turbulent encounters. PIREPs usually report turbulence on intensity scales of smooth, light, moderate, severe, or extreme. Although definitions of these categories are provided in terms of normal accelerations or airspeed fluctuations, there is still an amount of subjectivity associated with these reports. The pilot reporting system is fairly successful in warning other aircraft of turbulence regions encountered, but to use PIREPs to deduce CAT climatology is difficult, since they are biased by air traffic patterns, non-uniform reporting practices, and the tendency to avoid known turbulence areas. One way to reduce these biases is to examine the ratio of moderate or severe PIREPs to total reports in the three-dimensional airspace averaged over many years of reports. This has been done over the continental U.S. (Sharman et al., 2002). The distribution of this ratio of moderate or greater (MOG) severity PIREPs to total PIREPs by altitude for both CAT and in-cloud encounters is shown in Figure 2. Note that above about 8 km, the majority of reports are in clear air. Similar analyses show the occurrence of CAT to be about twice as frequent in winter as in summer. CAT encounters also tend to be more frequent and more severe over mountainous regions, for example, the Colorado Rockies. One characteristic of CAT is its patchiness in both time and space. These patches tend to be relatively thin compared with their length (the median thickness is about 500 m, whereas the median horizontal dimension is about 50 km); the median duration is about 6 hours (Vinnichenko et al., 1980). Within a patch, the turbulence may be continuous or may occur in discrete bursts that may be very severe but very narrow (1–2 km).
2 0
0
0.25
0.5
0.75
Fraction of MOG Pireps
Figure 2. Vertical distribution of the fraction of moderate or greater-intensity (MOG) turbulence pilot reports taken over a two-year period. Solid lines indicate reports in clear air, and dashed lines indicate reports in cloud.
Relation to Kelvin–Helmholtz Instability From Figure 2, it can be seen that the altitude of maximum occurrence of CAT is at upper levels near the tropopause and jet stream levels. This relation has been known since the 1950s. For example, Bannon (1952) noted that severe CAT tended to occur above and below the jet stream core on the low-pressure side. These areas tend to have large values of the vertical shear of the horizontal wind, and this led to the hypothesis (e.g., Dutton & Panofsky, 1970) that, at least in some cases, CAT may be related to Kelvin–Helmholtz (KH) instability (KHI, See Kelvin–Helmholtz instability). The KHI process occurs in stably stratified shear flows when dynamic instabilities due to wind shear exceed the restoring forces due to stability. KH waves (also known as “billows”) are, in fact, commonly observed in the atmosphere near the top of clouds, where the KH distortions become visible (Figure 3). Further, the KHI connection to CAT has been verified on occasion by simultaneous measurements of KH billows by high-powered radar and aircraft measurements of turbulence (Browning et al., 1970). Although the figure shows a KH wave train at an instant in time, the KHI process is an evolutionary one, where waves develop, amplify, roll up, and break down into turbulent patches. The resultant turbulent mixing will eventually destroy the wave structure and mix out the shear, and density distributions that created it, but if larger scale processes continue to reinforce the shears, the entire process may reinitiate. The names associated with KHI derive from the early works of Hermann von Helmholtz (1868), who realized the destabilizing effects of shear, and later of Lord Kelvin (William Thomson) (1871), who posed and solved the instability problem mathematically. Richardson (1920), using simple energy considerations, deduced that a sufficient condition for
140
Figure 3. An example of Kelvin–Helmholtz billows observed in the presence of clouds. © 2003 University Corporation for Atmospheric Research.
stability of disturbances in shear flow occurs when the restoring force of stability (as measured by the buoyancy frequency N ) is greater than the destabilizing force of the mean horizontal velocity shear (dU/dz) in the vertical (z) direction. Thus, when the ratio N 2 /(dU/dz)2 is greater than unity, the flow should be stable. In honor of Richardson’s insight, it has become common to refer to this ratio as the Richardson number (Ri). The linear problem for various stratified shear flow configurations is well reviewed in the texts by Chandrasekhar (1961, Chapter 11) and by Drazin and Reid (1981). The sufficient condition for stability to linear two-dimensional disturbances is that Ri > 0.25; Ri < 0.25 is necessary but not sufficient for instability. More recently, Abarbanel et al. (1984), using the method of Arnol’d, were able to show that the necessary and sufficient condition for Liapunov stability to three-dimensional nonlinear disturbances is Ri > unity, in agreement with Richardson’s deduction. Through these theoretical studies, laboratory experiments (e.g., Thorpe, 1987), and more recently, very high-resolution numerical simulations (e.g., Werne & Fritts, 1999), considerable progress has been made in understanding the intricacies of KHI. One (probably common) method in which KHI is initiated in the atmosphere is through longer wavelength gravity-waveinduced perturbations that lead to local reductions (e.g., in the crest of the wave) in Ri to a value small enough to initiate instability. Gravity waves are ubiquitous in the atmosphere and can be generated in a variety of ways, for example, by flow over mountains or by strong updrafts and downdrafts in convective storms. However, the processes by which KHI may lead to turbulent breakdowns within three-dimensional transient gravity waves is not yet completely understood. It should be mentioned that gravity wave breakdown into turbulent patches may also occur through other mechanisms besides KHI. Examples include convective overturning in large-amplitude waves or nonlinear wave–wave interactions (see the reviews by Wurtele et al. (1996) and Staquet & Sommeria (2002) for discussions of some
CLEAR AIR TURBULENCE of these effects). Further, other instability mechanisms besides KHI may lead to CAT, for example, inertial instability or critical-level instability. Thus, the processes by which CAT may be generated at any given time and place are complex, involving many different sources, making its forecasting quite difficult. One new promising avenue of research is the use of high-resolution numerical simulations of the atmosphere to reconstruct the atmospheric processes that led to particularly severe encounters of CAT (e.g., Clark et al., 2000; Lane et al., 2003). These types of studies have only recently become possible with advances in computing capabilities that allow model runs to contain both the large-scale processes that create conditions conducive to turbulence and the smaller scales that may affect aircraft. Further studies such as these, along with continued theoretical and numerical studies and field measurement campaigns, should lead to a better understanding of CAT genesis and evolution processes. Until this understanding is available, forecasting of CAT must be accomplished by empirical means. This is done by forecasting various large-scale atmospheric conditions that are known through experience to be related to CAT. Until recently, these diagnostics for likely regions of turbulence had to be performed by laborious weather map analyses of jet streams and upper-level fronts. Nowadays, these diagnostics can be computed from the output of routine numerical weather prediction forecast models. However, the reliability of these turbulence diagnostics is highly variable, and at the moment it seems that better success may be achieved by combining the various diagnostics within an artificial intelligence framework (Tebaldi et al., 2002). ROBERT SHARMAN See also Kelvin–Helmholtz instability; Turbulence
Further Reading Abarbanel, H.D.I., Holm, D.D., Marsden, J.E. & Ratiu, T. 1984. Richardson number criterion for the nonlinear stability of three-dimensional stratified flow. Physical Review Letters, 52: 2352–2355 Bannon, J.K. 1952. Weather systems associated with some occasions of severe turbulence at high altitude. Meteorological Magazine, 81: 97–101 Browning, K.A., Watkins, C.D., Starr, J.R. & McPherson, A. 1970. Simultaneous measurements of clear air turbulence at the tropopause by high-power radar and instrumented aircraft. Nature, 228: 1065–1067 Chandrasekhar, S. 1961. Hydrodynamic and Hydromagnetic Stability, Oxford: Clarendon Press Clark, T.L., Hall, W.D., Kerr, R.M., Middleton, D., Radke, L., Ralph, F.M., Nieman, P.J. & Levinson, D. 2000. Origins of aircraft-damaging clear-air turbulence during the 9 December 1992 Colorado downslope windstorm: numerical simulations and comparison with observations. Journal of the Atmospheric Sciences, 57: 1105–1131 Drazin, P.G. & Reid, W.H. 1981. Hydrodynamic Stability, Cambridge: Cambridge University Press
CLUSTER COAGULATION Dutton, J.A. & Panofsky, H.A. 1970. Clear air turbulence: a mystery may be unfolding. Science, 167: 937–944 von Helmholtz, H. 1868. On discontinuous movements of fluids. Philosophical Magazine, 36: 337–346 (originally published in German, 1862) Lane, T.P., Sharman R.D., Clark T.L. & Hsu, H.-M. 2003. An investigation of turbulence generation mechanisms above deep convection. Journal of the Atmospheric Sciences, 60: 1297–1321 Kelvin, Lord. 1871. Hydrokinetic solutions and observations. Philosophical Magazine, 42: 362–377 Richardson, L.F. 1920. The supply of energy from and to atmospheric eddies. Proceedings of the Royal Society, London, A97: 354–373 Sharman, R., Fowler, T.L., Brown, B.G. & Wolff, J. 2002. Climatologies of upper-level turbulence over the continental U. S. and oceans. Preprints, 10th Conference on Aviation, Range, and Aerospace Meteorology, Portland OR: American Meteorological Society, J29–J32 Staquet, C. & Sommeria, J. 2002. Internal gravity waves: from instabilities to turbulence. Annual Reviews of Fluid Mechanics, 34: 559–593 Tebaldi, C., Nychka D., Brown B.G., and Sharman, R. 2002. Flexible discriminant techniques for forecasting clear-air turbulence. Environmetrics, 13(8): 859–878 Thorpe, S.A. 1987. Transitional phenomena and the development of turbulence in stratified fluids: a review. Journal of Geophysical Research, 92: 5321–5248 Vinnechenko, N.K., Pinus, N.Z., Shmeter, S.M. & Shur, G.N. 1980. Turbulence in the Free Atmosphere, 2nd edition, New York: Consultants Bureau Werne, J. & Fritts, D.C. 1999. Stratified shear turbulence: evolution and statistics. Geophysical Research Letters, 26: 439–442 Wurtele, M.G., Sharman, R.D. & Datta, A. 1996. Atmospheric lee waves. Annual Reviews of Fluid Mechanics, 28: 429–476
CLUSTER COAGULATION In 1916, nine years before he was awarded the Nobel prize for his studies of colloidal solutions, the Austro-Hungarian chemist Richard Zsigmondy (1865–1929) brought forth the first model for cluster coagulation. Interpreting the behavior of aqueous solutions of gold colloidal particles, he posited that each cluster of particles is surrounded by a sphere of influence. According to this model, clusters execute independent Brownian motions when their spheres of influence do not overlap. Whenever the spheres of influence of a pair of clusters touch, the clusters instantaneously stick together to form a new cluster. This kind of non-equilibrium kinetics (See Nonequilibrium statistical mechanics) has proven to be truly ubiquitous: bond formation between polymerization sites; the coalescence of rain drops, smog, smoke, and dust; the aggregation of bacteria into colonies; the formation of planetesimals from submicron dust grains; the coalescence arising in genetic trees; and even the merging of banks to form everlarger financial institutions are all examples of cluster coagulation. Cluster coagulation results in a broad distribution of cluster sizes described by {ni (t)}i = 1,...,∞ , where ni (t)
141 is the number of clusters of size i present in the system at time t. The size of a cluster is defined as the number of unit clusters that it comprises. The primary goal of coagulation theory is to determine the evolution of ni (t) for all i. The most important theory of coagulation was given by the Polish physicist Marian Smoluchowski (1872– 1917) (Smoluchowski, 1916, 1917). In 1916, prompted by a request from Zsigmondy to provide a mathematical description of coagulation, Smoluchowski postulated that (1) clusters are randomly distributed in space and this feature persists throughout the coagulation process, (2) only collisions between pairs of clusters are significant, and (3) the number of new clusters of size i + j , formed per unit time and unit volume due to collisions of clusters of sizes i and j , is proportional to the product of the concentrations ci = ni /V and cj = nj /V : number of new clusters = Ki,j ci cj . V t
(1)
Here, V is the volume of the coagulating system and Ki,j is the collision frequency factor, also called the coagulation kernel. The rate equation describing the evolution of ci (t) follows from the balance between the total number of clusters of size i created and annihilated as a result of coagulation: i−1 1 Ki−j,j ci−j (t)cj (t) c˙i (t) = 2 −
j =1 ∞
Ki,j ci (t)cj (t),
i = 1, ..., ∞. (2)
j =1
Here, c˙i (t) is the time derivative of the concentration ci (t). This equation—in fact, the chain of nonlinear ordinary differential equations—is called the Smoluchowski coagulation equation (SCE). It describes the evolution of homogeneous aggregating systems with the distribution ci (t), provided knowledge of Ki,j and an initial distribution ci (0). The SCE does not depend on the spatial dimension in which the coagulation process is taking place. According to modern terminology, Smoluchowski theory is a mean field (See Phase transitions) theory of nonequilibrium growth. It neglects fluctuations of the concentrations ck ; that is, it presumes the existence of the thermodynamic limit: V → ∞, nk → ∞, nk /V → ck . For a broad variety of aggregating systems, this proves to be a reasonable assumption. However, the first assumption of Smoluchowski, that correlations in the distribution of the cluster may be disregarded, is not always satisfied. Aggregating systems fulfilling this assumption are called well-mixed. In low-dimensional systems with no “external” mixing mechanism, the cluster
142
CLUSTER COAGULATION
collisions are often not able to provide sufficient mixing. This may result in correlation build-up and, therefore, a breakdown of Equation (2). The last, essential, albeit obvious, condition for the applicability of the SCE is that interactions between aggregating clusters be treated as instantaneous collisions, that is, τcol τcoag , where τcol is the characteristic time scale of collisions and τcoag is the characteristic time scale of coagulation. Thus, similar to the Boltzmann kinetic equation (See Nonequilibrium statistical mechanics), the SCE describes a slow evolution of the distribution due to fast collisions. For a continuous distribution c(t, v), SCE takes the form of an integro-differential equation (Müller, 1928): 1 v K(u, v − u)c(t, u)c(t, v − u) ∂t c(t, v) = 2 0 ∞ K(u, v)c(t, u) du. (3) −c(t, v)du 0
Here u, v are the physical sizes of clusters. Although the SCE establishes a firm foundation for our understanding of a great variety of cluster coagulation processes, other models have also been devised. They include the Oort–van de Hulst–Safronov equation, which describes cluster coagulation as a continuous growth process, and various stochastic models, such as Kingman’s coalescent and the Marcus– Lushnikov process. In this article, we limit ourselves to the Smoluchowski coagulation theory. The mathematical structure of SCE can be traced to the master equation for stochastic processes: wi,j Pj (t) − wj,i Pi (t) . (4) P˙i (t) = j
Here, Pi (t) is the probability of finding the system in a state i at time t, and wi,j is the probability of transition from state j to state i per unit time. Under the aforementioned assumptions of Smoluchowski theory, Pi = ci , wi,j = Ki,j Pi . Thus, we arrive at the probabilistic interpretation of Ki,j as the probability of coagulation of a pair of clusters i and j in unit volume per unit time. Calculating the coagulation kernel for a particular coagulation mechanism is a separate problem that is beyond the scope of this article. However, a few remarks are appropriate. First, due to the aforementioned timescale separation, calculation of K can be treated as a stationary problem. Second, the probability of cluster collisions will depend on the cluster geometry. Very often, the aggregates prove to be fractals (See Fractals; Pattern formation) having no characteristic size. The coagulation kernel K will then depend on their fractal dimension Df . Third, the probability of coagulation of a pair of clusters is a product of the probability of their collision and the sticking efficiency E. The latter is defined as the probability of clusters merging once
they have collided. Two practically important limiting cases are distinguished for coagulation of diffusing clusters: diffusion-limited cluster-cluster aggregation (DLCA), when E ≈ 1, and reaction-limited clustercluster aggregation (RLCA), when E 1. DLCA and RLCA produce aggregates of different fractal dimensions and have kinetics of different speeds. Fourth, when the coagulation mechanism is scale-free and the aggregates are fractals, the coagulation kernel K should also be scale-free. In mathematical terms this amounts to the following requirement on the function K(u, v): K(λu, λv) = λα K(u, v)
(5)
for any real λ > 0. Such kernels are called homogeneous, and α is called the homogeneity index. Several examples of widely used kernels are listed in Table 1. Our present knowledge of when solutions of (2) exist and are unique is limited, as the nonlinearity of SCE presents challenging problems for rigorous mathematical analysis. Existence and uniqueness of solutions for all times have been proven for the kernels K(u, v) ≤ C(u + v), where C is a constant. This result has recently been extended to the kernels K(u, v) ≤ r(u)r(v), where r(v) = o(v), as v → ∞ (Norris, 1999; Leyvraz, 2003). The distribution function is significant because any macroscopic property characterizing a given coagulating system can be calculated in the continuous (discrete) case as an integral (sum) over the distribution. From a mathematical point of view, the distribution function is simply a time-dependent measure on the set of all cluster sizes. Therefore, it is natural to look for weak solutions to SCE. Weak solutions can be conveniently defined in the continuous case by means of a Laplace transformed SCE, which describes the time evolution of the Laplace transformation of the concentration. Weak solutions are inverse Laplace transforms of the solutions to this equation. The discrete case can be treated analogously with the help of generating functions. In fact, most of the presently known exact solutions of SCE were obtained by this approach. Since the total mass of clusters is, apparently, conserved during collisions, the SCE is expected to conserve the first moment of the distribution ∞ vc(t, v) dv. (6) M1 (t) = 0
It appears as a deceptively simple exercise to prove this by substituting (6) into SCE. The proof, however, hinges on the condition that the infinite sums involved are convergent. Violation of this condition gives rise to the important phenomenon of gelation, also known as
CLUSTER COAGULATION
143
Coagulation mechanism
Kernel
Originator (year)
“Mating”
2
Smoluchowski (1916)
Brownian motion
(r(u)+r(v))2 ; r(v) ∝ v 1/Df r(u)r(v)
Smoluchowski (1916)
Isotropic turbulent shear
(r(u) + r(v))3 ; r(v) ∝ v 1/Df
Saffman and Turner (1956)
Gravitational coalescence
(r(u) + r(v))2 |r(u)2 − r(v)2 |; r(v) ∝ v 1/Df
Findheisen (1939)
Polymerization (RA∞ model)
uv
Flory (1953)
Polymerization (ARB∞ model)
u+v
Table 1.
Examples of kernels. All kernels are given up to a non-dimensional prefactor. Df is the fractal dimension of the coagulates.
the gel-sol transition or runaway growth. It was first predicted for the kernel K(u, v) = uv, which serves as a model of polymerization (See Polymerization) in which new links are formed randomly between polymerization sites. In the mean field approximation, this model also describes random graph growth and bond percolation (See Percolation theory). An exact solution of this problem shows that starting with monodisperse initial conditions, the first moment M1 (t) begins to decay after a finite time, tc , whereas the second moment M2 (t), measuring the average cluster size, diverges. This behavior corresponds to the formation of an infinite cluster, or gel, in a finite time due to the coagulation kinetics “accelerating” with growing cluster sizes. The sol mass M1 decreases, as part of it is being lost to the gel. This kind of kinetics has also proved to be a key to the explanation of the rapid growth of Jupiter and planetesimal growth in the terrestrial planets. It has been shown that M1 (t) = M1 (0) for all times; that is, the system is nongelling, when K(u, v) ≤ C(u + v). A wealth of data suggests that this is, in fact, the exact bound separating gelation at finite times from nongelling behavior, although a rigorous proof has not yet been given. The nonlinear character of SCE complicates mathematical analysis for arbitrary kernels, and the set of exactly solvable kernels is limited. Considerable progress in understanding coagulation kinetics has been achieved for the wide class of homogeneous (or asymptotically homogeneous) kernels by looking for similarity solutions. This can be done in two different ways, which we shall refer to as the self-preservation theory and the self-similarity theory. The first approach embodies the notion of a single characteristic size in the system, which can be chosen to be equal to an average cluster size, s(t). The asymptotic solution of SCE is then expected to have a self-
preserving (scaling) form: c(t, v) = s(t)−2 (v/s(t)) .
(7)
The self-preserving form has been further studied with the objective of identifying when it will lead to a power-law distribution asymptotically (Leyvraz, 2003). This approach allows one to obtain sensible results for a large variety of problems, including some extensions of SCE, in an almost automatic manner. However, since the scaling hypothesis is postulated, one cannot estimate a priori the accuracy of these solutions. Experimental data on coagulation often display a power-law distribution over some range of cluster sizes. The second approach deals with a coagulating system maintained at steady state by an external source of monomers. By analogy with the scaling theories for turbulent flows and the theory of critical phenomena, it may be expected that the steady-state distribution at large sizes will “forget” the forcing scale v0 and, therefore, will evolve to a scale-free form, c(v) = const × v −τ .
(8)
A careful mathematical analysis has shown that this is indeed the case for a wide range of homogeneous kernels, and that the asymptotic distribution equals
1/2 E v −(3+α)/2 . (9) c(v) = κ Here, E is the total influx of mass into the system due to the forcing and κ is a kernel-dependent constant. DMITRI O. PUSHKIN AND HASSAN AREF See also Brownian motion; Dimensional analysis; Fractals; Nonequilibrium statistical mechanics; Pattern formation; Percolation theory; Phase transitions; Polymerization
144 Further Reading Aldous, D.J. 1999. Deterministic and stochastic methods for coalescence (aggregation, coagulation): review of the meanfield theory for probabilists. Bernoulli, 5: 3 Drake, R.L. 1972. A general mathematical survey of the coagulation equation. In Topics in Current Aerosol Research, edited by G.M. Hidy and J.R.Brock, part 2, Oxford: Pergamon Press Family, F. & Landau, D.P. (editors). 1984. Kinetics of Aggregation and Gelation, Amsterdam and New York: Elsevier Science Findheisen, W. 1939. Zur Frage der Regentropfenbildung in reinen Wasserwolken. Meteorologische Zeitschrift, 56: 365–368 Flory, P. 1953. Principles of Polymer Chemistry. Ithica: Cornell University Press Friedlander, S.K. 1960. Similarity consideration for the particlesize spectrum of a coagulating, sedimenting aerosol. 17(5): 479 Friedlander, S.K. 2000. Smoke, Dust, and Haze: Fundamentals of Aerosol Dynamics, 2nd edition, Oxford and New York: Oxford University Press Friedlander, S.K. & Wang, C.S. 1966. The self-preserving particle size distribution for coagulation by Brownian motion. Journal of Colloid and Interface Science, 22: 126 Galina, H. & Lechowicz, J.B. 1998. Mean-field kinetic modeling of polymerization: the Smoluchowski coagulation equation. Advances in Polymer Science, 137: 135–172 Hunt, J.R. 1982. Self-similar particle-size distributions during coagulation: theory and experimental verification. Journal of Fluid Mechanics, 122: 169 Leyvraz, F. 2003. Scaling theory and exactly solved models in the kinetics of irreversible aggregation. Physics Reports, 383: 95–212 Müller, H. 1928. Zur algemeinen Theorie der raschen Koagulation. Kolloid-chemische Beihefte, 27: 223 Norris, J.R. 1999. Uniqueness, non-uniqueness, and a hydrodynamic limit for the stochastic coalescent. Annals of Applied Probability, 9: 78 Pushkin, D.O. & Aref, H. 2002. Self-similarity theory of stationary coagulation. Physics of Fluids, 14(2): 694 Saffman, P.G. & Turner, J.S. 1956. On the collision of drops in turbulent clouds. Journal of Fluid Mechanics, 1: 16–30 Smoluchowski, M.V. 1916. Drei vorträge über diffusion, Physikalische Zeitschrift, 17: 557 Smoluchowski, M.V. 1917. Versuch einer mathematischen theorie der Koagulationskinetik kolloider lösungen. Zeitschrift für Physikalische Chemie, 92: 129 van Dongen, P.G.J. & Ernst, M.H. 1985. Dynamic scaling in the kinetics of clustering. Physical Review Letters, 54: 1396 van Dongen, P.G.J. & Ernst, M.H. 1988. Scaling solution of Smoluchowski coagulation equation. Journal of Statistical Physics, 50: 295
CNOIDAL WAVE See Elliptic functions
COHERENCE PHENOMENA The word coherence comes from the Latin cohaerens, meaning “being in relation.” Thus, coherence phenomena are those displaying a high level of correlation between several objects.
COHERENCE PHENOMENA From the physical point of view, it is necessary to distinguish between two types of coherence: state coherence, which characterizes correlations between static properties of the considered objects, and transition coherence, which describes correlated dynamical processes. These types of coherence are two sides of the same coin, and one obtains a better insight from considering them together. To gain an intuitive idea of these two types of coherence, imagine a group of soldiers all standing at attention, without moving. This corresponds to state coherence. If the soldiers were all in different positions (some standing, some sitting, some lying down), there would be no state coherence between them. Now, imagine well-aligned rows of soldiers in a parade, moving synchronously with respect to each other. This corresponds to transition coherence. Also, if they were to march with different speeds and in different directions, transition coherence would be absent. Coherence is related to the existence of a kind of order—be it a static order defining the same positions or an ordered motion of a group. Then, the antonym of coherence is chaos. Thus, state chaos means the absence of any static order among several objects, and transition chaos implies an absolutely disorganized motion of an ensemble of constituents. The notion of coherence is implicit in the existence of correlation among several objects (enumerated with the index i = 1, 2, . . . , N). Each object, placed in the spatial point ri , at time t, can be associated with a set {Qα (ri , t} of observable quantities labeled by α. To formalize the definition of state and transition coherence, one may write Qαi = Qα (ri , t), where Qzi corresponds to a state property of an object, while y Qxi and Qi describe its motion. As an illustration, assume that Qαi are spin components. Another example assumes that Qzi is the population difference of a y resonant atom, while Qxi and Qi are its transition dipoles. Instead of considering the latter separately, it is convenient to introduce the complex combinations y x α Q± i ≡ Qi ± iQi . (In general, Qi are not restricted to classical quantities but may be operators.) If the system is associated with a statistical operator ρ, ˆ then the observable quantities are the statistical averages Qαi ≡ Tr ρˆ Qαi ,
(1)
expressed by means of the trace operation.A convenient way of describing the system features is by introducing dimensionless quantities, normalized to the number of objects N and to the maximal value Q ≡ maxQzi . Then, one may define the state variable s≡
N 1 z Qi QN i=1
(2)
COHERENCE PHENOMENA
145
and the transition variable u≡
N 1 − Qi . QN
(3)
i=1
One may distinguish two opposite cases, when the individual states of all objects are the same and when they are randomly distributed. These two limiting cases give 1 1, state coherence, |s| = (4) 0, state chaos. Next consider the transition characteristic (3) and collective motion of an ensemble of oscillators. Again, there can be two opposite situations, when the oscillation frequencies of all oscillators, as well as their initial phases, are identical and when these take randomly different values. For the corresponding limiting cases of completely synchronized oscillations and of an absolutely random 1 motion, respectively, one has 1, transition coherence, (5) |u| = 0, transition chaos. In the intermediate situation, one may say that there is partial state coherence if 0 < |s| < 1 and partial transition coherence when 0 < |u| < 1. Accepting that coherence is not necessarily total, it is convenient to define qualitative characteristics for partial coherence by introducing correlation functions. Let Q+ α (r , t) denote the Hermitian conjugation for an operator Qα (r , t). When Qα (r , t) is a classical function, Hermitian conjugation means complex conjugation. For any two operators from the set {Qα (r , t)}, one may define the correlation function Cαβ (r1 , t1 , r2 , t2 ) ≡ Q+ α (r1 , t1 )Qβ (r2 , t2 ).
(6)
The function Cαα (. . .) for coinciding operators is called the autocorrelation function. There is also a shifted correlation function + Bαβ ≡ Q+ α Qβ − Qα Qβ ,
where, for brevity, the spatiotemporal variables are not written explicitly. For describing coherent processes, it is convenient to use the normalized correlation function Kαβ ≡
Q+ α Qβ , + 1/2 (Qα Qα Q+ β Qβ )
(7)
which is sometimes termed a “coherence function.” Functions (6) and (7) can be specified as second-order correlation functions since, in general, it is possible to define higher-order correlation functions, such as the 2p-order function + Cα1 ...α2p = Q+ α1 . . . Qαp Qαp+1 . . . Qα2p ,
which are closely related to reduced density matrices.
Correlations are usually strongest among nearby spatiotemporal points. Thus, function (7) varies in the interval 0 ≤ |Kαβ | ≤ 1, being maximal for the autocorrelation function |Kαα | = 1 at the coinciding points r1 = r2 , t1 = t2 . When either the spatial or temporal distance between two points increases, correlations diminish; this is named correlation decay. At an asymptotically large distance, the correlation function (6) for two local observables displays the property of correlation weakening (correlation decoupling) + Q+ α (r1 , t1 )Qβ (r2 , t2 ) Qα (r1 , t1 )Qβ (r2 , t2 ), (8) where either |r1 − r2 | → ∞ or |t1 − t2 | → ∞. It is important to stress that property (8) holds only for local observables; thus, for operators representing no observable quantities, correlation decoupling generally has no meaning. Coherence characteristically implies correlations between similar objects, which require the use of autocorrelation functions. To describe coherence decay, it is also necessary to fix a point from which this decay is measured (usually at r = 0 and t = 0), whereupon coherence decay is studied by considering an autocorrelation function Cα (r , t) ≡ Q+ α (r , t) Qα (0, 0).
(9)
In many cases, there exists a spatial direction of particular importance, for example, the direction of field propagation. It is natural to associate this special direction with the longitudinal z-axis and the transverse direction with the radial variable r⊥ . The characteristic scale of coherence decay in the longitudinal direction is called coherence length lcoh , where 2 z |Cα (r , t)|2 dr 2 lcoh ≡ , (10) |Cα (r , t)|2 dr and the integration is over the entire space volume. Coherence decay in the transverse direction is classified as transverse coherence radius rcoh , where 2 r |Cα (r , t)|2 dr 2 rcoh ≡ ⊥ . (11) |Cα (r , t)|2 dr For isotropic systems, one replaces r⊥ by the spherical radius r and obtains a coherence radius 2 from equation (11). It is natural to call Acoh ≡ π rcoh the coherence area and Vcoh ≡ Acoh lcoh the coherence volume. The typical scale of temporal correlation decay is termed the coherence time tcoh , where ∞ 2 t |Cα (r , t)|2 dt 2 . (12) ≡ 0 ∞ tcoh 2 0 |Cα (r , t)| dt As seen, the coherence length (10) and coherence radius (11) are related to a fixed moment of time, while the
146 coherence time (12) defines the temporal coherence decay at a given spatial point. Equations (10)–(12) all have to do with a particular coherence phenomenon characterized by the correlation function (9). Phase transitions in equilibrium statistical systems are collective phenomena demonstrating different types of state coherence arising under adiabatically slow variation of thermodynamic or system parameters (temperature, pressure, external fields, and so on). Phase transitions are conventionally specified by means of order parameters, which are defined as statistical averages of operators corresponding to some local observables. The order parameter is assumed to be zero in a disordered phase and nonzero in an ordered phase. For example, the order parameter at Bose–Einstein condensation is the fraction or density of particles in the single-particle ground state. The order parameter for superconducting phase transition is the density of Cooper pairs or the related gap in the excitation spectrum. Superfluidity is characterized by the fraction or density of the superfluid component. For magnetic phase transitions, the order parameter is the average magnetization. Thermodynamic phases can also be classified by order indices. Let the autocorrelation function (9) be defined for the operator related to an order parameter. Then, for a disordered phase, the coherence length is close to the interparticle distance and the coherence time is roughly the interaction time. But for an ordered phase, the coherence length is comparable with the system size and the coherence time becomes infinite. Taking account of heterophase fluctuations in the quasiequilibrium picture of phase transitions, there appear mesoscopic coherent structures, with the coherence length being much larger than interparticle distance, but much smaller than the system size. The coherence time of these mesoscopic coherent structures (their lifetime) is much longer than the local equilibrium time, although it may be shorter than the observation time. Such coherent structures are similar to those arising in turbulence. Electromagnetic coherent radiation by lasers and masers presents a good example of transition coherence. Such radiation processes are accompanied by interference patterns, a phenomenon that is typical of coherent radiation and can be produced by atoms, molecules, nuclei, or other radiating objects. Interference effects caused by light beams are studied in nonlinear optics. But coherent radiation and related interference effects also exist in other ranges of electromagnetic radiation frequencies, including infrared, radio, or gamma regions. Moreover, there exist other types of field radiation, such as acoustic radiation or emission of matter waves formed by Bose-condensed atoms. Registration of interference between a reference beam and that reflected by an object is the basis for
COHERENCE PHENOMENA holography, which is the method of recording and reproducing wave fields. The description of interference involves correlation functions. Let Qi (t) represent a field at time t, produced by a radiator at a spatial point ri . The radiation intensity of a single emitter may be defined as Ii (t) ≡ Q+ i (t) Qi (t),
(13)
whereupon the radiation intensity for an ensemble of N emitters is I (t) =
N
Q+ i (t) Qj (t).
(14)
i,j =1
Separating the sums with i = j and with i = j yields I (t) =
N
Ii (t) +
i=1
N Q+ i (t) Qj (t),
(15)
i=j
which shows that intensity (14) is not simply a sum of the intensities (13) of individual emitters but also includes the interference part, expressed through the autocorrelation functions of type (9). The first term in equation (15) is the intensity of incoherent radiation, while the second term corresponds to the intensity of coherent radiation. V.I.YUKALOV See also Bose–Einstein condensation; Chaotic dynamics; Critical phenomena; Ferromagnetism and ferroelectricity; Lasers; Nonequilibrium statistical mechanics; Nonlinear optics; Order parameters; Phase transitions; Spatiotemporal chaos; Spin systems; Structural complexity; Superconductivity; Superfluidity; Turbulence Further Reading Andreev, A.V., Emelyanov, V.I. & Ilinski, Y.A. 1993. Cooperative Effects in Optics, Bristol: Institute of Physics Publishing Benedict, M.G., Ermolaev, A.M., Malyshev, V.A., Sokolov, I.V. & Trifonov, E.D. 1996. Superradiance: Multiatomic Coherent Emission, Bristol: Institute of Physics Publishing Bogolubov, N.N. 1967. Lectures on Quantum Statistics, Vol. 1, New York: Gordon and Breach Bogolubov, N.N. 1970. Lectures on Quantum Statistics, Vol. 2, New York: Gordon and Breach Coleman, A.J. & Yukalov, V.I. 2000. Reduced Density Matrices, Berlin: Springer Klauder, J.R. & Skagerstam, B.S. 1985. Coherent States, Singapore: World Scientific Klauder, J.R. & Sudarshan, E.C.G. 1968. Fundamentals of Quantum Optics, New York: Benjamin Lifshitz, E.M. & Pitaevskii, L.P. 1980. Statistical Physics: Theory of Condensed State, Oxford: Pergamon Press Mandel, L. & Wolf, E. 1995. Optical Coherence and Quantum Optics, Cambridge and New York: Cambridge University Press
COLLECTIVE COORDINATES
147
Nozières, P. & Pines, D. 1990. Theory of Quantum Liquids: Superfluid Bose Liquids, Redwood, CA: Addison-Wesley Perina, J. 1985. Coherence of Light, Dordrecht: Reidel Scott, A.C. 1999. Nonlinear Science: Emergence and Dynamics of Coherent Structures, Oxford and New York: Oxford University Press Ter Haar, D. 1977. Lectures on Selected Topics in Statistical Mechanics, Oxford: Pergamon Press Yukalov, V.I. 1991. Phase transitions anad heterophase fluctuations. Physics Reports, 208: 395–492 Yukalov, V.I. & Yukalova, E.P. 2000. Cooperative electromagnetic effects. Physics of Particles and Nuclei, 31: 561–602
COHERENT EXCITON See Excitons
COHERENT STRUCTURES See Emergence
COLE–HOPF TRANSFORM See Burgers equation
COLLAPSE See Development of singularities
u(x, t) = asech(aθ) exp(iξ θ + iσ ),
The soliton equations describe a number of important nonlinear physical phenomena. However, in real life, these phenomena are not precisely modeled by say the sine-Gordon (SG) equation, the nonlinear Schrödinger (NLS) equation, or some of the other pure soliton equations. Corrective and often small terms are added to include, for example, inhomogeneities, dissipation, or energy input. The resulting wave phenomena possess modified solitonic features that can be treated approximately starting with a pure soliton solution and then allow the parameters of the soliton solution to vary slowly with time under the influence of the perturbations instead of being constant. Solution parameters that are chosen to vary with time are called collective coordinates. They encompass the influence of the perturbations in the pure soliton equations. The advantage of introducing collective coordinates is to reduce a problem with infinitely many degrees of freedom to a problem with a few degrees of freedom (Kivshar & Malomed, 1989; Sánchez & Bishop, 1998). To illustrate the use of a collective coordinate approach, we shall investigate the NLS equation with the perturbative term εR (1)
Here, u = u(x, t), where x is the spatial coordinate and t is time. Subscripts x and t denote partial derivatives
(2)
where θ = x − x0 . This solution possesses four parameters (a, ξ, x0 , σ ) = (y1 , y2 , y3 , y4 ), which we shall choose as collective coordinates. For weak perturbations, ε 1, we shall assume that the collective coordinates depend slowly on time t due to the influence of εR. In addition, the perturbation leads to a radiation field of small amplitude, which is neglected. A variational approach is employed to determine the time evolution of the collective coordinates, but this is not the only method available. In the framework of a variational approach, the collective coordinates are the generalized coordinates. Although perturbations may destroy the Hamiltonian property of the pure soliton equations, dissipative effects and external nonconservative forces can be accounted for in the variation of a Lagrangian function by introducing generalized forces associated with the generalized coordinates. Below, this is done by introducing a generalized force for each collective coordinate as in classical mechanics. The unperturbed NLS equation (and its complex conjugate) can be derived from the Lagrangian density (Caputo et al., 1995; Scott, 2003);
L(u, u∗ , ut , u∗t , ux , u∗x ) =
COLLECTIVE COORDINATES
iut + uxx + 2|u|2 u = εR.
with respect to these variables. The simple single soliton solution of the pure NLS equation (ε = 0) reads
i ∗ (u ut − u∗t u) 2 −|ux |2 + |u|4 .
(3)
The total Lagrangian function is L(yi , y˙i ) =
∞
−∞
L dx,
(4)
where we denote the collective coordinates yi (t), i = 1, 2, 3, 4, and y˙i is the time derivative of yi . Together with Equation (1), the variation of total Lagrangian leads to the Euler–Lagrange equations (Caputo et al., 1995; Scott, 2003) d ∂L − ∂yi dt
∂L ∂ y˙i
=ε
∞
R −∞
∂u∗ dx + c.c., ∂yi
(5)
where c.c. stands for complex conjugate of the preceding term on the right-hand side. The inhomogeneous term on the right-hand side is interpreted as the generalized force associated with the collective coordinate yi (t), which is a key result as we do not rely on a perturbation with a strict Hamiltonian nature. Another important feature is that the above approach provides as many dynamical equations as we choose collective coordinates; thus it is straightforward to determine the generalized forces associated with each collective coordinate.
148
COLLISIONS
To illustrate the method, we calculate the total Lagrangian for the NLS equation using the simple single soliton solution in (2) L=
dσ dx0 2 3 a − 2aξ 2 + 2aξ − 2a . 3 dt dt
(6)
Consider the perturbation εR = −iu +
ig0 u , 1 + p/ps
(7)
describing light propagation through an optical fiber amplifier ∞ with a loss factor , gain g0 , and power p = −∞ |u|2 dx. The constant ps is a saturation power level that is characteristic for a given fiber amplifier. From Equation (5), the resulting dynamical equations for the collective coordinates are 2g0 a da = −2a + , dt 1 + 2a 2 /ps dξ = 0, dt dx0 = 2ξ, dt dσ = a2 + ξ 2, dt
(8) (9) (10) (11)
which is a drastic simplification compared to the original perturbed NLS problem. As for the sine-Gordon system, one can define a power balance condition by requiring a(t) ˙ = 0, √ implying an equilibrium amplitude a∞ given as a∞ = (g0 − )ps /2. A number of other strategies have been designed to determine slow time variation of collective coordinates in perturbed soliton solutions. These include slow variation of scattering data of the inverse scattering theory for pure solitons (Kivshar & Malomed, 1989; Lamb, 1980), more direct perturbation approaches (McLaughlin & Scott, 1978), and utilizing Hamiltonian structures in cases where the perturbation leads to Hamiltonian systems (Caputo & Flytzanis, 1991). An important result of perturbations is the formation of trailing radiation fields, which are low-amplitude linear waves created as perturbed solitons propagates (McLaughlin & Scott, 1978; Kivshar & Malomed, 1989; Willis, 1997). The solitons lose energy to the radiation field, and the variational approach used here can be extended to include such radiation. MADS PETER SØRENSEN See also Constants of motion and conservation laws; Energy analysis; Euler–Lagrange equations; Hamiltonian systems; Inverse scattering method or transform; Nonlinear optics; Nonlinear Schrödinger equations; Perturbation theory; Solitons
Further Reading Caputo, J.G. & Flytzanis, N. 1991. Kink-antikink collisions in sine-Gordon and φ 4 models: problems in the variational approach. Physical Review A, 44(10): 6219–6225 Caputo, J.G., Flytzanis, N. & Sørensen, M.P. 1995. The ring laser configuration studied by collective coordinates. Journal of the Optical Society of America B, 12(1): 139–145 Kivshar, Y.S. & Malomed, B.A. 1989. Dynamics of solitons in nearly integrable systems. Reviews of Moderen Physics, 61: 763–915 Lamb, G.L. 1980. Elements of Soliton Theory, New York: Wiley McLaughlin, D.W. & Scott, A.C. 1978. Perturbation analysis of fluxon dynamics. Physical Review A, 18: 1652–1680 Sánchez, A. & Bishop, A.R. 1998. Collective coordinates and length-scale competition in spatially inhomogeneous solitonbearing equations. SIAM Review, 40(3): 579–615 Scott, A.C. 2003. Nonlinear Science: Emergence and Dynamics of Coherent Structures, 2nd edition, Oxford and New York: Oxford University Press Willis, C.R. 1997. Spontaneous emission of a continuum sineGordon kink in the presence of a spatially periodic potential. Physical Review E, 55(5): 6097–6100
COLLISIONS An interesting consequence of Hamiltonian structures is that there typically exist symmetries and invariances that allow one to generate mobile localized states from standing ones. For instance, the sine-Gordon (SG) equation (Dodd et al., 1982) utt = uxx − sin(u),
(1)
where the subscripts denote partial derivatives, is invariant under the Lorentz transformation√ x → x = γ (x − vt) and t → t = γ (t − vx). γ = 1 − v 2 . As a result, static kinks and antikinks, corresponding to the two signs of the solution u(x, t) = ±4 tan−1 (x − x0 ),
(2)
can be boosted to any subluminal speed v < 1, as ! " (3) u(x, t) = ±4 tan−1 γ (x − x0 − vt) . Similarly, standing waves of the nonlinear Schrödinger (NLS) equation iut = −uxx − |u|2 u
(4)
can be boosted by Galilean invariance into traveling ones with any speed v of the form (Sulem & Sulem, 1999) 2
u(x, t) = (2α)1/2 ei(v/2)x+i(α−(v /4))t ×sech (x − x0 − vt) .
(5)
Note that localized solutions of dissipative partial differential equations do not typically share such features, because their traveling wave speeds are determined by the dynamics rather than initial conditions. Hereafter, we will focus on Hamiltonian models.
COLLISIONS
Figure 1. Elastic collisions: (a) linear-shaped antikink and kink in the linear wave equation; (b) kink-kink repulsive collision in SG; (c) antikink-kink attractive collision in SG. Top panels show u(x, t), while bottom panels show the trajectories of soliton cores in the (x, t)-plane. (d) Inelastic collisions between two NLS solitons in the presence of weak perturbation.
Given the mobility of the localized coherent structures, it is natural to consider the outcome of their collisions. In fact, it was the “solitary” nature of such interactions (Zabusky & Kruskal, 1965) that inspired the term soliton in the case of integrable systems.
Linear versus Nonlinear Collisions Consider the collision of two wavepackets that are governed by a linear wave equation. In this case, the superposition principle inherent in linearity guarantees that the two packets do not “feel” each other and survive the collision without change of shape, speed, or trajectory (see Figure 1(a)). On the other hand, nonlinear dynamics offers more interesting collisions. Here, the result of the interaction of two excitations does not resemble their sum and (at least) a phase shift is present (compare Figures 1(b) and 1(c) with 1(a)). In 1(b) and 1(c), the kink-kink and the antikink-kink collisions in the nonlinear SG equation are shown. The soliton cores do not merge in mutually repulsive collisions 1(b) while they do so in attractive collisions 1(c).
Elastic versus Inelastic Collisions In fully integrable systems, solitons have the remarkable property (often used to define them) of colliding elastically (Ablowitz & Segur, 1981). In these special systems, the dynamics are severely restricted by the existence of an infinite set of conservation laws. Although realistic systems are typically non-integrable, the non-integrability in many applications is weak and can be treated by including small perturbative terms in integrable equations. Such perturbations are called Hamiltonian if the total energy of the perturbed system
149 remains a dynamical invariant. While collisions in linear systems are much simpler than those in integrable nonlinear systems (such as the SG and NLS equations), the latter can, in turn, also be very different from the much more complex picture of inelastic collisions in near-integrable or more generally non-integrable systems. In general non-integrable systems, the inelasticity of collisions may be manifested through emission of radiation, excitation of soliton internal modes, and energy exchange between solitons. Internal modes are the long-lived, spatially localized, oscillatory excitations (corresponding to the point spectrum of the linearization around the wave). These can typically be excited only for a particular sign of the perturbation (Campbell et al., 1983). Small-amplitude radiation waves correspond to an irreversible chunk of energy lost in the collision (radiated toward the boundaries). Such modes are extended, plane waves of the continuous spectrum. Notice that the energy of internal modes can be partly restored to solitons if they collide again, while this is not possible for radiation waves (in the first-order approximation). Strong energy exchange between nonlinear waves in near-integrable systems can also occur in the absence of the above two mechanisms (and for different types and signs of the original perturbation). There are two necessary conditions for this recently manifested mechanism (Dmitriev et al., 2001, 2002). It can be observed only if the energy exchange is not forbidden by the conservation laws existing in the perturbed system and if the collision is of attractive type, as in Figure 1(c). For example, in the SG equation perturbed by the term εuxxxx , where energy and momentum are conserved for the one (free)-parameter kink-solitons, energy exchange is possible only when more than two solitons participate in the collision. The energy exchange between only kinks or only antikinks is also not possible because SG solitons with the same parity repel each other. Similarly, the effect is not possible in the Korteweg–de Vries (KdV) equation, where soliton interactions are always mutually repulsive. In the NLS equation, in-phase solitons attract each other, while out-of-phase solitons repel. Each soliton has two parameters (amplitude and phase), and for many practically important perturbations, there are two conserved quantities: the Hamiltonian and L2 norm of the solution. Thus, energy exchange between two nearly in-phase NLS solitons is possible in the presence of a weak perturbation. The effect of radiationless energy exchange between solitons survives even for a very weak perturbation, as it decreases linearly with a decrease in perturbation amplitude, while other effects of the perturbation decay faster. If the perturbation is not small, the energy exchange effect mingles with
150 radiation emission and possibly with the excitation of internal modes.
Probabilistic Nature of Soliton Collisions In many examples of perturbed, non-integrable models related to applications in optics, fluid mechanics, or condensed-matter physics (Kivshar & Malomed, 1989), the result of soliton collisions can be predicted only in a probabilistic sense. The following sources of stochastic behavior can be identified. First, chaotic soliton scattering can arise from resonant interaction with the soliton internal modes (Campbell et al., 1986; Gorshkov et al., 1992). Second, in discrete systems, the result of inelastic collisions can be sensitive to the coordinate of collision point, xc , with respect to the lattice site (Dmitriev et al., 2003; Papacharalampous et al., 2003). Because the coordinate of the collision point usually cannot be controlled, it is natural to describe the result of the collision as a function of the random variable xc . Finally, the result of the collision can be extremely sensitive to some other uncontrolled characteristics, such as the relative phase of the colliding solitons, as has been demonstrated to dramatically affect the collisions between NLS solitons or between kinks and breathers in SG (Dmitriev et al., 2001, 2002, 2003; Papacharalampous et al., 2003). This last source of randomness is important when energy exchange between solitons is possible. In Figure 1(d), an example of inelastic interaction between two NLS solitons in the presence of quintic perturbation, ε|u|4 u in Equation (4), with a small ε > 0 is presented (Dmitriev & Shigenari, 2002). (The regions of the (x, t)-plane are shown, where the real part of the solution exceeds a certain value.) After each collision, the properties of the solitons such as the frequency and amplitude are different depending on the collision phase. With a certain probability, the solitons can attain (after a number of collisions) a velocity sufficient to overcome their weak mutual attraction. In the example presented, the solitons split after the fourth collision. Emission of extended wave radiation is monitored and found to be very weak in this case; as a result, the solitons may continue to collide for a very long time, forming a two-soliton bound state. However, the probability P to obtain a bound state with the lifetime T decays algebraically as P ∼ T −α (Dmitriev & Shigenari, 2002), which is a manifestation of the chaotic character of their interaction. In conclusion, linear waves do not “feel” each other, and nonlinear waves of integrable equations emerge unscathed from collisions, retaining their solitary character. However, solitary wave collisions in more realistic, non-integrable models remain a fascinating topic, where a number of basic mechanisms (such as internal mode resonances, extended wave
COLOR CENTERS radiation, and radiationless energy exchange) have been elucidated, but the full picture is far from complete. The probabilistic interpretation of collisions mentioned above may prove a fruitful viewpoint in future studies. P.G. KEVREKIDIS AND S.V. DMITRIEV See also Nonlinear Schrödinger equations; Partial differential equations, nonlinear; Sine-Gordon equation; Solitons; Solitons, types of Further Reading Ablowitz, M.J. & Segur, H. 1981. Solitons and the Inverse Scattering Transform, Philadelphia: SIAM Campbell, D.K., Schonfeld, J.F. & Wingate, C.A. 1983. Resonance structure in kink–antikink interactions in φ/4 field theory. Physica D, 9: 1–32 Campbell, D.K., Peyrard, M. & Sodano, P. 1986. Kink-antikink interactions in the double sine-Gordon equation. Physica D, 19: 165–205 Dmitriev, S.V., Kivshar, Yu.S. & Shigenari, T. 2001. Fractal structures and multiparticle effects in soliton scattering. Physical Review E, 64: 056613 Dmitriev, S.V., Semagin, D.A., Sukhorukov, A.A. & Shigenari, T. 2002. Chaotic character of two-soliton collisions in the weakly perturbed nonlinear Schrödinger equation. Physical Review E, 66: 046609 Dmitriev, S.V. & Shigenari, T. 2002. Short-lived two-soliton bound states in weakly perturbed nonlinear Schrödinger equation. Chaos, 12: 324 Dmitriev, S.V., Kevrekidis, P.G., Malomed, B.A. & Frantzeskakis, D.J. 2003. Two-soliton collisions in a near-integrable lattice system. Physical Review E, 68: 056603 Dodd, R.K., Eilbeck, J.C., Gibbon, J.D. & Morris, H.C. 1982. Solitons and Nonlinear Wave Equations, London: Academic Press Gorshkov, K.A., Lomov, A.S. & Gorshkov, M.I. 1992. Chaotic scattering of two-dimensional solitons. Nonlinearity, 5: 1343–1353 Kivshar, Yu.S. & Malomed, B.A. 1989. Dynamics of solitons in nearly integrable systems. Reviews of Modern Physics, 61: 763–915 Papacharalampous, I.E., Kevrekidis, P.G., Malomed, B.A. & Frantzeskakis, D.J. 2003. Soliton collisions in the discrete nonlinear Schrödinger equation. Physical Review E, 68: 046604 Sulem, C. & Sulem, P.L. 1999. The Nonlinear Schrödinger Equation, New York: Springer Zabusky, N.J. & Kruskal, M.D. 1965. Interactions of solitons in a collisionless plasma and the recurrence of initial states. Physical Review Letters, 15: 240–243
COLOR CENTERS Gemstones are brightly colored because of the presence of color centers, atomic-scale imperfections that absorb light in otherwise transparent crystals. Historically, the term color centers has been associated with such imperfections in a special class of crystals called alkali halides, because it was in these relatively simple transparent hosts that the scientific study of color centers flourished. A German research program that
COLOR CENTERS began in the 1930s soon led to the recognition that alkali halides are excellent hosts for scientific studies of defects in nonmetallic solids, and the understanding of fundamental properties in these materials has been of great significance in the studies of similar phenomena in more complicated (and sometimes more practical) situations and materials. Sodium chloride—common table salt—is the most familiar alkali halide. It consists of positively charged sodium and negatively charged chlorine ions, alternating positions in a simple cubic array. In the perfect crystal, each ion has six nearest neighbors of opposite charge. Simple defects may involve chemical impurities, such as a positively charged silver or thallium ion on an alkali site, or the removal of one or more ions from normal positions. The most fundamental of the latter class is the F center (from the German Farbzentren), a halogen ion vacancy that has trapped an electron. Defects involving more than one F center, or chemical impurities next to one or more F centers, may also occur. Understanding the properties of color centers in detail requires some knowledge of quantum concepts. For example, the trapped electron of an F center can exist only in certain quantum states. In order for light to be absorbed, the energy of an incident photon must match the energy difference between the lowest quantum state and a higher one. This energy difference depends on the host crystal, and so F centers induce different colors in different alkali halides. The situation becomes more complicated when one considers that the electron is not trapped in a static host, but rather that the neighboring ions are vibrating. The electron’s interaction with these vibrations, whose time scale is long with respect to that of the (light) electron’s motion, means that the energy difference between ground and excited electronic states does not have a single value, but rather a distribution of values about some mean. Thus, for example, whereas the mean photon absorption energy for F centers at low temperatures in NaCl is 2.77 eV, the mean width of the distribution of absorption energies is 0.26 eV. This electron-lattice interaction has other consequences. In most cases, an F center at low temperature that has been excited by light will re-emit light. However, the mean energy of the emitted photon is found to be considerably smaller than that of the absorbed photon. This Stokes shift is exemplified in NaCl, where the mean photon emission energy at low temperatures is 0.98 eV. Why is there a Stokes shift? In simple terms, the energy-level structure of the F center is determined by the mean position of the neighboring ions. However, the mean position of the neighboring ions is in turn determined by the quantum state of the F center electron. After the electron is excited from the ground state to another quantum state, the neighboring ions relax in
151 response to the change in the average force exerted on them by the electron. This relaxation then leads to a change in the energy-level separations, as well as other fundamental aspects of their properties. The emitted photon has an energy smaller than the absorbed photon had, and energy is conserved in the total cycle as the relaxation processes create quanta of lattice vibrations, or phonons, which remove the excess energy. This cyclic process may be visualized by means of a configuration coordinate diagram. In the simplest case, this diagram consists of two equal parabolas, displaced vertically (in energy, E) and horizontally (in the displacement of neighboring ions, R), as shown in Figure 1. Each parabola represents the vibration of neighboring ions and leads to quantized vibrational states. According to the Franck–Condon principle, electronic transitions take place vertically on the diagram—the massive ions do not respond instantly to the excitation by the photon—so absorption corresponds to a transition from A to B, emission C to D. This picture is highly oversimplified, but it does represent a physical and visual way to understand phenomena associated with the optical properties of F centers (and by extension, many other types of defects in both insulators and semiconductors). We now consider a perfect alkali halide crystal in which one electron has been removed by light (ionizing radiation) from the array of negative ions. At low temperatures, this is found to lead to the phenomenon of self-trapping, whereas in semiconductors, the removal of an electron from a valence band of occupied states leads to motion of the empty state, or hole, and resulting electrical conduction; in most alkali halides the missing charge becomes localized in space, or self-trapped. What happens in detail is that the halogen ion that has lost one electron (and become a neutral atom) can form a chemical bond with a nearby halogen ion. In the process, both of these move toward each other, the lattice around them relaxes, and the missing charge is equally shared by this halogen molecule-ion. Self-trapping is not a universal process; it “costs” energy to localize a quantum particle, and the energy
Figure 1. Schematic configuration coordinate diagram.
152
COMMENSURATE-INCOMMENSURATE TRANSITION Fowler, W.B. (editor). 1968. Physics of Color Centers, NewYork: Academic Hayes, W. & Stoneham, A.M. 1985. Defects and Defect Processes in Nonmetallic Solids, New York: Wiley Schulman, J.H. & Compton, W.D. 1962. Color Centers in Solids, New York: Macmillan Song, K.S. & Williams, R.T. 1993, 1996. Self-Trapped Excitons, Berlin and New York: Springer
COMMENSURATEINCOMMENSURATE TRANSITION
Figure 2. Adjacent F center and self-trapped hole center in an alkali halide. “+” and “−” denote alkali and halogen ions, respectively.
gained by chemical bonding and relaxation must overcome this. Self-trapping of holes occurs in most alkali halides, but not in semiconductors and not in many other insulators. Also of interest is the creation of defects by the self-trapping of excitation energy in alkali halides, leading to a self-trapped exciton. To approach this, we consider the trapping of an electron by a self-trapped hole. Since the hole is effectively positive (it resulted from the removal of an electron), we might imagine that the trapped electron would find itself in loosely bound quantum states about the self-trapped hole. However, there is another possibility. If it does not cost too much energy for the halogen molecule-ion to move into one halogen site, rather than be shared by two sites, then the electron may be trapped in the other, empty halogen site. One then has an F center next to a halogen molecule-ion; hence, two defects have been formed. This situation is shown in Figure 2. Since the two defects are adjacent, this may be an unstable arrangement, and indeed there is a finite probability that the system will revert back to the perfect crystal, either before or after the emission of a photon. But, in many cases, it is found that this nearest-neighbor arrangement is the precursor for the creation of a stable defect pair: the halogen molecule-ion may migrate through the halogen sublattice, not by long-range atomic motion, but rather by short-range halogen motion accompanied by motion of the hole. This results in sequential sharing of the hole by the halogens as the hole migrates away from the F center. W. BEALL FOWLER See also Excitons; Franck–Condon factor; Quantum theory Further Reading Crawford, J.H. & Slifkin, L.M. (editors). 1972 (vol. 1), 1975 (vol. 2). Point Defects in Solids, New York, Plenum
When some local property in a crystalline solid (atomic positions or orientation of local magnetic moments) develops a spatial modulation with a wavelength b that differs from the underlying lattice spacing a, one speaks of a “modulated phase.” A modulated phase is said to be “commensurate” when the ratio b/a is rational and “incommensurate” when b/a is irrational. Modulated phases are experimentally observed by the appearance of “satellite spots” in X-ray, neutron, or electron diffraction patterns (Janssen & Janner, 1987; Cummins, 1990). Observations of spatially modulated structures are abundant in condensed matter physical systems, such as the ferrimagnetic phases of the rare earths and their compounds, long-period structures of binary alloys, graphite intercalation compounds, or the polytypic phases of spinelloids, perovskites, and micas, among other minerals. The wavelength b characterizing the modulation varies with external parameters, such as temperature, pressure, or magnetic field. Sometimes, this variation is smooth, but often it remains constant at a rational locking value through some range of values of the external parameter before changing to another rational locking value, and so on. The ubiquity of modulated phases shows that the physical origin of these behaviors cannot be tied to particularities of specific types of systems, but must be of a general character. It is widely recognized that modulated phases appear whenever different terms in the free energy of the system compete, each one trying to impose its own characteristic length scale. The external parameters control the relative strength of the competing interactions and new compromises are reached as they vary. An insight into the physics of modulated phases was obtained through detailed analyses of simple model systems of competing interactions. Although these simple models are unlikely to fit experiments on specific materials, they help to understand the complexity of behaviors that emanate from length-scale competition and to discern essential features. One of the best-studied model systems with competing interactions is the axial next-nearestneighbors Ising (ANNNI) model (motivated by the
COMMENSURATE-INCOMMENSURATE TRANSITION
i,j,j
−J2
paramagnetic
6
< 2333 > < 223 >
5
< 45 >
< 34 >
4
< 23 >
3 < > 8
kB T /J0
magnetic structures of erbium), which was introduced by R.J. Elliot in 1961 (Yeomans, 1988). This is an Ising model with a two-state spin, S = ± 1, on each site of a cubic lattice. Interactions between spins on nearestneighbor sites are ferromagnetic, but there are second (next-nearest) neighbor antiferromagnetic interactions along the axial direction, z. The Hamiltonian is 1 Si,j Si,j − J1 Si,j Si+1,j H = − J0 2
153
2
1
i,j
0 0
Si,j Si+2,j ,
0.2
0.4
(1)
i,j
where i indicates the two-dimensional layers perpendicular to the axial direction, and j, j are nearestneighbor sites within a layer. Both J0 and J1 are positive (ferromagnetic interactions), thus favoring the same value of neighboring spins, but J2 < 0 (antiferromagnetic) so that it favors opposite values of second neighbor spins along the z-direction. In the absence of thermal effects (T = 0), the favored spin configurations along the z-direction are the ferromagnetic alignment (· · · + + + + + + · · ·) if κ = − J2 /J1 < 21 , and the antiphase (· · · + + − − + + · · ·) configuration if κ>1/2. Exactly at κ = 21 , any configuration containing a stripe of two or more spins “+” followed by a stripe of two or more spins “−”, etc., has the same energy; thus there is a multiphase point. A convenient notation is to use n1 , . . . , nm to represent a state in which a set of stripes of width ni of alternating spins repeat. For example, (· · · + + − − + + + − − + + − − − · · ·) is denoted by 223. Ferromagnetic and antiphase configurations are consistently denoted by ∞ and 2, respectively. When temperature increases from zero, entropic effects regulate the competition among the ferro and antiferro interactions, and a flower (called a “devil’s flower” by Per Bak) of petal-like phases of modulated structures opens up from the multiphase point. Figure 1 shows how new commensurate phases appear as temperature increases, demonstrating qualitatively what is commonly observed in experiments— locking to a few short-wavelength commensurate phases separated either by first-order phase transitions (as in the low-temperature regime of Figure 1) or by regions where the wave vector appears to vary smoothly (as at higher temperatures). The following question naturally arises: what determines the behavior of the modulation wave vector as parameters vary? The answer to this question is closely tied to a central paradigm of nonlinear science, the soliton concept in the context of its discrete counterpart called discommensuration. To clarify this issue, consider the simplest model of competing interactions (Griffiths, 1990), the Frenkel– Kontorova model, which can be visualized as an array of atoms at positions (un ), −∞ < n < + ∞, experiencing
0.6
0.8
1
k = J2 / J1
Figure 1. Mean-field phase diagram of the ANNNI model showing the main commensurate phases.
a periodic substrate potential V (u) = V (u+1) and nearest-neighbor interaction W (u). The Hamiltonian is H = [KV (un ) + W (un )], (2) n
where the parameter K controls the relative strength of the interactions. The standard Frenkel–Kontorova model corresponds to 1 [1 − 2 cos(2π u)], V (u) = (2π )2 1 (u)2 − σ u. (3) 2 Note that V favors an integer value of the interspacing u, while W favors a uniform value σ , so both interactions compete to determine the configuration (un ), characterized here by the average interspacing ω = u. If (as in the standard model) the interaction W (u) is a convex function, a complete rigorous characterization of the model phase diagram was given by Aubry (1985). Thus, we restrict the analysis to convex W (u). In the thermodynamic limit of the system, care must be taken when defining what is meant by configurations of minimum energy or even by the energy of a configuration. Thus, the mean energy ε per particle of a configuration (un ) is defined as 1 ε = lim (N −M)→∞ N − M W (u) =
×
N −1
[KV (uj ) + W (uj +1 − uj )].
(4)
j =M
A minimum energy configuration (MEC) (uj ) is that for which the arbitrary displacement of any finite segment (uj + δj ), M 0 (its imaginary part can be trivially removed), a and b are real coefficients accounting for the diffusion and
158
COMPLEX GINZBURG–LANDAU EQUATION
dispersion, while d and c are parameters controlling the nonlinear dissipation and frequency shift. Note that a and d must be positive, otherwise (3) is an ill-posed equation. In the case |a| |b|, |d| |c|, Equation (3) may be treated as a perturbed version of the nonlinear Schrödinger (NLS) equation. In the case b = c = 0, Equation (3) is called a real GL equation, to stress that all its coefficients are real, although ψ remains complex. The real GL equation may be represented in the gradient form, ψt = − δL{ψ, ψ ∗ }/δψ ∗ , where ∗ stands for the functional derivative, and δ/δψ − |ψ|2 + |∇ψ|2 + (1/2)|ψ|4 dV is a real Lyapunov functional. A consequence of the gradient representation is that L may only decrease, dL/dt ≤ 0. This fact simplifies the dynamics of the real GL equation. A fundamental feature of Equation (3) is that its zero solution, ψ = 0, becomes unstable as the gain g changes its sign from negative to positive. In this case, a transition from the stable trivial solution to a nontrivial state is called supercritical. In particular, the supercritical transition in the one-dimensional (1-d) case yields a solitary-pulse (SP) solution that can be found in an exact analytical form, u = A [cosh (κx)]−(1+iµ) exp (−iωt),
(4)
where A, κ, µ, and ω are uniquely determined by parameters of Equation (3). If the CCGL equation reduces to a perturbed NLS equation, the SP (4) can be obtained from the NLS soliton by means of the perturbation theory, provided that bc < 0 (otherwise, the NLS equation does not have brightsoliton solutions). However, the SP solution (4) is always unstable because the zero background around it is unstable. In many cases (for instance, in the case of thermal convection in a binary fluid), a nontrivial state may be excited by a finite-amplitude perturbation, while the trivial solution is stable against small perturbations. The simplest model that describes the corresponding subcritical transition to a nontrivial state is the cubicquintic complex GL (CQCGL) equation, first proposed by Petviashvili and Sergeev (1984) as ψt = −gψ + (a + ib) ∇ 2 ψ + (d − ic) |ψ|2 ψ − (f + ih) |ψ|4 ψ . (5) Here, g > 0 implies stability of the zero solution, the last term with f > 0 guarantees overall stabilization of the system, the coefficient h accounts for a quintic nonlinear correction to the wave frequency, while d > 0 provides for a possibility of nonlinear gain. The CQCGL equation may give rise to nontrivial states, coexisting with the stable zero solution, if the√ nonlinear gain coefficient exceeds a value dmin = 2 gf . An important result, obtained by means of analytical and numerical methods, is that the 1-d and 2-d versions of
the CQCGL equation support SP solutions that may be stable (in the 2-d case, the localized pulse may carry vorticity, having a spiral structure). If all the parameters g, a, d, f , and h are small, the 1-d pulse can be constructed on the basis of the NLS soliton by means of the perturbation theory. However, the CQCGL equation does not make it possible to find stable SP solutions in an exact analytical form (one exact solution for a 1-d SP is known, but it is always unstable). Patterns in nonlinear dissipative media may be supported not only by intrinsic gain; another possibility is to apply an external field, which is, for instance, the case for the pattern formation in laser cavities (Arecchi et al., 1999). In this case, the appropriate CCGL equation is ψt = −gψ +(a +ib) ∇ 2 ψ − (d +ic) |ψ|2 ψ +P , (6) where the driving term, induced by the external field, may be of two different types: direct drive, P = ε, or parametric drive, P = − iω0 ψ + εψ ∗ , where the asterisk stands for complex conjugation, ω0 fixes the frequency, and ε is the drive’s amplitude. Equation (6) with either type of drive can support stable SPs in 1-d and 2-d cases (in the case of the direct drive, SP settles on a nonzero background). An important generalization is to consider systems of coupled GL equations. These may describe counterpropagating waves (for instance, in thermal convection in a binary-fluid layer), or second-harmonic generation in a lossy medium. In the latter case, the nonlinearity is not cubic, but quadratic, viz., ψ1∗ ψ2 in the equation for the fundamental-frequency field ψ1 , and ψ12 in the equation for the second-harmonic field ψ2 . An alternative to the CQCGL equation is a system originating in nonlinear optics, in which the stability of the zero solution is provided by an extra linear equation, ψt = gψ +(a + ib) ∇ 2 ψ −(d +ic) |ψ|2 ψ −iκχ, χt = −Gχ − iκψ,
(7)
where κ is a real coupling constant, and G > 0 is the loss coefficient in the additional equation. The zero solution in this system is stable if the loss in the second equation and coupling are strong enough, G > g and κ 2 > Gg. System (7) has exact SP solutions, in which both fields ψ and χ take the form (4) (with different amplitudes), but, in contrast to the CCGL equation proper, in this case the pulse may be stable (Atai & Malomed, 1998). Yet another type of a system occurs if, due to a specific nature of the underlying physical problem, the complex order parameter ψ is coupled to an extra realorder parameter φ, which accounts for the existence of a conserved quantity in the medium (Matthews & Cox, 2 2 2 2000): t = ψ + ∇ ψ − |ψ| ψ − φψ, φt = ∇ [σ φ + ψ" µ |ψ|2 . In this simplest version of the system, σ > 0 and µ are real constants.
COMPLEX GINZBURG–LANDAU EQUATION
2 ψt = gψ −(a +ib) ∇ 2 +k02 ψ −(d +ic) |ψ|2 ψ (8) with g, a, d > 0. Quasi-1-d solutions of Equation (8) can be looked for as ψ (x, y, t) = (x, y, t) exp (ik0 x), where is a slowly varying function, whose xand y-dependences are characterized by large scales X, Y k0− 1 . An asymptotic consideration is then consistent in the case X ∼ Y 2 , reducing the SH equation (8) to an anisotropic complex Newell–Whitehead–Segel 2 equation, t = g + (a + ib) 2i∂x + ∂y2 − (d + ic) ||2 . GL equations generate rich dynamics. The simplest exact solutions are plane waves (PWs). In the case of the CCGL equation (3) (here, it is set by rescaling g = a = d ≡ 1), a family of PWs is $ ψ = A exp (iQx − iωt), A = 1 − Q2 , 2 ω = c + (b − c)Q , (9) where Q is a wave number (a parameter of the solution family) taking values − 1 0 (the Benjamin– Feir–Newell (BFN) condition). The consideration of finite-wavelength perturbations gives rise to more complex stability conditions. Therefore, following Aranson and Kramer (2002), the structure of a full stability region for the PWs can be shown by means of its cross sections in the space (b, c, Q); see Figure 1. The figure makes a distinction between convective
1.0
1.0 b absolutely unstable
0.5
Q
Q
b = 3.0 0.5
= 1.5
Convectively Unstable Stable
0.0 0.0
0.5
c
1.0
0.0 0.0
1.5
0.5
c
b=c
1.5
c = 2.0 Q
0.5
0.0 0.0
1.0
1.0
1.0
Q
A common feature of the various GL equations displayed above is their universality. Each is a generic representative of a class of models with given qualitative properties (for instance, super- or subcritical character of the excitation of nontrivial states, and the absence or presence of a conserved quantity). In more specific situations, there arise generic equations of other types. In particular, the complex Swift–Hohenberg (SH) equation describes a situation (for instance, in Rayleigh–Bénard convection) when the instability of the zero solutions appears at a finite wave number k0 of small perturbations; thus,
159
1.0
0.5 c
1.5
0.5
0.0 0.0
−0.5
b
−1.0
−1.5
Figure 1. A set of cross sections of the stability region for the plane-wave solutions (9) of the cubic complex GL equation (3) in the space (b, c, |Q|). The two top panels, the left bottom panel, and the right bottom panel show, respectively, the cross sections b = const, b = − c, and c = const. The filled circles mark turning points on the border between absolute and convective instabilities.
and absolute instabilities, when, respectively, the growing perturbation is traveling away or staying put. The transition from the 1-d CCGL equation to its multidimensional counterpart does not import extra instabilities to the PWs. If the BFN combination 1 + bc becomes negative, the PW develops phase turbulence, which means that |ψ| remains roughly constant, while the phase of the complex phase ψ demonstrates spatiotemporal chaos. In the 1-d case, close to the BFN instability threshold, the chaotic evolution of the phase gradient p ≡ φx obeys the Kuramoto–Sivashinsky equation, which, in a rescaled form, is pt + pxx + pxxxx + ppx = 0. Deeper into the instability region, phase-slip points (PSPs) arise, at which the amplitude |ψ| disappears. Multiple creation of PSPs leads to a transition from the phase turbulence to defect turbulence, which is distinguished by random dynamics of the PSP ensemble. Mixed turbulence at the border between these two types also occurs (Aranson & Kramer, 2002). In the case when PWs are stable, shock waves can be generated by collision between two PWs (9) with different wave numbers Q1 and Q2 . Although exact solutions for shocks are not available, they can be obtained in an approximate form, provided that Q1 − Q2 is small, or the coefficients b and c are small. In particular, a transient layer between the two PWs moves at the velocity v = (b − c)(Q1 + Q2 ), which is exactly the mean of the group velocities of the colliding PWs. In the 2-d case, PWs can collide obliquely; in this case, shocks take the form of a domain wall. Generally, the shocks are stable. Besides the shocks (which are sources that emit PWs), the 1-d CCGL equation also gives rise to sinks,
160 that is, localized hole-type structures that absorb PWs. If the CCGL equation (3) is a perturbed version of the NLS equation, sinks can be constructed as perturbed counterparts of NLS dark solitons, provided that bc > 0 (particular solutions for the sinks are available in an exact analytical form). A standing sink is actually a PSP, as |ψ| vanishes at its center, and it may be dynamically stable. A moving sink is a finite dip in the profile of |ψ|; it is structurally unstable, as it is either decelerated (turning into a standing sink) or accelerated (eventually vanishing) by a quintic term added to the CCGL equation (Lega, 2001; Aranson & Kramer, 2002). The 2-d CCGL equation displays spiral waves (SWs) in the form ψ = A(r) exp (iN θ − iωt), where r and θ are the polar coordinates, N is an integer vorticity, and A(r) is a complex $ function. An asymptotic form of the SW is A(r) ≈ 1 − Q2 exp(iQr) at r → ∞, where Q is related to ω as in Equation (9), and A(r) ∼ r N at r → 0. (In the case of the real GL equation, the SW has Q = 0, which corresponds to a vortex solution. Similar solutions are generated by Equations (1) and (2), which represent Abrikosov vortices in superconductivity. For the prediction of these vortices, A.A. Abrikosov shared the Nobel Prize for Physics in 2003.) The asymptotic wave number Q is an eigenvalue of the 2-d CCGL equation, as it is uniquely selected by parameters of the equation. All the SWs with N > 1 are unstable. The SW with N = 1 may be subject to specific instabilities localized near its core (Aranson & Kramer, 2002). An extension of the SW is a vortex line in three dimensions, which, in particular, may be closed into a ring. A vortex line with an additional wave number directed along its axis (twisted vortex) is also possible. The dynamics of 3-d vortex lines are quite complicated (Aranson & Kramer, 2002). BORIS MALOMED See also Nonlinear Schrödinger equations; Partial differential equations, nonlinear; Pattern formation; Spatiotemporal chaos; Superconductivity
Further Reading Aranson, I.S. & Kramer, L. 2002. The world of the complex Ginzburg–Landau equation. Reviews of Modern Physics, 74: 99–143 Arecchi, F.T., Boccaletti, S. & Ramazza, P. 1999. Pattern formation and competition in nonlinear optics. Physics Reports, 318: 1–83 Atai, J. & Malomed, B.A. 1998. Exact stable pulses in asymmetric linearly coupled Ginzburg–Landau equations. Physics Letters A, 246: 412–422 Cross, M.C. & Hohenberg, P.C. 1993. Pattern-formation outside of equilibrium. Reviews of Modern Physics, 65: 851–1112 Ginzburg V.L. & Landau, L.D. 1950. On the theory of superconductivity. Zhurnal Eksperimentalnoy i Teoreticheskoy Fiziki (USSR), 20: 1064–1082 (in Russian) [English translation: in Men of Physics, vol. 1. 1965. Oxford: Pergamon Press, pp. 138–167]
CONLEY INDEX Ipsen, M., Kramer, L. & Sorensen, P.G. 2000. Amplitude equations for description of chemical reaction-diffusion systems. Physics Reports, 337: 193–235 Lega, J. 2001. Traveling hole solutions of the complex Ginzburg– Landau equation: a review. Physica D, 152: 269–287 Matthews, P.C. & Cox, S.M. 2000. Pattern formation with a conservation law. Nonlinearity, 13: 1293–1320 Petviashvili, V.I. & Sergeev, A.M. 1984. Spiral solitons in active media with an excitation threshold. Doklady Akademii Nauk SSSR, 276: 1380–1384 (in Russian) Tinkham, M. 1996. Introduction to Superconductivity. New York: McGraw-Hill
COMPLEXITY, MEASURES OF See Algorithmic complexity
CONDENSATES See Bose–Einstein condensation
CONLEY INDEX In a five-page paper published in the Proceedings of the 1970 International Congress of Mathematicians (Conley, 1971), Charles Conley gave the first definition of his index. For context, he chose the phase space to be a compact connected metric space X and F to be the space of flows on X with the compact open topology. (A flow is a continuous function such that f (x, 0) = x, f (x, t + s) = f (f (x, t), s) for all choices of x, t, s.) For a compact subset Y of X define the set Inv(Y, f ) = {y ∈ Y : f (y, t) ∈ Y for all t}. This set is the maximal invariant set contained in Y . An isolating neighborhood for a flow f is a compact subset N of X such that Inv(N, f ) is contained in the interior of N. The set, Inv(N, f ), is the isolated invariant set associated with N. It is easy to show that isolating neighborhoods persist. If N is an isolating neighborhood for f , then there exists a neighborhood U of f in F such that N is an isolating neighborhood for every flow g in U . Suppose that M is an isolating neighborhood for g. If Inv(M, g) = Inv(N, g) then the isolated invariant set Inv(M, g) is said to be a local continuation of the set Inv(N, f ). Two invariant sets Inv(N, f ) and Inv(M, g) are related by continuation if there is a finite sequence of local continuations linking one to the other. For a flow f and an arbitrary compact subset W of X, one defines forward and backward exit time functions from W into the extended real line [−∞, ∞] as follows: t + (x) = sup{t ≥ 0 : f (x, [0, t]) ⊂ W }, t − (x) = inf{t ≤ 0 : f (x, [t, 0]) ⊂ W }. Certain subsets of W are associated with these functions. The forward asymptotic set is the set A+ = {x : t + (x) = ∞}, and the backward asymptotic
CONLEY INDEX set is the set A− = {x : t − (x) = − ∞}. Note that the maximal invariant set Inv(W, f ) is the set A+ ∩ A− . Forward and backward exit sets are the sets W ± = {x : t ± (x) = 0}. An isolating block B for a flow f is a special type of isolating neighborhood with the following property. The boundary of B is the union of the exit sets B + and B − . The intersection of these sets is the “tangency” set τ of boundary points that immediately exit in both time directions. Thus, a block has no internal tangencies where an orbit comes to the boundary from inside the block and does not exit. If B is an isolating block for a flow f , one can show that the exit time functions are continuous. Using these functions and the flow, one may define deformation retractions of B−A+ onto B − and B−A− onto B + . For example, a retraction r of B − A− onto B + is defined by the formula r(x) = f (x, t + (x)) (This property was first used by Wazewski (1954).) Suppose that f is a flow on R 3 and also that T is a solid torus that is an isolating block for the flow. Suppose that the exit set is a disk D on the boundary of T . Then, the invariant set Inv(T , f ) must not be empty. If it were empty, one could define a deformation retraction of T onto D, which is impossible. Under these definitions, the following theorem is fundamental (Conley & Easton, 1971). Given an isolating neighborhood N for a flow f , there exists an isolating block B contained in N such that Inv(N, f ) = Inv(B, f ). We use this theorem to define the Conley index of the set Inv(N, f ) to be the homotopy types of the pair of quotient spaces [B/B + ] and [B/B − ]. These spaces are obtained by collapsing the exit sets to points and using the quotient topology on the resulting spaces. Consider, for example, the flow f defined on R 2 by f ((x, y), t) = (e−t x, et y). This flow has the origin as a saddle point. Let B be a square centered at the origin. Then B is an isolating block for the flow. The exit set consists of the top and bottom sides of B. The quotient space [B/B − ] has the homotopy type of the pair of spaces consisting of a circle and a point on this circle. The Conley Index has the following properties (Conley, 1978; Herman et al., 1988; Smoller, 1983; Easton, 1998; Hofer & Zehnder, 1995; Mischaikow, 2002): (i) The Conley index is well defined. Thus it is independent of the choice of block B. (ii) If Inv(N, f ) and Inv(M, g) are two isolated invariant sets that are related by continuation, then they have the same Conley index. (iii) The index [B/B − ] of a saddle point for a smooth flow on a manifold is a sphere together with a point on the sphere. The sphere has the same dimension as that of the unstable manifold of the saddle point.
161 Thus, the Conley index is a generalization of the Morse index of a saddle point. (iv) The index of two disjoint isolated invariant sets is the “sum” or “join” of their indices.
Traveling Waves One of the early applications of the Conley index was to find traveling waves for reaction-diffusion equations (Smoller, 1983). We will use as an example the FitzHugh–Nagumo (FN) equations ut = εv, vt = vxx + f (v) − u, which are a simplification of the Hodgkin– Huxley equations used to model nerve impulses. The parameter ε is assumed to be small, and one seeks a solution of the form U (x, t) = u(s), V (x, t) = v(s), where s = x + θ t and θ is the wave velocity. Substituting the trial solutions into the FN equations, one obtains the following system of ordinary differential equations: du/ds = (ε/θ)v, θ dv/ds = d2 v/ds 2 + f (v) − u, where f (v) is assumed have the general shape of a cubic equation that is decreasing, increasing, and then decreasing. To be specific, we take f (v) = − v(v − 1)(v − 2). The corresponding first-order system of ordinary differential equations is u = σ v, v = w, w = θ w + u − f (v), where σ = ε/θ . Our goal is to find a periodic solution of this system for small values of the parameters sigma and theta, and thereby to find a periodic solution of the FN equations. One can completely understand the phase portrait of system when the parameters σ and θ are set to zero. In this case, u(t) is constant and the equations for v, w form a Hamiltonian system with Hamiltonian H (v, w) = w2 /2 + F (v) − uv u with F (v) = 0 f (r)dr = −v 2 (v − 2)2 /4. The phase portrait in the u = 0 plane has saddle points at (v, w) = (0, 0), (v, w) = (2, 0) and a center at (v, w) = (1, 0). The saddle points are connected by heteroclinic orbits implicitly defined by the equation H (v, w) = 0, 0 < v < 2. Note that the set of equilibrium points for the system is the set {(u, v, w) : u = f (v), w = 0}. Next consider the system with σ = 0, θ > 0. For this system, we have (d/dt)H (u(t), v(t), w(t)) = θ w2 (t). Let a(u) < b(u) < c(u) denote the three solutions of the cubic equation u − f (u) = 0. One can show that there are values 0 < u1 < u2 such that for j = 1, 2, there is a heteroclinic solution joining the equilibrium points (uj , a(uj ), 0) and (uj , c(uj ), 0) in the plane u = uj . We now have a cycle consisting of the two heteroclinic orbits together with arcs of equilibrium points {(u, a(u), 0) : u1 < u < u2 } and {(u, c(u), 0) : u1 < u < u2 }. This cycle is an invariant set for the system. However, the cycle is not isolated since it intersects the set of equilibrium points in two arcs. The two arcs of equilibrium points may be viewed as normally hyperbolic invariant manifolds, whose stable
162
CONSTANTS OF MOTION AND CONSERVATION LAWS
and unstable manifolds intersect transversally along the two heteroclinic orbits. Finally, consider the system for small positive values of sigma. In this case, u is increasing when u is positive and decreasing when negative. The hard part is to construct an isolating block, which topologically is a solid torus containing the cycle. The transversal intersection noted above is essential to this construction. The cycle is no longer invariant when sigma is positive. However, one shows that the isolating block must contain a periodic solution of the full system of equations. The periodic solution thus constructed may be viewed as a periodic traveling wave solution of the FN equations.
Applications to Discrete Dynamical Systems Consider the discrete dynamics generated by iterating a homeomorphism f of a compact metric space X. It is natural to study orbits with “errors” such as truncation or round-off errors in numerical algorithms. Thus, an ε-chain for f is a finite sequence (y0 , y1 , y2 , . . .) such that d(f (yn ), yn+1 ) ≤ ε where d(x, y) is the distance function on X. Conley (1978) and Bowen (1975) both understood the importance of studying orbits with errors. Bowen asked when such an orbit could be shadowed by a true orbit of the system. Conley defined the ε-chain recurrent set CR(f, ε) to be the set of points that are contained in periodic ε-chains of length at least 2. The chain recurrent set is the set CR(f ) = ∩ {CR(f, ε):ε > 0}. Points in CR(f ) are chain equivalent if for any positive epsilon there is a periodic ε-chain containing both points. He showed that every orbit uniformly approaches a unique chain equivalence class in CR(f ). This result is known as the Conley decomposition theorem, and it generalizes Smale’s decomposition of an Axiom A system into basic sets (Bowen, 1975). An isolating block for f is a compact subset N of X such that whenever x, f (x), f 2 (x) ∈ N , then f (x) is contained in the interior of N . This is something like having no internal tangencies. The exit set of N is the set E = {x ∈ N : f (x) - int(N)}. Because N is a block, the exit set is compact. Define an equivalence relation on N making all points of E equivalent and all points not in E equivalent only to themselves. Let N # denote the space of equivalence classes with the quotient topology obtained by projecting N onto N # by sending a point to the equivalence class to which it belongs. Let E # denote the image of E. The index space of N is the pair (N # , E # ). Define an index map f # : N # → N # by f # (x) = E # if x = E # or f (x) ∈ E # . Otherwise, define f # (x) = f (x). Note that the index map is continuous. The pair (N # , f # ) plays the role of the Conley index in this context. If the index map is not homotopic to a constant, then one can prove that the set Inv(N, f ) is non-empty.
Smale’s horseshoe map in the plane may be used as an example. Suppose that a rectangle B is mapped across itself so that the image crosses the rectangle in a horizontal strip, then curves back and crosses the rectangle again in another horizontal strip above the first. In this case, the exit set consists of three vertical strips, one in the center of the rectangle and the other two on the left and right sides containing the vertical edges of B. The index space [B/B − ] has the homotopy type of a figure of eight and the index map is non-trivial. Sequences of compact sets (called “windows”) may sometimes be constructed to contain an orbit with errors. If each window in the sequence is “correctly” mapped across the next one, then a true orbit runs through the sequence of windows and shadows the orbit with errors (Easton, 1998). ROBERT W. EASTON See also Anosov and Axiom-A systems; FitzHugh– Nagumo equation; Horseshoes and hyperbolicity in dynamical systems Further Reading Bowen, R. 1975. Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, New York: Springer Conley, C.C. 1971. On the continuation of invariant sets of a flow. In Proceedings of the International Congress of Mathematicians 1970, Paris: Gauthiers-Villars, pp. 909–913 Conley, C.C. & Easton, R.W. 1971. Isolated invariant sets and isolating blocks. Transactions of the American Mathematical Society 158: 35–61 Conley, C.C. 1978. Isolated Invariant Sets and the Morse Index, Providence, RI: American Mathematical Society. Easton, R. 1998. Geometric Methods for Discrete Dynamical Systems, Oxford and New York: Oxford University Press. Herman, M., McGehee, R., Moser, J. & Zehnder, E. (editors). 1988. Charles Conley Memorial Volume, Special Issue of Ergodic Theory and Dynamical Systems, 8 Hofer, H. & Zehnder, E. 1995. Symplectic invariants and Hamiltonian dynamics. The Floer Memorial Volume Progress in Mathematics, vol. 133, Basel: Birkhäuser. Mischaikow, K. 2002. Topological techniques for efficient rigorous computations in dynamics, Acta Numerica,vol. 11, 435–477 Smoller, J. 1983. Shock Waves and Reaction-diffusion Equations, New York: Springer. Wazewski, T. 1954. Sur un principe topologique de l’examen de l’allure asymptotique des integrales des equations differentielles, Proceedings of the International Congress of Mathematicians, vol. 3, 132–139
CONSTANTS OF MOTION AND CONSERVATION LAWS Although nonlinear spatiotemporal processes may be very complicated, they frequently obey simple constraints in the form of conservation laws. It is sometimes possible to construct one or several constants of motion (also called dynamical invariants, DIs), in the form of spatial integrals of local densities expressed in terms
CONSTANTS OF MOTION AND CONSERVATION LAWS of the physical fields and their derivatives, which are conserved in time, as a consequence of the underlying dynamics. Such commonly known conserved quantities as energy, momentum, and angular momentum belong to this class. Typically, the existence of conservation laws can be established if the underlying dynamics is dissipationfree; however, a specific DI may sometimes also exist in dissipative systems. Examples of the latter are provided by the diffusion equation, ut = uxx , and its important nonlinear counterparts in the form of the Burgers equation, ut = uxx + uux , and Cahn–Hilliard equation, ut + u − u3 + uxx xx = 0
(1)
(2)
(the subscripts stand +∞for the partial derivatives). They all conserve −∞ u(x, t)dx, which is simply the total mass of the substance in the case of diffusion. A more sophisticated example of the “dissipative conservation” occurs in physically important models based on the nonlinear Schrödinger (NLS) equation with special additional terms:
iut + uxx + F (|u| )u = εQ, 2
(3)
where the function F describes conservative nonlinearity of the medium (the prime stands for the derivative with respect to the argument of F ; in particular, 2 F = |u|2 corresponds to the most generic case of the cubic NLS equation), ε is a real parameter, and the “special perturbation” is, for instance, the nonlinear Landau-damping term in the NLS equation for Langmuir waves in plasmas, +∞ −1 dx x − x |u(x )|2 , (4) Q = −u −∞
or the stimulated Raman scattering term in the equation for electromagnetic waves in nonlinear optical fibers, Q = |u|2 x u. While these terms are dissipative ones, the corresponding perturbed NLS equation conserves the single DI, namely, the total wave action (alias “number of quanta”), +∞ |u(x, t)|2 dx. (5) N= −∞
In the general case, equations that govern the dissipation-free spatiotemporal dynamics can 4be de5 rived from the underlying action functional S u(n) : δS/δu(n) = 0, where u(n) (r , t) is the nth field variable, r is the set of the spatial coordinates, δ/δu(n) stands for the variational (functional) derivative, and the
163
action is expressed in terms of the Lagrangian density L, so that L u(n) , ∇u(n) , u(n) (6) S= dr dt. t For instance, the density ! " L = (1/2) u2t − (∇u)2 − F (u)
(7)
yields a nonlinear Klein–Gordon (NKG) equation for a single real field u, utt − ∇ 2 u + F (u) = 0.
(8)
The fundamental nature of DIs in Lagrangian systems is established by a theorem that was published by Emmy Noether in 1918: any continuous symmetry of the system, that is a family of transformations of the field variables, spatial coordinates, and time, which depend on an arbitrary continuous parameter ξ and leave the action invariant, generates a constant of motion. If the infinitesimal symmetry transformation is written in the form u(n) → u(n) + Un dξ, r → r + R dξ, t → t + T dξ,
(9)
then the main result following from the Noether theorem is the continuity equation It + ∇ · J = 0, with the following density and current: ∂L R · ∇u(n) + T u(n) I = t − Un (n) n ∂ ut (10) + T L, ∂L R · ∇u(n) + T u(n) J = t − Un ∂ ∇u(n) n (11) + RL, (∂/∂ ∇u(n) is realized as a vector with the com (n) (n) (n) ponents ∂/∂ ux , ∂/∂ uy , ∂/∂ uz ). Then, assuming, as usual, that the fields disappear at |r |→∞, the continuity equation immediately yields the conservation law in the form of dI /dt = 0, with I ≡ I dr . A detailed derivation of this fundamental result can be found in the book by Bogoliubov & Shirkov (1973); for discussion of the Noether theorem in various contexts, see also Sulem & Sulem (1999), and Whitham (1974) If the underlying equations are complex, the Lagrangian density and all the DIs are nevertheless real. The obvious invariance of the action against arbitrary temporal and spatial shifts, which are described by Equation (9) with Un = 0 and, respectively, R = 0, T = 1, or Rj = ej , T = 0 (ej is the unit vector corresponding to the j th spatial coordinate) gives rise
164
CONSTANTS OF MOTION AND CONSERVATION LAWS
to the conservation of the energy (Hamiltonian) H and momentum P . For the important classes of the NKG and multidimensional NLS, Equations (8) and (3), Equation (10) yields 1 2 ut + (∇u)2 + F (u) dr , HNKG = 2 PNKG = − ut ∇u dr , (12) HNLS =
PNLS = i
!
" |∇u|2 − F (|u|2 ) dr ,
u∇u∗ − u∗ ∇u dr ,
(13)
where ∗ stands for the complex conjugation (the transition from the Lagrangian to Hamiltonian density as per Equation (10) in the case of the temporal-shift invariance is called the Legendre transformation). The invariance against rotations in the three-dimensional space leads to the conservation of the angular momentum, M = (r × P ) dr , (14) where P is the density in the expressions for the momentum in Equations (12)–(14); in the twodimensional case, there is only one component of the conserved angular momentum. Additionally, in the NLS-type equations, the invariance against the phase shift (alias gauge invariance), u → u exp (iξ ) with an arbitrary constant ξ , generates the conservation of the above-mentioned wave action (5), which is |u|2 dr in the multidimensional case. Another important class of models in one dimension is based on equations of the Korteweg–de Vries (KdV) type for a real function u(x, t), ut + uxxx + F (u)ux = 0
(15)
(the most important cases of the KdV equation proper and modified KdV equation correspond to F = u3 and F = u4 , respectively). The Lagrangian representation of Equation (15) is possible in terms of the potential field v, defined so that vx ≡ u, but the Hamiltonian and momentum are expressed solely in terms of the original field u, +∞ 1 2 ux − F (u) dx, HKdV = 2 −∞ +∞ u2 dx. (16) PKdV = −∞
The invariance of the action, written in terms of the potential v, against the arbitrary shift v → v + ξ additionally generates the conservation of the “mass,” +∞ −∞ u dx.
Besides being a dynamical invariant, the Hamiltonian gives rise to a canonical representation of the equation(s) in the Hamiltonian form, which is dual to the Lagrangian representation. In particular, for the complex and real equations of the NLS and KdV types, respectively, this representation is ut = −i
δH , δu∗
ut =
∂ δH . ∂x δu
(17)
The conservation of H itself and the conservation of the mass in the KdV-type equations are immediate consequences of the general form of Equations (17). The conservation of the wave action in the NLS-type equation is also a consequence of its representation in the form of Equations (17). If a multicomponent Lagrangian system possesses an additional (“isotopic”) symmetry against linear transformations of the components, this also gives rise to a specific DI. An important example is a system of coupled NLS equations of Manakov’s type (Manakov, 1973)
u u ut + ∇2 + F |u|2 + |v|2 i v vt v = 0,
(18)
which are invariant against rotation in the plane of (u, v). In this case, Equation (10) gives rise to the DI (“isotopic spin”) in the form ∗ (19) S=i uv − u∗ v dr . A very special situation arises for the DIs in the case of integrable equations, that is, those that are amenable to the application of the inverse scattering transform (IST) (Ablowitz & Segur, 1981; Newell, 1984; Zakharov et al., 1984). Integrable equations have an infinite set of hidden dynamical symmetries, which, unlike the above-mentioned elementary invariances against temporal and spatial shifts, spatial rotations, phase shift, etc., do not have a straightforward meaning. In compliance with the Noether theorem, each hidden symmetry generates the corresponding DI, which is an integral expression with a density that, unlike those corresponding to the elementary DIs (see Equations (12), (4), and (16), involves higherorder derivatives. For instance, in the integrable KdV equation (15) with F = u3 , the first higher-order DI is I=
+∞ −∞
u2xx + 5u2 uxx + 5u4 dx.
(20)
In fact, it was an empirical discovery of several higherorder DIs in the KdV equation that was a major incentive for the study that had resulted in the discovery of the IST technique.
CONSTANTS OF MOTION AND CONSERVATION LAWS The IST provides a systematic method to derive the infinite set of the DIs in terms of the corresponding scattering data, into which the original wave field is mapped to make the hidden integrability explicit (Ablowitz & Segur, 1981; Newell, 1984; Zakharov et al., 1984). The use of the scattering data makes it possible to explicitly introduce a full system of the action-angle variables for the integrable equations, and demonstrate that the infinite set of the action variables is in one-to-one correspondence with the set of the DIs. It is also possible to prove; that all the DIs are in involution among themselves; that is, the Poisson bracket between any two DIs, defined as per the corresponding symplectic (Hamiltonian) structure, is zero. Thus, integrable equations are direct counterparts, for the case of the infinite number of degrees of freedom, of finite-dimensional Hamiltonian systems that are Liouville-integrable; that is, with a set of DIs that are in involution, their number being equal to the number of the degrees of freedom. The presence of the infinite set of the DIs in the integrable equations helps to understand such a wellknown property as the completely elastic character of collisions between solitons (Ablowitz & Segur, 1981; Newell, 1984; Zakharov et al., 1984): roughly speaking, the necessity to satisfy the infinite set of the conservation laws leaves no room for changes of the solitons, except for phase shifts. On the other hand, some equations amenable to the application of the IST technique, such as, for instance, the standard threewave system, u1,2 t + c1,2 u1,2 x = iu∗2,1 u3 , (u3 )t = iu1 u2 , (21) where u1,2 and u3 are, respectively, the “daughter” and pump waves, and c1,2 are group velocities, feature nontrivial “soliton reactions”—for instance, a spontaneous split of a pump-wave soliton into separating daughter ones. This possibility is explained by the fact that the above-mentioned one-toone correspondence between the infinite sets of the degrees of freedom and DIs does not really hold for these equations: the set of the DIs is not “infinite enough” (Fokas & Zakharov, 1992). Such equations are sometimes called “solvable,” to stress their difference from the genuinely integrable ones (integrable in the sense of Liouville, as generalized to systems with infinitely many degrees of freedom). Integrable lattice (discrete) models feature another important property: due to the lack of the continuous translational invariance, lattice systems lack the momentum conservation. Nevertheless, integrable lattice models do possess a conserved momentum, due to their hidden symmetry. For example, the Ablowitz–Ladik equation, dun +(un+1 +un−1 −2un )+|un |2 (un+1 +un−1 ) = 0, 2i dt (22)
165
which is an integrable discretization of the cubic NLS equation, conserves the real momentum in the form of P =
+∞
∗ (ψn ψn+1 − ψn∗ ψn+1 ).
(23)
n=−∞
In fact, conservation of the momentum is a specific integrability feature of discrete models, in contrast to continuum ones. Elementary DIs find specific applications in systems perturbed by small nonconservative terms to order ε. In that case, the conservation laws no longer hold; however, using evolution (balance) equations for the former DI(s) is a convenient way to derive effective equations of motion for solitons (or other collective nonlinear excitations) in the weakly perturbed model. For instance, in the cubic NLS equation (3) with the above-mentioned terms (Kerr and stimulated Raman 2 scattering ones), F = |u|2 and Q = |u|2 x u, an exact soliton solution with arbitrary amplitude η and velocity c, in the case of ε = 0, is usol = η sech (η (x − ct)) " ! × exp i(c/2)x + i η2 − c2 t .
(24)
In the presence of small ε > 0, the wave action (4) remains a DI (see above), and the balance equation for the formerly conserved momentum PNLS , see Equation (13), is +∞ ! 2 "2 dP = 2ε |u| x dx. (25) dt −∞ Substitution of the unperturbed soliton (24) into Equation (25), and into the conservation of the wave action, yields evolution equations for the amplitude and velocity: dc 16 4 dη = 0, = εη . dt dt 15
(26)
For further details and references, see the review by Kivshar & Malomed (1989). BORIS MALOMED See also Hamiltonian systems; Integrability; Integrable lattices; Inverse scattering method or transform; Korteweg–de Vries equation; N-wave interactions; Symmetry groups Further Reading Ablowitz, M. & Segur, H. 1981. Solitons and the Inverse Scattering Transform, Philadelphia: SIAM Bogoliubov, N.N. & Shirkov, D.V. 1973. Introduction to the Theory of Quantized Fields, Moscow: Nauka (in Russian); English translation, 2nd edition: New York: Wiley, 1980
166
CONTINUUM APPROXIMATIONS
Fokas, A.S. & Zakharov, V.E. 1992. The dressing method and nonlocal Riemann-Hilbert problems. Journal of Nonlinear Science, 2: 109–134 Kivshar, Y.S. & Malomed, B.A. 1989. Dynamics of solitons in nearly integrable systems. Reviews of Modern Physics, 61: 763–915 Manakov, S.V. 1973. Theory of two-dimensional stationary selffocusing of electromagnetic waves. Zhurnal Eksperimentalnoy i Teoreticheskoy Fiziki, 65: 505–516 (in Russian); translated in Soviet Physics—Journal of Experimental and Theoretical Physics, 38: 248 (1974) Newell, A.C. 1984. Solitons in Mathematics and Physics, Philadelphia: SIAM Sulem, C. & Sulem, P.-L. 1999. The Nonlinear Schrödinger Equation, New York: Springer Whitham, G.B. 1974. Linear and Nonlinear Waves, New York: Wiley Zakharov, V.E., Manakov, S.P., Novikov, S.P. & Pitaevskii, L.P. 1984. Theory of Solitons, New York: Consultants Bureau
CONTINUITY EQUATION See Constants of motion and conservation laws
CONTINUOUS SPECTRUM See Inverse scattering method or transform
CONTINUUM APPROXIMATIONS In general, a physical system belongs to one of three broad classes: (i) media with distributed parameters (electromagnetic fields, fluids, or liquid), (ii) discrete media (crystal lattices, polymers, or macromolecules), and (iii) artificial periodic systems (layered structures, lattices of nano-dots, or Josephson arrays). In the first class, the dynamics of a system is described by partial differential equations (PDE) for the field variable u(x, t) and in the other two cases by discrete differential equations (DDE) for the field variable at the lattice sites u(n, t) = un (t). For simplicity, only one-dimensional (1-d) models are discussed here. Well-known examples of discrete nonlinear dynamical systems include the following: • The discrete 1-dimensional elastic chain with a nonlinear interaction between the nearest neighbors (generalized Fermi–Pasta–Ulam (FPU) model), whose equation of motion reads m
d 2 un = ϕ (un+1 − un ) − ϕ (un − un−1 ), dt 2
(1)
where un is the displacement of the nth atom in a chain; a prime indicates the derivative with respect to the argument, and ϕ(un − un − 1 ) is the energy of the interatomic interaction (Fermi et al., 1955). The particular choice ϕ(ξ ) = Aξ 2 /2 + αξ 3 /3 + βξ 4 /4 and ϕ(ξ ) = c exp( − pξ ) + qξ represents α-FPU (for β = 0), β-FPU (for α = 0), and Toda models.
• The discrete 1-d chain with linear interatomic interaction exposed to a nonlinear external field (discrete Frenkel–Kontorova (FK) model) with the following equation of motion: m
d2 un = A(un+1 + un−1 − 2un ) − w (un ), (2) dt 2
where w(un ) is the nonlinear on-site external potential. The particular choice w = U (1 − cos(2π un /a)), where a is the interatomic distance, corresponds to the traditional FK-model (Frenkel & Kontorova, 1939). • 1-d photonic crystals (periodic arrays of optical waveguides) or the discrete spin lattice, which may be described in the context of the discrete nonlinear Schrödinger equation (DNLS) dψn + B(ψn+1 + ψn−1 − 2ψn ) i dt +F (|ψn |2 )ψn = 0, (3) where in the simplest case F is a linear function of the argument. ψn denotes the value of the effective field at the nth element of the discrete system, which can be assigned different physical meanings for various applications. Even in the 1-d case, the solution of the nonlinear DDE poses a fairly complicated mathematical problem and only a few of them can be solved exactly (the Toda and Ablowitz–Ladik equations). Thus, it is often easier to study discrete problems in the “continuum approximation” (CA) within the framework of PDEs. Clearly, some information about nonlinear dynamics of discrete systems is lost, and some phenomena cannot be described in this continuum limit; but in the long wave limit, this approach provides a good qualitative and even quantitative agreement with the results for a discrete system investigation. In the CA, the discrete number of the atom site n is replaced with the continuous coordinate x : na → x, with a being the interatomic equilibrium distance or the period of mesoscopic periodical structure, and un (t) = u(na, t) is replaced with u(x, t). The finite differences un±1 − un have to be expanded in Taylor series ∂u a 2 ∂ 2 u + un±1 − un = ±a ∂x 2 ∂x 2 ±
a3 ∂ 3u a4 ∂ 4u + .... 6 ∂x 3 24 ∂x 4
(4)
This expansion is valid under the condition |(un+1 − un )/un | 1.
(5)
For linear waves of the form un = u0 sin(kna − ωt) this expansion agrees with the long wavelength approximation ak 1, where k is a wave number.
CONTINUUM APPROXIMATIONS
167
Substitution of expansion (4) in DDE (1) in the leading approximation yields for the α-FPU and Toda models the Boussinesq equation ∂ 2 u Aa 4 ∂ 4 u ∂ 2u m 2 − Aa 2 2 − ∂t ∂x 12 ∂x 4 2 3 ∂u ∂ u −2αa = 0. (6) ∂x ∂x 2 (In the case of the β-FPU model, the modification of this equation with the nonlinear term 3βa 4 (∂u/∂x)2 (∂ 2 u/∂x 2 ) is obtained.) Equations (2) for discrete FK model within continuum limit are transformed in the leading approximation into the nonlinear Klein–Gordon (NKG) equation m
∂ 2u ∂ 2u − A a 2 2 + w (u) = 0. 2 ∂t ∂x
(7)
In the particular case of the periodic external potential w = U (1 − cos un ), one obtains the sine-Gordon (SG) equation. Finally, the CA for DNLS equation (3) reduces to the usual partial differential nonlinear Schrödinger equation i
∂ 2ψ ∂ψ + Ba 2 2 + F (|ψ|2 )ψ = 0. ∂t ∂t
(8)
Examples (6)–(8) demonstrate that a different number of terms in expansion (4) are to be taken into account in different situations. In the case of Equation (1), the dispersion relation is $ (9) ω = 2 (A/m) sin ak/2, which is of acoustic type with a weak dispersion in the limit k → 0. Within the CA, this weak dispersion is governed by the fourth spatial derivative alone in expansion (4), and it is necessary to retain the dispersion term in (6). The dispersion relation for Equation (6) is consistent with the exact formula (9) only in the long wave limit ak 1. Unfortunately, the dispersion term − (Aa 4 /12) ∂ 4 u/∂x 4 in (6) necessitates an additional boundary conditions for this equation and results in the appearance of additional nonphysical solutions with small frequencies and ak ∼ 1. To avoid these side effects, a regularization of the expansion over the discreteness parameter a can be performed (Rosenau, 1987). This corresponds to substitution of the relation ∂ 2 /∂x 2 (m/Aa 2 )∂ 2 /∂t 2 into the dispersion term in (6) and leads to yet another version of the CA for Equation (1) containing the term with mixed derivatives: ∂ 2 u ma 2 ∂ 4 u ∂ 2u m 2 − Aa 2 2 − ∂t ∂x 12 ∂x 2 ∂t 2 2 3 ∂u ∂ u −2αa = 0. (10) ∂x ∂x 2
The above estimation of CA application area (ak 1) holds for long wave small amplitude envelope solitons in the nonlinear case as well. But in general, this condition can differ for solitons of different types. For example, CA descriptions of Boussinesq solitons of the type u(x, t) = u(x − vt) describe the solutions of FPU model $ (1) only under the condition v/vc − 1 1, where vc = Aa 2 /m. Only the lowest-order terms of expansion (4) have so far been taken into account for discrete systems within the CA. In general, retaining the next terms with higher powers of spatial derivatives exceeds the accuracy of CA. But in some way, such extended versions of the CA also take into account the discreteness of the systems and can lead to interesting and important physical results. Retaining the fourth order derivative in (4) transforms the nonlinear KGE (7) into m
∂ 2 u Aa 4 ∂ 4 u ∂ 2u − Aa 2 2 − + w (u) = 0. ∂t 2 ∂x 12 ∂x 4
(11)
In the case of a sinusoidal external force, the corresponding generalized equation (dispersive SG equation) has steady-state bounded kinks solutions (4π-kink solitons) for some particular values of their velocities (Bogdan et al., 2001). This result obtained within the CA is in agreement with the numerical result for the corresponding discrete system (2) (Alfimov et al., 1993). The inclusion of yet higher terms of expansion (4) in the nonlinear parts of discrete equations in the CA gives rise to the nonlinear dispersion and leads to an existence of exotic solitons such as “compactons” and “peakons.” The CA is not restricted to the long-wave limit. For high-frequency short waves with wave numbers √k π/a and ω − ωmax ωmax (where ωmax = 2 A/m for (1) and (2)), the CA for the slowly varying envelope of antiphase oscillations (un = ( − 1)n vn ) in the β-FPU-model results in a PDE with the Euclidean differential part m
∂ 2v ∂ 2v + Aa 2 2 + 4Av + 16βv 3 = 0. ∂t 2 ∂x
(12)
The breather solution of this equation (Kosevich & Kovalev, 1975) within the CA describes the “intrinsic modes,” which are currently being widely discussed since the pioneering paper of Sievers and Takeno (1988). In more complicated diatomic chains with the gap in the linear waves spectrum at k = π/2a, the so-called “gap solitons” (breathers with frequencies lying in the gap) can be described in CA for the envelopes of the antiphase oscillations of atoms from two sublattices. To this point, applications of the CA for DDE models of discrete systems were discussed. Often, the opposite approach is used where the corresponding PDEs are investigated numerically in some discrete
168
CONTOUR DYNAMICS
schemes as a system of DDE (Dodd et al., 1982). The finite-differences method is one of the most popular in this case: the initial function u(x, t) is defined on the rectangular net of the (x, t) plane at points x = hn, t = h t. The partial derivatives are replaced by the finite differences un+1 − un ∂u(x, t) = , ∂x h ∂x 2 un+1 + un−1 − 2un = , ... h2 ∂ 2 u(x, t)
(13)
Generally, sampling over all variables is performed, but in some hybrid methods space sampling alone is carried out and the resulting system of ODEs is solved by using standard computer codes. Space sampling is commonly used for complicated biological systems. In order to simulate the behavior of a single neuron, for example, its continuous structure may be sliced into a large number of small segments (compartments). This procedure is called the “compartmental” approach, and within it the continuous PDEs are replaced by sets of ODEs. The advantage of this modeling approach is that it imposes no restrictions on the properties of each compartment and permits great flexibility at the level of resolution. Compartmental methods make it possible to develop the realistic models that have a close relationship with the relevant experimental data. ALEXANDER S. KOVALEV See also Compartmental models; Delay-differential equations; Discrete nonlinear Schrödinger equations; Dispersion relations; Fermi–Pasta–Ulam oscillator chain; Frenkel–Kontorova model; Partial differential equations, nonlinear; Peierls barrier; Sine-Gordon equation
Further Reading Alfimov, G., Eleonskii, V., Kulagin, N. & Mitskevich, N. 1993. Dynamics of topological solitons in models with nonlocal interaction. Chaos, 3: 405–414 Bogdan, M., Kosevich, A. & Maugin, G. 2001. Soliton complex dynamics in strongly dispersive medium. Wave Motion, 34: 1–26 Dodd, R., Eilbeck, J., Gibbon, J. & Morris, H. 1982. Solitons and Nonlinear Wave Equations, London: Academic Press Eisenberg, H., Silberberg, Y., Marandotti, R., Boyd, A. & Aitchison, J. 1998. Discrete spatial optical solitons in waveguide arrays. Physical Review Letters, 81: 3383–3386 Fermi, E., Pasta, J. & Ulam, S. 1955. Studies of nonlinear problems. 1965. Collected Works of E. Fermi, vol. II, Chicago: University Chicago Press Frenkel, J. & Kontorova, T. 1939. On the theory of plastic deformation and twinning. Physical Journal USSR, I: 137– 149 (originally published in Russia, 1938) Kosevich, A. & Kovalev, A. 1975. Self-localization of vibrations in 1D anharmonic chain. Soviet Physics, Journal of Experimental and Theoretical Physics, 40: 891–896
Rosenau, P. 1987. Dynamics of dense lattices. Physical Review, B36: 5868–5876 Sievers, A. & Takeno, S. 1988. Intrinsic localized modes in anharmonic crystals. Physical Review Letters, 61: 970–973
CONTOUR DYNAMICS A wide variety of fluid dynamical problems involve the material advection of a tracer field q(x, t), expressed by ∂q Dq ≡ + u · ∇q = 0, (1) Dt ∂t where u(x, t) is the fluid velocity. The value of q thus does not change following an infinitesimal element or particle. That is, for a particle at x = X (a, t), where a is a vector label (e.g., the initial position of the particle), Equation (1) implies that q = q(a), a constant, and ∂ X /∂t = u(X , t), which is just the statement that the particle moves with the local fluid velocity. The collective effect of this transport is a rearrangement of the tracer field q by the velocity field u. Depending on the nature of u, this may lead to highly intricate distributions of q, even starting from simple initial conditions. Moreover, there are important applications in which u depends on q itself, often in a nonlocal manner, that is, in which the entire field of q contributes to u at any given point x. A specific example relevant to the present topic is provided by the two-dimensional Euler equations governing the behavior of an inviscid, incompressible fluid: ∇p Du =− , (2) Dt ρ ∇ · u = 0, (3) where p is the pressure and ρ is the density (here constant), and where now the velocity field is two dimensional: u = (u, v). Taking the curl of Equation (2) gives an equation for the scalar vorticity ζ ≡ ∂v/∂x − ∂u/∂y: Dζ = 0, Dt
(4)
which is identical to Equation (1) if we take q = ζ . Thus, the vorticity is materially conserved in this system. But it also induces the velocity field u, that transports it. Equation (3) is satisfied generally by considering u = −∂ψ/∂y
and
v = ∂ψ/∂x,
(5)
where ψ(x, t) is called the streamfunction. Substituting these components into the definition of ζ leads to a Poisson equation for ψ: ∇ 2 ψ = ζ.
(6)
Given the distribution of ζ , this equation (with suitable boundary conditions) may be inverted to find ψ, whose spatial derivatives provide u and v. The inversion of
CONTOUR DYNAMICS
169
this equation can be formally accomplished by using the Green function G(x; x ) of Laplace’s operator ∇ 2 ; in two dimensions, G(x; x ) = (2π )−1 log |x − x|.
(7)
= δ(x − x),
(G is the solution to Equation (6) for ζ a singular delta distribution of vorticity having a unit spatial integral.) Consider henceforth an unbounded two-dimensional fluid. Then, the formal solution to the inversion problem is G(x; x )ζ (x , t) dx dy , (8) ψ(x, t) =
which shows explicitly that the flow field at any point depends on the vorticity field at all points. Moreover, the integration over space implies that the field of ψ is generally smoother than that of ζ . The evolution of the flow in this case consists of two basic steps: • inversion—the recovery of the velocity field u from the distribution of ζ and • advection—the transport of fluid particles to the next instant of time. Now, a two-dimensional plane is carpeted by an infinite number of such particles, and therefore, this view of the evolution may appear to be quite complex. However, the material conservation of q (or ζ ) affords an enormous simplification. First, note that, if one exchanges two particles labeled a and a having the same value of q, this does not alter the distribution of q, and as a result the velocity field u remains unchanged. This is a “particle relabeling” symmetry, and in general, it gives rise to an infinite number of globally conserved quantities (the spatial integrals of any functional of q). This symmetry implies that contours of fixed q consist of the same set of fluid particles for all time. They are called material contours. Contour dynamics arises from representing the distribution of q by a finite set of contours Ck , k = 1, 2, ..., n, between which q is spatially uniform, and across which q jumps by qk (defined to be the value of q to the left of Ck minus that to the right of Ck ). The contours here are still material ones—the particles just on either side of a contour retain their distinct values of q. Between two contours, any two fluid particles may be exchanged without altering q or u. This implies that only the contours matter for determining the velocity field. Also, since the contours are material, their advection suffices to evolve the entire distribution of q. This is the basis of contour dynamics. To see how this works for the two-dimensional Euler equations, it remains to be shown as to how one can calculate the velocity field directly from the contours Ck . The starting point is Equation (8), in
which we consider ζ = q to be a piecewise-uniform field. For the moment, we need only use the property G(x; x ) = g(x − x) satisfied by the Green function of the Laplace operator. Also, we need only consider one contour at a time and afterwards linearly superpose the results, since the relation between q and u is linear. We may take q = 0 outside of the (closed) contour, so that q = q inside it (denoted by the region R below). Nonzero exterior q simply gives rise to solid-body rotation, and may be superposed afterwards. Then, from Equation (5), we have (u(x, t), v(x, t))
∂ ∂ = − , q g(x − x) dx dy (9) ∂y ∂x R ∂ ∂ = −q − , g(x − x) dx dy ∂y ∂x R (10) g(X − x) (dX , dY ),
= −q
(11)
C
where we have used the symmetry of g with respect to x and x in the second line and Stokes’ theorem in the third, and where X denotes a point on the contour C. The velocity field anywhere thus depends only on the shape of C. For a set of contours, the velocity field is required only on the contours themselves to evolve q. The contours thus form a closed dynamical system, contour dynamics, governed by n d Xj = u(Xj ) = − qk g(Xk −Xj )dXk dt Ck k=1
(12) for all points Xj on contours Cj , j = 1, 2, ..., n. These equations were first derived for the twodimensional Euler equations by Zabusky et al. (1979), following earlier work by Berk & Roberts (1967), who derived a similar contour-based model for the two-dimensional Vlaslov equations in plasma physics. These authors also developed numerical methods for contour dynamics in which the contours were discretized into a finite set of points or nodes, originally connected by straight line segments. A wide variety of numerical methods have since been developed, many of which are summarized in the review articles of Dritschel (1989) and Pullin (1992). They principally differ in terms of the choice of the interpolation between nodes (linear, quadratic, cubic; local and global splines); the method of numerical quadrature used to evaluate the contour integral over the segment connecting adjacent nodes (trapezoidal, Gaussian, explicit); the method of redistributing, inserting, and removing nodes to maintain an accurate representation of the contour shape; and the procedure used, if any, to remove fine-scale structure (e.g., filaments and
170 thin bridges connecting two separating regions)—a procedure coined “contour surgery” (Dritschel, 1988). Contour dynamics has been used to study a wide variety of problems, from the interaction of two vortex patches (having just one contour each), to the filamentation and stripping of nested vortices (having many contours to represent a continuum) (Dritschel, 1989; Pullin, 1992). The numerical method illustrated next is described in Dritschel (1988, 1989). This method uses local cubic splines between contour nodes, explicit quadrature to first order in the departure of the contour from a straight line between nodes, node redistribution based on maintaining a local node density proportional to the square root of contour curvature, and automatic surgery whenever contours or contour parts get closer than a prescribed cutoff scale δ. This scale and the precise formula for the node density are chosen to balance the errors arising from surgery and node redistribution. A fourth-order Runge–Kutta scheme is used for the time integration. An example is presented next of the collapse of three vortex patches (see also (Rogberg & Dritschel, 2000)). The centers of the vortices are initially chosen at points where equivalent delta-distributed point vortices of the same circulation (spatial integral of q) are known to collide in finite time.√Two of the vortices have q = + 2π and radii 1 and 2/ 5, while the third has q = − 2π and radius 23 . The two positive vortices are initially separated by a distance d√= 5, and the negative vortex is placed at a distance d 17/27 and at an angle 225◦ relative to the joint center of the two positive vortices. The vortices are then all shifted so that the joint center of all three vortices lies at the origin. Starting from this configuration, the collapse time for the point vortices is 7.70059886 . . . Figure 1 illustrates the evolution of the vortices— in the upper left-hand frame, the initial conditions are shown, while the remaining frames (to the right and then downwards) are spaced at unit increments in time starting from t = 5 and ending at t = 15. By t = 6, the two positive vortices begin to merge (they are separated by only a thin channel of irrotational fluid). Thereafter, the flow grows rapidly in complexity, as many filaments are generated and small vortices roll up at the tips of some of the filaments. Notably, the negative vortex does not distort significantly, but merely acts to bring the two positive vortices together. The complexity just illustrated is typical of many vortex interactions. An accurate, robust numerical method must be able to capture this generic behavior, at least over time scales when the flow is reasonably predictable. To see how well the current method performs, we next examine how the results vary with spatial resolution. Two additional simulations were performed at half and double the average point spacing used in Figure 1. The results are compared in Figure 2 at the final time, t = 15, when the numbers of nodes in the
CONTOUR DYNAMICS
Figure 1. The collapse of three vortex patches. The initial condition is shown in the upper left frame. Time proceeds to the right and downwards in increments of one unit, from t = 5 to t = 15. The window of view is − 5.0 < x < 5.0 and − 5.8 < y < 4.2. The negative vortex is rendered with a short-dashed line (with a dash between each node), while the positive vortices are rendered with a continuous solid line.
Figure 2. Comparison, at t = 15, of three contour dynamics simulations of vortex collapse. Resolution increases from left to right, doubling between each frame (the node spacing parameter is µ = 0.12, 0.06, and 0.03, and the large-scale length L = 1 in all cases; consult Dritschel (1989) or Dritschel & Ambaum (1997) for further details). The domain of view is the same as used in the previous figure.
three simulations are, from low to high resolution, 2740, 10,738 and 27,297 (at t = 0, the numbers of nodes are 183, 349, and 682). Note the cutoff scale δ = 0.000225, 0.0009, and 0.0036 in the three simulations—there is a factor of 4 difference in δ between resolutions. The agreement is striking even in the detailed structure. The most visible differences show up in the lengths of the filaments, which are removed more readily at low resolution. These filaments, however, contribute negligibly to the velocity field, and retaining them makes little difference to the evolution of the flow. Contour dynamics has since been applied in a variety of diverse fields. Its largest growth has occurred in the field of atmospheric and oceanic dynamics, where the potential vorticity plays the role of the materially
CONTROLLING CHAOS conserved tracer q often to a very good approximation (Hoskins et al., 1985). Indeed, its application to this field is on a much sounder footing than it is to the fields it was originally developed for: plasma physics and aeronautics. The two-dimensional approximation is a particularly severe one in aeronautics, since real flows do not preserve two-dimensional symmetry, unless constrained in some manner. In the atmosphere and oceans, rotation and stratification serve to constrain the flow to be two dimensional, or more appropriately, layerwise two dimensional, and furthermore one may extend contour dynamics to study such flows (indeed, the equations are formally no different than given those by (12); see (Dritschel, 2002)). Finally, the use of contours to carry out tracer advection—the fast and accurate part of contour dynamics—has been combined with more traditional approaches of computing the velocity field (the inversion step) to produce a particularly fast, accurate, and versatile numerical method called the contour-advective semi-Lagrangian (CASL) algorithm (Dritschel & Ambaum, 1997; Dritschel et al., 1999; Dritschel & Viúdez, 2003). This latest development allows the extension of the contour approach to much more realistic sets of equations, and has significantly widened the applicability of the original contour dynamics method. DAVID DRITSCHEL See also Chaotic advection; Euler–Lagrange equations; Vortex dynamics of fluids
Further Reading Berk, H.L. & Roberts. K.V. 1967. The water-bag model. Methods in Computational Physics, 9: 87–134 Dritschel, D.G. 1988. Contour surgery: a topological reconnection scheme for extended integrations using contour dynamics. Journal of Computational Physics, 77: 240–266 Dritschel, D.G. 1989. Contour dynamics and contour surgery: numerical algorithms for extended, high-resolution modelling of vortex dynamics in two-dimensional, inviscid, incompressible flows. Computer Physics Reports, 10: 77–146 Dritschel, D.G. 2002. Vortex merger in rotating stratified flows. Journal of Fluid Mechanics, 455: 83–101 Dritschel, D.G. & Ambaum, M.H.P. 1997. A contour-advective semi-Lagrangian numerical algorithm for simulating finescale conservative dynamical fields. Quarterly Journal of the Royal Meteorological Society, 123: 1097–1130 Dritschel, D.G., Polvani, L.M. & Mohebalhojeh, A.R. 1999. The contour-advective semi-Lagrangian algorithm for the shallow water equations. Monthly Weather Review, 127(7): 1551–1565 Dritschel, D.G. & Viúdez, A. 2003. A balanced approach to modelling rotating stably-stratified geophysical flows. Journal of Fluid Mechanics, 488: 123–150. See also: wwwvortex.mcs.st-and.ac.uk. Hoskins, B.J., McIntyre, M.E. & Robertson, A.W. 1985. On the use and significance of isentropic potential-vorticity maps. Quarterly Journal of the Royal Meteorological Society, 111: 877–946
171 Pullin, D.I. 1992. Contour dynamics methods. Annual Review of Fluid Mechanics, 24: 89–115 Rogberg, P. & Dritschel, D.G. 2000. Mixing and transport in twodimensional vortex interactions. Physics of Fluids, 12(12): 3285–3288 Zabusky, N.J., Hughes, M.H. & Roberts, K.V. 1979. Contour dynamics for the Euler equations in two dimensions. Journal of Computational Physics, 30: 96–106
CONTROL PARAMETERS See Bifurcations
CONTROLLING CHAOS It may seem paradoxical that chaotic systems—which are extremely sensitive to the tiniest fluctuations— can be controlled; yet, the earliest reference to this idea appears around 1950, when John von Neumann presaged just that. Nowadays, laboratory demonstrations of the control of chaos have been realized in chemical, fluid, and biological systems, and the intrinsic instability of chaotic celestial orbits is routinely used to advantage by international space agencies who divert spacecraft to travel vast distances using only modest fuel expenditures. A variety of techniques for chaos control has been implemented since around 1990 when the first concrete analyses appeared, including traditional feedback and open-loop methods, neural network applications, shooting methods, Lyapunov function approaches, and synchronization to both simple and complex external signals. These techniques resolve the paradox implied by chaos control in different ways, but they all make use of the fact that chaotic systems can be productively controlled if disturbances are countered by small and intelligently applied impulses. Just as an acrobat balances about an unstable position on a tightrope by the application of small correcting movements, a chaotic system can be stabilized about any of an infinite number of unstable states by continuous application of small corrections. Two characteristics of chaos make the application of control techniques even more fruitful. First, chaotic systems alternately visit small neighborhoods of an infinite number of periodic orbits. The presence of an infinite number of periodic orbits embedded within a chaotic trajectory implies the existence of an enormous variety of different behaviors within a single system. Thus, the control of chaos opens up the potential for tremendous flexibility in operating performance within a single system. As an example, Figure 1 depicts the Lorenz attractor, used to model fluid convection. Embedded within the gray attractor are innumerable periodic orbits, such as the solid figure-8 orbit and the more complicated dashed one. For practical systems such as chemical reactors or fluidized beds, the presence of multiple
172
CONTROLLING CHAOS
Figure 1. Left: Lorenz attractor with two of its embedded unstable periodic orbits highlighted. Right: unstable points P1 and P2 in the surface of section indicated. The unstable direction is denoted by outgoing arrows, and stable direction is denoted by ingoing arrows.
co-existing states implies that one chaotic system could be operated in multiple different states, thus potentially performing the function of several separate units. A second characteristic of chaos that is important for control applications is the exponential sensitivity of the phenomenon. That is, the fact that the state of a chaotic system can be drastically altered by the application of small perturbations means two things: such a system if uncontrolled can be expected to fluctuate wildly, and if controlled can be directed from one state to a very different one using only very small controls. Traditional feedback control remains among the most widely used methods of control for chaotic systems. To implement feedback control, one waits until a chaotic trajectory by chance lands near a desired periodic point and then applies small variations to an accessible system parameter in order to repeatedly nudge the trajectory closer to that point. As an example, consider the plot to the right in Figure 1, where we depict two periodic points as they appear on a “surface of section” formed in this case by recording every intersection between the Lorenz chaotic attractor and the half plane, Z = 0, X > 0. To control the state to remain near point P2 (so the trajectory stays near the figure-8 trajectory shown to the left), one needs to apply variations in a parameter, p, that directs the state toward P2 along the unstable direction (or directions in more complicated problems) indicated by outgoing arrows in Figure 1. One can establish the direction in which a parametric control moves the chaotic state either experimentally, by varying the parameter and recording the future variation of the system state, or analytically, by determining the Jacobian of the flow or mapping where available. Nudging the state closer to P2 amounts to what is termed “pole placement” in traditional control literature, and numerous reports of alternative strategies for selecting parameter variations appear in the literature. Strategies include simple pole placement, optimal control, neural network approaches, simple proportional control, periodic forcing, and control dependent on the distance from P2 . Most of these strategies have proven successful under appropriate conditions, and the choice of strat-
egy depends principally on details of the control goal required and the computational resources available to meet that goal. All of these strategies require that the system state must lie close to the desired state in order to achieve control. In such a case, the system dynamics can be linearized, making control calculations rapid and effective. Fortunately, in chaotic systems, one can rely on ergodicity to ensure that the system state will eventually wander arbitrarily close to the desired state. By the same token, if it is desired to switch the system between one accessible state (say P1 ) and a second (say P2 ), one can merely release control from P1 and reapply a new control algorithm once the system strays close to P2 , which it is certain to do by ergodicity. In higher-dimensional or slowly varying systems, the time taken for the state to move on its own from one state to another can be prohibitive, and for this reason fully nonlinear control strategies have been devised that use chaotic sensitivity to steer the system state from any given initial point to a desired state. Since chaotic systems amplify control impulses exponentially, the time needed to steer such a system can be quite short. These strategies have been demonstrated both in systems in which a large effect is desired using very modest parameter expenditures (energy or fuel) and in systems in which rapid switching between states is needed (computational or communications applications). On the other hand, in both linear and nonlinear control approaches, one needs to repeatedly re-apply control over a time that is short compared with the inverse of the fastest growing growth rate of the system in order to counter the potential amplification of ubiquitous noises. Computational and experimental analyses have demonstrated that this is readily done in typical chaotic systems and that control can be robustly achieved. Because large but rare noise events can occur, however, controlled states occasionally break free when the system encounters an anomalous large noise. In this case, bounds have been established for the frequency and duration of these noise-induced excursions.
COSMOLOGICAL MODELS Numerous biological control applications have been proposed since the first introduction of the notion of chaotic control. Among the first applications were studies of intrinsic nonlinear control mechanisms involved in autonomic and involuntary functions such as the regulation of internal rhythms and the control of gait and balance. These studies confirm that nontrivial control algorithms are involved in the maintenance of normal physiological function and that provocative insights into pathological conditions can be gained (such as cardiac and breathing arrythmias and motor tremor). Further work has shown that networks of chaotic devices, under prescribed conditions, can be brought into synchronization, and strong indications have been presented that neuronal signaling may rely on nonlinear synchronization. Additional experimental studies are promising for the control of unwanted fluctuations (e.g., during fibrillation of the heart) or for the so-called “anticontrol” of synchronized periodic signals in focal epilepsy. In both studies, the goal is to use feedback control methods to steer a diseased organ using small electrical stimulation: in the former state, toward a stabilized state, and in the latter, away from a synchronized state. TROY SHINBROT See also Chaotic dynamics; Feedback; Lorenz equations
Further reading Alekseev, V.V. & Loskutov, A.Y. 1987. Control of a system with a strange attractor through periodic parametric action. Soviet Physics Doklady, 32: 1346–1348 Ditto, W.L. & Pecora, L.M. 1993. Mastering chaos. Scientific American, 78–84 Garfinkel, A., Spano, M.L., Ditto, W.L. & Weiss, J.N. 1992. Controlling cardiac chaos. Science, 257:1230– 1235 Glass, L. & Zeng, W. 1994. Bifurcations in flat-topped maps and the control of cardiac chaos. International Journal of Bifurcation & Chaos, 4: 1061–1067 Hayes, S., Grebogi, C. & Ott, E. Communicating with Chaos. Physical Review Letters, 70: 3031–3014 Hübler, A., Georgii, R., Kuckler, M., Stelzl, W. & Lscher, E. 1988. Resonant Stimulation of nonlinear damped oscillators by Poincaré maps. Helvetica Physica Acta, 61: 897–900 Lima, R. & Pettini, M. 1990. Suppression of chaos by resonant parametric perturbations. Physics Review A, 41: 726–733 Ott, E., Grebogi, C. & Yorke, J.A. 1990. Controlling chaos. Physical Review Letters, 64: 1196–1199 Pecora, L.M. & Carroll, T.J. 1990. Synchronization in chaotic systems. Physical Review Letters, 64: 821–824 Schiff, S.J., Jerger, K., Duong, D.H., Chang, T., Spano, M.L. & Ditto, W.L. 1994. Controlling chaos in the brain. Nature, 370: 615–620 Shinbrot, T., Ott, E., Grebogi, C. & Yorke, J.A. 1993. Using small perturbations to control chaos. Nature, 363: 411–417
173
COSMOLOGICAL MODELS Relativistic cosmology—the science of the structure and evolution of the universe—is based on the building and investigation of cosmological models (CMs), which describe geometrical properties of physical space-time, the matter, composition and structure of the universe, and physical processes at different stages of the universe’s evolution. Prominent in cosmology is the hot big bang CM, which is based on solutions of Alexander Friedmann’s cosmological equations for homogeneous isotropic models deduced in the framework of Einstein’s general relativity theory (GR). Because of its large-scale structure (galaxies, clusters of galaxies, etc.), the universe is homogeneous and isotropic only on the largest scales from 100 Mpc. (The pc or parsec is an astronomical unit of distance equal to 3.2616 light years; thus, an Mpc is 3.2616 million light years.) The most important feature of Friedmann’s CM is its nonstationary character, which was confirmed by Edwin Hubble’s discovery of cosmological expansion in 1929. In this formulation, the geometrical properties of physical space depend on the value of energy density ρ relative to a critical density ρcrit =
3H02 , 8π G
(0)
where H0 is the expansion rate (Hubble parameter) at the present epoch and G is Newton’s gravitational constant. If = ρ/ρcrit = 1, the 3-space is flat; if > 1, 3-space possesses positive curvature and if < 1, 3-space possesses negative curvature. Corresponding CMs are flat, closed, and open CM, respectively. All Friedmann CMs have a beginning in time (or cosmological singularity), where energy density and curvature invariants diverge. Their evolutions depend on properties of matter. In the case of ordinary matter with a positive energy density and a nonnegative pressure, the evolution of flat and open models has the character of expansion, and closed models recollapse after an expansion stage. The assumption that the temperature was very high at the initial stage of cosmological explanation (hot CM) was confirmed by the discovery of the cosmic microwave background (CMB) radiation in 1965, with a present epoch temperature of about T = 2.7 K. The theory of nucleosynthesis of light elements (hydrogen, helium, deuterium, lithium, etc.) into the first few minutes of cosmological expansion based on the framework of the hot big bang CM is in accord with empirical data. Advances in both theory and technology during the last 20 years have launched cosmology into a most exciting period of discovery. By using precise instruments (telescopes, satellites, spectroscopes), several cosmological research programs are being carried out, including investigations of the anisotropy
174 of CMB and supernovae observations. Cosmological observations have not only strengthened and expanded the hot big bang CM but they have also revealed surprises. Recent measurements of the anisotropy of the CMB have provided convincing evidence that the spatial geometry is very close to being uncurved (flat) with = 1.0 ± 0.03. The currently known components of the universe include ordinary baryonic matter, cold dark matter (CDM), massive neutrinos, the CMB and other forms of radiation, and dark energy. The sum of the values for these densities derived empirically is equal to the critical density (to within their margins of error). The largest contributions to energy density are from two components—CDM and dark energy. About 30% of the total mass-energy is dark matter, composed of particles probably formed early in the universe. Two thirds is in smooth dark energy whose gravitational effects began causing the expansion of the universe to speed up just a few billion years ago. The remarkable fact that the expansion is accelerating can be accounted for within GR, as the source of gravity is proportional to (ρ + 3p), where the pressure p and energy density ρ describe the bulk properties of “the substance.” A substance with pressure more negative than one-third its energy has repulsive gravity in GR. Such a situation occurs, for example, for gravitating vacuum (positive cosmological constant), for which p = − ρ. In addition to breakthrough empirical observations, creative theoretical ideas are also driving progress in cosmology. The development of cosmology during the last 20 years shows that profound connections exist between the elementary particles on the smallest scales and the universe on the largest. Using unified gauge theories of elementary particles, an inflationary scenario was formulated, which resolves a number of problems of standard Friedmann cosmology: flatness and the problem of horizon, among others. According to inflation, small bits of the universe underwent a burst of expansion when the universe was extremely young, explaining the homogeneity and isotropy of the universe at initial stages of cosmological expansion. Based on the framework of an inflationary CM, the appearance of quantum fluctuations with a nearly scale-invariant distribution by transition to radiationdominated era was predicted, explaining the large scale structure of the universe. The inflationary CM as well as others discussed above are singular, which is an outstanding problem of GR. Assuming that the Planck era (when the universe was sufficiently dense to require a quantum mechanical treatment) existed, some quantum gravitation theory is necessary to construct a regular CM. At present, the superstring theory is a candidate for such a theory. Some regular CMs have been constructed in a “brane world,” under which our universe is thought to exist as a slice (or membrane) through a higher-dimensional
COSMOLOGICAL MODELS space. By using scalar fields with a negative potential, a solution for an oscillating CM was obtained; thus, the Big Crunch takes place in such models before Big Bang. Resolving the problem of cosmological singularity requires that the gravitation theory not only admits regular solutions for CM but also excludes singular solutions. This suggests gauge theories of gravitation (Poincaré gauge theory or metric-affine gauge theory), leading to regular bouncing solutions for CMs. The building of more realistic CMs requires the resolution of fundamental cosmological problems. According to present knowledge, our universe is flat and 13 Gyr old, and it is expanding at the current rate of H0 = 72 ± 8 km sec−1 Mpc−1 . Measurements of the past rate reveal that the universe is presently in a period of cosmic acceleration. The contribution of ordinary matter to the overall mass energy is small, with more than 95% of the Universe existing in new and unidentified forms of matter and energy. What is the composition of dark matter (axions, neutralinos, or other exotic particles)? What is the nature of dark energy (quantum vacuum energy or scalar fields)? What is the field that drives inflation? Answers to these and other questions will change the picture presented above. VIACHASLAV KUVSHINOV AND ALBERT MINKEVICH See also Black holes; Einstein equations; Galaxies; General relativity Further Reading Gasperini, M. & Veneziano, G. 2003. The pre-big bang scenario in string cosmology. Physics Reports, 373: 1–212 Khoury, J., Ovrut, B.A, Seiberg, N., Steinhardt, P.J. & Turok, N. 2002. From big crunch to big bang. Physical Review D, 65: 086007 Kolb, E.W. & Turner, M.S. 1990. The Early Universe, Reading, MA: Addison-Wesley Linde, A.D. 1990. Particle Physics and Inflationary Cosmology, Chur: Harwood Academic Steinhardt, P.J. & Turok, N. 2002. A Cyclic Model of the Universe. hep-th/0111030
CONVECTION See Fluid dynamics
CONVECTIVE INSTABLITY See Wave stability and instability
CORRELATION DIMENSION See Dimensions
CORRESPONDENCE PRINCIPLE See Quantum nonlinearity
COUPLED MAP LATTICE
175
COUPLED MAP LATTICE Originally introduced in the study of spatiotemporal chaos, the coupled map lattice (CML) can be presented as a dynamical model for the evolution of a spatially extended system in time (Kaneko, 1983). CMLs have been widely used, not only as a tool for the study of spatiotemporal chaos but also for pattern dynamics in physics, chemistry, ecology, biology, brain theory, and information processing. A CML is a dynamical system with discrete time (map), discrete space (lattice), and a continuous state. It consists of dynamical elements on a lattice, which interact (are coupled) with suitably chosen sets of other elements. The construction of a CML is carried out as follows. First, choose a (set of) field variable(s) on a lattice. This (set of) variable(s) is on a macroscopic, not a microscopic level. Second, decompose the phenomenon of interest into independent units (e.g., convection, reaction, diffusion, and so on). Third, replace each unit by simple parallel dynamics (procedure) on a lattice, where the dynamics consists of a nonlinear transformation of the field variable at each lattice point and/or a coupling term among suitably chosen neighbors. Finally, carry out each unit dynamics (procedure) successively. As a simple and widely used example, consider a phenomenon that is created by a locally chaotic process and by diffusion, and choose a suitable lattice model on a coarsegrained level for each process. As the simplest choice, we can adopt some one-dimensional map for chaos, and a discrete Laplacian operator for the diffusion. The former process is given by xn (i) = f (xn (i)), where xn (i) is a variable at time n and lattice site i, (i = 1, 2, . . . , N), whereas xn (i) is introduced as the intermediate value. The discrete Laplacian operator for diffusion is given by xn+1 (i) = (1 − ε)xn (i) ε + {xn (i + 1) + xn (i − 1)}. 2
(0)
Combining the above two processes, the CML is given by xn+1 (i) = (1 − ε)f (xn (i)) ε + {f (xn (i + 1)) + f (xn (i − 1))}. (1) 2 The mapping function f (x) is chosen to depend on the type of local chaos. For example, one can choose the logistic map, f (x) = rx(1 − x), as a typical model for chaos. As the map dynamics are well studied, dynamical systems theory can be applied to understand behaviors of the CML. By adopting different procedures, one can construct models for different types of spatially extended dynamical systems. For problems of phase transition
dynamics, it is useful to adopt a map with bistable fixed points (e.g., f (x) = tanh x) as a local dynamics. The choice of a different type of coupling, as well as the extension to a higher-dimensional space is straightforward. By changing the procedures in the CML, one can easily construct a model for dynamical phenomena in space-time. Examples include spinodal decomposition, crystal growth, boiling, convection, and cloud dynamics, among others.
Universality Classes of the Phenomena Phenomena found in one CML are often observed in a wide variety of systems, and they form a universality class common to such systems. CMLs thus work as a tool to predict novel phenomenology forming such qualitative universality classes. In the model of Equation (1), the following phenomena have been discovered: (i) spatial bifurcation and frozen chaos, (ii) spatiotemporal intermittency (STI), (iii) Brownian motion of chaotic defects, and (iv) global traveling wave by local phase slips. These phenomena are observed in a wide variety of systems, including experiments. In particular, STI is now regarded as a universal route to fully developed spatiotemporal chaos. In fully developed spatiotemporal chaos, statistical mechanics theory is developed by taking advantage of the discreteness in spacetime. If one adopts a two-dimensional lattice system, spiral pattern dynamics are often observed. For example, by taking a local map with an excitable state, the formation of spiral waves is studied, including turbulence due to the break-up of a spiral wave pair. Such a model is studied in relation to the pattern dynamics in reactiondiffusion systems as well as wave propagation in cardiac tissue. Another straightforward extension is a spatially asymmetric coupling. In an open fluid flow, for example, there is coupling from up-flow to down-flow, instead of the diffusion. The CML xn+1 (i) = (1 − ε) f (xn (i)) + εf (xn (i)) gives a prototype model for such a case. In this open flow system, it is important to distinguish absolute instability from convective instability. If a small perturbation against a reference state grows in a stationary frame, it is called “absolute instability,” while if the perturbation grows only in a frame moving with a specific velocity, it is called “convective instability.” This convective instability leads to spatial bifurcation from a homogeneous state to down-flow convective chaos.
Globally Coupled Maps with Applications to Biology An extension of CML to global coupling is interesting, and often important for biological problems. Thus a globally coupled map (GCM) was introduced as a
176
COUPLED OSCILLATORS
COUPLED OSCILLATORS
mean-field-type extension of a CML, written as xn+1 (i) = (1 − ε)f (xn (i)) +(ε/N)
N
f (xn (j )).
(2)
j =1
One important notion here is clustering. The elements split into several clusters, within which all the elements oscillate in synchronization. Depending on the numbers of clusters in the GCM, there are phase transitions among a coherent phase, an ordered phase, a partially ordered phase, and a desynchronized phase, as the parameter describing the nonlinearity in f (x) is increased. In the partially ordered phase, there are many attractors with different numbers of clusterings and with a variety of partitions. Dynamically, the system spontaneously switches between ordered states through disordered states, known as chaotic itinerancy. In the desynchronized phase, nontrivial collective motion is observed with some hidden coherence among elements. This demonstrates the existence of macroscopic chaos different from microscopic chaos represented by each map xn+1 = f (xn ). This observation may shed new light on the origin of collective behavior by an ensemble of cells, such as an electroencephalogram (EEG) in the brain. Often, a biological system has both internal dynamics and interactions. Chemical dynamics in a cell includes both intra-cellular reactions associated with gene expressions and cell-cell interactions. Since a CML or GCM is a model for such intra-inter dynamics, the concepts developed in this area will be relevant to biological problems. For example, clustering leads to differentiation of the states of elements. The theory for cell differentiation and robust developmental process may be based on this dynamic differentiation. KUNIHIKO KANEKO See also Cellular automata; Cluster coagulation; Maps Further Reading Chaté, H. & Courbage, M. (editiors). 1997. Special issue on lattice dynamics. Physica D, 103: 1–612 Kaneko, K. 1986. Collapse of Tori and Genesis of Chaos in Dissipative Systems Singapore: World Scientific 1986 (PhD thesis originally published 1983) Kaneko, K. (editior). 1992. Chaos focus issue on coupled map lattices. Chaos, 2(3): 279–408 Kaneko, K. (editor). 1993 Theory and Applications of Coupled Map Lattices, Chichester and New York: Wiley Kaneko, K. & Tsuda, I. 2000. Complex Systems: Chaos and Beyond—A Constructive Approach with Applications in Life Sciences, Berlin and New York: Springer Kaneko, K. & Tsuda, I. 2003. Chaos focus issue on chaotic itinerancy. Chaos, 13(3): 926–1164
The simplest coupled oscillator is a pair of linearly coupled harmonic oscillators, which is used as a model for a wide variety of physical systems—including the interactions of musical instruments and tuning forks, lattice vibrations, electrical resonances, and so on— in which energy tunnels back and forth between two sites at a difference (beat) frequency. If there are many elementary oscillators that are nonlinear, coupled systems exhibit more varied nonlinear phenomena. There are two types of coupled nonlinear oscillators: those described by Hamiltonian (energy-conserving) dynamics, and systems in which energy is not conserved. In addition to coupled pendula, examples of the first kind include the Fermi–Pasta–Ulam model and the Toda lattice. Coupled nonlinear oscillators that do not conserve energy can be viewed as coupled limit cycle oscillators. A limit cycle oscillator (also called a self-sustained oscillator) is described as an attractor in a dissipative dynamical system. A typical dissipative dynamical system that exhibits a limit cycle oscillation is van der Pol’s equation dx d2 x + ω2 x = 0 − ε 1 − x2 dt 2 dt
(1)
in which the character of the oscillation varies from sinusoidal and energy-conserving to a strongly dissipative (blocking or relaxation) oscillation through the variation of a parameter (ε) from zero to large values (van der Pol, 1934). Among the varieties of limit cycle oscillators, the behavior of a quasilinear oscillator (small ε) can be expressed by a sinusoidal wave, x(t) = A sin(ωt + φ0 ). The wave shape of a relaxation oscillator (large ε), on the other hand, is composed of alternating fast and slow motions, similar to the spikes and slow recovery motions in a firing neuron, and stick-slip oscillations in frictional motions. Although the limit cycle oscillation has a certain natural amplitude and frequency, the phase variable, for example, φ = ωt + φ0 for a quasilinear oscillator, is a neutral mode, sensitively perturbed by an external force. If the external force is periodic with a frequency close to the natural frequency of the limit cycle oscillator, the phase of the limit cycle oscillator tends to approach the phase of the external periodic force. If the external force is sufficiently strong, the phase difference φ(t) = φ(t) − φe (t) between the limit cycle oscillator and the external force is fixed. This phenomenon— termed phase or frequency locking—occurs more easily when ε is large, the frequency of the limit cycle oscillator is close to that of the external force, and the coupling (K) is large. Regions in the (ω, ε, K) parameter space where frequency locking is observed are termed “Arnol’d
COUPLED SYSTEMS OF PARTIAL DIFFERENTIAL EQUATIONS P(ω)
a
P(ω)
ω
b
ω
Figure 1. Frequency distribution P (ω) (a) in an asynchronous state for K < Kc and (b) in a mutually entrained state for K > Kc .
tongues” owing to their peculiar shape. The frequency ratio between the limit cycle and the external force is 1:1 in the above frequency locking. In general, n:m frequency lockings are possible, where n and m are small integers. For a collection of coupled limit cycle oscillators with slightly different natural frequencies, frequency locking (called mutual entrainment) also occurs, as was first observed by Christiaan Huygens in the 17th century. He found that the motions of pendulum clocks suspended from the same wooden beam come to coincide with each other perfectly. Nobert Wiener analyzed such systems in the 1950s, showing that the power spectrum of the waves should have a peak close to 10 Hz, and he inferred that a similar shape of the power spectra of electroencephalogram (EEG) is due to mutual entrainment in coupled neural oscillators (Wiener, 1958). Buck and Buck reported that rhythmical flashes of South Asian fireflies were mutually synchronized (Buck & Buck, 1976). Mutual entrainment of coupled limit cycle oscillators has been studied by Winfree (2000) and also by Kuramoto, who considered a coupled phase oscillator model, noting the neutrality of phase variables (Kuramoto, 1984). The simplest model with global coupling has the form φ˙i = ωi +
N K sin(φj − φi ), N
(2)
j =1
where φi and ωi represent the phase and the natural frequency of the ith oscillator, N is the total number of oscillators, and K is a coupling constant. For K < Kc , the motion of each oscillator is independent and the frequency of the ith oscillator is the same as ωi . However, for K > Kc , collective oscillation appears and a number of oscillators are entrained to the collective oscillation. Figure 1 displays a typical frequency distribution for K < Kc and K > Kc . The δ-function peak in the frequency distribution implies mutual entrainment and a depression is seen around the deserved frequency for K > Kc . The Josephson junction is a quantum device composed of two weakly coupled superconductors. With the current bias current below a critical value, the superconducting current flows without a voltage
177
drop. If the bias is above the critical current, the phase difference (φ) between the Josephson junction is not constant in time, and the voltage drop (V ) between the Josephson junction equals φ˙ = 2 eV. This is called the AC Josephson effect. Thus the Josephson junction behaves as a kind of limit cycle oscillator above the critical current. If microwaves with frequency ω0 are applied to the Josephson junction, n : 1 frequency locking occurs, and the voltage becomes V = nω0 /2e. With N Josephson junctions coupled in series, the total voltage across the array is given by V = N nω0 /2e. Such series arrays are currently used to establish the international standard of voltage (See Josephson junction arrays). HIDETSUGU SAKAGUCHI See also Chaotic dynamics; Phase dynamics; Synchronization; Van der Pol equation Further Reading Buck, J. & Buck, E. 1976. Synchronous fireflies. Scientific American, 234: 74–85 Kuramoto, Y. 1984. Chemical Oscillations, Waves, and Turbulence, Berlin: Springer van der Pol, B. 1934. The nonlinear theory of electric oscillations. Proceedings of the IRE, 22: 1051–1086 Wiener, N. 1958. Nonlinear Problems in Random Theory, Cambridge, MA: MIT Press Winfree, A.T. 2000. When Time Breaks Down, Berlin and New York: Springer
COUPLED SYSTEMS OF PARTIAL DIFFERENTIAL EQUATIONS Coupled systems of nonlinear partial differential equations (PDEs) are often derived to simplify complicated systems of governing equations in theoretical and applied sciences (Engelbrecht et al., 1988). Nonlinear electromagnetic theory, fluid dynamics, and systems in general relativity are difficult computational problems even with the help of numerical algorithms and the latest computer technologies. Using additional assumptions on properties of nonlinear wave processes in physical systems, however, one can derive coupled systems of nonlinear PDEs from the original governing equations, which simplify the analysis. The main effects of nonlinear waves (such as nonlinearity, dispersion, diffraction, diffusion, damping and driven forces, and resonances) can be described with coupled nonlinear PDEs. Such systems may exhibit simple solutions such as traveling solitary waves and periodic waves, and some can be solved with the inverse scattering transform methods. Coupled systems comprise various combinations of nonlinear evolution equations that describe long solitary waves (Korteweg–de Vries and Boussinesq equations), envelope waves (nonlinear Schrödinger equations), kinks
178
COUPLED SYSTEMS OF PARTIAL DIFFERENTIAL EQUATIONS
and breathers (sine-Gordon equations), and traveling fronts and pulses (reaction-diffusion systems). Here, we present a few examples. Long surface water waves occur in oceans, seas, and lakes. The tsunami wave (Bryant, 2001) is an example of a nonlinear surface wave that arises following underwater earthquakes or underwater volcano eruptions and may reach heights of 20–30 m as it comes ashore. Because tsunamis are as long as tens and hundreds of kilometers, the ocean can be considered as shallow for such waves. This shallowwater approximation reduces the Euler equations for water waves to the Boussinesq system of coupled PDEs (Whitham, 1974): ut + uux + gηx − (h3 /3)utxx = 0, ηt + hux + ηux + uηx = 0, where η = η(x, t) is the wave surface elevation, u = u(x, t) is the horizontal velocity, h is the water depth, and g is the gravitational acceleration. The linear Boussinesq equation takes the form of the wave equation: ηtt − c2 ηxx = 0, which exhibits a two-wave solution η = f (x − ct) + g(x + √ ct), where f (x), g(x) are arbitrary functions and c = gh is the wave speed. When the two waves are separated in space, small nonlinearity and dispersion are captured in the unidirectional Korteweg–de Vries (KdV) equation (Johnson, 1997):
3η ch2 ηxxx = 0. ηx + ηt + c 1 + 2h 6 Different modes of long weakly nonlinear waves may travel with the same speed, exchanging energy by means of wave resonances. Because ocean water is stratified in density and shear flow, gravity waves can propagate along internal interfaces of the ocean stratification. Resonant interaction of internal wave modes in stratified shear flows is described by the system of coupled KdV equations (Grimshaw, 2001):
ut + Aux + B(u)ux + C uxxx = 0, where A, B(u), C are matrices and u = u(x, t) is the vector for amplitudes of different internal wave modes. Optical pulses may consist of electromagnetic waves in optical fibers, waveguides, and transmission lines. The propagation of optical pulses due to a balance between nonlinearity and dispersion is based on the paraxial approximation of the Maxwell equations with nonlinear refractive indices. This perturbation technique results in the nonlinear Schrödinger (NLS) equation (Newell & Moloney, 1992): iψt + 21 ω (k0 )ψxx + γ (k0 )|ψ|2 ψ = 0, where ψ = ψ(x, t) is the envelope amplitude of a wave packet with the carrier wave number k0 . Depending
on the relative signs of the dispersion coefficient ω (k0 ) and the nonlinearity coefficient γ (k0 ), wave perturbations are focused or defocused in the time evolution of the NLS equation. Interactions between waves with two orientations of polarization (ψy and ψz with propagation in the x-direction) can be represented in a normalized form as iψy,t + ψy,xx + 2(|ψy |2 + |ψz |2 )ψy = 0 iψz,t + ψz,xx + 2(|ψy |2 + |ψz |2 )ψz = 0. These are a pair of coupled NLS equations that are integrable by the inverse scattering transform method and display vector solitons (Manakov, 1974). Under collisions, the polarization vectors of two vector solitons change. In wavelength-division-multiplexing optical systems, optical signals are transmitted through parallel channels at different carrier wave numbers (up to 40 channels in latest communication lines). Incoherent interaction of optical pulses at nonresonant frequencies is described by the system of coupled NLS equations (Akhmediev & Ankiewicz, 1997): iψ t + Dψ xx + E(|ψ|2 )ψ = 0, where D and E(|ψ|2 ) are matrices and ψ = ψ(x, t) is the vector for optical pulses in different channels. If the coupling between optical pulses is coherent (as in birefringent fibers, waveguide couplers, phase mixers, and resonant optical materials), the system of coupled NLS equations takes a general form: iψ t + Aψ + iBψ x + iC(ψ)ψ x + Dψ xx + E(ψ)ψ = 0. The coupled NLS equations describe phase-matching resonance in quadratic χ 2 materials, gap solitons in periodic photonic gratings under Bragg resonance, Alfvén waves in plasmas, and other applications (Newell & Moloney, 1992). In conservative nonlinear systems, wave dynamics of small amplitudes occur typically in the neighborhood of local minima of potential energy. Wave oscillations in the system of nonlinear massive pendulums are described by the Frenkel–Kontorova dislocation model. In a continuous approximation, the Frenkel–Kontorova lattice model reduces to the sine-Gordon (SG) equation (Braun & Kivshar, 1998): ϕtt − c2 ϕxx + sin ϕ = 0, where ϕ is the angle between a pendulum and the vertical axis in a mechanical model. The nonlinear pendulums swing on a rigid rod under the gravity force and couple to each other with elastic springs. More complicated models of molecular crystals and ferromagnetics in solid state mechanics, stacked Josephson contacts in superconductivity, and strings in
CRITICAL PHENOMENA the general relativity theory are formulated as coupled systems of sine-Gordon equations (Maugin, 1999). The coupled Klein–Gordon equations take the form ϕ tt − Cϕ xx + f (ϕ) = 0, where C is a positive-definite matrix and f (ϕ) is the nonlinear vector function of components of the vector ϕ = ϕ(x, t). In more general systems, the energy of a nonlinear wave changes in time due to active and dissipative forces. The simplest system of this type is the nonlinear heat equation, which models the flame propagation (Zeldovich et al., 1985): ut = Duxx + f (u), where u = u(x, t) is the temperature and D is the diffusivity constant. A complex form of the nonlinear heat equation (known as the Ginzburg– Landau equation) is derived for the amplitude of the most unstable wave mode (Newell & Moloney, 1992). Active and dissipative systems include typically pairs of coupled activators and inhibitors. Coupled activator-inhibitor equations, known as the reactiondiffusion systems, are derived from the governing equations of thermodynamics in the form (Remoissenet, 1999): ut = C ux + D uxx + f (u), where C and D are matrices, and f (u) is a nonlinear vector function of components of the vector u = u(x, t). Reaction-diffusion systems exhibit static, traveling, and pulsating nonlinear wave structures such as fronts and impulses. Coupled reaction-diffusion systems include the FitzHugh–Nagumo and Hodgkin– Huxley equations for nerve impulses, ephaptic coupling among nerve impulses, and models of the global dynamics of heart. DMITRY PELINOVSKY See also Ephaptic coupling; Nonlinear optics; Reaction-diffusion systems; Sine-Gordon equation; Water waves Further Reading Akhmediev, N. & Ankiewicz, A. 1997. Solitons, Nonlinear Pulses, and Beams, London: Chapman & Hall Braun, O.M. & Kivshar, Yu.S. 1998. Nonlinear dynamics of the Frenkel–Kontorova model. Physics Reports, 306: 1–109 Bryant, T., 2001. Tsunami: The Underrated Hazard, Cambridge and New York: Cambridge University Press Engelbrecht, J.K., Fridman V.E. & Pelinovski E.N., 1988. Nonlinear Evolution Equations, London: Longman and New York: Wiley Grimshaw, R. (editor). 2001. Environmental Stratified Flows, Boston: Kluwer Johnson, R.S. 1997. A Modern Introduction to the Mathematical Theory of Water Waves, Cambridge and NewYork: Cambridge University Press Manakov, S.V. 1974. On the theory of two-dimensional stationary self-focusing of electromagnetic waves. Soviet Physics, JETP, 38: 248–253
179 Maugin, G.A. 1999. Nonlinear Waves in Elastic Crystals, Oxford and New York: Oxford University Press Newell, A.C. & Moloney, J.V. 1992. Nonlinear Optics, Redwood City, CA: Addison-Wesley Remoissenet, M. 1999. Waves Called Solitons. Berlin and New York: Springer Whitham, G. 1974. Linear and Nonlinear Waves. New York: Wiley Zeldovich, Ya.B., Barenblatt, G.I., Librovich, V.B. & Makhviladze, G.M. 1985. The Mathematical Theory of Combustion and Explosions. New York: Consultants Bureau
CRITICAL PHENOMENA The term critical phenomenon is used synonymously with “phase transition,” which involves a change of one system phase to another and occurs at a characteristic temperature (called a transition temperature or a critical temperature: Tc ). There are several different kinds of phase transitions such as melting, vaporization, and sublimation, as well as solid-solid, conductingsuperconducting, and fluid-superfluid transitions. In systems undergoing phase transitions, an emergence of long-range order is seen in which the value of a physical quantity at one arbitrary point in the system is correlated with its value at another point a long distance away. A classification scheme of phase transitions which remains the most popular was originally proposed by Paul Ehrenfest. According to this scheme, a transition for which the first derivative of the free energy with respect to temperature is discontinuous is called a firstorder phase transition; thus, the heat capacity, Cp , at a first-order transition is infinite. A second-order phase transition is one in which the first derivative of the thermodynamic potential with respect to temperature is continuous, but its second derivative is discontinuous, so the heat capacity is discontinuous but not infinite at the transition. Near a second-order phase transition (due to the reduction of rigidity of the system), critical fluctuations dominate as their amplitudes diverge. A useful concept in analyzing phase transitions is that of a critical exponent. In general, if a physical quantity Q(T ) either diverges or tends to a constant value (see Figure 1) as T tends to Tc , it can be characterized by defining the reduced temperature ε as ε≡
T − Tc . Tc
(1)
The associated critical exponent is µ = lim
ε→0
ln Q(ε) . ln ε
(2)
The most important critical exponents are denoted as α, β, γ , δ, υ, and η and describe the specific heat, order parameter, isothermal susceptibility, response to an external field, the correlation length, and the pair correlation function, respectively. (See Table 1 where
180
CRITICAL PHENOMENA
ε>0
Q Q(ε) ~ εµ
F
ε 0 and A4 > 0. Solving the equilibrium conditions for ψ yields ψ = 0 for ε > 0 and
(ε) = A – Bελ
aε 1/2 for ε < 0, ψ =± − 2A4
A
thus β = 0.5 entropy
d
S=
ε
obtained.
a2 ε ∂F = S0 + ∂T 2A4 Tc
Calculating
the
ε ≤ 0,
(5)
where for ε > 0, S = S0 is the entropy of the disordered phase, which gives the specific heat as
Figure 1. The four generic behaviors near criticality.
the primed exponents are introduced for temperatures below the critical temperature while the unprimed exponents are valid above the critical temperature.) The mean field approximation (Landau theory) describes the physics of phase transitions well except in the immediate vicinity of the critical point where order parameter fluctuations are large. It is assumed that close to Tc , the free energy F can be expanded in a Taylor series of the order parameter ψ. Introducing the reduced temperature ε as a control parameter, the simplest such expansion is (see Figure 2) F (T , V , ψ) = F0 + aεψ 2 + A4 ψ 4 ,
is
(4)
(3)
Cv = T
∂S a2 = C0 + T ∂T 2A4 Tc
ε ≤ 0,
(6)
where for ε > 0, Cv = C0 is the specific heat of the disordered phase. Hence, a discontinuity occurs at Tc (see Figure 3). C =
a2 . 2A4 Tc
(7)
Thus, α = 0. Including in F an external field h coupled to ψ F = F0 + aεψ 2 + A4 ψ 4 − hψ
(8)
CRITICAL PHENOMENA
Exponent
181
Definition (liquid-vapor) Specific heat at constant volume Cv ∼ (−ε)−α Cv ∼ ε−α Density difference ρL − ρG ∼ (−ε)β Isothermal compressibility κT ∼ (−ε)−γ κT ∼ ε−γ Pressure-density critical isotherm P − P c ∼ |ρL − ρG |δ (T = Tc ) Correlation length ξ ∼ (−ε)−ν ξ ∼ ε−ν Density–density pair correlation (r) ∼ |r|−(d−2+η)
α α β γ γ δ ν ν η
Definition (magnetic) Specific heat at constant H
CH ∼ (−ε)−α CH ∼ ε−α Magnetization M ∼ (−ε)β Isothermal susceptibility χT ∼ −ε −γ χT ∼ ε−γ Magnetic field-magnetization H ∼ |M|δ (T = Tc ) Correlation length ξ ∼ −ε−ν ξ ∼ ε−ν Spin-spin pair correlation (r) ∼ |r|−(d−2+η)
Table 1. The definitions of critical exponents for liquid-vapor and magnetic systems. [The primed (unprimed) exponents are for temperatures below (above) Tc .]
classical critical exponents: α = 0, β = 0.5, γ = 1, and δ = 3. While the Landau theory cannot describe spatial fluctuations, following Ginzburg and Landau’s proposal, it can be extended to consider the free energy to be a functional: F (ψ(. r ), T )
S
Tc
a
T
= Cv
d3 r[A2 ψ 2 +A4 ψ 4 −hψ +D(∇ψ)2] (12)
where D describes the energy due to spatial inhomogeneities. Applying a variational principle to F results in a nonlinear Klein–Gordon equation for the order parameter h = 2A2 ψ + 4A4 ψ 3 − 2D∇ 2 ψ.
b
Tc
T
A linearized solution of Equation (13) in spherical coordinates is
Figure 3. Plots of S(T ) and Cv (T ) in the Landau model of a second-order phase transition.
and minimizing F with respect to ψ yields an equation of state of the form h = 2ψ(aε + 2A4 ψ 2 ).
(9)
Because the susceptibility χ ≡ ∂ψ/∂h, we find χ = [2aε + 12A4 ψ 2 ]−1 .
ψ=
h0 e−r/ξ , 4π D r
(14)
−1/2
is the correlation length that diverges where ξ ∼ A2 as T → Tc so that the critical exponent υ = 0.5. Fourier transforming the order parameter according to . ψk eik·.r , (15) ψ(.r ) ≡ L−d/2 k 0 periods to obtain new capital. The solution of this planning problem is given by ∞ U (c(t))e−ρt dt, (4) max {c(t)} 0
subject to the DDE dk(t) = f (k(t − r)) − δk(t − r) − c(t), dt
(5)
with initial condition k(t) = φ(t), for all tε[ − r, 0]. f (·) is the production function; c(t), the rate at which capital depreciates, such that 0 < c(t) ≤ f (k(t − r)), δ ∈ [0, 1]; and k(t), the productive capital stock at time t. Mathematics
If one considers the symmetry reduction of a nonlinear differential-difference equation with respect to a combination of continuous and discrete symmetries, then the initial equation reduces to a DDE. As an
(6)
and assume a reduction with respect to ∂n + a∂t where a is an arbitrary real parameter. Equation (6) then reduces to (7)
where η = t − an. The equations considered in these examples are all instances of a general DDE, which, in the simple case of a linear first-order equation for just one field, can be written as a0
du(t) du(t − σ ) + a1 + b0 u(t) + b1 u(t − σ ) dt dt = f (t).
(8)
An equation of the form (8) is said to be a DDE of retarded type if a0 = 0 and a1 = 0; it is said to be of neutral type if a0 = 0 and a1 = 0; and of advanced type if a0 = 0 and a1 = 0. In applications, an equation of retarded type may represent the behavior of a system in which the rate of change of u(t) depends on its past and present values. A neutral equation represents a system in which the present rate of change depends on past rates of changes as well as its present and past values. An advanced type equation may represent a system in which its rate of change depends on its present and future values. If a0 = a1 = 0, Equation (8) is a pure difference equation, while if a0 = b0 = 0 or a1 = b1 = 0, it is a pure differential equation. In either case, f (t) is a forcing function. Let us compare the solution techniques for DDEs with those of ordinary differential equations (ODEs) and note some of their peculiar features. For simplicity we limit ourselves to retarded DDEs. For more details, see Bellman & Cooke (1963), Hale (1977), Hale & Verduyn Lunel (1993), Driver (1977), Bainov & Mishev (1991), Kuang (1993), and Gyori & Lada (1991). Because retarded DDEs depend on previous history, the initial condition at one point is not sufficient to obtain the present time behavior. What one needs depends on the discrete order of the equation. If the equation is a DDE of first order, then the initial solution on a whole delay interval is needed. For constant coefficient DDEs, an algebraic method of solution is provided by the “method of steps,” which also provides a constructive proof of the existence of the solution. To illustrate this method, consider a DDE generalization of the logistic equation dx(t) = −cx(t − 1)[1 + x(t)], dt
t > 0,
(9)
DENJOY THEORY
191
with the initial condition x(t) = φ(t) for t ∈ [ − 1, 0]. To solve Equation (9), we divide the interval [0, ∞) into steps of the size of the delay and solve recursively in each interval. We use the solution obtained in one interval to solve Equation (9) in the next one. For example, the solution in the interval [0, 1] is given by x(t) = [φ(0) + 1]e−c
t 0
φ(s − 1)ds
− 1,
which is obtained as a solution of the ODE dx(t) = −cφ(t − 1)[1 + x(t)]. dt
(10)
For linear DDEs we can construct, as in the case of linear ODEs, the characteristic equation, by looking at exponential solutions. In this case, however, the characteristic equation is given by a nonlinear algebraic equation. For example in the case of Equation (8), with a1 = 0, we have h(λ) = a0 λ + b0 + b1 e−λσ = 0.
(11)
Once the characteristic equation is solved, a particular solution of the DDE is obtained by applying the Laplace transform (Bellman & Cooke, 1963). As we have seen, the nature of the method of solution of a DDE is similar to that of an ODE. Nevertheless, DDEs exhibit more complicated behaviors, even in the linear case. For example, scalar linear firstorder homogeneous DDEs with real coefficients can have nontrivial oscillating solutions unlike ODEs (Kalecki, 1935). Moreover, solutions to DDEs may be discontinuous and, depending on the initial conditions, a solution may also not exist (Winston & Yorke, 1969). As in the case of ODEs, series solutions can be used to approximate solutions to nonlinear DDEs (Bellman & Cooke, 1963); however, the solutions obtained are often complicated and obscure. We can gain a better insight into the solution using qualitative theory and stability analysis to obtain properties of the dynamics of a nonlinear DDE by looking at its linearization. The stability of a fixed point of a DDE is defined by examining the roots of the characteristic equation h(λ). Thus, a fixed point of a DDE is stable if all roots of h(λ) have negative real parts. As the characteristic equation (11) is transcendental, it has an infinity of roots, and it is not guaranteed that all roots will have real parts, strictly negative or positive. So fixed points of DDEs will often be saddle points. Moreover, stability may depend crucially on the initial data (Driver, 1977). The stability of homogeneous scalar DDEs of the first order has been studied by Hayes (Bellman & Cooke, 1963). These results can be extended to nonlinear systems by linearizing the DDE around a stable solution and then using a generalization of the Poincaré–Lyapunov theorem. In such a way, one can
show that DDEs often admit periodic solutions after a sequence of Hopf bifurcations. Chaotic orbits may also exist, with the structure of the orbits depending critically on the smoothness of the feedback mechanism. DECIO LEVI See also Bifurcations; Equations, nonlinear; Feedback; Hopf bifurcation; Integral transforms; Ordinary differential equations, nonlinear; Poincaré theorems; Quasilinear analysis; Stability; Symmetry: equations vs. solutions Further Reading Asea, P.K. & Zak, P.J. 1999. Time-to-build and cycles. Journal of Economic Dynamics & Control, 23: 1155–1175 Bainov, D.D. & Mishev, D.P. 1991. Oscillation Theory for Neutral Differential Equations with Delay, Bristol: Adam Hilger Bellman, R. & Cooke, K.L. 1963. Differential-Difference Equations, New York: Academic Press Driver, R.D. 1977. Ordinary and Delay Differential Equations, New York: Springer Gyori, I. & Ladas, P. 1991. Oscillation Theory of Delay Differential Equations: with Applications, Oxford: Clarendon Press Hale, J.K. 1977. Theory of Functional Differential Equations, New York: Springer Hale, J.K. & Verduyn Lunel, S.M. 1993. Introduction to Functional Differential Equations, New York: Springer Kalecki, M. 1935. A macroeconomic theory of business cycles. Econometrica, 3: 327–344 Kuang, Y. 1993. Delay Differential Equations with Applications in Population Dynamics, Boston: Academic Press Levi, D. & Winternitz, P. 1993. Symmetries and conditional symmetries of differential-difference equations. Journal of Mathematical Physics, 34: 3713–3730 Ross, R. 1911. The Prevention of Malaria, 2nd edition, London: John Murray Roussel, M.R. 1996. The use of delay differential equations in chemical kinetics. Journal of Physical Chemistry, 100: 8323–8330 Winston, E. & Yorke, J.A. 1969. Linear delay differential equations whose solutions become identically zero. Académie de la République Popolaire Roumaine, 14: 885–887
DENJOY THEORY The theory developed by Arnaud Denjoy (1884– 1974) showed that any sufficiently smooth orientationpreserving diffeomorphism T of the unit circle S 1 with an irrational rotation number ρ is topologically equivalent to a linear rotation by the angle 2πρ (Denjoy, 1932). Informally, diffeomorphism is a smooth invertible map such that its inverse is also smooth. Circle diffeomorphisms arise naturally in many physical problems. For instance, in the case of Hamiltonian systems with two degrees of freedom, such diffeomorphisms appear as Poincaré first return maps for the two-dimensional invariant tori. When the rotation number is irrational, circle diffeomorphisms represent an important model for quasi-periodic dynamics (See Quasiperiodicity). The Denjoy theory
192
DENJOY THEORY
implies the following important fact: if two smooth circle maps have the same irrational rotation number then the topological structure of their trajectories is exactly the same. The topological equivalence means that circle diffeomorphisms are conjugated to a linear rotation with the help of a homeomorphic change of variables. Namely, there exists a homeomorphism, φ, which is an invertible map that is continuous together with its inverse, such that T ◦ φ = φ ◦ Tρ , where Tρ is the linear rotation by the angle 2πρ and ◦ stands for a composition of two maps. Denjoy’s theorem holds if T is absolutely continuous and log T (x) has bounded total variation: V = Var S 1 log T (x) < ∞. The last condition is satisfied if T is C 2 -smooth and T (x) > 0. The conjugacy φ is defined uniquely up to an arbitrary rotation Tα . In fact, a mapping φ of the unit circle S 1 that satisfies condition T ◦ φ = φ ◦ Tρ can be constructed for any quasi-periodic homeomorphisms T . This means that any homeomorphism T with irrational rotation number ρ is semiconjugate to Tρ . However, if T is not regular enough, φ may not be a homeomorphism. To construct φ it is enough to take two arbitrary points x0 and y0 and define their forward trajectories by T and Tρ , respectively: xi = T i x0 , yi = Tρi y0 , i ≥ 1. Now one can define φ on {yi } by letting φ(yi ) = xi , i ≥ 0 and extending φ by continuity to the whole unit circle. This can be done since any trajectory of a linear rotation by an irrational angle is everywhere dense. It is easy to see that a conjugacy φ is a homeomorphism if and only if T is transitive; that is, all its trajectories are dense on S 1 . When the total variation V is bounded, the transitivity of T follows from the Denjoy inequality: exp(−V ) ≤
q n −1
T (xi ) ≤ exp(V ),
i=0
where qn are the denominators of the convergents pn /qn = [k1 , k2 , . . . , kn ], and ρ = [k1 , k2 , . . . , kn , . . . ] is the continued fraction expansion for ρ. The condition T ∈ C 2 (S 1 ) that implies topological equivalence is almost sharp. Indeed, Denjoy constructed counterexamples where T ∈ C 1 (S 1 ) and the derivative T (x) is a Hölder continuous function with an arbitrary Höder exponent 0 < α < 1. In these examples T is not transient and, hence, is not conjugate to Tρ . An important extension of the Denjoy theory is connected with the problem of smoothness of the conjugacy φ. It is natural to ask when the homeomorphism φ is at least C 1 -smooth, which implies not only topological but also asymptotic metrical equivalence between T and Tρ . In this case, the unique probability invariant measure for T is absolutely continuous with respect to the Lebesgue measure. The first progress in this direction was made by Arnol’d (1961), who proved that for analytic diffeomorphisms,
T that are close enough to the linear rotation Tρ , a conjugacy φ is analytic provided the rotation number ρ is Diophantine, that is, ρ − p/q ≥ 1/q 2+δ for some δ > 0 and all integers p, q. Diophantine numbers form a set of positive Lebesgue measure and, hence, are typical in the Lebesgue sense. Arnol’d has also constructed counterexamples in the case of nontypical rotation numbers, which show that the smooth theory cannot be constructed for all irrational rotation numbers. In these counterexamples, φ is not differentiable, and the invariant measure for T is essentially singular with respect to Lebesgue measure. Arnol’d’s results are of the KAM-type (Kolmogorov–Arnol’d–Moser) and, hence, have a local character. However, as it was conjectured by Arnol’d, in the one-dimensional case the local condition of T being close to Tρ should not be necessary, and the global result should hold for all T smooth enough. Such a global result has been proven by Herman (1979) in the case when ρ satisfies certain Diophantine condition and T ∈ C 3 (S 1 ). Later Herman’s results were extended to a wider class of rotation numbers (Yoccoz, 1984) and to diffeomorphisms T ∈ C 2+ε (S 1 ) (Khanin & Sinai, 1987; Sinai & Khanin, 1989; Katznelson & Ornstein, 1989). Finally, we mention another extension of the Denjoy theory to the case of diffeomorphisms with singularities. Such mappings appear, for example, in the case of critical invariant tori in Hamiltonian systems with two degrees of freedom. The extension of the Denjoy theory to this case is a subject of the so-called rigidity theory. The main aim is to find conditions which imply that two topologically equivalent homeomorphisms that have the same local structure of their singular points are, in fact, C 1 -smoothly conjugate to each other. Significant progress in this direction has been made in the last 5 years in the case of mappings with one singular point (de Faria & de Melo, 1999, 2000; Yampolsky, 2001; Khanin & Khmelev, 2003). Note that the presence of singularities makes rigidity stronger than in the case of smooth diffeomorphisms. The arithmetical properties of the rotation numbers are less important, and one should expect C 1 -rigidity for all irrational rotation numbers. KONSTANTIN KHANIN See also Kolmogorov–Arnol’d–Moser theorem; Maps; Quasiperiodicity Further Reading Arnol’d, V.I. 1961. Small denominators. I. Mapping the circle onto itself. Izvestiya Akademii Nauk SSSR Seriya Mathematicheskaya, 25: 21–86 Cornfeld, I. P., Fomin, S.V. & Sinai,Ya. G. 1982. Ergodic Theory, New York: Springer Denjoy, A. 1932. Sur les courbes définies par les équations différentielles à la surface du tore. Journal des Mathematiques Pures et Appliques, ser. 9, 11: 333–375
DERRICK–HOBART THEOREM de Faria, E. & de Melo, W. 1999. Rigidity of critical circle mappings. I. Journal of the European Mathematical Society (JEMS), 1: 339–392 de Faria, E. & de Melo, W. 2000. Rigidity of critical circle mappings. II. Journal of the European Mathematical Society (JEMS), 13: 343–370 Herman, M. 1979. Sur la conjugaison différentiable des difféomorphismes du cercle à des rotations. Publications Mathématiques de l’Institut des Hautes Études Scientifiques, 49: 5–233 Katznelson, Y. & Ornstein, D. 1989. The differentiability of the conjugation of certain diffeomorphisms of the circle. Ergodic Theory & Dynamical Systems, 9: 643–680 Khanin, K.M. & Sinai, Ya.G. 1987. A new proof of M. Herman’s theorem. Communications in Mathematical Physics, 112: 89–101 Khanin, K. & Khmelev D. 2003. Renormalizations and rigidity theory for circle homeomorphisms with singularities of the break type. Communications in Mathematical Physics, 235: 69–124 Sinai, Ya.G. & Khanin, K.M. 1989. Smoothness of conjugacies of diffeomorphisms of the circle with rotations. Russian Mathematical Surveys, 44: 69–99 Yampolsky, M. 2001. The attractor of renormalization and rigidity of towers of critical circle maps. Communications in Mathematical Physics, 218: 537–568 Yoccoz, J.-C. 1984. Conjugaison différentiable des difféomorphismes du cercle dont le nombre de rotation vérifie une condition diophantienne. Annales Scientifique de l’École Normale Superierieure, 4(17): 333–359
DERIVATIVE NLS EQUATION See Nonlinear Schrödinger equations
193 expressed by the Euler–Lagrange equation. To second order, however, the energy may become smaller, in which case the corresponding quantum particle is considered to be unstable. (Think of a ball resting on top of a hill. A little push makes it roll down.) In order to study the existence of nonconstant finite-energy solutions (either stable or unstable), an argument due independently to Derrick (1964) and (in a somewhat different form) to Hobart (1963) is often useful. These authors were concerned with threedimensional space (four-dimensional space-time), but the argument can be extended without difficulty to an arbitrary space dimension N. Briefly, the argument is as follows. Assume φ(x, t), x ∈ R N , t ∈ R, is a scalar field on (N + 1)-dimensional space-time, whose dynamics is given by the Lagrangian
L = 21 (∂t φ 2 − ∇φ · ∇φ) − V (φ),
with V (y) being a potential function. Now let φ(x) be a (smooth) time-independent nonconstant solution to the Euler–Lagrange equation, with finite energy 1 ∇φ · ∇φdx, E = Ekin + Epot , Ekin = 2 RN Epot = V (φ)dx. (2) RN
Starting from the above data, Derrick’s key idea is to consider the family of scaled functions φλ (x) = φ(λx).
DERRICK–HOBART THEOREM The Derrick–Hobart scaling argument concerns certain solutions of nonlinear partial differential equations that arise as models for elementary particles; thus they are mostly of the relativistic variety. To appreciate the context in which the argument arose and the way it is used, some introductory remarks on relativistic quantum field theory are in order. There are only a few interacting relativistic quantum field theories that have been solved explicitly, in the sense that physically relevant quantities (particle spectrum, scattering, form factors, and so on) are known in closed form. For all of these models the dimension of space-time equals two. To gain more insight into higher-dimensional models, it has become standard practice to study the field theory first as a classical field theory. The underlying idea is that (via the Feynman path integral) one can use classical findings to obtain nonperturbative information on the quantum version. In particular, the presence of nonconstant, smooth, stable, time-independent, finite-energy, classical solutions is believed to signal the presence of an associated stable quantum particle. The notion of “stability” refers to small fluctuations around such a classical finite-energy solution. To first order, such variations do not change the energy, as
(1)
(3)
Clearly, the energy associated with φλ is given by Eλ = λ(2−N ) Ekin + λ−N Epot ,
(4)
(dEλ /dλ)λ=1 = (2 − N )Ekin − N Epot ,
(5)
so that
(d2 Eλ /dλ2 )λ=1 = (2 − N )(1 − N )Ekin +N (N + 1)Epot .
(6)
Since φλ makes the energy stationary for λ = 1, we have (dEλ /dλ)λ=1 = 0.
(7)
2−N Ekin , N
(8)
Hence (5) yields Epot = which entails (d2 Eλ /dλ2 )λ=1 = 2(2 − N )Ekin .
(9)
Let us now draw the relevant conclusions from this simple calculation. Since φ(x) is nonconstant, its
194 kinetic energy Ekin is positive. For N > 2, then, (9) says that the finite-energy solution cannot be stable. This is the first consequence, an instability result for N > 2. It does not involve restrictions on the potential V (y). Assuming from now on that V (y) ≥ 0, far stronger conclusions can be drawn. Indeed, since φλ is a solution for λ = 1, φ1 = φ makes the energy stationary. But since φ is nonconstant, we have Ekin > 0, and since V ≥ 0, we also have Epot ≥ 0. Therefore, the right-hand side of (5) is negative for N > 2, a contradiction. A second consequence, therefore, is the absence of finite-energy nonconstant solutions for V ≥ 0 and N > 2. Retaining the assumption V ≥ 0, one can draw a conclusion for N = 2, too. Indeed, it then follows that Epot = 0, so that φ must satisfy V (φ) = 0; moreover, the second variation (6) vanishes. For N = 1 the variation formulas (5), (6) have no useful consequences. Indeed, in two-dimensional spacetime there do exist stable time-independent finiteenergy solutions, as exemplified by the one-soliton and one-antisoliton solutions of the sine-Gordon theory. In applications of Derrick’s argument, one usually encounters positive potentials and invokes the latter consequences sketched above. Thus, it is used to the effect that for N ≥ 2, time-independent finite-energy solutions must be constant (the so-called vacuum solutions). Some caveats should be heeded, however. First, it is important to keep track of the above steps in models that are not of the above form, since the reasoning may need to be suitably modified. Second, even when this can be done at face value, it should be observed that the above argument, although convincing at first sight, is not a rigorous proof. Indeed, the scaling variation that is involved has a global character, whereas the Euler–Lagrange equation is derived by considering local variations. More in detail, one needs to control boundary terms that can a priori spoil the above derivation. (This was already realized in Hobart (1963).) We exemplify these related issues with two models described by Lagrangians that are different from the above, namely a (special) Yang–Mills/Higgs model in physical space (N = 3) and a class of nonlinear σ -models for N ≥ 2. In the first setting, explicit static finite-energy solutions were obtained in Prasad & Sommerfield (1975) and Bogomolnyi (1976). (These are nowadays called BPS monopoles.) The energy of these solutions is manifestly not scale-invariant, contradicting (7) for the case at hand. Inspection of the solution shows that this is due to poor decay at spatial infinity; it entails that the pertinent boundary term cannot be ignored. Turning to O(3) σ -models, one can once more study the issue of finite-energy solutions by adapting Derrick’s scaling argument. For N = 2 (now viewed as Euclidean space-time) this yields no conclusion, since the energy is scale-invariant. In this case, the so-called
DETAILED BALANCE instanton and anti-instanton solutions do exist, and they are stable for topological reasons. For N > 2, the scaling argument leads to the absence of finite-energy solutions. In this particular setting, the heuristic reasoning can be corroborated. More specifically, the boundary term can be rigorously controlled. The pertinent result (Garber et al. (1979), Theorem 5.1) has later been used by differential geometers to prove the nonexistence of harmonic maps, which are closely related to the above type of solution. SIMON RUIJSENAARS See also Matter, nonlinear theory of; Skyrmions; Virial theorem; Yang–Mills theory Further Reading Bogomolnyi, E.B. 1976. The stability of classical solutions. Soviet Journal of Nuclear Physics, 24: 449–454 Derrick, G.H. 1964. Comments on nonlinear wave equations as models for elementary particles. Journal of Mathematical Physics, 5: 1252–1254 Garber, W.-D., Ruijsenaars, S.N.M., Seiler, E. & Burns, D. 1979. On finite-action solutions of the nonlinear σ -model. Annals of Physics, 119: 305–325 Hobart, R.H. 1963. On the instability of a class of unitary field models. Proceedings of the Physical Society, London, 82: 201–203 Prasad, M.K. & Sommerfield, C.M. 1975. Exact classical solution for the ’t Hooft monopole and the Julia-Zee dyon. Physical Review Letters, 35: 760–762
DETAILED BALANCE This entry provides a qualitative discussion of equilibrium, a more quantitative discourse of principles such as detailed balance (which are needed in the description of equilibrium phenomenon), and a brief presentation of the Einstein relation between mobility and diffusion, which can be related to the above topics.
The Problem of Time One often says that a system has reached an equilibrium state if its physical variables are constant in time. Because of fluctuations that cannot be removed, however, it is better to regard the system as in equilibrium when there are no systematic trends in the time averages of its physical parameters. Here, averages are considered over all the microscopic constituents of the system, whether they are elementary particles, atoms, molecules, or larger objects. Equilibrium can be established among these constituents. Thus, a system that is in equilibrium cannot reveal the time variable among its broad characteristics. In other words, there is no way of telling which way time is running if one’s observations are confined to an equilibrium system. Formulated as a philosophical puzzle about the nature of time, this subject has spawned a library of books and papers, with little
DETAILED BALANCE
195
agreement among the authors (see Landsberg, 1982; Smith, 1993; Price, 1996; Davies, 1995).
Some Relevant Principles of Statistical Mechanics Here and below, we shall deal with a number of important principles that may or may not hold in any given case and are related to each other. To make these matters quantitative, denote by Pi the probability of finding a system of interest in any one of the ith group of states, Gi in number. The probability per unit time that a transition occurs from a state of group i to a state of group j is denoted by Aij . The transition rate i → j can be written as Rij = Pi Aij Gj .
(1)
If there are W available groups of states, the time rate of change of Pi is P˙i =
W (Rli − Ril )
(i = 1, 2 . . . W ).
(2)
A Simple Example from the Solid State One can use detailed balance arguments to infer the form of an unknown emission rate from a known absorption rate, as will now be shown by an example (Landsberg, 1991, p. 391). The idea is to obtain an expression for the equilibrium absorption rate per unit volume of photons of frequency ν0 in a semiconductor of refractive index µ and, hence, to infer spontaneous emission rates per unit volume. The probability of a single photon of vacuum wavelength λ0 being absorbed per unit time per unit volume is P (λ0 ) = cα(λ0 )/V µ(λ0 ). (LT −1
Aij = Aj i
(all i, j )
(3)
as a result of the Hermitian character of the perturbation operator. In statistical mechanics, one also uses the principle of Equation (3). It can then be independent of perturbation theory and is regarded instead as resulting from adequate statistical assumptions. It is then called the principle of microscopic reversibility. Next we have the principle of detailed balance which asserts that at a certain time t the forward and reverse transition rates between two groups of states are equal at a certain time; thus, Rij = Rj i
(all i, j ).
(4)
If Equation (4) holds, one sees that P˙i vanishes for all i. In fact, we can define a steady state by (5) P˙i = 0 (all i). Such a state need not be an equilibrium state since the system may, for example, be continuously raised to a high energy state by some external influence and then drop back continuously, for example, by the emission of radiation. Thus, one sees that Equation (4) implies Equation (5), but not conversely. For more details, see Lifschitz & Pitaewski (1981) and Landsberg (1991).
(6)
· L−3 )
The dimensions are easily verified to be correct. To find the volume rate of excitation in the solid by photons in the vacuum wavelength range dλ0 , P (λ0 ) has to be multiplied by the number of relevant photon modes (8π µ3 λ−4 0 V dλ0 ), and also by their equilibrium occupation probability at temperature T :
l=1
To be tractable the Aij have to be independent of time. The first sum gives the transitions into states i and the second sum gives the transitions out of the states i. To simplify the picture one can replace a typical group of states i by a single state, i.e., one can put Gi = 1. Now some additional general principles can be defined. The existence of the Aij can be deduced from quantum mechanical perturbation theory, but it is then valid only for a restricted time interval. One often finds the symmetry relation
· L−1
1/[exp(ch/λ0 kT ) − 1].
(7)
But not all photons of wavelength λ0 will, when absorbed, produce one electron-hole pair. We denote by α (λ0 )/α(λ0 )(≤ 1) the probability of this happening per absorbed photon. Hence, the equilibrium absorption rate (per unit volume) of photons in the wavelength range dλ0 with production of an electron-hole pair is α /α
8π µ2 αcλ−4 dλ0 exp(ch/λ0 kT ) − 1
or 8π α µ x 2 dx . (kT )3 h3 c 2 exp x − 1
(8)
Here x = hν0 /kT , and the second of these expressions is like the first, except that it is in terms of frequencies. According to detailed balance, the new inference is that these expressions give the rate per unit volume of spontaneous radiated recombination of electron-hole pairs with the emission of photons in the range dλ0 or dν0 . Note that we have passed from absorption to emission data. This widely used result was first given by W. van Roosbroeck and W. Shockley in 1954. For other examples of the use of the principle of detailed balance in solid state physics, see Landsberg (1991).
The Einstein Relation The Einstein Relation is basic to solid states physics and rests on the assumption that in a steady state the flux of charged particles due to an electric field must be balanced by diffusion of these particles induced by their density gradients. These two effects are due to wellknown and simple forces. The first is a particle flux due
196
DETERMINISM
to diffusion (with diffusion coefficient D, say). It can be written −Ddn/dx for one-dimensional motion, where n is the density of particles and dn/dx the gradient (“grad n” in three dimensions). The minus sign shows that the force acts to the left if the concentration n increases to the right. The second force on the charged particles is due to a built-in or externally applied electric field E, which is a vector in three dimensions. Here we deal merely with the one-dimensional problem, and note that E can be replaced by −dV /dx, where V is the electrostatic potential at the point considered. The flux of particles can be written as nνE = − nµd V /dx, where ν is the so-called mobility of the particles. In order to obtain the Einstein relation in its simplest form, one has to equate the two forces dn dV =D , dx dx
(9)
µ dV d(ln n) =− dx D dx
(10)
−nµ which implies that
giving the simple result n = n0 exp(−µV /D).
(11)
As we also know that the stationary state in an electric field at a temperature T is governed by the Boltzmann distribution n = n0 exp(−eV /kT ),
(12)
where n0 is a constant and k is Boltzmann’s constant. Comparison yields the Einstein relation µ = eD/kT .
(13)
This result connects the mobility of charged particles in a field with their diffusion coefficient. At first sight this seems unexpected because one side of the equation deals with the mechanical characteristic of diffusion. The extension to three dimensions is not the only generalization that can be made. For example, a similar Einstein relation holds for thermal current density, and generalizations have also been made for large departures from equilibrium (Landsberg, 1991). A further variety of special cases arises for different assumptions about the shape of the semiconductor bands that can occur; for example, they can be degenerate or nondegenerate, parabolic or nonparabolic, etc., and the results can be given in a table of formulae. (Einstein’s paper was published in Annalen der Physik und Chemie in 1905, the first of three important papers published by him in that year.) The principle of detailed balance emerged somewhat hesitantly in the 1920s, based on Einstein’s 1917
paper on transition possibilities. It was named by Fowler and Milne following other authors and other names. A brief historical survey is given by ter Haar (1955). PETER LANDSBERG See also Diffusion; Stochastic processes Further Reading Coveney, P. & Highfield, R. 1991. The Arrow of Time, London: Allen, 1990 and New York: Fawcett Columbine Davies, P. 1995. About Time: Einstein’s Unfinished Revolution, New York: Simon and Schuster Einstein, A. 1905. Die von der molekularkinetischen Theorie der Wärme geforderte Bewegung. Annalen der Physik und Chemie, 17: 549–560 Landsberg, P.T. (editor). 1982. The Enigma of Time, Bristol: Adam Hilger Landsberg, P.T. 1991. Thermodynamics and Statistical Mechanics, Oxford and New York: Oxford University Press Landsberg, P.T. 1991. Recombination in Semiconductors, Cambridge and New York: Cambridge University Press Lifschitz, E.M. & Pitaewski, L.P. 1981. Physical Kinetics, Oxford and New York: Pergamon Press Price, H. 1996. Time’s Arrow and Archimedes’Point, Oxford and New York: Oxford University Press Smith, Q. 1993. Language and Time, Oxford and New York: Oxford University Press ter Haar, D. 1955. Foundations of statistical mechanics. Reviews of Modern Physics, 27: 289
DETERMINISM Determinism is a philosophical and scientific notion, and discussions about it are as old as philosophy and science themselves. Richard Taylor writes “Determinism is the general philosophical thesis which states that for everything that ever happens there are conditions such that, given them, nothing else could happen” (Taylor, 1996). This seems to be the most general formulation of determinism. In philosophy, he continues, “There are five theories of determinism to be considered, which can for convenience be called ethical determinism, logical determinism, theological determinism, physical determinism, and psychological determinism.” Here we shall confine ourselves only to physical determinism in the natural sciences, except in the concluding section. In physics, the deterministic view developed along with the experimental approach to research, in the sense that phenomena are reproducible under the same unchanged external conditions, implying that the same cause leads to the same consequences under the same conditions. The quantitative description of physical reality began with Galileo Galilei; although some early developments are due to Pythagoras. However, Isaac Newton was the first to lay down the complete basis of classical mechanics, which at the time was considered to be the origin of all physical phenomena. His laws of mechanics plus
DETERMINISM the law of gravitation enabled him to reproduce and mathematically derive the motion of the planets, observations of which were empirically well known by the beginning of the 16th century and formulated in Johannes Kepler’s laws of celestial mechanics. With the rise and development of classical mechanics the view of determinism developed, with the opinion that all natural laws can be described by dynamical equations, either ordinary differential equations (as, for example, in celestial mechanics) or partial differential equations (as, for example, in the dynamics of fluids). In each case precise knowledge of the initial conditions (all positions and all velocities) completely determines the entire future and entire past of the system. When pushed to its extremum, this view implies complete deterministic evolution of the entire universe, including all its smallest and largest details. The French mathematician Pierre Simon de Laplace, about one century after Newton, wrote (in an often quoted passage): We ought then to regard the present state of the universe as the effect of its antecedent state and the cause of the state that is to follow. An intelligence knowing, in any instant of time, all forces acting in nature, as well as the momentary positions of all things of which the universe consists, would be able to comprehend the motions of the largest bodies in the world and those of the smallest atoms in one single formula, provided it were sufficiently powerful to subject all data to analysis: to it, nothing would be uncertain, both future and past would be present before its eyes. (Laplace, 1814)
We can comment on Laplace’s statement from our modern perspective. First, to store and process data of infinite precision about the state of the entire universe is problematic, as it would require a computer that would be of comparable size and complexity to the entire universe. Thus, its presence has to be taken into account, since—obeying the same mechanical laws as the rest of the universe—it would itself disturb the universe. Therefore, we can conclude that Laplace’s “intelligence” (sometimes known as Laplace’s daemon) cannot exist, and consequently his idea is fiction. Second, infinite precision of all the initial conditions (positions and momenta) can never be achieved in practice. And when the precision is finite, the existence of chaos (positive Lyapunov exponents) implies sensitive dependence on initial conditions and exponential divergence of nearby trajectories. In other words, there is a finite time horizon exists in general chaotic mechanic systems, beyond which nothing at all can be predicted (Lyapunov time). Therefore, the modern notion of omnipresent chaotic behavior makes Laplace’s idea impossible to implement, even in principle. Third, the universe is not described by classical mechanics, but by quantum mechanics, classical mechanics being just a useful or even excellent approximation in observing and describing the motions of sufficiently large bod-
197 ies. Quantum mechanics tells us, through Heisenberg’s principle of uncertainty, that momenta and positions cannot be measured simultaneously with infinite precision, but we have instead the inequality xpx ≥ h/2, ¯ where x and px are the uncertainties of position x and the conjugated momentum px . So, Laplace’s initial conditions can never be known to arbitrary precision, even in principle. Quantum mechanics is the correct description of physical reality, with the Schrödinger equation as the starting tool, for nonrelativistic systems. The quantum theory has been further developed by Paul Dirac for relativistic quantum systems and by the quantum field theory up to the unifying field theories, which capture three fundamental interactions (electromagnetic, weak, and strong interactions), but not yet gravity. The Schrödinger equation is a deterministic equation of motion of the wave function ψ, which contains the complete description of the quantum state of a given system. Importantly, ψ itself is a statistical quantity and thus not deterministic: it gives merely probabilities for the given system to be found (by measurement) in a given state. This is the so-called Copenhagen interpretation of quantum mechanics, initiated by Max Born in 1926 and further developed by Niels Bohr and his colleagues, according to whom there is no determinism in physical reality. This view was strongly opposed by Albert Einstein and colleagues, who accepted the quantum theory as correct but thought that it was an incomplete theory, to be supplemented (through future research) by a more general deterministic theory, uncovering further “hidden variables,” which seem to be ignored in present-day quantum mechanics. Many attempts have been made to find such a classical theory of fields to deduce the quantum theory but without success. There are also certain predictions such as Bell’s inequalities that are the testing ground of whether quantum theory can in principle be an extended classical deterministic field theory. So far the answer is no, at least for a large class of “local hidden variables theories,” and today we do have experimental confirmations where Bell’s inequalities are experimentally violated, meaning that the quantum theory and its prediction for the outcome of such experiments is correct. Therefore, the statistical interpretation of quantum mechanics of Bohr’s Copenhagen school, together with the strongly counter-intuitive notion of nonlocality, is proven to be correct, and these nondeterministic properties of quantum mechanics are being used in technological applications (such as quantum information theory, quantum teleportation, and quantum computing). It is, of course, a philosophical shock to learn that the world is not deterministic, but there seems to be no way out. One of the main causes is the process of quantum measurement, which as a process is not described by the Schrödinger equation and seems to
198
DETERMINISTIC WALKS IN RANDOM ENVIRONMENTS
be the primary source of quantum indeterminism. Quantum measurement is the main source for the generally accepted statistical interpretation of quantum mechanics. Still, the potential of a classical nonlinear field theory (including its turbulent solutions) seems largely unexplored as a description of physical reality. Even classical nonlinear dynamics is not deterministic (even in principle) due to the existence of chaos. A nonlinear classical field theory is even richer, for example the complex turbulent solutions of the Navier– Stokes equations. In a deterministic world, there would be no place for free will in the lives of human beings or other living creatures. Everything would be predetermined by the initial state before our life, even if we do not have information about that, which implies that we cannot be aware of our predestination. Since the world is not deterministic, there is room for free will and free choice. However, it might be that the world is deterministic, if we do not observe it, and is not deterministic as soon as we “touch” it. Therefore, determinism can never be proved (in analogy with Kurt Gödel’s famous incompleteness theorem). Thus, our free will may materialize as soon as we interact with the world, otherwise we would be completely predestined, but isolated from the rest of the world, which is of course not possible. The issue of classical and quantum measurement lies at the bottom of such discussions. It leads to the general conclusion that the world ultimately is not deterministic, but determinism might be a good approximation under certain conditions imposed on the measurement process. MARKO ROBNIK See also Butterfly effect; Causality; Chaotic dynamics; Lyapunov exponents; Quantum theory; Turbulence
Further Reading Belavkin, V.P. 2002. Quantum causality, stochastics, trajectories and information. Reports on Progress in Physics, 65: 353–420 Bell, J.S. 1987. Speakable and Unspeakable in Quantum Mechanics, Cambridge and NewYork: Cambridge University Press Edward U. Condon, author. Mechanics, Quantum. 1980. The New Encyclopaedia Britannica: Macropaedia, 15th edition, vol. 11, Chicago: Encyclopaedia Britannica: 793 Laplace, P.S. 1814. Essai philosophique sur le probabilités, Paris: Courier, 1814; as Philosophical Essay on Probabilities, Berlin and New York: Springer, 1995 Peres, A. 1995. Quantum Theory: Concepts and Methods, Dordrecht: Kluwer Philip W. Goetz. (editor). 1980. Determinism. The New Encyclopaedia Britannica: Micropaedia, 15th edition, vol. III, Chicago: Encyclopaedia Britannica: 494 Taylor, R. 1996. Determinism. In The Encyclopedia of Philosophy, vol. 2, edited by Paul Edwards, New York: Macmillan and London: Simon & Schuster, 359
Wheeler, J.A. & Zurek, W.H. (editors). 1983. Quantum Theory and Measurement, Princeton, NJ: Princeton University Press
DETERMINISTIC WALKS IN RANDOM ENVIRONMENTS A “deterministic walk in a random environment” (DWRE) is the name given to a system generated by the motion of some object (such as, a particle, signal, wave, ant, read/write head of the Turing machine) on a graph. At each time step, the object hops from a vertex to one of its neighboring vertices. The choice of neighbor is completely determined by the type of deterministic scattering rule or scatterer, located at the vertex. A random environment is formed by the scatterers that are assumed to be initially randomly (usually independently) distributed among the vertices. DWREs (in their simplest form and under different names) were introduced in various branches of science (Gunn & Ortuño, 1985; Langton, 1986; Ruijgrok & Cohen, 1988) as paradigms, for example, for propagation of a signal in a random media, evolutionary dynamics, growth processes, and the computational environment. In the early numerical studies, graphs were regular lattices and usually two types of scatterers were considered in each model. The most studied case was that of the regular quadratic lattice with left and right rotators, which rotate the particle to the left or to the right by an angle π/2, or left and right mirrors aligned along the two diagonals of the lattice. Two classes of such models have been extensively studied numerically (Cohen, 1992). The first class corresponds to the case when there is no feedback of the moving particle to the environment; that is, a particular type of scatterer is fixed at each site of the lattice forever. Another class is formed by models with flipping scatterers, when a scatterer at a site changes (deterministically) after every visit of a particle to this site. In statistical physics, these models naturally appear as deterministic Lorentz lattice gases (but with a random distribution of scatterers). The scatterers are not spheres (disks) as in the classical Lorentz gas. Instead, say in the d-dimensional cubic lattice, there are (2d)2d different types of scatterers because each vertex in this case has 2d incoming and 2d outcoming edges. In theoretical computer science, these models are referred to as many-dimensional Turing machines because the changes of scatterer type at each vertex occur deterministically—according to some program written on an infinite tape divided into commands, for example, to change a given scatterer to some other type (Bunimovich & Khlabystova, 2002a). Although a DWRE reminds one of random walks, these systems are essentially different. The major difference with random walks is that instead of carrying out a random experiment (like flipping a
DETERMINISTIC WALKS IN RANDOM ENVIRONMENTS coin), the particle chooses each step deterministically. Formally, DWREs are deterministic cellular automata, but their behavior reflects a mixture of deterministic dynamics and a random environment. Their dynamics is often counterintuitive (Cohen, 1992) because one’s intuition is essentially based on exactly understood (completely solved) systems and models. There are many such models among purely deterministic and purely stochastic systems; however, there were basically no completely understood systems with a mixture of deterministic and stochastic features. Some subclasses of DWREs provide such exactly solvable models (Bunimovich, 2000). Although closest to stochastic systems, DWREs have fixed environments. This seems counterintuitive, but the evolution of scattering types makes the entire dynamics more deterministic than in the case where an (initially random) distribution of scatterers is frozen. In many cases, DWRE systems are equivalent to various models from percolation theory (Bunimovich & Troubetzkoy, 1992). Not only the structure of the graph (lattice) but also the types of scatterer in the model determine the corresponding percolation problem. For instance, the mirror’s model in the square lattice is reduced to the percolation problem on the square lattice, while the rotator’s model is reduced to the percolation problem on some nonplanar graph (Bunimovich & Troubetzkoy, 1992). Perhaps the most widely known DWRE models are Langton’s Ant (Langton, 1986) or the flipping rotators model on the square lattice (Ruijgrok & Cohen, 1988), which are solvable again with rather counterintuitive results (Bunimovich & Troubetzkoy, 1993). If all vertices are occupied with rotators, then all orbits (particle’s paths) are unbounded. If, on the other hand, one allows vertices to be empty with positive probability (i.e., the third, straight-ahead scatterer is allowed), then the particle’s path becomes bounded with probability one. The results, both numerical and mathematical, continued to surprise until “Walks in Rigid Environments” (WRE) were introduced and analyzed (Bunimovich, 2000). WREs employ a new integer parameter r, 1 ≤ r ≤ ∞, which is called the rigidity of the environment. The rigidity determines how many times the particle must collide with the given scatterer in order to change its type. In other words, the scatterer at a given vertex changes its type immediately after the rth visit of the particle to this site. Therefore, WREs interpolate between DWREs with fixed environments (where r = ∞) and DWREs with flipping environments (where r = 1). WREs on a one-dimensional lattice Z are completely solved (Bunimovich, 2000; Bunimovich & Khlabystova, 2002b). In this case, there are only four types of scatterers. Two of them (forward scatterer and backscatterer) are symmetric with respect to the reflection of Z, which
199
is the only nontrivial symmetry of the one-dimensional lattice. The other two scatterers, which always send the particle to the right (or to the left), do not respect this symmetry. Therefore, the WRE with the last two types of scatterers has the same behavior for all values of the rigidity r. This model demonstrates a diffusive type of behavior, in which the particle eventually visits all vertices and the mean square displacement of the particle is proportional to t. On the contrary, WREs with forward and back scatterers demonstrate totally different behavior depending on the parity of the rigidity r. For even rigidities, the particle eventually visits all vertices again but its motion is subdiffusive. The most interesting behavior occurs for odd values of the rigidity. In this case the particle—after a short initial period of seemingly irregular motion near the origin—starts to propagate in one direction with random velocity. This phenomenon of (eventual) propagation reminds one of “gliders” in Conway’s Game of Life. However, in a WRE this propagation occurs for all initial configurations of environment, while in the Game of Life, gliders appear as only very special solutions. The phenomenon of eventual propagation in one direction is not restricted to one-dimensional WREs. For instance, the same behavior is demonstrated by the model with right and left rotators on the triangular lattice (Grosfils et al., 1999). If the rigidity r ξT , where the nonlinearity vanishes. As a result, the wave function ψ reads near the collapse point: ψ(r , t) =
eiλ
t 0
du/a 2 (u)
a(t) 1 r −iβr 2 /4a 2 × φc ,ε e a
0≤r 3 (Kosmatov et al., 1991). So far, the discussion has remained within the realm of the one-wave component NLS equation with a cubic nonlinearity. It is thus worth underlining the following. • The previous results can be generalized to a power-law nonlinearity, when the cubic term |ψ|2 ψ of Equation (1) is replaced by |ψ|2n ψ with n > 1 (Rasmussen & Rypdal, 1986; Bergé, 1998). Solutions with D = 2/n follow the route of a strong collapse, while solutions defined for D > 2/n collapse weakly. Superstrong collapses apply to the dimensional configurations D > 2 + 1/n. • Several NLS equations coupled through their cubic nonlinearities often serve to model the self- and cross-interactions of multiple wave components (or different polarizations) in vector systems. Such systems promote blow-up phenomena, which can be examined by means of the above analytical tools (Bergé, 2001). • Blow-up may take place in solutions of PDEs other than the NLS equation. For example, investigations of the solutions to the generalized D-dimensional Korteweg–de Vries (KdV) equation qt + q n qx + (∇ 2 q)x = 0
(8)
202 suggest that, whereas no collapse occurs for values of the product nD < 4, collapsing states can arise and adopt a self-similar shape provided that nD ≥ 4 (Blaha et al., 1989). The mathematical proof for this statement is presently incomplete. LUC BERGÉ See also Filamentation; Kerr effect; Nonlinear Schrödinger equations; Virial theorem Further Reading Bergé, L. 1998. Wave collapse in physics: Principles and applications to light and plasma waves. Physics Reports, 303: 259–370 Bergé, L. 2001. Nonlinear wave collapse. In Spatial Solitons, edited by S. Trillo & W. Torruellas, Berlin: Springer, pp. 247–267 Blaha, R., Laedke, E.W. & Spatschek, K.H. 1989. Collapsing states of generalized Korteweg–de Vries equations. Physica D, 40: 249–264 Glassey, R.T. 1977. On the blowing-up of solutions to the Cauchy problem for nonlinear Schrödinger equations. Journal of Mathematical Physics, 18: 1794–1797 Kelley, P.L. 1965. Self-focusing of optical beams. Physical Review Letters, 15: 1005–1008 Kosmatov, N.E., Shvets, V.F. & Zakharov, V.E. 1991. Computer simulation of wave collapses in the nonlinear Schrödinger equation. Physica D, 52: 16–35 Kuznetsov, E.A., Rasmussen, J., Rypdal, K. & Turitsyn, S.K. 1995. Sharper criteria for the wave collapse. Physica D, 87: 273–284 Rasmussen, J. & Rypdal, K. 1986. Blow-up in nonlinear Schrödinger equations-I: a general review. Physica Scripta, 33: 481–497 Weinstein, M.I. 1983. Nonlinear Schrödinger equations and sharp interpolation estimates. Communications in Mathematical Physics, 87: 567–576 Zakharov, V.E. & Kuznetsov, E.A. 1986. Quasiclassical theory of three-dimensional wave collapse. Zhurnal Eksperimental’noi i Teoreticheskoi Fiziki (USSR JETP), 91: 1310–1324 [Trans. in Soviet Physics JETP, 64: 773–780]
DIFFERENTIAL GEOMETRY chart. If the intersection Uαβ of two domains Uα and Uβ is nonempty, then the change of coordinates φα ◦ φβ−1 is a continuous map from φβ (Uαβ ) to φα (Uαβ ) with a continuous inverse. A differentiable manifold is a topological manifold M n equipped with charts such that, on the overlaps, φα ◦ φβ−1 is differentiable with a differentiable inverse. (Here, “differentiable” or “smooth” functions are those whose partial derivatives exist and are continuous to all orders; however, for C k manifolds, the changes of coordinates are only required to be differentiable up to order k.) If x 1 , . . . , x n and x¯ 1 , . . . , x¯ n are the coordinate functions for two overlapping charts, then the Jacobian determinant ||∂