High Dielectric Constant Materials: VLSI MOSFET Applications (Springer Series in Advanced Microelectronics)

  • 95 54 6
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

High Dielectric Constant Materials: VLSI MOSFET Applications (Springer Series in Advanced Microelectronics)

Springer Series in advanced microelectronics 16 Springer Series in advanced microelectronics Series Editors: K. Ito

1,211 97 10MB

Pages 723 Page size 336 x 532.8 pts Year 2005

Report DMCA / Copyright

DOWNLOAD FILE

Recommend Papers

File loading please wait...
Citation preview

Springer Series in

advanced microelectronics

16

Springer Series in

advanced microelectronics Series Editors: K. Itoh

T. Lee T. Sakurai

W. M. C. Sansen

D. Schmitt-Landsiedel

The Springer Series in Advanced Microelectronics provides systematic information on all the topics relevant for the design, processing, and manufacturing of microelectronic devices. The books, each prepared by leading researchers or engineers in their f ields, cover the basic and advanced aspects of topics such as wafer processing, materials, device design, device technologies, circuit design, VLSI implementation, and subsystem technology. The series forms a bridge between physics and engineering and the volumes will appeal to practicing engineers as well as research scientists. 1

2

3 4

5 6

7

8

9

Cellular Neural Networks Chaos, Complexity and VLSI Processing By G. Manganaro, P. Arena, and L. Fortuna Technology of Integrated Circuits By D. Widmann, H. Mader, and H. Friedrich Ferroelectric Memories By J.F. Scott Microwave Resonators and Filters for Wireless Communication Theory, Design and Application By M. Makimoto and S. Yamashita VLSI Memory Chip Design By K. Itoh Smart Power ICs Technologies and Applications Ed. by B. Murari, R. Bertotti, and G.A. Vignola Noise in Semiconductor Devices Modeling and Simulation By F. Bonani and G. Ghione Logic Synthesis for Asynchronous Controllers and Interfaces By J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, and A. Yakovlev Low Dielectric Constant Materials for IC Applications Editors: P.S. Ho, J. Leu, W.W. Lee

10 Lock-in Thermography Basics and Use for Functional Diagnostics of Electronic Components By O. Breitenstein and M. Langenkamp 11 High-Frequency Bipolar Transistors Physics, Modelling, Applications By M. Reisch 12 Current Sense Amplifiers for Embedded SRAM in High-Performance System-on-a-Chip Designs By B. Wicht 13 Silicon Optoelectronic Integrated Circuits By H. Zimmermann 14 Integrated CMOS Circuits for Optical Communications By M. Ingels and M. Steyaert 15 Gettering Defects in Semiconductors By V.A. Perevostchikov and V.D. Skoupov 16 High Dielectric Constant Materials VLSI MOSFET Applications Editors: H.R. Huff and D.C. Gilmer 17 System-level Test and Validation of Hardware/Software Systems By M. Sonza Reorda, Z. Peng, and M. Violante

H.R. Huff

D.C. Gilmer

(Eds.)

High Dielectric Constant Materials VLSI MOSFET Applications

With 363 Figures and 31 Tables

123

Dr. H.R. Huff

Dr. D.C. Gilmer

International SEMATECH 2706 Montopolis Drive Austin, TX 78741 USA E-mail: [email protected]

Motorola 3501 Ed Bluestein Boulevard, MD-K10 Austin, TX 78721 USA E-mail: [email protected]

Series Editors:

Dr. Kiyoo Itoh Hitachi Ltd., Central Research Laboratory, 1-280 Higashi-Koigakubo Kokubunji-shi, Tokyo 185-8601, Japan

Professor Thomas Lee Stanford University, Department of Electrical Engineering, 420 Via Palou Mall, CIS-205 Stanford, CA 94305-4070, USA

Professor Takayasu Sakurai Center for Collaborative Research, University of Tokyo, 7-22-1 Roppongi Minato-ku, Tokyo 106-8558, Japan

Professor Willy M. C. Sansen Katholieke Universiteit Leuven, ESAT-MICAS, Kasteelpark Arenberg 10 3001 Leuven, Belgium

Professor Doris Schmitt-Landsiedel Technische Universit¨at M¨unchen, Lehrstuhl f¨ur Technische Elektronik Theresienstrasse 90, Geb¨aude N3, 80290 München, Germany

ISSN 1437-0387 ISBN 3-540-21081-4 Springer Berlin Heidelberg New York Library of Congress Control Number: 2004105869 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specif ically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microf ilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media. springeronline.com © Springer-Verlag Berlin Heidelberg 2005 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specif ic statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Data conversion by PTP-Berlin Protago-TeX-Production GmbH using a Springer LATEX macro package Final-Processing: PTP-Berlin Protago-TeX-Production GmbH, Germany Cover concept by eStudio Calmar Steinen using a background picture from Photo Studio “SONO”. Courtesy of Mr. Yukio Sono, 3-18-4 Uchi-Kanda, Chiyoda-ku, Tokyo Cover design: design & production GmbH, Heidelberg Printed on acid-free paper

SPIN: 10827381

57/3141/Yu - 5 4 3 2 1 0

Preface

Although the Integrated circuit (IC) was invented in 1958 by Jack Kilby of Texas Instruments (mesa process with germanium) and Bob Noyce of Fairchild (planar process with silicon), it was not until the mid-later 1960s that the bipolar IC and then, in the early 1970s, the MOSFET IC significantly entered the production scene. Pat Haggerty’s vision at Texas Instruments of the pervasiveness of the microelectronics revolution, the concept of the learning curve and market elasticity in the early 1960s was of immeasurable significance to the fledging IC industry. Concurrently, projection of the memory bit per chip with time by Gordon Moore at Fairchild in 1965 – the number of transistors per IC doubles every year (updated by Moore at Intel in 1975 to about 18 months and, subsequently, re-assessed in 1995) – gave impetus to the industry that a viable market was indeed the case. These business oriented issues, coupled with Bob Dennard’s one transistor / one capacitor dynamic ram memory cell (DRAM) at IBM in 1968 and the related scaling methodology established the basis for the growth of the MOSFET industry for the next thirty plus years. The IC industry has sustained the above scaling methodology (critical feature reduction), in conjunction with the introduction of complementary metal oxide semiconductors (CMOS) and appropriate improvements in layout design through the end of the last century. We have now reached the point, however, where the conventional scaling methodology utilizing silicon dioxide (SiO2 ) and subsequently silicon oxynitride (SiOx Ny ) has essentially reached the tolerable power limit for high-performance MOSFET ICs and stand-by (direct tunneling) leakage current for low-power applications such as cell phones and notebook computers. An alternative to the conventional silicon oxynitride gate dielectric (which typically is in the range of 0.8–1.5 nm physical thickness) is required to obviate the degradation in IC performance. This monograph presents a perspective on the gate dielectric and the approaches in progress to rectify the above noted issues. That is, increasing the physical thickness of the gate dielectric to significantly reduce the power and direct-tunneling current issues while enabling the continued reduction in the electrically active gate dielectric thickness by utilizing high-k dielectric constant materials. The high-k materials facilitate both an increased physical thickness and a reduction in the electrical thickness to maintain the requisite scaling methodology. The latter is generally referred to as the equivalent

VI

Preface

oxide thickness (EOT). This monograph, however, is envisioned to be more than just a current view of these alternative high-k gate dielectric approaches. Rather, both previous and present directions related to scaling the gate dielectric and their impact, along with the creative directions and future challenges defining the direction of high-k gate dielectric scaling methodology, will be reviewed. The monograph is introduced by a comprehensive review of Moore’s law by Dan Hutcheson and then is divided into four parts. The first part reviews the classical regime of SiO2 , including a brief historical note by the late Else Kooi, kindly re-assessed by Albert Schmitz. Gene Irene then presents a comprehensive review of SiO2 -based MOSFETs, followed by Robin Degraeve’s assessment of SiO2 reliability methodologies. Part 2 describes the transition to silicon oxynitrides as the gate dielectric. Shih-Hsien Lo and Yuan Taur review the gate dielectric scaling methodologies in the transition from 2.0 towards 1.0 nm as regards their implications for device performance. Thomas Skotnicki and Frederic Boeuf then review “optimal” device scaling methodologies for SiO2 and silicon oxynitride by reminding us that we may not necessarily desire nor are required to scale every feature to its limit for a given technology generation as defined in the International Technology Roadmap for Semiconductors (ITRS). Finally, Hsing-Huang Tseng reviews the state-of-the-art for silicon oxynitride to both reduce gate leakage and boron penetration from a boron doped polysilicon gate electrode. Part 3 addresses the transition to high-k gate dielectrics with an EOT less than 1 nm with eleven articles. Initially, Jon-Paul Maria discusses a variety of criteria for selecting alternative high-k gate dielectrics. Bob Wallace and Glen Wilk then discuss a host of materials issues for high-k gate dielectric selection and their integration with planar CMOS processes. Gregory Parsons reviews the role of interface composition and structure for the high-k gate dielectrics. In a complementary fashion, Gerry Lucovsky and Jerry Whitten discuss the electronic structure of a variety of high-k gate dielectrics and correlations amongst them. Andrei Istratov and Eicke Weber then review a host of physical–chemical properties of selected 4d, 5d and rare-earth metals in silicon. Part 3 continues with a review of the deposition of high-k films by Jane Chang followed by Veena Misra’s review of metal gate electrode materials compatible with the high-k gate dielectrics. Luigi Colombo, Antonio Rotondaro, Mark Visokay and James Chambers discuss CMOS IC fabrication issues for both the high-k gate dielectric and metal gate electrode materials. Alain Diebold and William Chism discuss the characterization and metrology of these high-k gate dielectric constant materials followed by George Brown’s review of electrical measurements for the high-k gate films. Finally, Yang Yu Fan, Sivakumar Mudanai, Wanqiang Chen, Leonard Register and Sanjay Banerjee discuss the utilization of the high-k materials in IC design and related issues.

Preface

VII

Part 4 addresses future directions for advanced technology generations. Fred Walker and Rodney McKee discuss high-k crystalline gate dielectrics from a research perspective. Ravi Droopad, Kurt Eisenbeiser and Alex Demkov then discuss the utilization of high-k crystalline gate dielectrics from an IC manufacturer’s perspective. Finally, a roll-up of the high-k gate dielectrics with a host of related device alternatives is discussed for advanced MOSFET devices by Jeff Bokor, Tsu-Jae King, Jack Hergenrother, Jeff Bude, Dave Muller, Thomas Skotnicki, Stephan Monfray and Greg Timp. The challenges of high-k gate dielectrics is one of the most critical issues in the evolving ITRS. The vision, experience and wisdom the authors have summarized will help succeed in ensuring this monograph is a timely, relevant, interesting and resourceful book focusing on both the fundamentals and evolving directions to ensure the successful integration of high-k gate dielectrics and metal gate electrodes in future ICs as envisioned in the ITRS. Austin, April 2004

Howard R. Huff David C. Gilmer

Contents

1 The Economic Implications of Moore’s Law G.D. Hutcheson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Moore’s Law: A Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The History of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 The Microeconomics of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The Macroeconomics of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . 1.6 Moore’s Law Meets Moore’s Wall: What is Likely to Happen . . . . 1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 1 2 13 21 23 27 28 30

Part I Classical Regime for SiO2 2 Brief Notes on the History of Gate Dielectrics in MOS Devices E. Kooi, A. Schmitz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Early Attempts to Make Insulating-Gate Field-Effect Transistors; Surface States . . . . . . . . . 2.2 Passivation of Silicon Surfaces by Thermal Oxidation; Planar Transistor Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Positive Oxide Charge and Surface States at the Si–SiO2 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Instabilities Due to Ion Drift Effects . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Phosphate-silicate Glass Helped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Other Materials Tried as Gate-Dielectric Layers . . . . . . . . . . . . . . . 2.7 Thermal Oxidation of Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Segregation of Dopants at the Si–SiO2 Interface . . . . . . . . . . . . . . . 2.9 Other Silicon Oxide Preparation Techniques . . . . . . . . . . . . . . . . . . 2.10 Thick Field Oxides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11 Breakdown Strength of SiO2 , Defect Density, Moore’s Law . . . . . . 2.12 Weak Oxide Regions in MOS Structures, Kooi Effect . . . . . . . . . . 2.13 Al Gate MOS Devices; PMOS IC’s . . . . . . . . . . . . . . . . . . . . . . . . . . 2.14 Silicon Gate MOS Devices, NMOS and CMOS IC’s . . . . . . . . . . . .

33 33 34 35 36 37 37 38 39 40 41 41 41 42 42

X

Contents

2.15

Decrease of Oxide Thickness Connected with Downscaling of MOS Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties E.A. Irene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 SiO2 Prior to 1970 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 A Brief Historical Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 What Is a MOSFET? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 How Does a MOSFET Work? . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5 Interface Electronic States and Charge . . . . . . . . . . . . . . . . 3.1.6 Implications of the Charges on MOSFET Operation . . . . 3.1.7 The Silicon Oxidation Model: Early Studies . . . . . . . . . . . 3.2 After 1970: Progress in Understanding . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 In situ Real-Time Oxidation Studies: Dry O2 , the Effects of Water and Other Impurities . . . . . 3.2.2 Arrhenius Behavior and Deviations . . . . . . . . . . . . . . . . . . . 3.2.3 Stress Effects on Oxidation Kinetics . . . . . . . . . . . . . . . . . . 3.2.4 Orientation Effects on Oxidation Kinetics . . . . . . . . . . . . . 3.2.5 Effects of Light on Oxidation Kinetics . . . . . . . . . . . . . . . . 3.2.6 The Thin Film Regime (< 20 nm) . . . . . . . . . . . . . . . . . . . . 3.2.7 The Si–SiO2 Interface: Measurement and Implications . . 3.3 Modern Era: The Quest for Thinner SiO2 and Alternatives . . . . . 3.3.1 Ultra-thin SiO2 Film Metrology . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Interfacial Roughness at the Si–SiO2 Interface . . . . . . . . . 3.3.3 Ultra-thin Film SiO2 Films and the Future of Gate Dielectrics . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Oxide Reliability Issues R. Degraeve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Thin Oxide Layer Degradation Under Electrical Stress . . . . . . . . . 4.1.1 Interface Trap Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Oxide Charge Trapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Hole Fluence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4 Neutral Electron Trap Generation . . . . . . . . . . . . . . . . . . . . 4.1.5 Stress-Induced Leakage Current . . . . . . . . . . . . . . . . . . . . . . 4.1.6 Trap Generation Mechanism: Discussion . . . . . . . . . . . . . . . 4.2 Oxide Breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Breakdown Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Soft Breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Breakdown Acceleration Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Voltage or Field Extrapolation . . . . . . . . . . . . . . . . . . . . . . .

45 45 45 45 46 48 48 49 51 55 56 61 62 65 68 71 73 76 76 80 85 86 91 91 92 92 93 96 97 100 102 102 105 107 108

Contents

4.3.2 Temperature Dependence of Breakdown . . . . . . . . . . . . . . . 4.3.3 Oxide Reliability Predictions . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

XI

110 111 111 111

Part II Transition to Silicon Oxynitrides 5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride S.-H. Lo, Y. Taur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Device Requirements on Gate Dielectric Scaling . . . . . . . . . . . . . . . 5.2 Definition of Gate Dielectric Thickness . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Electron Distribution in Accumulation and Inversion Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Polysilicon Gate Depletion Effect . . . . . . . . . . . . . . . . . . . . . 5.2.3 Gate Capacitance and Equivalent Oxide Thickness (EOT) Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Tunneling Current of SiO2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Modeling Electron Tunneling from Quasi-bound States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Tunneling Current as a Function of Thickness . . . . . . . . . . 5.4 Tunneling Currents of Silicon Oxynitride . . . . . . . . . . . . . . . . . . . . . 5.5 Application Dependence of Gate Dielectric Limit . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Optimal Scaling Methodologies and Transistor Performance T. Skotnicki, F. Boeuf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Scaling and Device Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 MASTAR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Voltage-Doping Transformation . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Short-Channel Effect (SCE) . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 Drain-Induced Barrier Lowering (DIBL) . . . . . . . . . . . . . . . 6.2.5 Junction Depth Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.6 Understanding the “Good Design Rules” . . . . . . . . . . . . . . 6.3 Limitations of Conventional Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Limitations Menacing the Vth /Vdd Scaling . . . . . . . . . . . . . 6.3.2 Limitations Menacing the Tox el /L Scaling . . . . . . . . . . . . . 6.3.3 Limitations Menacing the Xj /L Scaling . . . . . . . . . . . . . . . 6.3.4 Limitations Menacing the Tdep /L Scaling . . . . . . . . . . . . . . 6.3.5 Impact on the Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Extending Validity of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Strategies Based on Increased Gate Drive (Vdd –Vth ) . . . . 6.4.2 Strategies Based on Even More Aggressive Scaling . . . . . .

123 123 127 127 127 130 132 133 133 135 137 140 143 143 145 145 148 150 151 151 153 154 154 155 161 163 163 165 165 169

XII

Contents

6.4.3 6.4.4

Strategies Based on New Materials . . . . . . . . . . . . . . . . . . . Strategies Based on Improvements of Device Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.5 How Far Can We Go and How Much Should We Pay? . . . . . . . . . . . . . . . . . . . . . 6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Silicon Oxynitride Gate Dielectric for Reducing Gate Leakage and Boron Penetration Prior to High-k Gate Dielectric Implementation H.-H. Tseng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Integrated RTCVD Oxynitride (ION) Process . . . . . . . . . . . . . . . . . 7.2.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 JVD Nitride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 DPN Oxynitride . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

176 182 187 190 192

195 195 197 197 198 207 207 207 211 211 212 218 219

Part III Transition to High-k Gate Dielectrics 8 Alternative Dielectrics for Silicon-Based Transistors: Selection Via Multiple Criteria J.-P. Maria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Development of Selection Criteria . . . . . . . . . . . . . . . . . . . . 8.2.2 Application of the Selection Criteria . . . . . . . . . . . . . . . . . . 8.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

223 223 226 226 239 247 248

9 Materials Issues for High-k Gate Dielectric Selection and Integration R.M. Wallace, G.D. Wilk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 Improved Performance Through Scaling . . . . . . . . . . . . . . . 9.1.2 Leakage Current and Power . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 MIS (Metal-Insulator-Semiconductor) Structures . . . . . . . . . . . . . . 9.2.1 Issues for Interface Engineering . . . . . . . . . . . . . . . . . . . . . .

253 253 254 256 257 257

Contents

9.2.2 High-k Device Modeling and Transport . . . . . . . . . . . . . . . Materials Properties and Integration Considerations . . . . . . . . . . . 9.3.1 Permittivity and Barrier Height . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Thermodynamic Stability on Si . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Interface Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.4 Film Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.5 Gate Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.6 Process Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.7 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9.3

10 Designing Interface Composition and Structure in High Dielectric Constant Gate Stacks G.N. Parsons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Thermodynamic Stability of Dielectrics on Silicon . . . . . . . . . . . . . 10.2.1 Silicide Formation and SiO Evolution During Post-deposition Processing . . . . . . . . . . . . . . . . . . . . 10.2.2 Affect of Excess Oxygen on Final State Energetics . . . . . . 10.2.3 Chemical Mechanisms in Silicon Interface Oxidation . . . . 10.3 Kinetic Rate Processes During Metal Oxide Deposition . . . . . . . . . 10.3.1 Driving Forces for Reactions During Metal Oxide Deposition on Clean Silicon . . . . . . . 10.3.2 Role of Surface Pre-treatment and Passivation . . . . . . . . . 10.3.3 Important Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Gate Electrode/Dielectric Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Polysilicon/Dielectric Interfaces . . . . . . . . . . . . . . . . . . . . . . 10.4.2 Metal/Dielectric Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Electronic Structure of Alternative High-k Dielectrics G. Lucovsky, J.L. Whitten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 SiO2 and the Si–SiO2 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Interfacial Transition Regions Between Crystalline Si and Non-crystalline SiO2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.2 Local Atomic Structure of SiO2 . . . . . . . . . . . . . . . . . . . . . . 11.2.3 Electronic Structure of SiO2 . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.4 Local Atomic Structure of the Si–SiO2 Interface . . . . . . . . 11.3 Alternative Dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.1 Classification of High-K Non-crystalline Dielectrics . . . . . 11.4 Electronic Structure of Transition Metal Dielectrics . . . . . . . . . . . .

XIII

260 261 261 266 269 270 272 275 276 277 277

287 287 290 290 292 295 297 297 300 303 304 304 304 305 306

311 311 313 313 315 316 319 322 322 327

XIV

Contents

11.4.1

Empirical Correlations Between Electronic Structure and Atomic d-State Energies . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2 Extension of Ab Initio Calculations to Transition Metal Oxides . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 Experimental Studies of Electronic Structure . . . . . . . . . . . . . . . . . . 11.5.1 Valence Band . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.2 Anti-bonding Conduction Band States of TM Oxides . . . 11.5.3 TM and RE Alloys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5.4 XPS and AES Results for Zr Silicates . . . . . . . . . . . . . . . . . 11.5.5 Trapping at Transition Metal Atoms in Al2 O3 –Ta2 O5 Alloys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Interface Electronic Structure Applied to Direct Tunneling in Silicate Alloys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals in Silicon A.A. Istratov, E.R. Weber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Crystal Lattice Site of 4d, 5d, and Rare Earth Metals in Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Solubility of 4d, 5d, and Rare Earth Metals in Silicon . . . . . . . . . . 12.4 Diffusivity of 4d, 5d, and Rare Earth Elements in Silicon . . . . . . . 12.4.1 Diffusivity of Pr, Sr, Ba, Zr, and Hf . . . . . . . . . . . . . . . . . . 12.4.2 Diffusivity of Er, Pm, Yb, Tb, Ho, and Mo in Silicon . . . 12.4.3 Diffusivity of Heavy Metals in Silicon: A Discussion . . . . . 12.5 Energy Levels in the Band Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5.1 Energy Levels of Y, Zr, and Hf . . . . . . . . . . . . . . . . . . . . . . . 12.5.2 Electrical Levels of Mo, Nb, Ta, and W . . . . . . . . . . . . . . . 12.5.3 Electrical Levels of the Rare Earth Elements: Er, Tb, Ho, or Dy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 Effect of 4d, 5d, and Rare Earth Metals on Minority Carrier Recombination Lifetime and Device Performance . . . . . . . . . . . . . . 12.7 Summarizing Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 High-k Gate Dielectric Deposition Technologies J.P. Chang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Atomic Layer Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1.2 Chemical Reaction Mechanisms and Precursors . . . . . . . . 13.1.3 Processing Reactors and Chemical Delivery System . . . . . 13.1.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

327 330 333 333 333 335 339 347 348 353 355

359 359 360 361 362 362 364 367 368 368 369 370 372 374 375 379 380 380 381 387 391

Contents

13.2

Chemical Vapor Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.2.3 Processing Reactors and Chemical Delivery System . . . . . 13.2.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Plasma-Enhanced Atomic Layer Deposition . . . . . . . . . . . . . . . . . . . 13.3.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.3.3 Processing Reactors and Chemical Delivery System . . . . . 13.3.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Plasma Enhanced Chemical Vapor Deposition . . . . . . . . . . . . . . . . . 13.4.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.4.3 Processing Reactors and Chemical Delivery System . . . . . 13.4.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 Physical Vapor Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.5.3 Processing Reactors and Chemical Delivery System . . . . . 13.5.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6 Molecular Beam Epitaxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.6.3 Processing Reactors and Chemical Delivery System . . . . . 13.6.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7 Ion Beam Assisted Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.7.3 Processing Reactors and Chemical Delivery System . . . . . 13.7.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8 Sol-gel Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8.1 Technology Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8.2 Chemical Reaction Mechanisms and Kinetics . . . . . . . . . . 13.8.3 Processing Reactors and Chemical Delivery System . . . . . 13.8.4 Film Composition, Microstructure, and Electrical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

XV

391 391 392 392 393 393 393 394 395 396 396 396 397 397 399 399 399 400 401 401 403 403 403 404 404 404 404 404 405 405 405 405 406 406 406 406 407

XVI

Contents

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices V. Misra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Metal Gate Selection Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Other Challenges with Metal Gates . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Metal Gate Candidates for NMOS Devices . . . . . . . . . . . . . . . . . . . . 14.4.1 Metal Nitrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4.2 Metal Silicon Nitrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4.3 Binary Metal Alloys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Metal Candidates for PMOS Devices . . . . . . . . . . . . . . . . . . . . . . . . . 14.6 Metals on High-k Dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 CMOS IC Fabrication Issues for High-k Gate Dielectric and Alternate Electrode Materials L. Colombo, A.L.P. Rotondaro, M.R. Visokay, J.J. Chambers . . . . . . . . 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 The “Standard” CMOS Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.1 Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.2 Well and Channel Doping . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.3 Gate Dielectric/Gate Stack . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.4 Source and Drain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.5 Silicide and Contact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Insertion of High-k Gate Dielectric into the CMOS Flow . . . . . . . 15.3.1 High-k Materials as a Substitute for SiON . . . . . . . . . . . . . 15.3.2 Interactions with/During the Gate Electrode Deposition . . . . . . . . . . . 15.3.3 Gate Electrode Etch Concerns – Stopping on High-k . . . . 15.3.4 Surface Preparation (Cleans) in the Presence of High-k Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.5 Poly Silicon Oxidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.6 Source and Drain Extension Formation . . . . . . . . . . . . . . . 15.3.7 Spacer Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.8 Source and Drain Formation . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.9 Silicidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.10 Contact and Metallization – Low Temperature Processes . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.11 Sinter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Alternative Electrode Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.1 The Need for Alternative Electrode Materials . . . . . . . . . . 15.4.2 Material Classes Under Consideration as Alternative Electrode Materials . . . . . . . . . . . . . . . . . . . . 15.4.3 Dual Work Function Gate Stack Implementation . . . . . . .

415 415 416 418 419 419 423 425 430 430 431 432

435 435 436 436 438 438 440 440 442 442 444 445 445 445 446 446 447 448 448 448 449 449 450 457

Contents

XVII

15.5

Integration of High-k Gate Dielectrics and Metal Gates into Advanced Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.5.1 Advanced Planar Integration Schemes . . . . . . . . . . . . . . . . . 15.5.2 Advanced Non-planar Integration Schemes . . . . . . . . . . . . 15.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Characterization and Metrology of Medium Dielectric Constant Gate Dielectric Films A.C. Diebold, W.W. Chism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Structural and Chemical Characterization of Medium ε Film Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.1 Characterization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.2 Structure/Function Relationships . . . . . . . . . . . . . . . . . . . . . 16.2.3 Characterization Results for Medium κ . . . . . . . . . . . . . . . . 16.3 Optical Models for Medium κ Films . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Electrical Measurement Issues for Alternative Gate Stack Systems G.A. Brown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 Capacitance–Voltage Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.3 Definition of Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.4 Measurement of Capacitance and Its Output in Series or Parallel Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.5 More Complex Equivalent Circuits . . . . . . . . . . . . . . . . . . . 17.2.6 Additional Capacitance-Related Measurement Topics for High-k Gate Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.7 Practical Capacitance Measurement Issues . . . . . . . . . . . . . 17.3 Analysis of Device/Material Parameters from Established C–V Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4 Current-Voltage Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.1 Parasitic Series Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.2 Temperature Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4.3 Time Dependence Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5 Determination of DC Conduction Mechanisms . . . . . . . . . . . . . . . . 17.6 Sample Design and Preparation Issues . . . . . . . . . . . . . . . . . . . . . . . 17.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

461 461 466 470 471

483 483 486 487 500 501 503 517

521 521 522 522 523 524 528 531 537 544 548 551 551 553 553 556 560 562 562

XVIII Contents

18 High-k Gate Dielectric Materials Integrated Circuit Device Design Issues Y.-Y. Fan, S.P. Mudanai, W. Chen, L.F. Register, S.K. Banerjee . . . . . 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Fundamental Issues on Gate Capacitance and Current Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2.2 ZrO2 and HfO2 NMOSCAP Cg , Ig –Vg Analysis . . . . . . . . 18.2.3 Conclusions for Fundamental Issues on Gate Capacitance and Current Modeling . . . . . . . . . . . 18.3 Wave Function Penetration Effect Issues . . . . . . . . . . . . . . . . . . . . . 18.3.1 Quantum Transmitting Boundary (QTBM) Method . . . . 18.3.2 Effects on Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3.3 High-k Tunneling Gate Currents Trend Study . . . . . . . . . . 18.3.4 Wave Function Penetration Effects on Gate Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4 Maxwell–Wagner Effects and Power Law Dispersion . . . . . . . . . . . 18.4.1 Interfacial Polarization in High-k Gate Stacks . . . . . . . . . . 18.4.2 Power Law Dispersion and Its Impact on Device Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.4.3 Conclusions for Maxwell–Wagner Effects and Power Law Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . 18.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

567 567 568 568 574 579 579 580 583 584 586 591 591 597 602 602 603

Part IV Future Directions for Ultimate Scaling Technology Generations 19 High-k Crystalline Gate Dielectrics: A Research Perspective F.J. Walker, R.A. McKee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 The Path to the Perovskites and COS . . . . . . . . . . . . . . . . . . . . . . . . 19.2.1 MBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.2.2 Rules for COS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3 The Material System of COS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3.1 Alkaline Earth Metal Silicide . . . . . . . . . . . . . . . . . . . . . . . . 19.3.2 Alkaline Earth Oxides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3.3 Perovskites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4 The Implementation of COS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4.1 Layer-Sequenced COS Growth . . . . . . . . . . . . . . . . . . . . . . . 19.4.2 The Importance of the Silicide . . . . . . . . . . . . . . . . . . . . . . . 19.4.3 Alkaline Earth Metal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4.4 Oxide Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

607 607 610 610 612 614 615 616 617 619 619 625 628 628

Contents

19.5

XIX

Electrical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5.1 Band Offset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5.2 Interface Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5.3 Channel Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

629 630 631 633 634 635

20 High-k Crystalline Gate Dielectrics: An IC Manufacturer’s Perspective R. Droopad, K. Eisenbeiser, A.A. Demkov . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Theoretical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3 Perovskite Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4 Oxide Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.5 Growth Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.6 Substrate Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.7 Initial Nucleation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.8 Stability of the Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.9 Structural Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.10 Band Discontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.11 Device Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.12 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

639 639 644 644 647 648 649 650 653 654 658 661 663 664

21 Advanced MOS-Devices J. Bokor, T.-J. King, J. Hergenrother, J. Bude, D. Muller, T. Skotnicki, S. Monfray, G. Timp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 Prospectus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 The Ballistic Nanotransistor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Vertical Replacement Gate MOSFET . . . . . . . . . . . . . . . . . . . . . . . . 21.4 The Double-Gate FinFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.5 Silicon-On-Nothing MOSFETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

667 667 672 674 681 688 692 701 702

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

List of Contributors

Banerjee, S.K. Microelectronics Research Center The University of Texas at Austin R9950, Austin, TX 78758, USA [email protected] Boeuf, F. STMicroelectronics 850 rue Jean Monnet 38926 Crolles, France [email protected] Bokor, J. University of California Berkeley, CA, USA Brown, G.A. International SEMATECH, Inc. Austin, TX 78741, USA [email protected] Bude, J. Agere Systems Murray Hill, NJ Chambers, J.J. Silicon Technology Development Texas Instruments Incorporated Dallas, TX, USA [email protected] Chang, J.P. Department of Chemical Engineering University of California Los Angeles, CA 90095, USA [email protected]

Chen, W. Microelectronics Research Center The University of Texas at Austin R9950, Austin, TX 78758, USA [email protected] Chism, W.W. International SEMATECH, Inc. Austin, TX 78741, USA Colombo. L. Silicon Technology Development Texas Instruments Incorporated Dallas, TX, USA [email protected] Degraeve, R. IMEC Kapeldreef 75 B-3001 Leuven, Belgium [email protected] Demkov, A.A. Microelectronics and Physical Sciences Laboratories Motorola Labs 2100 E. Elliot Road Tempe, AZ 85284, USA Diebold, A.C. International SEMATECH, Inc. Austin, TX 78741, USA [email protected]

XXII

List of Contributors

Droopad, R. Microelectronics and Physical Sciences Laboratories Motorola Labs 2100 E. Elliot Road Tempe, AZ 85284, USA Eisenbeiser, K. Microelectronics and Physical Sciences Laboratories Motorola Labs 2100 E. Elliot Road Tempe, AZ 85284, USA Fan, Y.-Y. Microelectronics Research Center The University of Texas at Austin R9950, Austin, TX 78758, USA [email protected] Hergenrother, J. Agere Systems Murray Hill, NJ, USA Hutcheson, G.D. VLSI Research Inc. 2880 Lakeside Drive, Suite 350 Santa Clara, CA 95054-2822 [email protected] Irene, E.A. Department of Chemistry, CB# 3290 University of North Carolina Chapel Hill, NC 27599-3290, USA gene [email protected] Istratov, A.A. Lawrence Berkeley National Laboratory Building 62, Room 109 (Mail Stop 62R0203) 1 Cyclotron Rd. Berkeley, CA 94720-8253, USA [email protected]

King, T.-J. University of California Berkeley CA, USA Kooi, E.† Formerly Director at Philips Research Laboratories Lo, S.-H. IBM Thomas J. Research Center 1101 Kitchawan Road, Route 134/P. O. Box 218 Yorktown Heights, NY 10598, USA [email protected] Lucovsky, G. Department of Physics North Carolina State University Raleigh, North Carolina, USA [email protected] Maria, J.-P. Department of Materials Science and Engineering North Carolina State University 1001 capability Drive Research Building One Raleigh, NC 27695, USA [email protected] Mckee, R.A. Oak Ridge National Laboratory Oak Ridge, TN 37831-6118, USA and University of Tennessee Knoxville, TN, USA [email protected] Misra, V. Department of Electrical and Computer Engineering North Carolina State University Raleigh, NC 27695, USA [email protected]

List of Contributors XXIII

Monfray, S. STMicroelectronics 850 rue Jean Monnet 38926 Crolles, France and France Telecom CNET Grenoble Meylan, France [email protected]

Taur, Y. Department of Electrical and Computer Engineering University of California, San Diego 9500 Gilman Drive, Mail Code 0407 La Jolla, CA 92093, USA [email protected]

Mudanai, S.P. Microelectronics Research Center The University of Texas at Austin R9950, Austin, TX 78758, USA [email protected]

Timp, G. University of Illinois Champaign-Urbana, IL, USA [email protected]

Muller, D. Bell Laboratories Lucent Technologies Murray Hill, NJ, USA

Tseng, H.-H. Advanced Products Research and Development Laboratory, Digital DNA Laboratories, Motorola 3501 Ed Bluestein Blvd. Austin, TX 78721, USA [email protected]

Parsons, G.N. Dept. of Chemical Engineering NC State University Raleigh, NC 27695, USA [email protected] Register, L.F. Microelectronics Research Center The University of Texas at Austin R9950, Austin, TX 78758, USA [email protected]

Visokay, M.R. Silicon Technology Development Texas Instruments Incorporated Dallas, TX, USA [email protected]

Rotondaro, A.L.P. Silicon Technology Development Texas Instruments Incorporated Dallas, TX, USA [email protected]

Walker, F.J. University of Tennessee Knoxville, TN, USA [email protected]

Schmitz, A. Phillips Semiconductors, North America – Retired [email protected]

Wallace, R.M. Departments of Electrical Engineering and Physics University of Texas at Dallas P.O.Box 830688 M/S EC33 (2601 N. Floyd for packages) Richardson, Texas, USA [email protected]

Skotnicki, T. STMicroelectronics 850 rue Jean Monnet 38926 Crolles, France [email protected]

XXIV List of Contributors

Weber, E.R. Department of Materials Science and Engineering 374 Hearst Mining Building Berkeley, CA 94720-1760, USA [email protected] Whitten, J.L. Department of Chemistry North Carolina State University Raleigh, North Carolina, USA

Wilk, G.D. ASM America Phoenix, Arizona, USA [email protected]

1 The Economic Implications of Moore’s Law G.D. Hutcheson

1.1 Introduction High-k dielectrics are a technology landmark so fundamental as to raise important questions about the future economics of the industry. It’s not just a material change in the gate dielectric. High-k dielectrics are also coming with changes to the physical structure of the transistor itself as well as the substrate. Not since the introduction of the planar process in the late fifties has the semiconductor industry faced such a fundamental change in process technology. In all likelihood, these changes and other anticipated process changes will probably mark the end of the conventional planar, poly process – hence, the requirement for an examination of Moore’s law. Moore’s law is predicated on shrinking the critical features of the planar process: the smaller these features, the more bits that can be packed into a given area. The most critical feature size is the physical gate length; as shrinking it, not only makes the transistor smaller, it makes it faster. But we are fast approaching the limits of what can be done by scaling silicondioxide gate dielectrics, necessitating the introduction of high-k dielectrics. Are these changes needed? This book examines these changes from a technical standpoint because barriers to Moore’s law have always been solved with new technology. However, these barriers are ultimately expressed economically and have important ramifications far beyond the industry itself. Moore’s law is not only an expression of a powerful engine for economic growth in the industry, but also for the economy as a whole. This chapter reviews Moore’s law and the economic implications that it poses. It shows how the continuation of Moore’s law provides a foundation for future economic growth and as such, sets the stage for a technical treatise on high-k dielectrics.

1.2 Moore’s Law: A Description Looking back thirty years after Gordon E. Moore first published his observations which would become known as Moore’s Law, he mused “The definition of ‘Moore’s Law’ has come to refer to almost anything related to the semiconductor industry that when plotted on semi-log paper approximates a straight

2

G.D. Hutcheson

line [1.1]”. Indeed, this abuse of the meaning of Moore’s Law has led to a great deal of confusion about what it exactly is. Simply put, Moore’s Law [1.2] postulates that the level of chip complexity that can be manufactured for minimal cost is an exponential function that doubles in a period of time. So for any given period, the optimal component density would be Ct = 2 ∗ Ct − 1 ,

(1.1)

where Ct is Component count in period t and Ct − 1 is component count in the prior period. This first part would have been of little economic import had Moore not also observed that the minimal cost of manufacturing a chip was decreasing at a rate that was nearly inversely proportional to the increase in the number of components. Thus, the other critical part of Moore’s Law is that the cost of making any given integrated circuit at optimal transistor density levels is essentially constant in time. So the cost per component, or transistor, is cut roughly in half for each tick of Moore’s clock: Mt =

Mt − 1 , 2

(1.2)

where M t is Manufacturing cost per component in period t, M t − 1 is Manufacturing cost component in the prior period. These two functions have proven remarkably resilient over the years as can be seen in Fig. 1.1.1 The periodicity, or Moore’s clock cycle, was originally set forth as a doubling every year. In 1975, Moore gave a second paper on the subject. While the plot of data showed the doubling each year had been met, the integration growth for MOS logic was slowing to a doubling every yearand-a-half [1.3]. So in this paper he predicted that the rate of doubling would further slow to once every two years. He never updated this latter prediction. Between 1975 and 2001, the average rate between MPU’s and DRAM’s ran right at a doubling every two years.

1.3 The History of Moore’s Law Moore’s law is indelibly linked to the history of our industry and the economic benefits that it has provided over the years. Gordon Moore has tried repeatedly to dismiss the notion that it was law, but instead just an observation. It was actually Carver Mead who first called the relationship “Moore’s Law.” Either way, the term became famous because Moore had proved amazingly perceptive about how technology would drive the industry and indeed the 1

The forces behind the law were still strongly in effect when Gordon Moore retired in 2001, leading him to quip to the author that “Moore’s Law had outlived Moore’s career.”

1 The Economic Implications of Moore’s Law

3

1000000 100000

Early IC'S MPU's DRAM's

Components (K)

10000 1000 100 10 1 0.1 0.01

1999

1996

1993

1989

1986

1983

1980

1977

1974

1971

1968

1965

1962

1959

0.001

Year

Fig. 1.1. Forty years of Moore’s law

world. Moore’s observations about semiconductor technology are not without precedent. As early as 1887, Karl Marx, in predicting the coming importance of science and technology in the twentieth century, noted that for every question science answered, it created two new ones; and that the answers were generated at minimal cost in proportion to the productivity gains made [1.4]. His observation was one of the times, referring to mechanics for which the importance of to the industrial age’s development had been largely questioned by economists up to that point [1.5] (much like the productivity gains of computers in the latter twentieth century are still debated today [1.6]). More important was Marx’s observation that science and engineering had proved to be a reasonably predictable way of advancing productivity. Moreover, investments in science and engineering led to technology, which paid off in a way that grew economies, not just military might. Today, no one questions that science was at the heart of the industrial age, as it led to important inventions like the cotton gin, the steam engine, the internal combustion engine, and the fractional horsepower electric motor, to name a few. Nevertheless, it is the exponential growth of scientific ‘answers’ that led to these, as well as, to the invention of the transistor in 1947, and ultimately the integrated circuit in 1958, which led to Moore’s observation that became known as a law and in-turn, launched the information revolution.2 2

These observations are often imitated as well. For example, Monsanto’s 1997 annual report proclaimed Monsanto’s Law, which is “the ability to identify and use genetic information is doubling every 12 to 24 months. This exponential growth in biological knowledge is transforming agriculture, nutrition, and health care in the emerging life sciences industry.” Its measure is the number of registered

4

G.D. Hutcheson

The progress of science into the twentieth century would ultimately lead to the invention of the transistor, which is really where the story of the information revolution from the semiconductor perspective starts. Like all great inventions, it relied on the prior work of others. The solid-state amplifier was conceptualized by Julius Edgar Lilienfeld. He filed for a patent on October 8, 1926 and it was granted on January 28, 1930 (U.S. patent No. 1745175). While Lilienfeld didn’t use the word Field Effect Transistor (FET), Oskar Heil did in British Patent No. 439457, dated March 2, 1934. Heil became the first person to use the term semiconductor. While neither ever gained from these patents, these works established the basis of all modern day MOS technology, even though neither author was cognizant of the concept of an inversion layer. [1.7] That is, the technology was not there to build it. Soon after World War II, figuring out how to make a solid state switch would become the Holy Grail of research as vacuum tube and electro-mechanical relay based switching networks and computers were already proving too unreliable. John Bardeen, Walter Brattain, and William Shockley, were in pursuit of trying to make workable solid state devices at Bell Labs in the late nineteenforties when Bardeen and Brattain invented the point-contact semiconductor amplifier (i.e., the point-contact transistor) on December 16, 1947 [1.7]. It was Brattain and Bardeen who discovered transistor action, not Shockley. Shockley’s contribution was to invent injection and the p–n junction transistor. Bardeen, Brattain and Shockley, nevertheless, all properly shared the 1956 Nobel prize in physics. It was these efforts that would set the stage for the invention of the integrated circuit in 1958 and Moore’s observation seven years later. The story of the integrated circuit centers on the paths of two independent groups, one at Fairchild and the other at Texas Instruments, who in their collision created the chain reaction that created the modern semiconductor industry. It is more than a story of technology. It is a story about the triumph of human endeavor and the victory of good over bad management. It begins with the “Traitorous Eight” leaving Shockley Transistor in 1957 to start Fairchild Semiconductor (the eight were Julius Blank, Victor Grinich, Jean Hoerni, Eugene Kliener, Jay Last, Gordon Moore, Robert Noyce, and Sheldon Roberts). They had been frustrated at Shockley because they wanted to move genetic base pairs, which grew from nil to almost 1.2 B between 1982 and 1997. Magnetic memory has seen a similar parallel to Moore’s Law as it shrinks the size of a magnetic pixel. Life sciences gains are a direct result of increased modeling capability of ever more powerful computers. Magnetic memory’s gains are a direct result of chip manufacturing methodologies being applied to this field. Both are a direct result of the benefits gained from Moore’s Law. Indeed, Paul Allen of Microsoft fame has credited his observation that there would be need for more increasingly powerful software as a direct result of learning about Moore’s Law. He reasoned that this would be the outcome of ever-more powerful chips and computers and then convinced Bill Gates there was a viable future in software – something no major systems maker ever believed until it was too late.

1 The Economic Implications of Moore’s Law

5

away from the four-layer device (thryristor) that had been developed at Bell Labs, and use lithography and diffusion to build silicon transistors, with what would be called the mesa process. Fairchild was the first company to specialize exclusively in making its transistors out of silicon. Their expertise for pulling this off was a rare balance: Bob Noyce and Jay Last on litho and etch, Gordon Moore and Jean Hoerni on diffusion, Sheldon Roberts on silicon crystal growing, and Gene Kliener on the financial side. The mesa process was named because cross-sectional views of the device revealed the steep sides and flat top of a mesa (it mounted the base on top of the collector). Debuted in 1958, it was the immediate rage throughout the industry, because transistors could be uniformly mass-produced for the first time. But the mesa process would not survive because its transistors were not reliable due to contamination problems. They were also costly due to their labor intensity, as the contacts were hand-painted. It was Jean Hoerni who – in seeking a solution to these problems – came up with the planar process, which diffused the base down into the collector. It was flat and it included an oxide passivation layer. So the interconnect between the base, emitter, and collector could be made by evaporating aluminum (PVD) on oxide and etching it. This was a revolutionary step that, with the exception of the damascene process, is the basis for almost all semiconductor manufacturing today. It is so important that many consider Jean Hoerni the Unknown Soldier whose contributions were the real seed for the IC. This is because the aluminum layer would make it easy to interconnect multiple transistors. The planar process was the basis for Fairchild’s early success and considered so important that it was kept secret until 1960. Its process methodology was not revealed until after the IC had been invented. At the time, however, it was only viewed as an important improvement in manufacturing. The first work to actually interconnect transistors to build an ‘integrated circuit’ was actually occurring halfway across the United States. Jack Kilby joined Texas Instruments (TI) in May of 1958 and had to work through its mass vacation in July. A new employee, with no vacation time built-up, he was left alone to ruminate over his charge of working on microminiaturization. It was then that he came up with the idea of integrating transistors, capacitors, and resistors onto a single substrate. It could have been a repeat of what happened at Shockley Semiconductor. Kilby’s bosses were skeptical. But instead of chasing him off, they encouraged him to prove his ideas. TI already had a mesa transistor on the market, which was made on germanium slices (TI used to call die and wafers bar and slices). Jack took one of these slices and cut it up into narrow bars (which may be why TI called chips ‘bar’ versus the more commonly used word ‘die’). He then built an integrated phase-shift oscillator from one bar with a germanium mesa transistor on it and another with a distributed RC network. Both were bonded to a glass substrate and wired together with a gold wire. He then built a flip-flop with multiple mesa transistors wire bonded together, proving the methodology was universal in October of 1958. This was the first integrated

6

G.D. Hutcheson

circuit ever made. It was unveiled in March 1959 at the Institute of Radio Engineers show. Back at Fairchild, during January of 1959, Bob Noyce entered in his notebook a series of innocuous ideas about the possibility of integrating circuits using Hoerni’s planar process, by isolating the transistors in silicon with reversed biased p–n junctions, and wiring them together with the PVD-LithoEtch process using an adherent layer of Al on the SiO2 . This was then put away until word of Jack Kilby’s invention rocked the world later in March, 1959. While many derided Kilby’s work as a technology that would never yield, with designs that were fixed and difficult to change, Noyce sat up and took notice. Putting it all together, Noyce and his team at Fairchild would architect what would become the mainstream manufacturing process for fabricating integrated circuits on silicon wafers.3 Transistors, capacitors, and resistors could now be integrated onto a single substrate. The reasons why this method was so important would be codified in Moore’s 1965 paper. In 1964, Electronics Magazine asked Moore, then at Fairchild Semiconductor, to write about what trends he thought would important in the semiconductor industry over the next ten years for its 35th anniversary issue. He and Fairchild were at the forefront of what would be a revolution with silicon. However, when Moore sat down to write the paper that would become so famous for its law, integrated circuits (ICs) were relatively new – having been commercialized only a year or two earlier. Many designers didn’t see a use for them and worse, some still argued over whether transistors would replace tubes. A few even saw integrated circuits as a threat: if the system could be integrated into an IC, who would need system designers? Indeed even Moore may have been skeptical early on. Robert Graham recalled that in 1960, when he was a senior Fairchild sales and marketing executive, Moore had told him, 3

Ironically, Kilby’s method for integrating circuits gets little credit for being what is now widely viewed as one of the most important paths into the future. In practice, his invention was what would be renamed as hybrid circuits, which would then be renamed Multi-Chip-Modules (MCM), then Multi-Chip-Packages (MCP), and now System In a Package (SIP). It is clear today that System-On-aChip (SOC) is limited by the constraints of process complexity and cost; and so Jack Kilby’s original integrated circuit is finally becoming a critical mainstream technology. Unfortunately, while he gets credit for the invention of the IC, few give him credit for inventing a method that had to wait forty years to become critical in a new millennium. Most give both Jack Kilby and Bob Noyce credit as co-inventors of the integrated circuit because of these differences, they both came up with similar ideas independently, and because it was Jack Kilby that prodded Bob Noyce into action. TI would go on to become an industry icon. Fairchild’s early successes would turn to failure under bad management and suffer a palace revolt, similar to the one at Shockley, in 1967. It was called the Fairchild brain drain and resulted in the founding of 13 start-ups within a year. Noyce and Moore would leave to start-up Intel in 1968. But that’s another story.

1 The Economic Implications of Moore’s Law

7

“Bob, do not oversell the future of integrated circuits. ICs will never be as cheap as the same function implemented using discrete components” [1.8]. In fact, Moore actually didn’t notice the trend until he was writing the paper [1.9]. Nevertheless, by 1964 Moore’s understanding of the process and seeing the growing complexity and lowered cost convinced him that integrated circuits would come to dominate. Fairchild was trying to move the market from transistors to ICs. Moore was also convinced that ICs would play an important role and he was writing the paper that would convince the world. Titled “Cramming more components into integrated circuits,” Moore’s paper was published by Electronics magazine in its April 19, 1965 issue on page 114. Its subtitle was “With unit cost falling as the number of components per circuit rises, by 1975 economics may dictate squeezing as many as 65,000 components on a single chip of silicon.” This issue’s contents exemplifies how so few really understood the importance of the integrated circuit. Ahead of it was the cover article by RCA’s legendary David Sarnoff who, facing retirement, reminisced about “Electronics’ first 35 years” with a look ahead. Behind this were articles titled “The future of electronics in 1930 – predictions from our first issue” and “A forward look at electronics – looking farther into the future” (both written by the magazine’s authors). Then there appeared an article from Motorola, by Dan Noble titled “Wonderland for consumers – new electronic products will make life easier.” All these papers were before Moore’s paper. Indeed, Moore’s paper would have been at the back of the magazine had it not been for what would prove to be mundane papers titled “changing the nature of research for space;” “light on the future of the laser;” “more and faster communications;” and “computers to run electric utility plants.” At the time Electronics magazine was the most respected publication covering the industry and they had assembled the best visionaries possible. Yet, with the exception of Moore’s paper, it was mostly classic forecasting ‘through the rear-view mirror.’ His paper would be the only thing remembered from this issue. In fact, those who entered the industry in the 1990’s wouldn’t even recognize the magazine as it is now defunct, not surviving the Moore’s Law article it contained. Moore’s Law paper proved so long lasting because it was more than just a prediction. The paper provided the basis for understanding how and why integrated circuits would transform the industry. Moore considered user benefits, technology trends, and the economics of manufacturing in his assessment. Thus he had described the basic business model for the semiconductor industry – a business model that lasted through the end of the millennium. From a user perspective, his major points in favor of ICs were that they had proven to be reliable; they lowered system costs; and often improved performance. He concluded, “Thus a foundation has been constructed for integrated electronics to pervade all of electronics.” This was one of the first times the word ‘pervade’ was ever published with respect to semiconductors. During this time frame the word “pervade” was first used by both Moore and

8

G.D. Hutcheson

Patrick Haggerty of Texas Instruments. Since then, the theme of increasing pervasiveness has been a feature of almost all semiconductor forecasts.4 From a manufacturing perspective, Moore’s major points in favor of ICs were that integration levels could be systematically increased based on continuous improvements in largely existing manufacturing technology. The number of circuits that could be integrated at the same yield had already been systematically increasing for these reasons. He saw no reason to believe that integration levels of 65,000 components would not be achieved by 1975 and that the pace of a doubling each year would remain constant. He pointed to multilayer metallization and optical lithography as key to achieving these goals. Multilayer metallization meant that single transistors could be wired together to form integrated circuits. But of far greater import was the mention of optical lithography. Prior to the invention of the planar process, the dominant device was known as a mesa transistor. It was made by hand painting black wax over the areas to be protected from etchant. While the tool was incredibly inexpensive (a 10 cent camel’s hair brush), the process was incredibly labor intensive [1.10]. The introduction of optical lithography meant transistors could be made simultaneously by the thousands on a wafer. This dramatically lowered the cost of making transistors. They did this to the degree that, by the mid-sixties, packaging costs swamped the cost of making the transistor itself. From an economics perspective Moore recognized the business import of these manufacturing trends and wrote, “Reduced cost is one of the big attractions of integrated electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate. For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent package containing more components.” As important as these concepts proved to be, it was still not clear that the arguments would stand the test of time. Package costs now overwhelmed silicon processing costs. These costs were not subject to Moore’s Law and technical efforts were shifting to lowering them. Fairchild was reducing packaging costs, which were still highly labor intensive, by moving its assembly lines off-shore. Texas Instruments and Motorola among others were developing and building automatic assembly equipment. Many, even those at Fairchild, were still struggling with how to make a profitable business out of integrated circuits. While transistors could be integrated, designing and marketing circuits that customers could easily use proved more complicated. The industry had no design standards. Fairchild had developed circuits with 4

Pervasiveness is another misused word. Many have used it during boom times to argue that the semiconductor industry would no longer be cyclical and thus, not have a downturn. While semiconductors have been increasingly pervasive since the dawn of the industry, this fact has not been a factor in alleviating the industry’s inherent cyclicality.

1 The Economic Implications of Moore’s Law

9

Resistor-Transistor-Logic (RTL), but customers were using Diode-TransistorLogic (DTL). Plus, Fairchild was undergoing major internal turmoil as political battles raged throughout the company. Many start-up companies were spinning out of it as senior executives left. The most famous of these spin-offs was Intel, for which its lead founders included no lesser than Robert Noyce and Gordon Moore. Lacking the packaging infrastructure of Fairchild and having the cream of its research capability, Intel’s strength was in its founder’s ability to build complex circuits and their deep understanding of Moore’s Law. They leveraged this by focusing on memories, which Moore believed had huge market potential and could be more easily integrated – both in terms of putting large numbers of transistors on silicon and in integrating them into customer’s designs [1.11]. He was also convinced that the only way to compete effectively was by making the silicon more valuable than the package, offsetting the natural advantage in packaging that Fairchild, Motorola, and Texas Instruments had. The strategy worked. Intel became the memory IC market leader in the early ’70s. They had started with SRAM’s (Static Random Access Memory) and soon invented the DRAM (Dynamic Random Access Memory), which proved to be very integratable and was much more cost effective than the ferrite core memories used by computer makers at the time. It was at this point that Moore’s Law began to morph into the idea that it was bits that were doubling every year or two. Transistors were now old fashioned. Few argued the practicality of integrated circuits. It was important to move on and use Moore’s Law as a way to demonstrate the viability of the emerging memory market. At the time, there was more to the marketing of Moore’s Law than most ever understood. The strategies taken would become a critical competitive advantage for Intel – enough of an advantage to keep it ahead of Texas Instruments, who also focused on memories and had much more manufacturing infrastructure. Bob Graham, another Intel founder, recalled [1.12] that there was a problem with Moore’s Law: it was too fast. Its cycle called for a doubling every year, but systems designers needed more than a doubling to justify a new design. They typically fielded a new design every three to four years. Graham’s strategy to harness the power of Moore’s law was to match the chip design cycle to the system designer’s. The difference between the nodes of Moore’s clock cycles and these memory nodes would lead to much confusion. But according to Graham, Intel used this confusion to keep competitors at bay when Intel’s early memory strategies were plotted. It was considered highly confidential and a big secret that the real generational nodes were based on a quadrupling, not a doubling. Moore’s Law in his view was also a marketing head fake. Intel introduced each new generation of memories with a 4× increase in bits about every three years. Each of its generations was closely matched to customer’s design cycles. Other competitors fell behind; because they followed the letter of Moore’s law. They tried to get ahead by introducing new chips

10

G.D. Hutcheson Table 1.1. Average months to double device complexity Year 1959–1965 1966–1975 1976–1985 1986–1995 1996–2001 1976–2001

Overall

MPU

DRAM

12 17 22 32 22 24

33 22 38 31 27

17 25 25 15 20

with every 2× increase. But interim generations failed. They failed from the first 64-bit memory introduced by Intel to the 64 M-bit memory. This cycle, with every 4× increase in bits, was not broken until the 128 M-bit DRAM came to market three decades later in the early 2000’s. Tax law and capital depreciation also played a critical role in determining the pacing of nodes. The United States’ MACRS (Modified Accelerated Cost Recovery Systems) tax code for capital depreciation specified that computers be fully depreciated over a 6 year length of time. If systems makers had designed new systems every year, there would have been six generations of computers per depreciation cycle – clearly too many for customers. Customers would have either over-bought and had to write-off equipment that was not fully depreciated; the annual market size would have been spilt into a sixth of its potential; or some compromise between the two would have happened. Early on, systems makers paced designs so that at least one half of potential customers would replace their systems every two to three years. It is likely that the competitive reasons accounted for the more rapid cycling of system designs in the early sixties. The computer market was still highly competitive then. It consolidated during the latter sixties and IBM emerged as the dominant supplier. There are no technical reasons to explain why the node pacing slowed. But from a market perspective, the pace should have slowed naturally as IBM sought to leverage its dominance to extend life of designs, hence having greater amortization of these costs and enhancing profitability. IBM was sued for monopolist trade practices and one of the claims was that it intentionally slowed technology. Whatever the reason, node pacing slowed to a rate of one design node every three years. Moore’s clock was roughly half that. In 1975, Moore wrote an update to the 1965 paper and revised his predictions. While technically, his prediction of 65,000 components had come true, it was based on a 16-Kbit CCD memory – a technology well out of the mainstream. The largest memory in general use at the time – the 16 K-bit DRAM, which contained less than half this amount – would not be in production until 1976. Between 1965 and 1975 the pace had actually slowed to a doubling every 17 months or roughly every year-and-a-half. Later, Moore’s Law was widely quoted by others as being every 18 months. But, despite being widely referenced as the source, this is a periodicity that Moore never

1 The Economic Implications of Moore’s Law

11

gave. The 1975 paper actually predicted the periodicity would be a doubling every two years [1.3]. This would turn out to be extremely accurate, though seldom quoted with any accuracy. But contrary to what many have thought, the finer points of the accuracy of Moore’s Law never really mattered. The real import of Moore’s Law was that it had proved a predictable business model. It gave confidence in the industry’s future because it was predictable. One could plan to it and invest in it on the basis that integration scale would always rise in a year or two, obsolescing the electronics that was out there and creating new demand because the unobtainable and confusing would become affordable and easy to use. This then fed back to reinforce it, as engineers planned to it and designed more feature-rich products or products that were easier to use. As Moore later put it, Moore’s Law “had become a self-fulfilling prophecy” [1.9]. But as a result, heavy international competitive and technical issues would loom in the future. It was at about this time that Japan seized on Moore’s Law as a planning mechanism. Without it, the industry appeared to go in random directions. But Moore’s Law made it easy to plan and it had been clearly proven by 1975. The DRAM explosion was already in place at TI, AMD, IBM, Intel, Mostek, etc. Moore’s Law made it all understandable. Now a believer in Moore’s Law, they saw that since memory demand and device type were very predictable it would play to their strengths. Moreover, cost of manufacturing was critical to success – one of their key strategic strengths. Moore’s Law was the basis of the arguments that prompted them to start their government funded effort called the VLSI program in 1976, for which, the goal was to build a working 64 K-bit DRAM. They believed that if the VLSI program could do this, their semiconductor industry could lever off the results to build their own. Japan was already successful in 4 K-bit DRAMs and their potential with 16 K-bit looked promising. One of the keys to their success was that they implemented multiple development teams. Each team worked on the same generation, walking it from research, through development, and into manufacturing. In contrast, the west had highly stratified walls between these three areas. Research did research and threw their results over the wall to development, and so forth into manufacturing. Often, manufacturing used little of these efforts because they were seldom compatible with manufacturing. Instead they built designs coming directly from marketing because they knew they would sell. Japan got the edge because they could get new designs to manufacturing faster and their cost of manufacturing was fundamentally lower. They had lower labor rates and their line workers typically stayed with the company for several times longer. This combined with the Japanese penchant for detail. Texas Instruments had a facility in Japan and this facility consistently yielded higher than its American facilities for these reasons [1.13] The Japanese also had newer equipment, having invested heavily in the late seventies. Capital was tough to get in the late seventies for American chipmakers. They had cut their investments to low levels, which would ultimately give Japan another source of yield advantage.

12

G.D. Hutcheson

But the real shocker came in 1980, when Hewlett-Packard published an internal study comparing quality between American and Japanese made DRAMs. It showed Japan to have higher quality. American chipmakers cried foul – that this was tested-in quality and that Japanese suppliers sent HP more thoroughly tested parts. Indeed, independent studies did later show that Japanese made DRAM’s obtained on the open market were of no higher quality than American ones. However, it was too late to change the perception that HP’s announcement had created (a perception that remains to this day). Whether or not the quality was tested-in, the one clear advantage the Japanese had was yield. It was typically 10–15% higher than equivalent American fabs and this gave the Japanese a fundamental cost advantage. When the downturn hit in 1981, these yield differences allowed Japanese companies to undercut American companies on 16 Kb DRAMs. This combined with the downturn, caused American producers to make further cuts in capital investment, and put them further behind. At the time, the Chairman of NEC noted that they had no fab older than five years. The average American fab was 8 years old. So by the early eighties, Japan came to dominate 64 K-bit memories. By 1985, America’s giants were bleeding heavily. Intel was among the worst when it came to manufacturing memories. It was forced out of memories. The technical issues with the 64 K-bit DRAM proved to be enormous. The most commonly used lithography tool of the day, the projection aligner would not make it to this generation because it lacked the overlay alignment capability. Something new had to be developed and the industry was not ready. The transition to stepping aligners proved much more traumatic than anyone expected. Steppers had the potential for very high yields. But unless the reticle was perfect and had no particles, yield would be zero because the stepper would repeat these defects. The result was that there was a three year hiatus in Moore’s law. 64 K-bit DRAMs did not enter volume production until 1982 – a full three years after they should have arrived – taking a full six years to pass from the 16 K-bit node. Another transition occurred in the early eighties that favored Intel. NMOS began to run out of steam and couldn’t scale well below one micron. Some even predicted that we had hit Moore’s Wall. But the industry escaped this by transitioning to CMOS. One of the benefits of CMOS was that performance also scaled easily with shrinks. An engineer in Intel’s research organization observed this and recognized its importance to microprocessors. Moreover, having exited memories it was important that Intel not lose the brand value of Moore’s Law it had, having its discoverer as Chairman of the company. So marketing morphed Moore’s Law a second time. It had started as the number of transistors doubling, then the number of bits, and now it was speed, or more comprehensively performance. This new form would serve Intel and the industry well. In the early nineties, the pace of integration increased again. There were several factors driving this change. It could be argued that the manufacturing

1 The Economic Implications of Moore’s Law

13

challenges of the early eighties had been overcome. Yet, there was significant fear that new hurdles looming in the near future would be insurmountable. America’s semiconductor industry had just instituted the roadmap process for planning and coordinating semiconductor development. As it turned out, this focused pre-competitive research and development like it had never been before. Instead of hundreds of companies duplicating efforts, it allowed them to start from a common understanding. The result was an increased pace of technology development. At the same time, efforts to reinvigorate competitiveness led companies to adopt time-to-market measures of effectiveness. This naturally accelerated the pace of development. Also, the shift from mainframe and minicomputers to personal computers had dramatically altered the competitive landscape in the ’80s. IBM had quickly come to dominate the PC market in the early eighties. Unlike all earlier IBM computers, the PC had been designed with an open architecture. Their dominance might never have been challenged. However on August 2, 1985, the IBM senior executives who had developed and ran its PC business violated a major corporate rule and boarded Delta Airlines flight 191 to Dallas, Texas. The flight encountered wind-shear and crashed on landing. IBM’s understanding of how the market was evolving as well as its leadership capability in the still-emerging PC market perished. Unlike all earlier IBM computers, the PC had been designed with an open architecture, which meant it could be easily cloned. Outside of this small group, IBM really didn’t understand how to respond to the hoards of clone makers who had entered the market. The clone hoard’s slash and burn strategies applied to pricing and self-obsolescence made IBM’s defenses about as useful as France’s Maginot line in WWII. As they lost their leadership, the pace of technical change accelerated again to limits set primarily by technical developments. At the 1995 SIA (Semiconductor Industry Association) forecast dinner, Gordon Moore gave a retrospective on thirty years of Moore’s Law. He claimed to be more surprised than anyone that the pace of integration had kept up for so long. He had given up on trying to predict its end, but commented that it was an exponential and all exponentials had to end. While it did not stop the standing ovation he received, he concluded that “This can’t go on indefinitely – because by 2050 . . . we’re everything.”

1.4 The Microeconomics of Moore’s Law The essential economic statement of Moore’s law is that that the evolution of technology brings more components and thus greater functionality for the same cost. It is this cost reduction that is largely responsible for the exponential growth in transistor production over the years. Lower cost of production has led to an amazing ability to not only produce transistors on a massive scale, but to consume them as well. According to the Semiconductor Industry Association, that 2000’s production alone amounted to 42% of the total

14

G.D. Hutcheson 1E+18

1E+17

1E+16

Worldwide Transistor Production in Units

1E+15

1E+14

1E+13

1E+12

1E+11

1E+10

1E+09

1E+08

1E+07

1000000

2000

1998

1996

1994

1992

1990

1988

1986

1984

1982

1980

1978

1976

1974

1972

1970

1968

1966

1964

1962

1960

1958

1956

1954

100000

Year Source: Semiconductor Industry Association

Fig. 1.2. World wide transistor production (merchant producers only)

transistors ever produced. This is not an anomaly of the millennial boom. In any given year since the industry’s birth, SIA data [1.14] shows that production of transistors has averaged 41% of the cumulative total ever produced up until then. So what makes Moore’s Law work? The law itself only describes two variables in the equation: transistor count & cost. Behind these are the fundamental technological underpinnings that drive these variables and make Moore’s Law work. There are three primary technical factors that make Moore’s Law possible: reductions in feature size, increased yield, and increased packing density. The first two are largely driven by improvements in manufacturing and the latter largely by improvements in design methodology. Design methodology changes have been significant over the years. They have come as both continuous and step function improvements. The earliest step function improvements were the reduction in transistor counts to store

1 The Economic Implications of Moore’s Law

15

Table 1.2. Integration scale measures used in the 1970’s Acronym

Integration Scale

SSI MSI LSI VLSI

Small Scale Integration Medium Scale Integration Scale Integration Scale Integration

Transistors < 100 101–1000 1,001–10,000 > 100, 000

Transistors Transistors Transistors Transistors

memories. The industry started with 6-transistor memories. In the late sixties, designers quickly figured how to reduce this to 4, then 2, and, finally, the 1-transistor/1-capacitor DRAM cell, developed by R.H. Dennard [1.15]. While this did not affect Moore’s Law as measured in transistors, it did when measured in bits, because a 4-Kbit memory (with a 6 T cell) needed 24-K transistors and could now be made with only 4-K transistors with a 1 T/ 1 capacitor cell. This was an enormous advance. Cost-per-bit plummeted and it further added to the mythical proportions of Moore’s Law, as customers saw little real difference between transistors and bits. What they were most interested in was reductions in cost-per-function and designers had delivered. There were less well known additions as well. The development of ComputerAided-Design (CAD) in the early eighties was a significant turning point. Now EDA (Electronic Design Aids), CAD’s first contribution was to prevent the ending of Moore’s Law. As the industry progressed from MSI to LSI levels of integration, the number of transistors to be wired together was becoming too large for humans to handle. Laying out the circuit diagram and cutting the rubylith5 for wiring 10,000 transistors together, with 3-connections each, by hand had to have been a daunting task. The 64 Kb DRAM loomed large in this picture as the decade turned. With just over 100 K transistors, it would be the first commercial VLSI grade chip produced in volume – and it was a point of hot competition. So being able to automate this process would be a great leap forward. The next step for design was to improve the layout capability of these tools. This improved the packing density. Early layout tools were fast, but humans could layout a circuit manually in 20–30% of the area. Today, no one would manually layout an entire circuit with millions of transistors. Even today, EDA tools do not offer the most efficient packing density. Designers who want the smallest die will ‘handcraft’ portions of a circuit. This is still 5

In those days, masks were made by drawing the circuit out on a large piece of paper that sometimes covered the floor of a large room. Then a laminated piece of Mylar called rubylith was used to make the initial mask pattern. Rubylith was made of two Mylar films, one clear and one red. A razor-edged knife was used to cut strips of the red Mylar away. Initially this was done by hand. One had to be careful not to cut the clear Mylar underneath so the red Mylar could be pulled away, leaving the final mask pattern. This pattern was then reduced to circuit dimensions to make a mask master.

16

G.D. Hutcheson

commonly done when a market is large and the die-size reduction can justify the cost of handcrafting. Performance improvements are another way that design has directly affected Moore’s Law. It is particularly important to the third version of Moore’s Law that measures the gain in circuit performance over time. Scaling theory states that transistor switching speed increases at a rate that is inversely proportional to the reduction in physical gate length. However, a designer can improve on this by using the switching of transistors more efficiently. These transistors switch with the clock of the circuit. Early processor architecture required several clock cycles per instruction. So a 1-GHz clock, might only perform at a rate of 300 MIPS (Millions of Instructions-PerSecond). Using techniques like parallel processing, pipelining, scalar processing, fractional clocks, etc. designers have systematically improved this so that three-to-five instructions-per-clock cycle can be achieved. Thus, a processor with a 1-GHz clock can exhibit run rates of 3000-to-5000 MIPS. Considering 1-MIP was considered state-of-the-art for a circa-1980 mainframe processor; subsequent architectural gains have been quite significant. Design tools are having further impacts today. One is their ability to improve testing. Without these tools test costs would explode or worse, the circuits would be untestable, making further gains in integration scale pointless. Another is to automatically layout the patterns needed to make reticles with optical-proximity-correction and phase-shifting. This substantially reduces feature sizes. But, it is important to realize that none of these gains would have been possible without ever-more powerful cost-effective computers. All of these benefits were made possible by Moore’s Law. Hence, instead of running down, Moore’s Law is a self-fulfilling prophecy that runs up. Indeed, many of the manufacturing improvements since the eighties have come only because Moore’s Law had made computing power so cheap that it could be distributed throughout the factory and in the tools, as well as to design the tools and even do the engineering and economic analysis to make more efficient decisions. Reductions in feature sizes have made the largest contributions by far, accounting for roughly half of the gains since 1976. Feature sizes are reduced by improvements in lithography methods. These enable smaller critical dimensions (CD’s, which is also known as Minimum Feature Size or MFS) to be manufactured. If the dimensions can be made smaller, then transistors can be made smaller and hence more can be packed into a given area. This is so important that Moore’s first paper relied entirely on it to explain the process. Improvements in lithography have been the most significant factor responsible for these gains. These gains have come from new exposure tools; resist processing tools and materials; as well as etch tools. The greatest change in etch tools was the transition from wet to dry etching. In etching, most of the older technology is still used today. Wet chemistries used for both etching and cleaning are the prominent of these. Improvements in resist processing tools

1 The Economic Implications of Moore’s Law

17

Minimum Dimension (nm)

1.E+06

1.E+05

1.E+04

1.E+03

2000

1997

1994

1990

1987

1984

1981

1978

1975

1972

1969

1966

1963

1960

1957

1.E+02

Year

Fig. 1.3. Forty years of critical dimension shrinks (in nanometers)

and materials have generally been incremental. Resist processing tools have remained largely unchanged from a physical perspective since they became automated. The changes have mostly been in incremental details changed to improve uniformity and thickness control. Resist chemistries have changed dramatically, but these are easy to overlook. Moreover etch and resist areas have relatively small effects on cost. Exposure tools have gone through multiple generations that followed the CD reductions. At the same time they have been the most costly tools and so generally garner the most attention when it comes to Moore’s Law. Moreover, without improvements in the exposure tool, improvements elsewhere would not have been needed. Exposure tools were not always the most costly tool in the factory. The camel’s hair brush, first used in 1957 to paint on hot wax for the mesa transistors, cost little more than 10 cents. But since that time prices have escalated rapidly, increasing roughly an order-of-magnitude every decade and a half. By 1974, Perkin–Elmer’s newly introduced projection aligner cost well over $100 K. In 1990, a state of the art i-line stepping aligner cost just over $1 M. At this writing in 2002, 193 nm ArF excimer laser scanning aligners are about to enter manufacturing. They cost on the order of $10 M. Over the decades, these cost increases have been consistently pointed to as a threat to the continuance of Moore’s Law. Yet, the industry has never hesitated to adopt these new technologies. It is testimony to the power of this law that these costs have been absorbed with little effect. Lithography tools have become more productive to offset these increases. Moreover, they are only part of the rising cost picture. The increase in the cost of semiconductor factories had been a recurring theme over the years. In

18

G.D. Hutcheson

Table 1.3. Evolution of lithography technology used to manufacture semiconductors Year first CD Lithography technology used in (microns) manufacturing 1957 1958 1959 1964 1972 1974 1982 1984 1988 1990 1997 2003

254.000 127.000 76.200 16.000 8.000 5.000 2.000 1.500 1.000 0.800 0.250 0.100

Etch

Camel’s Hair Brush, Hand Painting Wet Etching Silk Screen Printer Contact Printer W/Emulsion Plates Contact Printer W/Chrome Plates Proximity Aligner Projection Aligner Barrel Plasma g-line (436 nm) Stepper Planar Plasma Reactive Ion Etching High Density Plasma i-line (365 nm) Stepper 248 nm Scanner 193 nm Scanner

fact it was first noted in 1987 that there was a link between Moore’s Law and wafer fab costs [1.16]. Between 1977 and 1987, wafer fab costs had increased at a rate of 1.7× for every doubling of transistors. In the eighties, the cost of factories was offset primarily by yield increases. So rising tool costs were offset by lower die costs. However, this relationship stopped in the nineties, when, the rise in tool prices began to be offset by increases in productivity. So as effective throughputs rose, the unit volumes of tools in a fab needed to produce the same number of wafers declined. Nevertheless, this may change in the next decade. Tool productivity is not rising as significantly, 300 mm fabs are significantly more expensive, and significant technical challenges remain. Moore’s law governs the real limit to how fast costs can grow. According to the original paper given in 1965, the minimal cost of manufacturing a chip should decrease at a rate is nearly inversely proportional the increase in the number of components. So the cost per component, or transistor, should be cut roughly in half for each tick of Moore’s clock [see (1.1) and (1.2) above]. However, since this paper was first given, it has generally been believed that industry growth will not be affected if the cost per function drops by at least 30% for every doubling of transistors. This 30% drop would allow the manufacturing cost per unit area of silicon to rise by 40% per node of Moore’s law (or by twice the cost-per-function reduction ratio requirement) [Appendix A]. This includes everything from the fab cost to materials and labor. However it does not take yield or wafer size into account. Thus if cost per function needs to drop by 30% with each node, wafer costs can also increase by 40%, assuming no yield gains. Yield is a function of die size and so is directly dependent on component counts and CD reductions.

1 The Economic Implications of Moore’s Law

19

10000

Cost in $B

1000

100

10

2002

2000

1998

1996

1994

1992

1990

1988

1986

1984

1982

1980

1978

1976

1974

1972

1970

1968

1966

1

Year

Fig. 1.4. Wafer fab cost trends

There are many equations for calculating yield, the most basic of which is the classic Poisson probability exponential function: Y = exp −(ADN ) , where A = Die Area, D = Defect Density per Mask Layer, N = Number of Masks. Note that this equation also accounts for increased process complexity as component counts rise. It would seem that this effect would be the most significant cost reducing factor. In the early days of the industry it was. In the seventies, yield typically started at 15% when a device was introduced and peaked at 40% as the device matured. Gains of two to three times were typical over the life of a chip and four to five times was not uncommon for devices that had long lives. Improvement in manufacturing methods pushed this up dramatically during the eighties and nineties. This was primarily due to better equipment and cleanroom technology. For example, the switch to VLSI equipment technology such as steppers and plasma etchers caused initial yields for the 64 K bit DRAM to come in at 40% in 1982. It matured at 75% three years later. Today, devices typically enter production at 80%; rise to 90% within six months; and can achieve close to 100% at maturity. But at best the gain is only a quarter of initial yields. Wafer size has been another cost reducing factor used over the yields. Larger wafers have typically cost only 30% more to process and yet have had an areal increase of 50 to 80 percent. The move to 300 mm wafers from 200 mm will yield an areal increase of 125%! Larger wafers have always brought a yield bonus, because of their larger “sweet spot” – yielding a relatively larger number of good chips on the inner regions of the wafer – and the fact that

20

G.D. Hutcheson

they require a new generation of higher-performing equipment. Like the sweet spot of a tennis racket, wafers tend to have the lowest defect density at their centers and highest at their edges where they are handled most. Also, process chamber uniformity tends to suffer the most at the edges of the wafer. There are also gains in manufacturing efficiency that occur over time. The result is a continued decrease in manufacturing costs per die. However, the continual of Moore’s Law via reduction in CD’s, increased yields, larger wafer sizes, and manufacturing improvements has taken its toll in other areas. Costs have risen significantly over time as seen in the rise of wafer fab costs. Moreover the CD reductions have caused a need for increasing levels of technical sophistication and the resultant costs. For example, the camel’s hair brush used for lithography in the fifties cost only 10 cents; early contact aligners cost $3000 to $5000 in the sixties and $10,000 by the seventies; a projection aligner in the late seventies cost $250 K; and the first steppers cost $500 K in the early eighties. Today, a 193 nm (using an ArF excimer laser) stepper costs $10 M. That is an increase of seven orders-ofmagnitude in 45 years. Moreover, the cost increases are prevalent throughout the fab. Increased speeds have forced a transition from aluminum to copper wiring. Also silicondioxide insulation no longer works well when millions of transistors are switching at two gigahertz, necessitating a switch to inter-level dielectrics with lower permittivity. At the gate level, silicon-dioxide will no longer be useful as a gate dielectric. Scaling has meant that fewer than ten atomic thicknesses will be used and it will not be long before they fail to work well. The solution is to replace them with high-k dielectrics so that physical thicknesses can be increased, even as the electrical thickness decreases. These new materials are also causing costs to escalate. An evaporator, which could be bought for a few thousand dollars on the early seventies, now costs four to five million dollars. Even diffusion furnaces cost a million dollars-per-tube. As costs have risen, so has risk. So there has been a tendency to over-spec requirements to ensure a wide safety margin. This has added to cost escalation. At some point the effect of these technologies translating into high costs will cause Moore’s Law to cease. As Gordon Moore has put it, “I’ve learned to live with the term. But it’s really not a law; it’s a prediction. No exponential runs forever. The key has always been our ability to shrink dimensions and we will soon reach atomic dimensions, which are an absolute limit.” But the question is not if, it’s when will Moore’s wall appear. “Who knows? I used to argue that we would never get the gate oxide thickness below 1000 angstroms and then later 100. Now we’re below 10 and we’ve demonstrated 30 nm gate lengths. We can build them in the 1000’s. But the real difficulty will be in figuring out how to uniformly build ten’s of millions of these transistors and wire them together in one chip” [1.17]. Nevertheless, it is more likely that economic barriers will present themselves before technical roadblocks stop progress [1.18].

1 The Economic Implications of Moore’s Law

21

1.5 The Macroeconomics of Moore’s Law Moore’s Law was more than a forecast of an industry’s ability to improve, it was a statement of the ability for semiconductor technology to contribute to economic growth and even the improvement of mankind in general. This has a far richer history than the development of semiconductors, which to some extent, explains why Moore’s Law was so readily accepted. This history also explains why there has been an insatiable demand for more powerful computers no matter what people have thought to the contrary. The quest to store, retrieve, and process information is one task that makes humans different from other animals. The matriarch in a heard of elephants may be somewhat similar to the person in early tribes who memorized historical events by song. But no known animal uses tools to store, retrieve, and process information. Moreover the social and technological progress of the human race can be directly traced to this attribute. More recent writers have pointed to this as a significant driving force in the emergence of Western Europe as the dominant global force in the last millennium [1.19]. Man’s earliest attempts to store, retrieve, and process information date back to prehistoric times when humans first carved images in stone walls. Then in ancient times, Sumerian clay tokens developed as a way to track purchases and assets. By 3000 B.C. this early accounting tool had developed into the first complete system of writing on clay tablets. Ironically, these were the first silicon based storage technologies and would be abandoned by 2000 B.C. when the Egyptians developed papyrus based writing materials. It would take almost four millennia before silicon would stage a comeback as the base material, with the main addition being the ability to process stored information. In 105 A.D. a Chinese court official named Ts’ai Lun invented wood-based paper. But it wasn’t until Johann Gutenberg invented the movable type printing press around 1436 so that books could be reproduced cost effectively in volume. The first large book was the Gutenberg Bible, published in 1456. So something akin to Moore’s Law occurred, as Gutenberg went from printing single pages to entire books in 20 years. At the same time, resolution also improved, allowing finer type as well as image storage. Yet, this was primarily a storage mechanism. It would take at least another 400 years before retrieval would be an issue. In 1876, Melvil Dewey published his classification system that enabled libraries to store and retrieve all the books that were being made by that time. Alan Turing’s “Turing Machine”, first described in 1936, was the step that would make the transformation from books to computers. So Moore’s Law can be seen to have a social significance that reaches back more than five millennia. The economic value of Moore’s Law is also understated, because it has been a powerful deflationary force in the world’s macro-economy. Inflation is a measure of price changes without any qualitative change. So if price per function is declining, it is deflationary. Interestingly, this effect has never been accounted for in the national accounts that measure inflation adjusted

22

G.D. Hutcheson

1.0E+18 1.0E+17 UNITS SHIPPED

1.0E+16

BITS

1.0E+15

SALES ($)

1.0E+14 1.0E+13 1.0E+12 1.0E+11 1.0E+10 1.0E+09 1.0E+08 1.0E+07 1.0E+06

2001

1999

1997

1995

1993

1991

1989

1987

1985

1983

1981

1979

1977

1975

1973

1971

1.0E+05

YEAR

Fig. 1.5. Dram market history

gross domestic product (GDP). The main reason is that if it were, it would overwhelm all other economic activity. It would also cause productivity to soar far beyond even the most optimistic beliefs. This is easy to show, because we know how many devices have been manufactured over the years and what revenues have been derived from their sales. Probably the best set of data to use for analyzing the economic impact of Moore’s law is DRAM production. It is exceptionally good because it can easily be translated into a universal measure of value to a user: memory bits. Transistors are also a good measure, but in economic terms customers don’t really buy them – they buy bits. Moreover, the industry was very efficient in the seventies and eighties at designing bits with fewer transistors. Data for worldwide sales, unit shipments, and the resultant bits delivered is shown in Fig. 1.5 and the DRAM price in Fig. 1.6. This data includes both merchant and captive production and so it is a complete measure of industry production. Take this data and calculate the price per bit and the results are stunning: Back in 1971, when the first DRAM was introduced the price of a megabit of memory was $12,695. Twenty years later, in 2001, the price for a megabit of memory averaged a mere 26 cents. The DRAM market grew from $130 M to

1 The Economic Implications of Moore’s Law

23

1.0E+05

1.0E+04

DOLLARS

1.0E+03

1.0E+02

1.0E+01

1.0E+00

1.0E-01

2001

1999

1997

1995

1993

1991

1989

1987

1985

1983

1981

1979

1977

1975

1973

1971

1.0E-02

YEAR

Fig. 1.6. Dram price per megabit history

just over $11 B in the same timeframe. So, 2000’s market adjusted for inflation would be $5328.9 T – or just over one-hundred times Gross World Product. Moreover, that doesn’t include the value of all semiconductors! So it is hard to understate the long term economic impact of the semiconductor industry.

1.6 Moore’s Law Meets Moore’s Wall: What is Likely to Happen Moore’s Law meets Moore’s Wall and then the show stops, or the contrary belief that there will be unending prosperity in the 21st Century buoyed by Moore’s Law, have been recurring themes in the media and technical community since the mid-seventies. The pessimists are often led by conservative scientists who have the laws of physics to stand behind. The optimists are usually led by those who cling to ‘facts’ generated by linear extrapolation. The problem with the optimists is that the issues that loom are not easily amenable to measurement by conventional analysis. Eventually real barriers emerge to limit growth with any technology. Moreover, as Gordon himself has often quipped, “No exponential goes on forever.” But so far, the optimists have been right. The problem with the pessimists is that they typically rely too much on known facts and do not allow for invention. They don’t fully account

24

G.D. Hutcheson

for what they don’t know, leaving out the ‘what they don’t know’ pieces when assembling the information puzzle. Yet it is the scientific community itself that expands the bounds of knowledge and extends Moore’s Law beyond what was thought possible. History is replete with many really good scientists and engineers who have come up with new things to constantly expand the boundaries of our knowledge and as noted above, this is not likely to stop. When anyone asks me about Moore’s Wall, my first response is “Moore’s Wall is in Santa Clara, just outside Intel’s Robert Noyce building. If you look close, you will find the engraved names of people who made career limiting predictions for the end of Moore’s Law.” This has certainly been the case for those who have predicted the coming of Moore’s Wall in a five or ten year span over the years. Yet, Moore himself said in 1995 that the wall should be finished and in place somewhere around 2040, when he poignantly pointed out that otherwise, “we’ll be everything” if things continue at historical growth rates. Herein lies the real dilemma. If our industry continues to grow unbounded, it really will become as large as the global economy in the first half of the 21st Century. This leads to the historical view that as this occurs our industry’s growth will become bounded by macroeconomic growth. However, if you look at history, it dispels this idea. At the beginning of this millennium rapid advances in agricultural techniques did not slow economic growth. Instead, they buoyed it as they freed up human resources to work on other things which, in turn kicked off the High Middle Ages. Ultimately, this made possible the industrial age in the latter part of the millennium. As industry grew to be a larger part of the economy it did not slow to the 1% annual economic growth of agricultural economies. While it did slow, it also pushed economic growth up to an average of about 3%. Mechanized transportation allowed centralized manufacturing, so factories could achieve greater economies of scale. This combined with the mechanization of the factory and greatly improved productivity; thus allowing greater non-inflationary growth levels. Since the latter half of the nineties, the United States has been able to achieve regular non-inflationary growth of 4 to 5%. It is non-inflationary because of productivity gains. These gains are made possible by information technology. Another factor driving the non-inflationary growth potential of the economy is that information technology tends to be energy saving as well. One of the real limits to the agricultural age was the fact that the primary fuel was wood. Entire forests were decimated in the Middle East and then Greece and Italy. The industrial age was prompted with the discovery of fossil fuels. This stopped deforestation to a great degree, but from an economic perspective, it also allowed for greater growth potential. Fossil fuels were easier to transport and use, so they too increased productivity. This combined with the ability to transport materials to centralized manufacturing locations and then back out with trains, led to massive improvements in productivity. The information age takes the next step and relies on electricity. More importantly, it replaces the need to transport people, materials, and products with information. For example, video teleconferencing allows people to meet without

1 The Economic Implications of Moore’s Law

25

traveling great distances. The voice and image information at both ends is digitized into information packets and sent around the world so that people can communicate without being physically near. At the same time, products can be designed in different places around the world, the information sent, so products can be produced in low cost areas or were transportation costs are high, locally. For example, it is now a daily event for semiconductors being designed in the United States in close cooperation with a customer in Europe, to have the designs sent over the Internet to Texas for the reticles to be made, to California for the test programs, and then to Taiwan to make the wafers, then to Korea for packaging, and finally shipped to the customer in Europe. In the case of beer, transporting liquids is far too expensive. So a company in Europe can license its process to brewers in the United States and Japan, where they are manufactured locally. Using the Internet, the original brewer, can monitor production and quality with little need to leave the home factory. So, the productivity effect seen in the transition from the agricultural to the industrial age is really happening as we move into the information age. It can be argued that macroeconomic growth could rise to as high as 8% while creating a similar growth cap for our industry. What happens when this occurs? It is inevitable that the semiconductor industry’s growth will slow from the fifteen to twenty percent range it has averaged over its history in the last half of the twentieth century. The barriers that will limit it will be economic not technical, as Moore’s Law is a statement of powerful economic forces [1.18]. The reason is that technology barriers first show up as rising costs that go beyond the bounds of economic sense. Transportation speed limits could exceed the speed of sound. But economic limits make private jet ownership unattainable for all but a very few. Economic limits make the automobile the most commonly used vehicle in major industrialized countries and the bicycle in others. But even here, the economic limits of building infrastructure limit average speed to less than 20 MPH in industrial countries (which is one reason why the bicycle has become such a most popular alternative). If we look to the auto industry for guidance, similar declines in cost during its early years can be found. At the turn of the century, cars were luxury items, which typically sold for $20 K. They were the main frames of their day, and only the ultra-rich could afford them. Henry Ford revolutionized the auto industry with the invention of the assembly line. Ford’s efforts resulted in a steady reduction in costs, quickly bringing the cost of manufacturing a car to under $1000. But even Ford’s ability to reduce costs had bottomed out by 1918, when the average hit a low of $204.96. While these efforts pale in comparison to gains made in semiconductors, the lesson to be learned is that cost gains made on pushing down one technical river of thought will eventually lead to a bottom, after which costs rise. Science and engineering can only push limits to the boundaries of the laws of physics. Costs begin to escalate as this is done because the easy problems are solved and making the next advance is more difficult. At some point, little gains can be made by taking the next step, but the cost is astronomical. In

26

G.D. Hutcheson 900 800 Average Price

DOLLARS PER UNIT SOLD

700

Average Cost Average Profit

600 500 400 300 200 100

1922

1921

1920

1919

1918

1917

1916

1915

1914

1913

1912

1911

1910

1909

1908

1907

1906

1905

1904

0

YEAR

Fig. 1.7. Ford motor company’s equivalent to Moore’s law (the early years of the auto industry)

the case of autos, the gains were made by the development and improvement of assembly line technology. In the case of semiconductors it has largely been lithography where the gains were made. These are not ‘economies of scale’ as taught in most economics classes, where increased scale drives cost down to a minimum – after which, costs rise. Instead, technology is driving cost. These are economies of technology are one of the most important underlying factors that make Moore’s Law possible and will ultimately result in its demise when gains can no longer be made. Similar things are happening in semiconductors. Fab equipment prices have risen steadily at annual rates above 10%. This was fine as long as yields rose, giving added economic boost to the cost of steadily shrinking transistors to stay on Moore’s Curve. But yields cannot go up much further, so gains will have to come from productivity improvements. It is important to note that as these economic barriers are hit, it does not mean the end of the semiconductor industry. The industry has lived with Moore’s Law so long that it is almost of matter of faith, as exemplified in the term ‘Show Stopper.’ The term has been used extensively to highlight the importance of potential limits seen in the industry’s ‘road mapping’ of future technologies. Yet it is unlikely that the show will stop when the show stoppers are finally encountered. Just think of the alternatives. Moreover, the

1 The Economic Implications of Moore’s Law

27

auto industry has been quite healthy in the eight decades since it hit its show stoppers. People did not go back to horses as a means of regular transport. As the gains from automation petered auto manufacturers shifted their emphasis from low-cost one-size-fits-all vehicles to many varieties – each with distinct levels of product differentiation. The other hallmarks of the industrial age trains and planes also found ways to go on after they hit technical and economic limits. For this to happen in semiconductors, it means manufacturing will have to be more flexible and design will continue to become more important.

1.7 Conclusion Moore’s Law has had an amazing run as well as an unmeasured economic impact. While it is virtually certain that we will face its end sometime in this century, it is extremely important that we extend its life as long as possible. However well these barriers may be ultimately expressed economically, barriers to Moore’s law have always been overcome with new technology. It may take every ounce of creativity from the engineers and scientists who populate this industry do this, but they have always been up to the task. In fact, this is one reason why this book on high-k gate dielectrics was initiated. Moore’s law is predicated on shrinking critical features. But we are fast approaching the limits of what can be done by scaling silicon-dioxide gate dielectrics, necessitating the introduction of high-k gate dielectric materials. So what would advice would Gordon give us? I had the chance to ask him just that during the process of putting together this chapter. It was on the day he entered retirement [1.20]. One thing he wanted to point out was that he never liked the term Moore’s Law: “I’ve learned to live with the term. But it’s really not a law; it’s a prediction. No exponential runs forever. The key has always been our ability to shrink dimensions and we will soon reach atomic dimensions, which are an absolute limit.” But the question is not if, it’s when will Moore’s wall appear. “Who knows? I used to argue that we would never get the gate oxide thickness below 1000 angstroms and then later 100. Now we’re below 10 and we’ve demonstrated 30 nm gate lengths. We can build them in the 1000’s. But the real difficulty will be in figuring out how to uniformly build ten’s of millions of these transistors and wire them together in one chip.” The key is to keep trying. He feels that any solution has to champion manufacturing because “there is no value in developing something that cannot be built in volume. Back at Fairchild the problem was always in getting something from research into manufacturing. So at Intel we merged the two together.” “Always look for the technical advantage (in cost). I knew we could continue to shrink dimensions for many years, which would double complexity for the same cost. All we had to do was find a product that had the volume to drive our business. In the early days that was memories. We knew it was time to get out of memories

28

G.D. Hutcheson

when this advantage was lost. The argument at the time was that you had to be in memories because they were the technology driver. But we saw that DRAMs were going off in a different technical direction because problems in bit storage meant they had to develop all these difficult capacitor structures.” He also pointed to the need to avoid dependency on specific products. “I’ve never been good at forecasting. I’ve been lucky to be in the right place at the right time and know enough to be able to take advantage of it. I always believed in microprocessors but the market wasn’t big enough in the early days. Ted Hoff showed that microprocessors could be used for calculators and traffic lights and the volume could come in what we now call embedded controllers. I continued to support it despite the fact that for a long time the business was smaller than the development systems we sold to implement them. But just when memories were going out, microprocessors were coming into their own. Success came because we always sought to use silicon in unique ways.” So what did Gordon have to say about his contribution and the future of our industry: “I helped get the electronics revolution off on the right foot . . . I hope. I think the real benefits of what we have done are yet to come. I sure wish I could be here in a hundred years just to see how it all plays out.” The day after this discussion with Gordon, I knew it was the first day of a new era, one without Gordon Moore’s oversight. I got up that morning half-wondering if the sun would rise again to shine on Silicon Valley. It did – reflecting Gordon Moore’s ever present optimism for the future of technology. As has Moore’s Law continued to plug on, delivering benefits to many who will never realize the important contributions of this man and his observation.

1.8 Appendix A Moore’s law governs the real limit to how fast costs can grow. Starting with the basic equations from above, the optimal component density for any given period is Ct = 2 ∗ Ct − 1 , where Ct = Component count in period t, Ct − 1 = Component count in the prior period. (Also please note the “−1” here and below is symbolic in nature and not used mathematically.) According to the original paper given in 1965, the minimal cost of manufacturing a chip should decrease at a rate is nearly inversely proportional the increase in the number of components. So the cost per component, or transistor, should be cut roughly in half for each tick of Moore’s clock: Mt − 1 2 = 0.5 ∗ (M t − 1)

Mt =

1 The Economic Implications of Moore’s Law

29

where M t = Manufacturing cost per component in period t, M t − 1 = Manufacturing cost component in the prior period. However, since this paper was first given, it has generally been believed that industry growth will not be affected if the cost per function drops by at least 30% for every doubling of transistors. Thus: M t = 0.7 ∗ (M t − 1) . Since, Mt =

T dct Ct

and Mt − 1 =

T dct − 1 , Ct − 1

where T dct = Total die cost in period t, T dct − 1 = Total die cost in the prior period. Thus, T dct 0.7 ∗ T dct − 1 = Ct Ct − 1 0.7 ∗ T dct − 1 T dct = 2 ∗ Ct − 1 Ct − 1 2 ∗ Ct − 1 ∗ 0.7 ∗ T dct − 1 . T dct = Ct − 1 Simplified it reduces to 2 ∗ 0.7 ∗ Ct − 1 ∗ T dct − 1 Ct − 1 T dct = 1.4 T dct − 1 . T dct =

If the cost-per-function reduction ratio is different that 0.7, then: T dct = 2 ∗ Cpf r ∗ T dct − 1 , where Cpf r = Cost-per-function reduction ratio for every node as required by the market. So in general, the manufacturing cost per unit area of silicon can rise by 40% per node of Moore’s law (or by twice the cost-per-function reduction ratio requirement. This includes everything from the fab cost to materials and labor. However it does not take yield or wafer size into account. Adding these two: T wct = 2 ∗ Cpf r ∗ T wct − 1 .

30

G.D. Hutcheson

So, T dct =

2 ∗ Cpf r ∗ T wct − 1 T wct , = W ∗ Dpwt − 1 ∗ Y r ∗ Y t − 1 Dpwt ∗ Y t

where T wct = Total wafer cost requirement in period t, T wct − 1 = Total wafer cost in the prior periodm, Dpwt = Die-per-wafer in period t, Y t = Yielded die-per-wafer in period t, W = Ratio of die added with a wafer size change, Dpwt − 1 = Die-per-wafer in the prior period, Y r = Yield reductions due to improvements with time, Y t = Yielded die-per-wafer in the prior period.

References 1.1. G.E. Moore, “Lithography and the Future of Moore’s Law,” SPIE Volume 2440, 0-8194-1799, 2/1995 1.2. G.E. Moore, “The Future of Integrated Electronics, Fairchild Semiconductor,” 1965. This was the original internal document from which Electronics magazine would publish “Cramming more components into integrated circuits,” in its April 1965 issue celebrating the 35th anniversary of electronics. 1.3. G.E. Moore, “Progress in Digital Integrated Electronics,” IEDM, 1975 1.4. K. Marx, “Capital,” Progress Publishers, Moscow 1978, Chap. 15, Sect. 2 1.5. J.S. Mill, “Principles of Political Economy,” London 1848 1.6. G. Ip, “Did Greenspan Push High-Tech Optimism On Growth Too Far?,” The Wall Street Journal, December 28, 2001, pp. A1, A12 1.7. H.R. Huff, John Bardeen and Transistor Physics, Characterization and Metrology for ULSI Technology 2000, 3–29 (2001) AIP Conference Proceedings, 550 1.8. G.D. Hutcheson, “The Chip Insider,” In: VLSI Research Inc., September 17, 1998 1.9. Scientific American Interview: Gordon Moore, In: Scientific American, September 1997 1.10. W.R. Runyan and K.E. Bean, “Semiconductor Integrated Circuit Processing Technology,” Addison-Wesley Publishing Company, 1990, p. 18; Also Charles E. Spork, “Spinoff,” Saranac Lake Publishing, 2001, Chap. 3, pp. 53–56 1.11. C.E. Spork, “Spinoff,” Saranac Lake Publishing, New York 2001, pp. 181–185 1.12. Private conversations with the author 1.13. Conversations with Howard Moss, a Board Member of Texas Instruments at the time, 1985 1.14. Semiconductor Industry Association Data Archives, 2000 1.15. R.H. Dennard, IBM – Field-Effect Transistor Memory, US Patent 3,387,286, Issued June 4, 1968 1.16. G.D. Hutcheson, “The VLSI Capital Equipment Outlook,” VLSI Research Inc., 1987 1.17. G.D. Hutcheson, “The Chip Insider,” VLSI Research Inc., May 25, 2001 1.18. G.D. & J.D. Hutcheson, “Technology and Economics in the Semiconductor Industry”, Scientific American, January, 1996, pp. 54–62 1.19. J. Diamond, “Guns, Germs, and Steel,” W.W. Norton and Company, 1997, Chap. 12 1.20. Personal conversations with Dr. Gordon Moore and the author, May 24, 2001

Part I

Classical Regime for SiO2

2 Brief Notes on the History of Gate Dielectrics in MOS Devices E. Kooi† and A. Schmitz

2.1 Early Attempts to Make Insulating-Gate Field-Effect Transistors; Surface States The earliest references to field effect devices resembling todays MOS transistors may be found in patents issued to Lilienfeld (1926) and Heil (1934). However, the devices described in their patents may never have been reduced to practice, and at the time silicon was certainly not envisioned to be a suitable semiconductor material. The invention of the bipolar transistor at Bell Labs includes followed attempts to create thin film semiconductor field effect devices. These attempts failed due to trapping of induced charge in “surface states,” energy states for electrons between the valence and conduction band at the surface of the semiconducting material. Surface states remained a concern when bipolar transistors were produced, as both germanium and silicon transistors exhibited non-ideal behavior and instabilities due to timedependent surface phenomena. This led to considerable research activities aimed at understanding and improving germanium and silicon surfaces. Laboratory experiments included fundamental studies on “clean” surfaces made in high vacuum. In such instances, the surface states were then related to the structure of the semiconductor surface, including “dangling bonds” of germanium or silicon. The electrical effects of surface states were considered to be two-fold: (1) trapping of electrons and holes present or induced near the semiconductor surface, and (2) increased speed of recombination of holes and electrons. Many trial-and-error studies were made to find the best surface treatments of the semiconductor material for making diodes or transistors of the best electrical properties and stability. In field-effect experiments, practical surfaces usually exhibited both “fast” and “slow” (times up to hours or longer) surface trapping effects. The fast surface states were presumed to be present directly at the semiconductor surface, while the slow surface states in surface films formed within the bulk of the film or on the outer surface.



Deceased, September 14, 2001

34

E. Kooi and A. Schmitz

2.2 Passivation of Silicon Surfaces by Thermal Oxidation; Planar Transistor Technology A breakthrough came with the work of Atalla et al. [2.1], published in 1959, in which “a completely new kind of silicon surface” was established by thermal oxidation of silicon. For the first time a practical semiconductor surface appeared not to exhibit slow surface states effects. PN junction reverse leakage currents were found to be reduced considerably, though interface states were still present. Atalla et al. distinguish acceptor type states thought to be due to structural imperfections, and donor type states attributed to impurities, such as gold, gettered from the bulk silicon at the silicon surface. Atalla’s work marks the start of innumerable studies of thermally oxidized silicon surfaces, which led to major improvements of the properties and stability of silicon devices and within a few years also to making reliable MOS devices. Early germanium and silicon bipolar transistors were usually made on wafers with 111 crystal planes as surfaces. One reason for this was that many early transistors were made by using alloying metals to the semiconductor surface. The alloy process would best produce flat junctions on 111 oriented wafers, as the 111 plane is atomically the densest plane in these crystals. Making transistors by introducing donor and acceptor elements into the silicon surface in high-temperature gaseous environments by the diffusion process was a major step forward in making transistors with well-controlled base widths. In 1957 Frosch and Derick [2.2] of Bell Labs showed that under certain conditions silicon oxide films on the silicon surface could be used to prevent such introductions of dopants. The use of patterns of masking silicon oxide on the surface thus gave the possibility to dope certain surface regions selectively, thus making emitter and base areas defined by openings made in the masking oxide film. In the “planar technology,” started at Fairchild Semiconductor, the oxide masking technique published by Frosch and Derick is combined with the above mentioned finding of Atalla et al. that oxide coatings can improve the properties of silicon PN junctions. Fairchild owned important patents on the planar technology; a patent on the basic planar technique was filed by Hoerni [2.3] in 1959, and a patent about using the planar technology to make integrated circuits which was filed by Noyce [2.4] in the same year. Initially planar diodes, bipolar transistors, and integrated circuits were made mainly on N-type base material or epitaxial layers. The characteristics of planar structures on P-type material tended to be disturbed by the presence of an N-type inversion layer at the oxidized silicon surface. This caused increased leakage currents, which could be lessened by making highly doped surface regions (guard rings) around the active devices. The reason for the tendency of oxidized silicon to show N-type surfaces was reason for much speculation. The N-type surface skins went away when the oxide was etched off, so the effect had to be related with some donor action of the oxide near the surface,

2 Brief Notes on the History of Gate Dielectrics in MOS Devices

35

i.e. positive oxide-related charges induced compensating negative charges in the silicon. To understand such phenomena arduous research was needed. MOS transistors and capacitors soon became important measurement tools to get knowledge about the silicon surfaces used in planar devices. At about the same time functional MOS type field-effect devices became a reality. The patent of Kahng [2.5] of Bell Labs may be recognized as the first filed invention related to practical MOS transistors. Sah’s [2.6] article gives a comprehensive review information on the early evolution of the MOS transistor, including many references.

2.3 Positive Oxide Charge and Surface States at the Si–SiO2 Interface The presence of positive “oxide charge” caused MOS transistors made on Ptype material to be of the “depletion type,” because the inherently occurring N-type inversion layer was already present when zero gate voltage was applied, the inversion layer being depleted by a applying a negative gate voltage to the gate electrode. Conversely, P-channel transistors showed enhancementtype characteristics. A positive voltage above a certain threshold level needed to be applied to the gate to create a P-type inversion layer on the surface of the N-type substrate between source and drain. This threshold voltage was larger than would be expected on the basis of the doping level of the silicon, the thickness of the oxide, and the work function difference between the metal gate electrode (in early devices practically always aluminum) and the semiconductor substrate. In MOS IC’s, the presence of oxide charge allowed to make the fields between the devices free from unintentional induced channels by just providing those regions with a thick “field oxide;” the thick oxide combined with the positive oxide charge made the threshold voltage sufficiently high to prevent parasitic MOS transistor effects. Such transistors and IC’s were made on silicon wafers with a (111) surface orientation. In the mid 1960’s, it became clear that the amount of oxide charge depended markedly on the surface orientation, and that (110) and in particular (100) surfaces exhibited considerably less interface charges. Evidently at such surfaces the amorphous SiO2 structure fitted better to the Si crystal lattice. Further, the oxidation conditions and following heat treatments had a marked effect on the amount of surface charge, oxidizing conditions increasing the charge, heating in inert gases (argon, nitrogen) to reducing that surface charge. In addition to oxide charge, the Si–SiO2 interface also exhibited centers from which electrons or holes could easily be exchanged with the silicon. When such centers are present and a voltage is applied to the gate of an MOS transistor, the charge induced in the silicon may be trapped in these “surface states” rather than being available for MOS transistor action. Such surface states were thought to be related to unsaturated (“dangling”) silicon bonds at the Si–SiO2 interface. In 1965 presentations by Balk [2.7] and publi-

36

E. Kooi and A. Schmitz

cations by Kooi [2.8] it was reported that the number of surface states could be reduced dramatically during treatment of the thermally oxidized silicon surface in H2 or H2 O vapor at relatively low temperatures (400–500◦ C). In either case, it was assumed that during such treatments Si–H bonds would be formed, thus removing nearly all of the dangling Si-bonds. Remarkably, several of the early papers on MOS technology don’t mention the need for such a low-temperature treatment. The reason for this appeared to be that aluminum, utilized as contact material to the silicon, was also used for the metal electrode of the gates of the MOS devices. An alloy treatment was used to ensure good contacts and it was concluded that during this treatment, even if not done in hydrogen, hydrogen formed due to reactions in the Al electrode, at the Al–SiO2 interface, which would then diffuse through the gate oxide and effectively remove the silicon dangling bonds.

2.4 Instabilities Due to Ion Drift Effects Though it appeared feasible to make thermally oxidized silicon in such a way that satisfactory Si–SiO2 interface properties were achieved, a major problem was that early planar devices were often unstable. Threshold voltages of MOS devices could shift dramatically, in particular when positive gate voltages were applied. These effects were augmented if the temperature was increased and could best be explained by migration of ions through the oxide, resulting in the accumulation of positive charge near the silicon surface. At an early Si–SiO2 Interface Conferences animated discussions took place about which ions would be responsible for these effects, oxygen vacancies, protons, and alkali ions being considered as possibilities. A research group at Fairchild [2.9] (Snow, Grove, Deal, Sah) came with convincing arguments that sodium ions were the main culprits. Stable MOS transistors ought to be possible if one would be able to keep them out of the system. At the time, this was easier said than done. There were several possibilities that sodium would inadvertently get into the system. Clean handling of the wafers was a task to be learned. The quartz tubes in the oxidation furnaces could be a source of sodium as well. Last but not least, the aluminum films used for the metal contacts, gate electrode, and in IC’s also for the interconnection patterns, was usually made via evaporation from sodium-contaminated tungsten filaments loaded with pieces of aluminum wire. The use of HCl as an addition to oxygen during the oxide growth was the most important technology improvement for making stable MOS transistors. Heavy metals and alkali ions were are removed when using “chlorinated oxidations”.

2 Brief Notes on the History of Gate Dielectrics in MOS Devices

37

2.5 Phosphate-silicate Glass Helped If contamination with alkali ions could not be completely prevented, means could be sought to prevent their effects. Converting the upper part of the silicon oxide films into phosphate silicate glass (PSG, a glassy compound of SiO2 and P2 O5 ) was one such a method, described in 1964 by Kerr and Young [2.10] of IBM. The conversion process occurred by subjecting the oxidized silicon to vapors of P2 O5 in a diffusion furnace used at the time for diffusing phosphorus into silicon, for example for the preparation of emitter regions of NPN bipolar transistors. In fact, the stability of early planar bipolar transistors may be ascribed to the beneficial layer of PSG formed by reaction of P2 O5 with the top of the SiO2 layer used as the diffusion mask and as protection of the base-collector and emitter-base PN junctions. The mechanism of the stabilization effect of PSG may have been two-fold: (1) during its formation sodium may have been gettered from the silicon oxide layer and, (2) the PSG layer was thought to have a much denser structure than thermally grown SiO2 , thus making migration of alkali ions under the influence of an electric field difficult. The use of PSG–SiO2 sandwich structures as a composite gate dielectric was not totally without problems. Some instabilities were still observed, and as applying positive and negative gate voltages had opposite electrical effects, Snow and Deal [2.11] attributed those to polarization phenomena in the PSG.

2.6 Other Materials Tried as Gate-Dielectric Layers Concerns about the ion-drift phenomena in SiO2 films led to searches for other materials. The only dielectric material, other than silicon oxide, which can be made from Si letting the silicon nitridate was silicon nitride (Si3 N4 ). It was known as a ceramic material which could be formed by reaction of silicon with nitrogen or ammonia at very high temperatures. However, those reactions were not suitable to be used at lower temperatures to make films with any appreciable thickness on a silicon surface. In fact, in silicon device making processes at that time, nitrogen was considered as an inert gas; today we know that this is not totally correct. But certainly, silicon nitride films could also be made by chemical vapor deposition, using gaseous compound such as NH3 and SiCl4 as sources of nitrogen and silicon. Indeed, in a letter published in 1966, Tombs et al. [2.12] of Sperry Rand reported about “A New Insulated-Gate Silicon Transistor,” by replacing the conventional thermally grown SiO2 film with a deposited film of silicon nitride and stated to be without ion drift. They also proclaimed an improved control of the surface state density. Over time, the superior properties of silicon nitride with regard to its resistance against ion drift effects have been confirmed again and again. Methods such as lowpressure chemical vapor deposition (LPCVD) were developed to deposit films of silicon nitride at relatively low temperatures, so that they could be used to

38

E. Kooi and A. Schmitz

passivate integrated circuits even after the metallization patterns were made. However, the suggestion of Tombs that the “quality,” meaning the stability, of a Si–Si3 N4 interface might be better to that of a Si–SiO2 interface has never been substantiated. Until today it appears that the interface properties of oxidized silicon are superior to those of any other silicon-dielectric interface. Combining the good characteristics of silicon nitride with those of oxidized silicon was possible by first oxidizing the silicon and depositing a nitride film on top of the oxide layer. Starting around 1970, various MOS devices have been made in this manner. A 1968 publication by Sarace et al. [2.13] may have been the first one referring to such devices. Because Si3 N4 has a higher dielectric constant than SiO2 (around 6.5, dependent on the deposition conditions, although a value of 8 is often reported in textbooks compared to 3.9 for SiO2 4.2), metal-nitride-oxide-silicon composites have the advantage of an increased capacitance value compared to metal-oxide-silicon structures with the same dielectric physical thickness. However, it was soon discovered that the oxide film between the nitride and the silicon surface needed to have a thickness of at least about 100 ˚ Angstrom (10 nm), to prevent a peculiar instability effect from occurring. This effect was found to be due to tunneling of electrical charge between the silicon substrate and electron traps in the nitride, occurring when the electric field was high enough. The making of non-volatile memory devices based on this tunneling principle was a useful application of this tunneling principle. My own failing attempts to duplicate the nitride devices described by Tombs [2.14], led me to an early discovery of the masking effect of silicon nitride films against thermal oxidation of silicon and its many applications in devices made by the LOCOS (LOCal Oxidation of Silicon) concept [2.15]. Several materials other than using PSG and Si3 N4 as part of the gate dielectric layer were suggested and tried, already in the 1960’s. These included films of Al2 O3 , Ta2 O5 and ferro-electric films, but early on most of them did not find widespread use.

2.7 Thermal Oxidation of Silicon In processes to make integrated circuit, oxide films were used first of all for masking selected areas of the silicon surface during high-temperature diffusion of donor and acceptor dopants into the silicon substrate. As a substantial part of the oxide might be consumed during the diffusion process (for example conversion of the upper region of the SiO2 film into a phosphate-silicate glass), the oxide was usually chosen to be several thousands of ˚ Angstroms thick. To keep the oxidation times within reasonable margins, such oxides were grown in water vapor (steam) in the 1000–1200◦ C temperature range. Early MOS transistors had gate oxide thicknesses in the order of 1000 ˚ Angstroms (100 nm) and these oxide might be grown in dry oxygen rather than steam. In a 1965

2 Brief Notes on the History of Gate Dielectrics in MOS Devices

39

publication [2.16] Deal and Grove of Fairchild Semiconductor presented a linear-parabolic relationship for oxide growth: x20 + Ax0 = B(t + τ ), in which x0 is the oxide thickness, t the oxidation time, and A, B and τ are parameters which depend on the oxidation conditions. For long oxidation times the oxidation rate would be determined by the diffusion of oxidants through the oxide and the relationship between oxide thickness and oxidation time could be simplified to: x20 = Bt. Soon several publications provided values for B as a function of temperature and it became determine how long one would have to oxidize at different temperatures to obtain a desired oxide thickness. The value of B was found to be 10 to 100 times larger for oxidation in wet gases compared to dry oxygen. Therefore, steam oxidation has generally always been preferred for growing oxide films with a thickness of a couple of thousand ˚ Angstroms or more. Gate oxides usually less than 1000 ˚ Angstroms thick, were grown in dry oxygen; for such thin oxides the simplified parabolic rate equation would not give accurate predictions. In the initial stages of oxidation, the oxidation reaction at the silicon surface plays a predominant role, and the oxide thickness increases roughly linearly with time. The surface orientation of the silicon has a marked effect on the initial oxidation rate, (100) surfaces oxidizing slower than (111) surfaces. For accurate modeling of initial oxide growth (below about 200 ˚ Angstroms) further refinement of the model has appeared to be necessary. Deal [2.17] reported in 1998 that since 1970 he had counted more than one hundred mechanisms proposed to explain discrepancies in the < 200 ˚ Angstrom (20 nm) oxide thickness range. Considerations about the mechanism have included the effects of hydrogen originating from the oxidizing gas, of the dopants in the silicon being oxidized, and of generation and oxidation of interstitial silicon atoms at the Si–SiO2 boundary. The method of wafer cleaning before oxidation can also be very relevant when thin oxides are to be prepared.

2.8 Segregation of Dopants at the Si–SiO2 Interface The segregation of donor and acceptor impurities between the silicon and the silicon oxide during thermal oxidation was found to be very important since the early beginning of planar technology. N-type silicon was doped with the donor elements phosphorus, arsenic, or antimony, P-type silicon usually with the acceptor element boron. It was found that when surface regions of silicon were converted into SiO2 , the donor elements tended not be picked up in the oxide under most circumstances, but pile up in the silicon near the Si–SiO2

40

E. Kooi and A. Schmitz

interface, thus making the surface region more heavily doped N-type than the original material. Boron dopants tended to be oxidized at the surface and included in the growing oxide, thus making the surface less P-type than the substrate. Another effect to be taken into account in planar technology was that oxidation of the silicon surface could influence the diffusivity of dopant elements in the silicon, even at considerable distance below the surface. This effect was later explained by the generation of interstitial silicon atoms due to the oxidation process. These interstitials can diffuse rapidly through the silicon lattice. When they interact with dopant atoms, their diffusivity may be greatly enhanced (in the case of P, As, B) or decreased (the case for Sb).

2.9 Other Silicon Oxide Preparation Techniques As thermal oxidation was a high-temperature process, several attempts were made early to grow silicon oxide at lower temperatures. Anodic oxidation in liquid electrolytes was feasible, but yielded films with high ionic content. Anodization in a gas discharge, investigated by Ligenza [2.18] as early as 1965, was shown to be possible at temperatures as low as 300◦ C. Even earlier, in 1962, Ligenza [2.19] had shown that pieces of silicon could be oxidized effectively in high-pressure steam (25–500 atmosphere) in autoclaves at temperatures between 500 and 950◦ C. Later, high-pressure furnaces operating up to about 10 atmosphere and capable of carrying large wafers, have found practical use to speed up the silicon oxidation rate, which is approximately proportional to the oxidant pressure. In 1961 Kallender et al. [2.20] reported on a remarkable low-temperature oxidation method, showing that lead oxide vapor would catalize the oxidation of silicon, making effective oxidation of silicon possible at temperatures around 650◦ C. Confirming Kallender’s findings, Kooi [2.21] that under such circumstances a lead-containing glass would form with a linear oxidation rate in the order of 10 nm per minute. This oxide had the remarkable possibility to produce either N-type or P-type silicon surfaces, depending on the heat treatment. Low-temperature oxidation methods such as mentioned above would have the advantage to make oxidized silicon surfaces without causing much dopant redistribution in the silicon substrate, but none of them have been widely applied. This is different with respect to deposited oxide films. Successful chemical vapor deposition (CVD) of SiO2 from gas sources such as SiCl4 and O2 , H2 O or CO2 and pyrolysis of organo-oxysilanes such as Si(OC2 H5 )4 at temperatures below 500◦ C were already done in the 1960’s [2.22]. Over time, such techniques have been upgraded in great measure, and are they still being used today. However, deposited silicon oxide films were never found to be very good for use as gate dielectric. It appeared that the best quality of MOS systems was always obtained by making sure that at least the silicondielectric interface was formed by thermal oxidation of the silicon substrate.

2 Brief Notes on the History of Gate Dielectrics in MOS Devices

41

2.10 Thick Field Oxides The active area of MOS transistors has a “thin” oxide layer. In the early days 1000 ˚ Angstrom was rather typical and this has been reduced tremendously., Today’s gate thicknesses are below 20 ˚ Angstrom and still going down! Outside the gate areas the oxide thickness needs to be increased quite substantial to prevent parasitic charging effects. One method to grow thick oxides in one area and prevent oxide growth in the active areas-to-be is the LOCOS Oxidation. The silicon wafers get a thin (100 ˚ Angstrom) oxide and a three to four hundred ˚ Angstrom deposited Si3 N4 layer on top. With a masking step, the active areas-to-be are masked and elsewhere the Si3 N4 is removed. Now a thick oxide can be grown. The Si3 N4 layer on top of the active areas is very slowly oxidized. Oxide layers in excess of ten thousand ˚ Angstrom can be grown, while just a few hundred ˚ Angstrom of the Si3 N4 is converted. This invention by the author in the early seventies is still in use in the industry and it can improved the packing density of the devices quite substantially.

2.11 Breakdown Strength of SiO2 , Defect Density, Moore’s Law “Properly grown” SiO2 has very good breakdown voltages. A field strength of over 10 MV/cm is achieved. The defect density can also be rather small, as it should, because nowadays Integrated Circuits larger bigger than a square centimeter are manufactured with good yields. The manufacturing technology has been improved tremendously over a period of now over forty years. In the early days, silicon wafers with diameters of one inch were used. The modern manufacturing facilities use silicon wafers with diameters of 30 centimeters. The growth of complexity of the Integrated Circuits has been unimaginable. The first Integrated Circuits had a “complexity of five,” that is to say we were able to combine three transistors and two resistors in one die. Present day microprocessors have a complexity of over forty million transistors. Moore’s Law [2.23] states that the number of transistors, a measure of the complexity of Integrated Circuits, is doubling every few years and dates back to 1965. In the microprocessor case, the complexity went from 2250 transistors in the 4004 in 1971 to 42.000.000 transistors in the Pentium 4 in the year 2000.

2.12 Weak Oxide Regions in MOS Structures, Kooi Effect In the earliest LOCOS experiments an unpleasant effect was detected. The SiO2 –Si3 N4 sandwich was an excellent mask for oxidations. However, after the removal of the SiO2 –Si3 N4 sandwich, the gate oxide growth ended up in a

42

E. Kooi and A. Schmitz

gate oxide with a reduced thickness at the edges, So resulting in much too low breakdown voltages for the devices. In a microscopic view a “white ribbon” could be detected around the primary etched zones. At first it was thought to be an optical artifact. After study of the breakdown voltages, the conclusion became it is a very narrow ribbon that is the result of a reduced oxidation at the edge of the oxidized area. At the edge a Si3 N4 rich zone is formed, at the end of the oxidation that Si3 N4 rich ribbon is completely oxidized, but the resulting device characteristics are below expectations, because of a reduced oxide thickness at that edge. The solution for this problem was a sacrificial oxidation of a few hundred ˚ Angstrom, removal of that sacrificial oxide and a second oxidation that now forms the gate oxide. The White Ribbon has disappeared completely and the gate oxide thickness is constant.

2.13 Al Gate MOS Devices; PMOS IC’s Aluminum has been the contact metal of choice since the beginning of the use of the planar technology. At reasonable temperatures (around 450◦ C) it will alloy with silicon and form Ohmic contacts on P-type zones and on heavily doped N-type zones as well. When working with semiconductors, a major problem is getting ohmic contacts. Most metallic layers will result in “somewhat” rectifying contacts. When you need to use active elements as transistors, or resistive elements ohmic contacts are necessary! You can make a long list of the drawbacks of that aluminum and try for a better metal, or alloy. Until today, a better material however has not been found. In modern circuits with very narrow lines the electrical resistance of the Al lines are too high and in multilevel metal systems copper as an interconnect is in use. Aluminum will alloy with silicon and to prevent too deep alloy breaking through modern day very thin junctions, a small amount of silicon is added to the aluminum to reduce the alloying depth. The first MOS transistors made were made with poly silicon gates. Because of the previously mentioned surface charges of the silicon, only PMOS transistors could be made and so also PMOS Integrated Circuits. Designers were and are inventive, however, and the design restrictions when having just one type of transistors are severe. The growth of the IC business has taken off after Complementary MOS transistors (CMOS) could be made on one chip.

2.14 Silicon Gate MOS Devices, NMOS and CMOS IC’s The introduction of poly silicon as a gate material has improved the reliability and versatility of the MOS transistors in a big way. Moreover, the introduction of ion implantation in the gate areas has introduced additional degrees of freedom in choosing the threshold voltages of different MOS transistors. So now we are able to make very well defined complementary N and

2 Brief Notes on the History of Gate Dielectrics in MOS Devices

43

P MOS transistors on one chip (CMOS). In circuit design, this has given a huge expansion of the flexibility and thus the complexity of the designs.

2.15 Decrease of Oxide Thickness Connected with Downscaling of MOS Structure When we need higher packing densities for increased complexity of the Integrated Circuits we need better, meaning smaller, lithographic sizes and tolerances, so enabling smaller devices. When scaling devices the scaling is not only in the surface area but also in the depth of the devices. We need lower power per device otherwise the IC’s would be “too hot to handle.” Moreover, the cut off frequencies of the devices are going up as well. So we now have a gate layer thickness under 20 ˚ Angstroms. The maximum voltages that these devices can handle are reduced as well, because the maximum field strength the SiO2 can handle is not (significantly) increasing. However it should be clear that in circuits with millions of transistors, the power per transistor must be greatly reduced to prevent the circuit from overheating. Acknowledgments. The editors appreciate Albert Schmitz, retired, Philips Semiconductors, North America, completing the original draft of Else Kooi’s manuscript.

References 2.1. M.M. Atalla et al., “Fabrication of Semiconductor Devices having Stable Surface Characteristics,” US Patent 2,899,344. Filed April 30, 1958 granted Aug 11, 1959 2.2. C.J. Frosch and L. Derrick, Surface protection and selective masking during diffusion in silicon. J. Electrochem. Soc., Vol 104, No. 5, pp. 547–552, May 1957 2.3. J.A. Hoerni,”Planar silicon transistors and diodes,” presented at the 1960 IRE International Electron Device Meeting Oct, 27-29 1960. Technical Article and Paper Series, No TP-14, 9 pp., 1961 2.4. R.N. Noyce, “Semiconductor device-and-lead structure,” U.S. Patent 2,981,877. Application filed July 30, 1959 granted April 25, 1961 2.5. D. Kahng “Electric field controlled semiconductor device,” U.S. Patent 3,102,230. Application filed May 31, 1960, granted Aug. 27, 1963 2.6. C.T. Sah, Evolution of the MOS transistor – From conception to VLSI. Proc. IEEE 76, 1280–1320 (1988) 2.7. P. Balk, “Effects of hydrogen annealing on silicon surfaces,” presented at the Electrochemical Society Meeting San Fransisco, CA May 9-13, 1965, Extended Abstracts of Electronics Division, Vol. 14, No. 1, abstract No. 109, pp. 237–240, May 1965 2.8. E. Kooi, “Effects of low temperature heat treatments on the surface properties of oxidized silicon,” Philips Research Reports 20, pp. 578–594, Oct. 1965

44

E. Kooi and A. Schmitz

2.9. E.H. Snow, A.S. Grove, B.E. Deal, and C.T. Sah, “Ion Transport Phenomena in Insulating Films,” J. Appl. Phys. 36, 1664 (1965) 2.10. D.R. Kerr and D.R. Young, “Method of improving electrical characteristics of semiconductor devices and products so produced,” U.S.Patent 3,303,059. Filed June 29, 1964 issued Feb. 7, 1967 2.11. E.H. Snow and B.E. Deal “Polarization phenomena and other properties of phosphosilicate glass films on silicon,” J. Electrochem. Soc. 113(3), 263–9 (1966) 2.12. F.A. Sewell Jr., N.C. Tombs, Semiconductor devices employing silicon nitride as the diffusion masking and junction passivating material, US Patent 19,670,519 2.13. J.C. Sarace, R.E. Kerwin, D.L. Klein, R. Edwards, Metal-nitride-oxidesilicon field-effect transistors, with self-aligned gates. Bell Teleph. Lab., Inc., Murray Hill, NJ, USA. Solid-State Electronics 11(7), 653–60 (1968) 2.14. N.C. Tombs, Deposition of silicon nitride layers on semiconductor substrates. U.S. Patents 19,650,623 and 19,651,027 (1971) 2.15. J.A. Appels, E. Kooi, M.M. Pfaffen and W. Verkuylen, Local Oxidation of Silicon, Philips Research Reports 25, 117 (1970) 2.16. B.E. Deal and A.S. Grove, “General Relationship for the Thermal Oxidation of Silicon,” J. Appl. Phys. 36, 3770 (1965) 2.17. C.R. Helms, B.E. Deal (eds.), Silicon Oxidation Models Based on Parallel Mechanisms, Phys Chem SiO2 –Si Interface [Proc. Symp.], Plenum, New York, N.Y. 2.18. J.R. Ligenza, Oxidation of Silicon at 300◦ C. J. Appl. Phys. 36, 2703 (1965) 2.19. J.R. Ligenza, Effect of Crystal Orientation on Oxidation Rates in High Pressure Steam. J. Phys. Chem. 65, 2011 (1961) 2.20. D.A. Kallender, S.S. Flaschen, R.J. Gnaedinger and C.M Lufty, Conference of the Electrochemical Society, Indianapolis, 1961; Electronics Division, abstract 67 2.21. E. Kooi and M.M.J. Schuurmans, Temperature gradient effects during heat treatments of oxidized silicon. Philips Research Reports 20, 315–19 (1965) 2.22. K.H. Maxwell, L.H. Rabouin, Chemical vapor deposition of oxide films from volatile chlorides. I. Silicon dioxide. Philco Corp., Blue Bell, PA, Electrochem Technol 3(1–2), 37–40 (1965) 2.23. G.E. Moore, Cramming more components onto integrated circuits. Electronics 38, No. 8, April 19, 1965

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties E.A. Irene

3.1 SiO2 Prior to 1970 3.1.1 Introduction The purpose of this chapter is to follow the course of the development of SiO2 as a gate dielectric in metal oxide semiconductor field effect transistors (MOSFET’s) both in terms of the technology and the science. The literature on this subject is vast and it must be admitted that only a fraction of the literature is discussed herein. More extensive reviews of the literature prior to 1990 are suggested [3.1, 3.2]. To this day the gate SiO2 films are prepared mostly by the thermal oxidation of single crystal Si surfaces. Hence the emphasis of this chapter is on thermal oxidation with the ideas and models relating to thermal oxidation presented in some detail. In this way methods that were and are used are revealed as well as the significant issues relating to gate dielectrics. Many of these issues are as important today as they were in 1970 and thus the context for the following Chapters in this book will be better understood. 3.1.2 A Brief Historical Survey During the 1960’s electronic device technology shifted emphasis from discrete or individual devices to integrated circuits, IC’s. IC’s are composed of many (presently in the 107 range) devices typically MOSFET’s and bipolar transistors that are fabricated adjacent to one another on the same substrate and linked together for a particular purpose such as to provide the functions of memory or logic or signal processing, etc. MOSFET’s have emerged as the dominant device type and thus the emphasis of this chapter is with this device. The IC industry adopted a planar technology, namely where the finished device array is flat or planar, rather than mesa-like as with the technologies that preceded the planar trend. The method to define regions in the semiconductor substrate that were to receive the necessary dopant(s) or to be metallized for making contacts was via the growth of an oxide film of SiO2 on the Si surface and this was followed by a photolithographic step that creates a pattern and which requires flatness. Since this SiO2 film on the Si surface grows in air, SiO2 provided a straightforward materials solution that

46

E.A. Irene

is compatible with planarization technology. Furthermore in this era it was discovered that SiO2 effected a seemingly magical improvement in the electrical characteristics of the Si surface [3.3]. This feature of the Si surface or better the Si–SiO2 interface will be treated below in some detail. Essentially, it would have been decidedly advantageous to be able to use only the surface or near the surface of the semiconductor for electrical conduction. In this manner the planar technology is fully utilized. In addition the superior electronic performance of Si in particular the Si surface that is realized with an SiO2 overlayer, the Si–SiO2 interface, has not been achieved with any other semiconductor-dielectric combination to this day. 3.1.3 What Is a MOSFET? Modern MOSFET’s are fabricated from Si and related Si materials with SiO2 in film form as the most prominent material other than Si. Figure 3.1 shows a cross section sketch of a so-called N-channel MOSFET with its three terminals: source, drain and gate. For the purposes of this chapter we need only consider this simple MOSFET depicted in Fig. 3.1 although more complex designs are in practice. It should be noted that presently the most common device configuration is to have an N-channel and P-channel MOSFET electrically connected to form a complimentary MOSFET or CMOS configuration. This device structure affords many advantages and a few are mentioned later. In the MOSFET structure in Fig. 3.1 several regions are shown that have SiO2 films, but each film is grown for a different purpose. The oxide labeled “gate oxide” is subject to the most stringent electronic quality requirements and

Fig. 3.1. Pictorial representation of an N-Channel MOSFET. The gate dielectric SiO2 film is located below the metal gate contact and above the channel region of the transistor

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

47

is therefore the focus of this chapter. This gate dielectric must support the electric field that enables inversion of the carrier type at the Si surface (this key aspect of MOSFET operation will be elucidated below) and thus enables conduction from the source to the drain and defines the “on” state of the device. The most important requirement for this gate SiO2 dielectric film is that the interface electron states at the Si–SiO2 interface need to be well below the 1012 cm−2 eV−1 level for MOSFET operation. In fact since the 1970’s an interface electronic state level of 1010 cm−2 eV−1 and below is considered acceptable for reliable device operation. A brief discussion of the nature and origin of the crucial interface electronic states will follow later. The thicker oxide between adjacent devices is called the field or isolation oxide. This oxide has the function of electrically isolating one device from another. The Si–SiO2 interfacial electronic properties requirements for the field oxide are far less critical than for the gate oxide. Several orders of magnitude more electronic interface states are tolerable in the isolation regions of the device. It has been found that the best all around preparation method used to achieve the highest electronic quality Si–SiO2 interface is via the thermal oxidation of Si. Therefore this process has received considerable attention since the 1960’s and it will be discussed in detail below. Other preparation methods for SiO2 are used for device applications other than the gate oxide. Among the other processes used to prepare SiO2 films are: plasma oxidation, chemical vapor deposition (CVD), plasma enhanced CVD (PECVD), sputtering and evaporation. There are cost and process advantages with the use of the alternate methods as well as the possibility of less exposure of devices to the high temperatures and long process times (the thermal budget) that are required for thermal oxidation to form thick field oxide films. For ultra-thin (< 5 nm) gate oxide films the thermal budget is not an issue, because these films grow rapidly at modest temperatures. Not illustrated in Fig. 3.1 are oxide films used to delineate the variously doped Si regions or to delineate electrical contacts to the regions. These films are usually removed by etching procedures after they serve their masking purposes. While we choose not to discuss the photolithographic processes used for these purposes here, suffice for the present discussion that the oxide electrical quality is usually not of great concern so long as the film covers the desired regions of the surface. The SiO2 films need only support optically active resist materials, and the oxides are usually entirely removed and are often remote from active device regions. Thus, virtually any process compatible method can be employed for the preparation of masking oxide films. Indeed for some isolation requirements and most masking functions it is not even necessary that the films used be SiO2 . Si3 N4 , Al2 O3 and various organic and other ceramic films have been used over the years within specific technologies. Thus while thermal oxidation produces the highest quality SiO2 films for critical device areas such as gate areas, for the other device areas and for masking and isolation other film formation methods can yield acceptable films and even result in substantial benefits in cost and time.

48

E.A. Irene

3.1.4 How Does a MOSFET Work? Typically a transistor can be thought of as an electronic device that can turn a desired current flow on and off by the application of a control potential. Thus a basic transistor is a three terminal device. As is shown in Fig. 3.1 the three terminals are designated as the Source, Drain and Gate. For the N-Channel MOSFET in Fig. 3.1, a positive potential applied to the gate electrode causes the channel region beneath the gate oxide to become electron enriched. Before application of the gate potential (gate bias) this channel region was p-type according to the Si substrate doping when the majority carriers are holes. However, with the application of the positive gate bias, this channel region that was p-type becomes electron enriched or n-type by the attraction of the minority carriers, electrons, to the Si surface. The process that changes the channel region carrier type is termed “inversion”. The inversion layer near the Si–SiO2 interface forms a channel (in this case an N-channel) that connects the n-type source and drain. It is useful to note that the MOSFET operates using only one kind of carrier and that carrier is the minority carrier. A P-channel MOSFET operates similarly under a negative gate potential where inversion changes the electron rich n-type substrate to a p-type channel that would connect p-type source and drain regions. A modern complimentary MOSFET (CMOS) as was mentioned above is constructed using adjacent N and P-channel MOSFET’s. There are significant power and cooling advantages with the CMOS structure hence its popularity for modern high device density chips. CMOS device current only flows for a brief time in a typical flip-flop CMOS memory element. However, for the purposes of this chapter it is sufficient to limit the discussion to an N-channel device and the discussion of SiO2 herein applies equally to the more complex MOSFET’s as well. 3.1.5 Interface Electronic States and Charge The problem with the Si surface to be addressed, or for that matter with any surface, is that the termination of the regular crystalline lattice at the surface provides a large number of unsatisfied chemical bonds. The number of these so-called dangling bonds is about 1015 /cm2 which is of the order of the number of surface atoms. This simple idea is illustrated in Fig. 3.2. The existence of the dangling bonds was predicted and the fact that the bonds give rise to states within the forbidden gap of the semiconductor was treated in the 1930’s [3.4, 3.5]. If the electron states due to the dangling bonds are empty, the surface states act as acceptor states which when filled leave the surface in short supply of electrons and hence p-type. If the states are filled with electrons via a process of charge exchange with the bulk material, the filled states can act as electron donors with a supply of electrons. The early experimental work aimed at confirming these ideas was done on Ge surfaces [3.6, 3.7] and

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

49

Fig. 3.2. Pictorial representation of a Si surface showing unsatisfied chemical bonds that give rise to surface and interface electronic states

later work on Si surfaces also supported these results [3.3]. Extensive measurements show that both donor and acceptor electronic states are found on the Si surface [3.8–3.10] (and treated extensively in [3.11]). 3.1.6 Implications of the Charges on MOSFET Operation Because of the existence of the surface electronic states, the electronic properties of devices are profoundly affected. We consider the case of a field effect device as depicted in Fig. 3.1, in which the application of an external voltage (bias) to the gate contact creates an internal electric field across the gate dielectric. As was discussed above, this electric field modifies the semiconductor surface potential and can actually invert the semiconductor type in a thin skin at the surface, viz. change the carrier type by bringing minority carriers to the surface of the semiconductor. Also discussed above is the fact that the inversion layer connects the two similarly doped parts of the device (source and drain) with a similar carrier type conducting channel. With surface states present, however, part of the charge resulting from an externally applied bias voltage that creates an electric field that operates the device, will be compensated by the charges in the surface states. The resulting change in the semiconductor surface potential will be smaller than if no surface electronic states were present. In effect the action of the surface states alters the operating gate potential for the device. If sufficient surface state charge is present, this charge may actually prevent inversion of the surface and thus prevent the turn-on of the field effect device. This calamitous situation is termed Fermi level pinning, since as a result of charge exchange with the surface states, the Fermi level of the semiconductor cannot be altered suf-

50

E.A. Irene

ficiently relative to the conduction or valence band edges at the surface, to change the dominant carrier type at the surface and thereby enable field effect device operation via inversion. The bottom line is that surface states can disrupt and even prevent device operation. The major issue becomes how to reduce or eliminate surface electronic states from semiconductor surfaces and thereby have the ability to construct a high device density chip. The answer to this problem exists for the case of the Si surface. As discussed above it has been found [3.3] that although the Si surface intrinsically displays a large number of surface states as do most clean semiconductor surfaces, when Si is exposed to an oxygen ambient the number of states decreases by orders of magnitude. Chemically, a film of amorphous SiO2 forms on the surface. The SiO2 film apparently ties up most of the dangling bonds that gave rise to the surface states by forming chemical bonds and thereby removing the states from the semiconductor band gap. Thus the question posed as to why Si is the predominant semiconductor follows from the fact that SiO2 forms on the surface quite naturally in an oxygen containing ambient. The SiO2 film not only reduces the number of the interfering surface states to well below the typical number of current carriers utilized for electronics devices, but also provides a list of bonus properties mentioned above. The ability of SiO2 to significantly reduce the electronic surface states on Si is termed “electronic surface passivation”, and it has not been achieved to the same extent with any other semiconductor and film combination up to the present time. In fact the ability of SiO2 to electronically passivate the Si surface and provide a high quality dielectric is largely responsible for Si emerging as the preeminent semiconductor for IC fabrication. These two materials, Si and SiO2 , are intimately combined into a chemically favored transformation of one into another upon the surface of the Si. The free energy for this transformation is approximately ∆G0f = −200 kcal/mole which attests to the reactions favored status and to the stability of the interface. SiO2 has other very important uses in Si technology as was mentioned above. As a wide band gap dielectric (9 eV), SiO2 can support large electric fields without significant leakage current. Consequently, SiO2 is used as a gate dielectric in MOSFET’s to both passivate the Si as defined above, and to support the electric field necessary to invert the carrier type in the MOSFET channel and thereby enable the device to turn on, i.e. to enable current to flow from the source to the drain elements through the inverted channel of the MOSFET. By virtue of its excellent electrical insulating quality, SiO2 is used to isolate one device from another on a chip. It is also used as an overall electrical isolating encapsulant although modern devices mostly use other dielectrics for that purpose. SiO2 is readily removed using HF aqueous solutions, and using plasmas containing halogens. Consequently, it is a desirable masking film for use in lithography to support photoresists. While it is true that except for electronic passivation as was defined above, other dielectrics and materials could function equally well and sometimes better than SiO2 for the various applications in Si technology. However, in order

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

51

to minimize the number of materials used in the technology, and because only SiO2 will electronically passivate Si so thoroughly, SiO2 is typically preferred in applications in which it is acceptable albeit not required. Although SiO2 is used mostly in Si technology, because it is an excellent dielectric, it is chemically stable, transparent in the visible spectrum, and the preparation methods are well known, SiO2 films also find wide applications within other semiconductor and optical technologies. 3.1.7 The Silicon Oxidation Model: Early Studies Emerging in the early 1960’s were a number of pioneering studies concerning the formation and properties of SiO2 films prepared via the oxidation of Si [3.12–3.16]. These studies were important because several of the key features were revealed about the oxidation of Si to produce high quality films of SiO2 . Also, the understanding gained in these early studies that treated mostly thicker films provided the basis for subsequent and more accurate studies that extended the understanding about Si oxidation to thin SiO2 films. These early studies agree in the broad picture of Si oxidation. Two studies represent most of the understanding developed in that era with one study [3.13] by Deal and Grove (the same A. Grove who later became a founder of INTEL) is often quoted relative to the model for Si oxidation in which the thermal oxidation of single crystal 100 and 111 Si was treated using both dry O2 and steam. The resulting Si oxidation model is often called the linear-parabolic (LP) model. The other study [3.12] by Ligenza is concerned with identifying the reactant species that moves during Si oxidation. Before proposing an analytical model that describes the time evolution of the SiO2 film growth on Si, it is useful to consider the overall behavior of the growth process as the early workers had done. An overall picture can be obtained from Fig. 3.3a that displays SiO2 film thickness versus oxidation time data for several temperatures of oxidation in dry oxygen. Figure 3.3b displays data from a systematic study of H2 O additions to oxygen and the effects on the Si oxidation kinetics. The data in Fig. 3.3 was generated in the mid 1970’s using in situ real-time ellipsometry from the author’s laboratory [3.17, 3.18] however the general features were recognized in the 1960’s and more details about this data will be discussed later. For now it is clear from the shape of the data for both dry and wet oxidations that the relationship between the SiO2 film thickness and the oxidation time is neither purely linear nor is it purely parabolic. If the rate of formation of SiO2 was invariant with film growth, then linear growth kinetics is expected. On the other hand if the growth were limited by diffusional transport of reactant(s) through the growing oxide film, then purely parabolic growth would be expected. Furthermore, there have been several studies that indicate that during the oxidation of Si, the oxidant species is most likely molecular oxygen (O2 ) that migrates inward through the growing oxide film to the Si surface where the oxidation reaction occurs [3.12, 3.19]. The most conclusive study [3.12] first

52

a

E.A. Irene

b

Fig. 3.3. Data for the thermal oxidation of Si (100): (a) in dry 1 atm O2 at three temperatures in terms of the SiO2 film thickness versus the time for oxidation; (b) at one temperature but with various amounts of H2 O added to the O2 . The data was obtained using in situ real time ellipsometry at a wavelength of 632.8 nm and 70.00◦ angle of incidence, and the data was analyzed using a single film optical model

used naturally occurring isotope mixtures of O2 that is mostly 16 O for the oxidation of Si. Then after growing the SiO2 film for some time the ambient was switched to O18 enriched O2 . Then after further oxidation in the 18 O enriched O2 , the workers found that after the change of ambient the new oxide (Si18 O2 ) was found at the Si–SiO2 interface rather than at the SiO2 surface. This clearly indicated that the oxidant species is migrating inward to the interface where it reacts with Si. Based on the shape of the data which indicates that transport alone cannot explain the oxidation data and that the oxidant species, primarily uncharged O2 , are actually migrating, a very successful oxidation model [3.13] was derived in the 1960’s. Figure 3.4 depicts the essential features of this so-called linear parabolic (LP) model and the figure shows that essentially two fluxes are considered. This representation [3.1] is a departure from the original derivation wherein three fluxes were considered. In the present treatment the flux of oxidant from the gas to the SiO2 surface is ignored, because this gaseous flux is generally fast under the usual conditions for Si oxidation relative to the other fluxes in the solid phases and therefore not kinetically significant in a series process. The flux of oxidant across the growing oxide film, F1 , is given by Fick’s first law:

F1 = D

dC dx

(3.1)

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

53

Fig. 3.4. Schematic representation of the Si oxidation model showing the transport flux as F1 and the reaction flux as F2 . C1 and C2 are the concentrations of oxidant at the outer and inner interfaces, respectively and L is the SiO2 film thickness

where D is the oxidant species diffusivity, and dC/dx is the spatial concentration gradient of oxidant in one dimension. For a steady state situation the gradient can be approximated as: C1 − C2 dC ≈ dx L

(3.2)

where C1 is the solubility of oxidant in SiO2 and C2 is the smaller oxidant concentration at the Si–SiO2 interface where oxidant is removed by reaction, and L is the SiO2 film thickness. The other flux, F2 , is the number of oxide molecules forming per cm2 s and is expressed kinetically by a first order chemical dependence on the oxidant concentration at the Si–SiO2 interface, C2 : F2 = k2 C2

(3.3)

where k2 is the surface reaction rate constant. Of course the oxidation reaction involves both the oxygen and Si concentrations. However, for a given Si orientation the Si atom concentration per unit area is constant and thus included implicitly in k2 . This Si orientation issue will be discussed in the following paragraphs and then in the context of more modern studies where the careful application of ellipsometry has enabled further resolution of what are ostensibly a confusing set of results. By imposing the condition of steady state an equivalence of the fluxes follows as: F = F 1 = F2

(3.4)

which means that the series fluxes are self regulatory and permitted the use above of Fick’s first law. A rate equation is then formulated in terms of the rate of formation of the oxide film: dL = ΩF dt

(3.5)

54

E.A. Irene

where Ω is the conversion constant for SiO2 film thickness to oxygen flux, i.e. the number of oxidant molecules incorporated into a unit volume of SiO2 solid (2.3 × 1022 cm−3 ). C2 can be eliminated and the rate equation rewritten as: k2 DC1 1 dL . = Ω dt k2 L + D

(3.6)

This equation is readily integrated to yield: AL2 + BL = t + constant

(3.7)

where A and B are the reciprocals of the parabolic, kp , and linear, kl , rate constants, respectively and are given as: 1 = kp = 2DC1 Ω A

(3.8)

and

1 (3.9) = kl = k2 C1 Ω B The integration constant is evaluated at the initial time and thickness t = t0 , L = L0 so as to be able to shift the coordinate axes to any position in L, t space. With this initial condition, we obtain the following: t − t0 =

L2 − L20 L − L0 + kp kl

(3.10)

The use of the region L0 ,t0 in L, t space is twofold: at t0 = 0, L0 represents an initial oxide thickness either as a native oxide or from some previous processing; and L0 , t0 can be used to define a region of oxidation that does not conform to linear-parabolic kinetics i.e. an offset to the model. For oxidation in dry O2 such a region has been identified extending to several tens of nm. Later this important initial oxidation regime will be discussed further as it has been the subject of a number of recent oxidation models and while in the 1960’s through the 1990’s the initial growth regime has been more of an artifact than technologically important. However, now it is of utmost importance in microelectronics. Of particular relevance and discussed in more detail later is that the initial regime of oxidation comprising SiO2 from 0 to 20 nm thick is difficult to determine using ellipsometry because the real part of the refractive index (n) is both difficult to measure and has been found to vary with thickness. Recently, special procedures involving several techniques including ellipsometry have been used to determine the refractive index for thin films and will be presented later in this chapter. The new results presented later have led to a better understanding of the initial Si oxidation regime, and the ability to determine the film thickness for these ultra-thin SiO2 films. Experimentally obtained oxidation data are often interpreted using the LP oxidation model. The linear, kl , and parabolic, kp , rate constants are obtained

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

55

Table 3.1. Si Surface Atom Densities Si Orientation

1 1 0 1 1 1 3 1 1 5 1 1 1 0 0

Density Assuming Planar Surfaces 1014 (atoms/cm2 ) 9.59 7.83 10.86 11.48 6.78

Density Based on Vicinal Planes 1014 (atoms/cm2 )

7.21 7.05

from data fitting routines and the variation of the parameters with process variables are explained using the ideas contained within the LP model. Also of significance in the 1960’s era was the first report by Ligenza [3.20] that the kinetics of the oxidation of Si was significantly dependent on the Si substrate orientation. Specifically, Ligenza had shown that for steam at high pressure the Si111 orientation displayed the fastest oxidation rate with the Si110 next and followed by Si100 surface. According to the LP model this should be interpreted in terms of the number of available Si atoms at the various surfaces, since this model includes the Si concentration. However, as is seen in Table 3.1 the Si110 orientation has the greatest area density of Si atoms for the three major Si orientations, so a more convoluted model was constructed that depended on the number of available bonds at the Si surface. It had not been made clear why the author chose bonds over atoms as being kinetically significant, since it is typical that kinetics mechanisms include concentrations of atoms and molecules as proportional to chemical activity. As was mentioned above, the original LP model contains the orientation dependence implicitly in the linear rate constant, kl (or k2 ), since the substrate surface ought to dominate the surface reaction. Below when more modern ellipsometry results are presented for the oxidation of various Si orientations Table 3.1 will be discussed in more detail, and the need for a convoluted chemical bond based model to explain the orientation behavior will be obviated.

3.2 After 1970: Progress in Understanding From the 1970’s to the present, both industrial and academic microelectronics science and technology has been dominated by Si technology. Si technology involves numerous films on single crystal Si substrates (see Fig. 3.1) that in combination yield MOSFET’s and bipolar transistors and the associated circuitry that are included in modern integrated circuits (IC’s). In order to control the electronic device properties for the myriad of electronic devices on an IC, process control for the process steps was recognized as the strategy

56

E.A. Irene

to follow. Processes to produce devices and arrays of devices called IC chips are essentially a series of individual process steps many of which involve thin films and include surface preparation and characterization, film formation, film characterization film removal and device feature development which is essentially lithography. In this chapter the focus is on SiO2 films that are of utmost relevance to microelectronics Si technology through the 1970’s and 1980’s was dominated with films thicker than 20 nm. During this period ellipsometry dominated the measurement of dielectric film thickness and film growth dynamics. In the earliest studies in this era single wavelength ellipsometry (SWE) was used extensively for film thickness measurement. In the late 1970’s spectroscopic ellipsometry (SE) was shown to be readily implemented [3.21] and superior for most applications. In the 1990’s SE in concert with a wide variety of surface analytical techniques and microscopies were used to characterize films thinner than 20 nm that are largely interface. 3.2.1 In situ Real-Time Oxidation Studies: Dry O2 , the Effects of Water and Other Impurities The thermal oxidation of Si provides an excellent example of the power of ellispometry in addressing microelectronics issues and represents one of the first examples of real-time monitoring of a microelectronics industrial process, viz. Si oxidation in 1 atm O2 . [3.17, 3.22] In the area of microelectronics it is usually desirable to measure various film thicknesses, and in particular from the important Si oxidation process, the resulting SiO2 film thickness, L, and the real part of the index of refraction, n. It should be remarked that in general the refractive index N is complex: N = n + ik, where the imaginary part k is the absorption constant. However for SiO2 in the visible region of the spectrum k = 0 so N = n. Furthermore, it is necessary to have sufficient process control to obtain specified film thicknesses to within several percent. Single wavelength ellipsometry (SWE) measurements enable the extraction of both desired parameters L and n for a single transparent film thicker than 10 nm and on a substrate for which the optical properties are known. Spectroscopic ellipsometry (SE) over specifies this problem. Using a well aligned ellipsometer and near half of an ellipsometric period, n and L are easily obtained within an accuracy of several percent. (For a brief treatment of ellipsometry applications in microelectronics with appropriate references the reader is directed to [3.23]). Fig. 3.3a shows some oxidation results [3.17] from a 1 atm dry thermal O2 oxidation study of Siusing SWE, which was performed in flowing O2 . From the shape of the thickness versus time data it was possible to deduce and confirm oxidation models for the crucially important SiO2 film growth process. For example the data were fit to the linear-parabolic model and values for kp and kl and L0 were obtained. From kp and kl values versus temperature activation energies were also obtained (discussed separately later). Thus the

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

57

Fig. 3.5. Results from in-situ SWE employed during the Si oxidation process including dry O2 results (< 1 ppm H2 O in O2 ) and wet (2000 ppm H2 O in O2 ) as boundaries. The filled circles represent an experiment that started with dry O2 and switched to the wet ambient at 100 min. The data was obtained using 632.8 nm light and 70.00◦ angle of incidence, and was analyzed using a single film optical model

oxidation process could be quantitatively described and reproduced using the model parameters from the optical model and the Si oxidation model. Furthermore, better accuracy than required for industrial processes could be obtained from the combination of models using ellipsometry for films thicker than 20 nm. The process advantage of using H2 O over dry O2 for Si oxidation is the accelerated rate of oxidation with H2 O, hence a process time advantage is gained where a thick SiO2 film is desired. However, a price is to be paid with the degraded electrical performance of the SiO2 film. The H2 O reacts with the SiO2 network forming two Si–OH groups. These network defects have been associated with an experimentally observed SiO2 bulk charge trapping phenomena [3.24–3.26]. This internally trapped charge gives rise to internal electric fields that affect device performance. Dry ambient annealing after wet oxidation reduces the trapping somewhat, but never quite to reproduce dry O2 grown SiO2 films [3.26]. Therefore, for active MOSFET device areas such as for the gate oxide, a dry O2 thermally grown film is usually required. However, for masking and isolation oxides, one can often use the more economical wet O2 or steam oxidation process. The specific kinetic role of H2 O in the oxidation ambient has been given considerable attention. It was dis-

58

E.A. Irene

covered that even traces of H2 O in O2 of about 25 ppm altered the oxidation rate by an unexpected 20% [3.27]. Figure 3.3b displays data from another in situ real-time SWE study [3.18] which includes wet (H2 O added) O2 oxidation of Si. Figure 3.5 shows a combination of dry and wet Si oxidation data [3.18], that was obtained using in situ real-time SWE. The upper and lower solid lines act as bounds and show results from H2 O in O2 (2000 ppm H2 O in O2 and labeled wet) and pure O2 (dry) oxidation processes, respectively. For the experiment shown in Fig. 3.5 an initially dry O2 oxidation ambient was switched at some arbitrary time (100 min) to a mixture of O2 with H2 O (2000 ppm), whereupon it is seen that the Si oxidation kinetics also abruptly changes to that characteristic of wet grown SiO2 . The results shows that almost immediately upon H2 O addition the reaction rate changes to that for H2 O in O2 . This indicates both a rapid reaction of H2 O with the SiO2 network and a rapid change in kinetics with small amounts of H2 O in O2 . Also, it was found that both the linear and parabolic rate constants, kl and kp as obtained from fitting the L–P model to the data in Fig. 3.3b increased with the H2 O content in the O2 . However kp increased a greater percentage than did kl with H2 O additions. This was explained considering that H2 O reacts with the SiO2 network yielding OH groups and thus the reaction effectively breaks up the network of Si–O–Si bonds. When this occurs the diffusivity of O2 and many other atoms and molecules is increased thereby increasing the rate of transport and increasing the overall oxidation rate through kp . Also H2 O is a more virulent oxidant for Si and that explains that with H2 O addition even the interface reaction increases as evidenced by an increase in kl . In order to test this hypothesis of enhanced diffusion of oxidant in the presence of H2 O, an analysis was performed that considered the independent transport and reaction of two oxidant species: one related to O2 and the other related to H2 O. The resulting expression for L and t contains only the individual kl and kp values obtained from pure O2 and pure H2 O oxidations of the kind discussed above [3.18]. The prediction of this model was then compared with experimental data for oxidation in a controlled ambient. The results shown in Fig. 3.6 show an additional oxidation rate enhancement beyond having two oxidants. Thus the above hypothesis of enhanced transport due to a loosening of the SiO2 network is given some credence. It is also interesting to observe that the effect is reversible. If the ambient is switched from dry to wet and again to dry, the oxidation rate responds. The rate increases upon introduction of H2 O to the O2 and then slows down when the H2 O is turned off (see Fig. 3.5). Typically, Si oxidation is carried out in either a pure O2 or pure H2 O or H2 O in O2 or any of these ambients with HCl or some other Cl containing chemical. The reasons for these variations are mainly technological, however, with the use of these additional Cl containing chemicals considerable complexity results for the Si oxidation mechanism. The use of a Cl containing species in an O2 ambient is for the purpose of performing clean oxidations.

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

59

Fig. 3.6. A comparison of a model for parallel diffusion and reaction for two oxidant species for the oxidation of Si and experimental data for two oxidant species (O2 and H2 O) and for one oxidant species (O2 )

Mobile ionic charges (Qm ) in SiO2 are usually attributed to Na contamination from the process environment. It was found that the effects of the mobile charges could be substantially reduced through the use of a Cl containing oxidation ambient [3.28–3.31]. In terms of oxidation kinetics many of the chlorine containing additives contain hydrogen that reacts with oxygen to form H2 O that consequently increases the oxidation rate as was discussed above. A detailed thermodynamic analysis of the oxidation environment has been made that indicate significant formation of H2 O with H containing Cl components [3.32]. It is often necessary to oxidize heavily doped Si substrates (dopant concentration near about 1020 cm−3 ). The usual Si dopants include P and As for n-type Si and B for p-type Si. Excerpted from extensive experimental oxidation studies [3.33] Fig. 3.7 shows that at the lower oxidation temperatures that P doping accelerates the oxidation rate, but at the highest temperatures above 1000◦ C, B provides the faster oxidizing Si. In order to understand these results, an understanding of the redistribution of the dopants during oxidation is required. From the literature [3.34–3.38] it is shown that the donor impurities (As, P, Sb) accumulate at the Si–SiO2 interface and this comprises the so-called snow-plow effect, while acceptor impurities (Al, B, Ga, In) deplete. These effects are understood by comparing the relative solubilities of the dopants in SiO2 and Si and the diffusivity of the dopant in

60

E.A. Irene

b

a

c Fig. 3.7. The effect of heavy P and B doping (nominally 0.001 Ω-cm) on Si oxidation kinetics in dry O2 at three temperatures (a–c)

SiO2 and Si. Changes in both the kl and kp values were found with dopant concentration. The kl values were increased whenever there was an increase in a dopant species at the Si–SiO2 interface, viz where the oxidation reaction occurs. This was at the low oxidation temperatures for P and the highest temperatures for B. The large effects found for kp were attributed to structural changes in the oxide due to the presence of P and B. Later work on the oxidation of heavily P, As and B doped Si [3.39] indicated that changes in kl with dopant seemed to be at least correlated if not related to the enhanced concentrations of point defects (vacancies) at the Si surface. Given the stress related models to be discussed later that derive from a consideration of the volume requirements for the oxidation reaction, the production of vacancies

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

61

by dopants may also have relevance for other modeling efforts. This defect model has not explained the effect of dopants on the parabolic rate constant. Other impurity effects on Si oxidation are far less understood. In one of the earliest (1959) and seminal papers on the subject of Si oxidation [3.3] the effect of impurities that can enter the oxidation process as a result of improperly cleaned Si was realized. Specifically, these early workers found that impurities left on the Si surface before oxidation affected the reverse current characteristics of fabricated diodes by about seven orders of magnitude. Later it was found [3.40] that at high oxidation temperatures the fused silica furnace tube transpired impurities such as H2 and Na into the oxidation zone. Both of these impurities were found to enhance the oxidation rate. It is generally accepted that metallic impurities caused electrical problems in the Si substrate that are primarily related to carrier lifetimes. Most of the studies indicate that metals enhance the oxidation rate of Si [3.41–3.45] and a few such as Al that are network formers decrease the Si oxidation rate [3.46]. The so-called RCA cleaning process [3.47] emerged from work in the 1970’s and spurred continuing studies of the effects of impurities and cleaning of Si. The early studies concluded that H2 O2 solutions are effective for cleaning Si surfaces with high pH solutions for the removal of organics and low pH solutions for the removal of metallic impurities. Often the RCA procedure is followed by an HF dip, in order to remove the residual oxide and the impurities and particles at that oxide surface. Studies [3.48, 3.49] using SIMS, Auger and TEM analyses have also showed that H2 O2 based cleaning solutions are effective at removing contaminants. However one brief report [3.50] has shown that Si cleaned using an acidic peroxide solution yielded higher Si oxidation rates than for Si cleaned using basic peroxide solutions. Higher quantities of both mobile ions and fixed oxide charge were observed on the basic peroxide cleaned Si and later this work was substantially confirmed [3.51]. However while these impurity effects are readily measurable, the specific kinetic roles for specific impurities, with the exception of H2 O, are largely unknown. 3.2.2 Arrhenius Behavior and Deviations The L–P model indicates that a single activation energy should be associated with each rate constant, namely kp for transport and kl for the interface reaction. From the previously derived expression for kp in terms of the diffusion constant D, the transport process should yield the activation energy for diffusion of O2 in SiO2 given the likely assumption that the solubility is only a weak function of temperature. The interface reaction with the associated rate constant kl , should also display a single activation energy associated with k2 , the surface reaction rate constant. Figure 3.8 shows the Arrhenius plots for both kl and kp for the temperature range of 780◦ C to 1050◦ C [3.17, 3.33, 3.52] and assuming an exponential form: (3.11) k = k0 e−Ea /kT .

62

a

E.A. Irene

b

Fig. 3.8. Arrhenius activation energy plots (a) for the linear kl and (b) parabolic kp rate constants

It is clear that there is curvature in the plots. This curvature can arise from two sources. One is that the process is not represented by Boltzmann-like phenomena; and the other is that the process is not an elementary kinetic step. Since there is no evidence to suspect the former, it is concluded that the phenomenological L–P model as derived above does not adequately account for the complexity of the actual Si oxidation process. From the direction of the curvature, i.e. concave up or down the kind of rate process that is responsible can be deduced [3.52]. The concave upward shape for kl indicates that parallel rate processes possibly indicating reaction between different interface oxidant species are occurring while for kp a series process is indicated by the concave downward shape that could possibly indicate transport in both bulk and via micropores. 3.2.3 Stress Effects on Oxidation Kinetics Early studies of SiO2 film stress [3.53, 3.54] were performed at room temperature and on films grown on Si using oxidation temperatures greater than 1000◦ C which was appropriate to the technology at that time. These studies concordantly reported that the measured residual room temperature compressive film stress could be explained both in magnitude and sign (tensile or compressive) based on the thermal expansion stress (σth ) which develops upon cooling from the oxidation temperature to room temperature (∆T ), and as a result of the difference in thermal expansion coefficients, ∆α, between SiO2 and Si. σth is proportional to the product of ∆α and ∆T as: σth ∝ ∆α · ∆T.

(3.12)

Since at the oxidation temperature ∆T = 0, the thermal component of the stress, σth , would be 0 during oxidation, and thus the thermal component of stress cannot be implicated in any stress driven oxidation models. However,

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

63

Fig. 3.9. Viscous flow model where the growing SiO2 film viscously relaxes into the free direction (z) driven by a compressive in plane (x, y) stress

an early study [3.55] indicated that an intrinsic compressive stress (σi ) exists for oxidation temperatures below about 1000◦ C and that σi increased in magnitude with decreasing oxidation temperatures. Later the existence and temperature variation of σi was confirmed [3.56, 3.57] and a model was proposed [3.56] which not only explained the appearance of the intrinsic stress, but also the simultaneous appearance of an increased SiO2 film density that was also reported [3.58] with the same increase for lower oxidation temperatures as for σi . The model called the “viscous flow model” [3.56] was based on earlier ideas [3.55, 3.59] and is explained with the use of Fig. 3.9. It is useful here to know that there is a 120% increase in molar volume when an atom of Si converts into a molecule of SiO2 . This volume requirement can be met for the SiO2 formation on the Si surface by the expansion of the as-formed SiO2 into the free direction, viz. the direction normal to the Si surface (z in Fig. 3.9). If all of the oxide produced by reaction flows into this direction, then the anisotropic expansion created by the conversion of Si into SiO2 will occur without stress. The viscous flow model assumes that this free direction is “found” by the mechanism of viscous flow at the high oxidation temperatures, above 1000◦ C where the oxide viscosity is sufficiently low. The oxide is constrained by adhesion in the plane of the Si surface and thus can only readily flow into the normal direction. The constraint in the lateral direction and flow in the normal direction can be analogized as the flow of toothpaste from a tube as the tube is compressed normal to the direction of flow. This results in a biaxial compressive stress in the SiO2 film and a stress gradient in this film normal to the Si surface. However, at lower oxidation temperatures, the higher oxide viscosity precludes easy flow within the time frame for oxidation, and an intrinsic stress, σi , develops. Since the oxide viscosity increases as the temperature decreases, it then follows that the intrinsic stress which develops should also increase with decreasing oxidation temperature, as is observed. Along with the observation of the intrinsic stress and it’s temperature

64

E.A. Irene

a

b Fig. 3.10. (a) Intrinsic SiO2 film stress resulting from the thermal oxidation of various Si substrate orientations as a function of oxidation temperature; (b) SiO2 film density as a function of oxidation temperature as obtained from ellipsometric measurement of n and the Lorentz–Lorenz formula

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

65

dependence, is the parallel observation of an increase of the SiO2 film density [3.56, 3.58, 3.60, 3.61] with decreasing oxidation temperature. Using the viscous flow model, the densification of SiO2 can be understood as the accommodation of the SiO2 film growth system to the accumulation of stress, viz. the system attains as small a volume as possible so as to minimize the stress. Although the SiO2 network is quite open, only a small density increase is permitted before large repulsive forces are encountered. Between the oxidation temperatures of 1100◦ C and 700◦ C about 3% density increase is observed. The first report of a density increase with lower oxidation temperatures was by Taft [3.58], and was based on the precise measurement of the refractive index, n, values as a function of oxidation temperature and the application of the Lorentz–Lorenz relation to convert n to density. Later Irene et al. [3.61] found nearly identical refractive indices, but went further and obtained the density directly from measurements of the film volume and mass. While this latter measurement of density is not as precise as the measurement of n, the direct measurement yielded the same temperature dependence as the n derived values and approximately the same absolute values, thereby increasing the confidence with ellipsometrically obtained density values. Figure 3.10 shows the measured film stress (Fig. 3.10a), σi , and density (Fig. 3.10b) as a function of oxidation temperature and obtained from n measurements. The same density change as a function of oxidation temperature was found using infrared spectroscopy (IR) techniques [3.60]. The IR spectra for SiO2 prepared by thermal oxidation at three oxidation temperatures showed a shift towards lower frequency, v, for the 1075 cm−1 band. This band is associated with the Si–O–Si bond angle, Θ, which is the angle between adjacent SiO4 tetrahedra and is a measure of the Si–Si distance that relates directly to the SiO2 density. The lower v the smaller is Θ, and hence the smaller is the Si–Si distance and the higher is the film density. From the IR, a 2–3% increase in density is obtained in substantial agreement with earlier measurements. 3.2.4 Orientation Effects on Oxidation Kinetics The earliest reports of a Si substrate orientation dependence [3.20] showed that for high pressure steam oxidation in a film thickness regime well beyond L0 , the Si111 was the fastest oxidizing with the 110 next and followed by the 100 Si surface. According to the L–P model the Si orientation dependence is related to the interface reaction of oxidant with Si at the Si surface where the oxidation kinetics should scale with the area density of Si atoms on the surface, i.e. the Si atom concentration at the reacting Si surface. However, as is seen in Table 3.2 the Si110 has the greatest area density of Si atoms among these three major orientations. Thus, a more convoluted model was constructed that depended not simply on the area density of Si atoms but rather on the number of available bonds at the Si surface [3.20]. Furthermore according to the LP model [3.13] the orientation dependence of the oxidation rate should be incorporated in the linear rate constant, kl , since the substrate

66

E.A. Irene

Table 3.2. Barrier heights that yield equivalent thermionic electron and experimental oxygen fluxes for different oxidation temperatures, Si orientations and thickness ranges Oxidation Si Orientation Oxidation Rate SiO2 Barrier Height Temperature Thickness Range (nm/min) (nm) (eV) (◦ C) 600

100 110

0.0004 0.0012

2–3 2.5–6

2.87 2.79

650

100 110

0.014 0.041

2.5–7 2.5–10

2.95 2.86

700

100

0.014 0.0057 0.0043 0.033 0.014 0.0094

2.5 10 20 2.5 10 20

2.92 3.00 3.02 2.85 2.92 2.96

0.028 0.024 0.019 0.057 0.035 0.024

5 10 20 5 10 20

3.02 3.03 3.06 2.96 3.00 3.04

1.79 0.9 0.27 3.45 1.74 0.4

7 20 100 9 20 100

3.35 3.43 3.56 3.28 3.35 3.52

110

750

100

110

1000

100

110

surface ought to dominate the surface reaction. Figure 3.11 displays another set of in situ real-time ellipsometry data [3.62] obtained using SWE, a single film optical model, at 632.8 nm light from a He–Ne laser incident at φ = 70◦ and the oxidations were performed using dry flowing O2 . Figure 3.11a clearly shows that for the thinnest films (< L0 ) on the three major low index planes of Si, the order for the oxidation rate, R, is: R110

>

R111

>

R100

For films thicker than 20 nm there is a crossover in the order of the Si 110 and 111 orientations for the thicker films (near 15 nm), yielding agreement

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

67

a

b Fig. 3.11. In-situ real-time Si oxidation data for various orientations of Si obtained using SWE at φ = 70◦ and λ = 632.8 nm and in dry O2 : (a) at 800◦ C and (b) at 700◦ C. The data was analyzed using a single film optical model

68

E.A. Irene

with the early studies [3.20] that were performed on thicker films. An explanation for this crossover in the oxidation rates was based on the interface reaction and the effects of stress on Si oxidation [3.63]. The initial oxidation rate scales with the area density of Si atoms. Beyond the crossover, however, another physical mechanism is required to dominate that is based on the fact that Young’s modulus, E, for Si varies with orientation [3.64, 3.65] as: E111

>

E110

>

E100

It is also important to recall the observation discussed above that an intrinsic stress develops during thermal oxidation. The observed tensile stresses in the plane of the Si surface (compressive in the oxide film) stretch the Si–Si bonds in the surface and thereby increase the Si surface reactivity resulting in an accelerated surface reaction rate. This stress model [3.63] seems to predict the qualitative aspects of the crossover observations and is even relatively quantitative with the Si111 and 110 orientations. Later [3.57] it was also discovered that the thermal oxidation of Si111 resulted in a smaller compressive film stress than for the other major orientations. The lower compressive stress for the SiO2 on Si111 would lead to a higher diffusivity and thus a greater oxidation rate for the Si111 as the diffusion regime is entered as the SiO2 grows thicker. Hence a crossover in growth for the Si111 and Si110 would be anticipated. In an effort to confirm this model the in situ ellipsometry data above has been extended down to 600◦ C oxidation temperature where the intrinsic stresses are larger and the study included the Si311 and Si511 orientations [3.66]. Previously the crossover has been seen in the oxidation data from 1100◦ C down to 750◦ C. However, for the lower oxidation temperatures the films were too thin to exhibit the crossover. The 311 and 511 orientations are actually stepped vicinal planes of the low index major planes, the Si111 and 100 planes [3.67] with area densities of Si atoms are between the 111 and 100 values as is shown in the right side of Table 3.1. The left side entries would be for flat 311 and 511 planes. The experimentally determined oxidation rate order was found to be: R110

>

R111

>

R311

>

R511

>

R100

This order as is shown in Fig. 3.11b for 700◦ C oxidation data is concordant with the Si atom area densities for these vicinal planes scales with the initial oxidation rate. 3.2.5 Effects of Light on Oxidation Kinetics The attempts to oxidize Si using photons as the source of excitation were tied to the technological desire to reduce the thermal budget for the thermal oxidation process. Early experiments utilized photons emanating from

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

69

Fig. 3.12. Electron energy band diagram for the Si–SiO2 system showing the barrier energies for electrons from Si to SiO2 (3.15 eV from the Si conduction band and 4.3 eV from the Si valence band)

both a mercury arc lamp and an iodine vapor lamp and sometimes in combination [3.68] that were focused onto the cleaned and heated Si surface. An enhancement of the oxidation rate was seen for all temperatures from 955◦ C to 1215◦ C with the greatest photon enhancement at the lower temperatures. It was reported that the initial oxidation regime was most affected and it was pointed out that the near ultra violet (uv) light photons can elevate electrons over the Si–SiO2 barrier. Several later investigations used laser light as the source of photons [3.69–3.71]. In one study [3.69] the uv light from an Ar laser was shown to enhance the oxidation rate even after the calculated temperature rise was taken into account. Thus both thermal and photonic effects were reported and Si bond breaking was proposed to account for the effect. Also, wavelength dependence was reported that indicated that for light with energies greater than about 3 eV an additional enhancement was observed. In another study [3.70] a scanning Ar laser was used to heat a large area of the Si surface and thereby increase the oxidation rate. Again both thermal and photonic enhancements were reported. In another study [3.71] in which the laser beam power density was carefully considered, an even greater photonic effect was reported. This work has shown a much greater enhancement of the oxidation rate with light above 3 eV and greater effects with p-type Si and thus strongly suggests that electrons and electron related effects are important for Si oxidation. Electron effects were postulated as causative possibly through the destabilization of the O2 molecule, and this point will be returned to below. Reported photon enhanced oxidation effects indicate the importance of the electron barriers to Si oxidation kinetics and have led to a new model for oxidation. [3.72] From Fig. 3.12, it can be seen that to promote electrons

70

E.A. Irene

from Si to the SiO2 conduction band and thereby provide free electrons for the oxidation reaction, several routes having different energies are available. From the Si valence band a promotion energy of above about 4 eV is required, while for the Si conduction band, about 3 eV is necessary. Intermediate in energy are the defect levels (for n and p Si) and the intrinsic Fermi level at which there are no electron states in a perfect material. To determine which, if any, of these barriers may be oxidation rate limiting, we calculated the electron flux, Jet , from the Richardson–Dushman thermionic emission equation: Jet = AT 2 exp(−χ0 /kT )

(3.13)

where χ0 is the electron barrier height, T is absolute temperature and A is the Richardson constant. Then this electron flux is compared with the flux of O2 (J(O2 )) which is obtained from the experimental oxidation rates. This comparison assumes that one electron per O2 molecule is required for oxidation, which is justified within the specific proposed mechanism. Table 3.2 shows in the last column the calculated barrier heights such that the Jet equals J(O2 ) at various oxidation temperatures. It is seen that barriers of the order of 3 eV are appropriate for oxidation temperatures below 1000◦ C and for thin oxides; and 3 eV is approximately the energy barrier value for the Si conduction band electrons (see Fig. 3.12). A simple calculation [3.72] confirmed that there are sufficient conduction band electrons for oxidation, by a factor of ten or more, at any temperature above room temperature at which the numbers are marginal. From these results we proposed a mechanism [3.66] in which there exists a rapid relative (to the consumption of O2 ) flux of O2 to the Si surface on the SiO2 side of the Si–SiO2 interface, and also a rapid flux of electrons on the Si side, with the flux of electrons (e− ) over the Si–SiO2 barrier to be rate limiting. Once an e− goes over the barrier it attaches by a favored reaction [3.73] to O2 forming O− 2 which decomposes to O atoms more readily than O2 (by 25% or more). Oxidation then proceeds readily by reaction of Si with O atoms. In a parallel way, oxidation can also occur, but much more slowly by reaction with O2 . Such a parallel reaction scheme was already suggested for the initial regime [3.74] and the curvature found for Arrhenius plots for linear rate constants could be explained based on a parallel path reaction scheme [3.52]. Finally, this e− limited mechanism yields insight into the formation of the 1 nm native oxide that forms virtually instantly on a fresh Si surface even at room temperature, yet virtually ceases to grow after about 2 nm unless the temperature is raised. If we consider the approximately 1015 Si surface electronic states, most of which have eventually captured an e− from the bulk Si, then there are some 1015 e− ’s available for Si oxidation. These e− ’s are existing in closely spaced levels and require only little energy promotion. Thus these electrons are available in parallel to the electrons produced thermionically. The 1015 electrons at one e− per O2 molecule would yield about 1 nm SiO2 which is the experimentally measured native oxide thickness. Once the

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

71

states are removed via oxidation, however, this native oxide can no longer form and the thermionic and/or photonic excited electrons are required for further oxidation. 3.2.6 The Thin Film Regime (< 20 nm) Presently the microelectronics industry employs SiO2 gate oxides in MOSFET’s that are less than 10 nm thick and less than 2 nm thick for advanced devices. This film thickness regime is difficult to measure accurately. In addition many of the properties of these ultra thin films vary with film thickness. The oxidation kinetics for this regime (the thickness regime < L0 ) has been in dispute since the 1960’s as has been the nature of the Si–SiO2 interface. This oxidation regime comprises the region of SiO2 film thickness that does not conform to L–P kinetics, i.e. from zero oxide thickness up to L0 that is about 20 nm. For the purposes of discussion, the initial regime is divided into two parts based on film thickness. The first part includes the very initial regime that is from a bare Si surface up to about l nm; and this is followed by the regime of about l nm to 20 nm. The former regime is very important from a fundamental point of view, since it emphasizes the reaction at the Si surface without a consequential amount of SiO2 being present. The latter regime is of particular practical importance because at this time advanced MOSFET’s require gate SiO2 to be less than 5 nm and even below 2 nm for the most advanced technologies. It is interesting to note that most Si oxidation experiments begin with Si samples that already have a native oxide of about 1 nm. However, it is possible to commence an oxidation experiment with a bare Si surface. This can be done using “brute force” by placing a Si sample in an ultra high vacuum chamber and heating above 700◦ C so as to volatilize the oxide layer. Actually, the oxide itself doesn’t evaporate appreciably at this temperature, but rather with SiO2 in contact with Si, the oxide disporportionates to a volatile component, SiO as: SiO2 (s) + Si(s) → SiO(g).

(3.14)

Another less aggressive way to achieve a bare Si surface is to first dip the oxide coated Si surface in HF just before performing the desired reaction at the Si surface. HF will remove the surface oxide and render the Si surface hydrogen terminated and strongly hydrophobic. This H terminated surface can be stable for hours depending on the ambient and conditions. The H terminated Si surface can be gently heated to several hundred ◦ C to remove the H and leave a bare Si surface. The HF treated H terminated Si surfaces were studied using a combination of ex situ and in situ spectroscopic ellipsometry [3.75]. These workers were easily able to follow the changes at the Si surfaces when the monolayer of H was removed. Of course the bare Si surface from either procedure will react rapidly to reform oxide even at UHV. In discussions of oxidation kinetics presented above, the symbol L0 was used to

72

E.A. Irene

denote the upper limit of the initial regime of several tens of nm, and now Ln will be used to denote the upper limit of the very initial oxidation regime that is essentially the native oxide thickness. In this section the progress made with the thin film regime (up to L0 ) is reviewed up until the early 1990’s. In a later section ultra thin films (up to 10 nm) are again addressed, but in the light of recently developed techniques that can access information about ultra thin films. The study of the ultra thin film regime from 0 to Ln is fraught with experimental difficulties such as the cleanliness of the Si surface, and the vacuum conditions during the experiments. One early study comprising surface energy measurements made in room ambient [3.76] using contact angle techniques showed a steep change in surface energy as the oxide thickness changed from 0 to 3 nm. These experiments strongly suggested that only a change in composition would explain the large observed changes in surface energy. Later Rutherford backscattering measurements on thin oxides appear in agreement [3.77], however an ambiguity exists as to whether the etch solution (HF in H2 O) used to etch SiO2 for the contact angle experiments is altering the surface energy. One study [3.78] that paid particular attention to surface cleanliness by utilizing in situ, in the UHV chamber, Auger (AES) and electron energy loss (EELS) spectroscopies, and surface cleaning using high temperatures, report the disappearance of surface electronic states upon exposure of bare Si to oxygen as well as the formation of oxide. Initially the oxide appears to be a suboxide in agreement with previous studies [3.77, 3.79–3.81]. Another similarly careful UHV study utilizing EELS [3.82] reports three different stages of Si oxidation. The earliest stage occurs after oxygen exposure at low temperatures (l00 K) and involves adsorption of molecular oxygen that at stage two at higher temperatures converts to atomic species. The final stage for higher exposures indicates the formation of SiO2 . Optical absorption of a UHV cleaved Si crystal [3.83] also shows that surface electronic states within the Si band gap disappear upon oxygen exposure; these states are associated with dangling bonds at the Si surface and as was discussed above are implicated in the electronic passivation of the Si surface. The valence band states are observed to be unaffected. Therefore, these studies affirm the intuitive notion that the Si–SiO2 interface has some transition region of graded composition and that the formation of a stoichiometric SiO2 film occurs beyond at least a few atomic layers and at temperatures above room temperature. The next growth regime from Ln to L0 that extends from about 1 nm to several tens of nm is the regime that is now the focus of microelectronics technology. The reason that technology is now demanding thinner high quality SiO2 films is a direct result of the economically motivated demand for greater device densities for manufactured chips. In order to achieve greater device densities on a fixed chip area, it is obvious that smaller device areas are required, and along with this but less obvious is that thinner dielectric films are also required. This latter and presently most germane requirement is best understood by considering MOS devices that are designed to operate

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

73

at certain applied electric fields. The operating fields and the electric field distribution in the gate region are determined in part by the capacitance of the gate oxide in the MOS structure which is given as: C = KA/L

(3.15)

where K is the static dielectric constant for SiO2 , A is the gate contact area and L is the dielectric thickness (capacitor plate separation) or in our case SiO2 film thickness. Thus it is easily seen from this relationship that in order to down scale a device to smaller area A, but at the same time maintain the designed C, the SiO2 film thickness, L, must also decrease so as to maintain the C and therefore maintain the device operational characteristics. Present advanced technologies find the required L below L0 or below 20 nm. For the highest quality SiO2 grown in dry O2 , this thickness is within the offset region of the L–P model, i.e. below L0 , t0 and hence without analytic description. It will be shown below that new results for the refractive index for interfacial oxides grown on Si have enabled a further elucidation of the Si oxidation kinetics for the thin oxide regime. In terms of the oxidation data representing this regime, the best data is obtained from in-situ ellipsometric experiments of which there are several published studies [3.17, 3.22, 3.72, 3.84, 3.85]. The early results of Hopper et al. [3.22] shows that the shape of the initial regime up to L0 is basically linear-parabolic, as is the thicker film regime but with different L–P rate constants than those used to describe the thicker film regime. Similarly precise work [3.85] with some data shown in Fig. 3.11 revealed that the shape of the L versus t data is more parabolic when the ambient contains more H2 O. This was established based on the fit of the initial regime data to a simple linear equation of the form: t = k1 L + k2

(3.16)

where k1 and k2 are simply the slope and intercept, respectively, and have no other physical significance. The quality of the fit indicates that from 780◦ to 980◦ C the dry O2 data fits best. The parabolic term in the L–P model is derived from the consideration that the formed SiO2 film actually protects the Si surface from further oxidation, i.e. the grown film provides a barrier to further oxidant penetration. Thus with this idea, the parabolic shape of the oxidation data is interpreted as the films ability to protect the underlying substrate from further oxidation, and one deduces that the wet grown oxides are more protective. The extensive data of Massoud et al. [3.84] generally agrees with the other data on the appearance of the initially fast oxidation regime. 3.2.7 The Si–SiO2 Interface: Measurement and Implications An extensive review of the Si–SiO2 interface [3.86] strongly supports the notion that the interfacial region is chemically and physically distinct from

74

E.A. Irene

Fig. 3.13. Interface model for the Si–SiO2 interface. The interface layer (Linf ) made up of Si roughness (R) and suboxide (Lso ) on a Si substrate. The bulk SiO2 film has a thickness Lov

either Si or SiO2 . From several published studies [3.87–3.89], it was suggested that there is likely an epitaxial relationship between the first several layers of oxide grown on the Si surface. This is derived from structural compatibility arguments and the minimization of the molar volume difference between the two phases. In terms of an oxidation model [3.87] a two-step process was envisioned in which the first step produced the epitaxial layer of oxide, but with a concentration of interstitial Si atoms. The atoms resulted from the imperfect match across the phase boundary. The second step was the oxidation of the interstitial Si atoms with the concomitant amorphization of the oxide due to the lattice expansion. Studies on the transport of oxidant through a growing oxide [3.90] yielded some evidence that the very initial oxide forms as a result of the motion of O atoms as opposed to the findings for thicker oxides. This new mode of transport for the very thin films suggested the possibility for an ionic transport mechanism. More recently a novel ellipsometric method called spectroscopic immersion ellipsometry [3.91] (SIE) that essentially immerses a film-substrate sample into a transparent liquid (for SiO2 on Si, CCl4 is used) that has optical properties very close to the optical properties of the dielectric overlayer. Hence, this technique “optically” removes the overlayer and thus enhances

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

75

Fig. 3.14. The changes in the interface layer thickness with annealing time for several anneal temperatures. The measurements were performed on a 3 nm SiO2 film on Si immersed in CCl4 after anneal at the specified time and cooling from the anneal temperature

the sensitivity to the film-substrate interface by about 10x. This technique is used in conjunction with a working model for the interface between crystalline Si substrate and amorphous SiO2 film that is shown in Fig. 3.13. The transition region has a structure with two major components: the “physical” interface and the “chemical” interface. The “physical” interface can be represented by microroughness or protrusions of Si into the oxide. The “chemical” interface consists of a suboxide, SiOx with 0 < x < 2. For the case of nitridation of Si the interface layer could be a nitride or even an oxynitride. The crystalline silicon protrusions are described as hemispheres with an average radius R, which form a hexagonal network with an average distance D between centers. The protrusions and the region between them are covered by a layer of suboxide assumed to be SiO (i.e. x = 1) with an average thickness of LSiO . An effective interface thickness is given as: Linf = R + LSiO .

(3.17)

The Bruggeman effective medium approximation (BEMA) was used to calculate the effective dielectric function of the interface. The evolution of the Si–SiO2 interface as a function of high temperature annealing (750–1000◦ C) was investigated by SIE. Figure 3.14 shows modeled data in terms of the interface thickness defined above as Linf , which displays the temperature-time dependent shrinkage of the interface with annealing.

76

E.A. Irene

Distinct modes of behavior are observed for the evolution of the interface. For short annealing times a rapid change in the interface is observed that correlates with the disappearance of protrusions, followed by a slower change that correlates with the disappearance of the suboxide. At high annealing temperatures it was thought that viscous relaxation dominates (as was discussed above in Sect. 3.2.3), while at low annealing temperatures the suboxide reduction is apparent. With the use of the above optical model, it was reported that the thickness of the SiO layer at the interface, LSiO , for all 100, 110, and 111 Si substrate orientations increased slightly, and the average radius of the crystalline silicon protrusions, R, decreased with the thickening of the SiO2 overlayer. This yields an overall decrease in the interface layer (Linf ) as is seen in Fig. 3.14. These results are consistent with the well accepted linear-parabolic, LP, Si oxidation model which yields an accurate representation of the growth of SiO2 on Si over a wide range of thickness, temperature and oxidant partial pressures. Also, as will be discussed later these results are concordant with Si surface smoothing from oxidation. For large thicknesses it is straightforward to show that LP formula reduces to: t ∼ = L2 /kp . This formula implies that oxide growth is diffusion controlled. In other words, as the oxide layer gets thicker, the oxidizing species must diffuse through a larger distance to arrive at the Si/SiO2 interface. The reaction thus becomes limited by the rate at which the oxidizing species diffuse through the oxide. The disproportionation reaction above that yields SiO(g) from the reaction of SiO2 with Si can be initiated at active defect sites already present at the Si/SiO2 interface. In the model above the Si protrusions may be considered as defects that could cause the above decomposition, since these sites are thermodynamically active due to the smaller radius of curvature. This is consistent with the results that show that with the thickening of the SiO2 , the thickness of SiO layer, LSiO , at the interface increases and the average radius of the crystalline protrusions, R, decreases.

3.3 Modern Era: The Quest for Thinner SiO2 and Alternatives 3.3.1 Ultra-thin SiO2 Film Metrology Ellipsometry is often used to measure film thickness especially for films less than 200 nm thick. However, in the ultra-thin film regime (less than 10 nm thick) ellipsometry cannot accurately determine film thickness. This is due to both the general unavailability of refractive indices for thin films and the difficulty of measuring refractive indices for ultra-thin films. Since ellipsometry measures the product of refractive index and film thickness (this product is the optical path length or optical thickness), if the index is not known or inaccurate then the errors in index will result in thickness errors. Although it is often done, it is dangerous to use the bulk film refractive index for a film

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

77

Fig. 3.15. Pictorial of crossectional transmission electron microscopy (XTEM) of a rough interface

that is less than 10 nm, since it is known [3.91–3.94] that the optical properties including the refractive index for ultra-thin SiO2 films are different than for thick films. For SiO2 the ultra-thin films have a higher index than the bulk film due to compressive interfacial stresses and suboxide content as was discussed above. Thus, the use of the low bulk refractive index for thin SiO2 films would yield too large of a film thickness. Other techniques such as x-ray photoelectron spectroscopy (XPS) and cross-sectional transmission electron microscopy (X-TEM) have commonly been used to measure thin film thickness. However these techniques also have significant errors associated with them. For XPS for example [3.95] besides problems with adventitious C on the surface and photoelectron diffraction affecting measured intensities, the XPS measurement of film thickness requires the measurement of the attenuation length (A.L.) for the photoelectron. A.L. is typically measured in a separate experiment, and it requires knowledge of the film density that is usually not known accurately, since it can be different from the bulk density. The film thickness is obtained from a formula of the form:   1 . (3.18) Lox = (A.L.) sin θ ln 1 + Q In this formula Q is a product of two ratios of intensities that are measured in separate experiments. Thus XPS is subject to many potential sources of errors for film thickness measurements. X-TEM accesses the projection of

78

E.A. Irene

Fig. 3.16. Fowler–Nordheim (FN) energy barriers. When the applied voltage Vox > the barrier (φM ) FN tunneling occurs and FN tunneling current oscillations (FNCO’s) are possible from the interference of the electron waves

the lattice planes throughout the cross-section sample thickness as depicted in Fig. 3.15. If the sample is rough as shown or is tilted the accuracy of the film thickness is affected. In particular roughness makes it difficult to find the interfaces, and sample tilt would yield the minimum film thickness due to the projection of all the lattice planes in the cross-section. Thus, the combination of ellipsometry using thick film indexes plus X-TEM can be used to bracket the real film thickness. Without an accurate knowledge of film thickness it is difficult to fully characterize device properties, since film thickness determines the distribution of electric fields. Also accurate oxidation processes cannot be developed with inaccurate film thicknesses and the linear–parabolic mode that applies to SiO2 thicker than 10 nm cannot be used to model or predict the ultra-thin SiO2 film growth processes via thermal oxidation. In addition device operation at high electric fields can be hampered in the cases where the interface between the SiO2 film and Si substrate is rough. Both high field carrier mobility and film dielectric reliability can be reduced by interfacial roughness. One technique that has been developed to obtain accurate film thicknesses uses Fowler–Nordheim (FN) electron tunneling current oscillations (FNCO’s), a subject that was discussed theoretically in the 1960’s [3.96,3.97], and in the 1970’s and 1980’s was experimentally developed by Maserjian and colleagues [3.98]. As is depicted in Fig. 3.16, FN tunneling current oscillations result from the interference of incident (injected from the metal contact) and reflected (from the Si–SiO2 interface) electron waves in the SiO2 film conduction band in a MOS structure. As shown by Fig. 3.16 the interference of propagating electron waves occurs when the applied oxide potential Vox is greater than the electron barrier ΦM . The solution for standing waves in this system for a trapezoidal barrier has the form of Airy functions. Thus, the oxide acts analogous to an optical etalon (a spatial filter) where the

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

79

Fig. 3.17. Using refractive indexes from the combined FNCO and ellipsometry method, Si oxidation data has been corrected and the older Massoud et al. [3.84] data is compared with corrected data [3.101]

path through the barrier, the film thickness, is a parameter in the analytic solution. From Airy functions solutions and the experimentally determined maxima and minima in the oscillations, the film thickness can be obtained to an accuracy of better than 0.1 nm using a recently developed straightforward procedure [3.99]. In one study [3.100] ellipsometry was performed on the same samples for which the SiO2 film thickness was determined using FNCO’s. Specifically, highly accurate single wavelength ellipsometry was used at 632.8 nm to measure the ellipsometric variables, ∆ and Ψ . Using a single film model, and with the film thickness from FNCO’s as input, the refractive index was calculated, thereby obverting the issue of obtaining accurate indexes from elliposmetry alone. For film thicknesses of about 4.4 nm a refractive index for thermal SiO2 films was found to be about 1.894 as compared with a value of 1.465 for bulk SiO2 films thicker than 20 nm. Using this new value for the index for ultrathin SiO2 films as well as known bulk values, an interpolation formula was obtained for SiO2 refractive indexes as a function of film thickness [3.101]. Using these new indexes for SiO2 , our previously published SiO2 film thickness versus Si oxidation time data has been re-evaluated. Figure 3.17 shows that the corrected SiO2 film thickness values (labeled this work) for the ultrathin regime are smaller than previously calculated using the lower bulk SiO2 indexes. From this study [3.101, 3.102] it was confidently concluded that the controversial initial thermal oxidation regime is purely linear, indicative of an interface reaction. Note that in the early Si oxidation because the bulk index was used to evaluate the in situ real time ellispometry data a curvature was obtained for the initial regime. That this was the case was only very recently clarified with experimental results.

80

E.A. Irene

Fig. 3.18. The n results for thin film SiO2 using the algorithm in [3.103] (present results) compared with earlier results ( [3.100, 3.101]) from SWE and FNCO’s

Also recently [3.103], the ultra-thin film SiO2 refractive data was further refined. Variable angle SE data on SiO2 films ranging in thickness from 2–8 nm was collected and processed the data using an iterative algorithm, in order to obtain consistent values for the index parameter. The result is a new recursion formula shown in Fig. 3.18 along with a previous formula [3.101] for comparison. These new results have extended the range of thickness for which refractive indices down to about 1.5 nm were reported, and enable the accurate ellipsometric determination of SiO2 film thickness to about 1 nm. 3.3.2 Interfacial Roughness at the Si–SiO2 Interface For direct roughness measurements atomic force microscopy (AFM) is commonly used to obtain an image of the surface. From the lateral x, y positions the height of a surface feature, z, is obtained with sub nanometer accuracy. This data is typically analyzed to obtain the extent or width of the interface that is reported as the root mean square (RMS) roughness which is a measure of the average magnitude of the roughness features. However, the RMS is insufficient to describe a rough surface or interface, and insufficient to distinguish one rough surface from another. For that purpose we have developed Fractal analysis, in order to determine the spatial complexity of the roughness [3.104, 3.105]. In Fractal geometry a fractal dimension, DF , is a scaleless non-integer parameter that is a unique descriptor of an object,

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

81

Fig. 3.19. Si oxidation induced roughness from (a) AFM and (b) SE (SIE technique)

analogous to the integer dimensions in Euclidean geometry. For rough surfaces DF would be a non-integer between the integers 2 and 3, which are the Euclidean dimensions: 2 < DF < 3. In previous studies we have described a modified variation method algorithm for accurately extracting DF from the AFM data [3.104] and we have discussed the applicability and of using DF for Si roughness studies [3.105]. AFM and SIE were used to follow the evolution of Si roughness resulting from both the thermal and electron cyclotron resonance (ECR) plasma oxidation of Si [3.106–3.108]. Purposely roughened Si was used so that the level

82

E.A. Irene

of the roughness was well above the lower detection limits of the techniques. Figure 3.19 summarizes some results for thermal oxidation. Figure 3.19a displays AFM results which show that the RMS values for rough surfaces decrease with increasing extent of oxidation while initially smooth surfaces show that the RMS values increase with the extent of oxidation. From this data there appears to be convergence to a limiting roughness of 0.2–0.3 nm RMS. It was found that DF values for both initially rough and smooth Si surfaces monotonically decrease with increasing extent of oxidation indicating that the surface always becomes simpler as oxidation proceeds. Figure 3.19b displays SIE results which show that as oxidation time increases rough surfaces result in a smaller Linf while initially smooth surfaces show an increasing Linf with oxidation extent. Thus, both AFM and SIE are concordant in that initially rough surfaces become smoother, and initially smooth surfaces roughen in terms of the magnitude of the roughness features. However, the smallest features are always removed at a greater rate than large features for both initially rough and smooth surfaces (DF always decreases with oxidation). Convergence is not seen for the SIE data but rather a crossover is observed in the values of Linf . This apparently non-physical result is explained by considering the interface model (Fig. 3.13) that is used to analyze the SIE data. In this model the parameter extracted Linf is the sum of the height of the Si protrusions plus the suboxide layer. The suboxide layer forms during oxidation, and grows as the oxidation rate slows and for lower temperature oxidation. Also, the effect of the suboxide would be greatest for the smooth samples where the Si protrusions are small. The extent of the suboxide then increases Linf , but the suboxide does not affect the AFM RMS topographic measurements. Thus, we believe that the crossover seen in the SIE results is an artifact of the model used and is attributable to the suboxide formation that affects only the optical results. Similar results have also been obtained for electron cyclotron resonance (ECR) plasma oxidation for purposely roughened and initially smooth Si surfaces [3.107]. For ECR plasma oxidation an additional acceleration of smoothening and roughening could be obtained by increasing the oxidation rate using a positive DC sample bias. Both the smoothing and roughening effects can be understood using the Kelvin equation: 2γV ∆G = (3.19) R where the change in local free energy, ∆G, is inversely proportional to the radius of curvature (R) for a small feature. (γ and V are the surface energy and molar volume, respectively. The Kelvin equation teaches that small sharp features (with small R) present a greater reactivity, and hence oxidize more extensively relative to larger features. With the use of Fig. 3.20 both the smoothing of initially rough surfaces and the roughening of initially smooth Si surfaces can be explained [3.108]. If elliptical protrusions are assumed, then the top of Fig. 3.20 shows that these protrusions can reduce their radius

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

83

Fig. 3.20. Model showing how rough and smooth surface change with oxidation

of curvature R and hence the local free energy both by an increase in the width w (squared) and/or a decrease in the height h. For the purposely roughened samples (Fig. 3.20 left side), the presence of closely spaced features requires that a decrease h is the only way for the free energy to decrease as a result of oxidation. This is the case, because if w also increases for one feature, for an adjacent feature w would be forced to decrease. However, for the initially smooth surfaces (Fig. 3.20 right side) without closely adjacent features, both w and h can change so as to reduce the local free energy. In fact if w can increase to lower the free energy, then an increase in h is permitted since R is proportional to w2 . This leads to roughening as measured by an increasing RMS. It should also be noticed that for both the initially rough and smooth Si surfaces the DF decreases as a result of oxidation. Also, the fact that both roughening and smoothing can occur as independent mechanisms leads to the prediction of a limiting interfacial roughness which has now been experimentally verified in Fig. 3.19a to be about 0.3 nm. The possible effects of roughness on interface charges was recently investigated [3.109] in terms of fixed oxide charge, Qf , and interface trapped charge density, Dit . Dit is essentially the charges trapped in the interface electronic states that were discussed earlier as those states most implicated in electronic passivation. Since the interface charges represent an extensive

84

E.A. Irene

Fig. 3.21. Interface trapped charge (Dit ) and fixed oxide charge (Qf ) changes with the extent of Si thermal oxidation

property in that the charge level depends on the surface or interface area, an accurate measurement of surface area for the rough surfaces was required. We have found [3.110] that the usual algorithms that are available with commercial AFM’s are not useful for extracting areas when the roughness is in the micro-roughness regime, viz. with roughness features less than 5 nm high. This is because the commercial algorithms use a triangle method where three adjacent AFM (x, y, z) data points form a triangle, and the triangles are summed to obtain surface area. The mathematics involves squares of the lateral (x, y) and height (z) values to form the triangles. If z is much smaller than x or y which is typical for micro-roughness, the z height values when squared will be lost in the round-off error. Thus, the area of micro-rough surfaces obtained from these algorithms is about equal to the projected area, which is far too small. In order to overcome this problem, we developed a new algorithm for extracting the surface area for micro-rough surfaces from AFM images [3.110]. This new algorithm is based on RMS and DF values, and the way these values scale with roughness. When an accurate area for a rough surface has been obtained through the use of our new algorithm applied to AFM data, we have found that both interface electronic states and interface fixed charges scale simply with surface area [3.110]. These results are shown in Fig. 3.21 where it is seen that the charge values normalized using the projected contact areas show high values, which have a relatively strong correlation with the extent of oxidation. The charges decrease with oxidation indicative of smoothening and decreasing area. With the area correction most of the roughness effect is removed. Lastly, a small orientation correction is made to account for the fact that roughness introduces other orientations,

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

85

which can have higher charge values. This is both a small and approximate correction compared to the area correction. We can conclude that interface charges are extensive, and thus the specific interface charge levels can be accounted for by the increased area associated with the micro-rough interfaces. 3.3.3 Ultra-thin Film SiO2 Films and the Future of Gate Dielectrics As was pointed out above, ultra-thin SiO2 films are presently required and soon to be demanded by the relentless decrease in device size that is driven by the microelectronics industry. Based on the newest research results presented in this chapter it is now understood that the SiO2 films grown on Si via thermal oxidation occurs by an interface reaction controlled mechanism yielding a linear relationship between film thickness and oxidation time. This recent finding was made possible by means of new and improved ultra-thin film metrology. Furthermore, with the present availability of virtually perfect Si single crystal wafers that have virtually perfect surfaces and can be processed in ultra-clean environments there appears to be no reason why 1 nm SiO2 films cannot be manufactured. However the problem with such ultra-thin films does not stop with the process. It is now clear that quantum mechanical tunneling dominates the conduction mechanism for SiO2 at thicknesses below 2 nm. This results in intolerably large leakage currents in MOSFET devices. Consequently, if IC devices and designs as we now know them are to advance beyond the next 5–10 years a substitute for SiO2 as a gate dielectric is required. There is considerable discussion in this book about the research and development related to potential substitute materials for SiO2 in the other chapters. Therefore, no attempt is made here to cover all the issues. Rather, a few key thoughts will be presented herein that have been obtained from the lessons learned researching SiO2 , and that serve as reminder to some of the key issues to be addressed with new materials in future MOSFET’s. The first thought deals with the new candidate materials themselves and in particular their reactivity and compatibility with Si; and the second with metrology in terms of both materials properties and electronics characterization. The most exciting new materials being considered are complex oxides that have K’s at the minimum several times the value of 3.9 for SiO2 and are therefore attractive since useable film thicknesses for these high K materials would be at least tens of nm. At thicknesses above 5 nm the low electric field tunneling currents are near zero. Considerable attention has been paid to the thermodynamic stability of the materials relative to SiO2 [3.111] as is prudent. Many if not most of the candidate materials are reactive towards Si and produce intermixed layers with K values intermediate between 3.9 and the value for the high K overlayer. Even the few candidate materials such as Zr and Hf oxides that appear thermodynamically stable adjacent to Si based on equilibrium thermodynamics could be problematic at the atomic

86

E.A. Irene

scale when forming an interface with Si, once again leading to intermixing at the interface. An intermixed interface layer(s) has profound effects. The first effect is that such a layer provides a series capacitor that lowers the effective K of the film stack and the second is that such a film they may give rise to considerable densities of interface electronic states at the Si-film interface. Also, many of the high K candidate oxides enable the permeation of oxygen that can lead to subcutaneous oxidation [3.112, 3.113] of the Si substrate in the oxygen rich deposition ambient. This leads to another low K film and another series capacitor. In terms of interface electronic states the formation of subcutaneous SiO2 on the Si is a good thing in that it lowers the interface electronic states, but in terms of effective K another dielectric is added to the series capacitance with further lowering of the effective K of the stack. The fact that several layers may be present of which some are homogenous and some intermixed and each with a different K and thickness presents a formidable metrology challenge. Up until recently most researchers appear to be making “educated guesses” about the presence, K values and/or film thickness for the films present in a high K stack. Studies are appearing in the literature in which powerful in situ real time characterization tools are being focused on identifying the nature of the films and interfaces [3.112, 3.114, 3.115] and it is likely that in the future more detailed studies of this kind will be forthcoming. It is only when the intermixed films present are known and characterized (relative to K and thickness) can the K of the high K layer be evaluated. That this is true is contained in a recent study where it is shown that the relationship among the many possible films K’s and L’s in a gate stack is hyperbolic. [3.116] It was shown that for certain values of K and L for the films that are present, even small errors in the values of K and L for the films can change the values for the K by more than several hundred percent for the high K overlayer that is typically obtained from capacitance versus voltage (CV) measurements. Very little attention to this problem has appeared in the literature and thus experimental K values for high K films that do appear in the literature can only be taken as crude approximations. Acknowledgments. The author gratefully acknowledges that the authors research results in this chapter were performed at the IBM Thomas J. Watson Research Center prior to 1982, and then later the research was supported by the Office of Naval Research (ONR) and the National Science Foundation (NSF) Materials Research Division.

References 3.1. E.A. Irene. In: J.E. Greene (ed.), Critical Reviews in Solid State and Materials Science 14(2), 175 (1988) 3.2. E.A. Irene, Phil. Mag. B 55, 131 (1987) 3.3. M.M. Attala, E. Tannenbaum and E.J. Scheiber, Bell System Tech. J. 38, 749 (1959)

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties 3.4. 3.5. 3.6. 3.7. 3.8. 3.9. 3.10. 3.11. 3.12. 3.13. 3.14. 3.15. 3.16. 3.17. 3.18. 3.19. 3.20. 3.21. 3.22. 3.23.

3.24. 3.25. 3.26. 3.27. 3.28. 3.29. 3.30. 3.31. 3.32. 3.33. 3.34. 3.35. 3.36. 3.37. 3.38. 3.39. 3.40. 3.41. 3.42.

87

I. Tamm, Phys. Z. Soviet Union 1, 733 (1932) W. Schockley, Phys. Rev. 56, 317 (1939) J.T. Law and C.G.B. Garrett, J. Appl. Phys. 27, 656 (1956) D.R. Palmer and C.E. Davenbough, Bull. Amer. Phys. Soc. 3, 138 (1958) C.N. Berglund, IEEE Trans. Electron Dev. ED-1B, 701 (1966) R. Castagne and A. Vapaille, Surface Sci. 28, 557 (1971) P.V. Gray and D.M. Brown, Appl. Phys. Lett. 8, 31 (1966) E.H. Nicollian and J. Brews, MOS (Metal Oxide Semiconductor) Physics and Technology, John Wiley and Sons, New York (1982) J.R. Ligenza and W.G. Spitzer, J. Phys. Chem. Solids 14, 131 (1960) B.E. Deal and A.S. Grove, J. Appl. Phys. 36, 3770 (1965) W.A. Pliskin, IBM J. Res. Dev. 10, 198 (1966) P.J. Burkhardt and L.V. Gregor, Trans. Metallurgical Soc. AIME 236, 299 (1966) A.G. Revesz, K.H. Zaininger and R.J. Evans, Appl. Phys. Lett. 8, 57 (1966) E.A. Irene and Y.J. van der Meulen, J. Electrochem. Soc. 123, 1380 (1976) E.A. Irene and R. Ghez, J. Electrochem. Soc. 124, 1757 (1977) W.A. Pliskin and R.P. Gnall, J. Electrochem. Soc. 111, 872 (1964) J.R. Ligenza, J. Phys. Chem. 65, 2011 (1961) D.E. Aspnes and A.A. Studna, Appl. Optics 14, 220 (1975) M.A. Hopper, R.A. Clarke, and L. Young, J. Electrochem. Soc. 122, 1216 (1975) E.A. Irene. In: In Situ Real-Time Characterization of Thin Films, O. Auciello and A.R. Krauss (eds.), John Wiley & Sons, New York (2001), pp. 57–104 E.H. Nicollian, A. Goetzberger, and C.N. Berglund, Appl. Phys. Lett. 15, 174 (1969) E.H. Nicollian, C.N. Berglund, P.F. Schmidt and J.M. Andrews, J. Appl. Phys. 42, 5654 (1971) D.R. Young, E.A. Irene, D.J. DiMaria, R.F. DeKeersmaecher amd H.Z. Massoud, J. Appl. Phys. 50, 6366 (1980) E.A. Irene, J. Electrochem. Soc. 121, 1613 (1974) P.H. Robinson and F.P. Heiman, J. Electrochem. Soc. 118, 141 (1971) R.S. Ronen and P.H. Robinson, J. Electrochem. Soc. 119, 747 (1972) R.J. Kriegler, Y.C. Cheng and D.R. Colton, J. Electrochem. Soc. 119, 388 (1972) K. Hirabayshi and J. Iwamura, J. Electrochem. Soc. 120, 1595 (1973) R.E. Tressler, J. Stach and D.M. Metz, J. Electrochem. Soc. 124, 607 (1977) E.A. Irene and D. Dong, J. Electrochem. Soc. 125, 1146 (1978) M.M. Atalla and E. Tannebaum, Bell Syst. Tech. J. 39, 933 (1960) F. Leuenberger, J. Appl. Phys. 33, 2911 (1962) A.S. Grove, O. Leitstiko and C.T. Sah, J. Appl. Phys. 35, 2695 (1964) B.E. Deal, A.S. Grove, E.H. Snow and C.T. Sah, J. Electrochem. Soc. 112, 308 (1965). B.E. Deal and M. Sklar, J. Electrochem. Soc. 112, 430 (1965) C.S. Ho and J. Plummer, J. Electrochem. Soc. 126, 1516 and 1523 (1979) A.G. Revesz and R.J. Evans, J. Phys. Chem. Solids 30, 551 (1969) A. Cros, J. Physique 44, 707 (1983) A. Franciosi, P. Soukiassian, P. Phillip, S. Chang, A. Wall, A. Raisanenand and N. Trouiller, Phys. Rev. B 35, 910 (1983)

88

E.A. Irene

3.43. M.C. Acensio, E.G. Michel, E.M. Oellig and R. Miranda, Appl. Phys. Lett. 51, 1714 (1987) 3.44. P.J. Moller and J. He, J. Vac. Sci. Technol. A 21, 996, (1987) 3.45. G. Abbati, L. Rossi, L. Calliari, L Braicovich, I. Lindau and W.E. Spicer, J. Vac. Sci. Technol. 21, 409 (1982) 3.46. J.M. de Larios, D.B. Kao, C.R. Helms and B.E. Deal, Appl. Phys. Lett. 54, 715 (1989) 3.47. W. Kern and D.A. Poutinen, RCA Review 31, 187 (1970) 3.48. B.F. Phillips, D.C. Burkman, W.R. Schmidt and C.A. Petersen, J. Vac. Sci. Technol. A 1, 646 (1983) 3.49. R.C. Henderson, J. Electrochem. Soc. 119, 772 (1972) 3.50. F.N. Schwettmann, K.L. Chiang and W.A. Brown, 153rd Electrochem. Soc. Meeting, Abs. #276, May 1978 3.51. G. Gould and E.A. Irene, J. Electrochem. Soc. 134, 1031 (1987) 3.52. E.A. Irene, Appl. Phys. Lett. 40, 74 (1982) 3.53. R.J. Jaccodine and W.A. Schlegel, J. Appl. Phys. 37, 2429 (1966) 3.54. M.V. Whelan, A.H. Gormans and L.M. Goossens, Appl. Phys. Lett. 10, 262 (1967) 3.55. E.P. EerNisse, Appl. Phys. Lett. 30, 290 (1977); 35, 8 (1979) 3.56. E.A. Irene, E. Tierney and J. Angillelo, J. Electrochem. Soc. 129, 2594 (1982) 3.57. E. Kobeda and E.A. Irene, J. Vac. Sci. Technol. B 4, 720 (1986); J. Vac. Sci. Technol. B 5, 15 (1987) 3.58. E.A. Taft, J. Electrochem. Soc. 125, 968 (1978) 3.59. W.A. Tiller, J. Electrochem. Soc. 127, 619, 625 (1980) 3.60. G. Lucovsky, M.J. Mantini, J.K. Srivastava and E.A. Irene, J. Vac. Sci. Technol. B 5, 530 (1987) 3.61. E.A. Irene, D. Dong and R.J. Zeto, J. Electrochem. Soc. 127, 396 (1980) 3.62. H.Z. Massoud, J. Plummer and E.A. Irene, J. Electrochem. Soc. 132, 1745 (1985) 3.63. E.A. Irene, H.Z. Massoud and E. Tierney, J. Electrochem. Soc. 133, 1253 (1986) 3.64. J.J. Wortman and R.A. Evans, J. Appl. Phys. 36, 153 (1965) 3.65. W.A. Brantley, J. Appl. Phys. 44, 534 (1973) 3.66. E.A. Lewis and E.A. Irene, J. Electrochem. Soc. 134, 2332 (1987) 3.67. K. Ueda and M. Inoue, Surf. Sci. 161, L578 (1985) 3.68. R. Oren and S.K. Ghandi, J. Appl. Phys. 42, 752 (1971) 3.69. S.A. Schafer and S.A. Lyon, J. Vac. Sci. Technol. 19, 494 (1981); S.A. Schafer and S.A. Lyon, J. Vac. Sci. Technol. 21, 422 (1982) 3.70. I.W. Boyd, Appl. Phys. Lett. 42, 728 (1983); I.W. Boyd. In: Surface Studies with Lasers, F.R. Aussenberg, A. Leitner and M.E. Lippitech (eds.), Springer-Verlag, New York (1983), p. 193 3.71. E.M. Young and W.A. Tiller, Appl. Phys. Lett. 42, 63 (1983); E.M. Young and W.A. Tiller, Appl. Phys. Lett. 50, 80 (1987) 3.72. E.A. Irene and E.A. Lewis, Appl. Phys. Lett. 51, 767 (1987) 3.73. L.M. Chanin, A.V. Phelps and M.A. Biondi, Phys. Rev. 128, 219 (1962) 3.74. R. Ghez and Y.J. van der Meulen, J. Electrochem. Soc. 119, 1100 (1972) 3.75. H. Yao, J.A. Woollam and S.A. Alterovitz, Appl. Phys. Lett. 62, 3324 (1993) 3.76. R. Williams and A.M. Goodman, Appl. Phys. Lett. 25, 531 (1974)

3 SiO2 Based MOSFETS: Film Growth and Si–SiO2 Interface Properties

89

3.77. T.W. Sigmon, W.K. Chu, E. Lugujjo and J.W. Mayer, Appl. Phys. Lett. 24, 105 (1974) 3.78. J. Derrien and M. Commandre, Surface Science 118, 32 (1982) 3.79. S.I. Raider and R. Flitsch, J. Vac. Sci. Technol. 13, 58 (1976) 3.80. C.R. Helms, J. Vac. Sci. Technol. 16, 608 (1979) 3.81. F.J. Grunthaner, P.J. Grunthaner, R.P. Varquez, B.F. Lewis, J. Maserjian and A. Madhukar, J. Vac. Sci. Technol. 16, 1443 (1979) 3.82. H. Ibach, H.D. Bruchmann and H. Wagner, Appl. Phys. A 29, 113 (1982) 3.83. P. Chiaradia and S. Nannarone, Surface Science 54, 547 (1976) 3.84. H.Z. Massoud, J. Plummer and E.A. Irene, J. Electrochem. Soc. 132, 1745 (1985) 3.85. E.A. Irene, J. Electrochem. Soc. 125, 1708 (1978) 3.86. F.J. Grunthaner and P.J. Grunthaner, Chemical and Electronic Structure of the SiO2 /Si Interface, Materials Science Reports 1, 65 (1987) 3.87. W.A. Tiller, J. Electrochem. Soc. 128, 689 (1981) 3.88. F. Herman, I.P. Batra and R.V. Kasowski. In: The Physics of SiO2 and Its Interfaces, S.T. Pantelides (ed.), Pergamon, N.Y. (1979), p. 333 3.89. B. Agius, S. Rigo, F. Rocket, M. Froment, C. Maillot, H. Roulet and G. Dufour, Appl. Phys. Lett. 44, 48 (1984) 3.90. F. Rochet, B. Agius and S. Rigo, J. Electrochem. Soc. 131, 914 (1984) 3.91. V.A. Yakovlev, Q. Liu and E.A. Irene, J. Vac. Sci. Technol. A 10, 427(1992) 3.92. E.A. Taft and L. Cordes, J. Electrochem. Soc. 126, 131 (1979) 3.93. A. Kalnitsky, S.P. Tay, J.P. Ellul, S. Chongsawangvirod, J.W. Andrews, and E.A. Irene, J. Electrochem. Soc. 137, 234 (1990) 3.94. S. Chongsawangvirod, E.A. Irene, A. Kalnitsky, S.P. Tay, and J.P. Ellul, J. Electrochem. Soc. 137, 3536 (1990) 3.95. D.F. Mitchell, K.B. Clark, J.A. Bardwell, W.N. Lennard, G.R. Massoumi and L.V. Mitchell, Surface and Interface Analysis 21, 44 (1994) 3.96. K.H. Gundlach, Solid State Electron 9, 949 (1996) 3.97. M.E. Alferieff and C.B. Duke, J. Chem. Phys. 46, 938 (1967) 3.98. J. Maserjian. In: The Physics and Chemistry of SiO2 and Si–SiO2 Interface, C.R. Helms and B.E. Deal (eds.), Plenum, New York (1988), p. 505 3.99. S. Zafar, Q. Liu and E.A. Irene, J. Vac. Sci. Technol. A 13(1), 47 (1995); S. Zafar, K.C. Conrad, Q. Liu, E.A. Irene, G.Hames, R. Kuehn and J.J. Wortman, Appl. Phys. Lett. 67, 1031 (1995) 3.100. K.J. Hebert, S. Zafar, E.A. Irene, R Kuehn, T.E. McCarthy and E.K. Demirlioglu, Appl. Phys. Lett. 68, 266 (1996) 3.101. K.J. Hebert, T. Labayen and E.A. Irene. In: Physics and Chemistry of SiO2 and the Si–SiO2 Interface III, H.Z. Massoud, C.R. Helms and E.H. Poindexter (eds.), The Electrochemical Soc. Inc., New Jersey, USA (1996), p. 81 3.102. E.A. Irene, Solid State Electronics 45, 1207 (2001) 3.103. Y. Wang and E.A. Irene, J. Vac. Sci. Technol. B 18(1), 279, (2000) 3.104. L. Spanos and E.A. Irene, J. Vac. Sci. Technol. A 12(5), 2646 (1994) 3.105. L. Spanos, Q. Liu, T. Zettler, B. Hornung, J.J. Wortman and E.A. Irene, J. Vac. Sci. Technol. A 12(5), 2653, (1994) 3.106. Q. Liu, L. Spanos, C. Zhao and E.A. Irene, J. Vac. Sci. Technol. A 13, 1977 (1995) 3.107. C. Zhao, P.R. Lefebvre and E.A. Irene, Thin Solid Films 313-314, 286 (1998)

90

E.A. Irene

3.108. 3.109. 3.110. 3.111. 3.112.

L. Lai and E.A. Irene, J. Appl. Phys. 86(3), 1729 (1999) L. Lai, K.J. Hebert and E.A. Irene, J. Vac. Sci. Technol. B 17, 53 (1999) L. Lai and E.A. Irene, J. Vac. Sci. Technol. B 17, 33 (1999) K.J. Hubbard and D.G. Schlom, J. Mater. Res. 11, 2757 (1996) Y. Gao, A.H. Mueller, E.A. Irene, O. Auciello, A. Krauss and J.A. Schultz, J. Vac. Sci. Technol. A 17, 1880 (1999) Y.M. Sun, J. Lozano, H. Ho, H.J. Park, S. Veldman and J.M. White, Appl. Surface Sci. 161, 115 (2000) R.A. McKee, F.J. Walker and M.F Chisholm, Phys. Rev. Lett. 81, 3014 (1998) A.H. Mueller, N.A. Suvorova, E.A. Irene, O. Auciello and J.A. Schultz, Appl. Phys. Lett. 80, 3796 (2002) A.H. Mueller, N.A. Suvorova and E.A. Irene, Appl. Phys. Lett. 80, 3596 (2002)

3.113. 3.114. 3.115. 3.116.

4 Oxide Reliability Issues R. Degraeve

In this section, we will describe and discuss the electric reliability of thin oxide layers (range 1.5–15nm). An overview is presented of the ideas that have been proposed in the past decade to evaluate, model and predict the oxide reliability. First, we will describe the oxide wear-out phase, that is the gradual degradation of the electrical properties of the oxide under electrical stress. Second, the oxide breakdown phenomenon is discussed with special emphasis on soft breakdown issues in ultra thin layers. Third, the reliability prediction methodology is addressed.

4.1 Thin Oxide Layer Degradation Under Electrical Stress The reliability of oxide layers is typically tested by applying either a constant voltage stress (CVS) or a Constant current stress (CCS) to a capacitor. The thin SiO2 -dielectric gradually degrades until breakdown occurs. We can think of thin oxide layer degradation by electrical stress as the continuous generation of trapping centers in the bulk of the oxide. Breakdown is triggered when the accumulated damage reaches a critical level. During oxide stress, several phenomena can be observed: interface trap creation (Sect. 4.1.1), negative and/or positive charge trapping (Sect. 4.1.2), hole fluence (Sect. 4.1.3), neutral electron trap creation (Sect. 4.1.4), and the generation of a Stress-Induced Leakage Current (=SILC) (Sect. 4.1.5). These properties are important monitors during oxide stress, and can help one understand the degradation mechanisms. The trap generation mechanism itself is discussed in Sect. 4.1.6. Several research groups have proposed breakdown models that directly correlate one of these properties with breakdown. The line of thought is common to all models: some damage-related parameter exceeds a critical threshold at the moment breakdown is triggered. An important observation is that a relation between accumulated oxide damage and breakdown is only found for the intrinsic breakdown mode. Extrinsic breakdown is determined by localized, process-related physical phe-

92

R. Degraeve

nomena that do not influence in measurable way the global degradation phenomena. 4.1.1 Interface Trap Creation During high field oxide stressing, interface traps are created at the substrate/oxide interface [4.38, 4.40]. Their density can be obtained either from CV-measurements [4.74], or, on transistors, from charge-pumping measurements [4.50]. Recently, interface trap-related low voltage Stress-Induced Leakage Current in sub 3.5 nm oxides has been used to quantify the interface trap density [4.49]. It has been claimed that the interface trap density, Dit , reaches a critical density, Dit,crit , at the moment of oxide breakdown. In earlier work, the triggering of breakdown is suggested to be caused by a local interface softening due to accumulation of traps [4.38], but more recently, the critical interface trap density is merely viewed as a monitor for the total density of traps in the oxide. By means of a percolation model [4.29, 4.32, 4.111] (see also Sect. 4.2), the interface trap density can be related to this total bulk trap density [4.43]. 4.1.2 Oxide Charge Trapping In oxides with thickness larger than 4 nm, a typical observation during a high field CCS is the initial decrease of the applied voltage needed to force the required current, followed by a voltage increase which can become larger than the initially applied voltage [4.37, 4.38]. The voltage shifts are caused by charge trapping (initially positive, then negative charge) in the oxide, leading to an oxide field distortion and subsequent change of the tunnel current density. During a CVS, exactly the opposite current shifts are measured, i.e. an initial increase of the current followed by a decrease. In sub-4 nm oxides, the charge built-up almost completely disappears. Typically, as illustrated in Fig. 4.1, a very small increase of the stress current during CVS is measured, which is attributed to positive trapping and the gradual generation of a SILC [4.77]. In some older publications [4.14,4.16], it is claimed that the positive charge trapping in the oxide is responsible for triggering the breakdown event. During e.g. a CCS a locally enhanced charge trapping will not influence the total current density in the capacitor, but will lead to a local current density increase, resulting in an increased stress, which in turn leads to an increased positive charge trapping. In this way a positive feed-back mechanism is initiated that finally results in breakdown. Two arguments oppose this idea: (i) In ultra-thin sub-4 nm oxides the measurable positive charge trapping is extremely small and yet, the oxides break down. (ii) The possible role of the negative charge in the oxide is

4 Oxide Reliability Issues

93

Fig. 4.1. Increase in current as a function of stress time for different gate voltages. Solid lines are fit assuming a single trap plus SILC. From [4.77]

completely ignored, while for tox > 4 nm the net charge trapped at breakdown is negative. Other more recent publications [4.71, 4.124, 4.125] claim that the net negative trapped charge in the oxide exceeds a critical threshold value at breakdown. In [4.125], the authors related their observation to charge trapping kinetics, without assuming any additional trap creation. This is in contradiction with most publications on oxide degradation that clearly demonstrate an increase of the trap density.

4.1.3 Hole Fluence When the gate oxide of an nMOSFET is stressed with a positive gate voltage, while source and drain are grounded, electrons tunneling through the oxide are injected from the transistor channel and provided from the source and drain. In this configuration, a positive current can be measured at the substrate (charge separation technique) [4.16, 4.128]. The substrate current density has similar oxide field dependence as the FN-current density, as is shown in Fig. 4.2. It should, however, be remarked that the curves in Fig. 4.2 are not parallel: the ratio between gate and substrate current depends on the oxide field. A well-known and widely accepted explanation for the physical origin of this substrate current is given in [4.15, 4.39, 4.103] and is schematically illustrated in Fig. 4.3. When the injected electrons enter the anode (the polySi gate), they lose their energy by creating high energetic holes – possibly through the excitation of some intermediate state [4.47] – and the holes can then be injected back in the oxide. The hole flow reaches the cathode and is measured as a positive substrate current, Jp .

94

R. Degraeve

Fig. 4.2. The gate and the substrate current as a function of the oxide field in a carrier separation set-up. The ration α between the substrate and the gate currents is field dependent. α increases with increasing field (from [4.25])

Fig. 4.3. Schematic illustration of the anode hole injection model. Injected electrons reach the anode with high energy an can generate hot holes that can tunnel back to the cathode, giving rise to the substrate current (from [4.25])

Apart from anode hole injection, another possible explanation for the substrate current is the creation of holes in the cathode by photons generated in the anode [4.13, 4.41, 4.92]. A third origin of the hole current is valence band injection of electrons from the cathode. In ultra thin oxides, this component dominates in a carrier separation experiment [4.103]. The hole current density can be related to the electron current density, Jn , as follows: Jp = α(Eox )Jn

(4.1)

with α(Eox ) the field dependent hole generation efficiency. In the anode hole injection model, α is interpreted as the probability for a tunneling electron to generate an anode hole that is injected back into the oxide towards the

4 Oxide Reliability Issues

95

Fig. 4.4. The charge-to-breakdown QBD and the hole fluence at breakdown as a function of the electric field. On the one hand the hole fluence at breakdown is constant in the measured field range. On the other hand the charge-to-breakdown decreases continuously with the field (from [4.25])

cathode [4.103]. In a CVS α is found to be almost constant during the whole stress [4.105], while in a CCS, α slightly increases. However, for thin oxides the difference between the initial and final value of α is very small. Therefore, with good approximation, an equivalent relation as (4.1) also holds for the integrated values of Jp and Jn : the hole fluence, Qp , and the electron fluence, Qn , respectively: Qp = α(Eox )Qn .

(4.2)

In [4.15], it has been observed for the first time that the hole fluence reaches a critical value at breakdown: Qp,crit . This result has been confirmed in [4.31, 4.99], and additional experiments with Substrate Hot Hole injection confirm that a critical hole fluence is needed to trigger oxide breakdown [4.62, 4.120]. In Fig. 4.4, Qp,crit and QBD measured in a wide field range from 8 to 14 MV/cm are plotted. Clearly, Qp,crit remains constant in the entire field interval, while QBD decreases with increasing field. A satisfactory physical explanation for the experimentally observed invariance of Qp,crit cannot be found in literature, but the various suggested possibilities are discussed in detail in Sect. 4.1.6. It should be remarked that in [4.99], Qp,crit was observed to be constant only at 300 K. For lower temperatures, it decreases as a function of the field. This observation is consistent with [4.54], where an increase as a function of the field is found for temperatures above 300 K. These temperature effects indicate that the hole fluence at breakdown is possibly not the factor determining the triggering of breakdown.

96

R. Degraeve

Fig. 4.5. The occupied electron traps at a filling field of 7 MV/cm, as a function of the injected electron fluence measured during oxide stressing at fields between 6.2 and 11.1 MV/cm. Oxide stress was performed either by FN- (open symbols) or SHE-injection (closed symbols). For each stress condition, the mean breakdown value is indicated by an asterisk (from [4.25])

4.1.4 Neutral Electron Trap Generation During oxide stressing, neutral electron traps are generated in the oxide [4.45, 4.97]. Although many researchers have related this trap creation to the breakdown process, direct measurements of the oxide neutral trap density Dot as a function of the applied stress conditions are rare. In order to measure this degradation phenomenon, the neutral traps have to be made electrically visible. This can be accomplished by interrupting the stress periodically and fill the generated traps with electrons without creating any additional traps. A technique ideally suited for this purpose is uniform Substrate Hot Electron (SHE) injection [4.28, 4.78–4.80]. In [4.28, 4.80] the necessity of a trap filling step is demonstrated. Indeed, immediately after stress, a huge part of the available oxide traps is neutral and their occupancy depends on the applied stress field. Therefore, the trapped charge immediately after stress is not a good monitor for the density of neutral traps. SHE-injection can also be used to stress the oxide in a field range below the practically accessible FN-field range [4.32, 4.70]. In Fig. 4.5 [4.32], the increase of the neutral trap density as a function of the injected fluence is shown. Degradation and breakdown in the field range 6–8.5 MV/cm with SHE-injection and 8.5–11 MV/cm with constant voltage FN-injection are presented. The factor p is the fraction of filled traps after the SHE-injection, which is constant for all stress fields [4.32]. In the considered field range and apart from small statistical fluctuations, breakdown occurs when Dot reaches a critical value. This result supports the idea that a critical electron trap density is necessary to trigger breakdown.

4 Oxide Reliability Issues

97

Fig. 4.6. The generated neutral electron traps as a function of the hole fluence for different oxide thicknesses (range 7–14 nm), different stress types (CVS and CCS) and different oxide fields (9.5–12 MV/cm). A unique relation is observed (from [4.32])

A breakdown model based on neutral trap generation has been proposed by several authors [4.6,4.7,4.32,4.45,4.111,4.116]. It is assumed that at some place on the capacitor the local trap density becomes sufficiently large to allow the formation of a conductive chain of traps connecting the anode with the cathode interface. This model will be investigated in further detail in Sect. 4.2. In Fig. 4.6, the generated neutral trap density has been plotted versus the hole fluence for different oxide thicknesses and FN-stress conditions. One unique curve is obtained independent of oxide field and thickness [4.32]. It is concluded that the critical hole fluence, Qp,crit , corresponds to a critical generated density of neutral electron traps, Dot,crit , proving that both breakdown criteria are equivalent. 4.1.5 Stress-Induced Leakage Current The fifth phenomenon that occurs during oxide degradation is the generation of a leakage current through the gate. This is the Stress-Induced Leakage Current (SILC). SILC is illustrated in Fig. 4.7, where the Ig –Vg curve is shown on a fresh sample and on stressed samples. The SILC rises continuously with injected fluence and its Vg -dependence can be empirically fitted with a FN-expression using a barrier height of 1 eV [4.23, 4.46, 4.88, 4.89]. SILC is a major reliability problem for all devices that rely on extremely low leakage current for proper functioning. Typically, non-volatile memory arrays are sensitive to SILC. When the charge is leaking from the floating gate, the threshold voltage of the cell shifts and the stored information will be lost after some time [4.23, 4.57, 4.88, 4.97].

98

R. Degraeve

2

J [A/cm ]

10-7

10-8 10-9

2

CCS @ 1mA/cm : 2 0.01 C/cm 2 0.1 C/cm 2 1 C/cm 2 3 C/cm

10-10 10-11 -3.0

-4.0

-5.0

-6.0

V ox [V] Fig. 4.7. The gate current density vs. the applied oxide voltage for a fresh device (open symbols) and after a high field stress has been applied. A gradually increasing leakage current at low voltage appears. This is the Stress-Induced Leakage Current (from [4.25])

Fig. 4.8. The oxide thickness dependence of the steady-state and the transient component of SILC. As the oxide thickness increses, SILC is transformed from a steady-state current to a current transient. The current is measured at a field of 5.4 Mv/cm following a stress at 9.5 Mv/cm for 20 to 30 C/cm2 (from [4.69])

When the SILC after a given stress is continuously monitored as a function of time, two components can be distinguished [4.69, 4.97]. Initially, a decaying transient component is observed, relaxing to a steady state SILC after some time. Both components depend on the oxide thickness as illustrated in Fig. 4.8. Thick oxides have a large transient component and low steady state component, very thin oxides have a very small transient component and a large steady state component. The transient component of the SILC results from emptying negatively charged traps immediately after stress. A simple tunneling front model predicts a 1/t-decay of the current [4.45]. In recent publications, however, it has been shown that, apart from the electron component, a hole component is present in the transient current. Consequently, the electron transient current

4 Oxide Reliability Issues

99

Fig. 4.9. Post-stress IV curves comparing nmos an pmos. In both cases the peak degradation occurs near the flatband voltage. Current increase is pimarely observed when the sense voltage is within ±1 V of the flatband voltage (from [4.75])

follows a 1/tn -dependence with n < 1 [4.137]. Substrate hot electron [4.137] and Channel Hot Electron injection [4.127] experiments further confirm the role of the hole component in the transient SILC. The steady-state component of the SILC is caused by trap-assisted tunneling from cathode to anode. In thick oxides, the probability for finding a trap that can serve as “stepping stone” for electron tunneling is very small, resulting in a small steady-state component. In thinner oxides, this probability is significantly higher and therefore the steady-state component will increase [4.95] and dominate over the transient trap discharging current. Detailed modelling of the SILC has been the carried out by several groups. Most researchers agree that SILC can be correctly modelled by an inelastic trap-assisted tunneling process [4.56,4.59,4.93,4.119,4.135]. In all these models, a single trap is needed to create a localized conduction path through the oxide. The possibility exists that two traps are involved in the tunneling process, leading to a significantly more efficient conduction path [4.53,4.106]. The resulting SILC has been named ‘anomalous’, because a two-trap configuration is very rare. The anomalous SILC can be observed on small capacitors as pre-breakdown current jumps [4.26], and the effect also causes fast loss of the floating gate charge in a small fraction of the cells of a non-volatile memory array [4.27,4.53,4.82]. For this reason, anomalous SILC is the major limitation of the tunnel oxide thickness scaling in non-volatile memories. In very thin oxides, a new component of SILC is observed which is related to electrons that tunnel directly into interface traps. This leakage current is large in the low voltage range close to the flatband voltage as illustrated in Fig. 4.9 [4.49, 4.75]. In Fig. 4.10 the close relationship between SILC and the trap density in the oxide is illustrated. To construct this figure, the FN-stress was periodically

100

R. Degraeve

Fig. 4.10. The steady-state SILC measured at a fixed field E (here 5 MV/cm) plotted as a function of the generated density of neutral traps. Independent of the stress condition, a one-to-one relation is observed (from [4.24])

interrupted to measure the leakage current density JE at a fixed low oxide field E and on the same device the density of neutral oxide traps was determined with the technique explained in [4.32]. Plotted versus one another, a stressindependent one-to-one relation is revealed between the steady-state SILCincrease and the generation of neutral traps [4.24]. Several other authors have emphasized the relation between the neutral trap density and the SILC [4.35, 4.89, 4.100], but very recently this relationship has been questioned [4.94]. Despite these recent observations, many authors consider SILC as a measure of the neutral trap density. Consequently, both bulk trap-related and interface trap related SILC have been used as a degradation monitor and time-to-breakdown predictor [4.11, 4.48, 4.77, 4.84, 4.86, 4.112]. 4.1.6 Trap Generation Mechanism: Discussion From the results shown in the previous sections, it is clear that trap generation is the key factor determining the oxide degradation and breakdown. In this section, several models describing the electron trap generation are compared. In summary, three trap generation models are discussed: the ‘anode hole injection model’, the ‘electro-chemical model’, and the ‘hydrogen release model’. In the anode hole injection model [4.103], as illustrated in Fig. 4.3, it is assumed that the holes tunnelling back to the cathode can create electron traps in the oxide, probably in conjunction with an electron in the SiO2 conduction band. The physics of the trap creation process is still speculative. Indeed, there have been several studies demonstrating that the interaction of electrons and holes in an oxide results in trap creation [4.17,4.99,4.101,4.122]. However, the precise role of electrons and holes in the trap creation process and the details of the microscopic mechanism of the trap creation are still uncertain. The most important difficulty in studying this effect is the inability of many techniques to separately control the hole and the electron injection.

4 Oxide Reliability Issues

101

Fig. 4.11. Outline of the three lines of thought on neutral electron trap generation that can be found in literature

Electrons injected in the oxide relax their energy also by light emission [4.13, 4.21]. Experimental evidence has been presented that not anode hole injection but photo-excitation of valence band electrons in the cathode by light generated in the anode is the dominant source of the measured holes in the substrate [4.13, 4.41, 4.92]. If this is correct, or if there exists a voltage limit to anode hole injection [4.41], it could be questioned whether the anode hole injection model would correctly predict the low voltage oxide reliability. Recent modified simulations [4.1, 4.12] of anode hole injection show the existence of an exponentially decaying impact ionization rate at low voltage, which corresponds to the exponentially decaying trap generation rate as measured withSILC [4.112]. According to the anode hole injection model, the observation of a unique relation between hole fluence and neutral electron trap generation (Fig. 4.6) is interpreted as a causal relation, i.e. the holes are necessary to create the traps. This line of thought is outlined in Fig. 4.11a. However, other explanations are possible as well. As is outlined in Fig. 4.11b, the energy release of the incoming electrons at the anode can, besides hole creation, also activate some other mechanism that is responsible for neutral electron trap generation. This line of thought suggests that the hole fluence and the trap creation have a ’common origin’, i.e. the energy released by the electrons at the anode. Some authors have even suggested a third possible route which is illustrated in Fig. 4.11c: The electric field itself induces sufficient energy directly into the oxide to cause electron trap creation. This has been named the ‘electro-chemical’ model [4.45, 4.58, 4.65, 4.67, 4.102, 4.107]. With this interpretation, all processes that are related to the energy release of the injected

102

R. Degraeve

electrons at the anode, are independent of the trap generation mechanism. Furthermore, the electron fluence has no impact on the trap generation rate. As a trap creation mechanism, a model based on oxide dipoles interacting with the electric field has been proposed [4.67, 4.102]. A serious candidate for the ‘other mechanism’ of Fig. 4.11b is hydrogen release at the anode. Most older publications [4.36,4.38] on this phenomenon deal with interface trap creation at the cathode, but the model’s basic ideas can easily be extended to bulk oxide traps. In literature, one cannot find, however, experimental evidence supporting this extension. In the hydrogen release model, the electrons tunnel through the oxide potential barrier and reach the anode with sufficient energy to release hydrogen from the anode/oxide interface. This hydrogen is always present in sufficient amounts because of interface annealing applied to reduce the initial interface trap density. The released hydrogen diffuses through the oxide and can generate electron traps. Again, as for the anode hole injection model, the precise physical details of the microscopic trap generation mechanism remain speculative.

4.2 Oxide Breakdown In the previous section, the oxide degradation phenomena during electrical stress leading to breakdown have been discussed. In this section, the focus is on the breakdown event itself. First, the modeling of the breakdown event is discussed (Sect. 4.2.1). Secondly, the occurrence of so-called soft breakdown in ultra thin oxides and its relation to device failure will be further discussed (Sect. 4.2.2). 4.2.1 Breakdown Modeling Already in the beginning of the 1990s , Su˜ n´e and co-authors [4.116] presented a ’weakest link’ breakdown model [4.108, 4.113]. In this model, a capacitor is divided into a large number of small cells. It is assumed that during oxide stressing neutral electron traps are generated at random positions on the capacitor area. The number of traps in each cell is counted, and at the moment that the number of traps in one cell reaches a critical value, breakdown occurs by definition. The point is that in that critical cell the number of traps is sufficiently large to create a conductive path from anode to cathode via these traps. A disadvantage to the model of [4.116] is its two-dimensional nature. In more recent work a new weakest link model has been proposed that can accurately describe the intrinsic breakdown distribution. This model is based on the principles of percolation theory [4.109]. The use of the percolation concept for modeling of oxide breakdown has been suggested in [4.63] and has been thoroughly elaborated in [4.29, 4.30, 4.32, 4.111]. The percolation model for breakdown exists in two versions: (1) the ‘sphere’ model, where each

4 Oxide Reliability Issues

103

Fig. 4.12. The percolation model for oxide breakdown explained step by step. As the density of neutral electron traps increases, conductive clusters of traps are formed ultimately leading to the creation of a conductive breakdown path from anode to cathode (from [4.25])

generated defect in the oxide is characterized by a sphere with radius 0.9 nm, and (2) the ‘cube’ model, where each defect is represented by a cube with size 1.3 nm in a three-dimensional frame. Both models are implementations of the same concept and provide, therefore, similar results. As an example, the ‘sphere’ model of [4.29] is schematically illustrated in Fig. 4.12. It is assumed that electron traps are generated inside the oxide at random positions in space. Around these traps a sphere is defined with a fixed radius r, which is the only parameter of the model (Fig. 4.12a). If the spheres of two neighbouring traps overlap, conduction between these traps becomes possible

104

R. Degraeve

Fig. 4.13. Normalized QBD -distributions for different oxide thicknesses. The decrease of the Weibull slope with oxide thickness is clearly experimentally observed (from [4.32])

by definition. Also, the two interfaces are modelled as an infinite set of traps (Fig. 4.12b). This mechanism of trap generation continues until a conducting path is created from one interface to the other, which is defining the breakdown condition (Fig. 4.12c). In a computer simulation, the total electron trap density needed to trigger breakdown, Dot,crit , can now be calculated. It is found that the simulated Dot,crit -distribution can be fitted with a Weibull expression. This distribution can be easily associated with the tBD -distribution [4.32]. The percolation model for breakdown is able to explain quantitatively two important experimental observations: (i) as the oxide thickness decreases, the density of oxide traps needed to trigger breakdown decreases [4.29,4.32,4.43], and (ii) as the oxide thickness decreases, the Weibull slope of the breakdown distribution decreases, i.e. a larger spread on the tBD - or QBD -values is observed [4.29, 4.32, 4.90, 4.111]. The latter effect is illustrated in Fig. 4.13. An important consequence of the decreasing Weibull slope for thinner oxides, is the strongly increased area dependence of the time-to-breakdown or charge-to-breakdown [4.32, 4.90]. Indeed, based on the random character of the breakdown position, it has been shown that the for the scale factors η1 and η2 of two Weibull distributions (either tBD or QBD -distributions) of capacitors with identical oxide thickness, but area A1 and A2 respectively, the following relationship holds [4.76, 4.132]: η1 = η2



 1 A2 ( β ) . A1

(4.3)

As β decreases, the area dependence becomes stronger. This is illustrated in Fig. 4.14. It can be concluded that for thick oxides (tox > 10 nm), the intrinsic QBD -value is with acceptable approximation constant in the range of capacitor areas commonly available for experimental purposes, but for

4 Oxide Reliability Issues

105

Fig. 4.14. The 63%-value of the QBD as a function of the area of the test capacitor for different oxide thicknesses (gate injection). As the oxide thickness decreases, a strong area dependence is observed (from [4.32])

thin oxides the QBD can no longer be considered area independent. This has important implications when the QBD -value is used as a figure of merit for the oxidation process, as is commonly done in an industrial environment. It is meaningless to specify a QBD -value without specifying the area, and it is mandatory to specify the area when QBD -values of different oxide thicknesses are compared. 4.2.2 Soft Breakdown In the previous section, the modelling of breakdown as the creation of a localised conductive path has been presented. For an oxide thickness of more than 5 nm, the creation of this path is followed immediately by the propagation of thermal damage leading to a highly conductive short between anode and cathode. However, it is known for several years that ultra-thin oxides can have ‘anomalous’ failure [4.46], characterized by the creation of a more resistive breakdown path. Only recently, this so-called soft, quasi, early, non-destructive, electric breakdown or B-mode SILC has gained much attention in sub-5 nm oxide layers [4.2–4.4, 4.8, 4.19, 4.22, 4.34, 4.60, 4.61, 4.81, 4.83, 4.85, 4.91, 4.98, 4.130, 4.134]. The soft-breakdown (SBD) can be defined as an oxide breakdown without the lateral propagation of the breakdown spot due to thermal damage [4.34]. It is generally accepted that soft and hard breakdown originate from the same precursor [4.25, 4.115, 4.134], show the same stress current dependence [4.98] and can be described by the same Weibull statistics [4.25, 4.133], although this latter statement has recently been doubted [4.10]. In Fig. 4.15, a typical example is shown of the gate current vs. gate voltage immediately after soft-breakdown. A sudden rise compared to SILC is observed, but the SBD is less destructive when compared with the hard

106

R. Degraeve -4

10

-6

HBD

10

SBD

-8

10

-10

SILC

10

-12

10

Fresh -14

10

0

1

2

3

4

5

Gate Voltage (V)

Fig. 4.15. IG –VG curves for a capacitor with oxide thickness 4.5 nm. A comparison is made between the current in the fresh device, after stress (SILC), after soft breakdown and after hard breakdown (from [4.22])

breakdown. It has been shown that the gate current after soft breakdown is a unique curve as a function of the gate voltage independent of area, proving that soft breakdown is a localized effect [4.22]. This conclusion is confirmed by Emission Microscopy experiments [4.61]. However, for very small areas, a lower, unstable current is measured [4.22]. This might be explained by the difference in energy available for discharging at the moment of breakdown. Typically, when the applied voltage is plotted as a function of time during a CCS, SBD is indicated by a very small drop of the voltage (or equivalently a small jump of the current during CVS), followed by a noisy behaviour. However, although often claimed in literature, the small voltage jump during FN-stress is not a characteristic feature of soft breakdown. Indeed, when the test structure area is scaled down, the voltage jump becomes of the order of Volts [4.22]. On typical large area test structures, the small voltage or current jumps indicating soft-breakdown become very difficult to detect with automated measurement systems. Therefore, other breakdown detection methods have been proposed. The most stable and widely applicable real-time detection technique uses the sudden increase of the noise in the gate current [4.96,4.130]. This noise has been studied carefully [4.8, 4.22, 4.68, 4.81, 4.98, 4.123]. At low gate voltage, two-level and multilevel random telegraph signals are observed with current amplitudes that are depending on the applied gate voltage. An example is shown in Fig. 4.16. It has been demonstrated that the current-voltage characteristics of the ON and OFF state are shifted over a constant voltage interval [4.22]. This observation can be explained by electron capture-emission-induced local field fluctuations in the breakdown path [4.8, 4.22]. With this model, the area of the soft breakdown site is estimated to be 2 × 10−13 cm2 [4.22].

4 Oxide Reliability Issues

107

-12

500x10

400 Vg = 3.25 V 300 200 100 0 0

20

40

60

80

100

Time (s)

Fig. 4.16. Example of two-level random telegraph signal in the gate current of a capacitor after soft breakdown (from [4.22])

The current conduction mechanism through soft-broken spot in the oxide has been modeled in several ways, and the correct physical picture still remains unclear. Models based on variable range hopping of carriers [4.84,4.85], point contact conduction [4.117, 4.118], energy funnels [4.19], resonant tunneling through strategically placed traps [4.5, 4.72], direct tunneling through a thinned oxide region [4.51, 4.60] and electrode controlled conduction [4.68] have been proposed. It has been claimed that after SBD no significant variations in MOSFET Id –Vg characteristics can be observed and therefore, for some applications, SBD does not necessarily imply device failure [4.130, 4.131]. However, it has been demonstrated that for transistors with very small channel length, soft breakdown no longer occurs. Instead, hard breakdown is immediately observed [4.91, 4.134]. Since the reliability of small geometry devices has to be guaranteed, this means breakdown might still be a reliability limiting process.

4.3 Breakdown Acceleration Models In this section, the analysis of tBD data in order to predict the oxide reliability is discussed. First, in Sect. 4.3.1, the choice of the field or voltage extrapolation law is discussed. This is one of the most important issues in a reliability prediction. Indeed, oxide reliability has been pinpointed as one of the possible showstoppers in scaling down the oxide thickness [4.25, 4.33, 4.112], but this prediction is based on time-to-breakdown values measured at very high fields, while the actual device operates at a much lower field. The correctness of the oxide reliability prediction at operating conditions depends completely on the correctness and validity of the extrapolation law.

108

R. Degraeve

A second parameter that is sometimes considered in predicting the oxide reliability at operation conditions is the temperature. In Sect. 4.3.3, we will briefly summarize the temperature effect on oxide degradtion and breakdown. Finally, in Sect. 4.3.3, a complete reliability specificatoin is provided and illustrated. 4.3.1 Voltage or Field Extrapolation There have been contradicting opinions on the exact field or voltage dependence of time-to-breakdown. Some research groups claim that, based on the anode injection model, the logarithm of time-to-breakdown scales with 1/Eox , whereas others find better results using an Eox -dependence. Most recently, a Vg -dependence has been proposed for ultra-thin oxides where ballistic transport of electrons through the oxide occurs. According to the anode hole injection model [4.15, 4.39, 4.103], the field dependence of α, which is the probability to create a hole which can tunnel back into the oxide (see Sect. 4.3.3), can be described as:     −H H α = α0 exp and QBD = Q0 exp (4.4) Eox Eox with α0 and H constants (for a fixed oxide thickness) and Q0 = Qp,crit /α0 . With this equation and approximating the Fowler–Nordheim current by an exponential dependence on the reciprocal oxide field, the time-to-breakdown becomes:  ∗    ∗  B Q0 QBD QBD B +H G exp tBD = = ∗ exp = τ0 exp (4.5)  JFN A Eox A Eox Aox with τ0 a constant. Reported values for G = B ∗ + H vary from 290 to 350 MV/cm depending on oxide thickness and stress type (CVS or CCS). Equation 4.5 expresses the so-called ’1/E’-model, because the logarithm of tBD depends linearly on the reciprocal oxide field. The E-model on the other hand predicts a linear relationship between the logarithm of the time-to-breakdown and the oxide field [4.58, 4.65, 4.67]: TBD = t0 exp(−γEox )

(4.6)

with t0 and γ constants. This model has been used long before there existed any physical argumentation to support it. In the literature of the past 15 years publications trying to prove the correctness of either the E or the 1/E-model can be found as illustrated in Fig. 4.17 [4.31, 4.58, 4.66, 4.104, 4.108, 4.114, 4.136]. All models which are used to provide the E-model with a physical base assume that a direct correlation exists between the electric field and the oxide degradation, i.e. all of these models ignore the role of the injected electrons as

4 Oxide Reliability Issues

109

Fig. 4.17. In literature, several data sets have been published to support either the E- or the 1/E-model. Two examples are given here: left: the E-model (from [4.114]), right: the 1/E-model (from [4.104])

an intermediate step for generating oxide traps (see Sect. 4.1.6, Fig. 4.11c). Recent experiments clearly demonstrate that the oxide degradation process is fluence-driven [4.64, 4.73]. Furthermore, detailed simulations of anode hole injection including minority carrier ionization [4.1, 4.12], no longer link a 1/E-model directly with the anode hole injection concept. Instead, a mixed model with approximate 1/Edependence at high voltage and E-dependence at low voltage is found. Other groups have proposed similar ‘unified’ E–1/E-models [4.20, 4.52, 4.87, 4.126]. The E-model vs. 1/E model discussion is mainly valid for oxides thicker then 5 nm, where the injection of electrons is dominated by non-ballistic FNinjection. The injected electrons enter the conduction band of the silicon dioxide and interact with the SiO2 . The oxide field mainly determines the electron energy at the anode and consequently the oxide degradation. Since there exists a unique relationship between the FN current density and oxide field, the charge-to-breakdown should be measured using CCS. For ultra-thin oxides (< 5 nm) however, the injected electrons travel ballistically through the oxide without interacting with the SiO2 lattice. This can either be by FN-tunnelling above 3.5 V, typically in oxides with thickness between 5 nm and 3.5 nm, or by direct tunnelling below 3.5 V, typically in oxides with thickness below 3.5 nm. The electron energy at the anode is determined by the voltage difference between the cathode and the anode, which corresponds to the applied gate voltage [4.1, 4.42, 4.73]. This means that for an ultra-thin oxide the gate voltage determines the breakdown, and the constant current stress methodology needs to be replaced by constant voltage stress evaluations [4.42, 4.77].

110

R. Degraeve

4.3.2 Temperature Dependence of Breakdown

As advanced CMOS applications operate at elevated temperature, the temperature dependence of oxide degradation and breakdown has received considerable attention. Especially in ultra-thin oxides the temperature dependence of time-to-breakdown is very strong as illustrated in Fig. 4.18 [4.33]. In most publications, it is assumed that an Arrhenius law can describe the temperature dependence of tBD and many authors determine the activation energy for their layers. The activation energies depend, however, on the oxide layer thickness [4.18], the voltage or field range [4.136] and the temperature range [4.9,4.71,4.121] of the measurement. From the huge spread of the observed values, it can be concluded that Arrhenius is not a well-suited description for the T-dependence of tBD [4.44, 4.54, 4.55]. Both the trap density at breakdown as well as the trap generation rate depends on temperature [4.44]. Furthermore, the oxide traps created at different temperatures are not equivalent and consequently oxide damage generated during electrical stress at different temperatures is not simply cumulative [4.54]. This greatly complicates any temperature extrapolation of time-to-breakdown data. An obvious way to circumvent this problem is to measure at the operation or specified temperature.

10 2

tBD (s)

10 1

A = 2 x 10 -5 cm 2

10 0

13.8 nm SiO 2 @ 10 -1 A cm -2 10.8 nm SiO 2 @ 10 -1 A cm -2 7.3 nm SiO 2 @ 10 -1 A cm -2 4.1 nm SiO 2 @ 10 -1 A cm -2 ~3.1 nm SiO 2 @ 4.7 V 2.2 nm NO-oxide @ 3.65 V

10 -1 10 -2 2.2

2.4

2.6

2.8

3.0

3.2

3.4x10 -3

1/T (K -1) Fig. 4.18. Arrhenius plot of the time-to-breakdown reveals that no single activation energy can be determined for the temperature acceleration of breakdown. The temperature dependence of time-to-breakdown increases for ultra0thin oxide layers (from [4.54])

4 Oxide Reliability Issues

111

4.3.3 Oxide Reliability Predictions Accurate predictions of oxide reliability can only be obtained if a correct and complete reliability specification is defined. This specification must contain four elements: the allowed failure rate or percentage failures, a lifetime, the oxide area, and the voltage and temperature conditions. A typical specification reads: 0.01% failures are allowed after 10 years DC operation at an area of 0.1 cm2 and operating voltage and temperature. With this specification, it has been reported that a reliability limit for oxide thickness scaling is encountered at about 2.2 nm (optical thickness) at room temperature [4.33, 4.112]. Recent work showed that these predictions are too pessimistic and improved accuracy of the voltage acceleration law, combined with process improvement can still shift this limit further down [4.110, 4.129].

4.4 Conclusion We have described and discussed the electric degradation of thin oxide layers. Oxide wear-out can be described as the continuous creation of electron trapping centers. These traps cause the Stress-Induced Leakage Current, which can be modeled as a trap-assisted tunneling current. The traps can be negatively charged or neutral and they will trigger breakdown as soon as they form a percolating path connecting anode with cathode. The breakdown can be immediately follwed by a thermal run-away effect, resulting in hard breakdown, or the percolating path can keep a relatively high resistance and result in soft breakdown. The occurrence of soft breakdown in thin oxide layers changes both the breakdown detection techniques as well as its impact on reliability. Time-to-breakdown results can be extrapolated from high voltage test conditions to low operation voltage, but the discussion on the correct extrapolation method is still a topic of discussion. Despite all the missing links in our understanding of the electrical properties of ultra-thin SiO2 layers, we still have a quite complete picture of the essential physical phenomena that are taking place in these layers. This is obviously the result of many years of continuous research.

References 4.1. M.A. Alam, J. Bude, A. Ghetti, “Field acceleration for oxide breakdown – Can an accurate anode hole injection model resolve the E vs. 1/E controversy?”, Proc. IRPS, pp. 21–26, 2000 4.2. M.A. Alam, B.E. Weir, P.J. Silverman, “A study of soft and hard breakdown – Part I: Analysis of statistical percolation conductance”, IEEE Trans. Electron Devices 49, no. 2, pp. 232–238, 2002

112

R. Degraeve

4.3. M.A. Alam, B.E. Weir, P.J. Silverman, “A study of soft and hard breakdown – Part II: Principles of area, thickness, and voltage scaling”, IEEE Trans. Electron Devices 49, no. 2, pp. 239–2468, 2002 4.4. M.A. Alam, B. Weir, J. Bude, P. Silverman, D. Monroe, “Explanation of soft and hard breakdown and its consequences for area scaling”, IEDM Tech. Dig., pp. 449–452, 1999 4.5. G.B. Alers, B.E. Weir, M.A. Alam, G.L. Timp, T. Sorch, “Trap assisted tunneling as a mechanism of degradation and noise in 2–5 nm oxides”, Proc. IRPS, pp. 76–79, 1998 4.6. P.P. Apte, K.C. Saraswat, “Modeling ultrathin dielectric breakdown on correlation of charge trap-generation to charge-to-breakdown”, Proc. IRPS, pp. 136–142, 1994 4.7. E. Avni, J. Shappir, “A model for silicon-oxide breakdown under high field and current stress”, J. Appl. Phys. 64, no. 2, pp. 743–748, 1988 4.8. O. Briere, J.A. Chroboczek, and G. Ghibaudo, “Random telegraph signal in the quasi-breakdown current of MOS capacitors”, ESSDERC Proc., p. 759, 1996 4.9. O. Bri`ere, A. Halimaoui, G. Ghibaudo, “Breakdown characteristics of ultrathin gate oxides following field and temperature stresses”, Solid-State Electronics 41, no. 7, pp. 981–985, 1997 4.10. S. Bruyere, E. Vincent, G. Ghibaudo, “Quasi-breakdown in ultrathin SiO2 films: occurrence, characterization and reliability assessment methodology”, IRPS Proc., pp. 48–54, 2000 4.11. D.A. Buchanan, S.-H. Lo, “Reliability and integration of ultra-thin gate dielectrics for advanced CMOS”, Microelectronic Engineering 36, no. 1–4, pp. 13–20, 1997 4.12. J.D. Bude, B.E. Weir, P.J. Silverman, “Explanation of stress-induced damage in thin oxides”, IEDM Tech. Dig., pp. 179–182, 1998 4.13. E. Cartier, J.S. Tsang, M.V. Fischetti, D.A. Buchanan, “Light emission during during direct and Fowler-Nordheim tunneling in ultra thin MOS tunnel junctions”, Microelectronic Engineering 36, no. 1–4, pp. 103–106, 1997 4.14. I.C. Chen, S. Holland, C. Hu, “A quantitative physical model for timedependent breakdown in SiO2 ”, Proc. IRPS, pp. 24–31, 1985 4.15. I. C. Chen, S. Holland, C. Hu, “Hole trapping and breakdown in thin SiO2 ,” IEEE Electron Device Lett. 7, no. 3, pp. 164–167, 1986 4.16. I.C. Chen, S. Holland, K.K. Young, C. Chang, C. Hu, “Substrate hole current and oxide breakdown”, Appl. Phys. Lett. 49, no. 11, pp. 669–671, 1986 4.17. I. C. Chen, S. Holland, C. Hu, “Electron-trap generation by recombination of electrons and holes in SiO2 ”, J. Appl. Phys. 61, no. 9, pp. 4544–4548, 1987 4.18. C.-C. Chen, C.-Y. Chang, C.-H. Chien, T.-H. Huang, H.-C. Lin, M.-S. Liang, “Temperature-accelerated dielectric breakdown in ultrathin gate oxides”, Appl. Phys. Lett. 74, no. 24, pp. 3708–3710, 1999 4.19. K.P. Cheung, J.I. Colonell, C.P. Chang, W.Y.C. Lai, C.T. Liu, R. Liu, and C.S. Pai, “Energy funnels-a new oxide breakdown model”, Symp. VLSI Technol. Dig., p. 145, 1997 4.20. K.P. Cheung, “A physics-based, unified gate-oxide breakdown model”, IEDM Tech. Dig., 1999

4 Oxide Reliability Issues

113

4.21. C.-L. Chiang, N. Khurana, “Imaging and detection of current conduction in dielectric films by emission microscopy”, IEDM Tech. Dig., pp. 672–675, 1986 4.22. F. Crupi, R. Degraeve, G. Groeseneken, T. Nigam, H.E. Maes, “On the properties of the gate and substrate current after soft breakdown in ultrathin oxide layers”, IEEE Trans. Elec. Dev. 45, No. 11, pp. 2329–2334, 1998 4.23. J. De Blauwe, J. Van Houdt, D. Wellekens, R. Degraeve, Ph. Roussel, L. Haspeslagh, L. Deferm, G. Groeseneken, H.E. Maes, “A new quantitative model to predict SILC-related disturb characteristics in Flash E2 PROM devices”, IEDM Tech. Dig., pp. 343–346, 1996 4.24. J. De Blauwe, R. Degraeve, R. Bellens, J. Van Houdt, Ph. Roussel, G. Groeseneken, H.E. Maes, “Study of DC Stress Induced Leakage Current (SILC) and its dependence on oxide nitridation”, Proc. of ESSDERC, pp. 361–364, 1996 4.25. R. Degraeve, B. Kaczer, G. Groeseneken, “Reliability: a possible showstopper for oxide thickness scaling?”, Semiconductor Science and Technology 15, no. 5, pp. 436–444, 2000 4.26. R. Degraeve, B. Kaczer, F. Schuler, M. Lorenzini, D. Wellekens, P. Hendrickx, J. Van Houdt, L. Haspeslagh, G. Tempel, G. Groeseneken, “Statistical model for SILC and pre-breakdown current jumps in ultra-thin oxide layers”, IEDM Techn. Dig., pp. 121–124, 2001 4.27. R. Degraeve, F. Schuler, M. Lorenzini, D. Wellekens, P. Hendrickx, J. Van Houdt, L. Haspeslagh, G. Groeseneken, G. Tempel, “analytical model for failure rate prediction due to anomalous charge loss of flash memories”, IEDM Techn. Dig., pp. 699–702, 2001 4.28. R. Degraeve, G. Groeseneken, I. De Wolf, H.E. Maes, “Oxide and interface degradation and breakdown under medium and high field injection conditions : a correlation study,” Microelectronic Engineering (Proceedings INFOS) 28, no. 1–4, pp. 313–316, 1995 4.29. R. Degraeve, G. Groeseneken, R. Bellens, M. Depas, H.E. Maes, “A consistent model for the thickness dependence of intrinsic breakdown in ultra-thin oxides”, IEDM Tech. Dig., pp. 863–866, 1995 4.30. R. Degraeve, Ph. Roussel, G. Groeseneken, H.E. Maes, “A new analytic model for the description of the intrinsic oxide breakdown statistics of ultrathin oxides”, Microelectronics and Reliability (Proc. ESREF) 36, no. 11/12, pp. 1639–1642, 1996 4.31. R. Degraeve, J.L. Ogier, R. Bellens, Ph. Roussel, G. Groeseneken, H.E. Maes, “A new model for the field dependence of intrinsic and extrinsic time-dependent dielectric breakdown”, IEEE Trans. Elec. Dev. 45, No. 2, pp. 472–481, 1998 4.32. R. Degraeve, G. Groeseneken, R. Bellens, J.L. Ogier, M. Depas, Ph. Roussel, H.E. Maes, “New insights in the relation between electron trap generation and the statistical properties of oxide breakdown”, IEEE Trans. Elec. Dev. 45, No. 4, pp. 904–911, 1998 4.33. R. Degraeve, N. Pangon, B. Kaczer, T. Nigam, G. Groeseneken, A. Naem, “Temperature acceleration of oxide breakdown and its impact on ultra-thin gate oxide reliability”, Symposium on VLSI Technology Digest of Technical papers, pp.59–60, 1999 4.34. M. Depas, T. Nigam, and M. Heyns, “Soft breakdown of ultra-thin gate oxide layers”, IEEE Trans. Electron Devices 43, no. 9, p. 1499, 1996

114

R. Degraeve

4.35. M. Depas, M.M. Heyns, “Relation between trap creation and breakdown during tunneling current stressing of sub-3 nm gate oxide”, Microelectronic Engineering 36, no. 1–4, pp. 21–24, 1997 4.36. D. J. DiMaria, J. W. Stasiak, “Trap creation in silicon dioxide produced by hot electrons,” J. Appl. Phys. 65, no. 6, pp. 2342–2356, 1989 4.37. D. J. DiMaria, D. Arnold, E. Cartier, “Impact ionization and positive charge formation in silicon dioxide films on silicon,” Appl. Phys. Lett. 60, no. 17, pp. 2118–2120, 1992 4.38. D.J. DiMaria, E. Cartier, D. Arnold, “Impact ionization, trap creation, degradation, and breakdown in silicon dioxide films on silicon”, J. Appl. Phys. 73, no. 7, pp. 3367–3384, 1993 4.39. D. J. DiMaria, “Hole trapping, substrate currents, and breakdown in thin silicon dioxide films,” IEEE Electron. Device Lett 16, no. 5, pp. 184–186, 1995 4.40. D. J. DiMaria, D. A. Buchanan, J. H. Stathis, R. E. Stahlbush, “Interface states induced by the presence of trapped holes near the silicon-silicondioxide interface,” J. Appl. Phys. 77, no. 5, pp. 2032–2040, 1995 4.41. D.J. DiMaria, E. Cartier, D.A. Buchanan, “Anode hole injection and trapping in silicon dioxide,” J. Appl. Phys. 80, no. 1, pp. 304–317 4.42. D. J. DiMaria, “Dependence on gate work function of oxide charging, defect generation, and hole currents in metal-oxide-semiconductor structures,” J. Appl. Phys. 81, no. 7, pp. 3220–3226, 1997 4.43. D. J. DiMaria, J.H. Stathis, “Explanation for the oxide thickness dependence of breakdown characteristics of metal-oxide-semiconductor structures,” Appl. Phys. Lett. 70, no. 20, pp. 2708–2710 1997. 4.44. D. J. DiMaria, J.H. Stathis, “Non-Arrhenius temperature dependence of reliability in ultrathin silicon dioxide films,” Appl. Phys. Lett. 74, no. 12, pp. 1752–1754, 1999 4.45. D.J. Dumin, J.R. Maddux, R.S. Scott, R. Subramoniam, “A model relating wearout to breakdown in thin oxides”, IEEE Trans. Electron Devices 41, no. 9, pp. 1570–1580, 1994 4.46. K.R. Farmer, R. Saletti, R.A. Buhrman, “Current fluctuations and silicon wear-out in metal-oxide semiconductor tunnel diodes”, Appl. Phys. Lett. 52, no. 20, pp.1749–1751, 1988 4.47. M. V. Fischetti, “Model for the generation of positive charge at the Si-SiO2 interface based on hot-hole injection from the anode,” Physical Review B 31, no. 4, pp. 2099–2113, 1985 4.48. A. Ghetti, J. Bude, G. Weber, “TBD prediction from meuasurements at low field and room temperature using a new estimator”, Symp. on VLSI Technology Dig. of Tech. Papers, 2000 4.49. A. Ghetti, E. Sangiorgi, J. Bude, T.W. Sorsch, G. Weber, “Low voltage tunneling in ultra-thin oxides: a monitor for interface states and degradation”, IEDM Tech. Dig., pp. 731–734, 1999 4.50. G. Groeseneken, H.E. Maes, N. Beltr` an, R.F. De Keersmaecker, “A reliable approach to charge-pumping measurements in MOS transistors”, IEEE Trans. Electron Devices, 31, no. 1, pp. 42–53, 1984 4.51. Y.D. He, H, Guan, M.F. Li, B. J. Cho, Z. Dong “Conduction mechanism under quasibreakdwon of ultrathin oxide”, Appl. Phys. Lett. 75, no. 16, pp. 2432–2434, 1999

4 Oxide Reliability Issues

115

4.52. C. Hu, Q. Lu, “A unified gate oxide reliability model”, Proc. IRPS, pp. 47–51, 1999 4.53. D. Ielmini, A.S. Spinalli, A.L. Lacaiti, A. Modelli, “Statistical modeling of relibility and scaling projections for flash memories”, IEDM Tech. Dig., pp. 703–706, 2001 4.54. B. Kaczer, R. Degraeve, N. Pangon, G. Groeseneken, “The influence of elevated temperature on degradation and lifetime prediction of thin silicondioxide films”, IEEE Trans. Elec. Dev. 47, No. 7, pp. 1514–1521, 2000 4.55. B. Kaczer, R. Degraeve, N. Pangon, T. Nigam, G. Groeseneken, “Investigation of temperature acceleration of thin oxide time-to-breakdown”, Microelectronic Engineering (INFOS 1999) 48, no. 1–4, pp. 47–50, 1999 4.56. T.-K. Kang, M.-J. Chen, C.-H. Liu, Y.J. Chang, S.-K. Fan, “Numerical confirmation of inelastic trap-assisted tunneling (ITAT) as SILC mechanism”, IEEE Trans. Electron Devices 48, no. 10, pp. 2317–2321, 2001 4.57. M. Kato, N. Myamoto, H. Hume, A. Satoh, T. Adachi, M. Ushiyama, K. Kimura, “Read-disturb degradation mechanism due to electron trapping in tunnel oxide for low-voltage flash memories, IEDM Tech. Dig., pp. 45–48, 1994 4.58. M. Kimura, “Field and temperature acceleration model for time-dependent dielectric breakdown”, IEEE Trans. Electron Devices 46, no. 1, pp. 220–229, 1999 4.59. L. Larcher, A. Paccagnella, G. Ghidini, “A model of the stress induced laekage current in gate oxides”, IEEE Trans. Electron Devices 48, no. 2, pp. 285–288, 2001 4.60. S.H. Lee, B.J. Cho, J.C. Kim, and S.H. Choi, “Quasi-breakdown of ultrathin gate oxide under high field stress”, IEDM Tech. Dig., pp. 605–608, 1994 4.61. C. Leroux, D. Blachier, O. Briere, G. Reimbold, “Light emission microscopy for thin oxide reliability analysis”, Microelectronic Engineering 36, no. 1–4, pp. 297–300, 1997 4.62. M.F. Li, Y.D. He, S.G. Ma, B.-J. Cho, K.F. Lo, M.Z. Xu, “Role of hole fluence in gate oxide breakdwon”, IEEE Electron. Device Lett. 20, no. 11, pp. 586–588, 1999 4.63. H.Z. Massoud, R. Deaton, “Percolation model for the extreme-value statistics of dielectric breakdown in rapid-thermal oxides”, Extended abstracts of the ECS Spring Meeting, pp. 287–288, 1994 4.64. J.M. McKenna, E.Y. Wu, S.-H. Lo, “Tunneling current characteristics and oxide breakdown in p+ poly gate PFET capacitors”, Proc. IRPS, pp. 16–20, 2000 4.65. J.W. McPherson, D.A. Baglee, “Acceleration factors for thin gate oxide stressing”, Proc. IRPS, pp. 1–5, 1985 4.66. J.W. McPherson, V. Reddy, K. Banerjee, H. Le, “Comparison of E and 1/E TDDB models for SiO2 underlong-term/low-field test conditions”, IEDM Tech. Dig., pp. 171–174, 1998 4.67. J.W. McPherson, H.C. Mogul, “Disturbed bonding states in SiO2 thin-films and their impact on time-dependent dielectric breakdown”, Proc. IRPS, pp. 47–56, 1998 4.68. E. Miranda, J. Su˜ n´e, R. Rodr´ıguez, M. Nafr´ıa, X. Aymerich, “Switching behavior of the soft breakdown conduction characteristic in ultra-thin (< 5 nm) oxide MOS capacitor”, Proc. IRPS, pp. 42–46, 1998

116

R. Degraeve

4.69. R. Moazzama, C. Hu, “Stress-induced current in thin siliocon dioxide films”, IEDM Tech. Dig., pp. 139–142, 1992 4.70. A. Modelli, B. Ricco, “Electric Field and Current dependence of SiO2 Intrinsic Breakdown”, IEDM Tech. Dig., pp.148–151, 1984 4.71. C. Mons´eri´e, C. Papadas, G. Ghibaudo, C. Gounelle, P. Mortini, G. Pananakakis, “Correlation between negative bulk oxide charge and breakdown, modeling and new criteria for dielectric quality evaluation”, Proc. IRPS, pp. 280–284, 1993 4.72. M. Nafr´ıa, J. Su˜ n´e, X. Aymerich, “Exploratory observations of postbreakdown conduction in polycrystalline-silicon and metal gated thin-oxide metal-oxide-semiconductor capacitors”, J. Appl. Phys. 73, no. 1, pp. 205– 215, 1993 4.73. P.E. Nicollian, W.R. Hunter, J.C. Hu, “Experimental evidence for voltage driven breakdown models in ultrathin gate oxides”, Proc. IRPS, pp. 7–15, 2000 4.74. E.H. Nicollian, J.R. Brews, “MOS Physics and Technology”, Wiley New York, 1982 4.75. P.E. Nicollian, M. Rodder, D.T. Grider, P. Chen, R.M. Wallace, S.V. Hattangady, “Low voltage stress-induced-leakage-current in ultrathin gate oxides”, Proc. IRPS, pp. 400–404, 1999 4.76. T. Nigam, R. Degraeve, G. Groeseneken, M.M. Heyns, H.E. Maes, “Constant current charge-to-breakdown: still a valid tool to study the reliability of MOS structures?”, Proc. IRPS, pp. 62–69, 1998 4.77. T. Nigam, R. Degraeve, G. Groeseneken, M.M. Heyns, H.E. Maes, “A fast and simple methodology for lifetime prediction of ultra-thin oxides”, Proc. IRPS, pp. 381–388, 1999 4.78. T.H. Ning, “Hot-electron emission from silicon into silicon dioxide,” Solid State Electronics 21, pp. 273–282, 1978 4.79. Y. Nissan-Cohen, J. Shappir, D. Frohman-Bentchkowsky, “Dynamic model of trapping-detrapping in SiO2 ”, J. Appl. Phys. 58, no. 6, pp. 2252–2261, 1985 4.80. Y. Nissan-Cohen, J. Shappir, D. Frohman-Bentchkowsky, “Trap generation and occupation dynamics in SiO2 under charge injection stress”, J. Appl. Phys. 60, no. 6, pp. 2024–2034, 1986 4.81. A. Ohata, A. Toriumi, M. Iwase, and K. Natori, “Observation of random telegraph signals: anomalous nature of defects at the Si/SiO2 interface”, J. Appl. Phys. 68, p. 200, 1990 4.82. K. Okada, “A model for anomalous leakage current in flash memories and its application for the prediction of retention characteristics”, IEDM Tech. Dig., pp. 707–710, 2001 4.83. K. Okada, S. Kawasaki, and Y. Hirofuji, “New experimental findings on stresss induced leakage current of ultra thin silicon dioxides”, Ext. Abst. of the 1994 SSDM, p. 565, 1994 4.84. K. Okada, “An experimental evidence to link the origins of ‘A-mode’ and ‘B mode’ stress induced leakage current”, Extended abstracts of the 1997 Int. Conf. on SSDM, pp. 92–93, 1997 4.85. K. Okada and K. Taniguchi, “Electrical stress-induced variable range hopping conduction in ultrathin silicon dioxides”, Appl. Phys. Lett. 70, p. 351, 1997

4 Oxide Reliability Issues

117

4.86. K. Okada, H. Kubo, A. Ishinaga, K. Yoneda, “A new prediction method for oxide lifetime and its application to study dielectric breakdown mechanism”, VLSI Proc., pp. 158–159, 1998 4.87. K. Okada, K. Yoneda, “A consistent model for time dependent dielectric breakdown in ultrathin silicon oxides”, IEDM Tech. Dig., 1999 4.88. P. Olivo, T.N. Nguyen, B. Ricco, “High-Field-Induced Degradation in UltraThin SiO2 Films”, IEEE Trans. Electron Devices 35, pp. 2259–2267, 1988 4.89. N.K. Patel, A. Toriumi, “Stress-induced leakage current in ultrathin SiO2 films”, Appl. Phys. Lett. 64, no. 14, pp. 1809–1811, 1994 4.90. G.M. Paulzen, “Qbd dependencies of ultrathin gate oxides on large area capcitors”, Microelectronic Engineering (Proceedings INFOS), 36, no. 1–4, pp. 321–324, 1997 4.91. T. Pompl, H. Wurzer, M. Kerber, R.C.W. Wilkins, I. Eisele, “Influence of soft breakdown on nMOSFET device characteristics”, Proc. IRPS, pp. 82–87, 1999 4.92. M. Rasras, I. De Wolf, G. Groeseneken, B. Kaczer, R. Degraeve, H.E. Maes, “Photo-carrier generation as the origin of Fowler-Nordheim-induced substrate hole current in thin oxides”, IEDM Tech. Dig., pp. 465–468, 1999 4.93. B. Ricco, G, Gozzi, M. Lanzoni, “Modelling and simulation of stress-induced leakage current in ultrathin SiO2 films”, IEEE Trans. Electron Devices 45, no. 7, pp. 1554–1560, 1998 4.94. P. Riess, G. Ghibaudo, G. Pananakakis, “Stress-induced leakage current generation kinetics based on anode hole injection and hole dispersive transport”, J. Appl. Phys. 87, no. 9, pp. 4626–4628, 2000 4.95. P. Riess, G. Ghibaudo, G. Pananakakis, “Analysis of the stress-induced leakage current and related trap distribution”, Appl. Phys. Lett. 75, no. 24, pp. 3871–3873, 1999 4.96. Ph. Roussel, R. Degraeve, G. Van den bosch, B. Kaczer, G. Groeseneken, “Accurate and robust noise-based trigger algorithm for soft breakdown detection in ultrathin gate dielectrics”, IEEE Trans. Device and Materials Reliability 1, no. 2, pp. 120–127, 2001 4.97. E.F. Runnion, S.M. Gladstone IV, R.S. Scott, D.J. Dumin, L. Lie, J. Mitros, “Limitations on oxide thicknesses in FLASH EEPROM apllications”, Proc. IRPS, pp. 93–99, 1996 4.98. T. Sakura, H. Utsunomiya, Y. Kamakura, K. Taniguchi, “A detailed study of soft- and pre-soft-breakdowns in small geometry MOS structures”, IEDM Tech. Dig., pp. 183–186, 1998 4.99. H. Satake, A. Toriumi, “Substrate hole current generation and oxide breakdown in Si MOSFETs under Fowler-Nordheim electron tunneling injection”, IEDM Tech. Dig., pp. 337–340, 1993 4.100. H. Satake, A. Toriumi, “Common origin for stress-induced leakage current and electron trap generation in SiO2 ”, Appl. Phys. Lett. 67, no. 23, pp. 3489–3490, 1995 4.101. H. Satake, S. Takagi, A. Toriumi, “Evidence of electron-hole cooperation in SiO2 dielectric breakdown”, Proc. IRPS, pp. 156–163, 1997 4.102. B. Schlund, C. Messick, J.S. Suehle, P. Chaparala, “A new physics-based model for time-dependent-dielectric-breakdown”, Proc. IRPS, pp. 84–92, 1996

118

R. Degraeve

4.103. K. F. Schuegraf, C. Hu, “Metal-oxide-semiconductor field-effect-transistor substrate current during Fowler-Nordheim tunneling stress and silicon dioxide reliability,” J. Appl. Phys. 76, no. 6, pp. 3695–3700, 1994 4.104. K. F. Schuegraf, C. Hu, “Reliability of thin SiO2 ,” Semicond. Sci. Technol. 9, pp. 989–1004, 1994 4.105. K.F. Schuegraf, C. Hu, “Hole Injection SiO2 Breakdown Model for Very Low Voltage Lifetime Extrapollation”, IEEE Trans. Electron Devices 41, no. 5, pp. 761–767, 1994 4.106. F. Schuler, G. Tempel, H. Melzner, M. Jacob, P. Hendrickx, D. Wellekens, J. Van Houdt, “Failure rate prediction and accelerated detection of anomalous charge loss in flash memories by using an analytic transient physics-based charge loss model”, Jpn. J. Appl. Phys. 41, pp. 2650–2653, 2002 4.107. R.S. Scott, N.A. Dumin, T.W. Hughes, D.J. Dumin, B.T. Moore, “Properties of high voltage stress generated traps in thin silicon oxides”, Proc. IRPS, pp. 131–141, 1995 4.108. N. Shiono, M. Itsumi, “A Lifetime Projection Method Using Series Model and Acceleration Factors for TDDB failures of Thin Gate Oxides”, Proc. IRPS, pp. 1–6, 1993 4.109. B.I. Shklosskii, A.L. Efros, “Electronic Properties of Doped Semiconductors”, Berlin, Springer-Verlag, 1984 4.110. J.H. Stathis, A Vayshenker, P.R. Varakamp, E.Y. Wu, C. Montrose, J. McKenna, D.J. DiMaria, L.-K. Han, E. Cartier, R.A. Wachnik, B.P. Linder, “Breakdown meusurements of ultra-thin SiO2 at low voltage”, Symp. on VLSI Technology Dig. of Tech. Papers, 2000 4.111. J.H. Stathis, “Quantitative model of the thickness dependence of breakdown in ultrathin oxides”, Microelectronics Enigeering 36, no. 1–4, pp.325–328, 1997 4.112. J.H. Stathis, D.J. DiMaria, “Reliability projection for ultra-thin oxides at low voltage”, IEDM Tech. Dig., pp. 167–170, 1998 4.113. R. Subramoniam, R.S. Scott, D.J. Dumin, “A Statistical Model of Oxide Breakdown Based on a Physical Description of Wearout”, IEDM Tech. Dig., pp. 135–138, 1992 4.114. J.S. Suehle, P. Chaparala, C. Messick, W.M. Miller, K.C. Boyko, “Field and temperature acceleration of time-dependent dielectric breakdown in intrinsic thin SiO2 ”, Proc. IRPS, pp. 120–125, 1994 4.115. J. Su˜ n´e, G. Mura, E. Miranda, “Are soft breakdown and hard breakdown of ultrathin gate oxides actually different failure mechanisms?”, IEEE Electron. Device Lett. 21, no. 4, pp. 167–169, 2000 4.116. J. Su˜ n´e, I. Placencia, N. Barniol, E. Farr´es, F. Mart´ın, X. Aymerich, “On the breakdown statistics of very thin SiO2 films”, Thin Solid Films 185, pp. 347–362, 1990 4.117. J. Su˜ n´e, E. Miranda, M. Nafr´ıa, X. Aymerich, “Point contact conduction at the oxide breakdown of MOS devices”, IEDM Tech. Dig., pp. 191–194, 1998 4.118. J. Su˜ n´e, E. Miranda, M. Nafr´ıa, X. Aymerich, “Modeling the breakdown spots in silicon dioxide films as point contacts”, Appl. Phys. Lett. 75, no. 7, pp. 959–961, 1999 4.119. S. Takagi, N. Yasuda, A Toriumi, “Experimental evidence of inelastic tunneling in stress-induced laekage current”, IEEE Trans. Electron Devices 46, no. 2, pp. 335–341, 1999

4 Oxide Reliability Issues

119

4.120. A. Teramoto, K. Kabayashi, Y. Matsui, M. Hirayama, A. Yasuoka, “Excess currents induced by hot-hole injection and F-N stress in thin SiO2 films”, Proc. IRPS, pp. 113–116, 1996 4.121. A Teramoto, H. Umeda, K. Azamawari, K. Kobayashi, K. Shiga, J. Komori, Y. Ohno, H. Miyoshi, “Study of oxide breakdown under very low electric field”, Proc. IRPS, pp. 66–71, 1999 4.122. H. Uchida, T. Ajika, “Electron trap center generation due to hole trapping in SiO2 under Fowler-Nordheim tunneling conditions”, Appl. Phys. Lett. 51, no. 87, pp. 433–435, 1987 4.123. N. Vandewalle, M. Ausloos, M. Houssa, P.W. Mertens, M.M. Heyns, “NonGaussian behavior and anticorrelations in ultrathin gate oxides after soft breakdown”, Appl. Phys. Lett. 74, no. 11, pp. 1579–1581, 1999 4.124. E. Vincent, C. Papadas, C. Riva, F. Pio, G. Ghibaudo, “On the charge built-up mechanisms in very thin insulator layers”, Proc. ESSDERC, pp. 495–498, 1994 4.125. E. Vincent, C. Papadas, G. Ghibaudo, “Electric field dependence of charge build-up mechanisms and breakdown phenomena in thin oxides during Fowler-Nordheim injection”, Proc. ESSDERC, pp. 767–770, 1996 4.126. R.P. Vollertsen, “A new approach of statistical modelling the time dependent oxide breakdown”, ESREF, pp. 97–100, 1992 4.127. T. Wang, N.-K. Zous, J.-L. Lai, C. Huang, “Hot hole stress induced leakage current (SILC) transient in tunnel oxides”, IEEE Electron. Device Lett. 19, no. 5, pp. 148–150, 1998 4.128. Z.A. Weinberg, M.V. Fischetti, “SiO2 -induced substrate current and its relation to positive charge in field-effect transitors”, J. Appl. Phys. 59, no. 3, pp. 824–832, 1986 4.129. B.E. Weir, M.A. Alam, J.D. Bude, P.J. Silverman, A. Ghetti, F. Baumann, P. Diodato, D. Monroe, T. Sorsch, G. L. Timp, Y. Ma, M.M. Brown, A. Hamad, D. Hwang, P. Mason, “Gate oxide reliability projection to the sub2 nm regime”, Semicond. Sci. Technol. 15, pp. 455–461, 2000 4.130. B.E. Weir, P.J. Silverman, D. Monroe, K.S. Krisch, M.A. Alam, G.B. Alers, T.W. Sorsch, G.L. Timp, F. Baumann, C.T. Liu, Y. Ma, D. Hwang, “Ultrathin gate dielectrics: they break down, but do they fail?”, IEDM Tech. Dig., pp. 73–76, 1997 4.131. B.E. Weir, P.J. Silverman, M.A. Alam, F. Baumann, D. Monroe, A. Ghetti, J.D. Bude, G.L. Timp, A. Hamad, T.M.Oberdick, N.X. Zhao, Y. Ma, M.M. Brown, D. Hwang, T.W. Sorsch, J. Madic, “Gate oxieds in 50 nm devices: thickness uniformity improves projected reliability”, IEDM Tech. Dig., pp. 437–440, 1999 4.132. D.R. Wolters, J.F. Verwey, Instabilities in Silicon Devices. Elsevier Science Publishers. B.V. (North-Holland), 1986, Chap. 6, pp. 332–335 4.133. E.Y. Wu, J.H. Stathis, L.-K. Han, “Ultra-thin oxide reliability for ULSI applications”, Semicond. Sci. Technol. 15, no. 5, pp. 425–435, 2000 4.134. E. Wu, E. Nowak, J. Aitken, W. Abadeer, L.K. Han, S. Lo, “Structural dependence of dielectric breakdown in ultra-thin gate oxides and its relationship to soft breakdown modes and device failure”, IEDM Tech. Dig., pp. 187–190, 1998 4.135. J. Wu, L.F. Register, E. Rosenbaum, “Trap-assisted tunnelling current through ultra-thin oxide”, Proc. IRPS, pp. 389–395, 1999

120

R. Degraeve

4.136. A. Yassine, H.E. Nariman, K. Olasupo, “Field and temperature dependence of TDDB of ultrathin gate oxide”, IEEE Electron. Device Lett. 20, no. 8, pp. 390–392, 1999 4.137. N.-K. Zous, T. Wang, C.-C. Yeh, C.W. Tsai, “Transient effects of positive oxide charge on stress-induced leakage current in tunnel oxides”, Appl. Phys. Lett. 75, no. 5, pp. 734–736, 1999

Part II

Transition to Silicon Oxynitrides

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride S.-H. Lo and Y. Taur

This chapter deals with gate oxide and oxynitride scaling to 2.0–1.0 nm, a thickness only a few atomic layers across. Section 5.1 reviews the device requirements on gate dielectric scaling from a scale length perspective, summarized in a rule-of-the-thumb relationship between the gate length and oxide thickness. Section 5.2 delves into the murky and often controversial issue of how the gate dielectric thickness should be defined and measured. The complications mainly arise from physical effects like quantum mechanical distribution of electrons in the inversion layer and polysilicon gate depletion effects. Section 5.3 addresses the key obstacle in scaling gate oxides to thinner thickness: quantum mechanical tunneling currents. Section 5.4 extends the results to composite gate dielectrics, in particular, silicon oxynitrides, and quantifies how much it might help on extending scaling. Section 5.5 discusses the practical scaling limits of gate oxides and oxynitrides and their dependence on application, e.g., in high performance or low power CMOS technologies.

5.1 Device Requirements on Gate Dielectric Scaling When the dimensions of a MOSFET are scaled down, both the voltage level and the gate oxide thickness must also be reduced [5.1]. Since the electron thermal voltage, kT /q, is a constant for room-temperature electronics, the ratio between the operating voltage and the thermal voltage inevitably shrinks. This leads to higher source-to-drain leakage currents stemming from thermal diffusion of electrons. At the same time, the gate oxide has been scaled to only a few atomic layers thick where quantum mechanical tunneling gives rise to a sharp increase in gate leakage currents [5.2]. Figure 5.1 shows the scaling trend of power supply voltage (Vdd ), threshold voltage (Vt ), and gate oxide thickness (tox ) as a function of CMOS channel length [5.3]. It is seen that the power supply voltage has not been decreasing at a rate proportional to the channel length. This means that the field has been gradually rising over the generations between 1 µm and 0.1 µm channel lengths. Fortunately, thinner oxides are more reliable at high fields, thus allowing operation at the reduced but nonscaled supply voltages.

S.-H. Lo and Y. Taur 10 5

Vdd

2 1 0.5

50

Vt

20

0.2 0.1

10 5

t ox ~L 0.02 0.05 0.1 0.2 0.5 1 MOSFET Channel Length (µm)

2

Gate Oxide Thickness (nm)

Power Supply and Threshold Voltage (V)

124

1

Fig. 5.1. History and trends of power supply voltage (Vdd ), threshold voltage (Vt ), and gate-oxide thickness (tox ) versus channel length for CMOS logic technologies

A two-dimensional (2-D) scale length theory is employed to illustrate the device requirements on gate dielectric scaling. Figure 5.2 shows the essential 2-D aspects of a short-channel MOSFET [5.4]. A key parameter is the gate depletion width, Wd , within which the mobile carriers (holes in the case of nMOSFETs) are swept away by the applied gate field. The gate depletion width reaches a maximum, Wdm , at the onset of strong inversion (threshold voltage) when the surface potential (ψs ) or band bending is such that the electron concentration at the surface equals the hole concentration in the bulk substrate. This is the conventional ψs = 2ψB condition with ψB = (kT /q) ln(Na /ni ), where Na is the substrate doping concentration and ni the intrinsic carrier concentration of silicon. For uniformly doped cases [5.5],  4εsi kT ln(Na /ni ) Wdm = . (5.1) q 2 Na A rectangle is formed by the boundary of the gate depletion region, the gate electrode, and the source and drain regions, as depicted in Fig. 5.2 [5.4]. 2-D effects can be characterized by the aspect ratio of this rectangle. When the horizontal dimension, i.e., the channel length, is at least twice as long as the vertical dimension, the device behaves like a long-channel MOSFET and its threshold voltage is insensitive to channel length and drain bias. For channel lengths shorter than that, 2-D effect becomes significant and the minimum

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

125

Gate Source

n+ poly

-t ox

0H A

n+ Wd

B

C

Na D

Drain y

GL F E

n+

x Substrate

Fig. 5.2. Simplified geometry for analyzing 2-D effects in a short-channel MOSFET. The white area in silicon represents the depletion regions where mobile carriers are swept away by the built-in as well as the applied fields of the gate and between the substrate and the source and drain. The solution to the electrostatic potential is reduced to that of a 2-D boundary-value problem for the rectangle bounded by heavy dark lines

surface potential (ψs ) which determines the threshold voltage is increasingly more controlled by the drain than by the gate. The rectangular box consists of a silicon region of thickness Wdm and an oxide region of thickness tox . At the interface, the vertical fields (Ex ) obey the boundary condition, εsi Ex,si = εox Ex,ox , where εsi and εox are the permittivities of silicon and oxide, respectively. For lateral fields (Ey ) parallel to the interface, the boundary condition is Ey,si = Ey,ox , independent of the dielectric constants. By properly matching the boundary conditions of both components of the electric fields at the silicon-insulator interface, one can derive a scale length λ which is a solution to the following equation [5.6]:     πtox πWdm + εox tan = 0. (5.2) εsi tan λ λ λ also goes into the length dependent term of the maximum potential barrier in a MOSFET of channel length L, i.e., ∆ψs (SCE) ∝ exp(−πL/2λ) [5.4]. The ratio Lλ is a good measure of how strong the 2-D effect is. For the short-channel Vt roll-off and the drain-induced barrier lowering (DIBL) to be acceptable, the above exponential factor must be much less than unity. This means the minimum useful channel length is about 1.5–2.0 times λ [5.5]. Note that the above scale length equation is valid for high-κ gate insulator as well. One simply replaces εox and tox with εi and ti where εi is the permittivity of that insulator and ti its physical thickness.

126

S.-H. Lo and Y. Taur

Gate Oxide Thickness (nm)

10

λ=20 nm

8

SiO2

15 nm 10 nm

6 5 nm

4

Wdm=10tox m=1.3

2 0

0

5 10 15 Depletion Depth in Si (nm)

20

Fig. 5.3. Constant scale length λ contours (solid lines) in a tox −Wdm design plane, assuming SiO2 or εsi /εox = 3. The dotted line marks the boundary on which the ideality factor m equals 1.3. The intercepts represent design points which satisfy both the scale length and the subthreshold slope requirements

The lowest order solution to the above equation is plotted in Fig. 5.3 in the form of constant-λ contours in a tox − Wdm design plane. In addition to the 2-D scale length requirement, the ratio between tox and Wdm must also be kept small in order for the inverse (log) subthreshold current slope,   εsi tox kT kT = 1+ , (5.3) (ln 10) S = m(ln 10) q εox Wdm q to be close to the ideal (ln 10)kT /q value or 60 mV/decade [5.5]. Here m is usually referred to as the ideality factor which measures the gate voltage swing needed per unit of change of electron potential (or band bending) at the silicon surface. A reasonable upper limit is tox /Wdm = 0.1, or m = 1.3, as indicated by the dotted line in Fig. 5.3. This gives a long-channel inverse subthreshold slope of ≈ 80 mV/decade. The intercepts of the dotted line with the constant-λ contours lie in a region where the vertical fields dominate (for SiO2 ), and λ ≈ Wdm + (εsi /εox ) tox , obtained by replacing the oxide region with an equivalent silicon region of thickness (εsi /εox ) tox [5.4]. The design points, or the intercepts, can then be solved as tox = (1 − 1/m)(εox /εsi )λ. This means that for Lmin ≈ (1.5–2.0) λ and m ≈ 1.3, the oxide thickness required is tox ≈ Lmin /20 to Lmin /25. Based on this rule of thumb, 2.0–1.0 nm thick gate oxides are needed for scaling CMOS channel length to 50–20 nm dimensions [5.7].

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

127

5.2 Definition of Gate Dielectric Thickness With the quickened pace of device scaling, ultra-thin gate oxides below 3 nm have been widely used in sub-100-nm CMOS devices for logic technologies below the 180 nm node [5.8–5.10]. When the gate oxides are thick, it is straightforward to define the thickness in terms of the measured capacitance at accumulation or inversion, Cox = εox A/tox , where tox is the physical thickness of the oxide, A is the area of the capacitor, and εox is the permittivity of the oxide. The implicit approximation here that all the charges are located at the interfaces between the oxide and silicon and between the oxide and gate electrode is reasonably good as long as one is dealing with oxides 10 nm or thicker. For oxides as thin as 2.0–1.0 nm, however, one must carefully consider where the charges are located with respect to the oxide interfaces. 5.2.1 Electron Distribution in Accumulation and Inversion Layers On the silicon side, carriers in the accumulation or inversion layer are confined in a potential well formed by the oxide barrier and the silicon bands which bend either upward or downward near the surface due to the applied gate field. Because of the confinement of motion in the direction normal to the surface, mobile carriers must be treated quantum-mechanically as a 2-D gas [5.11–5.14], especially at high normal fields. In other words, carriers close to the silicon/oxide interface are bounded in subbands of discrete energies for their quantized motion perpendicular to the interface. Figures 5.4a and 5.4b, respectively, show the electron distribution in the inversion region and hole distribution in the accumulation region in an n+ -gate/p-Si MOS device with a tox of 1.5 nm. The applied gate voltages are 1.2 V and −2.0 V, respectively, in the inversion and the accumulation regions. They are calculated using classical (Maxwell–Boltzman (MB), Fermi–Dirac (FD) statistics) and quantum-mechanical (QM) models. It is clear that the charge centroid from the QM model is farther away from the interface than those from the classical models. Since the difference (after dividing by three, the dielectric constant ratio between silicon and oxide) is a significant fraction of the gate oxide thickness in the 2.0–1.0 nm range, quantum-mechanical treatments are necessary for accurate modeling of device capacitances. Based on the quantum-mechanical model, both the effective inversion and accumulation layer thicknesses (i.e., DC charge centroid) are about 1 nm in silicon which reduces the gate capacitance by adding about 0.3 ∼ 0.4 nm to the equivalent oxide thickness. This effective layer thickness is only weakly dependent on the electric field at the interface and therefore is not scalable with technologies. 5.2.2 Polysilicon Gate Depletion Effect On the gate side of the oxide, heavily doped polysilicon is commonly used as the gate electrode for ease of fabricating source and drain regions self-

128

S.-H. Lo and Y. Taur

Electron Concentration (×1019 cm−3)

50

n+-Gate/p-Si tox = 1.5 nm

40

Vg = 1.2 V MB

30

20

10

0

FD

0

QM 1

2

a

Hole Concentration (×1019 cm−3)

40

4

n+-Gate/p-Si tox = 1.5 nm Vg = −2.0 V

30

MB

20

10 FD 0

b

3

Depth (nm) from Si/SiO2 Interface

0

1

QM 2

3

4

Depth (nm) from Si/SiO2 Interface

Fig. 5.4. (a) Electron and (b) hole distributions in the inversion and the accumulation regimes, respectively, calculated based on two classical models (Maxwell– Boltzman (MB) and Fermi–Dirac (FD) statistics) and the quantum-mechanical (QM) model. The gate oxide thickness is 1.5 nm and the applied gate voltage is 1.2 V

aligned to the etched gate. The n+ and p+ doped polysilicon gates also offer the desired work-functions for scaled threshold voltages of nMOS and pMOS respectively [5.15]. However, in practice polysilicon can only be doped to Np = 1019 − 1020 cm−3 , much lower than the free carrier concentration of metals in the 1022 cm−3 range. As a result, a depletion width (where the

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

129

10

Charge Density (×1019 cm−3)

9

−3

Np (cm )

tox = 1.5 nm Vg = 1.2 V

20

10 5×1019 2×1019

SiO2

8 7 6 5

Silicon

Polysilicon

EC

4 3 2 EC

EV

1 Net Charge 0 -10

Electron

EV

-5

0

5

10

Position (nm)

Fig. 5.5. Simulated charge concentration profiles in the inversion region. Three active polysilicon doping concentrations (2 × 1019 , 5 × 1019 and 1 × 1020 cm−3 ) are studied. The energy band diagram (EC and EV ) for the 2 × 1019 cm−3 case is also shown

mobile carriers are swept away by the field) of the order of Wp = Qi /qNp is needed to sustain a surface charge density Qi (C/cm−2 ) across the MOS capacitor. For today’s Qi /q(∼ Cox (Vg − Vt )/q) in the 1013 cm−2 range, Wp is of the order of a few nanometers, adding significantly to the equivalent oxide thickness. Note that Wp adds to the oxide thickness when calculating the small signal capacitance. But for the MOSFET current which is proportional to the integrated charge, the charge centroid position or Wp /2 is the more relevant width. In contrast to the QM effect in silicon discussed above, Wp widens and therefore the polysilicon depletion effect worsens as the gate voltage or inversion charge density increases. Figure 5.5 shows the energy band diagram and the charge concentration profiles in the inversion region with three different active polysilicon doping concentrations (2 × 1019 , 5 × 1019 and 1 × 1020 cm−3 ). The same MOS structure simulated above is used. The net positive charge plotted on the polysilicon gate side reflects the un-neutralized ionized donor concentration in the depletion region where the mobile carriers are swept away. It is clear that the charge distribution widens further from the gate-oxide interface as the poly doping is decreased. The polysilicon depletion width Wp increases from about 1.5 to 6 nm as the active polysilicon doping Np decreases from 1020 to 2 × 1019 cm−3 . The result concludes that the polysilicon depletion width isn’t quite fixed, but actually is very sensitive to the varying active polysilicon doping level. At the same time, the electron peak concentration

130

S.-H. Lo and Y. Taur

as well as the integrated charge (at the given gate voltage) decrease with decreasing poly doping, indicating the degradation of current and transconductance. 5.2.3 Gate Capacitance and Equivalent Oxide Thickness (EOT) Determination Figure 5.6 shows the simulated capacitance-voltage (C–V ) characteristics for the same three polysilicon dopings studied above. In addition, an ideal metal gate case is simulated to exclude the polysilicon depletion effect and separate out the inversion-layer width effect. Both the polysilicon depletion and the inversion-layer width effects add series capacitances to the oxide capacitance, degrading the total gate capacitance to considerably less than Cox , as shown in Fig. 5.6. On the inversion (positive gate voltage) side, one can define a capacitance equivalent thickness (CET), tinv , in terms of the measured capacitance, Cinv : tinv =

Aεox Cinv

(5.4)

It is a key parameter for in-line process monitoring as well as for device and circuit modeling. Note that tinv is dependent on the gate voltage applied since poly depletion effect worsens at higher gate voltages. The typical active doping concentrations in today’s leading-edge CMOS logic technologies are in the range of 5 × 1019 − 1020 cm−3 . From Fig. 5.6, the combination of the Cox

2

Total Gate Capacitance (nF/cm )

2500

2000

Inversion

Accumulation

Metal 1500

1×10

20

5×10

1000

2×10

19

19

−3

Np (cm )

500

tox = 1.5 nm 0 -2.0 -1.5 -1.0 -0.5 0.0

0.5

1.0

1.5

2.0

Gate Voltage (V)

Fig. 5.6. Simulated full capacitance-voltage (C–V ) characteristics for the three different active polysilicon doping concentrations and for a perfect metal gate electrode

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

131

two effects can reduce the total gate capacitance by 42–33% of the oxide capacitance. Even if the gate material is assumed to behave like metal gates, i.e., no polysilicon depletion effect, the percentage difference due to the finite width of the inversion layer is still as high as 18%. The percentage degradation will be even more significant with thinner oxides required for further device scaling. A more extendable rule of thumb is that typically, tinv is about 0.7–1.0 nm thicker than tox for a gate voltage of ∼ 1.0 V. On the accumulation (negative gate voltage) side, the n+ polysilicon gate is also accumulated. There is no poly depletion region and the capacitance is insensitive to the poly doping [5.16,5.17]. Most of the capacitance attenuation comes from the finite width of the accumulation layer (Fig. 5.4b) on the silicon side. The capacitance of the poly gate for all dopings is slightly lower than that of the metal gate because of the finite width (∼ 0.1 nm) of the accumulation layer on the polysilicon side. Since the accumulation capacitance is insensitive to poly doping as well as to substrate doping, it can be used to extract the equivalent oxide thickness (EOT) of the MOS device. Here EOT is defined as the physical thickness of an oxide film that, when incorporated in the correct model, would reproduce the measured C–V characteristics in accumulation. The quantum-mechanical model has been shown to fit a full C–V curve of ultra-thin oxide CMOS devices very well [5.2, 5.18]. On the contrary, the classical model fails to match the entire range of C–V characteristics regardless of the oxide thickness used. Because of the fitting of the entire branch of the C–V curve, a well-defined EOT can be extracted from one measured point in strong accumulation region, e.g., Vg = −1.8 V as adopted in Fig. 5.7. In this figure, the EOT based on different models is plotted as a function of the other capacitance oxide thickness (CET) in accumulation, tacc , calculated from the measured accumulation capacitance Cacc using the equation: tacc =

Aεox . Cacc

(5.5)

This allows the determination of EOT from the measured C–V data of a capacitor with known area. With the quantum model, a first-order fitting of the curve yields EOT ≈ tacc − 0.7 nm, where the constant depends slightly on the specific gate voltage where Cacc is measured. After the oxide thickness is determined, the polysilicon and the substrate doping concentrations, respectively, can be extracted by fitting the measured C–V curve in the depletion and the inversion regions. It should be pointed out that it is very difficult to accurately extract the physical thickness of the gate oxide using the high-resolution transmission electron microscope (HRTEM) cross-sectioning technique. The manipulation of TEM image contrast can easily result in several angstroms of variation. Another major uncertainty comes from the Si–SiO2 interface roughness of the polysilicon/oxide/Si stack, which can easily be several angstroms [5.19]. Furthermore, when one deals with an oxynitride film whose dielectric con-

132

S.-H. Lo and Y. Taur

Equivalent Oxide Thickness (nm)

3.0

2.5

n+-Gate/p-Si NP = 2×1019 or 2×1020 cm−3 NB = 1016 or 1018 cm−3

2.0

MB

FD QM

1.5

1.0

0.5 0.5

1.0

1.5

2.0

2.5

3.0

Capacitance Equivalent Thickness (nm) at Vg = −1.8 V

Fig. 5.7. Quantum-mechanically (QM) calculated tox –tacc curves for n+ -gate/p-Si MOS devices. Two other groups are based on the classical model with Fermi– Dirac (FD) and Maxwell–Boltzmann (MB) statistics (dashed lines), respectively, for comparison

stant is non-uniform and unknown, physical film thickness cannot be easily translated into EOT. After all, since it is the “field effect” of the MOSFET that is of interest, the gate insulator thickness is best captured or indeed defined by an electrical measurement of the capacitance. For gate oxides below 2.0 nm, the extraction of reliable capacitance becomes challenging because of the tunneling current and series resistance effects. Several novel capacitance and current techniques have been developed to allow accurate capacitance measurement and thickness determination in 2.0–1.0 nm regimes [5.20–5.24]. The best designed test structure consists of many narrow stripes of gate fingers, instead of one large gate pad, to minimize the series resistance from the middle of the gate to the source-drain contacts.

5.3 Tunneling Current of SiO2 The tunneling current through the gate insulator is another quantummechanical effect intrinsic to the ultra-thin gate dielectrics which has become the most constraining limit to CMOS scaling.

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

133

5.3.1 Modeling Electron Tunneling from Quasi-bound States Many models for the current-voltage characteristics of MOS devices in the direct and the F–N (Fowler–Nordheim) tunneling regimes [5.25–5.28] are available in the open literature. For a two-dimensional gas system of electrons, the tunneling probability, which is only applicable for an incident gas of free electrons, may no longer be a valid concept. Weinberg [5.25] proposed a model based on the triangular well approximation and electrons being confined within the lowest sub-band. However, the contribution from higher sub-bands was neglected, which might not be entirely correct at higher temperatures. Rana et al. [5.28] calculated the lifetimes associated with the quasi-bound bands using a path integral expansion of the resolvent operator, which gave quite good results for electron tunneling from an accumulation layer. For the calculation of tunneling current from the quasi-bound states, the closed boundary condition at polysilicon/SiO2 assumed in the previous calculation (i.e., no gate tunneling current) is relaxed such that the electron wave function is non-zero in the polysilicon gate region. The conduction band profile and the discrete sheet charge densities previously obtained are assumed to be unchanged, i.e., the inversion layer acts as a reservoir within which the electrons are all in equilibrium with a constant temperature and Fermi level. The Schr¨odinger equation for the entire MOS structure, including the polysilicon region, is then solved again. The close analogy between the confined electrons in a varying potential and electromagnetic waves in a wave guide with a varying refractive index allows for the utilization of the transverse-resonant method [5.29], which is often used for finding the eigen-value equation for inhomogeneously filled wave guides and dielectric resonators. The detailed formulation of this method can be found in [5.18]. The energy band structure of SiO2 is modeled using a Franz-type E(k) dispersion [5.26] with the effective mass at the conduction band edge, 0.55 m0 in the inversion regions. The barrier height due to conduction band discontinuity at the Si/SiO2 interface is taken to be 3.15 eV. The barrier lowering due to the images force is ignored. 5.3.2 Tunneling Current as a Function of Thickness The gate tunneling current from the inversion layer to the positively biased gate electrode is calculated using the transverse-resonant method. Figure 5.8 shows the energy band diagram and the normalized electron wave function for the lowest subband of the twofold-degenerate valley. Two oxide thicknesses, 2.0 and 1.0 nm, are compared. The oxide field for the 1.0 nm case equals to that of the 2.0 nm case where a gate voltage of 1.5 V is applied. It is the nonzero tail of the wave function in the polysilicon region that is responsible for the gate leakage current. Compared with the 2 nm oxide case, the 1.0 nm oxide case has roughly 102 times higher tail magnitude in the polysilicon,

S.-H. Lo and Y. Taur 101

Normalized Wave Function

100

4 Polysilicon

2.0 nm

SiO2

Silicon

3

10-1

1.0 nm

10-2

ψ

2

0,L(z)

10-3

EC

1

10-4 10-5

0 EV

10-6

-1

10-7 10-8

-2

10-9 -10

-5

0

5

10

15

Conduction and Valence Band Edges (eV)

134

20

Position (nm)

2

Gate Current Density (A/cm )

Fig. 5.8. Energy band diagram and normalized electron wave function for the lowest subband of the twofold-degenerate valley. The magnitude of wave function is normalized to the peak value in silicon region. Two oxides, 2.0 and 1.0 nm, are compared 108 Measurement 107 Simulation 6 10 nFET 105 104 103 102 101 100 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 0.0 0.5 1.0

5Å 10 Å 15 Å 20 Å 21.9 Å 25.6 Å 29.1 Å 32.2 Å 35.0 Å 36.1 Å 1.5

2.0

2.5

3.0

Gate Voltage (V) Fig. 5.9. Measured and simulated Ig –Vg characteristics under inversion conditions of nFET’s with oxides ranging from 0.5 to 3.61 nm. These four dotted lines indicate the 3 × 10−4 , 10, 103 and 106 A/cm2 limits for leakage current as discussed in the text

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

135

indicating a 104 times higher gate tunneling current. The lifetime associated with each subband decreases with decreasing oxide thickness, leading to a higher tail amplitude in the polysilicon and therefore a higher current. Figure 5.9 shows the thickness dependent gate tunneling characteristics for n+ -gate/p-Si nMOSFET’s. Excellent agreement between the calculated and the measured Ig –Vg characteristics with tox ranging from 3.61 to 2.19 nm have been obtained. In the strong inversion region, the majority of the current density (over 90%) comes from the lowest two subbands associated with the twofold-degenerate and the fourfold-degenerate valleys in the conduction band [5.2]. Ig –Vg characteristics for oxides down to 0.5 nm are also simulated and shown. The tunneling current density calculated for a 1.5 nm oxide agrees reasonably with the reported experimental data [5.30]. The current density at a gate bias of 1.2 V increases by more than 14 orders of magnitude as the oxide thickness is scaled down from 3.66 nm to 0.5 nm. Different circuit applications, e.g., low-power versus high-performance systems, lead to different criteria of tolerable tunneling currents which then set the oxide scaling limit for that application. This is further addressed in Sect. 5.5.

5.4 Tunneling Currents of Silicon Oxynitride To slow down the increase of tunneling current and extend the scaling limit of ultra-thin oxide, gate dielectrics with silicon oxynitride (SiOx Ny ) and nitride (Si3 N4 ) have been widely proposed [5.31–5.34], where x is the O content coefficient and y the N content coefficient. These nitrided materials can be easily integrated with current CMOS processes and have much lower tunneling current than oxides for the same electrical thickness (i.e., capacitance) owing to their higher dielectric constants. Using the compiled material data for silicon oxynitride [5.35], the dielectric constant and the barrier height (in eV) of SiOx Ny can be represented by the following empirical forms:  x ε0 (5.6) εON = 3.9 2 − 2  x φON = 3.15 − 1.05 1 − . (5.7) 2 Note the stoichiometric oxynitride composition relationship satisfies the following form: 2x + 3y = 4. (5.8)

and

Using the same transverse-resonant method, the tunneling current of SiOx Ny gate dielectric is calculated and plotted as a function of film composition x for four different EOT’s in Fig. 5.10. Along each curve, the physical film thickness tN is adjusted according to the dielectric constant of that composition such that εON /tON = εox /EOT. The gate voltages used for the four EOT’s

136

S.-H. Lo and Y. Taur 107 106

Gate Tunneling Current (A/cm2)

105

EOT=0.6 nm / Vg=0.8 V

104 103 EOT=1.0 nm / Vg=1.0 V

102 101 100

EOT=1.5 nm / Vg=1.2 V

10-1 10-2 10-3

EOT=2.0 nm / Vg=1.5 V

10-4 10-5 10-6 10-7 0.0

SiO2

Si3N4

0.2

0.4

0.6

0.8

1.0

Oxynitride (SiOxNy) Dielectric Film Composition, x/2

Fig. 5.10. Calculated gate tunneling current as a function of SiOx Ny dielectric film composition x for four different EOT’s (2.0, 1.5, 1.0 and 0.6 nm). The corresponding gate voltages chosen for the four EOT’s are 1.5, 1.2,1.0 and 0.8 V

are chosen according to the proposed nominal power supply voltages for the current and future logic technologies [5.9]. For the same EOT, the gate tunneling current decreases as x decreases, i.e., more [N] and lesser [O] in the dielectric film composition. For a 2.0 nm EOT, the tunneling current of a pure nitride film (i.e., x = 0 and [N] = 57.1%), with a physical thickness of 4.0 nm, is nearly 5 orders of magnitude less than that of an oxide gate (i.e., x/2 = 1 and [N] = 0%). The order of magnitude of leakage reduction agrees well with the reported data of a jet vapor deposited (JVD) nitride film [5.32]. While incorporating high concentration of nitrogen in silicon oxynitride films allows smaller EOT for further device scaling, the ratio of leakage reduction for a given composition gradually decreases as the EOT becomes thinner. As shown in Fig. 5.10, for nitride films of 1.5, 1.0 and 0.6 nm EOT’s, the gate leakage decreases by approximately 4000, 200, and 30 times, respectively, compared with oxide films of the same EOT’s. This phenomenon can be understood by estimating the tunneling probability ΓON for the oxynitride film using the WKB approximation. Assuming an ideal square potential barrier, the tunneling probability based on the WKB approximation can be expressed as

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride εON ΓON = e−(2kON tON ) = e−2(kON EOT εox )

137

(5.9)

where kON is the imaginary part of the complex wave vector of the tunneling electron, and tON is the physical thickness of silicon oxynitride dielectric. The current reduction of an oxynitride dielectric over an oxide film for the same EOT is then εON JON = e−2(kON εox −kox )EOT . (5.10) Jox As EOT decreases, the ratio in (5.10) becomes closer to unity. The magnitude of tunneling current reduction of a nitride gate dielectric therefore decreases for thinner EOTs.

5.5 Application Dependence of Gate Dielectric Limit In this section, we discuss the scaling limits of gate oxides and oxynitrides based on the tunneling current density as a function of thickness. The scaling limits of gate oxides depend on the leakage current or power criteria that in turn depend on the specific application environment the devices are intended for [5.36]. The power supply is assumed to be in the range of 0.5–1.0 V for sub-50 nm CMOS technologies. Since the gate current in the direct tunneling regime is largely insensitive to the applied voltage or field across the oxide, it would not matter what exactly the supply voltage is. The tunneling current versus EOT plot in Fig. 5.11 provides the design guide for translating a gate leakage specification into a thickness limitation. The fact that these curves are not parallel with each other means that the benefit of oxynitride over oxide varies with EOT, ranging from ∼ 0.2 nm on the very thin side to ∼ 0.7 nm on the thick side. For a single transistor or circuit, e.g., inverter, to function properly in the presence of gate tunneling, the leakage current only has to be small compared with the switching or on-state current of the device at Vg = Vds = Vdd . Typical on currents of sub-50 nm CMOS devices are on the order of 0.5– 1.0 mA/µm [5.34]. This means that the gate current needs to be constrained to below about 0.1 mA/µm so as not to significantly affect the switching performance. For a 10 nm channel length MOSFET, this figure translates into a tunneling current density of 106 A/cm2 . Therefore, as far as the functionality of a single transistor is concerned, the oxide thickness can be aggressively scaled to 0.5 nm according to Fig. 5.11. This is listed in the first row of Table 5.1. Following the tox ≈ Lmin /20 to Lmin /25 rule from Sect. 5.1, gate length can be scaled to 10 nm for single transistor switching. Under a VLSI chip environment, however, even though the gate leakage current may be at a level negligible compared with the on-state current of the device, the cumulative effect on the chip standby power can be considerable. Note that the leakage power will be dominated by turned-on nMOSFETs in which electrons tunnel from the silicon inversion layer to the positively-biased

138

S.-H. Lo and Y. Taur

Gate Tunneling Current (A/cm2) @ Vg = 1 V

106

x/2=1.0 ([N]=0.0%)/ SiO2 x/2=0.9 ([N]=4.5%) x/2=0.8 ([N]=9.3%) x/2=0.7 ([N]=14.3%) x/2=0.6 ([N]=19.5%) x/2=0.5 ([N]=25.0%) x/2=0.4 ([N]=30.1%) x/2=0.3 ([N]=36.8%) x/2=0.2 ([N]=43.2%) x/2=0.1 ([N]=50.0%) x/2=0.0 ([N]=57.1%)/ Si N 3 4

105 104 103 102 101 100 10-1 10-2 10-3 10-4 10-5 10-6 0.5

n+-Gate/p-Silicon

1.0

1.5

2.0

2.5

Equivalent Oxide Thickness, EOT (nm)

Fig. 5.11. Design guide for using oxynitride gate for lower gate tunneling current

gate. Edge tunneling in the gate-to-source or -drain overlap region of turnedoff devices can be controlled by additional oxidation of polysilicon after gate patterning to build up the corner oxide thickness. pMOSFETs have a much lower leakage than nMOSFETs because there are very few electrons in the p+ poly gate available for tunneling to the substrate and hole tunneling has a much lower probability. For high performance CMOS processors, the total chip power could reach 100 W, which suggests a 10 W limit for the oxide leakage power. The processor core may consist of 10 million transistors each about 1 µm wide on the average. These figures lead to a gate current limit of 1 µA/µm, which corresponds to a current density of 3 × 103 A/cm2 for a gate length of 30 nm. According to Fig. 5.11, gate oxide thickness of 1 nm can be tolerated. The minimum gate length for such a tox is 20 nm, as listed in Table 5.1. For low power processors and on chip SRAM arrays with a larger transistor count, a more stringent limit on the standby power must be imposed. For 100 million transistors with a total standby power limit of, say, 0.1 W, each transistor cannot leak more than 1 nA, or 10 A/cm2 for a gate width and length of 100 nm. This sets the oxide limit at 1.5 nm for which 40 nm channel length devices can be accommodated without excessive short-channel effects. This is the third entry in Table 5.1. The most stringent limit on oxide thickness is with DRAM cells where the transfer device must have a high threshold voltage, Vt ∼ 0.7–0.8 V, and the gate leakage current cannot exceed 10 fA to ensure a retention time of 1 s

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

139

Table 5.1. Gate oxide (SiO2 ) scaling limits under different circumstances Application

Single transistor/circuit High performance processor/logic Low power processor and SRAM DRAM cell transistor

Tolerable gate current density (A/cm2 )

Gate oxide (SiO2 ) limit (nm)

Gate length limit (nm)

106 103 10 3 × 10−4

0.5 1 1.5 2.5

10 20 40 100

for a charge storage of 10 fC per cell. Here we are concerned about electron tunneling from the gate to the positively-biased drain when the MOSFET is off. So the critical length is a fraction of the gate length near the drain. If we take the gate tunneling area to be on the order of 100 nm × 30 nm where 100 nm is the device width, the corresponding current density is 3 × 10−4 A/cm2 . The gate oxide of the DRAM cell transistor therefore cannot be thinner than 2.5 nm [5.36]. Furthermore, in the last row of Table 5.1, a more conservative tox ≈ Lmin /40 rule is adopted to obtain a minimum gate length of 100 nm. This allows for a better subthreshold slope of the sourceto-drain current and thus lower off-currents. One approach to get around this limitation is to use vertical transistor cells [5.37] in which the cell size is not constrained by the gate length. The tox numbers listed in Table 5.1 refer to the “equivalent oxide thickness” for gate oxide as defined in Sect. 5.3. The capacitance oxide thickness in inversion (so called “tinv ”), which measures the incremental inversion charge density per gate voltage swing, however, is substantially thicker than tox due to inversion layer quantization and polysilicon-gate depletion effects discussed earlier. Inversion layer thickness effectively adds ∼ 0.4 nm to tinv (weakly dependent on the gate voltage), which is rather fundamental as it stems from the size of the electron orbit in an atom [5.5]. Polysilicon gate depletion effect, on the other hand, can be mitigated by using metal gates. It should be noted that as far as gate length scaling in Table 5.1 is concerned, the effective gate oxide thickness under turned-off conditions, i.e., at Vg = 0, is important. This means that the inversion layer thickness does not enter the picture. There might be some polysilicon gate depletion effects at Vg = 0, though not as strong as in the turned-on region, that can add ∼ 0.2 nm to tox . For thin EOTs (≤ 1 nm), that is about equal to the benefit of oxynitrides in terms of tunneling current reduction so these two effects tend to offset each other. For thicker EOTs (≥ 2 nm), the use of oxynitride dielectric can in principle reduce the limiting EOT figures in Table 5.1 by about 0.5 nm. In conclusion, as the gate oxide or oxynitride thickness is scaled to near atomic dimensions, fundamental limitations set in as imposed by quantum mechanics. This is manifested as a gate leakage current which adds to the standby power and, if severe, can threaten transistor operation at the indi-

140

S.-H. Lo and Y. Taur

vidual device level. The consequence is that CMOS scaling will be limited to a minimum gate length between 10 nm and 100 nm, depending on the specific applications and circumstances. An engineering way to cope with the various device limits is to deploy multiple oxide thicknesses on the same chip, e.g., in a merged logic-DRAM technology. That, of course, is at the expense of process complexity.

References 5.1. R. H. Dennard, F. H. Gaensslen, H. N. Yu, V. L. Rideout, E. Bassous, and A. R. LeBlanc, “Design of Ion-Implanted MOSFETs with Very Small Physical Dimensions,” IEEE J. Solid-State Circuits SC-9, p. 256 (1974) 5.2. S.-H. Lo, D. A. Buchanan, Y. Taur, and W. Wang, “Quantum-mechanical modeling of electron tunneling current from the inversion layer of ultra-thinoxide nMOSFET’s,” IEEE Electron. Device Lett. 18, pp. 209–211 (1997) 5.3. Y. Taur, D. A. Buchanan, W. Chen, D. J. Frank, K. E. Ismail, S.-H. Lo, G. A. Sai-Halasz, R. G. Viswanathan, H.-J. C. Wann, S. J. Wind, and H.S. Wong, “CMOS scaling into the nanometer regime,” Proc. IEEE 85, pp. 486–504 (1997) 5.4. T. N. Nguyen, “Small-Geometry MOS Transistors: Physics and Modeling of Surface- and Buried-Channel MOSFETs,” PhD. Thesis, Stanford University, 1984 5.5. Y. Taur and T. H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, New York, 1998 5.6. D.J. Frank, Y. Taur, and H.-S. Wong, “Generalized Scale Length for TwoDimensional Effects in MOSFET’s,” IEEE Electron. Device Lett. 19, p. 385 (1998) 5.7. Y. Taur, C.H. Wann, and D. J. Frank, “25 nm CMOS Design Considerations,” 1998 IEDM Technical Digest, p. 789 5.8. L. K. Han, S. Biesemans, J. Heidenreich, K. Houlihan, C. Lin, V. McGahay, T. Schiml, A. Schmidt, U. P. Schroeder, M. Stetter, C. Warm, D. Warner, R. Mahnkopf, and B. Chen, “A Modular 0.13 µm Bulk CMOS Technology for High Performance and Low Power Applications,” 2000 Symposium on VLSI Technology Digest of Technology Papers, pp.12–13 5.9. T. Ghani et al., “Scaling challenges and device design requirements for high performance sub-50 nm gate length planar CMOS transistors,” in 2000 Symposium on VLSI Technology Digest of Technology papers, pp. 174–175 5.10. M. Mehrotra, J. Wu, A. Jain, T. Laaksonen, K. Kim, W. Bather, R. Koshy, J. Chen, J. Jacobs, V. Ukrainstev, L. Olsen, J. DeLoach, J. Mehigan, R. Agarwal, S. Walsh, D. Sekel, L. Tsung, M. Vaidyanathan, B. Trentman, K. Liu, S. Aur, R. Khamankar, P. Nicollian, Q. Jiang, Y. Xu, B. Campbell, P. Tiner, R. Wise, D. Scott, and M. Rodder, “60 nm gate length dual-Vt CMOS for high performance applications,” 2002 Symposium on VLSI Technology Digest, pp. 124–125 5.11. F. Stern and W. E. Howard, “Properties of semiconductor surface inversion layers in the electric quantum limit,” Phys. Rev. 163, pp. 816–835, 1967 5.12. F. Stern, “Self-consistent results for n-type Si inversion layers,” Phys. Rev. B 5, pp. 4891–4899 (1972)

5 Gate Dielectric Scaling to 2.0–1.0 nm: SiO2 and Silicon Oxynitride

141

5.13. C. Moglestue, “Self-consistent calculation of electron and hole inversion charges at silicon-silicon dioxide interfaces,” J. App. Phys. 59, pp. 3175– 3183 (1986) 5.14. J. Su˜ n´e, P. Olivo, and B. Ricc` o, “Quantum-mechanical modeling of accumulation layers in MOS structure,” IEEE Trans. Electron Devices 39, pp. 1732–1739 (1992) 5.15. C. Y. Wong, J. Y.-C. Sun, Y. Taur, C. S. Oh, R. Angelucci, and B. Bavari, “Doping of n+ and p+ polysilicon in a dual-gate process,” 1988 IEDM Tech. Dig., pp. 238–241 5.16. P. Habaˇs and S. Selberherr, “On the effect of non-degenerate doping of polysilicon gate in thin oxide MOS-devices-analytic modeling,” Solid-State Electron. 33, pp. 1539–1544, (1990) 5.17. R. Rios and N. D. Arora, “Determination of ultra-thin gate oxide thicknesses for CMOS structures using quantum effects,” 1994 IEDM Technical Dig., pp. 613–616 5.18. S.-H. Lo, D. A. Buchanan, and Y. Taur, “Modeling and characterization of quantization, polysilicon depletion, and direct tunneling effects in MOSFETs with ultra-thin oxides,” IBM J. Research and Development 43, pp. 327–337 (1999) 5.19. D. A. Buchanan, “Scaling the gate dielectric: materials, integration, and reliability,” IBM J. Res. Develop. 43, pp. 245–264 (1999) 5.20. W. K. Henson, K. Z. Ahmed, E. M. Vogel, J. R. Hauser, J. J. Wortman, R. D. Venables, M. Xu, and D. Venables, “Estimating oxide thickness of tunnel oxides down to 1.4 nm using conventional capacitance-voltage measurements on MOS capacitors,” IEEE Electron Device Lett. 20, pp. 179–181 (1999) 5.21. K. J. Yang and C. Hu, “MOS capacitance measurements for high leakage thin dielectrics,” IEEE Trans. Electron Devices 46, pp. 1500–1501 (1999) 5.22. M. S. Krishnan, L. Chang, T.-J. King, J. Bokor, and C. Hu, “MOSFETs with 9 to 13 ˚ A,” 1999 IEDM Technical Dig., pp. 241–244 5.23. C.-H. Choi, J.-S. Goo, T.-Y. Oh, Z Yu, R. W. Dutton, A. Bayoumi, M. Cao, P. Vande Voorde, D. Vook, and C. H. Diaz, “MOS C-V characterization of ultrathin gate oxide thickness (1.3–1.8 nm),” IEEE Electron. Device Lett. 20, pp. 292–294 (1999) 5.24. K. Yang, Y.-C. King, and C. Hu, “Quantum effect in oxide thickness determination from capacitance measurement,” 1999 Symp. VLSI Tech. Digest of Technical Papers, pp. 77–78 5.25. Z. A. Weinberg, “On tunneling in metal-oxide silicon structures,” J. Appl. Phys. 53, pp. 5052–5056 (1982) 5.26. J. Maserjian, “Tunneling in thin MOS structures,” J. Vac, Sci, Technol. 11, pp. 996–1003, (1974) 5.27. J. G. Simmons, “Generalized formula for the electric tunneling effect between similar electrodes separated by a thin insulating film,” J. Appl. Phys. 34, pp. 1793–1803 (1963) 5.28. F. Rana, S. Tiwari, and D. A. Buchanan, “Self-consistent modeling of accumulation layers and tunneling currents through very thin oxides,” Appl. Phys. Lett. 69, pp. 1104–1106 (1996) 5.29. R. E. Collin, Field Theory of Guided Waves, 2nd edn, New York: IEEE Press, 1991

142

S.-H. Lo and Y. Taur

5.30. H. S. Momose, M. Ono, T. Yoshitomi, T. Ohguro, S. Nakamura, M. Saito, and Hiroshi Iwai, “Tunneling gate oxide approach to ultra-high current drive in small-geometry MOSFETs,” 1994 IEDM Technical Dig., pp. 593–596 5.31. M. Rodder, S. Hattangady, N. Yu, W. Shiau, P. Nicollian, T. Laaksonen, C. P. Chao, M. Mehrotra, C. Lee, S. Murtaza, S. Aur, “A 1.2 V, 0.1 mm Gate Length CMOS Technology : Design and Process Issues,” 1998 IEDM Technical Digest, pp. 623–626 5.32. T. P. Ma, “Making silicon nitride film a vaiable gate dielectric,” IEEE Trans. Electron Devices 45, pp. 680–690 (1998) 5.33. H. Yang and G. Lucovsky, “Integration of ultrathin (1.6 ∼ 2.0 nm) RPECVD oxynitride gate dielectrics into dual poly-Si gate submicron CMOSFETs,” 1999 IEDM Technical Dig., pp. 245–248 5.34. B. Yu, H. Wang, Q. Xiang, J. X. An, J. Jeon, and M.-R. Lin, “Scaling towards 35 nm gate length CMOS,” 2001 Symp. VLSI Technology Digest of Technical Papers, pp. 9–10 5.35. X. Guo and T. P. Ma, “Tunneling leakage current in oxynitride: dependence on Oxygen/Nitrogen content,” IEEE Electron Device Lett. 19, no. 6, pp. 207–209 (1998) 5.36. D. J. Frank, R. H. Dennard, E. Nowak, P. M. Solomon, Y. Taur, and H.S. P. Wong, “Device Scaling Limits of Si MOSFETs and Their Application Dependencies,” Proc. IEEE 89, pp. 259–288 (2001) 5.37. R. Weis et al., “A highly cost efficient 8F2 DRAM cell with a double-gate vertical transistor device for 100 nm and beyond,” 2001 IEDM Technical Digest, p. 415

6 Optimal Scaling Methodologies and Transistor Performance T. Skotnicki and F. Boeuf

6.1 Introduction Almost 40 years of extensive scaling (guided by the scaling rules established by Denard in 1974, [6.1]) has led device parameters to approach certain technological limits, device-related limits and fundamental physical limits, that all were negligible before. Just to give a better filling of what we are talking about, let us give a few examples. 1. Technological limits may arise from a lack of an appropriate technological process, lack of equipment, insufficient control of a process, excessive cost, etc. 2. Device-related limits arise often from a non-scalability of certain parameter values that are not fundamentally invariant but that we refrain from changing for purely practical reasons imposed by technology or circuit type prerogatives. A good example here is a threshold voltage – that we are perfectly capable of lowering but refrain from doing in order to avoid large static power consumption. This resulting invariability of the threshold voltage in conjunction with a permanent scaling-down of the supply voltage unavoidably leads to trouble. The so-called gate overdrive (gate voltage minus threshold voltage) diminishes that leads to a reduction in the transistor conductivity. Other examples of this category can be given by: short-channel effects, junction breakdown, junction lateral diffusion, contact resistance, junction series resistance, polydepletion, etc. 3. Regarding fundamental physical limits, we put in this category those parameters which are governed by fundamental physics and are basically out of our influence. A few examples of this category can be: the oxide tunneling, the finite subthreshold slope, the so called “inversion layer dark space”, etc. A good illustration of the increasing weight of the limitations arises from considering the so-called “good-design rules”. These rules established empirically and confirmed throughout a number of past CMOS generations instruct relations between certain crucial device parameters, which should be respected for guaranteeing a good operation of the MOS transistor. Three of them apply to the subthreshold regime, and if fulfilled they ensure a good electrostatic integrity of the device or in other words relative

144

T. Skotnicki and F. Boeuf

suppression of the SCE (short channel effect) and DIBL (drain induced barrier lowering). These three read as follows: Tox /L ≤ 1/30, Xj /L ≤ 1/3 and Tdep /L ≤ 1/3, where Tox , L, Xj and Tdep stand for gate oxide thickness, transistor length, junction depth, and substrate depletion depth under gate, respectively. The forth “good-design rule” demands the ratio Vth /Vdd ≤ 1/5, and in this way guaranties a correct drive current and noise margin to the device, where Vth and Vdd are threshold and supply voltages, respectively. Providing we started from a well-designed technology, the scaling according to the Denard’s scaling theory [6.1] (summarized in Table 6.1) used to guarantee the respect of the Tox /L and Xj /L good-design rules in the past. To large extend, this was automatic since Tox , L, Xj scaled with the same scaling factor k thus preserving the ratios Tox /L and Xj /L. The Tdep /L was, however, condemned to derive from the 1/3 value thus menacing of relaxation of SCE, DIBL and also of Ioff due to decreasing Vth value (roughly proportional to N 0.5 ×Tox ), since Tdep scales as k 0.5 if N scales as 1/k. This can be of course corrected by faster scaling of N . Its scaling as 1/k 2 leads to Tdep scaling as k and thus to preservation of the Tdep /L “good design” rule, but then causes problems to the Vth /Vdd “good-design” rule. Indeed, scaling Tox as k and N as 1/k 2 implies invariability of Vth that is roughly proportional to N 0.5 × Tox . Then Vth /Vdd ≤= 1/5 is endangered (Vdd converges to Vth – menacing of Ion degradation). In the past we had a large margin (this ratio used to be smaller than 1/5), and the problem could be neglected, but sooner or later it had to emerge as converging Vth and Vdd simply leads to vanishing transistor conductivity. In order to further investigate the consequences of the non-respect of “good-design” rules and do so in a more quantitative manner, we need to develop a better insight into the MOSFET physics. This will also allow us to realize that all of the four “good-design” rules are going to be violated if we remain within the traditional scaling principle. In the following sections, we will attempt to predict some consequences of these limitations for the CMOS performance and to analyze various optimization strategies aimed at elimination or attenuation of the limitations and thus enabling smooth continuation of the Moore’s laws. The Moore’s laws, that reflect the incredible dynamic of the semiconductor industry, relay at the beginning on the transistor scaling. Therefore, in order to maintain the traditional improvement rate in circuit performance, the intrinsic frequency (I/CV ) has to improve 17% per year.1 As the Ion in the ITRS device specifications results from the above criterion, it is important to understand that reaching 1

In fact the 17% improvement per year simply results from the scaling: since I/CV scales as 1/k (see Table 6.1) per generation that each is replaced every

6 Optimal Scaling Methodologies and Transistor Performance

145

Table 6.1. CMOS scaling projections based on 0.18 µm CMOS (LP) and extrapolated in both directions according to the theoretical scaling factor k = 0.7 CMOS, µm

Scaling rule

0.5

0.18

0.13

0.09

0.065

0.045

0.032

0.022

L, nm Vdd , V Tox , nm Xj , nm Nchannel At/cm3 Ion /W , µA/µm (CV /I)−1 , THz

xk xk xk xk x1/k x1 x1/k

500 5 9.0 120 3.4e17 650 0.07

180 1.8 3.0 50 1e18 650 0.2

130 1.3 2.1 35 1.4e18 750 0.29

90 0.9 1.5 25 2e18 650 0.41

65 0.65 1.0 18 2.9e18 650 0.58

45 0.45 0.7 13 4.2e18 650 0.83

32 0.32 0.5 9 6e18 650 1.2

22 0.22 0.35 7 8.5e18 650 1.7

the Ioff –Ion specifications is essential not only for the current drivability but also, and most of all, for satisfying the 17%/year improvement in speed.

6.2 Scaling and Device Physics For better understanding of the conflicting design rules, of their consequences for future CMOS generations, and also for analyzing optimal scaling strategies, we need a model. We will use here analytical models, see Table 6.2 rather than numerical simulation, in order to maintain close link with the devices physics. We have of course verified that both lead to the same or sufficiently close results so as to avoid misinterpretations due to limited validity of the tool (we will be coming back to this point all along the discussions). 6.2.1 MASTAR Model MASTAR2 is an analytical tool for assessment of CMOS technologies and roadmaps. It allows optimization or just predictions on Ion –Ioff , speed and power. In contrast with its former version [6.4], no CAD compact modeling is targeted with the present simplified version, but some new functionalities are added. The present MASTAR version has been extended to account for such technological “flavors” as Bulk, FD SOI/SON and Double Gate. Each of those can in addition be equipped with so-called technology “boosters” as (for definitions see further-on): strained-Silicon channel, metallic gate, quasiballistic transport, and metallic junction. Evaluation of the impact of all

2

2 years, the improvement per year is equal k−0.5 = 1.19 meaning 19% per year. The 17% is just the result of averaging over long period of history and of the fact that in the past the generations used to last longer than 2 years. The code of MASTAR (Model for Assessment of CMOS Technologies And Roadmaps) along with instructions for use and a graphical interface is available (free of charge – STMicroelectronics courtesy) on request to: [email protected] or [email protected].

146

T. Skotnicki and F. Boeuf

Table 6.2. MASTAR model equations: Kmob , Kball , Kfield allow modeling of the effects of technology boosters Lel = Lg − ∆L 2  Nch kT 2φF = ln q  ni  kT Next Nch φd = ln 2 q ni εSi Tox Tdep N CE = σ εox W 2   Xj2 Tox el Tdep εSi DIBL = 0.8 VDS 1+ 2 εox Lel Lel Lel    2εSi qNB (2φF − VBS ) Nch RSCE = −1 Cox NB   kT εs Tox el S= ln 10 1 + q εox Tdep  2εs Tdep = (2ϕF − Vb ) qNch εSiO2 Tox el = Tphys + Dark space + Poly depl ε actual



cm2 −0.3 MV nMOS :µac = 330Eeff Vs cm 2



cm −2.9 MV nMOS :µsr = 1450Eeff Vs cm 2



cm MV pMOS :µsr = 140E−1 eff Vs cm 1 Vdsat = 1 + 1+d Lel Ec Vgt ∆L = 0.8Xj ,

Ec = Kball

KB = Idsat0 Dark

2νsat µeff

qNch Tdep √ Cox el 2ϕF − Vb W 1 = µeff Cox el Vgt Vdsat 2 Lel

space

∼ A EOT (electrons) = 2–4 ˚

et 3–5 ˚ A EOT (holes)

Nch = NB + 2Npoches Npoches =

min (Lel , Lpoche ) Lel

1 Cpoches 2 Rp + 2∆Rp

∆L Lpoche = (Rp + 2∆Rp ) sin θ + 2∆Rl cos θ − 2   Xj2 Tox el Tdep εSi 1+ 2 SCE = 0.64 φd εox Lel Lel Lel  Vth∞ = VFB + 2ϕF + Cox1 eot 2εSi qNB (2φF − VBS ) Vth,off = Vth∞ + RSCE − SCE − DIBL − N CE log Ioff = log (Ith ) − Ith = 5 × 10−7 [A]

Vth,off S

 W −0.4865  8 × 108 Nch cm−3 Lel

µsr µac µeff = Kmob µ +µ sr 2 ac

cm −0.3 MV pMOS :µac = 90Eeff Vs cm   VG + Vth,on VFB + 2φF nMOS :Eeff = Kfield −2 6Tox el 6Tox el   VG + 2Vth,on VFB + 2φF pMOS :Eeff = Kfield −3 9Tox el 9Tox el Vgt = Vgs − Vth,on Vth,on = Vth,off + ∆ ∆ = difference due to different definitions and due to quantization in inversion layer (default value = 30 mV) Cox

Idsat Poly

KB d= √ 2 2ϕF − Vb Idsat0 = Rs Idsat0 dsat0 1 + 2RsVIgt − Vgt +L el Ec (1+d)

el

=

dep

εSiO2 Tox el

∼ A EOT (n+ -gate), 6 ˚ A EOT (p+ -gate), = 4˚

EOT=Tphys εSiO2 εactuel

combinations of these “flavors” and “boosters” is in MASTAR just a matter of a press-button action generating an immediate answer. This is considered the main advantage of MASTAR over 2D TCAD simulations. The Ion model used in MASTAR is close to a standard analytical solution of the Drift-Diffusion transport equations that can be found in any text-book (the closest account for these equations may be found in [6.4–6.8]. A validity of these equations has been verified in numerous labs all over the world. The specificity of the Ion model used here is that we have avoided the direct modeling of the dependence of mobility on gate bias that leads to introduction of such parameters as µ0 (low field mobility), and Θg (co-

6 Optimal Scaling Methodologies and Transistor Performance To x=1nm , Ba zley

1E-06

To x=1nm , MASTAR To x=1.5nm , Ba zle y Lg=50nm To x=1.5nm , MASTAR

Ioff, A/µm

1E-07

60nm

1E-08 1E-09

70n 80n

100n 90n

1E-10 500

147

600

Vdd=1V Xj=33nm Polydepletion : gate dop=1e20cm -3

700 800 Ion, µA/µm

900

Fig. 6.1. Comparison between the MASTAR model (Table 6.2) and numerical simulation [6.14]

Fig. 6.2. Literature Ioff –Ion points drop on distinct lines governed by supply voltage and more or less independent of other parameters. These lines are well immerge within the MASTAR calculated clouds of points corresponding to the same supply voltages and encompassing the parameter variation ranges of all literature points

efficient of mobility moderation due to surface roughness but also due to series resistance). These parameters do not follow any universal behavior and thus need to be extracted, which impedes and limits the force of prediction of such a model. Instead, we have modeled the vertical effective field (Eeff that results from the Poisson equation), and used the universal mobility curve [6.9,6.10] for calculating the effective mobility [6.5,6.6]. This simple modification shows up very efficient in expanding the force of prediction of the model. It also permits to decouple the mobility model from the series

148

T. Skotnicki and F. Boeuf

resistance, a nightmare of CAD parameter extraction. The Ioff model is our original one [6.4–6.7], calibrated and confirmed experimentally throughout CMOS 1.2 µm down to CMOS 0.1 µm generations, and beyond on research device results down to 15 nm gate length. Its foundation is the Voltage-Doping Transformation (VDT) [6.11,6.12], that allows extension of the validity of 1D transistor models (long channel case) to 2D short channel cases. For more detail see the next point. To demonstrate the reliability of MASTAR, Fig. 6.1 plots Ioff –Ion predictions as modeled and as simulated (2D drift-diffusion) for MOSFETs down to 50 nm length showing an excellent agreement. Confrontation with experimental data is not less successful. As shown in Fig. 6.2, all recent literature points are precisely modeled using MASTAR. 6.2.2 Voltage-Doping Transformation As the subthreshold model here-used in MASTAR, in addition to being accurate, explains nicely the “good-design” rules, we will now give the basic elements of its development. It is based on the Voltage-Doping Transformation (VDT) [6.11, 6.12] that is a simple but powerful technique enabling any 2-D electrostatic problem to be reduced to a 1-D one. The key-idea comes from the following interpretation of the rearranged 2-D Poisson equation:   qNB∗ q εSi ∂ 2 Ψ ∂2Ψ N = = − (6.1) B 2 2 q ∂y εSi εsi ∂x where x and y directions are defined in Fig. 6.3. Any 2-D potential distribution in a region containing a charge density NB is equivalent to a 1-D potential distribution in this region containing the charge density NB∗ . As we will show further on, this simple observation may be of great practical utility on condition that we are capable to calculate NB∗ . To demonstrate this let’s focus the discussion on a MOSFET structure, as shown in Fig. 6.3. In application to this structure, the VDT ensures that the potential along a given vertical axis can be calculated from a 1-D Poisson equation if replacing the real doping NB with a modified (transformed) one NB∗ . As the 1-D potential solution in a MOSFET structure is well known, it is thus sufficient just to formally replace NB with NB∗ in the well known expressions. The consequence of this is even more practical, since of course also the validity of all the models (e.g. the threshold voltage model) developed from the 1-D Poisson equation for long-channel transistor may be extended to shortchannel transistors by a simple and formal replacement of the real doping by the transformed doping. As the threshold voltage (and many other important quantities) of a short-channel transistor depend on the potential of the so-called virtual cathode, we need to determine the transformed doping along the vertical axis passing through the virtual cathode, as shown in Fig. 6.3. All other regions of the transistor structure are here unimportant, and therefore VDT does not attempt to solve potential anywhere else but along the virtual cathode.

6 Optimal Scaling Methodologies and Transistor Performance Ψ(y)=ay²+by+c φD+VSB

149

φD+VSB +VDS D

Y

S

y x

Ψ(x) x-axis

y

a

X

Ψ(x)

b

Fig. 6.3. Potential along the y-lines – (a) in the curvilinear system – (b) composed of field lines y and of perpendicular to them lines x. The shadow strip is a closed vicinity of the virtual cathode – the loci of potential minima along y-lines

In a short transistor, the geometry of the potential distribution is not Cartesian, but as shown in [6.11,6.12] the Laplacian relevant to the curvilinear system shown in Fig. 6.3 reduces to a Cartesian Laplacian when reducing the domain of the solution to a very close vicinity of the virtual cathode (shadowed strip in Fig. 6.3). Let’s suppose a parabolic distribution of the potential Ψ (y) = ay 2 + by + c in the lateral direction (y-axis), and determine parameters a, b and c from the following boundary conditions: Ψ (source) = φd + VSB , Ψ (drain) = φd + VSB + VDS , Ψ (y = virtual cathode) = Ψ and Ψ  (y = virtual cathode) = 0, where φd is the junction build-in voltage. After some mathematics this gives: (caution: since Ψ (y = virtual cathode) becomes the new solution (in reduced domain) replacing Ψ (x, y), we use for simplicity, the same notation Ψ or both.) 2 2  ∂2Ψ VDS + 2 (φd + VSB − Ψ ) = 2a = ν = ∂y 2 L2 L2 (6.2)   +2 (φd + VSB − Ψ ) (VDS + φd + VSB − Ψ ) , and consequently NB∗ = NB −

εSi ∂ 2 Ψ 2εSi ν = NB − q ∂y 2 q L2

(6.3)

Note that due to the curvilinear geometry of the solution, L is not the transistor channel length but rather the length of the given field line between source and drain. Further on we will find the relationship between L and the transistor channel length.

150

T. Skotnicki and F. Boeuf

All long-channel transistor models (1-D) that depend on the potential barrier height, remain valid also for the short channel case (2-D) when replacing in their equations the real doping NB by the transformed doping NB∗ . Therefore, let us inject the transformed doping into the classical threshold voltage equation:  2εSi qNB∗ (φd + VSB ) Vth = VFB + 2φF + (6.4) Cox A development into the Taylor series, limited to the first order term gives:    2εSi qNB (φd + VSB ) εSi ν Vth = VFB + 2φF + 1− Cox qNB L2 (6.5) εSi Tox el Tdep = Vth,L→∞ − ν L2 εox Note that the correction term in 1/L2 should be much smaller than 1 to keep the truncation of the Taylor series valid. And 2εSi Tdep = (φd + VSB ) . (6.6) qNB The complete expression vor ν is too complex:  ν = VDS + 2(φd + VSB − Ψ ) + 2 (φd + VSB − Ψ )(VDS + φd + VSB − Ψ ) (6.7) we will simplify it. 6.2.3 Short-Channel Effect (SCE) Let’s first consider the case of VDS = 0. As for the threshold voltage calculation, the potential value close to the interface plays more than that deep in the bulk, we will suppose an average value of potential Ψ = 3/4 φd + VSB along the y-axis (the choice of the particular value of 3/4 is not critical, many values around 3/4 can give similarly good results, but as 3/4 in addition leads to a significant simplification in the ν expression, we thus adopt this particular value). This permits a substantial simplification in the ν-expression that reduces just to ν = φd . Then Vth may be rewritten as follows: Vth = Vth,L→∞ − SCE where SCE =

εSi Tox el Tdep φd . εox L L

(6.8)

(6.9)

6 Optimal Scaling Methodologies and Transistor Performance

a

151

b

Fig. 6.4. Schematic representation of a field line passing from Source to Drain at the depth x for a deep junction – (a) and for a shallow junction – (b). In the latter case, the length of the field line L∗ increases thus leading to better suppression of the SCE

6.2.4 Drain-Induced Barrier Lowering (DIBL) Now let’s consider the case of VDS > 0. The mathematical treatment of this case is similar as before. Let suppose Ψ = 3/4 φd + VSB and VDS small with respect to 3/4 φd , then the expression for ν becomes ν ∼ = φd +VDS . In contrast, if VDS φd , one ends up with ν ∼ = VDS , that for the same reason is also well approximated by ν ∼ = φd + VDS . Therefore, whatever the VDS value, ν is correctly approximated by ν∼ = φd + VDS Using this value in the Vth expression, (6.5), we obtain similarly as in the case of SCE: εSi Tox el Tdep εSi Tox el Tdep Vth = Vth,L→∞ φd − VDS . (6.10) εox L2 εox L2 That can be rewritten as follows: Vth = Vth,L→∞ − SCE − DIBL where DIBL =

εSi Tox el Tdep VDS . εox L2

(6.11) (6.12)

6.2.5 Junction Depth Effect Careful analysis of electrical field geometry reveals that the field lines linking Source and Drain are pretty parallel to the interface, and thus shorter, in the case of deep junction than in the case of shallow junction. Only with infinitely deep junctions, the length of the carrier transit path3 from Source 3

We are speaking of subthreshold regime, in upthreshold regime, the carriers are better confined to the inversion layer and thus no substantial difference should remain between the transit path length and the junction-to-junction distance.

152

T. Skotnicki and F. Boeuf

to Drain is equal to the distance L between the Source and Drain junctions. In all other cases, we should replace L by Lel + 2δ in all expressions relative to and resulting from the VDT, where 2δ accounts for the curvature of the field lines, see Fig. 6.4. By doing so in the VDT-transformed expression for threshold voltage, (6.5), one obtains: εSi Tox el Tdep εSi Tox el Tdep ν ⇒ Vth = Vth,L→∞ − ν. 2 εox L εox (Lel + 2δ)2 (6.13) And supposing 2δ Lel , we can simplify:   εSi Tox el Tdep δ Vth = Vth,L→∞ − . (6.14) ν 1−4 εox L2el Lel Vth = Vth,L→∞ −

Precise calculation of δ is not possible but making use of the boundary conditions as written in Fig. 6.4, we can write down the following approximate expressions:

0 if Xj → ∞ 1 1 δ(x) = 2πx ⇒ X π 4 1+ j x if XJ → 0 . Lel

2

Substituting this expression for δ in (6.14), we will develop the denominator into the Taylor series and neglect 2πx/L as small with respect to 1,4 thus obtaining:   Xj εSi Tox el Tdep Vth = Vth,L→∞ − 1 + 2πx 2 ν (6.15) L εox L2 The best calibration to experimental data is obtained when supposing 2πx ∼ = Xj , that results in: Xj2 εSi Tox el Tdep (φd + VDS ) (6.16) Vth = Vth,L→∞ − 1 + 2 Lel εox L2el or: Vth = Vth,L→∞ − SCE − DIBL with εSi SCE = 0.64 εox 4



Xj2 1+ 2 Lel



Tox el Tdep φd Lel Lel

(6.17)

After development into the Taylor series and truncation of second and higher order terms, the Xj -correction term reads (1 − 2πx/L + 2πxXj /L2 ) that leads to a slightly unphysical behavior at small Xj (slight increase in Vth at increasing Xj ). To remove this anomaly, it is the term 2πx/L rather than 2πxXj /L2 (both are smaller than 1) that is dropped. This empirical correction leads to a good comparison with experiment, nevertheless, it suggests that a more sophisticated approach to the Xj -dependence should be developed in future editions of VDT.

6 Optimal Scaling Methodologies and Transistor Performance

DIBL = 0.80

εSi εox

1+

Xj2 L2el



Tox el Tdep VDS . Lel Lel

153

(6.18)

The constants 0.64 for SCE and 0.80 for DIBL are empirical corrections resulting from calibration of (6.17) and (6.18) to all consecutive CMOS generations from 0.7 µm down to 0.1 µm. Also the most recent and the most aggressive research results on devices as short as 60 nm to 20 nm are nicely matched by these expressions. 6.2.6 Understanding the “Good Design Rules” Now having the subthreshold regime modeled, it will be interesting to understand where the “good-design” rules come from. All were empirical at their origin, maybe with exception of Vth /Vdd = 1/5 that clearly arises from the expression on Ion current (Table 6.2) in which it implies a reasonable Vgt = Vg − Vth value needed to ensure a convenient current level. This same relation also ensures a secure noise margin. Concerning the other three rules, they govern what is sometimes called the “electrostatic integrity” of the device, or in other words its resistance to short-channel effects (SCE and DIBL). Our models for SCE and DIBL given by (6.17) and (6.18) are explicit functions of the ratios Xj /L, Tdep /L and Tox /L thus implying that keeping these ratios constant from one generation to another warrants conservation of the values of SCE and DIBL throughout the process of CMOS scaling. This was given the name of “good-design” rules. The values of the ratios once established and respected, from that time on have warranted a good control of parasitic effects throughout many consecutive CMOS generations. A simple estimation of SCE and DIBL can be made using the “good design” rules in (6.17) and (6.18). Supposing Φd = 0.8 V and Vds = 1.2 V, we obtain: Xj2 Tox el Tdep εSi 1+ 2 φd SCE = 0.64 Lel Lel Lel εox (6.19)   1 1 1 12 1+ 0.8 V ≈ 20 mV = 0.64 4 9 30 3 Xj2 Tox el Tdep 1+ 2 VDS Lel Lel Lel   1 1 1 12 1+ 1.2 V ≈ 35 mV = 0.80 4 9 30 3

εSi DIBL = 0.80 εox



(6.20)

that is a very good result, but very optimistic compared with real numbers we measure on current technologies where SCE and DIBL can read at as much as 100 mV. This discrepancy is not at all due to incorrect modeling but due to violation of “good-design” rules. We will discuss this point in the

154

T. Skotnicki and F. Boeuf

next section, but before doing so, let us note another important feature of the physics as described by (6.17) and (6.18). As the SCE and DIBL are both products of the ratios Xj /L, Tdep /L and Tox /L, there exist another way to keep the SCE and DIBL values constant than respecting all of the “good-design” rules separately. In fact, we can compensate relaxation in one ratio by a more stringent value of another one, provided that their product keeps constant. This was already the case in the past, when we used to adjust the channel doping empirically (thus playing with Tdep /L) to keep the SCE and DIBL under control and in this way we, more or less unconsciously, compensated for violations in other ratios. This possibility of mutual compensations will be expressed in the generalized “good-design” rules, which are better adapted for analyzing the limitations and prospect on the CMOS Roadmaps. This will be discussed in Sect. 6.3.5, but just to illustrate the principle let suppose that from a certain generation on, the Tox scaling saturates. Then, we still have a possibility of keeping SCE and DIBL constant, if one of Xj or Tdep , or both of them, scale quicker. In other terms, if the Tox /L ratio exceeds 1/30, then at least one of the ratios Tdep /L and Tox /L needs to be smaller than 1/3.

6.3 Limitations of Conventional Scaling Each limitation impeding scaling of Xj and Tox in the same proportion as L, as well as each limitation impeding scaling of Vth in proportion with Vdd , potentially leads to violation of “good-design” rules. In some cases the situation may be even worse, a given “good-design” rule may violated even if scaling laws are respected. In this section we will discuss some of such cases. 6.3.1 Limitations Menacing the Vth /Vdd Scaling Fulfilling of the “good-design” rules relative to Xj /L, Tox el /L and Tdep /L seems to imply automatically fulfilling of that related to Vth /Vdd . Supposing that (VFB + 2φF ), SCE, DIBL ≈ 0, we obtain according to 6.5:  2εSi qNB (φd + VSB ) Vth = VFB + 2φF + − SCE − DIBL Cox Tdep Tox el ≈ qNB L2el . Lel Lel This shows that if Tdep /Lel and Tox−el /Lel are constant, Vth scales as k similarly to Vdd (if NB scales as x1/k and L as k), thus keeping the Vth /Vdd ratio constant. This is, however, misleading for two reasons: (i) NB scaling as x1/k is incompatible with the Tdep /Lel design rule, since Tdep scales as x1/NB0.5 [see (6.6)] thus causing the Tdep /Lel to scale as x1/k 0.5 instead of keeping constant; (ii) the decreasing value of Vth leads to increasing Ioff .

6 Optimal Scaling Methodologies and Transistor Performance

155

The latter had been acceptable until the Ioff value was negligible. But the continuous decrease in Vth had eventually brought the Ioff to very high values, eventually implying a necessity of faster than x1/k scaling of the channel doping5 (close to x1/k 2 ) in order to keep the Vth almost constant. This in turn violates the Vth /Vdd “good-design” rule. We are clearly in a conflicting situation, let us understand the physical reason for this. The real physical reason lies in the invariance of the subthreshold slope:     Cdep εsi Tox el kT Css kT 1+ 1+ ln 10 ≈ ln 10 + S= q Cox Cox q εox Tdep ≈ 63 mV/dec

@300 K .

Indeed, if according to the “good-design” rules, the ratios Tox el /L = 1/30 and Tdep /L = 1/3 are constant, also the Tox el /Tdep is constant and reads at 1/10 thus making of S an invariant of scaling. Consequently, the Vth needs also to become an invariant (per choice) in order to prevent an increase in Ioff since the latter scales as a ratio of Vth /S:   Vth Wel −7 log Ioff ≈ log 5 × 10 [A] − . Lel S Figure 6.5 compares two scaling scenarios for Vdd , Vth and Ioff . As we can see, the Vth scaling has to be levelled off in order to stop the rise in Ioff . Unfortunately, this same levelling-off has dramatic consequences for Ion , that drops down due to decreasing (Vdd − Vth ) value. The faster than x1/k scaling of Nchannel also contributes to the reduction in Ion value via a mobility degradation – we will discuss this in Sect. 6.4. 6.3.2 Limitations Menacing the Tox

el /L

Scaling

One of the major issues the scaling encounters now-a-days goes with gate oxide. The thickness of the gate oxides that are experimented on R&D devices is already in the order of a few atomic layers, causing not only the throughthe-oxide leakage and oxide reliability but also a further thinning are seriously endangered. The question that may be raised is why should we scale down the Tox so aggressively? The answer comes from the observation that Tox is the only one among the parameters occurring in the “good-design rules” a reduction of which leads to an advantages evolution of both SCE/DIBL and Ion . Therefore, the motivation for aggressive Tox down-scaling is actually justified. In fact the problem is that in the future an even more aggressive Tox scaling may be needed due to a decreasing contribution of the physical oxide thickness Tox ph to Tox el . 5

In practice this faster scaling of Nchannel is realized by means of tilted implants (called pocket- or hallo-implants) that increase the mean channel doping in short devices without altering the mean channel doping of the long ones.

T. Skotnicki and F. Boeuf 1.E+05

1.4 Vdd-Vth

1.E+04

L

1

1.E+03

0.8 0.6 0.2

1.E+02

Ioff

Vddx0.7

0.4

1.E+01

Vth=1/5Vdd

0

1.E+00 0

50

100

150

CMOS node

a

100

1.4 Vdd-Vth

Vth,Vdd (V)

1.2

L

1 0.8

Ioff

10

0.6 Vddx0.7

0.4

Vth=0.2V

0.2 0

1 0

b

Ioff (nA/µm)

Vth,Vdd (V)

1.2

Ioff (nA/µm)

156

50

100

150

CMOS node

Fig. 6.5. Comparison of two scaling scenarios: (a) Vth = Vdd /5 leading to a potential preservation of Ion but to explosion of Ioff ; and (b) Vth = constant leading to a collapse of Ion but to conservation of Ioff . Ion behavior is represented by the variation of (Vdd − Vth )/L, in arbitrary units

˚ of oxide physical thickness the polydepletion and In the scale of 100 A quantum effects were negligible, now when oxides as thin as 10 ˚ A are considered, these effects produce an increasing difference between Tox ph and Tox el . As illustrated in Fig. 6.6, the polydepletion effect produces a carrier-deserted layer at the polysilicon-oxide interface, which acts as a gate dielectric adding to the physical oxide thickness. The exact value of polydepletion thickness can be calculated according to the following formula: 



2εSi φp kp where : φp = ⎣− + qNpoly 2  2qεSi Npoly and kp = Cox poly = Tdep



⎤2 kp2 + (VG − VFB − 2φF ) ⎦ 4

(6.21)

6 Optimal Scaling Methodologies and Transistor Performance N+Poly-gate

SiO2

Si-P-substrate

SC electrons in the gate

SC electrons in the channel

QM electrons in the gate in accumulation electrons in the gate in depletion

157

QM electrons in the channel

Darkspace in the gate

Polydepletion

Darkspace in the channel

SC - semi-classical picture QM - quantum-mechanical picture

Fig. 6.6. Illustration of free carrier distribution in a N+-Poly/Oxide/P-Substrate system under two hypothesis: SC-semi-classical, and QM-quantum-mechanical

suggesting an increase in polydepletion when down-scaling the oxide thickness. However, when taking into account that Tox scaling is accompanied by Vdd scaling, an almost constant polydepletion [6.15] is obtained depending only on active gate doping level, Fig. 6.7. Supposing the latter is 1020 at/cm3 for N+ poly and 5 × 1019 at/cm3 for P+ poly, we obtain constant polydepletion thickness of 12 ˚ A and 18 ˚ A, respectively. These numbers can be recalculated into equivalent oxide thickness (EOT), giving 4 ˚ A and 6 ˚ A of EOT, respectively for N+ and P+ gates. The first order approach to take into account the polydepletion effect is to consider these numbers as increments adding to the physical oxides thickness, as is the case in the MASTAR model. At the other side of the oxide, a vanishing probability of finding electrons at the Si-SiO2 interface leads to a creation of a kind of “dark space” or “quantum depletion” region within the channel, Fig. 6.6. Consequently, the inversion-electron distribution (resulting from the Schroedinger/Poisson equation) shows a barycentre at a certain distance from the interface in contrast with the classical case that shows a maximum at the interface. The thickness of the dark space varies with channel doping and with gate bias, but 9–12 ˚ A for electrons and 15–18 ˚ A for holes can be considered practical ranges of variations [6.16]. For simplicity, we will recalculate the dark space thicknesses for electron and holes into additional equivalent oxide thicknesses (EOT). Taking into account the dielectric constant difference factor 3 between Silicon and SiO2 , and referring to dark space values relevant to relatively high doping, 3 ˚ A for electrons and 5 ˚ A for holes seem to be representative numbers. It is pretty instructive to analyse the oxide thickness scaling from 10 ˚ A to 7 ˚ A (usual scaling factor of 0.7), see Table 6.3. Due to invariability of

T. Skotnicki and F. Boeuf

EOT of polydepletion, A

158

20 18 16 14 12 10 8 6 4 2 0

NMOS (Npoly=1e20cm-3) Vdd scaled with Tox

0

Tp (EOT) @ 1.8V 1.3V 1.0V 0.7V 0.5V 0.35V 0.25V

10 20 30 40 CMOS relevant Tox, A

Fig. 6.7. Polydepletion thickness as function of gate oxide thickness, recalculated into an equivalent oxide thickening EOT. This EOT increases with oxide thinning, but decreases when reducing the supply voltage. The net result of these two is the almost constant value of polydepletion EOT reading at 4 ˚ A for N+ doped gates and 6˚ A for P+ doped gates, [6.15] Table 6.3. Physical oxide scaling (×0.7) from 10 ˚ A to 7 ˚ A corresponds to electrical thickness ratio of merely 0.82 to 0.85 due to polydepletion and dark-space Physical thickness [A] Polydepletion [A] Dark space [A]

10 4–6 3–4

7 4–6 3–4

Total EOT [A]

17–20 NMOS–PMOS

14–17 NMOS–PMOS

Physical scaling ratio Electrical scaling ratio Physical thickness needed for catching up with ×0.7 scaling

0.7 0.82–0.85 5–4

polydepletion and of the dark space this scaling exercise with scaling factor 0.7 leads to effective scaling factors of 0.82 for electrons and 0.85 for holes. Therefore, if we wished to catch up with the scaling factor 0.7, the physical oxide thickness should be smaller than that resulting from ×0.7 scaling. This would require 5 ˚ A for NMOS and 4 ˚ A for PMOS , instead of 7 ˚ A for both. Such an aggressive physical oxide scaling is both: technically extremely difficult (controllability) and electrically very complex since it involves serious degradation of reliability and isolation properties. As the reliability issues are in depth analysed in Chaps. 4, 5, here we will focus on the issues relevant to degraded isolation properties of very thin oxides. As shown in Fig. 6.8, the gate leakage may attain the level of 1000 ˚ A/cm2 already for oxides thickness of

6 Optimal Scaling Methodologies and Transistor Performance

159

Ig[A/cm2] =1.44e5*(Exp(-4.02*Ug[V]^2+13.05*Ug[V])*Exp(-1.17*Tox[Å]) 1.E+07

Sim. Data 2.2 nm

1.E+05

1.8 nm

2 nm 1.6 nm

Ig (A/cm2)

1.E+03

1.4 nm 1.2 nm

1.E+01

1 nm 1.E-01

0.8 nm 0.6 nm

1.E-03

0.4 nm

analytical model

1.E-05 1.E-07 0

0.2

0.4

0.6

0.8

1

1.2

1.4

Ugate (V)

Fig. 6.8. Experimental data from [6.17,6.18] fitted with MASTAR expression 6.22

the order of 10 A. The data shown in Fig. 6.8 originates from [6.17,6.18], and can be nicely fitted with the following simple expression that is implemented in MASTAR: 2 [V]) + 13.05 × Vgate [V] Igate [A/cm2 ] = 1.44 × 105 exp(−4.02 × Vgate

× exp(−1.17 × Tox [A]) .

(6.22)

Using the above expression we can make a rough prediction on when H–K dielectrics will have to be introduced.6 Let suppose that the criterion for their introduction is the equality of the gate leakage in the off-state and of the source off-state leakage. The main gate leakage in the off-state takes place through the gate-to-drain overlap, the area of which can be estimated as W × 0.4Xj .7 Then supposing that W , Xj , Vg and Tox scale as ×0.7 we can establish a rough estimation of Igate scaling scenario starting from a given known technological node. In Fig. 6.9, such scenarios are calculated for LSTP (low stand-by power ), LOP (low operating power ) and HP (high performance) products taking as the starting point the data relevant to the 130 nm CMOS node. Just to give an example, let suppose that the tolerable Ioff current reads at 10 pA/µm, 10 nA/µm and 1 µA/µm for LSTP, LOP and HP products, 6

7

These predictions are aimed at illustration of the methodology of gate leakage analysis with MASTAR rather than exact predictions on HK introduction timing. Factor 0.4 results from fitting to our most recent technological results; slightly different factors may be obtained for each specific technology.

160

T. Skotnicki and F. Boeuf

Fig. 6.9. Predictions on gate leakage in OFF-state (only gate-to-drain overlap leaking) calculated with geometrical data from ITRS 2001 [6.3] using the MASTAR Igate model (6.22). For LSTP and LOP the calculated Igate has been devided per 10 to account for the improvement due to oxide nitridation technics. For HP the doubt resides in whether the improvement due to nitridation should be compensated by the degradation due to high temperature of operation. In any case these predictions are aimed at illustration of the methodology of gate leakage analysis with MASTAR rather than exact predictions on HK introduction timing. The maximal tolerable Igate currents are arbitrary set to 10 pA/µm, 10 nA/µm and 1 µA/µm for LSTP, LOP and HP products, respectively

respectively. We can read out from Fig. 6.9 that with so-specified Ioff limits, the introduction of H–K dielectrics8 is required in-between 90 nm–65 nm node for LSTP, 65 nm–45 nm node for LOP. Concerning the HP products, the doubt resides in the uncertainty about the increase of the gate leakage due to high temperature of operation, but it may happen that the HP products will be the less dependent on HK dielectrics (high-K or in order words dielectrics with high dielctric constants). Coming back to the Tox scaling, the polydepletion and the dark space contribute to violation of the Tox /L “good-design” rule, since: 8

We suppose here that oxynitrides have a capacity of reducing the gate leakage up to ×10 compared with pure SiO2 ; and thus that higher than ×10 reduction in gate leakage is only possible with H–K dielectrics.

6 Optimal Scaling Methodologies and Transistor Performance

161

0.18 Tox_e l/ Le l

0.16 0.14 Tox/ L

0.12 0.1

Tox_e l/ Lgate

0.08 0.06

Tox_e l/ Lmask

0.04 0.02

Tox/ Lmask

0 0

50 100 CMOS node

150

Fig. 6.10. Tox /L “good-design” rule plotted according to Table 6.1, that projects transistor scaling down to 25 nm node. The plots may be very different depending on whether physical or electrical values are taken for Tox and L

Tox

el

=

εSiO2 poly εSiO2 T + Tdark εSi εSi dep

space

.

(6.23)

It should be noted that this is not the only reason behind the Tox /L increase. In the past we cared little about the difference between Lmask (corresponding to the technology feature size on nominal device), Lgate and the actual channel length Lel (also called electrical, effective or metallurgical channel length). Today, however, Lgate is smaller than Lmask , for HP products Lgate ≈ 0.5Lmask , and Lel is always smaller than Lgate : 1 Lel = Lgate − 0.8Xj = Lgate − 0.8 Lgate ⇒ Lel ≈ 0.73Lgate ≈ 0.37Lmask 3 We assumed here that Xj = Lgate /3 that is a common practice in ITRS Roadmaps, although originally this “good-design” rule used to apply to Lmask . Therefore, the ratio Tox el /L undertakes very different values depending on if it is referenced to Lmask , Lgate or Lel . All of these cases are plotted in Fig. 6.10. Of course, for controlling the SCE and DIBL, only the ratio Tox el /Lel is significant, meaning that we override the “good-design” rule very much, up to factor 5 (last considered node in Fig. 6.10). If no compensation comes from other ratios (good-design rules), an increase in SCE and DIBL in the same proportion is to be expected. 6.3.3 Limitations Menacing the Xj /L Scaling Reduction in Xj has a detrimental effect on series resistance Rs of the junction, see Fig. 6.11. A continuous improvement of this compromise is being accomplished thanks to introduction of new junction technologies, but in reality the junction depth has for a long time been somewhat relaxed compared

162

T. Skotnicki and F. Boeuf

Sheet resistance (ohms/sq)

2000 Universal Rs/Xj tradeoff of implanted and thermal annealed boron

1600

B+ RTA B+ Spike BF2+ RTA BF2+ Spike GILD or LTA SPE

1200

800

400

0

50-nm node

0

100-nm node 70-nm node

20

40 60 80 Junction Depth @ 1E18 B/cm3 (nm)

100

Fig. 6.11. Series resistance versus junction depth (both for P-type extensions) trade-off reported in recent literature [6.19] for some innovative technologies (RTA, spike anneal, LTA, SPE) compared with specifications (boxes) according to ITRS 1999 [6.2]

to the requirements. This has been sometimes an implicit result of technology optimisation, where the detrimental effect of overly deep extension could be compensated by somewhat higher channel doping, thus ensuring: Tdep Xj × ≈ const ⇒ SCE, DIBL ≈ const L L The opposite strategy consisted in allowing shallower but more resistive extension, leads to a strong penalty in terms of reduced Ion current and cannot be easily compensated. It should be noted, however, that deeper junctions can no longer be tolerated since the increase in channel doping has been brought to the level menacing of junction breakdown, or at least of severe junction leakage. Another source of deviations from the Xj /L “good-design” value of 1/3, lies in the growing difference between Lmask , Lgate and Lel . All hypothesis are plotted in Fig. 6.12, and as before the most unfavourable one (Xj /Lel ) is the most relevant if the suppression of SCE and DIBL is considered. Today, the discrepancy is already reduced since in ITRS the Xj /L = 1/3 is no longer referred to Lmask but rather to Lgate , that nevertheless still leads to Xj /Lel being larger than 1/3 (actually equal 0.45).

6 Optimal Scaling Methodologies and Transistor Performance

1.6

163

Xj/Lel

1.4

Xj/ L

1.2 1 0.8

Xj/Lgate

0.6

Xj/Lmask=1/3

0.4 0.2 0 0

50

100

150

CMOS node Fig. 6.12. Xj /L “good-design” rule with L understood as Lmask implies very large values of Xj /Lgate and Xj /Lelec . Now-a-days, in ITRS Roadmaps, this “gooddesign” rule rather refers to Xj /Lgate , that nevertheless leads to Xj /Lel values largely exceeding 1/3. Note that Xj /Lel is the most relevant for the electrostatic integrity of a device

6.3.4 Limitations Menacing the Tdep /L Scaling Traditionally, Tdep is the leverage of compensation for insufficiencies in other parameters scaling. We have seen in Sects. 6.3.1 and 6.3.3, that faster than x1/k scaling of channel doping is necessary for keeping, respectively, Vth and SCE constant. Assuming x1/k 2 scaling, leads to Tdep scaling as k and thus seemingly to conservation of the Tdep /L rule. Nevertheless, due to the growing discrepancy between Lmask , Lgate and Lel , this “good-design” rule is also violated, as shown in Fig. 6.13. 6.3.5 Impact on the Roadmap The MASTAR expressions for SCE and DIBL may be considered as some kind of generalized “good-design” rules. We are accustomed today to tolerate higher values of SCE and DIBL than those (6.19) and (6.20) resulting from the original “good-design” rules. For a technology supplied with 1 V, 50 mV to 100 mV of SCE and as much for DIBL is not unusual. For simplicity, let assume that tolerable values of SCE and DIBL should be small fractions, let say less than 10%, of Φd and Vds , respectively. This leads to the following conditions (generalized “good-design” rules): Xj2 Tox el Tdep 1 Vth 1 SCE DIBL ≤ , ≤ 10%; and ≤ EI ≡ 1 + 2 ⇒ Lel Lel Lel 25 φd Vds Vdd 5 (6.24)

164

T. Skotnicki and F. Boeuf

1.2 Tdep/Lel

Tdep/ L

1 0.8

Tdep/Lgate

0.6 1/3

0.4

Tdep/Lmask

0.2 0 0

50 100 CMOS node

150

Fig. 6.13. Tdep /L “good-design” rule plotted according to the ITRS 2001 [6.3] with L understood as Lmask , Lgate and Lelec 0.35 SCE

EI,SCE,DIBL (V)

0.3 0.25

EI

0.2 0.15

DIBL

0.1 ideal SCE

0.05

ideal EI ideal DIBL

0 0

20

40

60 80 CMOS node

100

120

140

Fig. 6.14. Prediction on on EI, SCE and DIBL according to conventional scaling for planar MOSFET. EI is calculated according to Tox el /Lel , Xj /Lel and Tdep /Lel ratios, as estimated in Fig. 6.10, Fig. 6.12 and Fig. 6.13

The first advantage of this new formulation is that all the three “good-design” rules ensuring good electrostatic integrity (EI) of the device are now convoluted, thus showing explicitly the possibility of mutual compensations. Another advantage resides in that now we have a direct correspondence between the “good-design” rule for electrostatic integrity (for simplicity we will call this rule “EI”, see (6.24) for the exact definition of EI) and the real values of SCE and DIBL, since according to (6.19) and (6.20): SCE ≈ 2.0 × Φd × EI DIP L ≈ 2.5 × Vds × EI

(6.25)

6 Optimal Scaling Methodologies and Transistor Performance

22nm

1.E+04

Ioff (nA/µm)

(x TIF ->) Technology Improvement Factor (RIF)

HP CMOS Roadmaps

1.E+05

165

ITRS’01

(x 2.12 ->) 32nm (x 1.62 ->) 45nm (x 1.26 ->)

1.E+03

65nm 25nm(x 70 ->)

1.E+02

ITRS’99

35nm (x6.5 ->) 50nm (x 2.8 ->) 70nm (x 1.77>) 100nm

1.E+01

90nm

130nm

130nm

ITRS'99 As required ITRS'99 As modeled

1.E+00

CMOS 180nm

ITRS'01 As required ITRS'01 As modeled

1.E-01 0

250

500

750 1000 Ion (µA/µm)

1250

1500

Fig. 6.15. Impact of the violations of “good-design” rules on the CMOS roadmaps (ITRS1999 and ITRS2001 HP branches taken for example)

Figure 6.14 shows predictions on EI, SCE and DIBL according to conventional scaling for planar MOSFET. EI is calculated according to Tox el /Lel , Xj /Lel and Tdep /Lel ratios, as estimated in Fig. 6.10, Fig. 6.12 and Fig. 6.13. Both Ion and Ioff suffer from the violation of the “good-design” rules. In Fig 6.15, we have plotted MASTAR predictions for the HP branch of ITRS 1999 and ITRS 2001 Roadmaps for Ioff –Ion . The comparison with the Roadmap specifications reveals a large discrepancy, up to factor 70 for the ITRS 1999, but no more than factor 2.12 for the ITRS 2001. In the next Section we will show how these discrepancies can be further reduced and eventually annihilated.

6.4 Extending Validity of Moore’s Law As long as the “good-design” rules are respected, the scaling can be pursuit and the Moore’s laws have all their chances to remain valid. Lets us then analyse solutions for retrieving the “good-design” rules. 6.4.1 Strategies Based on Increased Gate Drive (Vdd –Vth ) The converging Vdd and Vth problem is difficult to remedy without causing additional power consumption. We will consider in the following two points

166

T. Skotnicki and F. Boeuf

two strategies: (i) – relaxed Vdd scaling, mainly leading to increased dynamic power dissipation; and (ii) – reduced Vth or in other words, relaxed Ioff scaling, mainly leading to increased static power consumption. Relaxed Vdd Scaling As seen in Fig. 6.16a, two regimes are present in the CMOS Ioff –Ion plot : (i) – a roughly vertical part, congruent with the ITRS 2001 requirements and covering generations from 130 nm to 65 nm, and (ii) – a rapidly diverging part for 45 nm and beyond generations. This second regime is due to insufficient 10000

22nm 32nm 45nm

Bulk

Ioff (nA/µm)

1000

65nm

ITRS 2001

100

90nm

10

CMOS130nm Bulk with Vdd 1.5X

1 500

1000

1500

2000

Ion (µA/µm)

CV/I, (ps)

a 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

Bulk with Vdd 1.5X

17%/y of improvement 0

b

Bulk

50

100

150

CMOS Node, (nm)

Fig. 6.16. Examination of the relaxed-Vdd scaling strategy. Large expense in dynamic power is needed to produce modest improvement in intrinsic speed – this strategy is therefore more beneficial for heavy-loaded circuits

6 Optimal Scaling Methodologies and Transistor Performance

167

gate drive (Vdd –Vth ) resulting from the violation of the Vth /Vdd = 1/5 “gooddesign” rule. Of course it can be rectified by means of relaxed Vdd scaling. As this simple strategy is however expensive in terms of power dissipation 2 (Pdyn ∝ CVdd ), it is not likely to become a general remedy, especially in battery operated devices. As can be inferred from Fig. 6.16a, Pdyn would increase up to factor 3 (last CMOS node) if Vdd were readjusted so as to satisfy the Ion specifications. It is interesting to realize that such strong Vdd increase needed to catch-up with the ITRS specifications, is due to the adverse effect of Vdd on mobility (accordingly with the (Vg /6/Tox ) term in Eeff , see equations in Table 6.2) that spoils a part of the benefice due to the Vdd increase. It is interesting to realize that the benefice due to this strategy for the intrinsic speed of the device is not huge. This is not astonishing, as the increase in Ion is to certain extend compensated by the increase in Vdd itself. As shown in Fig. 6.16b, the CV /I improvement permits to stick to the traditional 17%/year improvement for no more than one generation longer than with conventional Vdd scaling. In heavy loaded circuits the benefice may, however, be larger since the speed of such circuits is much more related to the current drivability Ion than to the intrinsic delay CV /I. Reduced Vth – Relaxed Ioff Two ways exist to reduce Vth : the so called Dynamic Vth modulation (DVth ), and permanent Vth reduction. DVth consists in shunting the body of the transistor with the gate – thus requiring SOI substrates, or a triple-well technology on Bulk. When the gate drive voltage goes high, so does the body potential (tied gate and body) thus reducing dynamically the value of the threshold voltage (positive body biasing in NMOSFET). The amplitude of this modulation is, however, limited by at least two factors. The body positive bias can not exceed the junction built-in voltage otherwise menacing of heavy diode leakage. On the other hand, due to the rapid scaling of Vdd (that brings the Vdd values below the junction built-in voltage), the amplitude of the Dynamic Vth modulation is small and alone incapable of rectifying the second regime in the Ioff –Ion plot, see Fig. 6.17. This is confirmed experimentally by the recent result from [6.20], that is in line with the predictions. This strategy becomes more efficient if slight increase in Vdd is allowed thus enabling the ITRS (1999) specifications to be matched with less effort in Vdd . To take example of 50 nm node, Vdd no larger than 0.55 V is now enough (see Fig. 6.17) whereas 0.73 V is needed without DVth . This conclusion must, however, be moderated by taking into account that the increase in current is accompanied by that in capacitances when forward biasing the bulk – the final gain in speed has been estimated in [6.21] at no more than 25%. The second way to reduced Vth is just its permanent adjustment at a lower level. Figure 6.18 shows the result of the exercise consisting in adjusting Vth at such low level so as to satisfy the Ion ITRS’01 specifications, carelessly of the Ioff level this would imply. We can see that this strategy can be successful on

168

T. Skotnicki and F. Boeuf 1E+3

X - DVth alone 25 nm

Ioff (nA/µm)

1E+2

O - DVth & Vdd

Chang et al.

35 nm 50 nm

1E+1 1E+0

70 nm 100 nm 130 nm 180nm - This work predictions

1E-1

- ITRS 1999

0.25 → 0.4V 0.35→ 0.46V 0.5 → 0.55V 0.7 → 0.7V

500 nm

1E-2 0

200

400

600

800

1000

Ion (µA/µm) Fig. 6.17. Decreasing efficiency of the DVth is due to Vdd down-scaling. However, DVth coupled with relaxed Vdd permits to match the ITRS (1999) with lower Vdd ’s compared with Vdd increase alone

1.E+07

IoffMAX=Ion/100

Relaxed Ioff to Fit Ion

1.E+06

Ioff (nA/µm)

1.E+05

Bulk

1.E+04 1.E+03 1.E+02

ITRS01

1.E+01

CMOS 22 32 45 65 90 130

1.E+00 500

700

900

1100

1300

1500

1700

Ion (µA/µm)

Fig. 6.18. Relaxed Ioff strategy: entire ITRS can be matched without any effort in Vdd

condition very large Ioff levels are allowed. Note that at the last node, 22 nm CMOS, an Ioff as large as 1/10 of the Ion current is required. On the other hand, if setting the tolerable Ioff limit at the level of 1/100 of Ion , this strategy does not buy any additional node. Indeed, the 65 nm node is reached without Iff relaxation and the 45 nm node requires more Ioff than 1/100 of Ion .

6 Optimal Scaling Methodologies and Transistor Performance

169

6.4.2 Strategies Based on Even More Aggressive Scaling Based on the scaling theory, a natural reflex used to be to scale the MOSFET more aggressively whenever more performance was required. To give an example, let consider L-scaling - we were used in the past to associate the reduction in L with an automatic increase in speed, or decrease in CV /I, as both I and C used to depend on L as 1/L. We will show below that today this strategy has to be applied with care, since the more aggressive scaling may produce beneficial as well as detrimental effects, depending on the degree of extra-scaling. Comparing ITRS 1999 (roughly orthodox scaling) and ITRS 2001 (optimised scaling) Roadmaps, we will also see how the careful and well-optimised scaling can pay an significant amount of extra performance. Anomalous Scaling Effects Within the orthodox scaling scheme, the effective field in the channel should not increase. Indeed, if Vth is 1/5 of Vg (in ON-conditions Vg = VDD ) and both scale as xk, the entire nominator in the Eeff expression scales as xk, that cancels this same type of scaling in the denominator. Eeff =

VG + Vth VFB + 2φF −2 6Tox el 6Tox el



see footnote 9



The problem arises when Vth scaling is levelled-off (to keep Ioff constant) by means of a reinforced channel doping, since this clearly leads to an increase in Eeff . Any increase in Eeff inevitably leads to a decrease in the effective mobility µeff according to the so-called universal mobility dependence [6.9], see Fig. 6.19. This mobility reduction moderates the gain in Ion resulting from scaling. More generally, the negative feedback due to mobility reduction by increased doping may be further reinforced by any action requiring an additional increase of the channel doping, such as faster scaling of Tox or L, or slower scaling of Xj within a given CMOS generation (so without parallel scaling of other parameters). Let first consider a more aggressive Tox scaling. Traditionally such a strategy used to pay in an extra current since Ion ∝ Cox ∝ 1/Tox , that has been a very foundation of the scaling theory and thus of Moore’s laws. As first shown in [6.22], this may no longer be true, since the reduction in Tox (non accompanied by any Vg (= VDD ) scaling) leads to a very rapid increase in Eeff and consequently to a decrease in µeff – note that not only the term Vg /6Tox increases but also the Vth /6Tox – if Vth is kept constant, a higher channel doping is required at thinner oxide, see Fig. 6.20. At a certain point, the detrimental effect of mobility reduction prevails over the positive effect of Cox increase and paradoxally the further reduction in oxide thickness degrades 9

The second term in this expression is usually much smaller than the first one and thus negligible.

170

T. Skotnicki and F. Boeuf

2

µeff @ Vdd=ITRS’99, cm /V/s

1000

100

10 0 .1

acoustic phonons

CMOS 500nm 180nm 130nm 100nm 70nm 50nm 35nm 25nm

surface roughness

Nch

1

10

Eeff = Vg/6Tox+Vth/6Tox, MV/cm Fig. 6.19. Effective field and mobility calculated with the ITRS 1999 data

current, Fig. 6.21, instead of improving it (see also the µeff × Cox product in Fig. 6.20b). The numerical simulation points [6.14] shown in Fig. 6.21 confirm the analytical results. We should also note that the increase in Cox when Tox decreases is moderated by the “Dark Space” and by the polydepletion. These effects play, however, in opposite directions, causing slower increase in Cox , but also slower decrease in µeff , and thus are considered secondary in the analysis. Note also that the anomalous decrease in Ion when thinning the gate oxide has been observed experimentally in [6.23] but was there interpreted as being due to gate leakage. As shown in Fig. 6.22, shorter gates permit a certain amount of increase in Ion although too short gates may, in contrast, be destructive for performance. This perverse effect remains in clear opposition with the classical MOSFET behavior - traditionally channel shortening has always led to Ion improvement as 1/L. This relation has been at the very foundation of the scaling theory and thus at the basis of Moor’s laws. Its failure at very short gate-lengths is again due to an adverse effect of increased doping on mobility. Indeed, if the gate length is scaled down more aggressively than other device parameters, suppression of SCE/DIBL requires very high channel doping that kills mobility. In extremely short devices this negative effect may prevail over the positive effect resulting from the channel shortening itself, thus leading to the overall Ion degradation. In order to sustain our confidence that this novel effect is not due to invalid analytical modeling, in Fig. 6.22 we have also shown 2D simulation points [6.14] that overlap with analytical points. Note that an experimental observation of the effect is difficult due to the requirement on constant Ioff (a prerequisite for all the anomalous mobility effects to show

6 Optimal Scaling Methodologies and Transistor Performance

171

0.14

300 CMOS100nm

0.12 0.1

µeff (Tox +δ)

200

0.08

µeff (Tox )

150

0.06 100

C (F)

µeff (cm²/V/s)

250

0.04

C(Tox )

50

0.02 C(Tox +δ)

0

0 0

0.5

1

1.5

2

Tox physical (nm)

a

6

µeff x Cox (F.cm²/V/s)

µeff X Cox (Tox )

5 4 µeff X Cox (Tox +δ)

3 2 1 0 0

b

0.5 1 1.5 Tox physical (nm)

2

Fig. 6.20. The mobility degradation may prevail over the 1/Tox increase (a); thus leading to the anomalous roll-down in the Ion -on-Tox dependence (b)

up) that would require costly experiments aiming readjusted channel doping per each gate length (if the Ion maximum in function of L is considered). This same kind of negative feedback may also show up with respect to insufficient Xj scaling involving degraded mobility. A compensation of the insufficient Xj scaling by an increase in channel doping has often been an unaware way of coping with insufficiencies of junction technologies. In Fig. 6.23

172

T. Skotnicki and F. Boeuf

1000

MASTAR, with ITRS 1999

Ion (µA/µm)

800

CMOS 130nm

600

100nm

400

70nm

200 0

Nominal Bazley & Jones 2D simul.

0

50nm

1

2

Tox (nm)

Fig. 6.21. Excessive thinning of the gate oxide may be also a wrong strategy due to the increase in doping it implies if one wishes to keep Vth constant (our predictions are confirmed by 2D simulations [6.14] for CMOS100 nm). Note that the gate leakage current (here neglected) may further accentuate this effect shifting the Ion -maxima towards thicker oxides 1000

Bazley & Jones 2D Nominal

ITRS 1999

Ion (µA/µm)

800

CMOS 130nm

600 400

100nm

70nm 50nm

200

@ Ioff and Tox nominal for a given node 0 0

50

100

150

200

L gate (nm)

Fig. 6.22. The nominal gate length equal to the CMOS feature size (ITRS 1999) turns out not to be optimal from the point of view of device performance. Shorter gates permit some increase in Ion although too short gates may in contrast be destructive for performance

we can see that strong relaxation in Xj thickness leads to a potential degradation in Ion . In practice, this effect is, however, difficult to observe since deeper junction at the same time leads to reduced series resistance that may compensate for the degradation in Ion due to mobility. Note that halo implants may alleviate the problem of mobility degradation due to excessive channel doping but only in long devices. In short devices when the halos from source and drain sides overlap, the mobility is degraded along the entire channel length.

Ion /Ion nominal @ Ioff=nominal @ lelec=nominal

6 Optimal Scaling Methodologies and Transistor Performance 1.01

173

ITRS 1999 CMOS100nm

1 0.99 0.98

CMOS70nm

0.97 0.96 0

50 Xj (nm)

100

Fig. 6.23. For very deep junctions the effect of increased doping (required for conservation of Ioff ) leads to a decrease in Ion H P C M O S ITR S 9 9

5

A S M O D E LE D (M A S TA R ) 100n m

4 CgVd/Ion (ps)

130n m

70n m

35n m

3

50n m

2 1

N o de N o m i n a l

0

0

50 100 L g a te (nm )

150

Fig. 6.24. The L value optimal for CV /I roughly corresponds to that optimal for Ion

The above analysis shows existence of some optimal values of L and Tox maximizing the Ion –Ioff trade-off. Are these same values optimal for the device speed? Concerning L, the answer is rather, yes. Since C monotonically decreases with decreasing L, CV /I improves as far as Ion increases. The minima in CV /I should thus be slightly shifted towards shorter gate lengths compared with the maxima in Ion , but there is a rough correspondence between them, see Fig. 6.22 and Fig. 6.24. The situation is different regarding Tox – the monotonic increase in gate capacitance (with gate oxide thinning) prevails over the resulting (too weak) increase in Ion , and consequently the CV /I plots do not show any minima, Fig. 6.25.

T. Skotnicki and F. Boeuf

CV/I (ps)

8

800

ITRS 1999

7

700

6

600

5

500

4

400

3

300

2

200

CMOS100

1

100

CMOS70

0 0

0.5

Ion (µA/µm) @ Ioff nominal

174

0

1 Tox (nm)

1.5

2

Fig. 6.25. Within the ITRS 1999, the monotonic increase in gate capacitance (with gate oxide thinning) prevails over the resulting weak increase in Ion , and consequently the CV /I plots do not show any minima

Ion (µA/µm) @ Ioff=nominal

1600 1400 1200 1000

CMOS 130nm

800

90nm 65nm

600

45nm

400 200

nominal CMOS130

nominal CMOS90

nominal CMOS65

nominal CMOS45

0 0

0.5

1

1.5

2

Tox physical (nm)

Fig. 6.26. Regarding the ITRS 2001, due to larger allowed Ioff values, the anomalous maximum in the Ion -vs-Tox plot shows up only for the CMOS 130 nm generation

Optimal Scaling of Tox and L In the 2001 edition of the ITRS Roadmap, the values of Tox and Lgate were specified closer to the optimal values, as shown in Fig. 6.26 and Fig. 6.27, respectively for Tox and Lgate . In addition, the relaxation of the Ioff values for all nodes has led to an attenuation of the anomalous scaling effects thanks to lower doping required for matching the Ioff specifications. Consequently, the maxima in the Ion -vs-Tox plots do not exist any longer for the ITRS 2001,

Ion (µA/µm)@ Ioff=nominal

6 Optimal Scaling Methodologies and Transistor Performance

175

1000 900 800 700 600

CMOS130nm

CMOS65nm

500

CMOS45nm

CMOS90nm

400 0

20

40

60

80

100

120

Lgate (nm)

Fig. 6.27. Optimal placement of the CMOS nodes within ITRS 2001

2 1.8

CMOS130

1.6

CV/I (ps)

1.4

90

1.2 1

65

0.8

130nm 90nm 65nm 45nm 32nm ITRS req

45

0.6

32

0.4 0.2 0 0

20

40 60 80 Gate Length (nm)

100

120

Fig. 6.28. Thanks to better optimization of Lgate , Tox and Ioff , the discrepancy between the required speed improvement (broken line , traditional 17% per year measured in (CV /I)−1 ), and the planar CMOS speed (continuous lines) is largely reduced. Squares and triangles correspond to nominal gate lengths

Fig. 6.26. All this has permitted to match the MOSFET Ioff –Ion trade-off with the requirements up to the 65 nm node that in the 1999 edition was already out of reach, see Fig. 6.15. The result of this optimisation on the

176

T. Skotnicki and F. Boeuf

device speed is spectacular – we retrieved the historical improvement rate of 17%/year up to the 65 nm node, see Fig. 6.28. For the most distant nodes the discrepancy between the real transistor performance and the requirements (driven by the historical 17%/year improvement) emerges again, but the necessary TIF factor (Technology Improvement Factor required for matching the specifications) undertakes more realistic values. In terms of the Ioff –Ion requirements, a TIF of no more than 2.12 (Fig. 6.15) is predicted to suffice for covering the entire Roadmap up to the last considered node CMOS 22 nm. Note also that the TIF values as predicted by MASTAR (26%, 62% and 112% for CMOS 45 nm, 32 nm and 22 nm, respectively) are very close to those that have been specified in the ITRS 2001 (30%, 70% and 100%, respectively, cf. Table 35b in [6.3]). Similar TIF values result from the analysis if the device speed, see Fig. 6.28. Whereas the TIFs larger than 10 (ITRS’99) are very tough to realise, the TIFs up to 2 (ITRS’01) are considered credible; we will discuss their practical realisation in the next section. 6.4.3 Strategies Based on New Materials Transport-Efficient Materials Mobility, that remains a strong leverage of transistor performance, is not the strength of Silicon. Materials such as AsGa or InP present much higher mobilities than Si, but their technology is not compatible with that of Si VLSI. Reasonable improvement can, however be expected with such Silicon compatible materials as Ge and Strained-Si, see Table 6.4. Concerning Ge, a Table 6.4. Comparison of main transport related parameters of Si, Ge, GaAs and Strained-Si (tensile stress) Unites for all materials are the same as for Silicon – first column

Si

Ge

GaAs

Strained-Si

11.7

16.3

10.4

=Si ?

Gap

1.12

0.67

1.43

1.12 (Ec,Ev offset)

Heat conductivity

1.5

0.6

0.8

v.bad in virt. substrates

Electron mobility

1350

3900

8600

up to 2xSi (µeff)

Electron sat. velocity

1e7

6-8e6

1e7

=Si ?

Hole mobility

480

1900

250

up to 2.4xSi (µeff)

Hole sat. velocity

8e6

8e6

-

=Si ?

@ 300K Dielectric constant

6 Optimal Scaling Methodologies and Transistor Performance

PECVD

177

gate oxide

GATE source

drain

strained Si channel relaxed SiGe buffer Fig. 6.29. SEM photo of a MOS transistor with strained Si channel, for more detail see [6.32]

strong R&D effort towards SiGe epitaxial channels on Si was carried out in the past [6.24–6.28] but eventually more or less abandoned. The main reasons for that were mainly: (i) – the buried-channel conduction in subthreshold regime leading to substantial degradation in short devices, (ii) – no gain if not a degradation in NMOSFET, (iii) – relaxation of the (compressive) strain in SiGe due to termal budget, due to STI induced mechanical stress, and due to SD implants. Although the two latter points were successfully solved [6.28] by using a multiple-well architecture, the pros and cons balance has not yet been considered favourable enough to promote this architecture to VLSI production. It should be noted, however, the recent interest to pure Germanium channels [6.29]. Tensile strained-Si (s-Si) seems to be more successful, stimulating a lot of interest, especially after the announcement from INTEL about implementation of s-Si in products [6.30]. Many labs have reported [6.31–6.34] up to factor 2 improvement in electron mobility (long channel) in s-Si channels on SiGe virtual substrates with around 20% Ge fraction. An example of s-Si channel implementation in an NMOSFET is shown in Fig. 6.29, and the corresponding electrical characteristics are given in Fig. 6.30. Small moderation of the enthusiasm comes, however, from that the hole mobility improvement (although potentially larger than that of electrons) comes much latter (in terms of the required Ge fraction) than the equivalent improvement for electrons [6.35]. Let us evaluate the impact of the X2 electron mobility improvement on the ITRS 2001 Roadmap. In Fig. 6.31 this has been done by multiplying the effective mobility (long channel) per 2 in the MASTAR equations. The result is shown, indicating a potential for 15% up to 25% gain in Ion for nominal CMOS nodes’ devices. The difference between the 100% improvement in

178

T. Skotnicki and F. Boeuf 800 RTA 900°C

Mobility [cm2/Vs]

700 600 RTA 800°C

500

surface strained Si channel

400 300 200

reference pure Si

100

tox = 5nm

0 -1

a

0

1

2

3

VG [V]

b Fig. 6.30. Comparison of electron mobility between conventional Silicon and strained Silicon channel transistors (a), and between their output characteristics (b). Data taken from [6.32]

long devices and the much more modest improvement in nominal devices is of course due to moderation of mobility by a lateral field. Larger mobility corresponds in short devices to smaller critical field and thus to quicker onset of the velocity saturation of carriers. This moderates the gain in short devices with respect to long devices where the velocity saturation does not occur. Metallic Gate Metallic gate cancels polydepletion effect, thus reducing the difference between physical and electrical (effective) gate oxide thickness by 3–6 A (see

6 Optimal Scaling Methodologies and Transistor Performance HP ITRS'01

10000

45nm 65nm

100

90nm

Ioff (nA/µm)

1000

LP ITRS'01

10

22nm 32nm 45nm 65nm

1

22nm 32nm

CMOS130nm

90nm

0.1

179

Bulk conventional

CMOS130nm

Bulk on straiued-Si

0.01 300

500

700

900 1100 Ion (µA/µm)

1300

1500

Fig. 6.31. Impact of strained-Si channel on the HP and LP ITRS 2001 CMOS Roadmaps. The only parameter that has been changed in the MASTAR equations to mimic the strain effect was the multiplication of the effective mobility (long channel) per 2

Sect. 6.3.2). This being in the past a small fraction of the physical oxide thickness, we used to neglect it. Today, however, we are already working with oxide thicknesses in the range of 15–10 A, pretty comparable with the polydepletion. Therefore metallic gate becomes a very interesting feature. Figure 6.32 shows the impact of metallic gate on the Ioff –Ion roadmap (ITRS 2001). The reduction in electrical oxide thickness due to metallic gate is advantageous since it leads to better electrostatic integrity of a device (smaller SCE and DIBL) and higher channel conductivity (higher Ion ), the same way as a physical oxide thickness reduction, but in contrast to the latter does not produce large increase in tunnelling current. Unfortunately, regarding the intrinsic device speed (CV /I), the image is much less positive as the large increase in Ion is accompanied by a similar increase in gate capacitance, thus precluding the increase in speed in the similar way as we observed for the more aggressive gate oxide scaling, see Fig. 6.25. We will come back to this issue in Sect. 6.4.4 to show that metallic gate may produce a (limited) improvement in speed if other than planar device architecture is considered. Another issue about metallic gate is that majority of metals we deal with in microelectronics today are so called mid-gap materials (Fermi energy at the level of the middle of the Silicon forbidden gap). Consequently the gateto-substrate work-function difference increases by 0.56 V leading to an equivalent shift in N- and P-MOS threshold voltages towards higher absolute values. The only way to bringing these threshold voltages back to the desired values goes through counter-doping channel implants, that however produce

180

T. Skotnicki and F. Boeuf HP ITRS'01

10000

22nm 45nm 32nm

Ioff (nA/µm)

1000

65nm

100

90nm

LP ITRS'01

10

CMOS130nm 22nm 32nm 45nm 65nm 90nm Bulk with poly-Si gate

1 0.1

CMOS130nm

Bulk with metal gate

0.01 300

500

700

900 1100 Ion (µA/µm)

1300

1500

Fig. 6.32. Impact of metal gate on the HP and LP ITRS 2001 CMOS Roadmaps. The only parameter that has been changed in the MASTAR equations to mimic metallic gate was the cancellation of polydepletion

(similarly to the polydepletion) an increase of the electrical equivalent oxide thickness (at least in the subthreshold regime). Therefore, the electrostatic integrity of the device may be even worse with the metallic mid-gap gate than with poly-Silicon gate [6.15]. Only conjunction of metallic gate with a non-conventional device architecture permits to implement the mid-gap gate without the above-mentioned penalty. Of course, an alternative for conventional planar devices exists and resides in an implementation of two different metals: one for NMOS devices, presenting a work-function equivalent to the N+ doped poly-Silicon, and another one for PMOS devices, presenting a work-function equivalent to the P+ doped poly-Silicon [6.36]. Both strategies (single mid-gap gate, and N+-like and P+like dual metal gates) are extensively studied today. Metallic Junction Thanks to significantly lower sheet resistance of metallic layers as compared with doped semiconductor layers, the concept of metallic junction has a potential of escaping from the Rs –Xj compromise, discussed in Sect. 6.3.3, Fig. 6.11. Ideally, the metallic junction can be approximated as a zeroresistance layer10 thus leading to 10–20% improvement in Ion at constant Ioff , see Fig. 6.33. In reality this gain is moderated by a tunnelling resistance 10

Junction resistance dropping to zero is an optimistic assumption, targeting to give an asymptotic assessment.

6 Optimal Scaling Methodologies and Transistor Performance HP ITRS'01

10000

45nm

Ioff (nA/µm)

1000

181

22nm 32nm

65nm

100

90nm

LP ITRS'01

10

22nm 32nm 45nm 65nm

1

90nm

0.1

CMOS130nm

0.01 300

500

700

CMOS130nm

Bulk w. implanted junction Bulk w. ideal metal junction (no Schottky barrier)

900 1100 Ion (µA/µm)

1300

1500

Fig. 6.33. Impact of an ideal metallic junction on the HP and LP ITRS 2001 CMOS Roadmaps. The only parameter that has been changed in the MASTAR equations to mimic metallic junction was the cancellation of junction resistance. In particular, the tunnelling resistance due to the parasitic potential spike has been neglected. This corresponds to an idealisation and thus lower than indicated gain is to be expected in practice

of the spike in the valence band that always appears as a result of band offset between a metal and semiconductor. This parasitic spike can be reduced by an appropriate choice of the metal material. Considering a N-doped semiconductor (PMOSFET), metals presenting Fermi energy closer to the Valence band of Silicon produce a much lower spike (Fig. 6.34b) than metals having Fermi energy closer to the conduction band (Fig. 6.34a). Materials such as PtSi or IrSi may be good candidates for PMOSFET Schottky junctions. For NMOSFETs, the opposite is true, here ErSi2 as presenting Fermi level closer to the conduction band is favourable. The technology of these materials is however difficult. Concerning the PMOSFET, a similar to Schottky junction results can be obtained in the SiGe-Si system that is better compatible with Si technology. In Fig. 6.34c, we show the P+ doped Ge-to-N doped Si junction. Due to much smaller forbidden gap in Ge than in Si, a band offset appears at the interface Ge/Si, mainly in the valence band. This band offset, similarly as in the case of Schottky junctions, plays a positive role when the transistor is OFF, since it adds to the potential barrier repelling holes from penetrating to the channel. However, in the ON-state, this barrier transforms into a spike that holes need to pass by tunnelling that produces an equivalent junction resistance. Therefore, in practice, we should expect the gain due to metallic junction to be somewhat smaller than that predicted in Fig. 6.33.

182

T. Skotnicki and F. Boeuf

Metal

N-Si

Metal

EFermi

N-Si





OFF-state

OFF-state

OFF-state

EFermi

EFermi

EFermi ⊕

⊕ ON-state

⊕ ON-state

ON-state

(a) depleting Schottky junction

N-Si

EFermi

EFermi ⊕

P+ (Si)Ge

(b) accumulating Schottky junction

(c) P+doped Ge-to-Ndoped Si junction

Fig. 6.34. Heterostructure type Source-to-channel systems for PMOS transistor: (a) – depleting Schottky junction (channel Fermi energy lower than that of metal); (b) – accumulating Schottky junction (channel Fermi energy higher than that of metal); and (c) – P+ doped Ge-to-N doped Si junction. An additional inconvenience of the depleting Schottky junction is that in the OFF state no barrier for electrons exists. All diagrams are plotted considering a zero drain-to-source bias

6.4.4 Strategies Based on Improvements of Device Architecture Considerable development effort is carried out towards other than bulk planar (non-bulk) device structures today. Figure 6.35 summarizes the evolution in device structures as prospected at STMicroelectronics. It is commonly admitted that the ultimate device structure will be a kind of a double-gate device. Apart of the SON (Silicon On Nothing [6.37]) DG structure [6.38], many other DG structures are studied today, such as FinFET [6.39], OmegaFET [6.40], TriGate [6.41], Vertical [6.42], DeltaFET [6.43], Surrounding Gate [6.44] etc. – for more see Chap. 21). The electrical advantages of these structures are due to the DG operation and thus are more or less independent of the fabrication mode or final geometry. Therefore, although our analysis is based on DG SON structures, similar conclusions basically apply to any other DG structure. Thin-Body Devices – Recovering the “Healthy” Scaling The main advantage of thin-body devices resides in their better electrostatic integrity. Since in the bulk devices the electrostatic integrity is determined by the rations Xj /Lel , Tox /Lel and Tdep /Lel [see (6.26)], and we have growing difficulties with their down-scaling (as discussed in Sect. 6.3), the EI diverges more and more from its ideal value, see Fig. 6.14. BULK: EI ≡



Xj2 1+ 2 Lel



SCE DIBL Tox el Tdep 1 ⇒ ≤ , ≤ 10% Lel Lel 25 φd Vds

(6.26)

6 Optimal Scaling Methodologies and Transistor Performance

183

GP

a

Bulk

b

PD SOI

c

FD SOI

d

FD SON

e

DG SON

Fig. 6.35. Possible device structure chain as prospected at STMicroelectronics: (a) – Bulk, (b) – partially-depleted SOI, (c) – fully-depleted SOI, (d) – fullydepleted SON (Silicon-On-Nothing), note that thanks to the very thin BOX a coupling between channel and bulk (called GP-ground plane effect) can occur and contribute to stabilisation of the potential in the channel, (e) – double gate SON. Especially regarding the double gate device structure, many alternative realisations are in development, such as: FinFET, OmegaFET, TriGate, Vertical FET, DeltaFET, etc. - for more see the Chap. 21

In thin-body devices, the junction depth as well as the depletion depth are determined by the Si-film thickness, Xj = Tdep = Tsi . In double-gate devices, it is reasonable to assume even more, namely that Xj = Tdep = Tsi /2. Therefore, the EI of DG devices can be written as: Double Gate:   SCE DIBL (Tsi /2)2 Tox el Tsi /2 1 ⇒ EI ≡ 1 + ≤ , ≤ 10% 2 Lel Lel Lel 25 φd Vds

(6.27)

This new form of the EI (6.27) is fundamental for the devices scaling since it means that in this kind of devices we can ensure very good electrostatic integrity without heavy channel dopings nor any ultra-shallow junction fabrication techniques, but just by making the Si-film sufficiently thin. In the next point we will show how thin it should be, but first let us focus on the consequences of this new situation on the scaling of device performance. Let suppose a thin-body device with a channel made of an undoped Sifilm sufficiently thin so as to fulfil the EI condition, (6.27). In contrast to bulk devices (Sect. 6.4.2, point “Anomalous Scaling Effects”), a shortening of the channel of such a device does no longer penalise the channel mobility,11 as no increase in the channel doping is required to suppress the additional amount of SCE and DIBL. As a result, the traditional increase in Ion current when channel shortening is recovered, and no maximum appears any longer, see Fig. 6.36. This is very important, since recovering of “healthy” scaling means that Moore’s lows are no longer menaced by the parasitic effects, and can continue as before, as far as permitted by the patterning (lithography) capabilities. 11

True only up to first order considerations, in reality a small difference in Eeff may appear due to thinning of the Si-film (proportional to channel shortening) necessary for preserving constant value of EI.

T. Skotnicki and F. Boeuf

Ion (µA/µm)@ Ioff=nominal

184

SCE/DIBL =0

1200 1100 1000 900 800 700 600

CMOS65nm

500

CMOS45nm

400 0

20

CMOS130nm

CMOS90nm

40

60

80

100

120

Lgate (nm)

Fig. 6.36. Cancellation of SCE andDIBL recovers “healthy” scaling – due to suppression of SCE and DIBL by the geometry of the device rather than by channel doping, the Ion continues rising when shortening the gate length. Calculations made for ITRS 2001 CMOS nodes

This is not the only advantage of thin-body devices. As reported in [6.45], a factor 2 decrease in the effective field should be expected in DG devices comparing with Bulk ones. In addition, the body effect coefficient (d term in the MASTAR model) should also vanish, and an ideal subthreshold slope of 65 mV/dec can be expected. The cumulative effect of these advantages is evaluated in Fig. 6.37 with respect to the ITRS 2001 Roadmap. Note that the current of DG devices is here normalized per a unit of the circumference of the inversion. In this way the gain over Bulk reflects only larger carrier density and/or mobility (at an equivalent Ioff ), and the purely geometrical factor in current improvement is eliminated. This seems justified since the final goal of our comparison is to speak of intrinsic speed in which the geometrical factors in current and capacitance increase would cancel each other. To complete the list of advantages of DG structures, we should mention: (i) – higher probability of ballistic transport thanks to higher mobility and less scattering on ionised impurities (undoped channel), (ii) – easier implementation of metallic gate thanks to intrinsically lower threshold voltage in DG devices, and finally (iii) – a possibility of the so-called “volume inversion” [6.46] that has potential of producing an additional increase in mobility. As a counterpart, we should also mention the potential disadvantages of DG devices. Here, complex process, different layout and mobility degradation are the main potential issues, although at the present stage of development, none of those can be considered a fundamental problem. Considering the complexity, simpler and simpler process integration schemes have been proposed recently, e.g. [6.54]. Also the layout problem have been partially solved, e.g. in [6.55]. Regarding the mobility, the degradation has been mainly attributed to film-thickness non-uniformity [6.56] and back-interface scattering [6.57], that are both process related and thus potentially remediable factors.

6 Optimal Scaling Methodologies and Transistor Performance HP ITRS'01

10000

45nm

1000

22nm 32nm

65nm

100

Ioff (nA/µm)

185

90nm

LP ITRS'01

10

22nm 32nm 45nm 65nm

1

CMOS130nm

90nm

0.1

CMOS130nm

Bulk with poly gate DG with poly gate

0.01 300

500

700

900 1100 Ion (µA/µm)

1300

1500

Fig. 6.37. Impact of DG on the ITRS 2001 Ioff –Ion Roadmap. For each CMOS node, equal Tox el and channel lengths are supposed for Bulk and DG devices, meaning that the improvement due to the possibility of using shorter channels in DG devices (with their advantages effect on Ion ) is not taken into account here

Reduced Gate Capacitance The gate capacitance plays an equally important role in the transistor intrinsic speed as the Ion current. Ideally, the gate capacitance should be equal to that part of the oxide capacitance Cox gb that is delimited between the junctions. In a real structure, at least three parasitic capacitances add to the picture: Cgate = Cox gb + Cov gs + Cov gd + Cfringing The overlap capacitances, gate-to-source Cov gs and gate-to-drain Cov gd , scale down as the overlap, and therefore used to constitute an important part (20–50%) of the total. A reduction below 20% overlap, and in particular non-overlapped structures used to be considered as a fatal technological error. This is true for conventional devices since it leads to large series resistances. As shown in [6.47], for extremely short devices there exists a new technological window where overlap may be a rewarding feature. The 16 nm non-overlapped device shown in Fig. 6.38, has been designed in such a way so as to enhance punchthrough via the non-overlapped regions (low doping) and thus to limit the series resistances. Thanks to that, not only the overlap capacitance have been cancelled but also well-behaved characteristics were obtained in spite of rather inappropriate junction as for such short transistor. In spite of Ion current being slightly lower than in well-overlapped structure, the gain in Cgate has prevailed, and CV /I for this device was well placed on the same tendency-line as for the best overlapped transistors, Fig. 6.38b.

186

T. Skotnicki and F. Boeuf

3 Gate Delay (ps)

Published This work

2

1

16nm nonoverlapped 0 0

b

a

50

100

Lgate (nm )

150

Fig. 6.38. Non-overlapped 16nm MOSFET [47]: (a) – TEM cross-section, (b) – gate delay (CV /I) in comparison with published data (overlapped transistors)

Cfringing

0.8

Capacitance, fF/µm

0.7 0.6

Cox_el

Cox_el

0.5 0.4 0.3

Cfringing

0.2 0.1 0 2000

2005

2010

2015

2020

Production Year Fig. 6.39. Prospective evolution of gate-oxide capacitance (Cox el ). If the fringing capacitance (Cfringing ) remains constant (actual value roughly equal to 0.24 fF/µm), the gate capacitance (being the sum of the two) will become dominated by the parasitic fringing capacitance

The fringing capacitance Cfringing is difficult to reduce. True, the height of the gate decreases that should play favourably, but introduction of nitride spacers (higher dielectric constant) and other technological changes compensate the effect of gate height making the fringing capacitance an almost constant value throughout the CMOS generations. Therefore, as the gate oxide capacitance Cox el (sum of Cox gb + Cov gs + Cov gd ) scales down (faster

6 Optimal Scaling Methodologies and Transistor Performance

187

down-scaling of Lg than of Tox ), the contribution from fringing grows, and may eventually dominate the system, see Fig. 6.39. An extensive R&D effort is needed to reduce Cfringing by introduction of new materials for spacers, or by any other means. 6.4.5 How Far Can We Go and How Much Should We Pay? Traditional performance improvement rate reads at 17% rise in the intrinsic device speed (inverse of CV /I). Due to the physical and technological limitations discussed previously, these improvement rate breaks down if remaining within the conventional scaling scheme. New concepts, materials, device structures etc. (hereafter called performance boosters) have to be introduced if we wish to maintain the 17% improvement rate. In Table 6.5 we have made an exercise of proposing a plausible sequence of introduction of performance boosters. Their translation into appropriate modifications in the MASTAR model is also presented, that permits an quantitative approach. Without claiming an absolute accuracy of these approach, it is nevertheless expected to give reasonable tendencies and thus to serve as a strategic guideline for CMOS technologies. Using Table 6.5, we can assess the impact of all these boosters on Ioff , Ion and on the speed 1/(CV /I). As Ioff is a result of the EI (Electrostatic Integrity) of the device and of its threshold voltage value, where the latter can be adjusted to the requirements of a given application, we will focus on EI rather than Vth . Using (6.27), we have adjusted the Silicon film thickness so as to respect (roughly) the

Table 6.5. Roadmap of performance boosters (MASTAR allows any values of boosters, but here Kmob = 2, Kfield = 0.5 and Kball = 1.1–1.5 are used) Nature

MASTAR Translation

Plausible introduction date

Optimised scaling

Lmask × 0.7 & Lg = Lmask /2, etc.

CMOS 130 nm

Strained-Si, Ge, etc.

µeff × Kmob (for s-Si Kmob = 2)

CMOS 65 nm (90 nm ?)

FDSON / SOI

Eeff × Kfield & S = 75 mv/dec & Xj = Tdep = Tsi , (Kfield = 0.5)

CMOS 45 nm

Metal Gate / HK

Tox

DG

CMOS 32 nm Eeff × Kfield & S = 65 mv/dec & Xj = Tdep = Tsi /2 & d = 0 (Kfield = 0.5)

Ballistic

Vsat × (Kball ); (Kball = 1.1–1.5)

Reduced fringing

Cfringe × 0.5

CMOS 32 nm

Metallic junction

Rsd → 0

CMOS 22 nm

el

– (2 ÷ 4 A)

CMOS 45 nm

CMOS 32 nm

T. Skotnicki and F. Boeuf

HP ITRS 2001

0.35 EI Bulk

0.3

Required Tsi

EI

0.25

30 25 20

EI FDSOI

0.2

15

0.15 0.1

10

EI DG

Tsi (nm )

188

5

0.05

EI Ideal

0

0 0

50 100 CMOS node (nm )

150

Fig. 6.40. Introduction of thin-body devices permits the EI “good-design” rule [as defined in (6.24)] to be kept to its theoretical value and thus the SCE and DIBL to be under control up to the most advanced CMOS nodes

condition on EI that guaranties ideal SCE and DIBL.12 The results are shown in Fig. 6.40. It is very reinsuring to see that the required scaling of Tsi does not involve unrealistic values. In contrary, even for the last CMOS node (labelled 22 nm), a Silicon film 5 nm thick suffices for guarantying the almost ideal EI. This is a fortunate finding, since 5 nm films seem not only pretty feasible (e.g. by SON technology [6.48]), but also they remain within the range of thickness that is supposed to behave classically. As shown for example in [6.49–6.51], below this value, QM effects may prevail. Let us now examine the impact of the performance “boosters” on the Ion . In Fig. 6.41 we have plotted the Ioff –Ion roadmaps for the HP branch of ITRS 2001. The new “boosters” are being introduced consecutively in the nodes where lack of Ion shows up. As can be seen, up to the CMOS 65 nm node, the optimised scaling of conventional CMOS suffices to meet the Ioff –Ion specifications. The CMOS 45 nm node cannot be caught with Strained-Si channel itself. The conjunction of Strained-Si with Thin-Body devices and Metallic Gate is more than enough for this node, as well as for the CMOS 32 nm node. For the CMOS 22 nm node, however, this configuration of “boosters” is merely sufficient. As we will see further-on, being “just enough” in terms of Ion may be still insufficient in terms of 1/(CV /I), since certain “boosters” improve Ion worsening C at the same time.13 It is thus valuable 12

13

We have allowed a rise in EI up to 8% instead of the ideal limit-value of 4%. This relaxation still preserves tight control of SCE and DIBL below 20% of Φd and Vdd respectively, that is less than the today reality. This is the case of a metallic gate, but not that of DG devices since in all our estimations Ion /µm and C/µm of channel circumference are considered for DG devices.

Ioff (nA/µm)

Metallic

189 Junctions

Quasi

ballistic

& UTB

Metal G.

Bulk

100000

10000

Strained-Si

6 Optimal Scaling Methodologies and Transistor Performance

22nm 32nm 45nm

1000

65nm HP ITRS 2001 Roadmap

100

90nm

Bulk as modeled (MASTAR) Effects of technology boosters

10

CMOS130nm

300

800

1300 1800 Ion (µA/µm)

2300

2800

Fig. 6.41. Introduction of the technology boosters not only permits matching the ITRS 2001 HP specifications (in terms of Ioff –Ion ) but also exceeding them

to see that such boosters as ballistic transport and metallic junctions give us a large margin of further improvement in Ion well beyond the ITRS 2001 specifications. In addition, both of the latter improve Ion without worsening C, that is a good news for the CV /I scaling. The speed specifications (guided by the traditional improvement rate of 17% per year in 1/(CV /I)) are the most challenging. As shown in Fig. 6.42, the conventional ×0.7 scaling departed from the 17%/year tendency line well before the CMOS 130 nm node. The optimised scaling has the capacity of rectifying this failure up to the CMOS 65 nm node. To win the next node CMOS 45 nm, the thin body with strained channel and metallic gate is necessary. For the 32 nm node, we need to add the reduced fringing capacitance and produce a quasi-ballistic transport in the channel. The 22 nm node requires all these boosters to be completed with a metallic junction in order to meet the tendency speed curve. All this is very challenging but the key message from this analysis is definitly positive: no fundamental limitation and no technological roadblock is seen at least until the end of the Roadmap. We have thus a good prospect for the Moore’s laws to continue on the straight line at least till 2020.

190

T. Skotnicki and F. Boeuf

+Metallic Junction (CMOS22->)

10

1/(CV/I), THz

Improvement of 17%/year

+Fringing capacitance 50% (CMOS3232

+Quasiballistic Transport

22

45

+ Thin Body & Metal Gate

65

1

+ Strain Layer (CMOS65->)

90

optimised bulk scaling 0.7X bulk scaling

0.1 100

10 CM OS Node, nm

Fig. 6.42. Introduction of technology boosters permits conservation of the traditional improvement rate of 17% per year till the end of the roadmap

6.5 Conclusions Enormous reserves have been revealed in boosting transistor performance by smart optimisation, introduction of new materials, and of new device structures. Consequently, the Moore’s laws do not seem endangered in their principle, although a strong effort in R&D will continue to be mandatory. This situation is not very new – one can say existence of technological challenges is inherent to the semiconductor industry, and all past challenges have been won. The actual challenges are not more difficult than in the past, and our strength is that basically we know the remedies, we have a large potential of innovation, and we are determined to make use of them. A part of credits for this good starting point is certainly due to the ITRS community that works continuously on analysing difficulties, and gives hints on potential solution paths. Nevertheless, focusing on device performance, we overlooked many other important issues that will have to be very seriously taken in charge. Just to

6 Optimal Scaling Methodologies and Transistor Performance

191

give a few examples, let mention the problem of matching that will certainly become a major issue. This can be expected because the down-scaling of transistor layout inevitably leads to revealing the discrete nature of matter. Just to say that under the gate of a transistor of the last CMOS generation merely a few dozens of dopants will be present – any fluctuation in their spatial arrangement or number will inevitably lead to fluctuations in threshold voltage [6.52]. Another potential issues reside in power dissipation. If we do not find more efficient ways of power dissipation or power management, the operation temperature of a chip will be out of control. As nicely concluded in [6.53], it may be useless to develop a very dense and rapid technology if we are unable to dissipate the produced power. In such a case, in order to keep within allowable power range, one would have to deliberately reduce speed (dynamic power is proportional to commutation frequency) or to relax the layout density (to reduce the amount of power per surface unit). Also the static power is becoming an issue, mainly since it leads to too rapid battery exhausting in portable devices. To complete the image we should, however, remind that technology is not a single player. CMOS is a team game, where the partner is circuit design – a powerful and creative world able to help overcoming difficulties and compensating for shortcomings of technology. In all the aspects we discussed before, design solutions are already under development, if it goes for performance, matching, power management, etc. This wonderful complementarity between technology and design is an additional big strength of the semiconductor industry, that justifies thinking that the CMOS technology will continue being successful and competitive not only till the end of the current Roadmap but also well beyond. Finally, the authors would like to give their personal and somewhat philosophical view on the future of CMOS. Firstly, in our opinion statements often heard around such as “end of CMOS” or “CMOS facing a brick wall” reflect the climate of sensation (in the positive sense of the word since resulting from the genuine interest of medias in the CMOS industry) and should not be confused with the technical reality. Even if not all the innovations discussed in this chapter come to the production lines, CMOS will not stop but just slow down the development rate. Secondly, the Moore’s laws are in our opinion more a business model of our industry than a technical obligation. Among other merits, they help to reduce the “product-renewal cycle-time”14 from 10 years (our product typical life-time) to 4–5 years (performance doubling cycle, that is considered a sufficient motivation for renewing the product), thereby contributing to the dynamic of the Si-semiconductor industry. 14

By “product-renewal cycle-time” (our concept), we understand the period of time after which customer renews his product – buys a new car, new cloths, new computer, etc. Practically all industries wish the customer to renew products before they fail (attain their life-time) – in other words the industries wish to reduce the “product-renewal cycle-time”.

192

T. Skotnicki and F. Boeuf

Acknowledgments. The authors express their thanks to Markus Muller (Philips Crolles, France) for his help with Igate fitting and development of the semi-empirical Igate expression.

References 6.1. 6.2. 6.3. 6.4. 6.5. 6.6. 6.7. 6.8. 6.9. 6.10. 6.11.

6.12. 6.13. 6.14. 6.15. 6.16.

6.17. 6.18. 6.19. 6.20.

6.21.

6.22.

R.H. Denard et al., IEEE J. SSC, pp. 256–268, Sept. 1974 ITRS (International Technology Roadmap for Semiconductors), 1999 Edition ITRS (International Technology Roadmap for Semiconductors), 2001 Edition T. Skotnicki et al., A New Analog/Digital CAD model for Sub-Halfmicron MOSFETs, 1994 IEDM Tech. Digest, pp. 165–168 T. Skotnicki, Rep. French Acad. of Science Tom 1, Series IV, pp. 885–909, Paris 2000 T. Skotnicki, Heading for decananometer CMOS – Is navigation among icebergs still a viable strategy? Proceedings of ESSDERC 2000, pp. 19–33 T. Skotnicki, Transistor MOS et sa technologie de fabrication (in French), Encyclopedia Techniques de l’Ing´enieur, Trait´e Electronique, E2 430, Paris 2000 K. Chen et al., The Impact of Device Scaling and Power Supply Change on CMOS Gate Performance, IEEE Elec. Dev. Lett., pp. 202–204, May 1996 S. Takagi et al., On the universality of inversion-layer mobility in n- and p-channel MOSFETs, IEDM’88, Tech. Digest, pp. 398–401 S. Thompson, IEDM’99, Short course T. Skotnicki et al., The Voltage-Doping Transformation : A New Approach to the Modeling of MOSFET Short-Channel Effects, Elec. Dev. Lett. 9, No. 3, 1988 T. Skotnicki et al., A New Punch-through Model based on the Voltage Doping Transformation, IEEE Trans. Elec. Dev, pp. 1067–1086, 1988 T. Skotnicki et al., Analytical Study of Punchthrough in Buried Channel p-MOSFETs, IEEE Trans. Elec. Dev 36, No. 4, 1989 D. Bazley and S.Jones, HUNT, EU IST project E. Josse et al., Polysilicon Gate with Depletion -or- Metallic Gate with buried Channel: what evil worse ? IEDM’99, Tech. Digest, pp. 661–664 C-Y. Wu et al., Quantization effects in inversion layers of PMOSFETs on Si (100) substrates, IEEE Elec. Dev. Lett. 17, No. 6, June 1996, pp. 276–278 Osborn et al., Gate leakage simulations with UQANT, NCSU, ITRS Working Group Y. Taur and E.J. Nowak, 1997 IEDM, Tech. Digest, pp. 215–218 D. Lenoble, Proc. Int. Workshop on Junction Technology, pp. 29–34, Tokyo 2001 S.J. Chang et al., High-Performance and High-Reliability 80-nm gate-length DTMOS with Indium Super Steep Retrograde Channel, Trans. Elec. Dev. Lett. 47, No. 12 , pp. 2379–2384 (2000) S-F. Huang et al., Carrier mobility enhancement in strained Si-on-insulator fabricated by wafer bonding, Proceedings of 2001 Symp. VLSI Technology, pp. 107–108 T. Skotnicki, Proceedings Short Course Nanoscale Technologies, ESSDERC 2000

6 Optimal Scaling Methodologies and Transistor Performance

193

6.23. A. Ono et al, A 70 nm Gate Length CMOS Technology with 1.0 V Operation, Proceedings of 2000 Symp. VLSI Technology, pp. 14–15 6.24. S. Verdonckt-Vanderbroek et al., SiGe Channel Heterojunction pMOSFET’s, IEEE, Trans. Elec. Dev. 41, No. 8, 1994, pp. 92–101 6.25. V.P. Kesan et al., High performance 0.25 µm p-MOSFETs with silicongermanium channels for 300 K and 77 K operation, IEDM’91, Tech. Digest, pp. 25–28 6.26. P. Bouillon et al., Search for the optimal channel architecture for 0.18/0.12 µm bulk CMOS Experimental study, IEDM 1996, Tech. Digest, pp. 559–562 6.27. J. Alieu et al., Optimisation of Si0.7Ge0.3 Channel Heterostructures for 0.15/0.18 µm CMOS Process, Proceedings of ESSDERC’98, pp. 144–147 6.28. J. Alieu et al., Multiple SiGe well: A new channel architecture for improving both NMOS and PMOS performances, Proceedings of 2000 Symp. VLSI Technology, pp. 130–131 6.29. H. Shang et al., High Mobility p-channel Germanium MOSFETs with a thin Ge Oxynitride Gate Dielectric, IEDM 2002, Tech. Digest, pp. 441–444 6.30. S. Thompson et al., A 90 nm Logic Technology Featuring 50 nm Strained Silicon Channel Transistors, 7 layers of Cu Interconnects, Low k ILD, and 1 µm2 SRAM Cell, IEDM 2002, Tech. Digest, pp. 61–62 6.31. K. Rim et al., Transconductance enhancement in deep submicron strained Si n-MOSFETs, IEDM’98, Tech. Digest, pp. 707–710 6.32. M. Jurczak et al., Study on enhanced performance in NMOSFETs on strained Silicon, Proceedings of ESSDERC’99, pp. 304–307 6.33. K. Rim et al., Strained Si NMOSFETs for high performance CMOS technology, Proceedings of 2001 Symp. VLSI Technology, pp. 59–60 6.34. A. Toriumi, FED J. Vol. 3, Suppl. 2, 1993 6.35. R. Oberhuber et al., Mobility enhancement of two-dimensional holes in strained Si/SiGe MOSFETs, Proceedings of ESSDERC ’98, pp. 525–527 (1998) 6.36. Q. Lu et al., Dual Metal Gate Technology for Deep-Submicron CMOS Transistors, Proceedings of 2000 Symp. VLSI Technology, pp. 72–73 6.37. M. Jurczak et al., Silicon-On-Nothing (SON), an Innovative Process for Advanced CMOS, SON, IEEE TED 47, No. 11, 2000, pp. 2179–2187 6.38. S. Monfray et al., 50nm Gate-All-Around (GAA) – Silicon On Nothing (SON) – Devices : A simple way to co-integration of GAA transistors within bulk process, Proceedings of 2002 Symp. VLSI Technology, pp 108–109 6.39. Y.K. Choi et al., Sub-20nm CMOS FinFET Technologies, IEDM 2001, Tech. Digest, pp. 421–424 6.40. F.L. Yang et al., 25 nm CMOS Omega FETs, IEDM 2002, Tech. Digest, pp. 255–262 6.41. R. Chau et al., Proceedings of SSDM’02, pp. 68–69 6.42. J.M. Hergenrother et al., The vertical replacement gate (VRG) MOSFET: A 50-nm vertical MOSFET with lithography-independent gate length, IEDM 1999, Tech. Digest, p. 75 6.43. D. Hisamoto et al., A fully Depleted Lean-Channel Transistor (DELTA) – A Novel vertical ultra thin SOI MOSFET, IEDM 1989, Tech. Digest, pp. 833–836 6.44. H. Takato, High performance CMOS Surrounding Gate (SGT) for Ultra High Density LSIs, IEDM 1988, Tech. Digest, pp. 222–225

194

T. Skotnicki and F. Boeuf

6.45. D. Antoniadis, MOSFET Scalability Limites and “new frontier” devices, Proceedings of 2002 Symp. VLSI Technology, pp. 2–3 6.46. F. Ballestra et al., Double Gate Silicon on insulator transistor with volume inversion: A new device with greatly enhanced performances, IEEE, Elec. Dev. Lett. 8, pp. 410–412, 1987 6.47. F. Boeuf et al., 16 nm planar NMOSFET manufacturable within state-of-theart CMOS process thanks to specific design and optimization, IEDM 2001, Tech. Digest, pp. 637–640 6.48. S. Monfray et al., SON p-MOSFET with totally silicided (CoSi2 ) polysilicon on 5 nm-thick Si-films: The simplest way to integration of Metal Gates on thin FD channels, Tech. Digest, IEDM’02, pp. 263–266 6.49. S. Monfray et al., Self consistent Optimization and Performance Analysis of Double Gate MOS Transistor, Proceeding of ESSDERC 2000, pp. 337–339 6.50. K. Uchida et al., Experimental Study on Carrier Transport Mechanism in Ultrathin-Body SOI n- and p- MOSFETs With SOI Thickness less than 5 nm, IEDM 2002, Tech. Digest, pp. 47–50 6.51. D. Esseni et al., Study of Low Field transport in Ultra-Thin Single and Double gate SOI MOSFETs, IEDM 2002, Tech. Digest, pp. 719–722 6.52. A. Asenov et al., Modelling End-of-the-Roadmap Transistors, Proc. ECS Paris 2003, volume 2003-06, pp. 306–321 6.53. R.K Cavin et al., Semiconductor Research Corp., Limit to Binary Logic Switch Scaling – A Gedanken Model, to be published 6.54. S. Harrison et al., Highly performant double gate MOSFET realized with SON process, IEDM 2003 Techn. Digest, pp. 449–452 6.55. T. Park et al., Static noise margin of the full DG-CMOS SRAM Cell using bulk FinFETs (Omega MOSFETs), IEDM 2003 Techn. Digest, pp. 27–30 6.56. K. Uchida et al., Experimental study on carrier transport mechanism in ultrathin-body SOI n- and p-MOSFETs with SOI thickness less than 5 nm, IEDM 2002 Techn. Digest, pp. 47–50 6.57. K. Uchida et al., Experimental study on carrier transport mechanism in double- and single-gate ultrathin-body MOSFETs – Coulomb scattering, volume inversion, and δTsoi-induced scattering, IEDM 2003 Techn. Digest, pp. 805–808

7 Silicon Oxynitride Gate Dielectric for Reducing Gate Leakage and Boron Penetration Prior to High-k Gate Dielectric Implementation H.-H. Tseng High gate leakage current and severe boron penetration are two major problems for ultra-thin gate oxide, especially for portable applications where chip standby power consumption must be minimized. This paper discusses three different oxynitride approaches to address these two problems: (1) integrated RTCVD oxynitride (ION) fabricated in a commercially available CVD nitride system, (2) Jet Vapor Deposition (JVD) nitride, and (3) Decoupled Plasma Nitridation (DPN) oxynitride. They all show a reduction of gate leakage and significantly increase the resistance to boron penetration. Device performance, gate dielectric reliability and integration issues of using these gate insulators fabricated by advanced CMOS process will be discussed. These approaches are attractive alternative gate dielectrics for advanced technology before implementing high-k gate dielectric.

7.1 Introduction In order to improve the device performance, gate oxide has been scaled aggressively for advanced technology. There are two major challenges as gate oxide thickness decreases: (1) the gate leakage current through the gate oxide increases significantly due to direct tunneling mechanism, and (2) boron penetration in surface-channel PMOSFETs with P+ gate increases significantly. The former will increase standby power consumption. The latter will cause threshold voltage shift and degrade reliability. One efficient way to reduce leakage current is to use a gate dielectric with high dielectric constant that provides a physically thicker film for the same electrically equivalent SiO2 thickness. Metal oxides such as HfO2 and ZrO2 have been considered as CMOS gate dielectrics to address above-mentioned problems. Although progress has been made, it is challenging to grow ultra-thin metal oxide with stable electrical and physical properties through an entire device fabrication process without degrading device performance. On the other hand, silicon nitride is an attractive candidate for this purpose because it has a relatively high dielectric constant of 7.8 (about 2× as high as SiO2 ) and has been studied extensively in the microelectronic industry. Further, silicon nitride is an efficient diffusion barrier which can minimize the boron penetration problems encountered for P+ gate integration. However, it is difficult to fabricate

196

H.-H. Tseng

a pure silicon nitride film using a conventional approach to meet the device requirements due to its high density of trap generated during processing. Incorporation of oxygen in a silicon nitride film is required to improve the film properties thereby forming an oxynitride. Oxynitride can be fabricated by the following major approaches: (1) thermal nitridation on Si followed by re-oxidation, (2) annealing of SiO2 in nitrogen containing ambient such as nitrous (N2 O) oxide, nitric (NO) oxide, or NH3 , (3) grow SiO2 on nitrogen implanted Si substrates, (4) low temperature remote plasma nitridation of SiO2 , (5) atomic layer deposition (ALD) nitride deposited on SiO2 , (6) chemical vapor deposition (CVD) process, and (7) Jet Vapor Deposition (JVD) process. Oxynitride films fabricated by either thermal oxynitridation in nitrous or nitric oxide on base oxide [7.1–7.5] or growth of SiO2 on nitrogen implanted Si substrates [7.6, 7.7] resulted in low nitrogen concentration and a similar dielectric constant to SiO2 . Because high nitrogen concentration gate dielectric is our focus, oxynitride processed by either nitrous or nitric oxide anneal of SiO2 or growing SiO2 on nitrogen implanted Si will not be discussed here. Silicon nitride film fabricated by direct thermal nitridation on Si in NH3 is one of the earliest methods used to incorporate a high nitrogen concentration into a gate dielectric [7.8]. This approach uses high temperature (∼ 1000◦ C) and results in a relatively high threshold voltage. Thermal nitridation in NH3 of SiO2 was studied later to improve oxynitride film quality [7.9]. Because NH3 nitridation introduces high concentrations of hydrogen into SiO2 films, which can act as electron traps, post nitridation anneal in O2 ambient is required to improve the quality. This approach has generally fallen into disfavor as a gate dielectric due to process complexity and scaling difficulty. Recently, low temperature remote plasma nitridation on SiO2 demonstrated encouraging results in terms of gate leakage reduction and reliability improvement [7.10–7.12]. On the other hand, ALD nitride deposited on thermal oxide to form stacked oxynitride has been studied recently [7.13]. Despite the encouraging results, it requires a lot of effort to improve the tool for gate dielectric application and increase throughput. Finally, high nitrogen concentration oxynitride can be processed by chemical vapor deposition (CVD) and jet vapor deposition (JVD) approaches. LPCVD nitride has a poor interface with silicon and is leaky due to a high trap density in the film even for thickness down to 60 ˚ A. A C–V result measured by Hg probe for a 60 ˚ A LPCVD nitride used as the gate dielectric is shown in Fig. 7.1. A post deposition anneal at 800◦ C in N2 for 30 minutes was implemented to reduce the concentration of bulk traps. There is a 175 mV hysteresis on the C–V trace indicating device instability. Therefore, conventional LPCVD nitride is not an attractive candidate for a future gate dielectric for ULSI. Recently, high quality oxynitride films with relatively high nitrogen concentration have been demonstrated by rapid thermal CVD (RTCVD) and Jet Vapor Deposition (JVD) methods [7.14–7.19]. In this chapter, three major approaches to fabrication of high nitrogen concentration oxynitride will be reviewed: (1) A high quality integrated RTCVD oxynitride (ION) film fabri-

7 Silicon Oxynitride Gate Dielectric

197

Fig. 7.1. C–V for 60 ˚ A LPCVD silicon nitride as gate dielectric

cated in a commercially available CVD nitride system, (2) a nitride deposited by novel Jet Vapor Deposition (JVD) method, and (3) a decoupled plasma nitridation (DPNT M ) gate dielectric which can be engineered to reduce gate leakage while retaining excellent device performance. The devices utilizing ION, JVD nitride, and DPN oxynitride as gate dielectrics were fabricated using deep-sub-micron CMOS technologies with polysilicon gate electrode. Related integration issues encountered in implementing these oxynitride gate dielectrics will also be discussed. In order to meet the low gate leakage goal for ultra-thin gate dielectric used for advanced technology, a balance between nitrogen concentration, oxygen addition (which impacts the dielectric constant of oxynitride film), film thickness, gate dielectric reliability, and device performance will be considered. Physical analysis along with electrical results will be presented.

7.2 Integrated RTCVD Oxynitride (ION) Process 7.2.1 Experiment After thin thermal oxide growth, oxynitride was deposited using a commercial RTCVD silicon nitride process using NH3 and DCS gases with N2 O addition. Addition of N2 O helped decrease gate leakage probably due to reduction of dangling bonds and release of H2 during deposition. Deposition was followed by N2 O and nitrogen anneals. A clustered polysilicon deposition was used to form the gate. Although hydrogen bake is a potentially attractive wafer pretreatment to remove native oxide before ultra-thin gate dielectric growth, we observed that hydrogen bake resulted in a high interface state density. The

198

H.-H. Tseng

Icp (nA)

0.6

IN w / H2 Bake

0.4 IN w /o H2 Bake

0.2 0 -0.5

Con trol SiO 2 0

0.5

1

1.5

Vh (V) Fig. 7.2. H2 bake results in high interface state density

a

b

Fig. 7.3. Surface roughness measured by AFM. (a) w/o H2 bake (b) w/ H2 bake

charge pumping current for ION with hydrogen bake pre-treatment is more than 3 times higher than that without hydrogen bake resulting in significantly increased threshold voltage as shown in Fig. 7.2. This may be associated with the formation of a terrace structure with a hydrogen bake process as shown in Fig. 7.3. The devices reported in this section were fabricated without hydrogen bake using a 0.18 micron 1.5 V CMOS technology [7.20]. 7.2.2 Results and Discussion Gate Leakage Reduction and Conduction Mechanism The comparison of gate leakage-to-Id ratio and normalized gate leakage current versus Lpoly of 30 ˚ A equivalent oxide thickness (EOT) ION and SiO2 is shown in Fig. 7.4. The ION dielectric shows more than two orders of magnitude reduction of Ig and Ig/Id ratio over the full range of Lpoly . A TOFSIMS profile of ION film is shown in Fig. 7.5. A 200 ˚ A poly silicon was deposited on ION to prevent further oxide growth. The physical thickness of the oxynitride film is ∼ 35 ˚ A. The peak nitrogen concentration is around 4E21 atm/cm3 . In order to study the conduction mechanism for ION, the

7 Silicon Oxynitride Gate Dielectric 10-6

199

SiO2

-8

Ig / Id

10

10-10

ION

10-12

Vd = 1.5 V Vg - Vt = 1.5V

10-14

0.1

1

10 2

10

LPoly (µm)

a

Ig/Cox (cm 2 V/s)

10-3 10-4 10-5

SiO2

10-6 10-7 10-8

ION

10-9 10-10

Vd = 1.5 V Vg - Vt = 1.5V

10-11 0.1

1

102

10

LPoly (µm)

b

Fig. 7.4. (a) Ig/Id ratio reduction (b) Ig reduction for ION at various Lpoly 23

10

22

10

21

10

20

10

19

10

18

10-6 18

O

0

10

20

30

Substrate Injection, + Vg

10-7

SiN

|Ig| @ 5MV/cm (A)

10

40

Depth (nm)

Fig. 7.5. TOFSIMS profile for ION

50

-8

SiO2

10

10-9 10-10 10-11 10-12

ION

10-13

Gate Injection, -Vg

-14

10

20

40

60

80

100

120

140

Temperature ( o C)

Fig. 7.6. No temp. dep. of Ig for ION

temperature dependence of gate current under both gate bias polarities was studied for the temperature range of 25◦ C to 125◦ C as shown in Fig. 7.6. A similar weak temperature dependence was observed for both ION and SiO2 , indicating the conduction mechanism in ION dielectric is similar to SiO2 . This conduction mechanism is quite different from the strongly temperature dependent trap-assisted hopping conduction observed for conventional LPCVD nitride. To determine the dominant conduction carrier in ION, a

H.-H. Tseng 10 -7

NMO S LP oly = 1 5 µm Ig (e +h)

Current (A)

10 -8 10 -9

S iO2

10 -10 10

Id+Is (e )

-11

Isub (h)

10 -12 10 -13 10 -14 0

0. 5

1

1. 5

2

2. 5

3

3. 5

Vg (V) NMOS L Poly = 15 µm

Current (A)

10 -9 10 -10

Id+Is (e)

Ig (e+h)

10 -11

ION

10 -12

Isub (h)

10 -13 10

-14

10 -15 0

0.5

1

1.5

2

2.5

3

3.5

4

Vg (V) Fig. 7.7. Charge separation measurement results for NMOSFETs PMOS LPoly = 1 5 µm

Current (A)

10 -9 10 -10

Ig (e +h)

10 -11

S iO2

Id+Is (h)

10 -12 10 -13

Is ub (e )

10 -14 10 -15 0

-0. 5

-1

-1. 5

-2

-2. 5

-3

-3. 5

Vg (V) PMOS LPo ly = 15 µm

10 -11

Current (A)

200

Ig (e+h)

ION

10 -12 10 -13 10

-14

10

-15

Id+Is (h) Isub (e) 0

-0.5

-1

-1.5

-2

-2.5

-3

-3.5

Vg (V) Fig. 7.8. Charge separation measurement results for PMOSFETs

7 Silicon Oxynitride Gate Dielectric

201

carrier separation method was implemented. Electron tunneling conduction is dominant in NMOS for both ION and SiO2 as shown in Fig. 7.7. The lack of hole conduction current in NMOS is a good indication of low bulk trap density, especially in the ION dielectric. For the case of PMOS, Fig. 7.8 shows that the hole current in ION dominated up to −3 V while the hole current in SiO2 dominates only up to −1.8 V. Gate Dielectric Reliability

Ig (A)

To further compare the bulk trap densities of ION and SiO2 , Fig. 7.9 shows that the stress induced leakage current (SILC) for ION is comparable to SiO2 for both injection polarities. Bulk trap density was directly probed by constant-voltage stressing of MOSFETs, as shown in Fig. 7.10. These results show that the initial trap density in ION is comparable to that in SiO2 . No significant trap generation during stress is observed for ION or for the oxide control. A Time-To-Breakdown (TTB) comparison is shown in Fig. 7.11. The results of constant voltage stressing show that the TTB for ION is comparable to that for SiO2 . The C–V characteristic comparison is revealed in Fig. 7.12. The ION eliminates C–V hysteresis shown by conventional hot wall LPCVD Si3 N4 and demonstrates a high level of stability as indicated by negligible flat band voltage shift after high E-field stressing at 10 MV/cm

Stre ssed@1 0MV/cm Stre ss time: 1 00 sec No. of Stres s: 3

1 0-7 1 0-8 1 0-9 1 0-10 1 0-11 1 0-12 1 0-13 1 0-14

S iO2 ION

N M OS Lpoly = 15 µm 0

0 .5

1

1 .5

2

2 .5

3

3 .5

Vg (V)

Stressed @ -10M V/cm Stress tim e: 100 sec N o. o f St ress: 3

1 0-9

SiO2

-Ig (A)

1 0-10

ION

1 0-11 1 0-12 1 0-13

PMOS Lpoly = 1 5 µm

1 0-14 0

-0 .5

-1

-1 .5

-2

-2 .5

-3

-3 .5

Vg (V)

Fig. 7.9. Stress induced leakage current comparison between ION and SiO2

202

H.-H. Tseng

Fig. 7.10. Bulk trap comparison under constant voltage stressing

Fig. 7.11. TTB comparison after CVS

Fig. 7.12. C–V stability comparison

7 Silicon Oxynitride Gate Dielectric

Ic p ( 1 0

-1 1

A)

12

203

Stress Condition: 100 sec @ 10MV/cm

SiO

8

2

Stressed Fresh

ION

4 NMOS L eff = 0.18 µm

0 -1

-0.5

0

0.5

1

1.5

VH (V)

Fig. 7.13. Interface state density comparison

for 100 seconds. The interface state density of ION is comparable to that of SiO2 as demonstrated in Fig. 7.13 by charge pumping measurements before and after high E-field stressing.

Boron Penetration Resistance In order to study the boron penetration resistance of the ION film, an extra boron dose relative to baseline dose level was implanted into P+ gates for both ION and SiO2 wafers. A large Vt,p shift (∼ 250 mV) for SiO2 samples but only ∼ 50 mV shift for ION is shown in Fig. 7.14. Strong boron penetration resistance for ION can be demonstrated by PMOS DIBL characteristics as discussed in next section. 0.35

² V t S h if t (V )

0.3 0.25 0.2 SiO

0.15

2

w/ e xtra B

ION w/ e xtra B

0.1 0.05 0 0.1

1

10

100

Lpoly (µm) Fig. 7.14. PMOS Vt shift comparison with extra boron implantation dose

204

H.-H. Tseng

I D / C o x ( 10 4 cm 2 V /S )

NMOSFET Leff = 0.18 µm

PMOSFET Leff= 0.18 µm

1.35

Vg - Vt

2.0 SiO2

1.5

0.9 Vg - Vt

ION

2.0

0.45

ION 1.0

SiO2

1.5 1.0

0.5

0.5 0.0

0 -2

0.0

-1.5

-1

-0.5

0

0.5

1

1.5

2

VD

Fig. 7.15. Id–Vd Characteristics for N- and P-MOSFET

ID /C o x (c m

2

V /s )

105 3

10

Vd = 1.5 V

NMOS Leff = 0.18 µm

Vd = 0.1 V

1

10

SiO2 ION

10-1 10-3 10-5 10-7 -1

SS = 84 mV/decade -0.5

0

0.5

1

1.5

Vg-Vt Fig. 7.16. NMOSFET SS and DIBL

Device Characteristics The Id for ION dielectric is comparable to thermal oxide for N- and P- channel MOSFETs as shown in Fig. 7.15. Well-behaved NMOSFET subthreshold slope and DIBL characteristics for both ION and SiO2 are shown in Fig. 7.16. Well-behaved PMOSFET subthreshold slope and DIBL characteristics for both ION and SiO2 with standard boron P+ gate implantation are shown in Fig. 7.17. However, the subthreshold and DIBL characteristics for SiO2 with extra boron implantation show the impact on boron penetration while the impact is significantly reduced for ION, confirming the stronger boron

Vd = -1.5 V ION, extra B II

Vd = -0.1 V

101

2

103

/se c)

105

-Id/Cox (V cm

-Id/C o x (V cm

2 /s e c )

7 Silicon Oxynitride Gate Dielectric

10-1 10-3 10-5

ION, std B II

10-7 PMOS, 15x0.25

10-9 4

2

0

-2

-4

105

Vd = -1.5 V

SiO 2 , extra B II

103

205

Vd = -0.1 V

101 10-1 10-3 10-5

SiO 2 , std B II

10-7 PMOS, 15x0.25

10-9 4

-6

2

0

-2

-4

-6

(Vg-Vt)/Tox (MV/cm)

(Vg-Vt)/Tox (MV/cm)

Fig. 7.17. PMOSFET SS/DIBL comparison between ION and SiO2 . ION shows a strong boron penetration resistance

Power-Delay Product (J)

10 -12

Vdd = 1.5 V 10 -14

10 -15

ION SiO2

10 -16

0.18

0.19

0.20

0.21

Leff (µm)

Fig. 7.18. Ring oscillator performance comparison between ION and SiO2

penetration resistance for ION. The power/delay product for ION is close to that for SiO2 for a 401 stage ring oscillator as shown in Fig. 7.18. Device Reliability Under Hot Carrier Injection The NMOS device lifetime after HCI is shown in Fig. 7.19 based on 10% Gm degradation and 100 mV Vt shift criteria, respectively. The projected lifetime for ION is comparable to that of SiO2 . ION Scaling A well-behaved C-V for ultra-thin ION with NMOS inversion Capacitance Equivalent Thickness (CETinv) of 23.8 ˚ A is demonstrated in Fig. 7.20. Comparing with furnace oxynitride with similar thickness, the gate leakage current for ION is about 30× lower than furnace oxynitride as shown in Fig. 7.21. The device performance for ION is comparable to furnace oxynitride.

H.-H. Tseng

Lifetime (Del Vt =100 mV) (s)

Lifetime (DelGm/Gm0 = 0.1 ) (s)

206

Vd (V) 3.3

2.5

2.0

1.43

1.67

108

1.25

10 Years

SiO 2 ION

106 104

NMOS L eff = 0.18 µm

102 0.2

0.3

0.4

10

3.3

2.5

0.5

1/Vd (1/V)

0.6

0.7

0.8

1.67

1.43

1.25

Vd (V) 10

10

9

10

8

10

7

10

6

10

5

10

4

10

3

2.0

10 Years

SiO2 ION

NMOS Leff = 0.18 µm

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1/Vd (1/V)

Fig. 7.19. HCI lifetime comparison between ION and SiO2

7 10 -11 6 10

-11

5 10

-11

0.01

CET,inv = 23.2 A FN Oxynitride EOT= 18.6 A

IN ION

4 10 -11

CET,inv=23.8A @1.2V EOT= 19.2 A

3 10 -11 2 10 -11 1 10

-1.2

-4

10

-6

10

-8

FN Oxynitride

Integrated Nitride

ION

-10

10

-12

10

-11

0

10

NMOSFET, T80 10x0.15 Isolated

-14

-0.8

-0.4

0

0.4

Gate Voltage (V)

0.8

1.2

10

0

0.5

1

1.5

2

2.5

3

3.5

4

Gate Voltage (V)

Fig. 7.20. Well behaved C–V for ultra- Fig. 7.21. ION shows 30× Ig reduction thin ION

7 Silicon Oxynitride Gate Dielectric

207

He / SiH4

2 0 Torr 1 Torr

He / N2 Pla s m a

Jet Sc a nning Sub s t ra t e Fig. 7.22. Schematic of the jet vapor deposition system

7.3 JVD Nitride 7.3.1 Experiment Jet Vapor Deposition (JVD) nitride was developed at Yale University [7.15– 7.17]. The nitride film is deposited in a vacuum chamber, using a supersonic remote plasma jet. A diagram of the jet nozzle is shown in Fig. 7.22. It consists of two concentric quartz tubes. For a consistent flow velocity, the diameter of the outer nozzle defines the jet velocity. It is the supersonic velocity of the gases that provides the unique characteristics of the JVD process. For the same flow rate, larger diameter tubing reduces the jet velocity. The source gases for the nitride deposition flow separately through these nozzles, with the outer nozzle carrying the nitrogen source in the form of nitrogen gas mixed with a helium carrier gas in a 4.6% mixture, and the inner nozzle carrying silane with a He carrier gas in a 0.0004% mixture. A microwave cavity that is mounted around the outer nozzle is used to generate the plasma. Devices were built using a 0.35 micron technology [7.21] with shallow trench isolation, N+ /P+ gates, Ti-salicide, and shallow transistor extensions with a 1400 ˚ A nitride spacer. Wafers were delivered to Yale University for JVD nitride deposition. After return of the wafers from Yale, an 800◦ C post deposition anneal was performed prior to depositing the polysilicon gate and the completion of wafer processing. 7.3.2 Results and Discussion Gate Leakage Reduction The physical thickness of the JVD nitride sample studied here obtained from the TEM cross-section (Fig. 7.23) is 50 ˚ A while the equivalent oxide thickness (EOT) is 31 ˚ A. The increased physical thickness, due to the high dielectric constant of nitride, resulted in 100× lower leakage current than SiO2 at

208

H.-H. Tseng

JVD Nit ride

50Å

Si Subst rat e Equivalent Oxide Thickness = 31 Å Fig. 7.23. TEM cross-section of JVD nitride 2 10

-10

Before Stress After Stress 1.0E+00

JVD Nitride

1.0E-01

Thermal Oxide

1.5 10

C a p a c it an c e ( F )

J ( A/cm2 )

1.0E+01 1.0E+01

1.0E-02 1.0E-03 1.0E-3 1.0E-04 1.0E-05 1.0E-06

1.0E-7 1.0E-07 1.0E-08 1.0E-09

-10

1 10

-10

5 10

-11

Stress Condtion: J = -1.0 mA/cm 2 Time = 500 S

1.0E-10

1.0E-11 1.0E-11

0

5 10 15 Eox ( MV/cm)

20

Thermal Oxide ∆ Vfb = 6.0 mV

JVD Nitride ∆ Vfb = 7.0 mV

0 10

0

-2

-1.5

-1

-0.5

0

Vgate ( V )

Hydrogen conc (Arbitrary Unit)

Fig. 7.24. J–E comparison for JVD Fig. 7.25. C–V shift comparison after stressing nitride and SiO2 8 6

w/o post anneal

4 with post anneal

2 0

0.00

50.0

100

150

200

Depth (Å)

Fig. 7.26. SIMS analysis for post nitride deposition anneal

7 Silicon Oxynitride Gate Dielectric

209

5 MV/cm electric field as shown in Fig. 7.24. In order to study the film stability, polysilicon gate capacitors of JVD nitride and thermal oxide fabricated with the full CMOS process were stressed using constant current injection to a 0.5 C/cm2 fluence. There is no hysteresis in the C–V measurement and that the flat-band voltage shift for JVD nitride after stressing is comparable to that of thermal oxide as shown in Fig. 7.25. These results demonstrate that the interface stability of JVD nitride compares favorably with thermal oxide. Post nitride deposition annealing plays an important role in achieving a stable JVD nitride film. The results of SIMS analysis for 55 A thick JVD nitride film before and after 800◦ C post deposition anneal in nitrogen are shown in Fig. 7.26. The anneal process reduces both the surface and bulk hydrogen concentration significantly. Boron Penetration Reduction To investigate the resistance to boron penetration with JVD nitride, we built capacitors with standard boron or BF2 implantation for P+ gate. The normalized high frequency C–V results from the capacitors are shown in Fig. 7.27. N+ and P+ poly silicon gate capacitors were fabricated on an N-type substrate. For the N+ gate case, JVD nitride reveals a negative shift as compared with thermal oxide due to the larger fixed charge contained in JVD nitride. It is well known that boron penetration caused flat-band voltage increase for P+ gate processing. For the B implanted P+ gate case, the flat-band voltage for thermal oxide is about 0.7 V larger than JVD nitride. The flat-band voltage difference is even larger for BF2 implanted P+ gate due to fluorineenhanced boron penetration. After compensating for the 0.2 V flat-band fixed charge contained in JVD nitride as observed from N+ gate, the thermal oxide still results in a much larger flat-band voltage than JVD nitride for P+ gate case caused by boron penetration. Furthermore, the negligible C–V differ-

Thermal Oxide BII

0.6 BF2II

0.4 0.2

Thermal Oxide 0.0 -1.0

-0.5

-1 -0.50

0.0

0.50

VG ( V )

1.0

1.5

JVD Nitrid e

0 T h erm al O xid e

OX

C/C

JVD Nitride

0.8

0.5

JVD Nitrid e

N+ Gate

1.0

Th resh old V oltag e (V )

BF2II = 5E15 ; 15 KeV 1.2 BII = 5E15 ; 15 KeV

T h erm al O xid e

1

1.4

Vtn

Normal Vtp

Vtp (w / high non-standard BF2 dose )

Fig. 7.27. Boron penetration resis- Fig. 7.28. Threshold voltage comparison tance comparison

210

H.-H. Tseng

ence between B and BF2 implanted P+ gate capacitors using JVD nitride, suggests that JVD nitride has a strong resistance to fluorine-enhanced boron penetration. To examine the boron penetration resistance with JVD nitride for the transistor lot, threshold voltage is compared for thermal oxide and JVD nitride as shown in Fig. 7.28. For the N-ch MOSFET, JVD nitride shows a slightly lower Vt,n than thermal oxide due to the influence of the extra fixed charge. For the P-ch MOSFET, because a very high non-standard BF2 dose is implanted into the P+ gate, thermal oxide shows severe boron penetration effects that cause a shift of the threshold voltage from the expected value (−0.5 V) to a slightly positive value. The Vt,p for JVD nitride is strongly negative. The strong negative Vt,p for JVD nitride is partially due to strong resistance to boron penetration and partially due to non-optimal Vt,p adjustment implantation of this experiment. MOSFET Characteristics and Stability

1.2

JVD Nitride Thermal Oxide

Vg-Vt 2.0

0.5

Idp/Cox (10 4 cm 2 V/s)

Idn/Cox (104 cm 2 V/s)

The N-ch MOSFET I–V characteristics is shown in Fig. 7.29 where the normalized Id of JVD nitride approaches that of thermal oxide. JVD nitride provides a well-behaved P-ch MOSFET I–V characteristic as demonstrated in Fig. 7.30. The current drive is relatively low which we postulate to be due to the high Vt,p caused by fixed charge and Dit. However, this problem can be addressed by in-situ N2 O or NO plasma pre-treatment before JVD nitride deposition. In order to demonstrate the stability of the device, transistor I– V characteristics before and after constant E-field (10 MV/cm) stressing are measured for the N-ch MOSFET as shown in Fig. 7.31. The JVD nitride sample shows comparable small I–V shift to thermal oxide indicating excellent transistor stability. The product of power and stage delay for a CMOS ring oscillator with 125 inverter stages using JVD nitride compared to a CMOS ring oscillator using thermal oxide is shown in Fig. 7.32. The fact that a

W/L= 25/0.4

1.5

0.8

1.0 0.4

0.5 0

Vg-Vt 2.0

W/L = 25/0.4

0.4 1.5

0.3 0.2

1.0

0.1 0.5

0 0

0.5

1

1.5

Vdn (V)

2

2.5

3

0

0.5

1

1.5

2

2.5

3

Vdp

Fig. 7.29. NMOSFET I–V character- Fig. 7.30. PMOSFET I–V characteristics istics

7 Silicon Oxynitride Gate Dielectric 12 Stress Condition: 10 MV/cm for 500 Sec

D ra in C u rr e n t ( m A )

12

10

Before Stress After Stress

10

VG - VT = 1.5

8 VG - VT = 1.0

6 4

VG - VT = 0.5

2 0 0

a

0.5

Stress Condition: 10 MV/cm for 500 Sec

VG - VT = 2.0

D ra in C u rre n t ( m A )

14

1 Drain Voltage ( V )

1.5

VG - VT = 2.0

Before Stress After Stress

8

VG - VT = 1.5

6 VG - VT = 1.0

4 2 0

2

211

VG - VT = 0.5

0

0.5

b

1 Drain Voltage ( V )

1.5

2

Power X Stage Delay (fJ)

Fig. 7.31. NMOSFET stability of (a) thermal oxide and (b) JVD nitride 100 75

Vdd = 1.8 V 125 Inverter Stages

50 25 0 JVD N itride

Thermal Oxide

Fig. 7.32. CMOS ring oscillator results for JVD nitride

functional ring oscillator is achieved with JVD nitride suggests there is no fundamental obstacle to the use of JVD nitride for advanced technology.

7.4 DPN Oxynitride 7.4.1 Experiment A high quality ultra-thin DPNT M (decoupled plasma nitridation) film was fabricated in a commercially available nitride system that is compatible with standard CMOS processing technology. The key features of DPNT M chamber is illustrated in Fig. 7.33. Although both DPN and RPN (Remote Plasma Nitridation) use plasma nitridation to nitridize the bottom oxide, DPN results in a better across-wafer nitrogen uniformity and is more scalable than RPN approach. After thin thermal oxide growth, DPN was implemented with different processing conditions followed by a post anneal. The ultra-thin DPN with 15 ˚ A EOT (23 ˚ A NMOS CETinv) was compared to high quality oxynitride that was processed by rapid thermal oxidation followed by NO gas anneal (RTONO). The EOT of RTONO is 15 ˚ A. The devices reported in this section were fabricated using a 0.13 micron 1.2 V high performance CMOS technology [7.22]. The DPN films for TOFSIMS analyses were capped with a 10 nm poly layer in situ in order to evaluate the N concentration levels among

212

H.-H. Tseng

different processes. The TOFSIMS depth profiles were acquired with a 25 keV Ga analytical ion pulse interleaved with a 1 keV Cs sputter ion pulse. The quantification was calibrated with an in-house oxynitride standard. 7.4.2 Results and Discussion Impact of Process Conditions on Vt Shift and Peak Gm It is not difficult to reduce gate leakage by adding nitrogen into gate dielectric. It is difficult to balance gate leakage and device performance if too much nitrogen is incorporated into the gate dielectric, however. A 50 to 100× gate leakage reduction using oxynitride usually results in a severe mobility degradation and Vt shift. In this section, an optimal nitrogen concentration for DPN processing will be discussed which shows minimum mobility degradation and Vt shift while retaining gate leakage reduction advantage. The C–V comparison for RTONO and different DPN processes measured on 10 × 10 transistors for N- and P-MOSFET is shown in Fig. 7.34a and Fig. 7.34b respectively. These devices have a capacitance equivalent thickness (CET,inv) around 23 ˚ A (corresponding to 15 ˚ A Effective Oxide Thickness) measured on N-channel transistors in inversion mode at Vdd. The reference oxynitride used here is grown by rapid thermal process followed by a nitric oxide anneal(RTONO). The RTONO oxynitride was used to fabricate high performance devices as shown in [7.6]. C–Vs for oxynitride and high pressure DPN films shown in Fig. 7.34a are almost identical. However, a Vt shift is observed for DPN processes with lower pressure. The process with low pressure and shorter time stays in the middle. The impact of DPN pressure on PMOSFET Vt is even more pronounced than NMOSFET as shown in Fig. 7.34b. The Vt Source RF Power

RF Match

Wafer Quasi-Remote Plasma Source

Throttle/Gate Valve

Cathode Turbo Pump

Fig. 7.33. DPN chamber description

7 Silicon Oxynitride Gate Dielectric

213

-11

Capacitance (F)

5 10

-11

4 10

DPN 20 mTorr 22"

-11

3 10

Oxynitride DPN 70 mTorr,35"

-11

2 10

DPN 20 mTorr,35"

-11

1 10

0

-1.2

-0.8

a

-0.4

0

0.4

Gate Voltage (V)

0.8

1.2

-11

4.5 1 0

O xynitr ide DPN 70mTorr ,35"

-11

Capacitance (F)

410

-11

3.5 1 0

DPN 20mTorr ,22"

-11

310

DPN 20mTor r,35"

2.5 1 0-11 -11

210

-11

1.5 1 0

PMO SFET

-11

110

-5

-12

510

2

Area=2.8x10 cm

-1.2

-0.8

-0.4

0

0.4

0.8

1.2

Gate Voltage (V)

b

Fig. 7.34. (a) NMOSFET C–V comparison, (b) PMOSFET C–V comparison 0.007

0.035 0.03

NMOSFET

DPN 70 mTorr,35"

0.025

0.01

0.004

DPN 20 mTorr, 22"

0.003

a

DPN 20mTorr, 22"

0.002

DPN 20 mTorr, 35"

0.005 0

RTONO

0.005

0.02 0.015

PMOSFET

0.006

RTONO

DPN 70mTorr,35"

DPN 20mTorr, 35"

0.001

0

0.02

0.04

(Vg - Vt) / CET,inv

0.06

0

0.08

0.02

b

0

-0.02

-0.04

-0.06

-0.08

(Vg - Vt) / CET,inv

Fig. 7.35. (a) NMOSFET Gm comparison, (b) PMOSFET Gm comparison

shift relative to oxynitride for high pressure DPN process is 50% smaller than low pressure. The comparison of normalized peak Gm between high pressure DPN process and RTONO for N- and P-MOSFET is shown in Fig. 7.35a and Fig. 7.35b respectively. The peak Gm for DPN is close to RTONO while the Gm at high E-field is higher for DPN for both N- and P-MOS devices,

H.-H. Tseng (atoms/cm3)

214

1 0102222

Symbol

U02N11Q

WF 4 - DPN5

U04N11Q

WF 6 - DPN6

U06N11Q

Overlay of N Depth Profiles

D PN 20m T orr 22”

10102121 Concentration

Concentration (atoms / cm3)

File

D PN 20m T orr 35”WF 2 - DPN1

10102020

D PN 70m T orr 35”

10101919

Poly-Si

10101818

5

5

DPN 10 10

15

15

20 20

25 25 Depth

30 30

35 35

40 40

45

D epth (nm )

Fig. 7.36. TOFSIMS analysis of different DPN processes

which is not achieved by low pressure DPN. The sharp reduction of Gm for RTONO at high E-field is partially due to high gate leakage that causes inversion charge loss. The C-V and Gm peak results suggest there is an optimal nitrogen concentration incorporated into the film such that the Gm peak degradation and Vt shift can be minimized. The optimal concentration can be adjusted by DPN pressure. TOFSIMS results shown in Fig. 7.36 demonstrate that the nitrogen concentration for high pressure DPN is 50% lower than the DPN process with low pressure, which is consistent with the Vt shift trends observed. The collision probability and recombination rate of active nitrogen are much higher at a higher DPN process pressure than at a low pressure process. A lower active nitrogen plasma density results in a lower nitrogen concentration in the gate dielectric. This is a possible explanation consistent with electrical and physical results. Device Characteristics and Gate Leakage Reduction of Optimal DPN An excellent NMOSFET SS for high pressure DPN is shown in Fig. 7.37, which is very close to that of RTONO. A 5× gate leakage reduction at 1 V Vox (=Vg − Vt) for the DPN oxynitride, as compared to RTONO, with the optimal nitrogen concentration in the film is demonstrated for N- and PMOSFET as shown in Fig. 7.38a and Fig. 7.38b respectively. The gate leakage current distribution for both injection polarities with a large sample size is extremely tight, indicating very high dielectric thickness uniformity. The

D r a in C u r r e n t ( A )

7 Silicon Oxynitride Gate Dielectric

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

10

-7

10

-8

10

-9

215

Vd=1.2V

-0.5

S.S. ~77mV/dec for DPN

Vd=0.1V

0

0.5

solid - DPN dashed - Oxynitride

1

1.5

Gate Voltage (V)

2

2.5

Fig. 7.37. NMOSFET SS characteristics

Fig. 7.38. Tight Ig distribution of DPN. 5× Ig reduction advantage observed for DPN while retaining minimum Gm peak reduction and Vt shift

I–Vs were measured on transistors instead of capacitors to include potential process induced damage effects typically incurred During CMOS transistor formation. Reliability of Optimal DPN A negligible I–V and C–V shifts for both high pressure DPN and RTONO oxynitride under the same stressing conditions are shown in Figs. 7.39 and 7.40. I–Vs measured at temperature ranging from 25◦ C to 105◦ C for high pressure DPN for both substrate and gate injection polarities are shown in Fig. 7.41. The negligible shift indicates small trap density in the film and the conduction mechanism is F–N tunneling. The reliability of thin gate oxide is a big challenge, especially for PMOS devices where boron incorporation in oxynitride is problematic. Although

216

H.-H. Tseng

Gate Current (A)

10 10

-1 -2

Ox ynitride

10 -3 10 -4 10 10

-5

D PN

-6

10 -7 10

S tre ss Condi tion:

-8

o

N M OS FET

10 -9 0

0.5

V g= 2.3 V, T= 105 C 1

1.5

2

Gate V oltag e (V)

2.5

3

Capacitance (F)

Fig. 7.39. Negligible I–V shift after CVS 5 10

-11

4 10

-11

3 10

-11

2 10

-11

1 10

-11

Oxynitride

0

& DPN

Stress Condi tion: Vg=2.3V, T=105 o C

NMOSFET

-1.2

-0.8

-0.4

0

0.4

Gate Voltage (V)

0.8

1.2

Fig. 7.40. Negligible C–V shift after CVS

the nitrogen concentration for high-pressure DPN is lower than low-pressure DPN, the nitrogen profile shows a peak close to the poly Si/DPN interface (see Fig. 7.36), which can block boron penetration into the DPN film. Pile-up of nitrogen at the top interface is more efficient to minimize boron diffusion into the gate dielectric than pile-up at the bottom interface. To illustrate this point, Fig. 7.42 shows the SIMS for high-pressure DPN and RTONO devices that were processed to the metal one level. The SIMS analysis was done after de-processing the devices. The boron concentration at the poly Si/DPN interface is higher than that for poly Si/RTONO interface, indicating a stronger boron diffusion resistance for DPN as a result of nitrogen pile-up at the top interface. In addition, the peak nitrogen of RTONO oxynitride is shown to be close to the RTONO/Si bottom interface. Consequently, the boron penetration into Si substrate for DPN is less than that for RTONO. These results suggest less boron incorporation in the DPN film compared to RTONO. Note that the N peaks and the B pile up spike were broad, as a result of unevenness of the surfaces of samples after the de-processing. The comparison of mean-time-to-failure (MTTF) of time-to-breakdown (TTB)

7 Silicon Oxynitride Gate Dielectric 0.01

217

o

125 C

-4

10

o

-6

25 C

10

-8

10

-10

10

NMOSFET

-12

10

0

0.5

Area= 2.8x10

1

1.5

2

2.5

-5

3

cm

2

3.5

4

Gate Voltage (V) 0.01 o

125 C

-4

10

o

-6

25 C

10

-8

10

-10

10 10

-5

PMOSFET

-12

0

-0.5

-1

Area= 2.8x10 cm

-1.5

-2

-2.5

-3

2

-3.5

-4

Gate Voltage (V)

10

22

10

21

10

20

10

19

P o ly S i

G a t e d ie l e c t r ic

B

18

10

17

5

10

4

10

3

10

2

Si

RTO NO DPN

N 10

10

B p e n e tr a tio n t a il

80

100

120 140 D e p th (n m )

160

Intensity (counts/second)

Atomic concentration (atoms/cm3)

Fig. 7.41. Tunneling conduction mechanism demonstrated for DPN

180

Fig. 7.42. SIMS analysis for high-pressure DPN and RTONO devices that were processed to metal one. SIMS analysis of boron and nitrogen profiles was done after de-processing the devices. Note that the N peaks and the B pile up spike are broad, as a result of unevenness of the surfaces of samples after the de-processing

218

H.-H. Tseng

Lifetime (Hrs)

1.E+04 DPN 70 mTorr, 35" RTONO

1.E+03

PMOS Inversion

1.E+02

1.E+01

1.E+00 2.1

2.2

2.3

2.4

2.5

Voltage Fig. 7.43. Mean-time-to-failure (MTTF) of time to breakdown (TBD) comparison for packaged PMOSFET devices between high pressure DPN and RTONO oxynitride

for packaged PMOSFET devices under constant-voltage stressing in inversion mode at 105◦ C is shown in Fig. 7.43. High-pressure DPN gate dielectric device results in a significantly longer TTB (∼ 30×) than RTONO devices. The same trend is also observed for constant-voltage stressing in accumulation mode. It is speculated that the TTB improvement is attributed to fewer boron-induced traps in the DPN film, as compared to RTONO.

7.5 Conclusion Integrated Oxynitride (ION), JVD nitride, and DPN oxynitride can address the two major challenges faced by aggressive gate oxide scaling for advanced technology. The gate leakage current can be reduced significantly for the same oxide equivalent thickness compared to thermally grown SiO2 or conventional oxynitride. Strong resistance to boron penetration, a robust interface and well behaved transistor characteristics are demonstrated. Reliability results are encouraging. These three approaches are attractive alternative gate dielectrics for advanced technology before implementing high-k gate dielectric. Acknowledgments. The author would like to thank for the discussions with P.J. Tobin, J. Mogab, V. Wang, D. O’Meara, Y.J. Jeon, P. Abramowitz, J.J. Lee, J. Jiang, L. Hebert, R. Cotton, J. Conner, M. Moosa, T.-Y. Luo, J. Alvis, R. Hegde, G. Yeap, T. P. Ma*, T. C. Chua**, A. Hegedus**, G. Miner**, J. Jeon***, A. Sultan***. The managerial support from D. Sieloff, S. Anderson, and B. Melnick and the APRDL Pilot Line for wafer processing are appreciated. (* Yale University, ** Applied Materials, *** AMD)

7 Silicon Oxynitride Gate Dielectric

219

References 7.1. Hwang H; Ting W; Kwong D; Lee J (1990) Electrical and reliability characteristics of ultrathin oxynitride gate dielectric prepared by rapid thermal processing in N2 O. IEDM Technical Digest, pp. 421–424 7.2. Hayashi T; Ohno M; Uchiyama A; Fukuda H; Iwabuchi T; Ohno S (1991) Effectiveness of N2 O-nitrided gate oxide for high performance CMOSFETs Electron Devices. IEEE Transactions 38, p. 2711 7.3. Tseng HH; Tobin PJ (1993) Thin CVD stacked gate dielectric for ULSI technology. IEDM Technical Digest, pp. 321–324 7.4. Okada Y; Tobin PJ; Rushbrook P; DeHart WL (1994) The performance and reliability of 0.4 micron MOSFET’s with gate oxynitrides grown by rapid thermal processing using mixtures of N2 Oand O2 . Electron Devices, Transactions on 41, Issue 2, pp. 191–197 7.5. Maiti B; Tobin PJ; Misra V; Hegde R; Reid KG; Gelatos C (1997) High performance 20 ˚ A NO oxynitride for gate dielectric in deep sub-quarter micron CMOS technology. IEDM Technical Digest, pp. 651–654 7.6. Liu CT; Lloyd EJ; Ma Y; Du M; Opila RL; Hillenius SJ (1996) High Performance 0.2 um CMOS with 25 ˚ A gate oxide grown on nitrogen implanted Si substrates. IEDM Technical Digest, pp. 499–502 7.7. Han LK; Crowder S; Hargrove M; Wu E; Lo SH; Guarin F; Crabb E; Su L (1997) Electrical characteristics and reliability of sub-3 nm gate oxides grown on nitrogen implanted silicon substrates. IEDM Technical Digest, pp. 643–646 7.8. Ito T; Nozaki T; Arakawa H; Shinoda M (1979) Thermally grown silicon nitride films for high-performance MNS devices. Appl Phys Lett 32, P. 330 7.9. Gross BJ; Krisch KS; Sodini CG (1991) An optimized 850 degrees C lowpressure-furnace reoxidized nitrided oxide (ROXNOX) process Electron Devices, IEEE Transactions 38, pp. 2036–2041 7.10. Yang H; Lucovsky G (1999) Integration of ultrathin (1.6 ∼ 2.0 nm) RPECVD oxynitride gate dielectrics into dual poly-Si gate submicron CMOSFETs. IEDM Technical Digest, pp. 245–248 7.11. Rodder M; Hattangady S; Yu N; Shiau W; Nicollian P; Laaksonen T; Chao C; Mehrotra M; Lee C; Murtaza S; Aur S (1998) A 1.2 V, 0.1 micron gate length CMOS technology: design and process issues. IEDM Technical Digest, pp. 623–626 7.12. Nicollian PE; Baldwin GC; Eason KN; Grider DT; Hattangady SV; Hu JC; Hunter WR; Rodder M; Rotondaro ALP. (2000) Extending the reliability scaling limit of SiO2 through plasma nitridation. IEDM Technical Digest, pp. 545–548 7.13. Nakajima A; Khosru QDM; Yoshirnoto T; Kidera T; Yokoyama S (2001) Soft breakdown free atomic-layer-deposited silicon-nitride/SiO2 stack gate dielectrics. IEDM Technical Digest, pp. 133–136 7.14. Tseng HH; O’Meara D; Tobin PJ; Wang V; Guo X; Hegde R;Yang I; Gilbert P; Cotton R; Hebert L (1998) Reduced gate leakage current and boron penetration of 0.18 µm 1.5 V MOSFETs using integrated RTCVD oxynitride gate dielectric. IEDM Technical Digest, pp. 793–796 7.15. Wang D; Ma TP; Golz J; Halpern B; Scmitt JJ (1992) High quality MNS capacitors prepared by jet vapor deposition at room temperature. IEEE Electron Device Lett, pp. 482–484

220

H.-H. Tseng

7.16. Wang XW; Shi Y; Ma TP; Cui GJ; Tamagawa T; Golz J; Halpern BL; Schmitt JJ (1995) Extending gate dielectric scaling limit by use of nitride or oxynitride. VLSI Technology Digest of Technical Papers, pp. 109–110 7.17. Ma TP (1998) Making silicon nitride film a viable gate dielectric Electron Devices, IEEE Transactions on 45, Issue 3, pp. 680–690 7.18. Tseng HH; Tsui PGY; Tobin PJ; Mogab J; Khare M; Wang XW; Ma TP; Hegde R; Hobbs C; Veteran J; Hartig M; Kenig G; Wang V; Blumenthal R; Cotton R; Kaushik V; Tamagawa T; Halpern BL; Cui GJ; Schmitt JJ (1997) Application of JVD nitride gate dielectric to a 0.35 micron CMOS process for reduction of gate leakage current and boron penetration. IEDM Technical Digest, pp. 647–650 7.19. Song SC; Luan HF; Chen YY; Gardner M; Fulford J; Allen M; Kwong DL (1998) Ultra thin (< 20 ˚ A) CVD Si3 N4 gate dielectric for deep-sub-micron CMOS devices. IEDM Technical Digest, pp. 373–376 7.20. Yang IY; Gilbert P; Pettinato C; Anderson SGH; Woodruff R; Misra V; Bhat N; Reid K; Lii T; Yuan C; Dyer D; O’Meara D; Collins S; De H; Veeraraghavan S (1998) Optimization of a 0.18 um 1.5 V CMOS technology to achieve 15 ps gate delay. VLSI Technology Digest of Technical Papers, pp. 148–149 7.21. Tsui PGY; Tseng HH; Orlowski M; Sun SW; Tobin PJ; Reid K; Taylor WJ (1994) Suppression of MOSFET reverse short channel effect by N2 O gate poly reoxidation process. IEDM Technical Digest, pp. 501–504 7.22. Perera AH; Smith B; Cave N; Sureddin M; Chheda S; Islam R; Chang J; Song SC; Sultan A; Croen S; Kolagunta V; Shah S; Celik M; Wu D; Yu KC; Fox R; Park S; Simpson C; Eades D; Gonzalea S; Thomas C; Sturtevant J (2000) A versatile 0.13 um CMOS platform technology supporting high performance and low power applications. IEDM Technical Digest, pp. 571–574

Part III

Transition to High-k Gate Dielectrics

8 Alternative Dielectrics for Silicon-Based Transistors: Selection Via Multiple Criteria J.-P. Maria

To follow forecasts of the National Technology Roadmap for Semiconductors regarding the performance of integrated circuits (both DRAM and logic) for the year 2011, it appears likely that dielectrics other than SiO2 will be required [8.1,8.2]. These dielectrics must provide the capacitance density equivalent to 10 ˚ A of SiO2 – a thickness at which quantum mechanical tunneling yields an unacceptably large leakage current density. The urgency and importance of this issue demands that candidate materials be chosen through a systematic approach which considers potentially important criteria such that experimental efforts are not ill-spent on inappropriate selections. We present here a list of attractive gate-oxide compositions for this application. To accomplish this, a hierarchical set of selection criteria were established and used to estimate the appropriateness of available materials. In general, thermodynamic, dielectric, electronic, and technological issues were evaluated, and a set of materials small enough to be experimentally screened was constructed. This approach and the subsequent list are necessary as insufficient data (or theoretical understanding) exists to make an outright singular selection. This list will, in turn, be useful to direct ongoing experimental efforts towards a gate-oxide replacement.

8.1 Introduction Since 1992, the Semiconductor Industry Association (SIA) has evaluated the industry’s state of the art, and established a “roadmap” to follow such that future performance demands and expectations could be predicted and satisfied. The roadmap traditionally considers device capabilities, incorporated materials, design criteria, and the industries associated hardware. The forecasts are derived, in general, such that a continuation in the growth rate observed during previous years be maintained [8.1]. This most recent “roadmaps” have been unique as certain performance goals appear unachievable given the current compliment of materials and technologies, i.e., through continued evolutionary engineered process refinements. One approach to meet these goals will involve an unprecedented addition of new chemistries and processing methods. The focus of this discussion will cover the 65 nm generation (with physical gate lengths of 25 nm), tenta-

224

J.-P. Maria

tively scheduled for the year 2007 [8.2]. In general, the goals are driven by the need for reduced manufacturing cost, reduced size, and increased processing power. Historically, these criteria could be satisfied by reducing the transistor dimensions, while simultaneously increasing the starting size of silicon wafers. The 2010 45 nm technology milestones include, most importantly for this study, transistors with 18 nm physical gate lengths. To maintain the correct transistor operation, a specific level of capacitance density is required in this tiny gate capacitor. Achieving this capacitance density (for carrier accumulation and inversion in the transistor gate) when limited to these small dimensions requires that the SiO2 layer thickness be no greater than 10 ˚ A. At this thickness, quantum mechanical electron tunneling is prevalent, and leakage currents cannot be maintained below the proposed maximum limit of 1 A/cm2 (for high performance processors) [8.3–8.5]. This limitation appears to be of theoretical origin (i.e., unable to be relieved by continued process improvement), thus the solution demands the introduction of a new higher permittivity gate dielectric composition, or possibly a transistor offering an alternative architecture. One must keep in mind that the strongest motivations for continued device scaling are economic. Thus, the solutions for scaling issues must be cost effective and appropriate for a manufacturing environment. If such criteria cannot be met, other approaches like alternative designs may prove more attractive. In addition to the requirements on gate oxides, these smaller transistor dimensions pose implications on nearly all other aspects of device fabrication and characterization. For example, the improved transistor operations may only be achieved in the presence of metallic gate electrodes (highly doped poly-Si is currently in use) which will not become depleted under bias [8.6]. Additionally, new device patterning techniques will be required; using through-the-lens optical methods, dimensions of approximately 130 nm may be the lower limit: there are no readily available materials which are sufficiently transparent to radiation far below 150 nm in wavelength. Moreover, new and improved methods and instrumentation must be developed for accurate device and process characterization. With these reduced dimensions, the size of contaminant particles leading to device failure fall below diameters that can be detected optically, which is one method currently used for contaminant particle detection [8.1]. Finally, at these smaller dimensions, clock speeds will surpass the low GHz regime, thus dielectric characterizations will need to be performed at microwave frequencies. Similar additional requirements will exist for packaging and the associated discrete components. Impedance matching of this system and its power supplies will be a non-trivial task. The challenges associated with alternative gate dielectrics are currently viewed as the most difficult to overcome; however, it is important to remember that any of the other issues could provide the rate limiting step to further device scaling. As such it is necessary to propose solutions to any single problem within the context of the related integration process.

8 Alternative Dielectrics for Silicon-Based Transistors

225

MOx

Si

CB

CB VB VB

O

Band line up?

O

MOx

Mn+

Si

Si unsatisfied bonds ?

Sr2+ Mn+

Si (001)

defect site formation

R = 1.1 Å Si4+ R = 0.26 Å

Fig. 8.1. A schematic illustration of expected issues which will be encountered during investigations of alternative gate dielectrics. Potential issues (indicated by numbers) include 1) oxygen transport from the gate dielectric to the interface, 2) diffusion of Si from the substrate in to the gate-dielectric, 3) oxygen transport from atmosphere, through the gate-dielectric, into the Si transistor, and 4) diffusion of metal cations from the gate dielectric into Si. Also pictured are interfacial difficulties such as achieving the appropriate band line-up and accommodation of unsatisfied bonds at the Si/dielectric interface

In 45 nm technology, the replacement gate oxide must exhibit a capacitance density equivalent to or greater than a 10 ˚ A layer of Si and a leakage current density below 1 A/cm2 . The gate oxide and its associated process must be compatible with existing fabrication methods, and cannot interact with underlying semiconductor junctions. Other researchers have addressed the problem, most commonly leveraging the knowledge base developed for Gbit embedded DRAM development. The most prominent difference between these systems is the capacitor configuration: DRAM operates in MIM (metalinsulator-metal) configuration, while gate oxides require the MIS (metalinsulator-semiconductor) arrangement. That the capacitor must sit directly over, and in intimate contact with the Si transistor presents multiple difficulties involving the capacitor, and the underlying transistor gate. With these concepts in mind, selecting a material with a high permittivity alone will be

226

J.-P. Maria

insufficient, thus information leveraged from high-K DRAM investigations may be of limited utility. The ultimate goal of this investigation is to produce a satisfactory replacement gate-dielectric for 10 ˚ A tox (where tox refers to the equivalent capacitance density of SiO2 at a given physical thickness) and beyond. The first stage in this process is, however, to identify which candidate materials possess the greatest possibility for success. To accomplish this we have established a set of criterion through which attractive choices can be selected and unsuitable choices excluded. Other authors have addressed these ideas, thus several discussions are founded on these previous reports. We present an extension of these previous works where a greater range of criteria is considered, and the selections are tailored to the specific application of an MIS capacitor. Figure 8.1 graphically illustrates an alternative gate dielectric on Si. Several potentially serious issues associated with this geometry are identified. In general, the selection process is designed to provide materials for which these problems can be minimized or avoided. The selection criteria are therefore based upon the materials science fundamentals which relate most strongly to the issues indicated in Fig. 8.1. The areas of concern discussed below include thermodynamically evaluated compatibility, process flow considerations, dielectric properties, electronic structure, and defect chemistry. The appropriateness of alternative gate-dielectrics is ultimately dependent upon the boundary conditions established by transistor synthesis conditions (especially temperature excursions in the back-end process). Two selection criteria sets will, therefore be required, such that the transistor fabrication using both standard and replacement gate-stack process flows can be accommodated.

8.2 Discussion 8.2.1 Development of Selection Criteria Thermodynamically Evaluation Gate-oxides must sit directly on Si, thus reaction, mixing, or interlayer formation during deposition or subsequent processing at elevated temperatures is of primary concern. Our initial consideration, therefore, involves the thermodynamic stability of oxides in contact with silicon. In general, the ideal gate-dielectric would be thermodynamically stable while in contact with silicon - this being the case, driving forces for the issues depicted in Fig. 8.1 would not exist. The approach to determine which materials satisfy this requirement is based primarily on the numerical investigations of Hubbard and Schlom, Schlom et al., and Fork et al. [8.7–8.9]. In the most comprehensive study for the model system of a single component oxide in contact with Si at 1000 K, Hubbard and Schlom evaluated the free energies of reactions producing known oxide or silicide products. Using this approach, metal oxides were

8 Alternative Dielectrics for Silicon-Based Transistors

227

Table 8.1. Elements expected to produce single-component oxides with thermodynamic stability in contact with Silicon at 1000 K. The table also includes reference data pertinent to the selection criteria element atomic common oxide ionic(˚ A3 ) calculated calculated comments number valence polarizability permittivity band gap (eV) Li 3 1+ Li2 O 1.2 11 – BeO 0.19 8 – Be 4 2+ Mg 12 2+ MgO 1.32 10 6.3 epi-growth [8.53, 8.54] Al2 O3 0.79 10 5.8 epi-growth [8.40, 8.55] Al 13 3+ Si 14 4+ SiO2 0.87 – 4.7 CaO 3.16 12 6.6 epi-growth Ca 20 2+ Sc2 O3 2.81 12 6.6 Sc 21 3+ Sr 38 2+ SrO 4.24 11 6.1 epi-growth [8.30, 8.56] Y2 O3 3.81 14 6.7 epi-growth [8.39, 8.42–8.44] Y 39 3+ ZrO2 3.25 20 5.7 epi-growth [8.44, 8.54, 8.57] Zr 40 4+ BaO 6.40 16 5.7 epi-growth [8.18, 8.29] Ba 56 2+ La 57 3+ La2 O3 6.07 38 6.2 HfO2 3.25* 25 5.8 epi-growth [8.57] Hf 72 4+ Pr 59 3+ Pr2 O3 5.32 27 6.3 epi-growth [8.36, 8.58] Sm 62 3+ Sm2 O3 4.74 15 – Eu2 O3 4.53 15 6.0 Eu 63 3+ Gd 64 3+ Gd2 O3 4.37 19 6.2 epi-growth [8.9] Tb2 O3 4.25 12 6.6 Tb 65 3+ Dy 66 3+ Dy2 O3 4.07 14 – Ho2 O3 3.97 12 – Ho 67 3+ Yb2 O3 3.58 13 – Yb 70 3+ * The polarizability value for Hf was assumed equal to that of Zr. This assumption was made following Shannon’s r3 law where ionic polarizability tracks with the cube of the ionic radius. Zr and Hf have nearly identical ionic radii [8.59].

characterized as thermodynamically stable or unstable in contact with silicon. Table 8.1 lists the abbreviated Hubbard-Schlom results as they pertain to this investigation.1 This list includes the elements producing oxides with calculated stability, experimentally demonstrated stability, and predicted stability in the absence of the entire compliment of thermodynamic reference data. Upon consideration of this data, it is necessary to remember that reference thermodynamic data (for each metal cation) for all possible reactions are not available, and the available data is of finite accuracy. In addition, the thermodynamic calculations correspond to equilibrium conditions. Because of this uncertainty, all materials which do not satisfy the thermodynamic criteria will not be excluded from consideration, especially those which offer particularly attractive dielectric properties, and which are energetically very close to predicted stability. Similarly, it will not be unexpected for materials with predicted stability to exhibit uncontrollable reactions in practice. From a chemical compatibility standpoint, the identified metal cations are expected 1

It should be noted that the list presented here is one interpretation of the Hubbard-Schlom investigation appropriate when considered within the confines of this study. It is recommended that the original reference be reviewed for the most accurate representation.

228

J.-P. Maria

to produce satisfactory gate oxides. Note that this list contains materials for which reactions with Si are not expected. There are other dielectrics, like TiO2 , for instance, which have higher permittivity values but react strongly with Si. Though not considered in this study, these materials may have a role in future technology generations if processing conditions can be modified to yield kinetic conditions limiting unwanted reactions. This approach is designed to select a material resistant to reactions producing appreciable quantities of low permittivity silicates, silica, or conducting silicides during deposition or post-deposition heat treatments. For completeness, the crystal structure of these materials must be considered. If certain oxides are deposited epitaxially, their structural match to Si (001) is potentially important. Precise atomic registry across the Si/gate-oxide interface can help to stabilize the system against nucleation of reaction products, thus materials satisfying this criterion may be especially attractive. Table 8.1 also indicates the unary oxides for which epitaxial deposition on silicon has been demonstrated. In view of these results, we conclude that high K gate oxides will most likely contain one or more elements in this list, and will provide chemical boundary conditions for the selection process. Process-Flow Options The selection process is tailored foremost towards compatibility with the standard transistor process flow. A consequence of this process is the high temperature exposure during the back-end fabrication steps. After gate-oxide deposition, the transistor stack must be exposed to N2 or mildly-oxidizing anneals between 900◦ C and 1000◦ C (typically 1000◦ C today) to achieve the appropriate dopant profile in the transistor source and drain. The gate oxide must be able to survive this treatment without formation of low dielectric constant interfaces or metal silicides, or changes in the high K microstructure. For example, if the gate oxide is deposited amorphous, and intended for use as such, it must be able to withstand devitrification during these thermal cycles. Assuming that the dielectric constant of an amorphous oxide is similar to that of its crystalline analog (an assumption which is true for many, but not all, oxides presented) the non-crystalline solid will always be preferable given the absence of microstructural features like facets and grain boundaries. This high temperature back-end processing step strongly limits the number of alternative metal oxides. The replacement gate-stack process flow is an alternative fabrication route where high temperature dopant diffusion anneals are performed prior to high K deposition. Consequently, the high temperature exposure of the gatedielectric may be limited to ∼ 550◦ C: these lower temperatures may allow the introduction of additional metal oxides [8.10, 8.11]. The themodynamic stability calculations performed by Hubbard and Schlom simulated 1000 K processing; thus one can expect that under lower temperature conditions, additional oxides may be stable in contact with Si. Moreover, the slower ox-

8 Alternative Dielectrics for Silicon-Based Transistors

229

idation kinetics at 500◦ C may allow less stable gate-oxides to be used since acceptably thin interfacial layers (several formula units in thickness) may be achievable in practice. The reverse gate-stack process itself requires additional research and optimization, however, considering the fundamental nature of gate-oxide replacement, such an undertaking is not completely unrealistic and consideration of these possibilities is within the scope of this selection process. With these concepts in mind, two lists of attractive candidates will be developed, those most appropriate for incorporation into the standard transistor fabrication process, and those most appropriate for transistors fabricated by the replacement gate-stack flow. In general, candidates for the standard process method will ideally (but not in most cases) be deposited as lattice matched epitaxial crystals, while those for the replacement process flow will be thin amorphous layers. Several of the amorphous layers will be considered for both approaches. Dielectric Properties The dielectric constant of the ideal gate oxide will have a permittivity which is large enough that its capacitance density will be equal to, or larger than, that achieved with 10 ˚ A of SiO2 (∼ 32 pF/µm2 ). Ideally, the larger dielectric constant will allow the thickness to be increased such that a tolerable level of tunneling leakage can be maintained. (As a first approximation, this assumes that the thickness dependence of the leakage current of alternative oxides is similar to, or at least as good as, that of SiO2 .) Though replacing SiO2 with any dielectric having K > 4 would be theoretically satisfactory, the possibility of higher leakage must be considered, especially since very few materials possess similarly large band gaps and similarly favorable band line up. Knowing this, physically realistic permittivity requirements were established. When considering the exact geometry of the proposed gate oxides, two distinct microstructures are envisioned. The first microstructure is an epitaxial film where the composition and atomic registry are chosen to provide a sub-monolayer interface that corresponds to Si-oxide bonding governed by lattice registry. No amorphous oxide layer would be present, but some quantity of Si–O or Si-cation bonds will occur. In this case, a reasonable minimum dielectric constant considered would be 10. This value should provide for a 10 ˚ A equivalent oxide thickness (EOT) at a physical thickness of 25 ˚ A. At this thickness, tolerable tunneling currents are expected.2 Figure 8.2 illus2

J.-P. Maria, It can be argued that at a thickness of 25 ˚ A tolerable tunneling leakage currents would be expected in many well-prepared alternative gate oxides. Predicting tunneling-induced leakage currents requires, in general, knowledge of a material’s band gap, thickness, permittivity (in order to arrive at a similar gate capacitance), and effective electron mass. If, in the absence of reliable reference data, the effective mass of electrons is held constant between materials, the 1/3 tunneling probability can be estimated as being proportional to exp−(mfb ed ).

J.-P. Maria

Capacitance density (fF/µm 2)

230

140 K = 30 120 K = 15 100 K = 10 80 60 40 20 No SiO interface 2 0 0 10 20 30 40 50 Dielectric thickness (Å)

Fig. 8.2. Capacitance density plotted versus dielectric thickness for hypothetical dielectrics with permittivities of 10, 15, and 30. For this calculation, it is assumed that the dielectrics are deposited epitaxially to Si in the absence of any low-permittivity SiO2 interface. The horizontal line at ∼ 32 fF/cm2 indicates the capacitance density achieved in 10 ˚ A of SiO2

trates the capacitance densities achievable in a high-K oxide deposited on Si as a function of thickness. The second microstructure is a smooth and homogeneous amorphous gate dielectric with a thickness-variable SiO2 layer. In the absence of a heteroepitaxial epitaxial interface, avoiding any a-SiO2 may not be possible; at least ∼ 2.5 ˚ A of SiO2 (that dimension corresponding to the approximate bond length of an Si–O bond) should form at the surface. This thin layer will dilute the capacitance of the structure, thus materials with a larger permittivity may be required. As such, only materials with dielectric constants ≥ 15 will be considered. In this case the 10 ˚ A equivalent oxide thickness can be achieved in a ∼ 35 ˚ A physical thickness containing 5 ˚ A SiO2 and 30 ˚ A gate-oxide. Again, this represents a thickness where tunneling induced leakage should be tolerable. Figures 8.3, 8.4, and 8.5 show simulations of the dielectric constants for materials of different permittivities as a function of SiO2 interface layer thickness. These figures effectively demonstrate the importance of minimizing the interfacial oxide thickness. In addition, the practical difficulty of achieving 10 ˚ A tox in a dielectric with a permittivity < 15 is illustrated. Though not covered in these simulations, the incorporation of a non-SiO2 interface remains a strong possibility. Consequently, using a nitrided or partially nitrided interface may relax the physical thickness or permittivity requirements, but the device characteristics require assessment to ensure compatibility. Consequently, (with respect to SiO2 ) as long as the alternative gate-oxide per1/2 mittivity increases faster than fb decreases, one can expect that tunneling probabilities will not increase. Certainly sufficiently insulating SiO2 films can be made at thicknesses less than 25 ˚ A – this thickness was chosen to reflect that alternative dielectrics may not exhibit the same “quality” and may incorporate greater numbers of trap sites which reduce the effective tunneling barrier height.

capacitance density (fF/µm 2)

8 Alternative Dielectrics for Silicon-Based Transistors

231

100 10 Å SiO2 interface

80

5 Å SiO2 interface 2 Å SiO2 interface

60 40 20

permittivity = 10

0

0

10 20 30 40 total film thickness (Å)

50

capacitance density (fF/µm 2)

Fig. 8.3. Capacitance density plotted as a function of total dielectric thickness for a dielectric film (permittivity of 10) with a 2 ˚ A, 5 ˚ A, and 10 ˚ A SiO2 interface. For these calculations, composite permittivities were calculated assuming an ideal “slab” series arrangement of SiO2 and high k layers. The horizontal line at ∼ 32 fF/cm2 indicates the capacitance density achieved in 10 ˚ A of SiO2

100 10 Å SiO2 interface

80

5 Å SiO2 interface 2 Å SiO2 interface

60 40 20

permittivity = 15

0

0

10 20 30 40 total film thickness (Å)

50

Fig. 8.4. Capacitance density plotted as a function of total dielectric thickness for a dielectric film (permittivity of 15) with a 2 ˚ A, 5 ˚ A, and 10 ˚ A SiO2 interface. For these calculations, composite permittivities were calculated assuming an ideal “slab” series arrangement of SiO2 and high k layers. The horizontal line at ∼ 32 fF/cm2 indicates the capacitance density achieved in 10 ˚ A of SiO2

Ideally, gate-dielectric selection would merely involve taking the list of acceptable elements, reviewing all available compounds made from them, and choosing the material with the largest permittivity value. The available reference dielectric constant values are most often reported for large grained ceramics or single crystals, and in many cases, these values may not be achievable in thin film geometry. The reasons for this may include composition control, defect density, interface effects, and residual stress. Moreover,

J.-P. Maria

capacitance density (fF/µm 2)

232

100 90 80 70 60 50 40 30 20

10 Å SiO2 interface 5 Å SiO2 interface 2 Å SiO2 interface

permittivity = 30

0

10 20 30 40 total film thickness (Å)

50

Fig. 8.5. Capacitance density plotted as a function of total dielectric thickness for a dielectric film (permittivity of 30) with a 2 ˚ A, 5 ˚ A, and 10 ˚ A SiO2 interface. For these calculations, composite permittivities were calculated assuming an ideal “slab” series arrangement of SiO2 and high k layers. The horizontal line at ∼ 32 fF/cm2 indicates the capacitance density achieved in 10 ˚ A of SiO2

the necessity of having an asperity or defect-free microstructure makes many amorphous solids more attractive than their crystalline forms. Unfortunately, reliable reports of non-crystalline dielectric constants are less common. To fill gaps in the available reference data, the method developed by Shannon [8.12] for permittivity calculation was applied to the systems for which crystallographic and ionic polarizability data can be found.3 Shannon’s method considers a summation of a solid’s ionic polarizabiltiy normalized with respect to molar volume. This treatment is most appropriate for crystals but provides a reasonable reference point for amorphous solids, as it does not consider the bonding configuration or longer-range order. (Provided the theoretical densities between amorphous solids and crystals of the same composition are not appreciably different, and the crystal does not contain large dipolar contributions to the polarizability, the calculated numbers offer a reasonable baseline for comparison.) Crystal structure information is readily obtained from the powder diffraction file [8.13], while ionic polarizability values (for metal cations coordinated to O) have been calculated by Shannon [8.12]. Following these arguments, the ideal amorphous materials would be closely packed solids containing high polarizability cations. In Table 8.1, the list of stable cations is presented along with the ionic polarizability values determined by Shannon and the calculated permittivities. 3

Note that significant propagating mathematical errors can be associated with this method. As indicated by Billman et al., the errors (in select cases) can be quite large, especially for materials with dielectric constants in excess of 10 [8.7]. Consequently, the calculated values must be used with caution, especially in the absence of any related reference measurements.

8 Alternative Dielectrics for Silicon-Based Transistors

233

When attempting to predict the dielectric constants of epitaxial films the specific crystal structure must be taken into consideration. Those structures containing “rattling ions”, like ferroelectrics, may appear attractive due to large dipolar contributions to the permittivity. However, as sample dimensions (i.e., film thickness) become very small, these contributions will be reduced by scaling effects [8.14–8.16]. This reduction appears to be unavoidable, thus must be considered. Reports of thin epitaxial films indicate, however, that though reduced, the dielectric constants of very thin perovskite thin films are still considerably larger than what can be achieved in normal dielectrics, and that ferroelectricity can be maintained in perovskite films limited to 4 unit cells in thickness [8.17, 8.18]. As such, the possibility of a ferroelectric or paraelectric gate oxide cannot be completely eliminated. A final consideration must be made concerning dielectric constant optimization. When selecting a model composition for either proposed microstructure, it may be advantageous to incorporate more than one cation from the “stable” list. In these cases, the themodynamic stability predicted for two separate single component oxides in contact with Si may not exist when the three components are intimately mixed. An example of this may be SrZrO3 . Both SrO and ZrO2 are included on the “stable” cation list. Combining the two materials to form the perovskite SrZrO3 is certainly advantageous in regards to dielectric properties; however, it is difficult to predict thermodynamic stability for this more complicated system. For this selection process, the assumption of binary oxide stability will be made given known stability of the composition’s endmembers. Electronic Structure Upon consideration of the previous arguments alone, it would be tempting to select either the “stable” amorphous composition containing the highest polarizability cations, or the highest dielectric constant crystal which forms an epitaxial structure on (001) Si. Other electronic characteristics must be considered prior to completion of the ideal candidate list. These include band gap, band-alignment to Si, valence, and coordination. A dielectric’s band gap is an influential parameter for determining leakage properties. It is important that the band gap be appreciably large such that electronic transitions between valence and conduction states are minimized. As the dielectric thickness becomes very small, the capacitor begins to resemble a tunneling barrier whose magnitude is determined, in part, by the band gap. If the band gap of a dielectric is sufficiently small, even if the number of intrinsic electronic carriers is low, the material will not be resistive enough to provide charge storage over acceptable time periods. This criterion makes otherwise desirable cations (in reference to their large polarizability) like Ti and Nb inappropriate [8.19, 8.20]. Oxides of these materials are prone to incomplete oxidation and the existence of multiple valence states. The presence of these reduced valence cations will produce defect states between which lower

234

J.-P. Maria

energy transitions can be promoted – this condition will effectively reduce the band gap value. In the reducing conditions required for many thin film processes, multiple lower cation valence states may be difficult to practically avoid. Furthermore, these same factors can promote dielectric breakdown at the relatively large fields encountered. For example, at 0.25 volts applied, an electric field of 1.25 MV/cm will be established across a 2 nm gate-oxide. Consequently, cations with more stable valences like Mg and Zr may be preferred, cation-cation electronic transitions are much less likely to occur in these larger band-gap systems. Discussions and investigations detailing this behavior have been presented by Harrop and Nakane et al. [8.20,8.21]. In the absence of reliable reference data, (tabulated band-gap values are difficult to find in all cases) band-gaps can be roughly estimated for unary oxides following the method of Vijh. In this approach, band-gaps are estimated as being equivalent to twice the heat of formation [8.22]. Tabulated thermodynamic data is readily available for most single-component oxides, thus estimations can be made for all of the compounds considered. Results of these calculations are listed in Table 8.1. A second concern closely related to band gap and leakage current is the band lineup at the silicon/gate-oxide interface. In general, a design rule for transistors is that a blocking, or Schottky contact exist for both positive and negative electronic carriers with a barrier height to conduction of ≥ 1 eV. In the limiting case, a gate-dielectric has a energy gap of 2 eV, the semiconductor Fermi level lines up in the center of the gate-oxide forbidden energy range, and sufficiently large conduction barriers are expected for both carrier polarities. If however, this is not the case (i.e., the semiconductor Fermi level does not line up with the insulator’s mid-gap), then the contact will be prone to greater leakage via one majority carrier type. A recent publication by Robertson and Chen suggests that for many oxide materials in contact with Si this latter situation may indeed be the case [8.23]. The argument suggests that band line up between a gate dielectric (treated as a high energy gap semiconductor) and silicon will be given by the following equation predicting the Schottky barrier between two semiconductors: φn = (χSi − φCNL,Si ) − (χG-O − φCNL,G-O ) + S(φCNL,Si − φCNL,G-O ) (8.1) where χi is the electron affinity and φCNL,i is the charge neutrality level, i.e, the energy level below vacuum where the influential interface states lie. From the Monch [8.24] approximation:  −1 S = 1 + 0.1(ε∞ − 1)2 . (8.2) The important parameters in this relationship are φCNL,i and S. These quantities will effectively determine conduction barriers for each interface under consideration. In general, a large S-value will reduce the barrier height (as it indicated charge transfer between interface states) while a φCNL at an energy other than mid-gap predicts an asymmetric band line-up.

8 Alternative Dielectrics for Silicon-Based Transistors

235

Table 8.2. Schottky barrier heights calculated for several high-K materials with Si electrodes. Calculations taken from Robertson and Chen [8.23] electrode material Pt Si (CB) Si (VB)

work function (eV) 5.3 4.0 5.1

Schottky barrier height (eV) SrTiO3 Pb(Zr,Ti)O3 Ta2 O5 0.89 1.45 1.42 −0.14 0.45 0.36 2.3 1.75 2.9

SrBi2 Ta2 O9 1.19 0.15 2.85

Robertson’s calculated CNL values for a variety of high K materials of current interest for alternative gate-dielectrics were asymmetric (i.e, φCNL = Eg /2) in all cases. Table 8.2 shows the calculated conduction barrier heights to Si and Pt calculated by Robertson for several materials. This result suggests that none of these materials would exhibit appropriately blocking contacts for a functional device. Recent literature reports for MIS structures containing Ta2 O5 or SrTiO3 dielectrics, however indicate acceptable leakage currents in contrast to this model [8.25]. This discrepancy can be explained considering that in most cases a thin layer of SiO2 -rich material occupies the Si interface and consequently dominates the electrical properties. A potentially serious implication is that difficulties associated with asymmetric band line-up will be exacerbated by improved sample quality – as the interfacial oxide thickness between the high K material and Si is reduced, the properties will no longer be governed by the thin SiO2 skin. Depending upon the material, the electronic interface properties may be entirely inappropriate for gate oxide application. Given the difficulty in CNL and S-factor determination, it is not straight forward to translate this model into a simple selection criterion. The following general comments can, however, be made: A material is most desirable if its CNL is located in the center of the band gap, or if its band gap is wide enough such that even in the presence of a large CNL shift, a 1 eV conduction barrier is maintained for both charge carrier types. In a very general sense, the CNL energy will tend towards mid-gap when the cation and anion share a common valence [8.23]. For our purpose, this suggests that oxides with a 2+ valence and a large band gap are most attractive. Since this is a surface/interface phenomenon it is possible that a sparingly thin skin of material can overcome the potential difficulties. It is unclear, however, at this point, exactly how much interfacial material is required to dominate the interface properties, and how much material can be tolerated while still maintaining the appropriate capacitance density. In order to obtain the optimal transistor properties, the number of point defects at the gate-oxide interface must be minimized since their presence may potentially result in reduced carrier populations and effective mobilities. These point defects are most often associated with the termination of the Si crystal, and the dangling bonds which result [8.26]. At the interface between Si (001) and a-SiO2 in a conventional transistor, constraints posed by ion size and coordination do not allow accommodation of all bonds. If higher K di-

236

J.-P. Maria

electrics are substituted, a similar condition will be encountered. To minimize the resulting number of dangling bonds, two general interfacial criterion need to be satisfied, and can be envisioned in the context of joining two dissimilar structures. (i) Each dangling bond of the substrate interface needs to become neutral by losing/sharing the same number of electrons as each dangling oxygen bond of the oxide needs to gain. For example, in Si–SiO2 , each Si on the (001) surface needs to give up/share one electron, while on average, each O dangling bond in amorphous SiO2 requires one electron, and (ii) the cation size and its oxygen coordination should provide an interface bond density similar to that of the Si (001) surface thus minimizing the number of dangling bonds. If these conditions cannot be satisfied, the result will be either residual unsatisfied bands, or bonding between cations and Si, i.e., thin regions of silicide. When we substitute amorphous higher K dielectrics for SiO2 , inevitably the cations become larger and, as predicted by Pauling’s rules, coordinated to a greater number of oxygen atoms [8.19]. As such, it is expected that even greater numbers of dangling bonds will accompany these interfaces. Unfortunately, no cations exist which have sufficiently large polarizabilities to provide dielectric constants appreciably greater than silica while having sufficiently small size such that oxygen atoms are two-coordinated. This suggests that a Si/high K oxide interfaces will inescapably have greater numbers of defects and charge traps. One possibility to ameliorate this problem is to incorporate a thin layer of SiO2 . (It should be noted that incorporating a thin SiO2 layer may be useful in reducing interface trap state densities at the substrate. However, the resulting SiO2 /high K interface may exhibit high levels of traps within tunneling distance of the gate.) Due to the complicated and random nature of amorphous materials, these effects are difficult to predict, thus experimental verification appears necessary. Moreover, the resulting SiO2 /gate-dielectric interface may be heavily defective as well. For the special case of epitaxial deposition, if the Si/high K interface is commensurate, all Si atoms can be geometrically accommodated by an oxygen sublattice site. The charge exchange for a rocksalt cation (assuming ionic bonding) is +1/3 e− (i.e., 6-coordinated 2+ cation) while the charge exchange/sharing for a Si–O bond is 1 e− . As a result, the system cannot be charge balanced on a formula unit scale if the expected coordinations for each structure are maintained. Consequently, a large number of trap sites may be encountered at the interface to maintain charge neutrality. Furthermore, the possibility exists that the cation sublattice, rather than the oxygen sublattice will epitaxially register to the silicon. In this case, a very thin layer of silicide will result – about 1/4 monolayer. It is unclear if this type of interface will provide for acceptable electrical behavior. Detailed results have been presented for this system by McKee et al. and Schlom et al., as well as previous reports by others; however, at this point, there remains some debate regarding the exact interfacial atomic arrangements. Additional efforts are clearly warranted in these interesting systems [8.18, 8.27–8.30].

8 Alternative Dielectrics for Silicon-Based Transistors

237

In a state of the art interface, approximately 1010 trap states/cm2 are present. As the number of interface states increases, the mobility of transistor carriers and the effectiveness of a higher K oxide are reduced. Such low defect density interfaces have only been achieved by careful hydrogen anneals or other more sophisticated procedures after the gate dielectric preparation. These anneals effectively “tie” all Si dangling bonds not allowing them to behave as electronic trap states. It is likely that these types of anneals will be necessary for alternate gate dielectrics. As such, incorporating a material with resistance to hydrogen reduction may be an important factor and may constitute an additional selection criterion. Defect Chemistry A final materials consideration involves the point defect chemistry of various metal oxides. It is well known that at above absolute zero, a formation free-energy balance predicts an equilibrium population of point defects. The specific defect type, as well as the population will depend strongly on constituent cation characteristics as well as the particular crystal structure. It is possible for these defects to act as trap sites, which could influence the dielectric leakage currents and the transistor operation. The latter may be especially true when the oxides are very thin, thus trap sites spread throughout the volume may influence the interface. This information can be determined best by conductivity measurements on slightly doped bulk ceramics or single crystals at elevated temperatures and high reducing or strongly oxidizing atmospheres. These conductivity trends can be used to determine equilibrium defect reaction constants, which can, in-turn, be used to estimate cation or anion vacancy concentrations in a solid. The ambient and oxygen pressure dependence of these measurements may be especially useful in estimating the behavior in thin film materials, which are often synthesized in either strongly oxidizing or reducing conditions. (strongly reducing conditions predominate most oxide depositions in the presence of Si).4 Predicting specific defect concentrations is extremely difficult as great sensitivity results from specific sample history, composition, and trace impurity concentrations. This approach can, however, be used to judge which materials may be more resilient to excessive defect populations. Moreover, this type of information can be a useful guide if attempts are made to reduce point defect concentrations like oxygen vacancies (or at least immobilize them) through impurity doping. 4

In this case reducing atmospheres refer to oxygen pressures between 10−2 and 10−6 Torr. This corresponds to a pressure range common to most oxide thin film PVD techniques. Though the upper end of this range corresponds to relatively large oxygen pressures, in the context of bulk ceramic processing (from which most point defect reference data was developed) these pressures are 4 decades lower than the atmospheric firing conditions commonly used.

238

J.-P. Maria

Table 8.3. A comparison of the preferred point defects and their predicted concentrations in MgO and TiO2 . Defect chemical reactions given using the Kroger–Vink notation. The calculations were performed assuming a temperature of 1000◦ C and an oxygen pressure of 10−7 Torr material MgO TiO2

reaction

majority defect

/2 O2 → VMg +Oo +2h − TiTi +2Oo → Ti··· i +4e +O2 (g)

VMg Ti··· i

1

#/cm3

#/cm2

2.5 × 109 250 1.6 × 1018 1.6 × 1011

To approximate the number of defects per square cm, the number of defects expected in a 10 ˚ A thick slab were calculated assuming a uniform spatial distribution.

As an example, the behavior of MgO and TiO2 are compared. Substantial work has been performed on these systems to determine both the favored defect reactions and their equilibrium coefficients. Specifically, Sempolinski et al. and Baumard et al. studied the defect chemistry of MgO and TiO2 over a range of atmospheres and temperatures [8.31, 8.32]. Their study allows the concentrations of majority defects to be estimated for both systems under experimental conditions similar to those used during thin film growth. Table 8.3 gives the result of the calculations, as well as the defect reaction (the defect reaction is written using the Kroger–Vink notation – see [8.19], Chap. 4, for explanation of notation) and majority species for each case. As indicated in Table 8.3, Mg vacancies are the expected defect in MgO while Ti interstitials are the majority defect in TiO2 . As indicated by the defect density estimations, for similar processing conditions, TiO2 is expected to have 9 orders of magnitude more defects per cubic cm than MgO. Again, the deviation from equilibrium conditions common to thin film processing cannot be reflected in these calculations, thus qualitative prediction of absolute defect concentrations cannot be determined. However, the comparison of concentrations between materials is valuable in the context of selecting structures most resistant to electronic defects. Practical/Technological Considerations The final criterion used to judge the most appropriate gate-oxide candidates involves the compatibility with the current semiconductor-fabrication infrastructure. Traditionally, only a select assortment of elements have been allowed into the fabrication line and may be referred to as the CMOS elements. It seems apparent that the goals of the ITRS roadmap will not be met if the material selections are limited by this traditional list. For alternate gateoxides, the only high polarizability cation in the CMOS list is Ta, but several difficulties exist thwarting application – most notably, the rapid reactivity between Ta2 O5 and Si. As such, it is apparent that one or more additional elements must be introduced into the process. Knowing this, it is appropriate to define which elements are fundamentally unacceptable, those which remain representing plausible candidates.

8 Alternative Dielectrics for Silicon-Based Transistors

239

Because of the proximity of the gate oxide to the transistor, elements which can diffuse rapidly from the oxide may not be used. This effectively excludes elements form the alkalai family. Along similar lines, those elements which dissolve easily into silicon forming trap states deep in the band gap should be avoided. Many transition metals like Ni and Cu fall into this category. A series of high polarizability cations are available which are predicted to have high dielectric constant oxides; however, these materials may posses a permanent magnetic moment and should also be avoided. (Examples include Sm, Co, and Mn.) Because of potential difficulties with controlling composition and contamination, volatile metals and metals producing volatile oxides are inappropriate. Examples include Pb, Bi, Mo, and W. Finally, those ions which are known alpha-particle emitters cannot be used. Pb, Th, as well as several actinides and lanthanides fall into this category. Locating a potential particle emitter in such close proximity to the transistor gate can result in an unacceptably large frequency of random errors and upset events. This section is not meant to define a specific list of permanently barred metal cations; rather, it is meant to point out that certain attractive elements may be excluded do to the stringent controls required by semiconductor processing methods. 8.2.2 Application of the Selection Criteria With consideration of the criteria established, lists of optimal compositions can now be introduced and discussed. Again, two lists are presented, one appropriate for transistors processed according to the standard flow, and the other appropriate for the replacement gate-stack process. Considering compatibility with the standard process flow, the high temperatures involved will require that thermodynamic stability have the greatest importance. As such only the materials from the “stable” list are considered. Materials for a Standard Process Flow – Epitaxial Dielectrics For epitaxial deposition, the best atomic registry to (001) silicon from the available materials is obtained in a solid solution of BaO and SrO. In this solid solution, the room temperature lattice constants of (001) Si can be duplicated identically [8.29]. We consider the optimal case to be perfect lattice match for strain minimization at room temperature. High temperature lattice match is viewed as less critical since the proposed films are thin enough to ensure commensurate matching, thus misfit dislocations are not be expected due to limited volume strain energies [8.33, 8.34]. Given a template of several rocksalt mono-layers, a variety of perovskites could be deposited subsequently. Possibly, a high K perovskite like BaZrO3 could be deposited directly on Si; however, stability of a multicomponent oxide cannot be assured by the thermodynamic calculations. Conceivably, the rocksalt BaO could be used alone, however, this material is extremely unstable in the presence of

240

J.-P. Maria

H2 , which will likely be required in post-deposition anneals for interface trap density minimization. The optimal structure may be a composite of (Ba,Sr)O and perovskite layers. (Ba,Sr)ZrO3 is chosen as the most appropriate perovskite for the top layers. The BaZrO3 endmember of this compositional series boasts the highest dielectric constant (approximately 50); however, the epitaxial strain to Si is ∼ 9%. The opposite endmember of the system has a lower dielectric constant of ∼ 35, but the lattice match to Si is reduced to ∼ 6%. It is difficult at this point to predict how strongly epitaxial strain will affect transistor performance, thus provided that both compositions can be suitably deposited, experimenting on a range of solid solutions is recommended. In either case, dielectric constants should provide a 10 ˚ A oxide equivalent thickness. Finally, it should be noted that the possibility exists that (Ba,Sr)ZrO3 can be grown on Si (001) in the absence of a rocksalt intermediate. Unlike the (Ba,Sr)OSrTiO3 system investigated by McKee et al., both cations in the proposed zirconate promise thermodynamic compatibility with Si [8.8, 8.18]. This may render additional simplicity to the system – a potentially important prospect from a process feasibility standpoint. Finally, the proposed (Ba,Sr)O interface should provide a promising band lineup with Si. (Ba,Sr)O has a suggested band gap between 4.5 and 6 eV while the divalent state of both cation and anion suggest that the CNL lies near the band gap midpoint. As such, this material is expected to exhibit relatively symmetrical leakage currents with respect to opposite majority carrier type. If reducing the epitaxial strain is found important, Ti can be substituted into the zirconates. This will reduce the lattice strain to a minimum value of 1.6% in pure SrTiO3 . As mentioned, however, this substitution comes at the expense of increased leakage given the variability of the Ti valence. An attractive alternative to the titanate family is LaAlO3 . LaAlO3 has a pseudo-cubic lattice constant of 3.79 ˚ A, corresponding to a lattice mismatch to Si (001) of ∼ 1% to. Like SrZrO3 , LaAlO3 is comprised of two cations having predicted stability against Si at 1000 K, thus it is possible that LaAlO3 may be deposited directly on Si in the absence of intermediate rocksalt layers. Table 8.4 gives a prioritized list of epitaxial gate-oxides and their expected dielectric constant values and lattice mismatch to (001) Si. It should be noted that though the dielectric properties of LaAlO3 and SrZrO3 are smaller than titanate-based perovskites, their reduced permittivity values are more stable with respect to temperature fluctuations [8.35]. Materials from the (Ba,Sr)TiO3 solid solution can have large temperature coefficients of capacitance which may be unsuitable considering the temperature excursions experienced in an integrated circuit [8.16]. A substantial number of reports can be found for the deposition of epitaxial single component oxides on Silicon. Examples of these oxides include Al2 O3 , HfO2 , ZrO2 , Pr2 O3 , CeO2 , Gd2 O3 , BaO, Y2 O3 , SrO, and MgO. In many of these cases, motivations included buffer layer templates for subsequent growth of oxides which react strongly with Si, or the development of

8 Alternative Dielectrics for Silicon-Based Transistors

241

Table 8.4. Prioritized list of high-permittivity alternative gate-oxides appropriate for transistors fabricated via the standard process flow (i.e., subject to a high temperature post-deposition anneal). These oxides would be deposited epitaxially on (001) Si Rank 1 2 3 4 5

Material Lattice Mismatch K (measured) (structure) parameter (˚ A) to (001) Si K (calculated) SrZrO3 4.10 6.4% 20–70 [8.60, 8.61] (perovskite) 15 3.795 1.1% 25 [8.35] LaAlO3 (perovskite) 250 CaZrO3 4.00 4.0% 20–60 [8.61] (perovskite) 15 3.902 1.6% < 100 [8.62] SrTiO3 (perovskite) – Y2 O3 10.60 8.0% 12 (bixbyite) 14

Comments predicted stability at 1000 K predicted stability at 1000 K predicted stability at 1000 K Ti reduction low permittivity

reliable SOI based devices [8.9,8.18,8.29,8.30,8.36–8.44]. Even though epitaxial growth was demonstrated in these and other systems, full understanding of the interface issues were not developed in each case. From this list, the most promising choices for gate-oxides include Y2 O3 and Gd2 O3 . Numerous examples of Y2 O3 deposition have been demonstrated in the literature justifying pursuit of this material for alternative gate-dielectric applications. Results have not, however, been published for films in the thickness range of interest. As a result, it remains unclear whether or not the Y2 O3 /Si interface can exhibit the necessary stability [8.39,8.42,8.43]. The simplicity of this single cation system, the predicted chemical stability, the estimated band gap of ∼ 7 eV, and reported permittivities in excess of 15 warrant its inclusion on the attractive candidate list. Materials for a Standard Process Flow – Amorphous Dielectrics Amorphous films may also be used in compatibility with the standard process flow. Again, because of the expected high-temperature excursions, thermodynamic stability against silicon is of primary consideration. Ideally, in this case, an amorphous oxide comprised of the highest permittivity cation would be used, however, this must be normalized with respect to the propensity for crystallization. The two highest permittivity cations expected to be stable against Si are La and Ba, both have polarizabilities greater than 6 ˚ A [8.3] and dielectric constants are estimated near 30. La2 O3 ; however, is expected to have greater resistance to devitrification as it prefers a more complicated crystal structure and melts at a considerably higher temperature. BaO is expected to have a large band-gap and a favorable band-line up to Si; however, its relative instability in H2 -containing atmospheres and its stronger desire to crystallize make it a less desirable choice. Moreover, the known reactiv-

242

J.-P. Maria

Table 8.5. Prioritized list of high-permittivity alternative gate-oxides appropriate for transistors fabricated via the standard process flow. These oxides would remain amorphous throughout all processing steps Rank 1 2 3

Material (structure) a-2La2 O3 ·3SiO2 (lanthanum orthosilicate) a-(Hf,Zr)O2 ·SiO2 (zircon-hafcon) a-Gd2 O3 ·SiO2

K (measured) K (calculated) – 18 12 12 4 < K< 18 –

Comments

ionic conduction small K

ity with H2 O and CO2 implies that these materials will require atmospheric isolation. Other cations suitable for this approach include Gd, Hf, and Zr. These materials also boast large cation polarizabilities (see Table 8.1) and resistance to reaction with Si. Moreover, all have high melting temperatures. It is difficult to predict the behavior of an amorphous oxide-Si (001) interface in terms of band alignment, dangling bonds, and trap densities. With this in mind it seems appropriate that the oxides in question be alloyed with SiO2 in an attempt to simulate the Si–SiO2 contact. We expect that the interfacial properties will be improved, however, this will come at the expense of reduced permittivity. An additional benefit to the SiO2 alloy approach is a reduced driving force for devitrification, which upon review of the limited literature reports of single component oxide deposition, appears to be a potentially problematic issue [8.37, 8.45–8.51]. As such, the cations with the largest polarizability become more attractive. Table 8.5 lists the prioritized amorphous compositions. For the case of amorphous oxide materials intended for high temperature process flows, one additional selection criteria can be established. This criterion is based upon what is known regarding the silicates of candidate dielectrics. The criteria are principally based upon the existence of a silicate, and its melting behavior. By melting behavior, we refer to whether or not the silicate melts congruently, i.e, the first melt composition is identical to the melting solid composition, or incongruently, i.e., the first melt composition is not identical to the solid composition. The implications are predicted to be as follows. Crystallization of amorphous SiO2 -high K alloys will occur in a manner analogous to the melting of compositionally similar silicates. For instance, an amorphous alloy between ZrO2 and SiO2 is expected to respond to temperature exposure as would ZrSiO4 (the stoichiometric silicate). In this case, the stoichiometric silicate zircon melts incongruently and a ZrO2 rich solid phase and a SiO2 rich liquid phase initially evolve during the melting process. The implication is that upon crystallization from the amorphous state, these materials are prone to phase separation. In contrast one may consider the combination of Y2 O3 and SiO2 . This compositional system contains several silicates that melt congruently, that is, upon heating, a direct phase transition between solid and liquid Y2 SiO5 phases will be observed.

8 Alternative Dielectrics for Silicon-Based Transistors

243

Fig. 8.6. The periodic classification of the silicates. In this modified periodic chart, the characteristics of silicates associated with unary metal oxides are indicated. The legend gives symbols indicating the existence, melting behavior, and thermodynamic stability of each elemental oxide

244

J.-P. Maria

Consequently, the amorphous thin film material will not phase separate, but crystallize to the stoichiometric silicate directly. The result is that high K – silica alloys of an incongruent system will in general have lower crystallization temperatures – the tendency to phase separate induces crystallization. However, the kinetics of reactions with SiO2 at are effectively nill under the process conditions of interest in these highly refractory systems. As such, layers of high K with incongruent silicates deposited on thin SiO2 tend to be very stable with temperature. Conversly, high K materials with congruent silicates will react easily with SiO2 , thus such layers deposited on thin SiO2 will tend to form silicate-like interfaces. We have surveyed the available phase diagrams of metal oxide – SiO2 systems and comprised a periodic classification of silicates as shown in Fig. 8.6. Note that all phase diagram information corresponds to the thermodynamic equilibrium conditions, thus kinetic effects will not change the ultimately stable phases. This information is critical for dielectric selection as if fundamentally influences the interface structure of oxides on silicon. In principle, epitaxial compositions may be preferred as gate-oxides processed following the standard flow. Since Si must be prevented from reacting with the gate-oxide, and the oxygen from the dielectric must be prevented from diffusing into the Si, it appears unlikely that an amorphous oxide-Si interface will survive a 800◦ C–1000◦ C temperature cycle without any additional silica formation. If low permittivity layers greater than several formula units in thickness are present, it is unlikely that the 10 ˚ A and lower equivalent oxide thicknesses will be satisfied with an amorphous, and potentially SiO2 alloyed dielectric. This perspective, however, clearly contradicts the desire to maintain a straightforward non-UHV process flow.

Materials for a Replacement Process Flow For compatibility with the replacement process flow, exclusively amorphous films appear most desirable. In this case, as previous, the ideal material will consist of densely packed high polarizability cations. Again, as before, the ideal interface will exhibit monolayer oxygen coverage in the absence of metal-silicon bonding. Assuming that the temperature excursions to which the gate-oxide will be exposed will be limited to approximately 600◦ C, a greater number of compositions should be available. In the absence of subsequent high temperature anneals, it should be possible to precisely control oxide growth at the Si interface, and to prevent gate dielectric devitrification. This may alleviate the need for the SiO2 -alloy approach, which will increase the gate-oxide permittivity. The attractive cations in question again include all of the high permittivity selections (i.e., most transition metals and metals in the rare earth series with predicted thermodynamic stability). As before, knowledge of the band gap and band alignment is not well known for all cases; however, if the associated issues prove problematic, the SiO2 interface

8 Alternative Dielectrics for Silicon-Based Transistors

245

strategy can be introduced. Following this idea, each material identified on this list will also be potentially considered when alloyed with SiO2 . In an attempt to minimize uncertainties in property estimations, a literature search was performed to help determine the preferred gate-oxide choices. In most cases published data for amorphous thin films was used, Table 8.6 lists the results of this search. In addition, permittivity values, band gaps, and breakdown strengths are listed where available. The best choice appears to be a-La2 O3 . This material should have high permittivity, good resistance to devitrification, and an appreciably large band gap. The calculated permittivity of La2 O3 (hexagonal) is approximately 38, while the measured values range between 20 and 30. Reports on La2 O3 indicate a bandgap greater than 5.5 eV and thin film breakdown strengths as large as 8 MV/cm. As shown in Tables 8.1 and 8.6, yttrium, europium, praseodymium, dysprosium, gadolinium, and ytterbium, have oxides with similar structures and dielectric properties, thus may be of experimental interest. One difficulty with all of these single component oxides is their associated affinity for hydroxide formation; for example, it is known that ceramic La2 O3 is unstable in the presence of water vapor and CO2 . This condition may be of considerable consequence during exposures to H2 anneals which will likely be required for interface trap state minimization. Though all of these elements will, to some effect, be susceptible to degradation in a H2 containing atmosphere, results have shown that resulting rare-earth or transition metal hydroxides are unstable much above 350◦ C [8.47,8.52]. At greater temperatures, conversion to oxide is observed. Consequently, problems may be avoided provided such exposures occur in the appropriate temperature envelope. Though exhibiting lower permittivities, oxides of Y, and the periodic series from Sm to Lu are regarded as less prone to hydroxide formation, and thus may be increasingly attractive in light of the necessity for hydrogen anneals [8.47,8.52]. Of these ‘more stable’ choices, Gd and Y provide the largest dielectric constant values thus occur below La2 O3 on the prioritized list. The second group on the prioritized list is HfO2 and ZrO2 . Well-prepared samples should exhibit dielectric constants approaching 20, and are expected to survive H2 exposure with minimal irreversible damage. A potential difficulty exists with these compositions – in crystalline form both zirconia and halfnia are fast oxygen anion conductors (an illustration of this is given in Fig. 8.1, mechanism #1). Evidence of this behavior is well documented in reports detailing stabilized epitaxial ZrO2 for Si buffer layer applications [8.44]. It is unknown how this behavior will manifest in a thin amorphous layer; however, difficulties with oxygen transport through the dielectric into the Si gate are not unexpected. The prioritization of these dielectrics may change as experimental results indicate which criteria are most crucial to satisfy. As mentioned before, insufficient data exists to determine the absolute best material, thus the prioritized list is determined in order to address as many of these issues as possible using the minimum number of experiments. Alloying high K with SiO2 may

246

J.-P. Maria

Table 8.6. Summary of several literature reports detailing investigations of transition metal and rare-earth single-component oxide thin films material

structure

permittivity

loss tangent

Al2 O3 CeO2

amorphous MIM amorphous MIM epitaxial MIS amorphous MIM amorphous MIM amorphous MIM µ-xtalline MIM

10,10 15 26 7 12 14 24

0.055,– 0.2 – 0.016 – 0.014 .015

breakdown field (MV/cm) 3, – 0.8 2.5 – – .006 1

12 8–20 25 22 42 8 13 12 20 30 – 19 10 41 38 10 – 17 µb = 20 m2 Vs−1 13 11 20 25 26 42 12 16 14 12 12 13 humidity-stable 13 25 30–40 16 22 20 32 14 30

.01 humidity sensitive 0.006 – – 0.01 .007 – 0.016 .005 – – – – 0.20 0.15 – – Nt ∼ 3e15 /m2

3 5 2 – – 4 3 – 2 2 8 – – – – 2 6 – –

0.01–0.05 0.05 – 0.01 0.01 – 0.003 – 0.0025 0.004 .006 – TFT w/CdSe – 0.01 0.01 0.02 0.005 0.01 – – 0.01

– 3 – 1.5

Dy2 O3 Er2 O3 Eu2 O3 78% Eu3+ Gd2 O3

amorphous MIM amorphous MOS amorphous MIM HfO2 amorphous ? anodic oxide amorphous MIM Ho2 O3 amorphous MIM amorphous MIM La2 O3 amorphous MIM amorphous? amorphous MIM amorphous MIM MgO amorphous MIM Nb2 O5 anodic oxide amorphous amorphous MIM Nd2 O3 TFT w/CdSe amorphous MIM amorphous MIM µ-xtalline SIS Pr2 O3

Sb2 O3 Ta2 O5 WO3 Y2 O3

Yb2 O3 ZrO2

ZrTiO4

amorphous MIM amorphous MIM anodic oxide amorphous MIM amorphous MIM anodic oxide amorphous MIM epitaxial MIS amorphous MIM amorphous MIM amorphous MIM amorphous MIM µ−-xtalline amorphous MIM monoclinic (bulk) tetragonal (bulk) µ−-xtalline MIM amorphous MIM µ−-xtalline MIM x-talline MIM µ−-xtalline MIS amorphous MIM

4 – 3 4 4.5 – – – – – – 1.5 – 6 7 *

Eg (eV) reference measured 10, 7 5.1 4.5 5.2 4.9 5.4 –

4.7

5.5, 5.6

8.5 6 4.0

– 4 4.5 – –

5.3 8



[8.63] [8.64] [8.41] [8.65,8.66] [8.67,8.68] [8.69] [8.21] [8.64] [8.70] [8.71] [8.20] [8.71] [8.64] [8.69] [8.67,8.68] [8.72] [8.71] [8.67,8.68] [8.20] [8.71] [8.71] [8.64] [8.67,8.68] [8.67,8.68] [8.73] [8.74] [8.64] [8.71] [8.63] [8.71] [8.63] [8.75] [8.76,8.77] [8.71] [8.69] [8.67,8.68] [8.73] [8.67,8.68] [8.78] [8.78] [8.37] [8.20] [8.45] [8.50] [8.50] [8.79]

8 Alternative Dielectrics for Silicon-Based Transistors

247

Table 8.7. Prioritized list of high-permittivity alternative gate-oxides appropriate for transistors fabricated via the replacement gate-stack process flow (i.e., subject to post-deposition heat treatments at temperatures below ∼ 600◦ C). These oxides would remain amorphous throughout all processing steps Rank 1 2

Material (structure) a-La2 O3

3

a-HfO2 a-ZrO2 a-Gd2 O3

4

a-Y2 O3

K (measured) K (calculated) 20–30 35 15–40 11–25 8–20 18 15 12

Band gap (eV) 5.5 (meas.) ∼6 (calc.&meas.) ∼6 (calc.) ∼7 (calc.)

Comments H2 resistance? ionic conduction H2 resistance? moderate K value

overcome several of the expected difficulties; however, this inevitably occurs at the expense of reduced permittivity. The efforts towards materials compatible with the replacement gate-stack process may be appropriate for a wider range of devices than is currently appreciated. In addition to the reduction of capacitor and transistor areas, the Si dopants must be concomitantly confined in ultra-shallow profiles. One method of accomplishing this is to reduce thermal budgets. For this reason, it is important to investigate the potential utility of lower-temperature gate dielectric depositions. Conceivably, a process for the high-temperature deposition of gate-oxides will be developed which may be poorly suited for compliance with previous front-end or subsequent back-end processing steps.

8.3 Conclusions We have undertaken the problem of finding the optimal replacement gatedielectric for SiO2 to be used in applications requiring the capacitance density equivalent of a 10 ˚ A layer. Prior to experimental investigations, a selection process was developed through which the optimal candidate dielectrics could be identified. The complicated nature of this problem, and the urgency of solution warranted this approach. In general, this exercise is intended to avoid serial investigations of multiple compositions. Rather, a select set of materials would be constructed, their relative strengths and weaknesses identified, and parallel experimental investigations would be used to make the final decision. To begin this process, a set of selection criteria was developed. The criteria were based on themodynamic, dielectric, electronic, and technological compatibility issues. By examining each in detail, potentially appropriate and inappropriate material characteristics could be identified. The selection criteria were then applied to the available compositions and prioritized lists of candidates for experimental screening was assembled. Two lists were con-

248

J.-P. Maria

structed reflecting the two distinct microstructures of alternative gate oxides: epitaxial and amorphous. An insufficient amount of reference data is available to make a singular outright choice, thus necessitating the list approach. In general, the lists were constructed such that these unknowns could be addressed. Hopefully, these lists will be useful to guide experimental screening and verification to determine the most appropriate alternative gate-dielectric. The top gatedielectric choices were determined to be epitaxial SrZrO3 and amorphous La2 O3 for the standard CMOS process flow, and La2 O3 for the replacement gate approach. Acknowledgments. The authors would like to acknowledge the SRC/SEMATECH Center for Research in Front End Processes for funding this research. In addition, the authors would like to express their appreciation for technical discussions with D. Schlom, A.I. Kingon, V. Misra, J. Hauser, G. Parsons, G. Luckovsky, C. Osburn, R. Amos, D. Cann, J. Robertson, R. McKee, F. Walker, G. Bai, G. Wilk, G. Brown, and C. Tracey.

References 8.1. S.I. Association, The National Technology Roadmap for Semiconductors (Sematech, Austin, 1999) (1999) 8.2. S.I. Association, The International Technology Roadmap for Semiconductors (Sematech, Austin, 2001), (2001) 8.3. D.A. Muller, T. Sorsch, S. Moccio, F.H. Baumann, K. Evans-Lutterodt, and G. Timp, Nature 399, 758–762 (1999) 8.4. H.S. Momose, M. Ono, T. Yoshitomi, T. Oghmo, S. Nakamma, M. Saito, and H. Iwai, IEDM Tech. Dig., 593–618 (1994) 8.5. Y. Taur, D.A. Buchanan, W. Chen, D. J. Frank, K.E. Ismail, S.-H. Lo, G.A. Sai-Halong, R.C. Viswanathan, H.-J. C. Wann, S.J. Wind, and H.-S. Wong, IEDM Tech. Dig., 593–612 (1994) 8.6. J.R. Hauser and W.T. Lynch, available at Semiconductor Research Corporation website, www.src.org (1998) 8.7. D.G. Schlom, J.H. Haeni, MRS Bull. 27 (3): 198–204, Marc. 2002 8.8. K.J. Hubbard and D.G. Schlom, J. Mater. Res. 11 (11), 2757–2776 (1996) 8.9. R.J. Tarsa, K.L. McCormick, and J S. Speck, Mater. Res. Soc. Symp. Proc. 341, 73–85 (1994) 8.10. R. Amos, Sematech Assignee, Personal communication (1999) 8.11. J.M. Hergenrother, S.H. Oh, T. Nigam, D. Monroe F.P. Klemens, A. Kornblit, Solid State Electronics 46 (7), 939–950 (2002) 8.12. R.D. Shannon, J. Appl. Phys. 73 (1), 348–366 (1993) 8.13. Powder Diffraction File: Inorganic Phases (Swarthmore, PA, 1992) 8.14. C. Zhou and D.M. Newns, J. Appl. Phys. 82 (6), 3081–3088 (1997) 8.15. D. McCauley, R.E. Newnham, and C.A. Randall, J. Amer. Ceram. Soc. 81 (4), 979–987 (1998) 8.16. C. Bas¸ceri, S.K. Streiffer, A.I. Kingon, and R. Waser, J. Appl. Phys. 82 (5), 2497–2504 (1997)

8 Alternative Dielectrics for Silicon-Based Transistors

249

8.17. T. Tybell, C.H. Ahn, and J.-M. Triscone, Appl. Phys. Lett. 72 (12), 1454– 1456 (1998) 8.18. R.A. McKee, F.J. Walker, and M.F. Chisholm, Phys. Rev. Lett. 81 (14), 3014–3017 (1998) 8.19. W.D. Kingery, H.K. Bowen, and D.R. Uhlman, Introduction to Ceramics, 2nd edn (John Wiley & Sons, New York, NY, 1997) 8.20. P.J. Harrop and D.S. Campbell, Thin Solid Films 2, 273–292 (1968) 8.21. H. Nakane, A. Noya, S. Kuriki, and G. Matsumoto, Thin Solid Films 59, 291–293 (1979) 8.22. A.K. Vijh, J. Mater. Sci. 5, 379–382 (1970) 8.23. J. Robertson and C.W. Chen, Appl. Phys. Lett. 74 (9), 1168–1170 (1999). J. Robertson, MRS Bull. 27 (3), 217–221 (2002) 8.24. W. Monch, J. Appl. Phys. 80, 5076–5081 (1996) 8.25. A. Chatterjee, R.A. Chapman, K. Joyner, M. Otobe, S. Hattangady, M. Bevan, G.A. Broen, H. Yang, Q. He, D. Rogers, S.J. Flang, R. Kraft, A.L. P. Rotondaro, M. Terry, K. Brennan, S.-W. Aur, J.C. Hu, H.-L. Tsai, P. Jones, G. Wilk, M. Aoki, M. Rodder, and I.-C. Chen, IEDM-98, 777–780 (1998) 8.26. S. M. Sze, Physics of Semiconductor Devices, 2nd edn, (Wiley, New York, 1981 8.27. R. Droopad, Z.Y Yu, J. Ramdan, L. Hilt, J. Curles, C. Overgaard, J.L. Edwards, J. Finder, K. Eisenbeiser , J. Wang, V. Kaushik, B.Y. Ngyuen, B.J. Ooms, J. Crys. Growth 227, 936–943 (2001) 8.28. Y. Kado and Y. Arita, Extended abstract of 21st Int Conf on Solid State Devices and Mater., 45–48 (1989) 8.29. J. Lettieri , J.H. Haeni, and D.G. Schlom, J. Vac. Sci. Tech. A 20 (4), 1332– 1340 (2002) 8.30. H. Ishiwara, H. Mori, K. Jyokyu, and S. Ueno, Mater. Res. Soc. Symp. Proc. 220, 595–601 (1991) 8.31. J.F. Baumard and E. Tani, J. Chem. Phys. 67 (3), 857–860 (1977) 8.32. D.R. Sempolinski, W.D. Kingery, and H.L. Tuller, J. Amer. Ceram. Soc. 63 (11–12), 669–675 (1980) 8.33. W.D. Nix, Mett. Trans. A 20A, 1989–2217 (1988) 8.34. D.L. Smith, Thin Film Deposition (McGraw-Hill, Inc., N.Y., 1995) 8.35. G.A. Samara, J. Appl. Phys. 68 (8), 4214–4219 (1990) 8.36. E.J. Tarsa and J.S. Speck, Appl. Phys. Lett. 63 (4), 539–541 (1993) 8.37. Y. Miyahara, J. Appl. Phys. 5 (1), 2309–2314 (1992) 8.38. S. Miura, T. Yoshitake, S. Matsubara, Y. Miyasaka, N. Shohata, and T. Sato, Appl. Phys. Lett. 53 (20), 1967–1969 (1988) 8.39. T. Matth´ee, J. Wecker, H. Behner, G. Freidl, O. Eible, and K. Samwer, Appl. Phys. Lett. 61 (10), 1240–1242 (1992) 8.40. M. Ishida, K. Sawada, S. Yamaguchi, and T. Nakamura, Appl. Phys. Lett. 55 (6), 556–558 (1989) 8.41. T. Inoue, Y. Yamamoto, M. Satoh, and T. Ohsuna, Mater. Res. Soc. Symp. Proc. 341, 101–106 (1994) 8.42. R.L. Goettler, J.-P. Maria, and D.G. Schlom, Mater. Res. Soc. Symp. Proc. 474, 333–337 (1997) 8.43. H. Fukumoto, T. Imura, and Y. Osaka, Appl. Phys. Lett. 55 (4), 360–362 (1989) 8.44. A. Bardal, M. Zwerger, O. Eible, J. Wecker, and T. Matth´ee, Appl. Phys. Lett. 61 (10), 1243–1246 (1992)

250

J.-P. Maria

8.45. F. Tcheliebou, M. Boulouz, and A. Boyer, Mater. Sci. and Eng. B 38, 90–95 (1996) 8.46. G.V. Samsonov, I.Y. Gil’man, and A.F. Andreeva, Inorganic Materials 10 (9), 1417–1419 (1974) 8.47. M. Gasgnier, phys. stat. sol. (a) 114 (11), 11–71 (1989) 8.48. Y.-M. Gao, P. Wu, K. Dwight, and A. Wold, J. Sol. State Chem. 90, 228–233 (1991) 8.49. R.B. van Dover, R.M. Fleming, L.F. Schneemeyer, G.B. Alers, and J.W. D, Proc. IEEE Int. Electron Dev. Meet. Technical Digest, 823–826 (1998) 8.50. E.S. Ramakrishnan, K.D. Cornett, G.H. Shapiro, and W.-Y. Howng, J. Electrochem. Soc. 145 (1), 358–362 (1998) 8.51. W. Heitmann, Appl. Opt. 12 (2), 394–397 (1973) 8.52. M. Gasgnier, phys. stat. solidi (a) 57 (11), 11–55 (1980) 8.53. D.K. Fork, F.A. Ponce, J.C. Tramonata, and T.H. Geballe, Appl. Phys. Lett. 58 (20), 2294–2296 (1991) 8.54. P. Tiwari, S. Sharan, and J. Narayan, J. Appl. Phys. 69 (12), 8358–8362 (1991) 8.55. K. Sawada, M. Ishida, and T. Nakamura, Appl. Phys. Lett. 52 (20), 1672– 1674 (1998) 8.56. Y. Kado and Y. Arita, J. Appl. Phys. 61 (6), 2398–2400 (1987) 8.57. R. de Reus, F.W. Saris, G.J. van der Kolk, C. Witmer, B. Dam, D.H.A. Blank, D.J. Adelerhof, and J. Flokstra, Mater. Sci. Eng. B 7 (1–2), 135–147 (1990) 8.58. D.K. Fork, D.B. Fenner, R.W. Barton, and T.H. Geballe, J. Appl. Phys. 68 (8), 4316–4318 (1990) 8.59. R.D. Shannon, Acta Cryst. A32, 751–767 (1976) 8.60. A. Yamada, Y. Utsumi, H. Watarai, and K. Sato, Jpn. J. Appl. Phys. 31 Part 1 (9B), 3148–3151 (1992) 8.61. N.F. Federov, L.V. Kozhevnikova, and N.M. Lunina, Inorganic Materials 9 (10), 1579–1582 (1973) 8.62. H.-C. Li, W. Si, A.D. West, and X.X. Xi, Appl. Phys. Lett. 73 (2), 190–192 (1998) 8.63. V. Mikhaelashvili, Y. Betzer, I. Prudnikov, D. Ritter, and G. Eisenstein, J. Appl. Phys. 84 (12), 6747–6752 (1998) 8.64. A.T. Fromhold Jr. and W.D. Foster, Electrocomponent Sci. and Tech. 3, 51–62 (1976) 8.65. A. Goswami and R.R. Varma, Thin Solid Films 22, 52–54 (1974) 8.66. A. Goswami and R.R. Varma, Thin Solid Films 28, 157–165 (1975) 8.67. D.I. Chernobrovkin, V.S. Ten’Gushev, and V.V. Bakhtinov, Radio Eng. and Electronic Phys. 17 (2), 334–336 (1972) 8.68. D.I. Chernobrovkin, V.V. Bakhtiov, and Y.G. Sakharov, Inst. and Exp. Tech. 14 (3) pt. 2, 839–840 (1971) 8.69. K.A. Osipov, V.G. Krasov, I.I. Orlov, and D. Khromov, Inorganic Materials 12 (1), 108–110 (1976) 8.70. A.I. Petrov and V.A. Rozhkov, J. of Comm. Tech. and Electron. 38 (8), 57–62 (1993) 8.71. D. Gerstenberg, in Handbook of Thin Film Technology, ed. by L.I. Maissel and R. Glang (McGraw-Hill Book Company, New York, 1970), p. 19–1 to 19–36

8 Alternative Dielectrics for Silicon-Based Transistors

251

8.72. T. Mahalingham, M. Radhakrishnan, and C. Balasubramanian, Thin Solid Films 78, 229–233 (1981) 8.73. P. Singh and B. Baishya, Thin Solid Films 147, 25–32 (1987) 8.74. A. Goswami and A.P. Goswami, Thin Solid Films 20, S3–S6 (1974) 8.75. J. Hudner, H. Ohls´en, and E. Fredricksson, Mater. Res. Soc. Symp. Proc. 341, 95–100 (1994) 8.76. C.K. Campbell, Thin Solid Films 6, 197–202 (1970) 8.77. C.K. Campbell and M. Thewalt, Thin Solid Films 13, 195–198 (1972) 8.78. D.P. Thompson and A.M. Dickins, J. Mater. Sci .27, 2267–2271 (1992) 8.79. Y. Ishizuka, D. Wickaksana, J.-P. Maria, and A.I. Kingon, Symp. Proc. Elec. Chem. Soc. (1999)

9 Materials Issues for High-k Gate Dielectric Selection and Integration R.M. Wallace and G.D. Wilk

9.1 Introduction The key material enabling Si-based metal-oxide-semiconductor field effect transistor (MOSFET) technology is silicon dioxide. The use of amorphous, thermally grown SiO2 as a gate dielectric offers several important materials (and electrical) properties that are exploited in complementary metal-oxidesemiconductor (CMOS) technology including a stable (thermodynamically and electrically), high-quality Si–SiO2 interface as well as superior electrical isolation properties. In modern CMOS processing, defect charge densities are on the order of 1010 /cm2 , mid-gap interface state densities are ∼ 1010 /cm2 eV, and hard breakdown fields in excess of 10 MV/cm are routinely obtained and are therefore expected regardless of the device dimensions. These outstanding electrical properties clearly present a significant challenge for any alternative gate dielectric candidate. Many materials systems are currently under consideration as potential replacements for SiO2 as the gate dielectric material for sub-0.1 µm (CMOS) technology. A systematic consideration of the required properties of gate dielectrics indicates that the key guidelines for selecting an alternative gate dielectric are (a) permittivity, band gap and band alignment to silicon, (b) thermodynamic stability, (c) film morphology, (d) interface quality, (e) compatibility with the current or expected materials to be used in processing for CMOS devices, (f) process compatibility, and (g) reliability [9.109, 9.115]. Many dielectric materials appear favorable in some of these areas, but very few materials are promising with respect to all of these guidelines. This article provides a summary of the materials issues that any viable dielectric alternative to SiO2 must address. Many of these materials have been examined over the last 20 years for capacitor applications, for example. In the last 2–3 years, however, attention has turned toward understanding the physical and electrical properties of various ultrathin (≤ 5 nm) alternate dielectric candidates. Moreover, it is also clear that much research remains, as any material that is to replace SiO2 as the gate dielectric within the next 2–5 years (depending upon the application) faces a formidable challenge [9.96]. The requirements for process integration compatibility are remarkably demanding, and any serious candidates will emerge only through continued, intensive research. As such, it is not yet possible to write the closing chapter on this

254

R.M. Wallace and G.D. Wilk

topic. Significant advances in understanding should be expected over the next decade. 9.1.1 Improved Performance Through Scaling The improved performance associated with the reduction (scaling) of logic device dimensions can be seen by considering a simple model for the drive current associated with a field effect transistor [9.34]. Using the gradual channel approximation, the drive current is given by:   W VD µCinv VG − VT − VD (9.1) ID = L 2 where W is the width of the transistor channel, L is the channel length, µ is the channel carrier mobility (assumed constant here), Cinv is the capacitance density associated with the gate dielectric when the underlying channel is in the inverted state, VG and VD are the voltages applied to the transistor gate and drain, respectively, and the threshold voltage is given by VT . In this approximation, the drain current is proportional to the average charge across the channel (with a potential VD /2) and the average electric field (VD /L) along the channel direction. For low applied fields, ID increases linearly with VD and then eventually saturates to a maximum when VD,sat = VG − VT to yield: ID,sat =

(VG − VT )2 W µCinv L 2

(9.2)

It is important to note that the term (VG − VT ) is limited in range due to reliability and device operation at (or above) room temperature. A large gate voltage will create an undesirable, high electric field across the gate insulator, thereby reducing reliability. Moreover, the threshold voltage cannot easily be reduced below an engineering margin of about 8 kT∼ 200 mV as devices must operate within a wide range of temperatures. Thus, to increase ID,sat , a reduction in the channel length or an increase in the gate dielectric capacitance must be accomplished. For the gate capacitance, consider a parallel plate capacitor, C=

κεo A t

(9.3)

where κ is the dielectric constant (also referred to as the relative permittivity in this paper) of the material, εo is the permittivity of free space (= 8.85 × 10−3 fF/mm), A is the area of the capacitor, and t is the thickness of the dielectric. Note that the relative permittivity of a material is often given by ε or εr , such as with the expression C = εεo A/t. The relation between κ and ε varies depending on the choice of units (e.g. when εo = 1 as in cgs units), but since it always the case that κ ∝ ε, we simply set κ = ε.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

255

This expression for C can be rewritten in terms of teq (i.e. equivalent oxide thickness (EOT)) and κox (dielectric constant of SiO2 ) of the capacitor. The term teq represents the theoretical thickness of SiO2 that would be required to achieve the same capacitance density as the dielectric (ignoring issues such as leakage current and reliability). For example, if the capacitor dielectric is SiO2 , teq = 3.9 εo (A/C), and so a capacitance density of C/A = 34.5 fF/µm2 corresponds to teq = 10 ˚ A. Thus, the physical thickness of an alternative dielectric employed to achieve the equivalent capacitance density of teq = 10 ˚ A can be obtained from the expression: κhigh-k κhigh-k teq = (9.4) thigh-k = teq 3.9 κox Obtaining teq = 10 ˚ A with a relative permittivity of 16 therefore requires a physical thickness of ∼ 40 ˚ A. The actual capacitance of a CMOS gate stack for ULSI devices does not scale simply with 1/t due to parasitics, quantum mechanical confinement of carriers, and dopant depletion effects. Indeed, consideration of these important effects can result in much confusion on the definition of dielectric thickness as extracted from electrical measurements. Parasitic resistances and capacitances associated with various portions of the transistor structure can result in an overall degradation of performance as defined by a delay time figure of merit. [9.34,9.115] There exist several materials issues associated with the control of such parasitics including source/drain dopant profile control, gate/contact sheet resistance minimization, etc. which are beyond the scope of this article. However, it is important to realize that the rapid integration of any gate dielectric in CMOS requires compatibility with the fabrication processes associated with these materials issues and must therefore be considered at an early stage. Quantum mechanical confinement effects on carriers in the channel region occur as a result of the large electrical fields in the vicinity of the Si substrate surface. These fields quantize the available energy levels resulting in the displacement of the charge centroid from the interface into the Si and at energies above the Si conduction band edge. The extent of the penetration of the charge centroid is dependent upon the biasing conditions employed for the MIS structure and, for accurate estimates of transistor drain current performance, the inversion capacitance measurement provides the accurate determination of the equivalent oxide thickness [9.44, 9.87, 9.118]. The depletion effect in the poly-Si gate electrode associated with the gate stack is a result of the decrease in the density of electrically active dopants near the poly-Si/dielectric interface resulting in an increased depletion width [9.117]. As a result, the capacitance under strong inversion is significantly lower than that obtained under accumulation. As the dielectric thickness is reduced in the scaling process, this effect becomes a serious limitation for transistor drive current. To solve this problem, a return to metal

256

R.M. Wallace and G.D. Wilk

gates has been proposed and thus the interaction of any alternate dielectric with various metal gate candidates must be examined. When a capacitance-voltage curve is properly corrected for these quantum mechanical and dopant depletion effects, one can extract the equivalent oxide thickness, teq . This thickness definition is in contrast to that derived directly from the raw data in a C–V measurement, which is termed the “capacitance equivalent thickness” (CET>EOT), or the “physical thickness,” which can be determined by non-electrical characterization methods such as ellipsometry or high-resolution transmission electron microscopy.

9.1.2 Leakage Current and Power An equally important aspect of performance in modern electronics, however, is the minimization of (both operational and stand-by) power consumption, particularly for portable electronics [9.96]. At 100◦ C, current product roadmaps predict the need for a leakage current limited to 300–700 pA/µm of gate width in the 2 to 5 year time frame for “low-power” devices such as those used for microprocessors in portable notebook computer applications. For other handheld portable devices, a value of 1 pA/µm is required under standby power conditions. Both of these requirements stem from the preservation of the associated portable power supply life.1 In contrast, similar predictions for leakage in high performance microprocessor and application specific applications place the limit at 100– 1000 nA/µm under operational conditions. Unlike low power devices, the primary concern in these devices is minimal gate delay, τ , rather than leakage current [9.12]. Targets for high-performance devices call for an average increase of 17% in 1/τ per year, maintaining the historical performance trend.2 However, this scaling can be maintained only if the sub-threshold sourcedrain leakage (which is kept greater than the required gate current leakage) is permitted to increase substantially and therefore an increase in chip power dissipation will be required. Nevertheless, it is recently predicted that conventional dielectrics (such as nitrided silicon-oxynitride) will continue to be employed to meet leakage requirements for high performance devices, as well as mixed low-power/high-power devices and creative circuit techniques [9.96]. To achieve small gate leakage (tunneling) currents, quantum mechanics indicates that the gate dielectric must be sufficiently thick and have a sufficient energy barrier (band offset). This can be seen by considering the expression for Fowler–Nordheim tunneling current where electron transport occurs through a trapezoidal energy barrier [9.34]: 1

2

The most stringent projected conditions for leakage current are, of course, for memory devices (1.4 fA/µm) where charge storage concerns are paramount. In contrast, low power devices are predicted to yield 14% increase in 1/τ per year.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

JDT =

A t2diel

 exp (−Btdiel ) ×

Vdiel ΦB − 2

 − ΦB +





Vdiel 2

ΦB −

exp 

 exp

ΦB +

Vdiel 2 Vdiel 2

257





(9.5)

Here A is a constant, B = 4π/h(2m∗q)1/2 , tdiel is the physical thickness of the dielectric, Vdiel is the voltage drop across the dielectric, and m∗ is the electron effective mass in the dielectric. From (9.5), one observes that the tunneling (leakage) current increases exponentially with decreasing barrier height and thickness for electron direct tunneling transport. The resultant implications for materials will be discussed further below. With the leakage current as a key driving force, alternate gate dielectric materials appear to be required for low power CMOS device technologies in the near future. Moreover, unlike previous generations, the 2001 industry roadmap suggests a bifurcation in the gate dielectric materials requirements. In the past, the scaling technology transfer process between high performance and low power applications was accomplished conservatively without drastically changing core transistor materials, such as the gate dielectric. Apparently, the cost associated with adopting the (very likely) different fabrication processes is currently considered within the realm of possibilities.

9.2 MIS (Metal-Insulator-Semiconductor) Structures 9.2.1 Issues for Interface Engineering The interfacial regions between the gate electrode, dielectric and channel (in totality termed the “gate stack”) require careful attention, as they are particularly important in regard to device performance. These regions, less than ∼ 5 ˚ A thick, obviously serve as a transition between the atoms associated with the materials in the gate electrode, gate dielectric and Si channel, and can alter the overall capacitance of the gate stack, particularly if they have a thickness which is substantial relative to the gate dielectric. Additionally, these interfacial regions can be exploited to obtain desirable properties. The upper interface, for example, can be engineered in order to block boron out-diffusion from the poly-Si gate. The lower interface, which is in direct contact with the CMOS channel region, must be engineered to provide low interface trap densities (e.g. dangling bonds) and minimize carrier scattering (maximize channel carrier mobility) in order to obtain reliable, high performance device. Mobility degradation, relative to that obtained using SiO2 (or SiON) gate dielectrics, associated with the incorporation of high-k dielectrics is an important issue currently under investigation and discussed further in Sect. 9.3.1.

258

R.M. Wallace and G.D. Wilk

It is clear that the industry roadmap presents a major challenge for the core transistor gate dielectric as predictions call for a much thinner effective thickness for future alternative gate dielectrics: teq ≤ 10 ˚ A. Reactions at either of these interfaces during the device fabrication process results in the presence of a significant interfacial layer that will compromise the desired gate stack capacitance. Additionally, any suitable interfacial layer near the channel must result in defect state densities comparable to SiO2 and avoid degradation of carrier mobility in the near surface channel region. The reduced capacitance can be seen by noting that when the structure contains several dielectrics in series, the lowest capacitance layer will dominate the overall capacitance and also set a limit on the minimum achievable teq value. For example, the total capacitance of two dielectrics in series is given by 1/Ctot = 1/C1 + 1/C2 , where C1 and C2 are the capacitances of the two layers, respectively. If one considers a dielectric stack structure such that the bottom layer (layer 1) of the stack is SiO2 , and the top layer (layer 2) is the high-κ alternative gate dielectric, (9.4) is expanded (assuming equal areas) to: teq = tSiO2 +

κSiO2 thigh-k . κhigh-k

(9.6)

It is clear from (9.6) that the minimum achievable equivalent oxide thickness is limited by that of the lower-κ (in this case pure SiO2 ) layer. Therefore, much of the expected increase in the gate capacitance associated with the high-κ dielectric is compromised. Alternatively, the superior interface characteristics associated with SiO2 on Si naturally leads one to consider a thin SiO2 layer (∼ 5 ˚ A thick) coupled with a high-κ dielectric. Such an extremely thin SiO2 layer is very difficult to obtain with high quality. Recent work, for example, has determined that such thin layers (< 7 ˚ A thick) do not exhibit bulk SiO2 behavior [9.63,9.100]. Additionally, the resulting voltage drop across the oxide could also lead to significant charge trapping in the film, especially since the interface between such stacked dielectrics may contain a large density of traps. Furthermore, a thin interface layer most likely will not prevent reaction between the substrate and any high-κ material which is not thermodynamically stable to SiO2 formation on Si, under standard thermal processing required for CMOS. To illustrate this point, an example for obtaining a dielectric stack with teq = 10 ˚ A is considered in Fig. 9.1. One way to achieve this would be to use 5 ˚ A of SiO2 (teq = 5 ˚ A) as the lower (first) layer, at the Si interface, and 30 ˚ A of a dielectric with κ = 25 (teq,high-k ∼ 5 ˚ A) as the upper (second) layer. Even for a low applied voltage, the thin interfacial layer will have a large enough electric field to create a significant amount of charge trapping. In addition, an oxide layer this thin will allow a large amount of direct electron tunneling into the high-κ dielectric, likely causing further deleterious effects to the electrical performance of the stack.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

259

Fig. 9.1. Comparison of stacked and single layer gate dielectrics in a hypothetical transistor gate stack. Either structure results in the same overall gate stack capacitance or equivalent oxide thickness, teq = 10 ˚ A. After [9.115]

It is important to note, however, that if a single layer dielectric could be used, then teq = 10 ˚ A can be achieved with 40 ˚ A (physical thickness) of a material which only has a moderate permittivity of κ = 16 (see Fig. 9.1). This physical thickness is greater than the total physical thickness of the stack in the above example (40 > 35 ˚ A), even though the permittivity of the single layer gate dielectric (κ = 16) is much lower than that of the alternate dielectric in the two-layer stack (κ = 25). In addition, any potential charge trapping at a dielectric-dielectric interface might, in theory, be avoided. However, as indicated previously, replacing a SiO2 -like interfacial layer with a metal-oxide may introduce more defects at the substrate-dielectric interface, close to the CMOS channel. Of course, in practice, the detailed structure of any interface plays a dominant role on the electrical performance. These considerations for the choice of the best high-κ materials will be covered in more detail below. The presence of fixed charge and traps in dielectrics (bulk or at dielectric stack interfaces) is another important issue currently under investigation [9.115]. The fixed charge, typically positive and sometimes negative, results in a shift of the flat-band voltage, VFB , which can be very large (∼ 1 eV in some cases). This shift, in turn, results in challenges in transistor design through, for example, implantation or interface engineering (usually at a compromise to the gate stack capacitance) to compensate for the flat-band or threshold voltage shift [9.26,9.116]. The presence of this charge can also cause Coulombic scattering of carriers in the channel, resulting in channel mobility degradation (from that achieved with SiO2 ) as well. Recent investigations of charge traps associated with candidate dielectrics has indicated a dynamic threshold voltage shift is observed with injected charge (stress) [9.26, 9.46]. Moreover, sensitivity to the injection polarity is noted and attributed to the presence of interfacial layers (and their resultant potential barriers to charge transport).

260

R.M. Wallace and G.D. Wilk

9.2.2 High-k Device Modeling and Transport The incorporation of stacked structures has been investigated by a number of researchers in recent years. As discussed previously, the layer within the stack that has the lower dielectric constant will limit the overall capacitance of the stack. However, the minimization of interface states may require suitable interfacial layers that serve as a transition region between the Si substrate and the dielectric. One can also envision a graded composition throughout the dielectric permitting control of interfacial state formation that preserves, to some extent, the high-κ properties sought for the alternative gate dielectric in the gate stack. Vogel et al. [9.107] considered such effects in a model of potential gate stack materials. As acknowledged in the work, the model does not incorporate trap-assisted tunneling mechanisms but does provide an indication of the trends associated with stacked layers and scaling. The effect of electron injection through an interfacial layer from either the gate or the substrate was examined. In addition to the expected reduction in the overall tunneling current, it was seen that the tunneling current changes substantially depending on the dielectric layer first encountered by the electron. In an attempt to predict the effect of high-κ gate dielectrics on transistor performance, Frank et al. [9.20] modeled gate dielectrics with various permittivities in a planar, bulk CMOS structure. It was reported that the upper limit of permittivity would be limited to κ ∼ 20 due to fringing field-induced barrier lowering (FIBL) at the drain region of the device. This phenomenon is a large concern, because a significant fringing field from the edge of a highκ dielectric could lower the barrier for transport into the drain enough to seriously degrade the on/off characteristics of the device. Cheng et al. [9.15] reported similar modeling results for high-κ gate dielectrics, but also claimed that a dielectric stack with SiO2 at the channel interface could reduce any barrier-lowering effects from the high-k fringing fields. Perhaps even more important is the issue of field penetration into the Si channel region [9.20]. The inversion charge in the channel experiences an increasing electric field with increasing gate capacitance, regardless of the gate dielectric material. At a high enough electric field penetration through the gate dielectric, channel carriers will undergo increased scattering, ultimately leading to a decrease in mobility and drive current. Additionally, this inversion layer will have an associated capacitance in series with the gate stack and will also eventually limit the ultimate gate stack capacitance for any high-k dielectric [9.37]. This effect was first reported by Timp et al. [9.102] using pure SiO2 , as 30 nm gate length transistors showed that for teq < 13 ˚ A, the drive current actually decreased.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

261

9.3 Materials Properties and Integration Considerations All of the materials systems discussed above must meet a set of criteria to perform as a successful gate dielectric. We now consider a summary of the appropriate materials properties for the selection of materials for gate dielectric applications. 9.3.1 Permittivity and Barrier Height Selecting a gate dielectric with a higher permittivity than that of SiO2 is clearly essential. For many simple oxides, permittivities have been measured on bulk samples and in some cases even on thin films, but for the more complex materials (more elemental constituents), the dielectric constants may not be as well known. Shannon used a method involving the Classius-Mossoti equation for calculating ionic polarizabilities as a means to make predictions of permittivities for many dielectrics. Good agreement has been found for some materials, but there are also many cases of poor agreement between calculated and measured values. This discrepancy between calculation and measurement can be attributed to many factors, including film thickness, method of film deposition, and local electronic structure within the dielectrics. It is clear that much more experimental data is needed for measurements of dielectric constant for these high-κ dielectrics, particularly below the 100 ˚ A thickness regime. For non-metallic (insulating) solids, there are two main contributions of interest to the dielectric constant which give rise to the polarizability: electronic and ionic dipoles. Figure 9.2 illustrates the frequency ranges where each contribution dominates. In general, atoms with a large ionic radius (e.g. high atomic number) exhibit more electron dipole response to an external electric field, because there are more electrons to respond to the field (electron screening effects also play a role in this response). This electronic contribution tends to increase the permittivity for higher atomic number atoms. The ionic contribution to the permittivity can be much larger than the electronic portion in cases such as perovskite crystals of (Ba,Sr)TiO3 and (Pb,Zr)TiO3 (which exhibit ferroelectric behavior below the Curie point). In these cases, Ti ions in unit cells throughout the crystal are uniformly displaced in response to an applied electric field (for the case of ferroelectric materials, the Ti ions reside in one of two stable, non-isosymmetric positions about the center of the Ti–O octahedra). This displacement of Ti ions causes an enormous polarization in the material, and thus can give rise to very large dielectric constants of 2000–3000. Since ions respond more slowly than electrons to an applied field, the ionic contribution begins to decrease at very high frequencies, in the infrared range of ∼ 1012 Hz, as shown in Fig. 9.2. In contrast to the general trend of increasing permittivity with increasing atomic number for a given cation in a metal oxide, the band gap EG of the

262

R.M. Wallace and G.D. Wilk

CMOS

Fig. 9.2. The frequency dependence of the real (εr ) and imaginary (εr ) parts of the dielectric permittivity. In CMOS devices, ionic and electronic contributions are present. After [9.40]

metal oxides tends to decrease with increasing atomic number, particularly within a particular group in the periodic table. An intuitive explanation for this phenomenon is that the corresponding bonding and anti-bonding orbitals of the metal-oxygen atoms form a valence band and a conduction band, respectively. For the case of SiO2 , the σ bonds formed by the sp hybrid orbitals (which arise from Si s, p and O p orbitals) have a σ bonding orbital energy level and a higher σ∗ anti-bonding orbital energy level. The energy separation between these levels defines a band gap, but this may or may not be the minimum band gap in the material. For even the simple case of SiO2 , where there are only s and p electron orbitals that are all filled during bonding, the oxygen electron lone pair energy level actually defines the valence band maximum (rather than the σ bonding energy level). The result is the often-reported band gap of SiO2 , EG ∼ 9 eV. If the σ and σ∗ bonds defined the valence band maximum and conduction band minimum, respectively, then the band gap of SiO2 would be larger. (See Chap. 11 for further discussion). Thus for the transition metal oxides, which all have five d electron orbitals and other non-bonding p orbitals, the band gaps of these oxides can be significantly decreased by the presence of partially-filled d orbitals, which have available states for electron occupancy. These orbital energy levels tend to lie within the gap defined by the σ and σ∗ orbitals. The d orbital levels which lie within the gap defined by σ and σ∗ levels are all therefore available for electron conduction at significantly lower energy levels than would be expected from σ and σ∗ alone. It is important to note that these partially filled and non-bonding levels are not the result of defects within the material, but rather are intrinsic to such atomic constituents where many orbitals are available for electron conduction. This general band gap reduction for

9 Materials Issues for High-k Gate Dielectric Selection and Integration

263

Table 9.1. Comparison of relevant properties for some high-κ candidates Material

Dielectric Constant (κ) 3.9 7 9 15 25 25 30 26 80

Band Gap EG (eV) 8.9 5.1 8.7 5.6 5.8 5.7 4.3 4.5 3.5

∆EC (eV) to Si 3.2 2 2.1a 2.3b 1.2a 1.5b 2.3b 0.5a 1.2

SiO2 Si3 N4 Al2 O3 Y2 O3 ZrO2 HfO2 La2 O3 Ta2 O5 TiO2 a [9.62] b calculated values [9.89, 9.90] mono. = monoclinic; tetrag. = tetragonal

Crystal Structure(s) amorphous amorphous amorphous cubic mono., tetrag., cubic mono., tetrag., cubic hexagonal, cubic orthorhombic tetrag. (rutile, anatase)

higher-κ materials is a limitation that must be realized and expected when selecting a suitable high-κ gate dielectric. In the cases of Ta2 O5 and TiO2 , both materials have small EG values and correspondingly small ∆EC values. These small ∆EC values directly correlate with high leakage currents for both materials, making pure Ta2 O5 and TiO2 unlikely choices for gate dielectric applications. Table 9.1 also shows, however, that ZrO2 , HfO2 and La2 O3 offer relatively high values for both κ and EG . It is important to note that all of the highκ metal oxides listed in Table 9.1, which includes those studied for gate dielectric applications, are crystalline at relatively low temperature (except Al2 O3 ). In the case of many of the alternate gate dielectric materials (mainly metal oxides) under consideration, the polarizibility of the metal-oxygen (“ionic”) bond is responsible for the observed low-frequency permittivity enhancement. Such highly polarizable bonds are described to be “soft” relative to the less polarizable, “stiff” Si–O bonds associated with SiO2 . Unlike the stiff Si–O bond, the polarization frequency dependence of the M-O bond is predicted to result in an enhanced scattering coupling strength for electrons with the associated low-energy and surface optical phonons [9.19]. This scattering mechanism can therefore degrade the electron mobility in the inversion layer associated with MOSFET devices. A thorough examination of this effect is provided by Fischetti et al. [9.19] where calculations of the magnitude of the effect indicate that pure metaloxide systems, such as ZrO2 and HfO2 , suffer the worst degradation, whereas materials which incorporate Si–O bonds, such as silicates, fare better. In that work, it is also noted that the presence of a thin SiO2 interfacial layer between the Si substrate and the high-κ dielectric can help boost the resultant mobility by screening this effect, although the maximum attainable effective

264

R.M. Wallace and G.D. Wilk

Vacuum Level

qΦΜ



qΦΒ

Eg EF

F

qΨΒ

EC Ei EV

metal

insulator

semiconductor

Fig. 9.3. MIS energy barrier structure for an n-type semiconductor – dielectricmetal system

mobility is still below that for the ideal SiO2 /Si system. These researchers further suggest that the effect is also minimized by the incorporation of Si–O in the dielectric, as in the case of pseudo-binary systems such as silicates. There are other possible explanations for mobility degradation that are currently under investigation. For example, it has also been suggested by Torii et al. [9.103], that fixed-charge-induced Coulomb scattering (at the polySi/Al2 O3 interface), rather than soft-phonon coupling (with the Si substrate), is the cause for mobility degradation. It was also previously suggested that interdiffusion of dielectric constituents with the Si channel could lead to mobility degradation through ionized impurity scattering [9.114, 9.115]. Interdiffusion of Zr and Hf from silicate dielectrics has recently been examined where Hf silicates appears to offer superior stability in this regard [9.81, 9.82, 9.85]. Recent work by Guha et al. [9.28] appears to support a correlation of mobility degradation and the interdiffusion of Al with Si from Al2 O3 . For gate dielectric applications, however, the required permittivity must be balanced against the barrier height for the tunneling process, since electron tunneling directly relates to leakage current. Consider the band diagram shown in Fig. 9.3 where the electron affinity (χ) and gate electrode workfunction (ΦM ) are defined. For electrons traveling from the Si substrate to the gate, this barrier is the conduction band offset, ∆EC ∼ = q(χ − (ΦM − ΦB )); for electrons traveling from the gate to the Si substrate, this barrier is ΦB . For highly defective films which have electron trap energy levels in the insulator band gap, electron transport will instead be governed by a trapassisted mechanism such as Frenkel–Poole emission or hopping conduction, as described by (9.7) and (9.8), respectively.

9 Materials Issues for High-k Gate Dielectric Selection and Integration



JFP

q = E exp − kT

ΦB −



qE πεi

265

 (9.7)

q 2 l 2 n∗ Γ E (9.8) kT Here E is the electric field across the dielectric, l is the interval of separation between adjacent hopping sites, n∗ is the density of free electrons in the dielectric, and Γ is the mean hopping frequency. Leakage current also depends on other factors, including the morphology of the layer, discussed below. In order to obtain low leakage currents, it is desirable to find a gate dielectric that has a large ∆EC value to Si and to other gate metals that may be required. Reported values of ∆EC for most dielectric-Si systems are scarce in the literature, but calculations indicate that some of the metal oxide and complex oxide materials, such as Ta2 O5 and SrTiO3 , will have ∆EC < 0.5 eV on Si [9.88]. For several high-κ dielectrics that are currently under consideration, Robertson found that ∆EC ∼ 2.8 eV for Al2 O3 , ∆EC ∼ 2.3 eV for La2 O3 and Y2 O3 , and ∆EC ∼ 1.5 eV for ZrO2 and ZrSiO4 [9.89]. Recently, photoelectron spectroscopy measurements have examined the Ta2 O5 , Al2 O3 and ZrO2 materials systems [9.62]. Good agreement with the calculations was observed for thin films of Ta2 O5 (2.8 nm: 0.45 eV), Al2 O3 (5.3 nm: 2.08 eV) and ZrO2 (3.2 nm: 1.23 eV), although values lower than 1.5 eV raises concern of an insufficient energy barrier height to gate leakage (tunneling). It is also noted by Miyazaki that consideration of the film thickness dependence of the associated photoelectron energy loss features is required in the course of these measurements, particularly when comparing thin film and bulk results. Other similar photoelectron measurements for the ZrO2 (2.6nm)/Zr-silicate (0.9 nm)/Si MIS structure (produced by pulsed laser-ablation and sputter deposition methods) indicate somewhat lower barriers (∼ 0.8–1.0 eV) [9.119]. Very recent internal photoemission work indicates that the conduction band offset for HfO2 is 2.0 eV (regardless of the interfacial layer present) [9.1], in contrast with X-ray photoemission results (1.2 eV) [9.93]. It is noted in these reports, however, that the uncertainty in the bandgap of HfO2 plays an important role in the extraction of these barrier measurements. These recent calculations and experimental measurements provide important insight into relevant barrier height (such as ∆EC ) values for many candidate dielectrics. If the experimental ∆EC values for these oxides are even much less than 1.0 eV, it will likely preclude using these oxides in gate dielectric applications, since electron transport (from enhanced Schottky emission, thermal emission or tunneling) would lead to unacceptably high leakage currents. A large band gap EG generally corresponds to a large ∆EC (see Table 9.1), but the band structure for some materials has a large valence band offset ∆EV which constitutes most of the band gap of the dielectric. As noted above, it is extremely difficult to achieve the juxtaposition of these high-κ dielectrics on Si, as an SiO2 -like interface usually forms. This Jhop =

266

R.M. Wallace and G.D. Wilk

interface layer will of course alter the ∆EC value of the system, and must be taken into consideration when comparing measured and calculated results. Referring back to Table 9.1, a list of several metal oxide (and nitride) systems, some of which have been investigated as gate dielectrics, compares values for κ and EG , along with ∆Ec values on Si where available. For these high-κ materials, Table 9.1 indicates that EG will be somewhat limited, since it can be seen that the dielectric constant and band gap of a given material generally exhibit an inverse relationship (although some materials have significant departures from this trend).

9.3.2 Thermodynamic Stability on Si For all thin gate dielectrics, the interface with Si plays a key role, and in most cases is the dominant factor in determining the overall electrical properties. Most of the high-κ metal oxide systems investigated thus far have unstable interfaces with Si: that is, they react with Si under equilibrium conditions to form an undesirable interfacial layer. These materials therefore require an interfacial reaction barrier, as mentioned previously. Any ultrathin interfacial reaction barrier with teq < 20 ˚ A will have the same quality, uniformity and reliability concerns as SiO2 does in this thickness regime. This is especially true when the interface plays a determining role in the resulting electrical properties. It is important to understand the thermodynamics of these systems, and thereby attempt to control the interface with Si. An important approach toward predicting and understanding the relative stability of a particular three-component system for device applications can be explained through ternary phase diagrams [9.4, 9.36, 9.94]. An analysis of the Gibbs free energies governing the relevant chemical reactions for the Ta–O–Si and Ti–O–Si ternary systems, as shown in Fig. 9.4, indicates that Ta2 O5 and TiO2 (or mixtures with Si), respectively, are not stable to SiO2 formation when placed next to Si. Rather, the tie lines in Fig. 9.4 show that Ta2 O5 and TiO2 on Si will tend to phase separate into SiO2 and metal oxide (Mx Oy , M = metal), and possibly silicide (Mx Siy ) phases. This instability resulting in SiO2 formation has been observed experimentally for both of these metal oxides which leads to the necessity for an additional thin interfacial barrier layer to minimize the undesirable reactions with the silicon substrate. In contrast to the Ta and Ti systems, the tie lines in the phase diagram for the Zr–O–Si system, shown in Fig. 9.5 indicate that, under equilibrium conditions, the metal oxide ZrO2 and the compound silicate ZrSiO4 will both be stable in direct contact with Si up to 950◦ C [9.110]. The gray shaded area denotes a large phase field of (ZrO2 )x (SiO2 )1−x compositions which are expected to be stable on Si up to high temperatures [9.115]. Other (ZrO2 )x (SiO2 )1−x compositions outside of the gray area are also stable on Si, but since it is desirable to prevent any Zr–Si (silicide) bonding, film compositions within the gray area may be preferable.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

267

b

a

Fig. 9.4. Ternary phase diagrams for (a) Ta–O–Si and (b) Ti–O–Si systems. The phase diagrams are shown for temperatures of 700–900◦ C, but these relations are also valid at much lower temperatures. After [9.4] O

ZrO2

Zr

Zr2Si

ZrSiO4

SiO2

ZrSi ZrSi2

Si

Fig. 9.5. Ternary phase diagram for the Zr–O–Si system. After [9.110]

Care must be taken in selecting the appropriate composition, however, because (ZrO2 )x (SiO2 )1−x phase separation into ZrO2 and SiO2 has been reported to occur at T < 900◦ C for x > 0.25 [9.67]. This suggests that compositions in the right upper region of the gray shaded area may be preferable for obtaining a stable film which maintains a high-quality interface to Si. Similar behavior is expected to be the same for the Hf–O–Si system based on coordination chemistry arguments. Although the thermodynamic information is incomplete for the Hf–O–Si system, previous work suggests that HfO2 and HfSiO4 , as well as a large range of (HfO2 )x (SiO2 )1−x compositions, will be stable in direct contact with Si up to high temperatures [9.3, 9.65, 9.72]. This fundamental difference from the Ta and Ti systems is extremely important, because it suggests that there is potential to control the dielectric-Si interface. A review of thermodynamic considerations in regard to gate dielectric selection has been provided by Schlom and Haeni [9.94]. More recent work has shown, however, that deposited films for metal oxides predicted to be thermodynamically stable, such as Al2 O3 [9.7, 9.68], Y2 O3 [9.7, 9.10], La2 O3 [9.59], ZrO2 [9.7, 9.11] do result in the formation of an interfacial layer, most likely

268

R.M. Wallace and G.D. Wilk

a silicate or SiO2 . Recent work has demonstrated that even for the case of Zr-silicate deposited directly on Si, an extremely thin 3.5 ˚ A SiO2 layer still forms at both Si interfaces [9.64]. For the cases of Zr- and Hf-silicates, a range of dielectric constants from 5 to 15 has been reported for a range of compositions (metal content from 3 to 27 at.%) using sputtering and CVD techniques [9.5,9.8,9.67,9.80,9.112–9.114, 9.120]. Clearly, details of the surface preparation and deposition conditions have a strong impact on the resulting quality and dielectric constant of the films. Post deposition exposure of uncapped films to oxygen can also result in interfacial layer formation, as has been reported for Y2 O3 [9.6], La2 O3 [9.59], ZrO2 [9.111] and HfO2 [9.116]. Recently, the role of OH species in the context of interfacial reactions has also been examined [9.23]. This emphasizes that the deposition and post-deposition processes have kinetic components that must be controlled for an optimized gate stack. Such work also demonstrates the need to focus on the integration requirements (as-processed device structure, thermal cycling, annealing ambient, O2 partial pressure, etc.) associated with conventional CMOS process flows rather than only acquiring a low teq value. For example, very recent work by Quevedo-Lopez et al. [9.81] demonstrated that fundamental stability issues are observed for Zr-based dielectric materials where Zr interdiffusion with the Si substrate is observed from annealed Zr-silicate films. In contrast, Hf-silicate instability due to Hf interdiffusion with the Si substrate under similar annealing conditions was not observed within the depth resolution of the Time-of-flight Secondary Ion Mass Spectrometry technique (i.e. < 1 nm) [9.82]. Recent surface science (scanning tunneling microscopy) investigations of uncapped Hf-silicate films (produced from Hf/SiO2 layers subsequently annealed in ultra-high vacuum) have identified the formation of Hf-silicides on the Si surface [9.50]. The reported reaction of Hf with the Si substrate appears to be limited to depths of < 1 nm consistent with that reported by Quevedo-Lopez et al. [9.82]. Even in the case of process modifications such as a replacement gate flow, which allows the high-temperature processing to be done before deposition of the final gate dielectric, the continued thermal cycling afterward is sufficient to result in poor electrical properties for materials such as Ta2 O5 [9.13]. It is therefore important to select a materials system in which the desired final state is stable. As many of the material candidates examined to date result in mixed interfacial alloys upon deposition or further processing, the use of materials such as silicates may allow for control of the Si interface composition, which may solve a key problem for the high-κ gate dielectric materials approaches. The κ values of materials such as (HfO2 )x (SiO2 )1−x are substantially lower than those of their pure metal oxide counterparts (in this case HfO2 ), but this tradeoff for interfacial control and relatively higher mobility will be acceptable as long as the resulting leakage currents are low enough.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

269

9.3.3 Interface Quality It is difficult to imagine any material creating a better interface to Si than that of SiO2 , since typical production-quality SiO2 gate dielectrics have a midgap interface state density Dit ∼ 2 × 1010 states/cm2 eV. Most of the high-κ materials reported in this paper show Dit ∼ 1011 –1012 states/cm2 eV, and in addition exhibit a substantial flatband voltage shift ∆VFB > 300 mV (possibly from fixed charge density > 1012 /cm2 in the film or at the interface). It is crucial to understand the origin of the interface properties of any high-κ gate dielectric, so that an optimal high-κ–Si interface may be obtained. Lucovsky et al. has shown that bonding constraints, where bondstretching and bond-bending forces across the atoms associated with the interface, must also be considered at the Si-dielectric interface [9.55–9.57,9.76]. Using empirical methods, the interface defect density increases proportionally if the average number of bonds per atom Nav > 3, with a corresponding degradation in device performance. Thus, metal oxides which contain elements with a high coordination (such as Ta and Ti) will have a high (> 3.5) Nav , and form an over-constrained interface with Si with a concomitant degradation in leakage current and electron channel mobility. Similarly, cations with low coordination (e.g. Ba, Ca) compared to that of Si lead to under-constrained systems in the corresponding metal oxides. Such systems (metal oxides, ternary alloys, etc.) that are either over- or under-constrained with respect to SiO2 , lead to the formation of a high density of electrical defects near the Si-dielectric interface, resulting in poor electrical properties. These bonding arguments can be extended to silicide formation in the gate dielectric, or even to any M–Si bonding (not necessarily a full silicide phase). Any silicide bonding which forms near the channel interface (as may result when a metal oxide is placed in direct contact with, or near, Si), will tend to produce unfavorable bonding conditions, leading to poor leakage current and electron channel mobilities. In order to maintain a high-quality interface and channel mobility, it is expected to be important to have no metal oxide or silicide phases present at or near the channel interface. Adding a third component to an alloy or network may favorably affect the material in terms of thermal stability, bonding constraints and morphology. Metal oxides such as ZrO2 and HfO2 are well known to exhibit high oxygen diffusivities [9.45] and are therefore a serious concern in regard to the control of interfacial layer formation. Annealing treatments with excess oxygen present (e.g. from the ambient or from a sidewall oxide), will lead to rapid O2 diffusion through the oxides, resulting in SiO2 or SiO2 -containing interface layers. Although SiO2 is an ideal interface with Si, an uncontrolled amount of SiO2 formation at the interface will severely compromise the capacitance gain from any high-κ layers in the gate stack. Thus resistance to oxygen diffusion from annealing ambients should be characterized in assessing the interface stability of high-κ dielectrics.

270

R.M. Wallace and G.D. Wilk

Also, since many high-κ dielectrics will be reduced in the presence of H2 (the Ti-containing perovskites are all severely reduced by even low temperature anneals in forming gas), high-κ gate dielectrics also need to be characterized with respect to the effect of anneals in reducing ambients such as forming gas (typically 90% N2 :10% H2 ), which is a standard final anneal in the CMOS process and is believed to passivate interfacial traps (dangling bonds) with hydrogen. For all dielectrics examined to date, a substantial amount of (positive or negative) fixed charge has been observed [9.115]. Such defects result in large flat band voltage shifts and in some cases dynamic threshold shifts due to the injected charge are observed. Recent work by Wilk et al. [9.116] and Gusev et al. [9.26] has examined the control of fixed charge by engineering the interfacial layer, albeit at the expense of gate capacitance. It has also been suggested that Coulomb scattering of carriers by the interaction with the fixed charge present, as well as dynamic charge trapping of channel carriers at or near the channel interface, are major causes for the observed mobility degradation in transistors with alternate gate dielectrics such as HfO2 [9.26], Y2 O3 [9.86] and Al2 O3 [9.103]. Although this charge can typically only be measured by fabricating devices and making electrical measurements, it has been shown that the amount and sign of charge density in high-k stacks can also be quantified by measuring the shift of the core-level binding energy of the Si peaks in X-ray photoelectron spectroscopy [9.71]. Recent work provides a more detailed discussion on the implications of these effects on device performance [9.41, 9.73]. The ideal gate dielectric stack may well turn out to have an “engineered” interface at the channel consisting of several monolayers of Si–O (and possibly N) containing material, such as a nitrided silicate [9.108]. This layer could serve to preserve the critical, high-quality nature of the SiO2 interface while providing a higher-κ value for that thin layer. The same pseudo-binary material could also extend beyond the interface, or a different high-k material could be used on top of the interfacial layer. The detailed composition of the gate stack structure will likely be tailored to accommodate scaling performance requirements. Examples may obviously include composition gradients of the various constituents, such as the N content, throughout the dielectric layer [9.115]. 9.3.4 Film Morphology Many of the alternate gate dielectrics studied to date exhibit either polycrystalline or single crystal films. As shown in Table 9.1, nearly all bulk metal oxides of interest, with the exception of Al2 O3 , will form a poly-crystalline film during deposition or upon modest thermal treatments: HfO2 and ZrO2 are no exceptions. It is important to note, however, that the phases listed in Table 9.1 are bulk properties, and there will certainly be some suppression of crystallization for very thin films such as gate dielectrics, at tempera-

9 Materials Issues for High-k Gate Dielectric Selection and Integration

271

tures where crystallization would otherwise be expected to occur. The extent of crystallization suppression for a given oxide will depend on composition (metal content), the annealing ambient (capped and uncapped films) as well as the thermal budget (i.e. temperature and time) associated with processing (i.e. kinetic factors). For example, uncapped Al2 O3 films appear to exhibit crystallization upon annealing above T = 650◦ C, whereas capped films apparently do not [9.74]. Previous work by Wilk and Wallace indicate that, for low concentrations of Hf [9.112] and Zr [9.113, 9.114], Hf and Zr-silicates appear to remain amorphous upon post-deposition annealing to 1050◦ C for 20 s. Films with somewhat higher concentrations of Hf are expected to exhibit some crystallization and this effect has been reported [9.106]. Poly-crystalline gate dielectrics could be problematic because grain boundaries may serve as high-leakage paths, and this may lead to the need for an amorphous interfacial layer to reduce leakage current. In addition, grain size and orientation changes throughout a poly-crystalline film can cause significant variations in κ, leading to irreproducible properties, especially if gate dimensions approach that of the dielectric grain size. We note that several recent studies however, appear to offer counterexamples, as encouraging electrical properties for ZrO2 [9.35] deposited by atomic layer deposition (ALD) and HfO2 [9.26,9.30,9.33,9.42,9.116] deposited by ALD and metallo-organic chemical vapor deposition (MOCVD) are reported. It should be noted, however, that both studies also used a gate dielectric stack, with the dielectric film on top of an amorphous SiO2 layer. It is unclear at this point to what extent the amorphous SiO2 layer affords the encouraging electrical properties, but this issue will become important, as the SiO2 layer presents a limit to the minimum achievable teq value for these structures. Recent work on the crystallization kinetics of ALD HfO2 has determined that grain growth is thermally activated, as the average grain size is primarily determined by anneal temperature, while anneal time has little effect [9.32]. Interestingly, it was also found in this study that HfO2 films grown on a 10 ˚ A SiO2 layer grown by dry thermal oxidation are small-grain poly-crystalline, containing monoclinic and either tetragonal or orthorhombic phases, whereas HfO2 films deposited under equivalent conditions on a 10 ˚ A SiO2 layer grown by wet chemical oxidation appear completely amorphous by bright-field, highresolution transmission electron microscopy (TEM). Further investigation in Z-contrast scanning TEM using fluctuation electron microscopy, however, revealed strong signatures of order at a sub-nanometer level. The morphologies for these two cases become indistinguishable after a 700◦ C anneal, as only a poly-crystalline monoclinic phase is observed for both [9.32]. Perhaps a more significant issue is the impurity (dopant) diffusion through the dielectric, typically enhanced by the presence of grain boundaries, resulting in significant threshold and flatband voltage shifts. Recent work has demonstrated B penetration from poly-Si capped amorphous Al2 O3 [9.26, 9.74] and polycrystalline ZrO2 thin films into the Si substrate [9.75]. Phos-

272

R.M. Wallace and G.D. Wilk

phorous penetration was also reported for Al2 O3 [9.48] and ZrO2 [9.51]. Similar results have been reported for HfO2 films as well [9.69, 9.70]. For pseudo-binary systems, such as Hf-silicate, recent studies indicate that films with sufficient Hf content resulting in polycrystalline films exhibit enhanced dopant diffusion upon annealing [9.83, 9.84]. Suppression of dopant diffusion through the incorporation of N (either within the film or as an interfacial barrier layer) has previously been employed for SiO2 films [9.24], and considered for alternate gate dielectrics [9.108]. Recent work has suggested that N incorporation in high-k materials such as Hf-silicate [9.91,9.106], ZrO2 [9.38,9.75], Al2 O3 [9.74,9.101], and HfO2 [9.39,9.69,9.70] appear promising. N incorporation also appears to provide sufficient modification of the bonding network in Hf-silicate so as to preserve an amorphous structure even with high Hf content [9.106]. Given the concerns regarding poly-crystalline and single crystal films, it appears that an amorphous film structure remains the ideal one for the gate dielectric – particularly in the near term. The prospects for longer term single crystal dielectrics remain dependent upon (1) adequate manufacturing tools that can address large volume throughput associated with mainstream Si technology and (2) a modified CMOS process flow that incorporates metal gates in a replacement-gate scheme. See the section on “future directions” in this book for details. 9.3.5 Gate Compatibility Si-based gate electrodes have dominated CMOS technology because precise control of dopant implant energy and flux can accurately tune the desired threshold voltage, VT , for both nMOS and pMOS. Additionally, the process integration components (annealing, etching, contact silicidation etc.) are well established in industry. It is therefore still desirable to employ a gate dielectric which will be compatible in direct contact with Si-based gates. We also note that other efforts are focused on using poly-Si1−x Gex gates for achieving higher boron activation levels and therefore better performance in pMOS devices, and potentially better performance in nMOS devices as well [9.43, 9.54, 9.105]. The pseudo-binary dielectrics appear to offer the required stability when in contact with Si-based gates as well as metal gates. Control of dopant penetration through any gate dielectric will remain an important issue since doped poly-Si is the incumbent gate electrode material. For CMOS scaling in the longer term, however, current roadmap predictions indicate that poly-Si gate technology will likely be phased out beyond the 70 nm technology node, after which a metal gate substitute appears to be required [9.96]. It is therefore also desirable to focus efforts on dielectric materials systems that are compatible with potential metal gate materials. Specifically, metal gates such as TiN and Pt have been used with most of the high-κ gate dielectrics mentioned above for material evaluation purposes due to their expected stability toward adverse reactions with various dielectrics.

9 Materials Issues for High-k Gate Dielectric Selection and Integration

273

Fig. 9.6. Energy diagrams of threshold voltages for nMOS and pMOS devices using (a) midgap metal gates and (b) dual metal gates

Metal gates are very desirable for eliminating dopant depletion effects and sheet resistance/performance constraints for scaled CMOS [9.96]. In addition, use of metal gates in a replacement gate process can lower the required thermal budget by eliminating the need for a gate dopant activation anneal as for the case of a poly-Si electrode [9.12, 9.58]. A key issue for gate electrode materials research will be the control of the gate electrode work function (Fermi level) at the dielectric interface after CMOS processing. Moreover, many of the potential advanced gate dielectrics investigated require metal gates, again due to the instability when in direct contact with Si. There are two basic approaches toward achieving successful insertion of metal electrodes: a single midgap metal or two separate metals (which may include two compositions of the same alloy system). The energy diagrams associated with these two approaches are shown in Fig. 9.6. The first approach is to use a metal (such as TiN) that has a work function that places its Fermi level at the midgap of the Si substrate, as shown in Fig. 9.6a. These are generally referred to as midgap metals. The main advantage of employing a midgap metal arises from a symmetrical VT value for both nMOS and pMOS, because by definition the same energy difference exists between the metal Fermi level and the conduction and valence bands of Si. This affords a simpler CMOS processing scheme, since only one mask and one metal would be required for the gate electrode (no ion implantation step would be required for the gate electrode). For the case of bulk CMOS devices, however, a major drawback is that the band gap of Si is fixed at 1.1 eV, thus the threshold voltage for any midgap metal on Si will be ∼ 0.5 V for both nMOS and pMOS. Since voltage supplies are expected to be ∼ 1.0 V for sub-0.13 µm CMOS technology, VT ∼ 0.5 V is much too large, as it would be difficult to turn on the device. Typical threshold voltages for these devices are expected to be 0.2–0.3 V. Compensation implants can be made in the channel to lower the VT , but other concerns then arise regarding increased impurity ion scattering, which would degrade the electron channel mobility. Furthermore, midgap work function metal gate systems have been predicted not to provide a performance

274

R.M. Wallace and G.D. Wilk

improvement worthy of the added process complexity to replace Si-based gates [9.18,9.66]. Recent reports have examined TiN (and TiSiN) metal gates in conjunction with HfO2 gate dielectrics on n- and p-MOSFETs [9.92] as well as TaN gates [9.47]. Recently, TaN gates have been examined with Hf-silicate dielectrics as well [9.22]. The second approach toward metal electrodes involves two separate metals, one for pMOS and one for nMOS devices. As shown in Fig. 9.6b, two metals could be chosen by their work functions, ΦM , such that their Fermi levels line up favorably with the conduction and valence bands of Si, respectively. In the ideal case depicted in Fig. 9.6b, the ΦM value of Al could achieve VT ∼ 0.2 V for nMOS, while the higher ΦM value of Pt could achieve VT ∼ 0.2 V for pMOS. In practice, Al is not a feasible electrode metal because it will reduce nearly any oxide gate dielectric to form an Al2 O3 -containing interface layer. Other metals with relatively low work functions, such as Ta and TaN, however, are feasible gate metals for nMOS. Similarly for pMOS, Pt is not a practical choice for the gate metal, since it is not easily processed, does not adhere well to most dielectrics, and is high cost. Other elemental metals with high ΦM values such as Au are also not practical, for the same reasons as for Pt. It has been suggested that the workfunctions for each metal should be within 0.2 eV of the conduction and valance band edges of Si [9.18]. However, metals such as Al, Ta, Ti, Hf and Zr have been demonstrated to reduce dielectrics when in direct contact [9.60]. Recently, the difference between vacuum-based workfunction values and those obtained from stack structures [9.52] employing dipole theory at the metal/dielectric interface [9.121, 9.122]. Comparisons to a small set of metal oxide dielectric materials were performed. From that work, it is suggested that the metal gate materials should exhibit a vacuum workfunction less that 4.05 eV for nMOS and greater than 5.17 eV for pMOS. As an alternative to elemental metals, conducting metal oxides such as IrO2 and RuO2 , which have been studied and used for years in DRAM applications [9.34], can provide high ΦM values in addition to affording the use of standard etching and processing techniques. Alloys of these and similar conducting oxides can also potentially be fabricated to achieve a desired work function. Regarding potential gate electrodes for pMOS devices, Zhong et al. [9.123, 9.124] made initial measurements of the important properties of RuO2 , including thermal stability up to 800◦ C, a low resistivity of 65 µΩ-cm, and a measured RuO2 work function of ΦM = 5.1 eV. Recently, metal alloys of TaN and TaSiN, which have been employed as diffusion barriers for Cu in “backend” portions of the CMOS process flow, have also been examined for nMOS applications [9.98, 9.99]. The concept of a “tunable” workfunction metal gate electrode would, of course, be appealing as this is one of the key successes associated with poly-Si gate technology. Misra et al. [9.61] have recently examined the feasibility of the deposited Ta-Ru alloy system to accomplish this with Ru as a pMOS gate and Ru0.6 Ta0.4 as an nMOS gate [9.61, 9.125]. Alteration of a metal through

9 Materials Issues for High-k Gate Dielectric Selection and Integration

275

implantation, as in the case of poly-Si gates, is obviously appealing as well. Recently, the effects of N implantation in Mo on the Mo workfunction have been examined [9.52, 9.53]. Metal interdiffusion has also been suggested as a process efficient method of controlling metal work functions [9.78]. 9.3.6 Process Compatibility A crucial factor in determining the final film quality and properties is the method by which the dielectrics are deposited in a fabrication process. The deposition process for the dielectric must be compatible with current or expected CMOS processing, cost and throughput. Since all of the feasible deposition techniques available occur under non-equilibrium conditions, it is certainly possible to obtain properties different from those expected under equilibrium conditions. It is therefore important to consider the various methods for depositing the gate dielectrics, and the following techniques will be discussed here: physical vapor deposition (PVD, e.g. sputtering and evapor ration), chemical vapor deposition (CVD), atomic layer CVD (ALCVD), and molecular beam epitaxy (MBE). PVD methods have provided a convenient means to evaluate materials systems for alternate dielectric applications. The damage inherent in a sputter PVD process, however, results in surface damage and thereby typically creates unwanted interfacial states. Additionally, device morphology inherent to the scaling process generally rules out such line-of-sight PVD deposition approaches. For this reason, CVD methods have proven to be quite successful in providing uniform coverage over complicated device topologies. Interestingly, many encouraging results in recent gate dielectric research have come from PVD films, which may be speculated to be the result of lower impurities present compared to CVD films where CVD precursors can lead to impurity incorporation. Some researchers have suggested that such PVD films should not be ruled out based on such results [9.106]. The reaction kinetics associated with film CVD deposition requires careful attention in order to control interfacial layer formation. The precursor employed in the deposition process must also be tailored to avoid unwanted impurities in the film as well as permit useful final compositions in the dielectric film. Indeed, a graded composition for dielectric films may be a key requirement in order to control interface state formation to a level comparable to SiO2 . r (also referred to as ALD) methThe recent application of ALCVD ods for depositing gate dielectrics appears to provide much promise, where self-limiting chemistries are employed to control film formation in a layer-bylayer fashion, through the delivery of alternating chemical precursor pulses in a laminar flow of N2 carrier gas [9.104]. This process affords outstanding conformality, even over very demanding device structures (see Chap. 21). As discussed above, attention to the surface preparation and the resultant chemistry must be carefully considered. A recent, comprehensive review on the ALD

276

R.M. Wallace and G.D. Wilk

process, reactors, precursors and wide-ranging applications has been written by Ritala and Leskel¨ a [9.79]. The application of the ALD method for capacitor dielectrics has been recently reviewed by Sneh et al. [9.97] where impressive conformality is noted for dielectric films associated with high aspect-ratio capacitor trenches [9.95,9.97]. Improvements in wafer throughput are now under consideration to address manufacturing economics issues as well [9.95]. Recent work has indicated that ALD HfO2 gate dielectrics coupled with controlled interfacial chemistry and post-deposition annealing provides a viable pathway for MOS transistors and alternative (e.g. vertical) transistor designs [9.77, 9.116]. The technique also permits the formation of laminated structures such as HfO2 -Al2 O3 multilayered films [9.16, 9.49], Hfaluminates [9.31] and Zr-aluminates [9.14]. To better understand the fundamental growth process of ALD in the critical initial stages, the initial stages of growth kinetics have been investigated for ALD HfO2 [9.25] and ALD Al2 O3 [9.21] on Si and SiO2 . An analytical model describing the different growth modes which result from ALD HfO2 on various starting surfaces (e.g. H-terminated Si, SiO2 grown by dry thermal oxidation and SiO2 grown by wet chemical oxidation) has also been constructed [9.2], and should serve as a basis for advancing the understanding of film morphologies resulting from the ALD process.

9.3.7 Reliability The electrical reliability of a new gate dielectric must also be considered critical for application in CMOS technology. The determination of whether or not a high-κ dielectric satisfies the strict reliability criteria requires a wellcharacterized materials system – a prospect not yet available for the alternate dielectric materials considered here. The nuances of the dependence of voltage acceleration extrapolation on dielectric thickness and the improvement of reliability projection arising from improved oxide thickness uniformity, have only recently become understood, despite decades of research on SiO2 . This further emphasizes the importance and urgency to investigate the reliability characteristics of alternative dielectrics, as these materials are sure to exhibit subtleties in reliability that differ from those of SiO2 . However, some preliminary projections for reliability, as determined by stress induced leakage current (SILC), time-dependent dielectric breakdown (TDDB) and mean time to failure (MTTF) measurements, appear to be encouraging for many candidates (see the review by Wilk et al. [9.115] for a synopsis of previous preliminary work). See Chaps. 4 and 17 for further discussion on the status and issues of high-k gate dielectric reliability. Recently, the role of Si interdiffusion with the gate dielectric and the Si substrate [9.81,9.82] has been examined in view of reliability constraints [9.29]. The susceptibility to hot-carrier charge trapping has also been recently investigated [9.46].

9 Materials Issues for High-k Gate Dielectric Selection and Integration

277

9.4 Conclusions We have presented several key materials issues that candidate gate dielectrics must address: (a) Permittivity and Barrier Height, (b) Thermodynamic Stability on Si, (c) Interface Quality, (d) Film Morphology, (e) Gate Compatibility, (f ) Process Compatibility and (g) Reliability. All potential alternative dielectric materials must also address several fundamental performance and integration concerns, such as fixed charge, dopant depletion in the poly-Si gate electrode, and an increasing electric field in the channel region, which decreases device performance. Furthermore, dopant diffusion characteristics, failure mechanisms and reliability of any potential high-κ dielectric need to be better understood. The semiconductor industry has enjoyed the excellent reliability characteristics of SiO2 for many years, but any new material will certainly exhibit different behavior, which may or may not have deleterious effects on device performance. The stringent requirements for 10-year reliability of CMOS devices will be a harsh test on any high-κ materials candidates. An even greater challenge is the adoption of a new candidate in the time frame required by the industry roadmap (∼ 4-5 years) in order to maintain cost/performance trends. The industry has enjoyed the fruits of over 30 years of research and development on the SiO2 /Si materials system – a fact not always recognized in technology development planning. A new generation of scientists and engineers will be challenged by not only integrating these new materials in a timely manner, but also by avoiding the mistakes of the past. Opportunities to revolutionize an industry like these are indeed rare!

References 9.1. Afanas’ev VV, Stesmans A, Chen F, Shi X, Campbell SA (2002) “Internal photoemission of electrons and holes from (100)Si into HfO2 ,” Applied Physics Letters 81:1053–5 9.2. Alam MA, Green ML (2003) “A mathematical description of atomic layer deposition, and its application to the nucleation and growth of HfO2 gate dielectric layers,” unpublished 9.3. Barin I, Knacke O (1973) “Thermochemical Properties of Inorganic Substances” (Springer-Verlag, Berlin) 9.4. Beyers R (1984) “Thermodynamic considerations in refractory metalsilicon-oxygen systems” J. Applied Physics 56:147–52 9.5. Bevan MJ, Visokay MR, Chambers JJ, Rotondaro ALP, Bu H, Shanware A, Mercer DE, Laaksonen RT, Colombo L (2001) “Comparative Study Of High-K CVD Films of Hf and Zr Silicate for CMOS Devices” as discussed at the IEEE Semiconductor Interface Specialists Conference, Washington D.C. 9.6. Busch BW, Kwo J, Hong M, Mannaerts JP, Sapjeta BJ, Schulte WH, Garfunkel E, Gustafson T (2001) “Interface reactions of high- κ Y2 O3 gate oxides with Si,” Applied Physics Letters 79:2447–9

278

R.M. Wallace and G.D. Wilk

9.7. Busch BW, Pluchery O, Chabal YJ, Muller DA, Opila RL, Kwo J, Garfunkel E (2002) “Materials Characterization of Alternative Gate Dielectrics” Materials Research Society Bulletin 27:206–211 9.8. Callagari A, Cartier E, Gribelyuk M, Okorn-Schmidt H, Zabel T (2001) “Physical and electrical characterization of Hafnium oxide and Hafnium silicate sputtered films,” Journal of Applied Physics 90:6466 9.9. Cartier E (2002) “Emerging challenges in the development of high-e gate dielectrics for CMOS applications,” Proceedings of the AVS 3rd International Conference on Microelectronics and Interfaces, February 11-14, Santa Clara, CA, pp. 119–22 9.10. Chambers JJ, Parsons GN (2000) “Yttrium silicate formation on silicon: Effect of silicon preoxidation and nitridation on interface reaction kinetics,” Applied Physics Letters 77:2385–7 9.11. Chang JP, Lin YS, Berger S, Kepton A, Bloom R, Levy S (2001) “Ultrathin zirconium oxide films as alternative gate dielectrics,” J. Vacuum Science and Technology B19:2137–43 9.12. Chatterjee A, Rodder M, Chem I-C (1998) “A Transistor Performance Figure-of-Merit Including the Effect of Gate Resistance and its Application to Scaling to Sub-0.25-µm CMOS Logic Technologies”, IEEE Transactions on Electron Devices 45:1246–52 9.13. Chatterjee A, Chapman RA, Joyner K, Otobe M, Hattangady S, Bevan M, Brown GA, Yang H, He Q, Rogers D, Fang SJ, Kraft R, Rotondaro ALP., Terry M, Brennan K, Aur SW, Hu JC, Tsai H-L, Jones P, Wilk G, Aoki M, Rodder M, Chen I-C (1998) “CMOS Metal Replacement Gate Transistors using Tantalum Pentoxide Gate Insulator,” Technical Digest of the International Electron Devices Meeting, pp. 777–80 9.14. Chen PJ, Cartier E, Carter RJ, Kauerauf T, Zhao C, Petry J, Cosnier V, Xu Z, Kerber A, Tsai W, Young E, Kubicek S, Caymax M, Vandervorst W, DeGendt S, Heyns M, Copel M, Besling W, Bajolet P, Maes J (2002) “Thermal Stability and Scalability of Zr-Aluminate-Based High-K Gate Stacks,” Symposium on VLSI Technology Technical Digest of Papers, pp. 192–3 9.15. Cheng B, Cao M, Rao R, Inani A, Voorde P, Greene W, Stork J, Yu Z, Zeitzoff P, Woo J (1999) “The Impact of High- Gate Dielectrics and Metal Gate Electrodes on Sub-100 nm MOSFET’s”, IEEE Transactions Electron Devices 46:1537–44 9.16. Cho MH, Roh YS, Whang CN, Jeong K, Choi HJ, Nam SW, Ko DH, Lee JH, Lee NI, Fujihara K (2002) “Dielectric characteristics of Al2 O3 –HfO2 nanolaminates on Si(100),” Applied Physics Letters 81:1071–3 9.17. Copel M, Cartier E, Ross FM (2001) “Formation of a stratified lanthanum silicate dielectric by reaction with Si(001),” Applied Physics Letters 78:16079 9.18. De I, Johri D, Srivastava A, Osburn CM (2000) “Impact of gate workfunction on device performance at the 50 nm technology node”, Solid-StateElectronics 44, no.6, pp. 1077–80 9.19. Fischetti MV, Nuemayer DA, Cartier EA (2001) “Effective electron mobility in Si inversion layers in metal–oxide–semiconductor systems with a high-k insulator: The role of remote phonon scattering,” Journal of Applied Physics 90:4587–4608 9.20. Frank D, Taur Y, Wong H-S P(1998) “Generalized scale length for twodimensional effects in MOSFETs,” IEEE Electron Device Letters 19:385–7

9 Materials Issues for High-k Gate Dielectric Selection and Integration

279

9.21. Frank MM, Chabal YJ, Wilk GD (2003) “Nucleation and interface formation mechanisms in Al2 O3 atomic layer deposition,” unpublished 9.22. Gopalan S, Onishi K, Nieh R, Kang CS, Choi R, Cho HJ, Krishnan S, Lee JC (2002) “Electrical and physical characteristics of ultrathin hafnium silicate films with polycrystalline silicon and TaN gates,” Applied Physics Letters 80:4416–8 9.23. Gougousi T, Jason Kelly M, Parsons GN (2002) “The role of the OH species in high-k polycrystalline silicon gate electrode interface reactions,” Applied Physics Letters 80:4419–21 9.24. Green ML, Gusev EP, Degraeve R,.Garfunkel EL, (2001) “Ultrathin ( 10–15. In particular, this has focused attention on i) group IIIB and IVB transition metals, and ii) lanthanide series rare earth oxides in which the dielectric constants are in the range between at least 15 to 25. Since the gate oxide capacitance scales directly with the dielectric constant, these substitutions give increases of four to six in physical thickness for the same EOTs as thermally-grown SiO2 . However, since the tunneling transmission probability also depends on the height of the tunneling barrier, e.g., the conduction band offset energy, EB , and the effective electron tunneling mass, meff , reductions in these parameters are expected to mitigate some of the gains associated with increased physical thickness [11.1,11.2]. This mandates an increased understanding of the fundamental differences between the electronic structure of SiO2 and the transition metal and rare earth alternative dielectrics, since electronic structure plays a significant role in determining EB and meff . This chapter addresses the chemical bonding and fundamental electronic structure of transition metal and lanthanide rare earth gate dielectrics. From this point forward, no distinction will be made between rare earth lanthanides, and group IIIB transition metals, since they have similar electronic structures with respect to the atomic states with d-symmetry that contribute to the lowest conduction bands. For the past twenty to thirty years, thermally-grown SiO2 gate dielectrics and Si–SiO2 interfaces were the gate stack constituents for Si integrated circuits in which the level of integration increased more than a thousand fold and the thickness dimensions were reduced proportionally. The bulk properties of relatively thick SiO2 films in the range of approximately 10 to 1000 nm defined the performance and re-

312

G. Lucovsky and J.L. Whitten

liability metrics for device scaling. The properties of the Si–SiO2 interface, including the densities of interface traps, Dit , and fixed charge, Qf , defined additional metrics for the eventual replacement of SiO2 or nitrided SiO2 , by a qualitatively different deposited alternative thin film dielectric. Possible replacements for thermally-grown SiO2 , including deposited thin films of silicon nitride, Si3 N4 , aluminum oxide, Al2 O3 , and zirconium oxide, ZrO2 , received attention at the research and advanced development levels during the 1970’s and 1980’s. These materials have dielectric constants that are larger than SiO2 (∼ 3.8 to 3.9) by factors of approximately 2 for Si3 N4 and 2.5 for Al2 O3 , and more than 4 for ZrO2 . However, other properties, primarily significant increases in Dit and/or Qf were too large to meet the empirically defined scaling metrics. Additionally, neither direct nor Fowler– Nordheim tunneling was a significant issue due to the thickness range of interest at that time. In addition to reducing direct tunneling for EOTs extending below about 1.5 nm, devices with these dielectrics must preserve all of the other properties of Si–SiO2 CMOS devices, including current drive which is proportional to the respective electron and hole channel mobilities, threshold voltage control, and as-deposited/processed and electrically stressed defect densities. Since many of these properties are controlled by the bonding arrangements in the immediate vicinity (< 1 nm) of the metallurgical Si-dielectric interface, the replacement of SiO2 or Si oxynitride alloys by Al2 O3 or transition metal or rare earth oxides, silicates and aluminates must not degrade the interface between these replacement gate dielectrics and the Si substrate, and must also be compatible with the processing associated with polycrystalline Si gate electrodes, and eventually with dual metal gate electrodes. The next section of this chapter addresses the unique bulk and interface properties of SiO2 , and the Si–SiO2 interface, respectively, at a fundamental level, in terms of chemical bonding, electronic structure, and strain energy. This approach provides a background for understanding the qualitative and quantitative differences in bulk and interface properties associated with the introduction of deposited alternative dielectrics. The sections that follow correlate chemical bonding and fundamental electronic structure of alternative dielectrics, defining fundamental limitations for the replacement of thermally-grown SiO2 , and the intrinsic factors that will contribute to the inevitable end-of-the road, and the road-maps for the aggressive scaling of bulk CMOS integrated circuits. The key material enabling Si-based metal-oxide-semiconductor field effect transistor (MOSFET) technology is silicon dioxide. The use of amorphous, thermally grown SiO2 as a gate dielectric offers several important material and electrical properties that are exploited in complementary metal-oxide-semiconductor (CMOS) technology including a kinetically-stable, device-quality Si–SiO2 interface with excellent electrical isolation properties. In state-of-the-art CMOS processing, defect charge densities are on the order of 1010 /cm2 , mid-gap interface state densities are ∼ 1010 /cm2 eV, and hard breakdown fields in excess of 10 MV/cm

11 Electronic Structure of Alternative High-k Dielectrics

313

are routinely obtained and are therefore expected regardless of the device dimensions. These outstanding kinetic-stability and electrical properties clearly present a significant challenge for any alternative gate dielectric candidate.

11.2 SiO2 and the Si–SiO2 Interface 11.2.1 Interfacial Transition Regions Between Crystalline Si and Non-crystalline SiO2 Before beginning a discussion the local atomic and the electronic structure of SiO2 , it is important to understand the inherent complexity of the Si– SiO2 interface. It is well established that Si–SiO2 interfaces in metal oxide semiconductor field effect transistor (MOSFET) devices are not atomically abrupt, but instead contain i) a transition region ∼ 0.5 nm thick in which the distribution of local bonding arrangements have average SiO composition, as well as ii) a strained or defective region in the Si substrate that is of similar spatial extent and that likely contains Si dangling bonds [11.3–11.5]. Similar interfacial transition regions have been found at interfaces between Si and high-K dielectrics [11.6]. It is therefore critically important to understand i) the basic physical and chemical forces that drive the creation of these interfacial regions, ii) the bonding within these regions, and iii) the effects that these regions have on device performance and reliability. These issues have been addressed by combining new and important insights in the nature of selforganized regions that develop at compositionally-defined interfaces between unconstrained and heavily constrained materials; i.e., the Si and SiO2 components of the Si–SiO2 interface [11.7]. Three factors contribute to the formation of these transition regions; i) differences in Si–Si inter-atomic distances in the Si substrate and the SiO2 dielectric result in intrinsic compressive stress in the oxide and tensile stress in the substrate, ii) differences in linear thermal expansion coefficients in the Si and the substrate add a thermally-induced stress component, and iii) differences in the average number of bonds per atom, 4 in the substrate, and 2.7 contribute to bond-strain at the atomic scale. Atomic structure at Si–SiO2 interfaces at the monolayer scale plays a significant role in determining the performance and reliability in state of the art and projected, aggressively scaled advanced microelectronic devices. The contributions to local bond strain and macroscopic strain identified above have been shown in [11.7] to promote formation of Si suboxide, SiOx , x < 2, interfacial transition regions. Building on an increased understanding of selforganization in binary glass alloys such as Gex Se1−x [11.8,11.9], this has identified a novel approach for understanding the importance of self-organization within these transition regions that results in the formation of monolayer strain free suboxide bonding. The bonding within these transitions regions plays a crucial role in defining the performance and reliability metrics that have been applied to road map metrics for alternative high-K dielectrics as

314

G. Lucovsky and J.L. Whitten

well, and this one of the most significant challenges in the introduction of alternative dielectrics that meet the demands of aggressive scaling. The precursor bonding arrangements for this self-organization are a natural consequence the high temperature, 800–1000◦ C conventional thermal oxidation process and a 900◦ C relaxation, but can also be achieved by low temperature, 300◦ C, remote plasma-assisted oxidation, and other more empirical approaches. Stated differently, aggressive scaling metrics have been indexed to the properties of the thermally-grown interfaces, without serious consideration of the differences in electronic structure and bond-ionicity that inherent to the most promising transition metal and rare earth alternative dielectrics. Additionally, it has been shown that the intentional formation of nonoptimal and physically thicker interfacial transition regions by low temperature remote plasma-assisted oxidation, inhibits self-organization, and results in degraded performance and reliability [11.7]. Additional factors, such as bond-ionicity or equivalent inherent heterovalency contribute to the formation of properties of these transition regions when alternative high-K dielectrics are substituted for SiO2 . Finally, deposited replacement dielectrics introduce internal dielectric interfaces between interfacial SiO2 transition regions and the bulk of the highK film. These interfaces are non-ideal in two ways. First, the number of bonds/atom and related presence or absence of intrinsic bond-strain will generally be different for the two dielectric materials involved, and this will contribute to interfacial defect formation [11.10]. Second, the bond ionicities, discussed in more detail later on in this chapter, of the two interface components will generally be different, and this can contribute to heterovalent interfacial bonding in which the number of electrons available for two-electron pair bonds is not balanced by the nuclear charges of the constituent atoms on each side of the interface. This is a characteristic property of bonding at interfaces between SiO2 , and more ionic elemental oxides, Al2 O3 , ZrO2 , etc., and their respective silicate and aluminate alloys. The formation of defects at these interfaces, their relaxation, and their ultimate effect on device properties are beyond the intended scope of this chapter, and are addressed in other publications [11.10]. The basic idea is that defects at semiconductor-dielectric interfaces, and internal dielectric interfaces scale as the sum of the discontinuities in the absolute values of the difference in the number of bonds/atom or average coordination, and the differences in the bond ionicities at these interfaces. This approach has been applied at the Si-SiO2 interface accounting for the number of Si-atom dangling bonds/cm2 before hydrogen passivation, and at internal dielectric interfaces between SiO2 and as-deposited i) Si3 N4 and Si oxynitride alloys, Al2 O3 , and ii) transition metal oxides accounting for the fixed charge/cm2 . The scaling extends over more than a factor of fifty in interfacial defects.

11 Electronic Structure of Alternative High-k Dielectrics

315

11.2.2 Local Atomic Structure of SiO2 Non-crystalline SiO2 is a very special material, and its interface with Si is unique among semiconductor-native oxide interfaces as well. The local atomic structure of bulk SiO2 or fused silica, and thin film SiO2 has been the focus of many studies over the past forty to fifty years. Bulk SiO2 has been identified as the prototypical non-crystalline solid with a continuous random network structure, CRN. [11.11] and references therein. It has been the testing and proving ground for many of the x-ray, electron and neutron diffraction techniques that were used to deduce information about the local atomic structure, including bond-lengths and bond-angles, and intermediate range order characterized by dihedral angles and rings of bonded atoms. These diffraction methods were complemented by studies of vibrational properties by infrared and Raman techniques [11.12]. This era of research was highlighted by the ball and stick models of Bell and Dean [11.13]. Calculations based on these models were in excellent agreement with radial distribution functions deduced from diffraction studies providing an identification of the local bonding geometry, including statistical distributions of bond-lengths and bond-angles. Calculations of vibrational properties identified empirical force constants that eventually provided important information relative to glass formation and thin film annealing [11.14, 11.15]. However, it is only recently that ab initio calculations, based on the local atomic structure deduced from analysis of diffraction studies, were able to provide a quantitative understanding of the mechanical and vibrational properties based on fundamental electronic structure [11.16]. SiO2 is the prototypical network amorphous solid [11.11], and has a CRN structural morphology in which the Si-atoms are four-fold coordinated in a tetrahedral geometry, and the O-atoms are two-fold coordinated. The ideal network is chemically ordered containing only Si–O bonds. Defect bonding configurations have been proposed that are associated with broken bonds, or wrong bonds, Si–Si, or equivalently O-atom vacancies [11.17]. The random aspects of the structure, which provide the configurational entropy necessary for glass formation and metastability of the disordered or amorphous morphology and include i) the large distribution of Si–O–Si bond angles, ∼ 150 degrees ±10 degrees [11.18, 11.19], and ii) the randomness of the dihedral angle distribution, or four-body correlation functions. There are several important aspects of the local vibrational properties that have been correlated with the fundamental electronic structure [11.16, 11.20]. These include the i) very small bond-bending force constant at the O-atom sites, i.e., the body bond-bending valence force field constant for Si–O–Si, kθ ∼ 104 dynes/cm, and ii) large differences between the infrared effective charges for the three normal mode vibrations as referenced to the two-fold coordinated O-atoms: (a) the asymmetric bond-stretching vibration, νas , (b) the symmetric bond-stretching or bending vibration, νss , and (c) the out-of-plane bond-rocking vibration, νr .

316

G. Lucovsky and J.L. Whitten

11.2.3 Electronic Structure of SiO2 There was an upsurge of interest in the electronic structure of SiO2 and the Si–SiO2 interface that began in the early and mid 1970’s as integrated circuits began to emerge. Two different approaches were used in these early calculations: i) semi-empirical tight binding calculations performed on cluster Bethe lattices (CBL) [11.21, 11.22], and ii) periodic structures with the local atomic structure of different crystalline structures of SiO2 , e.g., crystrobalite and a-quartz [11.23,11.24]. Another approach was based on the application of ab initio quantum chemistry calculations to molecules that include the local bonding arrangements of Si and O [11.25, 11.26]. This approach has recently been refined and used to revisit several important aspects of the electronic structure of SiO2 that have also been extended to high-K dielectrics, and is discussed below [11.16]. The electronic structure calculations of [11.16] are ab-initio in character, and employ variational methods in which an exact Hamiltonian is used. No core potential or exchange approximations are assumed [11.27, 11.28]. The Hamiltonian, H, is given in (11.1),   1    Zk   1 + H= − Λ2i + (11.1) i k i 2, and a Pauling bond ionicity of greater than about 67%. This group includes transition metal oxides that are deposited by low temperature techniques including plasma deposition and sputtering with postdeposition oxidation [11.1]. The coordination of the oxygen atoms in these RCP structures is typically four or more. The values of Ib , calculated from (11.5), that separate these three groups are approximately 47% and 67%. The use of other electronegativity scales, e.g., Sanderson [11.52], and different definitions of bond ionicity would change the values of bond ionicity that establish the boundaries, but do not change the separation of oxides and oxide alloys into the same three classes, and therefore do not modify any of the qualitative aspects of the proposed classification scheme. The coordination of oxygen atoms scales monotonically with increasing bond-ionicity. This suggests a fundamental relationship between charge localization on the oxygen atom and bonding coordination that has been confirmed by spectroscopic studies of Zr silicate alloys in which the coordination varies linearly with alloy composition [11.49].

11 Electronic Structure of Alternative High-k Dielectrics

327

Table 11.1. Electronegativity difference, ∆X, average bond ionicity, Ib , and metal and oxygen coordination for SiO2 and high-k alternative dielectrics DIELECTRIC CRNS SiO2 MCRNS Al2 O3 Ta2 O5 TiO2 (ZrO2 )0.1 (SiO2 )0.9 (ZrO2 )0.23 (SiO2 )0.77 (ZrO2 )0.5 (SiO2 )0.5 (TiO2 )0.5 (SiO2 )0.5 (Y2 O3 )1 (SiO2 )2 (Y2 O3 )2 (SiO2 )3 (Y2 O3 )1 (SiO2 )1 (Al2 O3 )4 (ZrO2 )1 (Al2 O3 )3 (Y2 O3 )1 RANDOM IONS HfO2 ZrO2 (La2 O3 )2 (SiO2 )1 Y2 O3 La2 O3 ∗

∆X

Ib

coordination

coordination

1.54

0.45

metal/silicon 4

oxygen 2.0

1.84 1.94 1.90 1.61 1.70 1.88 1.72 1.88 1.93 1.99 2.02 1.97

0.57 0.61 0.59 0.48 0.51 0.59 052 0.59 0.61 0.63 0.64 0.62

4 and 6 (3:1) 6 and 8 (1:1) 6 8 and 4 8 and 4 8 and 4 6 and 4 6 and 4 6 and 4 6 and 4 4 and 8 4 and 6

3.0 2.8 3.0 2.2 2.46 3.0 2.5 2.86 3.0 3.11 3.0 3.0

2.14 2.22 2.18 2.2 2.34

0.68 0.71 0.70 0.7 0.75

8 8 6 and 4 6 6

4.0 4.0 3.5∗ 4.0 4.0

The (La2 O3 )2 (SiO2 )1 alloy contains both O ions and silicate groups, and as such the O atoms have local coordinations of 4 and 3, respectively.

11.4 Electronic Structure of Transition Metal Dielectrics 11.4.1 Empirical Correlations Between Electronic Structure and Atomic d-State Energies Figure 11.6 is a schematic molecular orbital energy level diagram for a group IV transition metal, e.g., Ti, in an octahedral bonding arrangement with six oxygen neighbors [11.53–11.55]. Each oxygen atom is assumed to provide one σ and two π 2p-electrons for potential bonding with the neutral group IVB atom that contributes four additional electrons. The symmetries and π or σ character of the calculated molecular orbitals are determined by the symmetry character of the group IVB and oxygen atomic states. The energies of the bonding and anti-bonding states are determined in the usual manner, e.g., by calculation of the coulomb integrals [11.53, 11.54]. The top of the

328

G. Lucovsky and J.L. Whitten t 1u ( σ∗ , π∗) 6 a 1g ( σ∗ ) 2

TM n+1 p TM n+1 s

e g ( σ∗ ) t 2g ( π∗ )

TM nd

4 6

Eg E’g O 2p ( σ, π) t 1g + t 2u 12 t 1u ( σ, π ) 6 t 2g ( π )

6

eg(σ)

4

t 1u ( σ, π) 6

a 1g ( σ )

2

Fig. 11.6. Molecular orbital energy level diagram for a group IV transition metal, e.g., Ti, in an octahedral bonding arrangement with six oxygen neighbors

valence band is associated with non-bonding π orbitals of oxygen atom 2pstates, and the first two conduction bands are associated with transition metal 4d-states. In order of increasing energy these conduction bands have the following symmetries tt2g (π ∗ ), and eg (σ ∗ ) which are a direct consequence of the octahedral bonding arrangement. The next conduction band is derived from transition metal 5s-states with a1g (σ ∗ ) character. The energy separation between the top of the valence band and the a1g (σ ∗ ) band edge defines a band gap with essentially the same energy as that of non-transition metal insulating oxides such as MgO or Al2 O3 , ∼ 8–9 eV. The symmetry of the Ti orbital that contributes to this band gap is anti-bonding s∗ , essentially the same symmetry as the lowest conduction band in SiO2 . In all of the transition metal and rare earth oxides, and silicate and aluminate alloys the lowest conduction band states are associated with d∗ -orbitals of the respective transition metal and rare earth atoms. The ordering of the lowest d∗ -state conduction bands in crystalline TiO2 has been verified by electron energy loss and X-ray spectroscopies, which confirm the relative sharpness of the t2g (π ∗ ), and eg (σ ∗ ) bands, the increased width of the a1g (σ ∗ ) band [11.55, 11.56]. The experiments demonstrate that the spatial localization of the transition metal atomic d∗ -states results in a solid state broadening that is significantly less than their energy separation relative to transition metal s∗ -states, and significantly less than the separation of the anti-bonding states derived from the t2g and eg d-states [11.55, 11.56].

11 Electronic Structure of Alternative High-k Dielectrics

329

Fig. 11.7. (a) Lowest optical band gap energy as function of the energy difference between with the transition metal nd atomic state energy and O2p atomic state energy. (b) Conduction band offset energy as function of the differences in energy between the atomic n+1 s and n d states. Both are for representative transition metal oxides

There are several aspects of the energy band scheme in Fig. 11.6 that are important for band gap and conduction band offset scaling: i) the symmetry character of the highest valence bonding states, non-bonding O 2p πstates with an orbital energy approximately equal the energy of the atomic O 2p state, ii) the weak π-bonding of the transition metal atoms establishes that the lowest anti-bonding state is close in energy to the atomic n d state of the transition metal atom, and iii) the energy separation between the n d and n+ 1s derived anti-bonding states is correlated with the difference between the atomic n d and n+1 s states. These aspects of the energy band scheme in Fig. 11.6 have been verified in ab initio calculations on small neutral clusters which include the transition atom and O-atom neighbors [11.57]. Figure 11.7 contains plots of (a) the lowest optical band gap and (b) the conduction band offset energies, both from the papers of Robertson [11.58, 11.59], versus the absolute value of the energy of the transition metal atomic n d state in the s2 dγ−2 configuration appropriate to insulators. γ = 3 for the group IIIB transition metals, Sc, Y and Lu(La), and the rare earth lanthanides, and γ = 4 for the group IVB transition metals Ti, Zr and Hf. The linearity of these plots supports the qualitative universality of the energy band scheme of Fig. 11.6. The band gap scaling displays a slope of approximately one between Ti and Y, indicating quantitative agreement with the energy band scheme of Fig. 11.6, and more importantly with the ab initio calculations of [11.60] that are discussed in the next section. The band offset energy between the conduction band of Si and the empty anti-bonding or conduction band states of a high-k gate dielectric is important in metal-oxide-semiconductor, MOS, device performance and reliability. It defines the barrier for direct tunneling, and/or thermal emission of electrons from an n+ Si substrate into a transition metal oxide. In alloys such

330

G. Lucovsky and J.L. Whitten

as Al2 O3 –Ta2 O5 , or SiO2 –ZrO2 , it also defines the energy of localized transition metal trapping states relative to the Si conduction band [11.62–11.64]. The next sub-section of the papers extends the ab initio calculations of SiO2 to local bonding arrangements of Zr atoms in ZrO2 and Zr silicate alloys, (ZrO2 )x (SiO2 )1−x . 11.4.2 Extension of Ab Initio Calculations to Transition Metal Oxides Following the methods of previous studies that have addressed the electronic structure of transition metal ion complexes [11.53–11.55], the ab-initio results presented in this paper have been based on relatively small clusters with at least two shells of near-neighbors. The approach is essentially the same as that applied to SiO2 , comprised of calculations at the self-consist field (SCF) Hartree-Fock level followed by a configuration interaction (CI) refinement. These calculations have focused on the splittings between the TM d-states, ∆(d∗1 ,d∗2 ), that comprise the lowest energy conduction band states, and the energy separation between the lower TM d-state, d∗1 , and the spectral peak of the next conduction band that is associated with TM s∗ -states, ∆(d∗1 ,s∗ ) as indicated in Figs. 11.8 and 11.9 [11.60]. The calculations for these calculations have been applied to the four transitions indicated in Fig. 11.8 for ZrO2 . One of the these transitions, the Zr M2,3 transition between Zr 3p1/2 and 3p3/2 -core states and the Zr 4d∗ -doublet and the Zr 5s∗ -state is dipole allowed and essentially intra-atomic in character [11.64, 11.65]. The cluster used to calculate the ∆(d∗1 ,d∗2 ) splitting and ∆(d∗1 ,s∗ ) energy separation for this transition is centered on a Zr-atom, and includes eight O-atom neighbors terminated by H-atoms. This Zr-centered cluster has also been used to calculate ∆(d∗1 ,d∗2 ) and ∆(d∗1 ,s∗ ) for the Zr K1 transition between the Zr 1score state and the the Zr 4d∗ -doublet and the Zr 5s∗ -state. This transition is not dipole-allowed, and involves a mixing with O 2p∗ conduction band states and characterizing it as inter-atomic in character. The remaining two transitions that have been studied experimentally, the O K1 edge, and the optical band gap, involve inter-atomic transitions from O 2p π non-bonding states the top of the valence to conduction band final states that are mixture O 2p∗ -states with the Zr 4d∗ -doublet and the Zr 5s∗ -conduction band states [11.64, 11.65]. The cluster used to calculate the ∆(d∗1 ,d∗2 ) splitting and ∆(d∗1 ,s∗ ) energy separation for these transitions is centered on an O-atom that has four Zr-neighbors. These Zr-atoms have four O-atom neighbors that are terminated by Zr-pseudo-atoms following a procedure that replicates the one used to terminate the SiO2 cluster in Fig. 11.2 with Si∗ -pseudo-atoms. The details of these calculations have been addressed in [11.60]. These calculations also identify the valence band states. This calculation gives a value for the Zr-O bond length in excellent agreement with experiment. The valence band states are referenced to the O 2p π non-bonding states that comprise the top of the valence band. The upper most valence band has two distinct

11 Electronic Structure of Alternative High-k Dielectrics

331

Fig. 11.8. Schematic representation of valence and conduction structure of transition metal oxides

Fig. 11.9. Schematic representation of Zr M2,3 , Zr K1 , O K1 and band edge transitions to the Zr 4d∗ doublet the Zr 5s∗ band

features; one associated with Zr 4d π-states, and the second with Zr 4d σstates. There is also a much deeper valence band associated with the Zr 4s bonding and anti-bonding states. Figure 11.9 is a schematic representation of the valence band structure and conduction band structure, that identifies the energies of the valence band states relative to O 2p π non-bonding states, and the ∆(d∗1 ,d∗2 ) splitting and ∆(d∗1 ,s) energy separation.

332

G. Lucovsky and J.L. Whitten

Table 11.2. Comparisons of d-state splittings and d-s energy separations between ab initio calculations and experimental results ab initio calculations (±0.2 eV) spectrum Zr K1 Zr M2 Zr M3 O K1 (Zr) band edge d-state widths spectrum band edge

∆(d∗1 ,d∗2 ) 2.7 2.7 2.7 4.3 1.3

∆(d∗1 ,s∗ ) 13.7 11.3 11.3 10.6 7.2

ab initio calculations (±0.2 eV) ∆d∗1 0.3

∆d∗2 1.3

experimental results (±0.3eV) ∆(d∗1 ,d∗2 ) ∼3 2.3 2.3 3.2 1.4

∆(d∗1 ,s∗ ) ∼ 13 11.7 11.9 10.1 not measured

experimental results (±0.2eV) ∆d∗1 0.4

∆d∗2 1.5

The ab-initio calculations have also been applied to HfO2 and TiO2 , where preliminary results also yield good agreement with experiment. These calculations are being extended to complex oxides, here defined as binary alloys between TM and RE oxides, or two different TM or RE oxides in which there are equal numbers of TM and/or RE, atom pairs. This ensures that the electronic states of these atoms are coupled through bonding to the same O-atom as in GdScO3 , HfTiO4 and LaLuO3 . Table 11.2 summarizes the results of these calculations. It includes the ∆(d∗1 ,d∗2 ) splittings and ∆(d∗1 ,s) energy separations for the four transitions that will be addressed in Sect. 11.5, as well as the energies of the valence band features which will also be addressed in the same Section of this article. In addition the table includes the experimentally determined ∆(d∗1 ,d∗2 ) splittings and ∆(d∗1 ,s) energy separations, as well as the energy separation of the 4d π- and 4d σ-features in the upper valence band. Each of these is mixed with O 2p states. This chapter includes results of the calculations which have addressed: i) the valence band electronic structure of TiO2 and ZrO2 , ii) transitions between 3p core states and anti-bonding 3d∗ and 4s∗ states in ZrO2 , as well as the corresponding p to d∗ ,s∗ transitions for TiO2 and HfO2 , iii) the O K1 edge transitions in TiO2 , ZrO2 and HfO2 , and iv) band edge transitions in the same three group IVB metal oxides. The transitions in ii) are intra-atomic in character, whilst those in iii) and iv) include inter-atom contributions as well, i.e., the final states are O 2p∗ states that are mixed with the d∗ and s∗ states of the respective group IVB transition metal atoms. Figure 11.8 summarizes the results of the ground state calculations for TM valence and conductions, and Fig. 11.9, compares the intra- and inter-atomic transitions for ZrO2 . In the next section of the paper these calculations are compared with experimental results.

11 Electronic Structure of Alternative High-k Dielectrics

333

Fig. 11.10. Valence band spectra for ZrO2 and HfO2 . The dashed line at ∼ −3.8 eV is the top of the valence band, and the other two dashed lines indicate the approximate energies of the spectral features associated with 4(5) d π-bonding states, and 4(5) d σ-bonding states for Zr(Hf)O2

11.5 Experimental Studies of Electronic Structure 11.5.1 Valence Band Figure 11.10 includes the valence spectra for ZrO2 and HfO2 as determined by UPS [11.66]. The dashed lines in this figure indicate the position of the band edge relative to the Fermi level of the spectrometer. The dashed line at approximately 3.8 eV is the valence edge associated with O 2p π non-bonding states. The next two dashed lines are assigned, on the basis of the ab initio calculations discussed above to Zr(Hf) 4d(5d) π states, and Zr(Hf) 4d(5d) σ states that overlap the respective O 2p π and σ states. The experimental energy differences of approximately 3.5 ± 0.2 eV for ∆1 , and 5.0 ± 0.2 eV for ∆2 , are in excellent agreement with the respective calculated differences of 3.4 eV, and 4.6 eV. The agreement between ab initio calculations for the valence band states of HfO2 is essentially the same to within an experimental uncertainty of about ±0.2 eV. 11.5.2 Anti-bonding Conduction Band States of TM Oxides Figure 11.11a, b, c, and d, present respectively Zr M2,3 , Zr K1 , O K1 and band edge optical absorption constants for crystallized thin film ZrO2 . The Zr M2,3 spectrum in Fig. 11.11a is an intra-atomic spectrum. The energy differences between the spectral peaks of two d-states, ∆(d∗1 ,d∗2 ), and the spectral peak of the first d∗ state and the s∗ state ∆(d∗1 ,s∗ ) are in good agreement (±0.3 eV) with ab inito calculations based on small clusters with central Zr atoms, and two shells of atomic neighbors, and included in Table 11.2. The relative intensities of the d∗ and s∗ features are markedly different, and are

334

G. Lucovsky and J.L. Whitten

Fig. 11.11. Electronic structure for ZrO2 from XAS measurements: (a) Zr M2,3 , (b) Zr K1 , and (c) O K1 spectra, and from vacuum uv spectroscopic ellipsometry measurements: (d) energy dependence of the band edge optical absorption constant. The XAS spectra are plotted as relative absorption versus the x-ray photon energy

consistent with the radial wave functions of the initial and final states in this transition. The Zr K1 and O K1 spectra in Fig. 11.11b and c are interatomic with final states reflection the mixing O 2p∗ and Zr 4d∗ and 5s∗ states. Agreement with ab initio calculations is approximately ±0.3 eV as well and is indicated in Table 11.2 as well. The Zr K1 spectrum for is also an intra atomic spectrum. Since transitions from the Zr 1s-state to Zr 4d∗ and 5s∗ states are not dipole-allowed, the Zr K1 edge spectrum is qualitatively similar to the O K1 edge spectrum in which the final states are a mixture of i) Zr 4d∗ and 5s∗ states, and ii) O 2p∗ states. The doublet 4d∗ features are not resolved in Fig. 11.11c, but, the average d∗ -s∗ energy difference is ∼ 10 eV, again in agreement with experiment. Figure 11.11d presents the band edge absorption constant, determined from spectroscopic ellipsometry data, for

11 Electronic Structure of Alternative High-k Dielectrics

335

ZrO2 . The energy difference between the two d∗ features, and their relative spectral width are in excellent agreement with the calculations as indicated in Table 11.2. As noted above, the calculations for the O K1 edge and band edge final states utilized clusters centered on O-atoms. Qualitatively similar spectra have been for the p-state absorption in TiO2 and HfO2 , the Ti L2,3 and Hf N2,3 , edges, and the respective O K1 edges as well. The relative intensities of features within the individual spectra, and between these spectra and the corresponding M2,3 edge of Zr are all consistent with the intra-atomic character of these transitions. In contrast, the O K1 spectra show differences in the relative amplitudes of the respective d∗ features consistent with the 6-fold coordination of Ti, and the 8-fold coordination of Zr and Hf. In addition, the differences in energy between the first d∗ peaks are the same to within an experimental uncertainty of ∼ 0.3 eV as the differences in the experimentally determined band gaps [11.4]. There is also a direct correlation between the features in the O K1 spectra of TiO2 , ZrO2 and HfO2 and the energy of the atomic states for the electronic configuration appropriate to oxides, nd2 (n+1)s2 , where n = 3, 4 and 5, respectively for Ti, Zr and Hf. As discussed below this has important implications for the scaling of optical band gaps, Eg , and conduction band offset energies with respect to Si, EB . 11.5.3 TM and RE Alloys There are two different types of alloys that have been considered as replacement high-k dielectrics. These are pseudo-binary TM or trivalent RE silicate and aluminate alloys, which are comprised on TM or RE oxides, and either SiO2 or Al2 O3 . In this section, only the silicate alloys are addressed. The second group alloys are complex oxides, defined above as binary alloys between TM and RE oxides, or two different TM or RE oxides in which there are equal numbers of TM and/or RE, atom pairs. Representative examples include GdScO3 , HfTiO4 and LaLuO3 . These two classes of alloys are differentiated by the nature of the conduction band states. In the TM and RE silicate and aluminate alloys the lowest conduction band are comprised a TM or RE d∗ -state doublet, as in the respective TM or RE oxides, and then two conductions with s∗ -symmetry, one associated with the TM or RE atom, and the second with the Si or Al atom. As in the TM and RE oxides, all TM and RE conduction band states are mixed with O 2p∗ -states. The lowest Si and Al conduction states are predominantly 3s∗ -in character with reduced mixing with O anti-bonding states. Figure 11.12a displays the O K1 edge in a Zr silicate alloy as deposited, and after a 900◦ C anneal in which the homogeneous as-deposited non-crystalline separates into ZrO2 nano-crystallites, encapsulated by noncrystalline SiO2 [11.60]. Consider first the phase-separated silicate. Based on comparisons with the O K1 edge spectra for ZrO2 , and arguments based on a molecular orbital model for Zr silicate alloys, the first two features in these silicate spectra are associated in order of increasing energy with O 2p∗ states

336

G. Lucovsky and J.L. Whitten

Fig. 11.12. (a) Comparison between O K1 spectra for a Zr silicate alloy with 60% ZrO2 (x = 0.6) as-deposited and annealed at 900◦ C. (b) Hf silicate O K1 spectra for as-deposited non-crystalline films with varying ratios of SiO2 to HfO2

coupled to i) Zr 4d∗ -states and ii) Si 3s∗ -states. The molecular orbital state associated with O 2p σ-Si 3s σ-bonding in the valence band is much deeper in energy than the O 2p π-Zr 4d π-valence band state, and the symmetries of these orbitals, π versus σ, are sufficiently different so that the Zr 4d∗ - and Si-3s∗ -states do not mix. The first two dashed lines indicate transitions that terminate in Zr 4d∗ -states, and the next dashed line indicates transitions that terminate in the Si 3s∗ -band states. These Si 3s∗ band states, as well as the Zr 5s∗ band states and individually mixed with anti-bonding O-2p∗ -states as well. These assignments, based on spectra of Zr silicate alloys with x ranging from 0.3 to 0.6 indicate that i) the difference in energy between the localized Zr atom 4d∗ features, and the more extended Si 3s∗ states is not dependent

11 Electronic Structure of Alternative High-k Dielectrics

337

on the alloy composition, whereas ii) the relative amplitudes of the Zr and Si features scale with alloy composition. This observation is consistent with the results of X-ray photoelectron spectroscopy (XPS) studies presented later on in this chapter. The spectra in Fig. 11.12b are for non-crystalline Hf silicate alloys, and are qualitatively similar to those for Zr silicate spectrum in the as-deposited non-crystalline state in Fig. 11.12a. The Hf silicate spectra demonstrate that the energy difference between the Hf 5d∗ states, and the onset of transitions involving Si 3s∗ states is independent of alloy composition. Additionally, differences between Hf silicate and aluminate spectra are consistent with an approximately 1 eV red shift of the Al2 O3 conduction band states relative to those of SiO2 , as determined from band offset calculations and measurements. The conduction and valence band offset energies of Zr silicate alloys are discussed in Sect. 11.5. For TM and RE silicate and aluminate alloys, the conduction band offset energies are essentially the same as those of the respective end-member TM and RE oxides. Figure 11.13a and b are the O K1 edge spectra, respectively of two crystalline complex oxides HfTiO4 , prepared by reactive evaporation of the constituent TM atoms, Ti and Hf, and DyScO4 . Figure 11.13c, d and e are energy level diagrams that can be applied to d-state mixing in complex oxides. A mean-field model was initially proposed for complex oxides. This model of d-state coupling predicts that band gaps and conduction band offset energies are averages of the constituent oxides. This model, shown in Fig. 11.13c, requires that the triad of atoms be collinear, and that coupled pairs of d-states are relatively close in energy. This model is not supported by high-resolution studies of the O K1 edges of HfTiO4 , DyScO3 and GdScO3 in Fig. 11.13a and b. Mean field theory predicts two pairs of d-states, one with doubly degenerate e-symmetry, and one triply degenerate with t-symmetry, whereas the spectra of the the complex oxides in Fig. 11.13a and b display three strong d∗ -state features. The mean field model does not take into account differences in local coordination and symmetry at the two metal atom sites. The multiplicity of final d-states in the O K1 edge is the same as at the conduction band edge, establishing that O K1 edge spectra, and the theory/model that explains its scaling with atomic d-state energies is also applicable to band edge properties including conduction band offset energies. Deficiencies in mean-field theory are remedied in the energy level diagrams of Fig. 11.13d and e. This approach includes differences in the atomic coordination of the two atoms of the complex oxide, e.g., 8-fold for Hf and 6-fold for Ti, and 12-fold for Dy or Gd, and 6-fold for Sc in Fig. 11.13d, with distortions from octahedral symmetry as at Sc atom sites as added in Fig. 11.13e. Relative energies of uncoupled e and t d∗ -states have been approximated from the XAS O K1 spectra of the elemental oxides. The model in Fig. 11.13d predicts four states: two “t” states separated by > 10 eV, and two closely spaced “e” states that are expected to contribute to only one feature in the XAS spectra. A splitting of the triply degenerate “t”- states, into

338

G. Lucovsky and J.L. Whitten

Fig. 11.13. XAS spectra O K1 edge spectra for (a) HfTiO4 , and (b) DyScO3 . Empirical energy level schemes for d-state coupling in complex oxides: (a) mean-field, virtual crystal model, (b) model that includes different atomic coordinations only, and applies to HfTiO4 , and (c) model that includes different atomic coordinations and a strong distortion at one atomic site, Sc, and applies to DyScO3 and GdScO3

11 Electronic Structure of Alternative High-k Dielectrics

339

a doubly degenerate “e” state, and a non-degenerate “a” state introduces two additional states can result from a ferroelectric distortion at the 6-fold coordinated sites is shown in Fig. 11.13e. The XAS O K1 edge for HfTiO4 displays three features. As applied to this HfTiO4 spectrum, the model in Fig. 11.13d predicts four features, two of which are closely spaced and not observable as a resolvable doublet in the XAS spectrum. Reduced local symmetry at the Ti-atom site can lead to a splitting of the “t” states and lead to band edge tails; however, these have not been reported. When applied to O K1 spectra for GdScO3 and DyScO3 , the model in Fig. 11.13e accounts for the three strong features, at approximately 531 eV, 537 eV and 541 eV, as well as the substantially weaker feature at ∼ 539.5 eV. The lowest energy a-state of this model is not detected in the O K1 spectrum, but is detected in GdScO3 , another complex oxide that has the same crystalline symmetry as DyScO3 , by optical transmission, and by spectroscopic ellipsometry as well. This state is also responsible for band-tail effects in internal photoemission (IPE) and photoconductivity (PC) studies. O K1 edges for non-crystalline and crystalline LaAlO3 in which only the La atom as valence bonding d-states are qualitatively different. In general, non-crystalline complex oxides with stoichiometric compositions in which both atoms have valence band d-states display qualitatively different O K1 edge spectra without the strong d-state coupling of their crystalline counterparts. Off stoichiometry non-crystalline compositions are dominated by the oxide with the smaller bandgap, leading to markedly different properties for TiO2 -rich and HfO2 -rich alloys of the HfO2 -TiO2 alloy system. Finally, the bottom line is that non-crystalline complex oxides yield only marginally small improvements with respect to their constituent elemental oxides, and crystalline complex oxides are expected to show increased trapping due to symmetry reductions and the formation of deep trapping states. 11.5.4 XPS and AES Results for Zr Silicates A detailed and comprehensive study of XPS and AES measurements are presented in [11.49] for Zr silicate alloys, (ZrO2 )x(SiO2 )1−x . Figure 11.14 summarize the results of XPS measurements of O 1s, Si 2p, and Zr 3d core level binding energies for the end-member oxides, SiO2 and ZrO2 , and for thirteen pseudo-binary oxide alloy compositions distributed approximately equally over the entire alloy composition range. These are for as-deposited thin films. Studies of films annealed at 500◦ C in Ar display essentially the same spectra, whereas films annealed at 900◦ C show evidence for chemical phase separation into SiO2 and ZrO2 , independent of whether the phase separation is accompanied by crystallization [11.67]. Figure 11.14a indicates the compositional dependence of the O1s binding energy. The sigmodial character of the plot is a manifestation of mixed coordination for O-atoms as anticipated by the discussion above relative to the classification scheme for oxides based on bond ionicity. The coordination of

340

G. Lucovsky and J.L. Whitten

Fig. 11.14. XPS chemical shifts of (a) O 1s, (b) Si 2p and (c) Zr 3d5/2 core levels from as-deposited (300◦ C) (ZrO2 )x (SiO2 )1−x alloys as a function of composition, x

11 Electronic Structure of Alternative High-k Dielectrics

341

oxygen increases from 2 to 3 in the composition range from SiO2 (coordination 2), to 3 for the 50% ZrO2 chemically-ordered alloy that defines the stoichiometric silicate composition, ZrSiO4 . Derivative XPS spectra, displayed in [11.41] confirm that the sigmoidal dependence is due to mixed coordination. Finally, the total shift in the O 1s core level binding energy between SiO2 and ZrO2 is 2.45 ± 0.1 eV. Figures 11.14b and 11.14c display similar spectra for the Si 3p and Zr 3d5/2 core levels. The Si 2p data in Fig. 11.14b shows a linear dependence consistent with a single atomic coordination of four, and a total shift of 1.85 ± 0.1 eV between the end member elemental oxides, SiO2 and ZrO2 . Note that these core level shifts are in the same direction, with the values at the SiO2 end of the alloy regime being more negative. As discussed in [11.41], this is consistent with partial charges calculated on the basis of electronegativity equalization [11.44] as presented in [11.41]. The data for the compositional dependence of the Zr 3d5/2 core level show some additional structure for low values of x. Consider first the total change in binding energy across the alloy system. This is 1.85 ± 0.1 eV, essentially the same as for the 2p Si level. This means that the slopes of the plots in Figs. 11.14b and 11.14c in the linear regime are the same as well. The equality of these slopes is also consistent with the principle of electronegativity equalization [11.52]. More importantly the equivalence of the slopes is also consistent with the XAS data for Zr silicate alloys. Parallel slope shifts in core level spectra are equivalent to the 4d∗ anti-bonding states of Zr and the 3s∗ band peak of Si maintaining a constant energy separation as a function of alloy composition as shown in Fig 11.12. Finally, the departure from linearity for x < 0.4 in Fig. 11.14c has been assigned to the change in the nature of the chemical bonding at the Zr site as a function of alloy composition [11.52]. The coordination of Zr has been assumed to be eight independent of alloy composition; however, each of these eight oxygen atoms are not equivalent with respect to bonding neighbor coordination and electronic structure. The number of ionic Zr–O bonds associated with network disruption increases from four to eight with increasing x for alloys in the SiO2 rich bonding regime. In this alloy regime, each O-atom makes at least one Zr–O bond with a bond order of one in a Si–O–Zr arrangement, and there must be at least four of these arrangements. The remainder of the eight-fold coordination is made up with donor-acceptor pair electrostatic bonds with bridging O-atoms of the non-disrupted portion of the SiO2 continuous random network. These weaker bonds have been modeled in ab-initio calculations as components of a dipolar electrostatic field, and alternatively, and equivalently can also be described as donor-acceptor pair or dative bonds. The donor-acceptor bonds are replaced by Si–O–Zr ionic bonding arrangements as x increases, and the network disruption increases. At a composition of x = 0.5, network disruption is essentially complete, and the O-atom coordination is three, and the bond order of the Zr atoms is formally one-half with all bonds between eight-fold coordinated Zr4+ ions and terminal O-atoms of

342

G. Lucovsky and J.L. Whitten

silicate ions, SiO4− 4 . Each of the terminal O atoms of a silicate ion makes bonds with two Zr4+ ions. Ab-initio calculations discussed in [11.52] have been used to identify the effects of the donor-acceptor pair bonds on the Zr core level shifts. In this model calculation, the Zr-atom has four OH-groups in a tetrahedral arrangement to emulate the ionic bonds, and four tetrahedrally-grouped water molecules with the O-atom non-bonding p-electron pair aligned in the direction of the Zr-atom to emulate the donor-acceptor pair bonding interaction. The calculations indicated that bonding is optimized at an effective inter-atomic spacing of ∼ 0.26 to 0.28 nm between the Zr-atoms and the bridging O-atoms of the network. The minimum is broad and shallow opening up the possibility of a spread in inter-atomic spacing where bond-strain and configurational entropy are likely to also be contributing factors in determining a statistical distribution of these bonding arrangements in a non-crystalline solid. The calculations indicate a positive shift in the Zr 1s bonding energy as a function of the inter-atom spacing between Zr- and bridging O-atoms. The calculations also indicate the effects of the donor-acceptor pair bond on Zr core levels are equivalent to a dipole field. The effect of the donoracceptor pair bonds, or dipole fields is to reduce the binding energy of the Zr 1s core state. Since all of the core states move rigidly with respect to the Zr 1s state, this calculation explains the direction of the non-linearity of the Zr 3d5/2 core state in Fig. 11.14c. A more detailed calculation which substitutes OH-terminated Si–O–Si groups for the water molecules is in progress [11.54] and will be used to make more quantitative comparisons between shifts in binding energies in Fig. 11.14c and ab-initio calculations. However, scaling the values for core shifts in the H–O–H model with the ratio of dielectric constants and relative electronegativities of Si–O–Si groups predicts shifts of the order of 0.15 to 0.2 eV comparable to what has been found in the analysis of the XPS results. AES measurements on the as-deposited films were performed on-line immediately following film deposition. AES chemical shifts of OKVV and ZrMVV transitions as a function of composition for derivative spectra are shown in Fig. 11.15. They show nearly identical non-linear behaviors that are qualitatively different and therefore complementary to the XPS chemical shifts of the O1s and Zr 3d5/2 core level binding energies shown in Figs. 11.14a and 11.14c, respectively. The compositional dependence of the AES peak kinetic energy values display marked sigmoidal non-linear dependence. Finally, due to spectral overlap between the ZrMVV and SiLVV features in the AES spectra, it was not possible to track the compositional dependence of the AES SiLVV feature. The chemical shifts of the Auger electron kinetic energies for OKVV and ZrMVV transitions in the as-deposited films are consistent with changes in the calculated partial charges and their effects on the O and Zr core state energies, i.e., the kinetic energies of the Auger electrons increase with increasing x reflecting the decreases in the negative XPS binding energies, i.e., shifts

11 Electronic Structure of Alternative High-k Dielectrics

343

Fig. 11.15. AES chemical shifts of (a) OKVV and (b) ZrMVV kinetic energies in as deposited (ZrO2 )x (SiO2 )1−x alloys as a function of composition. The plots in (a) and (b) are for the highest energy peaks in the respective AES derivative spectra. The solid lines are polynomial fits that are intended to emphasize the sigmoidal character of the compositional dependence

to less negative values. The differences between the XPS and AES spectral features derive from differences between the XPS and AES processes. Following [11.58], the AES electrons of Fig. 11.15 originate in the valence band, whereas the XPS electrons of Figs. 11.14a and 11.14b originate in the respective core states with no valence band participation. This is addressed below where the non-linear behaviour of the AES features reflect systematic shifts in valence band energy with increasing O-atom coordination.

344

G. Lucovsky and J.L. Whitten

The XPS and AES results are combined with determinations of valence bond offset energies for SiO2 and ZrO2 [11.68] to generate an empirical model for the compositional variation of valence band offset energies with respect to Si. The OKVV transition in amorphous-SiO2 has been investigated theoretically, and it has been shown that the highest kinetic energy AES feature is associated with two electrons being released from the non-bonding O 2p π states at top of the valence band; one of these is the AES electron, and the second fills the O 1s core hole generated by electron beam excitation [11.69]. Based on this mechanism, the XPS and AES results of this study have been integrated into a model in that provides an estimate of valence band offsets with respect to Si as a function of alloy composition. For an ijk AES A-atom transition, the kinetic energy of the AES electron, EK (A, ijk), is related to the XPS binding energies, EB (A, i), EB (A, j), and EB (A, k), and a term Ω(A) that includes all final state effects; EK (A, ijk) = EB (A, i) − EB (A, j) − EK (A, k) − Ω(A).

(11.6)

Applied to the OKVV transition, A = O, i = K (O 1s) and j, k = L = O (2p π non-bonding). Equation 11.6 is the basis for an empirical model for the energy of the Zr silicate valence band edge with respect to vacuum, and then with respect to Si, both as functions of the alloy composition. If EBE (O 1s) is the XPS binding energy, and EKE (OKVV ) is the average kinetic energy of the Auger electron with respect to the top of the valence band edge, then the offset energy, VOFFSET (x);, is given by VOFFSET (x) ≈ −A × 0.5 × (EB (O1s) − EK (OKVV ) + B,

(11.7)

where x is the alloy composition, and A and B are determined from the experimental valence band offsets of 4.6 eV for SiO2 and 3.1 eV for ZrO2 [11.68, 11.70]. This model is presented in Fig. 11.16, and the sigmoidal shape is determined by the relative compositional dependencies of the XPS (O 1s) and AES (OKVV ) results in Figs. 11.14a and 11.15a. The analysis has also been applied to the ZrMVV AES and Zr 3d5/2 XPS results of Figs. 11.14c and 11.15b, and gives essentially the same compositional dependence as is displayed in Fig. 11.16, but with different empirical constants, A and B  . The weakly sigmoidal dependence is a manifestation the discreteness of the O-atom coordinations as function of the alloy composition, a mixture of 2-fold and 3-fold for x < 0.5, and 3-fold and 4-fold for x > 0.5. Figure 11.17 contains plots of the average conduction and valence band offset energies of Zr silicate alloys. These plot combine XAS results of Fig. 11.13 with the model of (11.7), and the experimentally determined band gaps for SiO2 , ∼ 9 eV, and ZrO2 , ∼ 5.6 eV. This approach demonstrates that essentially all of the band gap variation occurs in the valence band offsets, so that the offset energies of the respective Zr 4d∗ states and Si 3s∗ states are constant to < ±0.2 eV with respect to the conduction band edge of Si. The contributions of these Zr and Si states to the conduction band density of

11 Electronic Structure of Alternative High-k Dielectrics

345

Fig. 11.16. Calculated values of the valence band offset energies relative the valence band of crystalline Si at ∼ −5.2 eV as calculated from the two parameter empirical model of (11.7). The plots in are derived from O atom XPS and AES data. The signmoidal dependence results from differences between the compositional dependencies of the respective XPS and AES results used as input, and not on empirical constants

Zr 5s*

Si 3s* conduction band offset

Zr 4d* 3.2 eV

1.4 eV

Si bandgap 4.6(4.4) eV

3.8(3.9) eV

valence band offset x = 0.0 SiO2

x~0.5

3.1(3.3) eV

x = 1.0 ZrO2

O 2p nb π

Fig. 11.17. Band edge electronic structure for SiO2 , an x = 0.5 Zr silicate alloy and ZrO2

G. Lucovsky and J.L. Whitten epsilon1 and E05 bandgaps (eV)

346

11

Hf silicates

10 eps1 gap

9

8

7

6

E05 gap

0

0.2

0.4

0.6

0.8

1

HfO2 composition, x

Fig. 11.18. Values of the compositional dependencies of the peak in epsilon 1, the real part of the complex dielectric constant, and the E05 band gap at which the absorption constant, α, equals 1 × 105 cm−1 extracted from spectroscopic ellipsometry data for Hf silicate alloys, SiO2 and HfO2

states are proportional to their alloy concentrations, with qualitative differences in these states playing a significant role in determining direct tunneling currents. Qualitatively similar results have been obtained been obtained for Hf silicate alloys using the same XPS and AES approach [11.71]. The results presented in Fig. 11.17, and those presented in [11.49], indicate the variation in the respective band gaps is reflected entirely in the valence band offset energy with respect to Si. Fig. 11.18 is a plot of the effective band gap for Hf silicate alloys obtained from the analysis vacuum uv spectroscopic studies. It contains plots of the E05 band gap, defined as the photon energy corresponding to an absorption constant, α = 105 cm−1 , and the spectral peak of the real part of the complex dielectric constant, ε1 [11.70]. The non-linearity of the plot reflects the complex nature of the band edge states, localized Hf 5d∗ and extended Si 3s∗ states, whose relative amplitudes change systematically as a function of alloy composition. In [11.73], absorption edge measurements were analysed for Zr and Hf silicate alloys using a Tauc’s band edge representation in which α = (hν − Eopt )2

(11.8)

where hν is the photon energy, and Eopt is an effective band gap. The results displayed in [11.73] give essentially the same alloy dependence as shown in Fig. 11.18.

11 Electronic Structure of Alternative High-k Dielectrics

347

11.5.5 Trapping at Transition Metal Atoms in Al2 O3 –Ta2 O5 Alloys Studies of the electrical properties of Ta aluminate alloys have provided another example of fixed energy differences between transition metal atom 5d∗ -states, and s∗ conduction band states of the aluminum oxide alloy constituent [11.62, 11.63]. Alloys of Al2 O3 and Ta2 O5 with different concentrations of Ta2 O5 have been prepared by remote plasma assisted deposition onto hydrogen terminated Si(100) substrates, and then incorporated into MOS capacitors. Figure 11.19 indicates the leakage current as a function of inverse temperature. Flat band voltage shifts, and hysteresis in capacitancevoltage C–V, traces in the low temperature regime, are consistent with electron trapping, whereas C–V data in the high temperature regime are consistent with emission out of these trapping states, i.e., a Poole–Frenkel bulk transport mechanism. The activation energy in the low temperature regime in Fig. 11.20, ∼ 0.3 eV, for electron trapping is assigned to the difference in energy between the Si conduction band and the lowest lying t2g (π ∗ ) states of six fold coordinated Ta atoms, and the activation energy at higher temperatures, ∼ 1.5 eV, is assigned to emission out of these Ta d-state traps into the conduction band states of the Al2 O3 matrix. These states have a predominantly s-like character. The two activation energies are consistent with the band offset energy of Ta2 O5 with respect to Si, ∼ 0.3 eV, as determined from X-ray photoelectron spectroscopy, XPS, and the energy difference between the band offset energies of Al2 O3 and Ta2 O5 with respect to Si, ∼ 1.7 eV, as measured by XPS [11.68].

200C

2

C/A (µF/cm )

150C 1.0

100C

0.8 0.6

50C

0.4 -1

0

1

2

3

VG-VFB (Volt)

Fig. 11.19. Temperature dependent capacitance-voltage (C–V) plots for a Ta2 O5 Al2 O3 alloy dielectric, x ∼ 0.4. The direction of the hysteresis for the 50◦ C trace is maintained up to 150◦ C, but decreases in magnitude, and is reversed for the 200◦ C consistent with trapping and trap release, consistent with the activation plots in Fig. 11.20

348

G. Lucovsky and J.L. Whitten 0.01

1.44 e V

1E -4

2

J (A/cm )

1E -3

0.31 eV

1E -5

1E -6

1E -7 2.0

2.5

3.0

3.5

-1

1000/T (K ) Fig. 11.20. Tunneling current density versus inverse temperature, 1/T , for a capacitor with a Ta2 O5 -Al2 O3 alloy dielectric with 40% Ta2 O5

11.6 Interface Electronic Structure Applied to Direct Tunneling in Silicate Alloys In order to reduce direct tunneling in MOS devices with equivalent oxide thickness, EOT, less than 1.5 nm, and extending below 1 nm, there has been a search for alternative dielectrics with significantly increased dielectric constants, K, allowing increases in physical thickness proportional to K, and thereby significantly reducing direct tunneling. However, significant increases in K to values of 15 to 25 in transition metal and rare earth oxides are generally accompanied by decreases in the conduction band offset energy with respect to Si, EB , and the effective electron tunneling mass, meff . This tradeoff between increases in K, and decreases in EB and meff is quantified by the introduction of a figure of merit, Φm , given by, Φm = K(EB − meff )0.5

(11.9)

where K, EB and m∗eff are respectively, the dielectric constant, the conduction band offset energy, and the effective electron tunneling mass [11.2, 11.60]. The expectation was that increased values of K, which permit the use of physically thicker films for the same EOT as SiO2 , would provide significant reductions in direct tunneling allowing scaling to continue to at least an EOT of 1.0 nm, and hopefully to values of EOT approaching 0.5 nm. The discussion presented above has demonstrated that conduction band offset energies for high-K dielectrics containing transition metal atoms are at most 1 eV less than SiO2 for group IIIB and rare earth oxides, silicates and aluminates, and in group IVB materials they are reduced further to at least 1.5 eV. These limitations assume that EB must be greater than at least 1 eV. Since the

11 Electronic Structure of Alternative High-k Dielectrics

349

tunneling mass, m*eff (mo)

1.2

vacuum

1 0.8 0.6

SiO2 Si3N4

0.4 0.2

Y2O 3 0 -0.2

HfO2 0

1

2

3

4

conduction band offset (eV) (eV)

5

Fig. 11.21. Electron tunneling mass versus conduction band offset as determined from Franz two band model. The solid line is for dielectrics with extended s∗ conduction bands, and the dashed line is for localized d∗ state band edges

tunneling figure of merit includes a dependence on meff as well, it is necessary to understand the relationship between EB and meff . Based on a new approach for experimental determination of EB meff products as discussed in [11.74], EB meff has been determined for HfO2 to be 0.23 ± 0.01 mo eV for HfO2 . Based on the spectroscopic studies for ZrO2 , and their extensions to other transition metal and rare earth oxides, a value of EB = 1.5 eV has been inferred for HfO2 , and Hf silicate and aluminate alloys as well. Using this value of EB ∼ 1.5 eV for the Si–HfO2 conduction band offset energy then corresponds to a value of meff = 0.15±0.02 mo , in good agreement with other analyses of tunneling through HfO2 films [11.74]. Next, it is important to comment on the magnitude of the low value for tunneling mass for HfO2 , and its impact on direct tunneling in Hf silicate alloys. This mass is significantly smaller than the mass of ∼ 0.55 mo for SiO2 , and it is important to first understand the microscopic origin for this difference, and then determine its effect on tunneling in Hf silicate alloys. Figure 11.21 contains a plot of tunneling mass versus band offset energy that is consistent with the Franz two band model of [11.75] and [11.76]. The masses for vacuum, SiO2 , Al2 O3 and Si3 N4 dielectrics fall on a straight line, along with the extrapolated mass for Y2 O3 ; however the mass for HfO2 does not. The Franz two-band model is an effective mass approximation that works best when the conduction and valence band states are extended and free electron like. This is the case for SiO2 and Al2 O3 , where the lowest conduction band states are 3s∗ anti-bonding states, but not for transition metal oxides with d∗ -state bands; however the overlap of these states with transition metal s∗ states differs and is proportional to the difference between the atomic nd and n+1s states of the transition metal; n is the principal quantum number

350

G. Lucovsky and J.L. Whitten

equal to 5 for Hf and 4 for Y. The point for Y2 O3 falls on the plot for the oxides with extended free electron like conduction band states, and the point for HfO2 is well removed from this fit to the data due primarily to differences in s-d overlap which is greater the Y2 O3 . Finally, it has been shown in [11.74] that the low value of meff = 0.15 mo coupled with an EB of ∼ 1.5 eV gives a minimum tunneling current for a given EOT in the middle of the silicate alloy regime, whereas for Y silicates, the higher values of both meff ∼ 0.25 mo and EB ∼ 2.3 gives a minimum tunneling current at the Y2 O3 composition. Fig. 11.22a displays this result for the compositional dependence of tunneling current density at an oxide bias of one volt as calculated using the WKB approximation [11.77–11.79]. The plots in Fig. 11.22b provide the important connection between the tunneling figure of merit, Φm , and the tunneling calculated tunneling currents. The plots in Fig. 11.22 are for the figure of merit, Φm , where K, EB and m∗eff have been computed for Si oxynitride alloys, Hf silicate alloys and Zr silicate alloys using compositionally averaged values of K, EB and meff . A plot for the Si oxynitride alloy system is shown for reference. The values of K, EB and meff for SiO2 for this model calculation are respectively, 3.8, 3.15 eV, and 0.55 mo and the corresponding values for Si3 N4 are, 7.6, 2.15 eV, and 0.25 mo . The corresponding values for HfO2 and Y2 O3 have been included in Fig. 11.22b. The plot for Si oxynitrides shows a relatively small variation across the alloy system, and accounts for i) the relatively small decreases in direct tunneling that obtained on alloying. However, these are still significant for device scaling as well discussed below. More important are the differences between the compositional dependence for Y silicate and Hf silicate alloys. The monotonically increasing function for Y silicates predicts that tunneling with respect to SiO2 will be reduced over the entire alloy range, whilst the qualitatively difference behavior predicts that the tunneling reduction in Hf silicate alloys will display a minimum in the middle of the alloy system. The plots in Fig. 11.22a are for the direct tunneling current in n+dielectric-N+-poly-Si at an oxide bias of in excess of one volt above flat band for substrate accumulation. The calculation takes into account the potential drops across the poly-Si and the channel region, and there is a potential drop of 1 volt across the dielectric for the gate potential used in the evaluation of the current density. The doping concentration in the substrate was 2.5 × 1017 cm−3 , and in the poly-Si, 9 × 1019 cm−3 . The values of the computed tunneling current density are independent of these values for n+ and N+ because of the corrections made for the potential drops in the poly-Si and channel regions of the dielectric stack. The differences between the calculated compositional variations of direct tunneling in Hf and Y silicate alloys represent the importance for determining the (EB )(meff ) product for high-K dielectrics, which can be accomplished through the novel approach identified in this article. The correlation between the tunneling figures of merit in Fig. 11.22b and the calculated tunneling currents in Fig. 11.22a in evident in the complementary nature of the plots. In particular, the plot for Hf sili-

11 Electronic Structure of Alternative High-k Dielectrics

351

Fig. 11.22. (a) Compositional dependence of calculated tunneling at 1 volt oxide bias for alloys with EOT = 1.2 nm. (b) Compositional dependence of tunneling figure of merit for alloys in (a)

cate alloys, which applies to Zr silicate as well, indicates that the tunneling current is a minimum in the middle of the alloy, paralleling a behavior for Si oxynitride alloys. These behaviors are in agreement with experimental results for both the Si oxynitride and the Hf silicate alloys. The behavior for Y (and other group IIIB and trivalent RE silicates) is qualitatively different with the minimum in tunneling occurring at the elemental oxide composition. Figures 11.23 and 11.24 compare experimental results for MOSCAPs with Si oxynitride alloys [11.80] and Hf silicate alloys [11.81], respectively. The agreement between the calculated compositional dependence and the experimental points in generally very good. There are issues that relate to the

G. Lucovsky and J.L. Whitten

tunneling current (A/cm2)

352

Si oxynitride alloys Si oxynitrides EOT - 2.0 +/- 0.2 nm

EOT ~ 2.0 nm

10-2

10-3

10-4 0

0.2

0.4

0.6

0.8

1

alloy composition, x

Fig. 11.23. Comparison between calculated tunneling current and experiment for Si oxnitride alloys

silicate alloys HfHfsilicates Hf silicates EOT 0.20.2 nmnm EOT -–1.2 2.0+/-+/-

tunneling current (A/cm2)

101

EOT ~ 1.0 nm

10-1 10-3 10-5 10-7 10-9

0

0.2

0.4

0.6

alloy composition, x

0.8

1

Fig. 11.24. Comparison between calculated tunneling current and experiment for Hf silicate alloys

interfacial oxide/silicate layers and the way they must be included in these comparisons that must be addressed in comparisons between experiments and theoretical models. However, the point to be emphasised here is not so much the comparisons of the magnitudes of the direct tunneling currents, but rather the compositions at which the minimum tunneling occurs. The experimental data clearly indicate that these minima are in the mid alloy composition range for both the Si oxynitride and Hf silicate alloys, providing direct experimental evidence in excellent agreement with the tunneling calculations presented in Fig. 11.22a. The effects of these transition regions on EOT have been identified in [11.2], and the relevant plot is presented in Fig. 11.25. This is a plot of

11 Electronic Structure of Alternative High-k Dielectrics

353

3.0

equvalent oxide thickness (nm)

k = 7.6 2.5 k = 12 2.0

1.5

1.0 k = 20 0.5

0.0

0

1

2

3

4

5

6

physical thickness (nm) Fig. 11.25. EOT versus physical thickness for dielectrics with k = 7.6, 12 and 20 with and without an interfacial transition region, EOTtr = 0.35 nm

EOT versus physical thickness of the high-k dielectric for abrupt interfacial transitions, and for the inclusion of a nitrided Si oxide transition region that contributes about 0.35 nm to EOT. The primary effect of the transition region is to reduce the physical thickness of the high-k dielectric by an amount, ∆tphys , given by K(j) ∆tphys Λtphys = EOTt r (11.10) K(SiO2 ) where EOTtr is the transition region contribution to EOT, ∼ 0.35 nm, K(j) is the dielectric constant of the alternative dielectric, and K(SiO2 ) is the dielectric constant of SiO2 , ∼ 3.8–3.9, so that ∆tphys ∼ 0.092 K(j) in units of nm. If this correction is applied, then the very low direct tunneling reported Pr2 O3 may be taken as evidence for the validity of this approach to rare earth lanthanide oxides, and in particular to the compositional dependence shown in Fig. 11.24 [11.82].

11.7 Conclusion This chapter has addressed two important aspects of high-K dielectrics that lead to limitations on their application in aggressively scaled devices, independent of other issues related to reliability, process integration and manufacturing cost. The first limitation derives from their fundamental electronic

354

G. Lucovsky and J.L. Whitten

structure in which the lowest conduction band states are localized d∗ -states. The energies of these states relative to the top of the valence band in oxides, and silicate and aluminate alloys is significantly less than that of the extended s∗ state conduction bands in SiO2 and other non-transition metal and rare earth oxides such as Al2 O3 and MgO. The theoretical calculations and spectroscopic studies of this article indicate that transition metal and rare earth oxide-based dielectrics also have significantly reduced conduction band offset energies relative to SiO2 . Based on these intrinsic electronic structure properties, the only candidate oxides, and silicate and aluminate alloys remaining as viable candidates as replacement dielectrics are those of Hf, and Zr, Y, La and the lanthanide trivalent rare earths. This list may be extended to included complex mixed oxides comprised of mixtures of transition metal and rare earth oxides in which d-state mixing promotes conduction band offset energies > 1 eV, and preferably greater than 1.5 eV. Low band offset energies are also generally accompanied by low tunneling electron masses, and the combination of these mitigates increases in thickness associated with increased K. Other limitations for high-K alternative dielectrics, not specifically discussed in this chapter derive from their ionic bonding which results in i) increased bond induced stress in silicate and aluminate alloys, ii) high values of interface fixed charge, iii) oxygen atom or ion transport in a number of oxides including ZrO2 , HfO2 , Y2 O3 and La2 O3 , and finally, iv) the challenge of replacing polycrystalline Si gate electrodes which dual metal gate electrodes that yield symmetric threshold voltages required in CMOS devices. These issues are under active study by several research groups, and important research results should be forthcoming in the next year or two. The path finding a high-K replacement for SiO2 and Si oxynitride gate dielectrics is wrought with many intrinsic obstacles that derive from the qualitative differences of the transition metal and rare earth high-K candidate dielectrics. As discussed in this article, many of several important differences are intrinsic in nature, and derive from i) an increased ionic character, and ii) differences in conduction band edge states – localized d∗ -states for the transition metal and rare earth dielectrics, in contrast with extended s∗ -states for SiO2 . and Si oxynitride alloys. Acknowledgments. Research supported by the Office of Naval Research, the Air Office of Scientific Research, the SEMATECH/Semiconductor Research Corporation Front End Processes Center, and the Semiconductor Research Corporation. Finally, one of the authors (GL) acknowledges the contributions of his graduate students and post doctoral fellows for much of the research described in this chapter, and documented in the references.

11 Electronic Structure of Alternative High-k Dielectrics

355

References 11.1. G. Wilk, R.W. Wallace and J.M Anthony, J. Appl. Physics 89, 5243 (2001) 11.2. G. Lucovsky, in Extended Abstracts of the 6th Workshop on Formation, Characterization, and Reliability of Ultrathin Silicon Oxides, January 2627, 2001, Atagawa Heights, Japan, p. 5 11.3. D.E. Aspnes and J.B. Theeten, J. Electrochem. Soc. 127, 1359 (1980) 11.4. D.E. Aspnes et al., Phys. Rev. Lett. 43, 1046 (1979) 11.5. J.W. Keister, J.E. Rowe, J.J. Kolodziej, H. Niimi, H.S. Tao, T.E. Madey and G. Lucovsky, J. Vac. Sci. Technol. A 17, 1250 (1999) 11.6. M.D. Ulrich, J.G. Hong, J.E. Rowe, G. Lucovsky, A.-Y. Chan and T.E. Madey, J. Vac. Sci. Technol. B 21, 1777 (2003) 11.7. G. Lucovsky and J.C. Phillips, Appl. Phys. A – Materials & Processing 78, 453 (2004) 11.8. P. Boolchand, in Insulating and Semiconducting Glasses (World Scientific, Singapore, 2000), p. 191 11.9. P. Boolchand, D.G. Georgiev and M. Micoulaut, J. Optoelectronics and Adv. Mater. 4, 823 (2002) 11.10. G. Lucovsky and J.C. Phillips, J. Vac. Sci. Technol. B 22 (2004) (in press); G. Lucovsky, J.P. Maria and J.C. Phillips, J. Vac. Sci. Technol. B 22 (2004) (in press) 11.11. R. Zallen, The Physics of Amorphous Solids (John Wiley and Sons, New York, 1983), Chap. 2 11.12. F.L. Galeener and G. Lucovsky, Phys. Rev. Lett. 37, 1474 (1976) 11.13. R.J. Bell and P. Dean, Discuss Faraday Soc 50, 55 (1970); in Amorphous Materials, edited by R.W. Douglas (Wiley-Interscience, London, 1972), p. 443 11.14. J.C. Phillips, J. Non-Cryst. Solids 34, 153 (1979) 11.15. J.C. Phillips, J. Non-Cryst. Solids 43, 37 (1981) 11.16. J.L. Whitten, Y. Zhang, M. Menon, and G. Lucovsky, J. Vac. Sci. Techol. B 20, 1710 (2002) 11.17. D.L. Griscom in The Physics of SiO 2 and Its Interfaces, ed. by S.T. Pantelides (Pergammon Press, New York, 1978), p. 232 11.18. R.L. Mozzi and B.E. Warren, J. Appl. Cryst. 2, 164 (1969) 11.19. L. Robertson and S. Moss, J. Non-Crystalline Solids 106, 330 (1988) 11.20. G. Lucovsky, L.S. Sremaniak, T. Mowrer T and J.L. Whitten, J. Non-Cryst. Solids 326, 1, (2003) 11.21. R.B. Laughlin and J.D. Joannopoulos, Phys. Rev. B 16, 2942 (1977) 11.22. D.J. Chadi, R.B. Laughlin and J.D. Joannopoulos, in [11.7], p. 55 11.23. I.P. Batra, in [11.17], p. 65 11.24. S.T. Pantelides and W.A. Harrison, Phys. Rev. B 13, 2667 (1976) 11.25. A.G. Revesz and G.V. Gibbs, in The Physics of MOS Insulators, edited by G. Lucovsky, S.T. Pantelides and F.L. Galeener (Pergamon Press, New York, 1980), p. 92 11.26. G. Lucovsky and H. Yang, J. Vac. Sci. Tech. A 15, 836, (1997) 11.27. J.L. Whitten and H. Yang, Int. J. Quantum Chem., Quantum Chem. Symp. 29, 41 (1995) 11.28. J.L. Whitten and H. Yang, Surf. Sci. Rep. 24, 55 (1996) 11.29. J.L. Whitten and M. Hackmeyer, J. Chem. Phys. 51, 5584 (1969)

356

G. Lucovsky and J.L. Whitten

11.30. 11.31. 11.32. 11.33. 11.34. 11.35.

J. Neufeind and K.-D. Liss, Bur. Bunsen Phys. Chem. 100, 1341 (1996) H.R. Philipp, Solid State Commun. 44, 73 (1966) S.T. Pantelides, in [11.17], p. 80 C. Senemaud and M.T. Costa Lima, in [11.9], p.85 T.H. DiStefano, in [11.9], p. 362 G. Lucovsky G, IBM J. Res. and Develop. 43, 301 (1999) 301, and references therein G. Lucovsky, H. Yang, H. Niimi, J.W. Keister, J.E. Rowe, M.F. Thorpe and J.C. Phillips, J. Vac. Sci. Technol. B 18, 1742 (2000) F.J. Himpsel, F.R. McFeely, A. Taleb-Ibrahimi, Y.A. Yarmoff and G. Hollinger, Phys. Rev. B 38, 6084 (1988) J.W. Keister, J.E. Rowe, Y.-M. Lee, H. Niimi, G. Lucovsky and G.J. Lapeyre, unpublished H. Yang, H. Niimi, J.W. Keister, G. Lucovsky and J.E. Rowe, IEEE Electron Device Lett. 21, 76 (2000) D.R. Lee, G. Lucovsky, M.R. Denker and C. Magee, J. Vac. Sci. Technol. A 13, 1671 (1995) H. Niimi and G. Lucovsky, J. Vac. Sci. Technol. A 17, 3185 (1999); B 17, 2610 (1999) M. Weldon, K.T. Queeny, Y.J. Chabal, B. Stefanov and K. Raghavachai, J. Vac. Sci. Technol. B 17, 1795 (1999) D.E. Muller et al., Nature 399, 758 (1999) L.C. Feldman, I, Stensgaard, P.J. Silverman and T.E. Jackman, in [11.17], p. 339 G. Lucovsky and J.C. Phillips, J. Non-Cryst. Solids 227, 1221 (1998) G. Lucovsky, Y. Wu, H. Niimi, V. Misra, and J.C. Phillips, Appl. Phys. Lett. 74, 2005 (1999) G. Lucovsky, A. Rozaj-Brvar and R.F. Davis, in The Structure of NonCrystalline Materials 1982, edited by P.H. Gaskell, J.M. Parker and E.A. Davis (Taylor and Francis, London, 1983), p. 193 B. Rayner, H. Niimi, R. Johnson, R. Therrien, G. Lucovsky and F.L. Galeener, AIP Conf. Proc. 550, 149 (2001) G.B. Rayner Jr., D. Kang, Y. Zhang and G. Lucovsky, J. Vac. Sci. Technol. B 20, 1748 (2002) J.C. Phillips and X. Kerner, Solid State Commun. 117, 47 (2001) L. Pauling, The Nature of the Chemical Bond, 3rd edn (Cornell University Press, Ithaca, NY. 1936) R.T. Sanderson, Chemical Bonds and Bond Energy (Academic Press, New York 1971) H.B. Gray, Electrons and Chemical Bonding (W.A. Benjamin, New York, 1962), Chap. 9 C.J. Ballhausen and H.B. Gray, Molecular Orbital Theory (W.A. Benjamin, New York, 1964), Chap. 8 P.A. Cox. Transition Metal Oxides (Oxford Science Publications, Oxford, 1992) L.A. Grunes, R.D. Leapman, C.D. Walker, R. Hoffman and A.B. Kunz, Phys. Rev. B 25, 7157 (1982) Y. Zhang, unpublished J. Robertson and C.W. Chen, Appl. Phys. Lett. 74, 1164 (1999)

11.36. 11.37. 11.38. 11.39. 11.40. 11.41. 11.42. 11.43. 11.44. 11.45. 11.46. 11.47.

11.48. 11.49. 11.50. 11.51. 11.52. 11.53. 11.54. 11.55. 11.56. 11.57. 11.58.

11 Electronic Structure of Alternative High-k Dielectrics

357

11.59. J. Robertson, J. Vac. Sci. Technol. B 18, 1785 (2000) 11.60. G. Lucovsky, Microelectronic Reliability 43, 1417 (2003) 11.61. R.S. Johnson, G. Lucovsky and J. Hong, J. Vac. Sci. Technol. A 19, 1353 (2001) 11.62. R.S. Johnson, G. Lucovsky and J. Hong, J. Vac. Sci. Technol. B 19, 1606 (2001) 11.63. R.S. Johnson, G. Lucovsky and J.G. Hong, Microelectronic Engineering 59, 385 (2001) 11.64. G. Lucovsky, G.B. Rayner Jr., D. Kang, G. Appel, R.S. Johnson, Y. Zhang, D.E. Sayers, H. Ade and J.L. Whitten, Appl. Phys. Lett. 79, 1775 (2001) 11.65. G. Lucovsky, Y. Zhang, G.B. Rayner Jr. et al., J. Vac. Sci. Technol. B 20, 1739 (2002) 11.66. C.C. Fulton (unpublished) 11.67. G.B. Rayner, D. Kang and G. Lucovsky, J. Vac. Sci. Technol. B 21, 1783 (2003) 11.68. S. Miyazaki, M. Narasak, M. Ogasawaga and M. Hirose, Microelectronic Engineering 59, 373 (2001) 11.69. D.E. Raymaker, J.S. Murday, N.H. Turner, C. Moore and M.G. Legally, in [11.9], p. 99 11.70. S. Miyazaki and M. Hirose, AIP Conf. Proc. 550, 89 (2000) 11.71. J.G. Hong, PhD. Dissertation, North Carolina State University (2004) 11.72. N.A. Stoute, G. Lucovsky and D.E. Aspnes (unpublished) 11.73. H. Sato, T. Nango, T. Miyagawa, T. Katagiri, K.S. Seol and Y. Ohki, J. Appl. Phys. 92, 1106 (2002) 11.74. C.L. Hinkle, J.G. Hong and G. Lucovsky, Microelectronic Engineering 72, 257 (2004) 11.75. W. Franz, Handbuch der Physik, Vol. XVIII, ed. by S. Flugge (Springer, Berlin 1965) p. 155 11.76. J. Maserjian, J. Vac. Sci. Tech. 11, 996 (1974) 11.77. K.F. Schuegraf, C.C. King and C.-M. Hu, 1992 VLSI Symposium 11.78. H-Y. Yang, H. Niimi and G. Lucovsky, J. Appl. Phys. 83, 2327 (1998) 11.79. E.M. Vogel et al., IEEE Trans. Elec. Dev. 45, 1350 (1998) 11.80. H. Yang and G. Lucovsky, IEEE-IEDM Digest, p. 245 (1999) 11.81. I. Kim, S. Han, J.G. Hong, C. Osburn and G. Lucovsky (unpublished) 11.82. H.H. Osten et al., in Extended Abstracts International Workshop on Gate Insulator, Nov. 1-2 2001, Tokyo, p. 100

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals in Silicon A.A. Istratov and E.R. Weber

12.1 Introduction The continuous downscaling of the integrated circuits node size requires thinning of the SiO2 gate oxide to less than 1.5 nm. At this thickness, silicon dioxide starts losing its dielectric properties since direct tunneling of carriers through the dielectric dominates over its bulk properties. Intensive research is being conducted in the area of alternative gate dielectric materials with high dielectric constants (k) in order to utilize substantially thicker dielectric layers for the same gate capacitance, thus preventing the leakage current and reliability problems. Besides a high dielectric constant (10–20 or higher), the new dielectric materials should have sufficiently large conduction band offset, ∆EC , to prevent tunneling, should form a high-quality interface to Si and have good film morphology, be compatible with the gate material, semiconductor processing temperatures and operating conditions, be stable in direct contact with silicon, and must produce reliable MOSFETs [12.1–12.4]. A variety of heavy metal oxides and silicates, including ZrO2 , HfO2 , ZrSiO4 , HfSiO4 , Al2 O3 , (Ba, Sr)TiO3 (BST), SiOx Ny , SrTiO3 , CeO2 , Pr2 O3 , PrO2 , CeO2 , Y2 O3 , La2 O3 , Ta2 O5 , TiO2 , have been suggested as potential candidates to replace SiO2 in the gate stacks for the next generations of integrated circuits (see, e.g., [12.1–12.3,12.5–12.14]). All these materials are either oxides or silicates of 4d or 5d transition metals or rare earth elements. The possibility of thermal instability and reactions of the high-k dielectrics with silicon is a serious concern. Aside from the changes in the thickness or dielectric permittivity of the gate dielectric, instability of heavy metal oxides/silicates may lead to a diffusion of these metals into the silicon substrate, which may detrimentally affect the device performance. In fact, even a metal diffusion length of several hundred nanometers from the wafer surface may be sufficient to alter the properties of MOS devices. Unfortunately, it is very difficult to evaluate the feasibility of the device degradation by small amounts of heavy metals because their electrically properties, solubilities, and diffusion coefficients in silicon are poorly investigated or unknown. In this chapter, we review the current state of knowledge of fundamental physical properties in silicon of the heavy metals which are considered for fabrication of high-k dielectrics: Zr, Hf, Ba, Sr, Ti, Ce, Pr, Y, La, and Ta. Since the data on the properties of these elements in silicon are scarce, the scope of this review

360

A.A. Istratov and E.R. Weber

was intentionally made broader to include not only these elements, but also their neighbors in the periodic table which belong to the 4d, 5d, and rare earth groups of elements and, due to their similar atomic radius and electron structure, may be expected to have similar properties.

12.2 Crystal Lattice Site of 4d, 5d, and Rare Earth Metals in Silicon The crystal site of a metal impurity in silicon determines both the electrical properties of the impurity and its diffusion coefficient. Substitutional impurities typically form a donor and an acceptor state (or even multiple acceptor states) in the silicon band gap, and diffuse very slowly. Interstitial impurities as a rule diffuse faster, and usually form donor levels. In most cases, metal impurities dissolve in silicon on both the interstitial and substitutional sites. The interstitial fraction of the metal can vary in a wide range depending on the individual physicochemical properties of the metal and concentration of the silicon native defects, such as vacancies or interstitials, which can be injected in the wafer by a high temperature anneal in an appropriate ambient. For example, light 3d transition metals such as iron or copper are predominantly interstitial (with typically less than 1% of the total dissolved concentration on substitutional sites), whereas heavy metals such as gold prefer substitutional sites, using interstitial sites mainly for diffusion [12.15–12.17]. There are almost no systematic experimental or theoretical data on the preferred lattice sites of 4d, 5d, and rare earth metals in silicon, besides Au, Ag, Pd and Pt. For the vertical series in the periodic table (Cu, Ag, Au) a clear trend towards increasing substitutional fraction for increasing atom size can be observed [12.18]. Beeler and Scheffler [12.19] performed theoretical calculations for 4d transition elements from Zr to Ag and concluded that these metals are more stable in the substitutional rather than in the interstitial position, in agreement with the general trend that heavy metals with large atomic radius do not easily fit in the interstitial sites. However, their calculations [12.19] indicated that the interstitial site might be partly occupied as well. To the best of our knowledge, a direct evidence of the fact that even heavy metals may occupy interstitial site was obtained only for erbium (Er ), which was proposed to be predominantly (about 90%) located in the interstitial position [12.20]. This agrees with the theoretical predictions for Er 3+ [12.21]. A strong donor activity of a number of impurities, such as Tb [12.22], Yb [12.23], Er [12.24–12.26], Dy and Ho [12.27–12.29] can also be considered as an indication of the interstitial site of a fraction of these impurities. These observations give reasons to expect that 4d, 5d, and rare earth metals are likely to co-exist in interstitial and substitutional states.

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

361

Temperature (0C) 1300 1200

10

1100

1000

19

900

800 Pm Zr,Hf

Mo Er

-3

Metal solubility (cm )

1018 1017 Yb 1016

Cu

Er Zri, Hfi

1015 1014 10

Mo

Pm

Fe

13

1012

Cr 0.65

0.70

0.75

0.80

0.85

0.90

0.95

-1

1000/T (K )

Fig. 12.1. Solubility of Yb (thick solid line) [12.23], Mo [12.30], Er [12.31], Pm [12.32], and Zr and Hf [12.33]. The data for Mo, Er, Pm, Zr, and Hf are approximate values, reported by the authors of [12.30–12.33] as estimates or lower limits. The solubilities of Cu, Fe, and Ce are shown as a reference (after Weber [12.18]). The arrows next to the data points for Mo, and interstitial Zr and Hf indicate that the plotted data are lower/upper solubility limits, respectively

12.3 Solubility of 4d, 5d, and Rare Earth Metals in Silicon There are very little data on the solubility of 4d, 5d, and rare earth metals in silicon. The data which are known to us are plotted in Fig. 12.1. These data are as follows: Bakhadyrkhanov et al. [12.23] reported that the solubility of Yb in the temperature range from 1000◦ C to 1250◦ C varied from 3.7 × 1016 cm−3 to 5 × 1017 cm−3 and was described by the expression SYb = 2.2 × 1023 exp (− (1.70 ± 0.05) eV/kB T ) , cm3 . Ren et al. [12.31] reported that their experiments indicated the equilibrium solubility of Er in Si was approximately equal to 1016 cm−3 at 1300◦ C. Graff [12.34] reported the value of 3.6 × 1013 cm−3 as a lower limit for the solubility of molybdenum at 900◦ C. This value is similar to the solubility of iron, manganese and cobalt in the same temperature range. Solubility of 147 Pm in silicon was on the order of 6×1013 cm−3 at 1200◦ C, as reported by Ferrin et al. [12.32]. Vyvenko et al. [12.33] reported that the analysis of the diffusion profiles and comparison of the SIMS and spreading resistance data give an indirect

362

A.A. Istratov and E.R. Weber

indication that the interstitial solubility of Zr and Hf in silicon at 1000◦ C may be less than the SIMS detection limit (approximately 1015 cm−3 ). No data were reported for the substitutional solubility of these impurities. Those few data points which are available for 4d, 5d, and rare earth metals are spread between 3×1013 cm−3 to 1018 cm−3 in the temperature range from 900◦ C to 1300◦ C (Fig. 12.1). For comparison, we also plotted in Fig. 12.1 the solubilities of the three common impurities in silicon, Cu, Fe, and Cr. The data for Zr, Hf, and Mo in Fig. 12.1 are close to the iron solubility in silicon. It will be shown in the next section that the accurate determination of the solubility of heavy metals is hindered by their low diffusion coefficients, i.e., extremely long annealing time may be required to saturate a sample with the heavy metal impurity.

12.4 Diffusivity of 4d, 5d, and Rare Earth Elements in Silicon The database on the diffusivity of 4d, 5d, and rare earth metals in silicon is more extensive than that on their solubility. Nonetheless, the available data are fragmentary and often inconclusive. Therefore, in order to establish the range in which the diffusivity value of a heavy metal is likely to be, we reviewed the data on both the metals which are considered for highk dielectric applications (Sect. 12.4.1), and their neighbors in the periodic table (Sect. 12.4.2). In the third part, we summarize the data and determine typical ranges of diffusivity values which one may expect for a heavy metal. 12.4.1 Diffusivity of Pr, Sr, Ba, Zr, and Hf Diffusion of praseodymium (Pr) in silicon was studied by Nazyrov et al. [12.35] using radiotracer technique. They deposited radioactive praseodymium chloride on the polished surfaces of silicon samples and annealed the samples in the air at temperatures from 1100◦ C to 1300◦ C for 22–57 hours. The penetration depth of Pr in silicon after the diffusion anneals did not exceed 3–4 µm. From the analysis of the diffusion profiles, they concluded that the diffusion coefficient of Pr was 1.15 × 10−13 cm2 /s at 1100◦ C, 1.5 × 10−13 cm2 /s at 1150◦ C, 2.2 × 10−13 cm2 /s at 1200◦ C, 4.2 × 10−13 cm2 /s at 1250◦ C, and 6.3 × 10−13 cm2 /s at 1300◦ C. An Arrhenius plot of these data yields the following expression for the diffusion coefficient: DPr = 2.4 × 10−7 exp(−1.76 eV/kT ), cm2/s. These data and the fit are plotted in Fig. 12.2. Diffusion of strontium (Sr) in silicon was studied by Yamamichi et al. [12.8]. Strontium was deposited onto wafers by spin-coating using 0.1, 1, 10, and 100 ppm solutions. The wafers were annealed at 950◦ C in nitrogen for 4 hours, and were subsequently analyzed by SIMS. The diffusion coefficient

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

363

Temperature (0C) 10-12

1150

1100

1000

900

800

2

Diffusivity (cm /s)

Pr

10-13

0.62

0.64

0.66

0.68

0.70

0.72

0.74

-1

1000/T (K )

Fig. 12.2. Diffusivity of praseodymium in silicon (after Nazyrov et al. [12.35])

Temperature (0C)

10-11

1100

1000

900

Hf (fast), Vyvenko et al. Hf (slow), Vyvenko et al. Zr (fast), Vyvenko et al. Zr (slow), Vyvenko et al. Zr, Quevedo-Lopez et al. Ba, Boubekeur et al. Sr, Yamamichi et al.

10-12 2

Diffusivity (cm /s)

800

10-13 10-14 10-15 10-16 10-17

0.70

0.75

0.80

0.85

0.90

0.95

-1

1000/T (K ) Fig. 12.3. Diffusivity of barium and strontium in silicon (after Yamamichi et al. [12.8] and Boubekeur et al. [12.36]) and diffusivity of Hf and Zr in silicon (after Vyvenko et al. [12.33] and Quevedo-Lopez et al. [12.10])

364

A.A. Istratov and E.R. Weber

of Sr in Si at 950◦ C was calculated using the Sr depth profile in Si for a depth of 8 to 10 nm, and the value obtained was 2 × 10−17 cm2 /s (see Fig. 12.3). Diffusion of barium (Ba) in silicon was studied by Boubekeur et al. [12.36], who spin-coated Si wafers with a Ba-containing solution and annealed the wafers at 800◦ C for 60 min in nitrogen or oxygen atmosphere. SIMS analysis revealed penetration of Ba into silicon to the depth of 30–100 nm. The diffusion coefficient of Ba at 800◦ C obtained from the diffusion profile was 5 × 10−16 cm2 /s [12.36] (see Fig. 12.3). Diffusion of hafnium (Hf) and zirconium (Zr) in silicon was studied by Vyvenko et al. [12.33]. They found that the diffusion coefficient of Hf was given by (1–3)×10−16 cm2 /s at 1000◦ C and (0.3–1)×10−15 cm2 /s at 1100◦ C for the slow diffusing component (tentatively substitutional Hf) and 1.2 × 10−12 cm2 /s at 1100◦ C and 3 × 10−13 cm2 /s at 1000◦ C for the fast diffusing component of Hf (tentatively interstitial Hf). For Zr, the diffusion coefficient at 1100◦ C was determined to be 1.7 × 10−12 cm2 /s for the fast diffusing component, and 5 × 10−14 cm2 /s for the slow diffusing component. The Zr diffusion coefficient of 5 × 10−14 cm2 /s at 1100◦ C seems to agree reasonably well, taking into account the difference in the annealing conditions, with an estimated diffusion coefficient of 2×10−15 cm2 /s, reported by Quevedo-Lopez et al. [12.10] for a 30 s drive-in RTA anneal at 1050◦ C (see Fig. 12.3). 12.4.2 Diffusivity of Er, Pm, Yb, Tb, Ho, and Mo in Silicon Besides the data on diffusivity of Zr, Hf, Ba, Sr, and Pr, which are considered for the use in high-k dielectrics, diffusivities of several other 4d, 5d, and rare earth elements were reported. These elements include Erbium (Er), Promethium (Pm), Ytterbium (Yb), Terbium (Tb), Holmium (Ho), and Molybdenum (Mo). Although these elements did not find applications in the gate stacks, they belong to the same group of elements as the metals which are considered for high-k dielectrics and therefore are likely to have similar diffusivity. The data on diffusivity of Er in silicon are presented in Fig. 12.4, the data for promethium in Fig. 12.5. The data for the rest of the metals discussed in this section are plotted in the master plot Fig. 12.6. Diffusivity of erbium in silicon was studied in [12.25, 12.26, 12.31, 12.37– 12.39] (see Fig. 12.4). Zainabidinov et al. [12.26] measured three points in the temperature range from 1150 to 1250◦ C and reported the equation D = 5 × 10−3 ×exp(−3 eV /kT ) cm2 /s. Alexandrov et al. [12.25] reported Er diffusivity of D = 1.3 × 10−12 cm2 /s at 1200◦ C. Nazyrov et al. [12.37] reported that the diffusion coefficient of Er in silicon is given by the expression DEr = 2 × 10−3 exp(−2.9 eV/kT ) cm2 /s(1100–1250◦ C). Roberts et al. [12.39] found that the diffusion coefficient of Er in Si at 1315◦ C lies between 1 × 10−16 and 3 × 10−16 cm2 /s. Finally, Ren et al. [12.31] reported that the diffusivity of Er in silicon is about 10−12 cm2 /s at 1300◦ C and approximately 10−15 cm2 /s at 900◦ C, and is described by a diffusion enthalpy of about 4.6 eV. Plotting all data reported in [12.25,12.26,12.31,12.37–12.39] in the Arrhenius coordinates,

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

365

Fig. 12.4. Diffusivity of erbium in silicon (after Aleksandrov et al. [12.25], Zainabidinov et al. [12.26], Nazyrov et al. [12.37], and Ren et al. [12.31]) Temperature (0C) 1300 1200

1100

1000

900

Diffusivity (cm2/s)

10-12

10-13

10-14

0.60

Pm, Nazyrov et al. Pm, Ferrin et al. 0.65

0.70

0.75

0.80

0.85

0.90

1000/T (K-1)

Fig. 12.5. Diffusivity of promethium (Pm) in silicon, after Ferrin et al. [12.32] and Nazyrov et al. [12.40]

we obtained the following equation for the best fit to all reported data for erbium diffusivity: DEr = 2.5 × 10−3 × exp(−2.95 eV/kB T ), cm2 /s. Ferrin et al. [12.32] used radiotracer technique to study the diffusivity of a radioactive isotope of promethium, 147 Pm, in p-type silicon crystals. Diffusion anneals were performed at temperatures between 730◦ C and 1270◦ C for a period of about 300 hours. The penetration depth of Pm at these temperatures

366

A.A. Istratov and E.R. Weber Ba Er

Hf Ho

1300 1200 1100

10-8

Mo Pm

Pr Sr

Zr Sii

Tb Yb

Temperature (0C) 1000 900 800

700

2

Diffusion coefficient (cm /s)

10-9 10-10 10-11 10-12 10-13

Sii

10-14 10

-15

10-16 10-17 10-18 10-19 0.6

0.7

0.8

0.9

1.0

1.1

1000/T (K-1)

Fig. 12.6. Master plot of the reported data on diffusivity of heavy metals in silicon. Diffusivity of silicon self-interstitials (solid line, [12.45]) is shown for comparison. The diffusion coefficients of 4d, 5d, and rare earth elements discussed in this chapter can be expected to be within the shaded area

was about 50 microns. The penetration curves were best described by double exponentials: a shallow diffused layer of a few micron thickness was followed by a considerably deeper one, which extended to about 50 microns. However, the temperature dependence of the fast component had an unreasonably low activation energy of 0.07 eV, thus indicating that the fast component might have been an artifact of sample preparation, such as re-deposition of Pm on the sample surface during sequential polishing or etching (as it is usually done in radiotracer measurements). Therefore we did not plot the data for the fast component in Fig. 12.5. The diffusivity of the slow component was given by DPm = 5.7 × 10−9 exp (−1.2 eV/kB T ) , cm2 /s. Another study of diffusivity of promethium in silicon, also made by using radioactive isotope 147 Pm, was reported by Nazyrov et al. [12.40]. They deposited Pm from an alcohol solution of the chloride of 147 Pm and performed diffusion anneals at temperatures from 1100 to 1250◦ C for 8 to 48 hours. The diffusion depth of promethium at these temperatures was found to be less than 10 microns. The diffusion coefficient of Pm followed the Arrhenius law given by the equation DPm = 5 × 10−3 exp(−3.3 eV/kB T ), cm2 /s.

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

367

It is interesting that when the data from [12.32,12.40] are plotted in the same plot, they agree quite well with each other despite a substantial difference in the values of the diffusion enthalpies obtained in [12.32, 12.40]. A least squares fit to the both sets of data in Arrhenius coordinates (Fig. 12.5) gives the following expression for the diffusivity of promethium in silicon: DPm (fit to all data) = 1.12 × 10−8 exp (−1.29 eV/kB T ) , cm2 /s. The diffusivity of ytterbium (Yb) in silicon was studied by Fu et al. [12.41]. A fit of the SIMS depth profile of Yb distribution with an error function enabled them to estimate the diffusion coefficient of Yb in Si as (4–14)×10−13 cm2 /s at 1050◦ C. The diffusivities of terbium (Tb) and holmium (Ho) in silicon were studied by Uskov et al. [12.42]. They used neutron activation analysis to obtain diffusion profiles of Tb and Ho in silicon after anneals at temperatures 1000– 1270◦ C in a mixture of dry argon and oxygen, and reported the following equations for diffusivity of Tb and Ho, respectively: DTb = 1.4 × 10−3 exp(−4.0 eV/kB T ), cm2 /s DHo = 4.9 × 10−3 exp(−4.1 eV/kB T ), cm2 /s. The diffusion coefficient of molybdenum (Mo) in silicon was studied in [12.43, 12.44]. Hamaguchi et al. [12.43] obtained the diffusion coefficient of Mo in Si of 2 × 10−10 cm2 /s at 1000◦ C. Benton et al. [12.44] performed diffusivity measurements at four temperatures between 680◦ C and 1000◦ C and reported the equation for the diffusion coefficient of Mo in Si of DMo = 0.26 exp(−2.2 eV/kT ) cm2 /s. Benton et al. [12.44] pointed out that the diffusion of Mo in Si is defect mediated and dramatically affected by the presence of implantation-related damage. 12.4.3 Diffusivity of Heavy Metals in Silicon: A Discussion A master plot of the diffusivity data reported for heavy metals in silicon is presented in Fig. 12.6. We did not attempt to identify in the figure captions of this busy graph the source of each data point. The sources for data points for the metals which are considered for high-k dielectrics were denoted in Figs. 12.2–12.5 above, whereas the data for the other heavy metals (see the book of Graff [12.30] and references therein) are essential only in establishing trends and ranges of the diffusivity values. With the exception of Ho and Tb, whose diffusion coefficients lie below the silicon self-diffusion curve (solid line) and are therefore questionable, the data points for the other metals correspond to a very slow metal diffusion. The majority of the data points in Fig. 12.6 are located between D = 3 × 10−14 cm2 /s and 10−12 cm2 /s in the temperature range between 1000◦ C and at 1300◦ C. The diffusion coefficient of 3 × 10−14 cm2 /s corresponds to the

368

A.A. Istratov and E.R. Weber

average diffusion length of about 27 nm during a 60 second RTP anneal at 1000◦ C. While this diffusion length is extremely short, it is in the same order of magnitude as a typical device size and may be sufficient to alter the device properties. On the other hand, at this low diffusion coefficient even a 24 hour anneal at 1000◦ C does not allow the impurity to travel more than 1 micron in the crystal. Consequently, it is very difficult to saturate the sample with the diffusing metal in order to measure it solubility, or to apply experimental techniques other than SIMS to determine the diffusivity of the impurity. As a matter of fact, even diffusivities determined from SIMS depth profiles may be inaccurate due to the high reactivity of the heavy metals and the complexity of the processes involved in diffusion. For example, in their recent study, Francois-Saint-Cyr et al. [12.46] presented SIMS profiles of 17 heavy metals including Be, Ti, V, Cr, Mn, Mo, and others, taken after implantation and various anneals. None of these profiles could be described by a simple model of the broadening of the implantation profile due to a simple diffusion. Some of the metals showed strong outdiffusion to the surface, some apparently were trapped at the implantation-induced lattice defects. These factors suggest that significant experimental and theoretical efforts will be required in the future in order to obtain accurate expressions for the diffusion coefficients of heavy metals in silicon.

12.5 Energy Levels in the Band Gap From the long list of heavy metals considered for high-k dielectric application, only three (Y, Zr, and Hf) were studied using electrical characterization techniques. The samples for analysis were either doped with these metals during growth, or thermally diffused. In this section, we will first discuss the data reported for Y, Zr, and Hf, and then will review the reports available for the other 4d, 5d, and rare earth metals, such as Mo, Nb, Ta, W, Er, Tv, Dy, and Ho. Since the position of the energy levels and the number of defect states formed in the band gap can vary significantly from one element to another within the same group in the periodic table, it is extremely difficult to establish clear trends in the position of the energy levels in the silicon band gap in order to extrapolate this dependence to the metals considered for high-k dielectrics. However, a qualitative prediction of the electrical activity of these metals can be derived from this analysis. 12.5.1 Energy Levels of Y, Zr, and Hf Lemke [12.47] studied electrical levels in silicon float zone crystals doped with Zr and Hf during growth. For doping concentrations near 1 × 1020 cm−3 in the melt, electrically active concentrations of about 1 × 1012 cm−3 were achieved for n- and p-type crystals. Since segregation coefficients (the ratio between the amount of impurity incorporated in the growing crystal and the

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

369

amount dissolved in the melt) of heavy impurities are usually low, on the order of 10−6 –10−7 , Lemke’s data indicate that a significant fraction of the incorporated Zr and Hf form electrically active defects in silicon. Three levels in the band gap were detected for each of the two elements: EC − 0.14 eV, EC − 0.41 eV, and EV + 0.32 eV for Zr, and EC − 0.10 eV, EC − 0.39 eV, and EV + 0.32 eV for Hf [12.47]. These levels were tentatively associated with multiple energy levels of substitutional zirconium/hafnium. Vyvenko et al. [12.33] performed DLTS measurements of Zr- and Hfimplanted and annealed, as well as surface-diffused samples, and confirmed the levels reported by Lemke. In addition, two new levels at EC − 0.32 eV and EC − 0.21 eV were observed in Hf-contaminated n-type samples, and another two new levels, EC − 0.18 eV and EC − 0.54 eV, were observed in Zr-contaminated n-type samples [12.33]. Voronkov et al. [12.48] studied crystals doped during growth with Zr using Hall effect. An impurity-related additional energy level in concentrations above the detection limit of the Hall measurement was found only in crystals grown from the melt doped with the highest concentration of Zr (1020 cm−3 ). Voronkov et al. [12.48] determined the position of Zr-related level as EC − 0.07 eV; its concentration depended on the initial Zr concentration in the melt and was up to 1016 cm−3 . The authors argued that this level could be associated with the interstitial zirconium, which becomes stable in the crystal when the total concentration of Zr exceeds the equilibrium vacancy concentration at the growth temperature (about 1016 cm−3 ). This model assumes that all Zr at concentrations lower than 1016 cm−3 is trapped by vacancies and forms substitutional zirconium. However, Voronkov et al. [12.48] did not observe any energy levels of substitutional Zr in their samples. Lebedev et al. [12.49] studied photoelectric properties of yttrium (Y)doped silicon and found that introduction of yttrium by thermal diffusion at 1200–1250◦ C for 3–12 h gave rise to several deep levels with ionization energies of EC − 0.29, EC − 0.4, EC − 0.5, EC − 0.55, and EV + 0.45 eV. The total density of the deep levels formed on introduction of yttrium was on the order of (4–8)×1013 cm−3 [12.49]. The control samples subjected to the same heat treatment without yttrium had deep levels in concentrations not exceeding 2 × 1012 cm−3 . 12.5.2 Electrical Levels of Mo, Nb, Ta, and W Molybdenum (Mo) in Si was investigated primarily because it is a common impurity in epi-layers of p/p+ epitaxial wafers [12.44, 12.50]. Hopkins, Davis and Rohatgi et al. [12.51–12.54] found from a comparison of neutron activation analysis and DLTS measurements that all the molybdenum introduced in silicon during crystal growth was electrically active. They reported one energy level detected by DLTS at EV + 0.30 eV [12.53]. This level is thought to be the major electrical level of Mo in silicon, and was

370

A.A. Istratov and E.R. Weber

also observed in [12.43, 12.44, 12.50, 12.55–12.62]. In addition, several authors suggested that Mo also forms levels in the upper half of the band gap: at EC − 0.27, EC − 0.34, EC − 0.58 eV as reported by Hamaguchi et al. [12.43], at EC − 0.30 eV as reported by Schulz et al. [12.61], and at EC − 0.53 eV, as reported by Zhou Jie et al. [12.62]. Niobium (Nb) in silicon was also shown to be electrically active. Schulz et al. [12.61] used capacitance-voltage technique to determine the position of Nbrelated level in ion-implanted silicon and found two donor levels, EC −0.26 eV and EV +0.4 eV. Davis et al. [12.52] measured DLTS on samples contaminated with Nb during growth and reported one electrical level at EV +0.12 eV (σn = 3.6 × 10−14 cm2 ). The concentration of electrically active centers was 19% of the total Nb concentration measured with neutron activation analysis and spark source mass spectrometry. Finally, Pettersson et al. [12.63] implanted Nb in p–n structures and reported three energy levels at EC − 0.293 eV, EC − 0.583 eV, and EV + 0.163 eV. Tantalum (Ta) was studied, to our knowledge, by only two groups. Busta et al. [12.64] reported one Ta-related level EC − 0.21 eV. Milnes [12.65] in his earlier review reported two levels, at EC − 0.14 eV and EC − 0.43 eV. Tungsten (W) in silicon was studied more extensively. First results concerning the electrical activity of W in Si were obtained by Zibuts et al. [12.66, 12.67] in the 1960s. From photoconductivity results obtained on silicon samples intentionally contaminated with W , they concluded that tungsten introduces three electron traps and two hole traps in the silicon band gap at EC − 0.22 eV, EC − 0.30 eV, EC − 0.37 eV, EV + 0.31 eV, and EV + 0.34 eV, respectively. Fujisaki et al. [12.68] found two W -related deep levels in the band gap, EV + 0.41 eV and EC − 0.22 eV. Boughaba et al. [12.69] reported three defect centers, with levels at EV + 0.22 eV (σp = 9.4 × 10−15 cm2 ), EV +0.33 eV (σp = 1.2×10−14 cm2 ), and EC −0.59 eV (σp = 6.6×10−17 cm2 , σn = 1.7 × 10−14 cm2 ). Schmalz, Pettersson et al. [12.59, 12.60] reported a W -related level at EV + 0.379 eV. Busta et al. [12.64] reported two W -related levels at EC − 0.25 eV and EC − 0.28 eV. Ando et al. [12.70] found one hole trap at EV + 0.41 eV in samples doped with W from the melt during crystal growth. Thus, although it seems that W introduces multiple energy levels in the silicon band gap, there is no consensus on their position. This may either imply that W is very reactive and may form various complexes, or may be considered as an indication that unintentional contamination might have played an adverse role. 12.5.3 Electrical Levels of the Rare Earth Elements: Er, Tb, Ho, or Dy Emtsev et al. [12.27, 12.28] reported that initially p-type samples (20 and 50 Ωcm) became n-type after implantation of Er, Dy, or Ho in doses over 5 × 1011 cm−2 and subsequent annealing of the samples at 700◦ C and 900◦ C for 30 min in chlorine-containing ambient to activate the implants. This con-

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

371

ductivity type inversion was caused by the formation of relatively shallow donor levels in the energy range of EC −(60–70) meV and EC −(100–120) meV. They also pointed out that high concentration of implantation-induced intrinsic point defects facilitated heterogeneous nucleation of oxygen with the formation of thermal donors, which in turn formed an additional shallow level at EC − 0.04 eV [12.27,12.28]. Benton et al. [12.24] reported 8 defect levels related to implanted erbium in silicon, located 0.06, 0.09, 0.14, 0.18, 0.27, 0.31, 0.32, and 0.48 eV below the conduction band edge. Capacitance-voltage profiling, Hall effect data and DLTS all showed that erbium doping introduces excess donors. In addition to numerous reports that erbium acts as donor in silicon, some groups reported its acceptor action. Alexandrov et al. [12.71] reported that diffusion of Er in Si at high temperature (1100–1250◦ C) results in the formation of shallow acceptor centers with an energy of EV + 0.045 eV. The concentration of these centers depended on the heat treatment conditions and, in particular, on the type and concentration of excess intrinsic point defects which participate in the diffusion process. Supersaturation of Si with vacancies is accompanied by a significant increase in the concentration of acceptor centers, which are likely to be associated with substitutional Er. Supersaturation of Si with intrinsic interstitial atoms results in a decrease in their concentration, which can be explained by transformation of some of the substitutional Er atoms to interstitial sites [12.71]. Libertino et al. [12.22] implanted p–n junctions with 5 MeV Tb ions using fluences in the range of 6 × 1011 –6 × 1012 cm−2 with subsequent annealing at the temperatures between 800 and 1000◦ C for 5 s to 30 min. It was found that in a high purity epitaxial Si, Tb introduces several donor levels at the energies between EC − 0.15 and EC − 0.53 eV. On the other hand, in the presence of oxygen, which was co-implanted with Tb to achieve an oxygen concentration of about 1018 O/cm3 in the crystal region where Tb sits, the concentration of deep levels was reduced by more than one order of magnitude and shallower levels with energies in the range 0.07–0.16 eV from the conduction band dominated in the spectrum. It was suggested that these modifications in deep level spectra were produced by the formation of Tb–O complexes. Aleksandrov et al. [12.29] reported that implantation of Ho and Dy followed by post-implantation anneals in the temperature range between 600 and 900◦ C in chlorine-containing ambient caused a formation of donor centers and conversion of conductivity type of the 20 Ω× cm p-type samples to n-type. However, the energy level of the donor centers was not determined. Bakhadyrkhanov et al. [12.23] reported that diffusion of Yb in Si introduces donor centers, and that only about 1% of the total Yb concentration was electrically active. Hall effect and photoconductivity studies enabled them to determine the position of these levels at EC − 0.23 eV and EC − 0.31 eV. Fu et al. [12.41] performed thermal diffusion of Yb in Si for 30 min at 1050◦ C. They found two hole traps in Yb-diffused p-type Si,

372

A.A. Istratov and E.R. Weber

EV + 0.38 eV and EV + 0.49 eV, and one electron trap in p-type Yb-doped silicon at EC − 0.33 eV. The concentration of all these traps was about (2– 5)×1013 cm−3 . This value is much less than the atomic concentration of Yb determined by SIMS (4 × 1018 cm−3 at a depth of 1.7 µm). Voronkova et al. [12.72] used Hall effect to study energy levels in silicon doped with samarium (Sm) during crystal growth. Two new acceptor levels, at EC − 0.28 eV and EV + 0.45 eV, were observed after annealing at 800– 1100◦ C for 30 min terminated by quenching in liquid nitrogen. The same heat treatment of samarium-doped silicon with higher resistivity resulted in the n to p conductivity type conversion, which confirmed that samarium introduced acceptor levels in the band gap. The formation of acceptor levels after anneals was explained by dissolution of samarium precipitates formed during crystal growth.

12.6 Effect of 4d, 5d, and Rare Earth Metals on Minority Carrier Recombination Lifetime and Device Performance The majority of studies presented in this section were performed in the 1980s in order to determine the impact of metal contamination on solar cell efficiency. One should bear in mind when evaluating these old data that the quality of crystalline silicon has significantly improved during the last 20 years – and so has the sensitivity to metal contaminants. A typical minority carrier diffusion length in crystalline silicon of 100–200 µm in the 1980s should be compared to over 1000 µm now. Therefore, the threshold values for metal contamination levels that are detrimental for the minority carrier lifetime determined in the early 1980s should be decreased by a factor proportional to the ratio of lifetimes (or the square of the ratio of diffusion lengths in uncontaminated wafers) [12.51], i.e., by typically a factor of 25 to 100 in order to account for the improved minority carrier lifetime in as-grown crystals. For example, compare the threshold concentration of iron of 3 × 1013 cm−3 reported in 1986 by Pizzini et al. [12.73] or the earlier value of 3 × 1014 cm−3 , reported in the year 1980 [12.74], with the recent data of Reiss et al. [12.75] who showed that 5 × 1011 cm−3 of iron in the form of FeB pairs may decrease the efficiency of CZ solar cells by 3–4%. From the list of metals considered for the use in high-k dielectrics, we were able to find experimental data only for the impact of zirconium on minority carrier recombination lifetime. Hopkins et al. [12.51] and Davis [12.52] reported that the 8% degradation threshold contamination level for solar cell efficiency is about 1012 cm−3 for Zr. Hill et al. [12.76] reported that the critical Zr contamination level for a 10% efficient cell (which was considered a good efficiency in 1976) was 4 × 1013 cm−3 , which should be compared with the critical level of 1 × 1015 cm−3 reported by the same authors for iron. These data suggest that in modern silicon with the typical minority carrier

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

373

diffusion length on the order of one micron, the critical contamination level of Zr would be around 1010 cm−3 or less. The other heavy metals whose impact on minority carrier diffusion length were studied in the literature include Nb, Mo, Ta, and W. The impact of niobium on minority carrier diffusion length was studied by Hopkins et al. [12.51] and Davis [12.52], who reported that the Nb threshold contamination level for 8% degradation of solar cell efficiency is about 1012 cm−3 . We would like to re-emphasize that extrapolation of this data to modern IC-quality silicon would result in critical contamination levels in the range 1010 -1011 cm−3 (this estimate was made using the factor of 25-100, explained above). To our knowledge, barium was studied only by Boubekeur et al. [12.36], who used the ELYMAT technique to investigate the impact of barium on minority carrier lifetime; however, they did not observe any effect. We would like to point out that the data from [12.36] are hardly conclusive since Ba contamination was performed using diffusion from the surface and the penetration depth of Ba was less than 100 nm. Since minority carrier diffusion length is a bulk measurement technique, interpretation of the experimental data taken on a wafer which was contaminated only in a thin near-surface layer is not straightforward, and the fact that no effect was observed does not necessarily imply that Ba is not recombination active. In the same article, Boubeckeur [12.36] attempted to analyze the impact of Ba on p-n junction leakage by spin-coating patterned Si wafers and annealing them at 800◦ C for 60 min. No impact of Ba concentration was observed. However, the authors did not consider the fact that the distance which Ba had to diffuse through the n+ polysilicon electrode and n+ implanted area to the p-n junction was over 300 nm, whereas a characteristic diffusion length of Ba during the annealing temperature and time used in this study was only 27 nm. The contamination level of molybdenum critical for solar cell efficiency was, according to Hopkins, Davis, Rohatgi et al. [12.51–12.53], about 5 × 1011 cm−3 . Their report agrees with the findings of Aoki et al. [12.50], who reported that the effect of Mo on minority carrier lifetime in contaminated epi-layers was substantial when its concentration was about 3 × 1011 cm−3 (the lowest contamination level in their experiments), and Hamaguchi et al. [12.43], who also pointed out that Mo is an efficient recombination center, which affects the lifetime for concentrations around 5 × 1011 cm−3 . Polignano et al. [12.55] reported that 2 × 1012 cm−3 of electrically active Mo can reduce the minority carrier diffusion length, as measured by SPV, from about 350 µm in as-grown CZ wafers to approximately 55 µm in Mocontaminated wafers. Benton et al. [12.44] reported that molybdenum affects the minority carrier lifetime to a somewhat lesser degree than iron, although the effect of Mo contamination was detectable even for Mo concentrations of about 1010 cm−3 . It should be emphasized that their data may not be applicable, however, as they were measured on implanted samples followed by a 30 min diffusion anneal at 825◦ C. Such anneal redistributes Mo only in about

374

A.A. Istratov and E.R. Weber

a 3 µm-thick layer near the surface. This implies that minority carrier lifetime measured by the authors may not accurately describe the actual recombination activity of the metal impurities, as discussed above with application to barium in silicon. Concerning tantalum and tungsten in silicon, Hopkins et al. [12.51] and Davis [12.52] reported that the Ta degradation threshold contamination level for solar cell efficiency is about 1.5 × 1011 cm−3 . Extrapolation of these data to modern IC-quality silicon using the factor of 25–100 described above would result in critical contamination levels in the range 109 –1010 cm−3 . The same authors [12.51,12.52] found that the W degradation threshold contamination level for solar cell efficiency is about 1.5×1012 cm−3 , which, if extrapolated to the minority carrier lifetime typical in modern IC-quality silicon, would result in critical tungsten contamination levels in the range of 1010 –1011 cm−3 . To our knowledge, there is no data on generation lifetime of 4d, 5d, and rare earth metals in silicon.

12.7 Summarizing Discussion Our literature survey revealed that very little is known about the physicochemical properties of the metals considered for fabrication of high-k dielectrics (Zr, Hf, Ba, Sr, Ti, Ce, Pr, Y, La, and Ta) in silicon. A somewhat broader approach, which consisted in analysis of a wider range of 4d, 5d, and rare earth elements with electronic structure similar to that of the metals of interest, enabled us to determine the range of values in which the solubilities and diffusivities of these metals can be expected to be. The diffusivity of the metals of interest proved to be quite low, ranging from the values close to the silicon self-diffusion coefficient (approximately 3 × 10−14 cm2 /s at 1200◦ C, 10−16 cm2 /s at 1000◦ C, and less than 10−20 cm2 /s at 800◦ C) to the upper limit of the range which approximately equals 10−7 cm2 /s at 1200◦ C, 10−9 cm2 /s at 1000◦ C, and 10−11 cm2 /s at 800◦ C. While this range is quite wide, it allows one to estimate that the metals will diffuse a distance of approximately from 1.5 nm to 5 µm during a 60 s anneal at 1000◦ C, or from 12 nm to 38 µm during a one-hour anneal at 1000◦ C. This diffusion distance may be sufficient for the metals to penetrate the device layer and alter the device properties, unless the bonding of metals in high-k dielectrics is sufficiently strong to prevent penetration of metals into silicon in measurable quantities. Gettering techniques, which are based on diffusion of metals over a distance from several microns to several tens of microns will generally be inefficient since the expected diffusion length of the metals during an anneal only in most favorable cases reaches the distance between the devices and the gettering sites. Low diffusion coefficients of 4d, 5d, and rare earth metals makes it very difficult to accurately determine their solubility in silicon. Indeed, to saturate the sample and reach the equilibrium state, the diffusion length of the metal

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

375

should be several times greater than the sample thickness. This requires extremely long annealing times. Although the data on solubility of 4d, 5d, and rare earth elements in silicon are scarce, they indicate that the solubility of these metals may be comparable to that of iron and manganese in the same temperature range, i.e., may reach about 1014 cm−3 at 1000◦ C and 1016 cm−3 at 1300◦ C. Energy levels of the metals considered for the use in high-k dielectrics can not be quantitatively predicted from the positions of the levels of other 4d or 5d elements or rare earth elements. However, there is a clearly observed general trend of strong electrical activity of heavy metals in silicon, which form multiple donor and acceptor levels in the band gap. Strong electrical activities of these metals leads to their severe impact on the minority carrier diffusion lifetime. Indeed, comparative studies of Davies, Hopkins and Rohatgi et al. [12.51, 12.54] indicated that the recombination activity (degradation threshold of solar cell efficiency) of transition metals increases towards the lower left corner of the periodic table. Thus, one can expect that the detrimental effects of the heavy metals on minority carrier lifetime will be significant also in the integrated circuits applications, and stronger than the effect of iron. It is important to note that strong recombination centers usually also act as strong generation centers, which may affect the performance of IC devices even stronger than their recombination properties. Acknowledgments. The authors would like to acknowledge helpful discussions with H.Huff and members of the SiWEDS industry/university cooperative research center and financial support from International Sematech.

References 12.1. 12.2. 12.3. 12.4.

R.M. Wallace and G.D. Wilk, Semicond. Int. 24, 227 (2001) G.D. Wilk, R.M. Wallace, and J.M. Anthony, J. Appl. Phys. 89, 5243 (2001) R.M. Wallace and G.D. Wilk, Semicond. Int. 24, 153 (2001) H.R. Huff, A. Agarwal, Y. Kim, L. Perrymore, D. Riley, J. Barnett, C. Sparks, M. Freiler, G. Gebara, B. Bowers, P.J. Chen, P. Lysaght, B. Nguyen, J.E. Lim, S. Lim, G. Bersuker, P. Zeitzoff, G.A. Brown, C. Young, B. Foran, F. Shaapur, A. Hou, C. Lim, H. Alshareef, S. Borthakur, D.J. Derro, R. Bergmann, L.A. Larson, M.I. Gardner, J. Gutt, R.W. Murto, K. Torres, and M.D. Jackson, Extended Abstracts of International Workshop on Gate Insulator, IWGI 2001 (IEEE Cat. No.01EX537), Japan Soc. Appl. Phys., 2 (2001) 12.5. S.A. Campbell, B. He, R. Smith, T. Ma, N. Hoilien, C. Taylor, and W.L. Gladfelter, in Chemical Processing of Dielectrics, Insulators and Electronic Ceramics, ed. by A.C. Jones, J. Veteran, D. Mullin, R. Cooper, and S. Kaushal, Mater. Res. Soc., Warrendale (2000), p. 23 12.6. R.K. Sharma, A. Kumar, and J.M. Anthony, Jom (USA) 53, 53 (2001) 12.7. Y. Lee, M. Park, and J. Song, Semicond. Int. 24, 267 (2001)

376

A.A. Istratov and E.R. Weber

12.8. S. Yamamichi, Y. Muramatsu, P.Y. Lesaicherre, and H. Ono, Jpn. J. Appl. Phys. 34, 5188 (1995) 12.9. H.J. Osten, J.P. Liu, H.J. Mussig, and P. Zaumseil, Microelectr. Reliab. 41, 991 (2001) 12.10. M. Quevedo-Lopez, M. El-Bouanani, S. Addepalli, J.L. Duggan, B.E. Gnade, R.M. Wallace, M.R. Visokay, M. Douglas, M.J. Bevan, and L. Colombo, Appl. Phys. Lett. 79, 2958 (2001) 12.11. W.-J. Qi, R. Nieh, B.H. Lee, L. Kang, Y. Jeon, and J.C. Lee, Appl. Phys. Lett. 77, 3269 (2000) 12.12. G.D. Wilk and R.M. Wallace, Appl. Phys. Lett. 74, 2854 (1999) 12.13. G.D. Wilk, R.M. Wallace, and J.M. Anthony, J. Appl. Phys. 87, 484 (2000) 12.14. G.D. Wilk and R.M. Wallace, Appl. Phys. Lett. 76, 112 (2000) 12.15. H. Bracht and H. Overhof, phys. stat. sol. (a) 158, 47 (1996) 12.16. N.A. Stolwijk, B. Schuster, and J. Holzl, Appl. Phys. A A33, 133 (1984) 12.17. U. G¨ osele, W. Frank, and A. Seeger, Appl. Phys. 23, 361 (1980) 12.18. E.R. Weber, Appl. Phys. A 30, 1 (1983) 12.19. F. Beeler and M. Scheffler, Mater. Sci. Forum 38–41, 257 (1989) 12.20. U. Wahl, A. Vantomme, J. De Wachter, R. Moons, G. Langouche, J.G. Marques, and J.G. Correia, Phys. Rev. Lett. 79, 2069 (1997) 12.21. M. Needels, M. Schluter, and M. Lannoo, Phys. Rev. B 47, 15533 (1993) 12.22. S. Libertino, S. Coffa, R. Mosca, and E. Gombia, J. Appl. Phys. 85, 2093 (1999) 12.23. M.K. Bakhdyrkhanov, F.M. Talipov, U.S. Dzhurabekov, S.B. Sultanova, and U. Egamov, Elektronnaya Technika – Materialy 5 (190), 79 (1984) 12.24. J.L. Benton, J. Michel, L.C. Kimerling, D.C. Jacobson, Y.H. Xie, D.J. Eaglesham, E.A. Fitzgerald, and J.M. Poate, J. Appl. Phys. 70, 2667 (1991) 12.25. O.V. Alexandrov, N.A. Sobolev, and E.I. Shek, Semicond. Sci. Technol. 10, 948 (1995) 12.26. S. Zainabidinov, D.E. Nazirov, A.Z. Akbarov, A.A. Iminov, and T.M. Toshtemirov, Tech. Phys. Letters 24, 71 (1998) 12.27. V.V. Emtsev, V.V. Emtsev Jr., D.S. Poloskin, E.I. Shek, and N.A. Sobolev, J. of Luminescence 80, 391 (1998) 12.28. V.V. Emtsev, V.V. Emtsev Jr., D.S. Poloskin, N.A. Sobolev, E.I. Shek, J. Michel, and L.C. Kimerling, Semiconductors 33, 603 (1999) 12.29. O.V. Aleksandrov, A.O. Zahhar’in, N.A. Sobolev, E.I. Shek, M.I. Makoviichuk, and E.O. Parshin, Sov. Phys. Semicond. 32, 921 (1998) 12.30. K. Graff, Metal Impurities in Silicon-Device Fabrication, 2 edn, Springer, Berlin (2001) 12.31. F.Y.G. Ren, J. Michel, Q. Sun-Paduano, B. Zheng, H. Kitagawa, D.C. Jacobson, J.M. Poate, and L.C. Kimerling, in Rare Earth Doped Semiconductors, ed. by G.S. Pomrenke, P.B. Klein, and D.W. Langer, Mater. Res. Soc., Pittsburg (1993) p. 87 12.32. I. Ferrin, G. Bemski, and W. Parker, Phys. Lett. A 32, 65 (1970) 12.33. O.F. Vyvenko, R. Sachdeva, A.A. Istratov, E.R. Weber, P.N.K. Deenapanray, C. Jagadish, Y. Gao, and H.R. Huff, in Semiconductor Silicon-2002, ed. by H.R. Huff, L. Fabry, and S. Kishino, The Electrochemical Society, Pennington (2002) p. 440 12.34. K. Graff, Metal Impurities in Silicon-Device Fabrication, Springer, Berlin (1995)

12 Physicochemical Properties of Selected 4d, 5d, and Rare Earth Metals

377

12.35. D.E. Nazyrov, V.P. Usacheva, G.S. Kulikov, and R.S. Malkovich, Pis’ma Zh. Tekh. Fiz. (USSR) 14, 483 (1988) 12.36. H. Boubekeur, J. Hopfner, T. Mikolajick, C. Dehm, L. Frey, and H. Ryssel, J. Electrochem. Soc. 147, 4297 (2000) 12.37. D.E. Nazyrov, G.S. Kulikov, and R.S. Malkovich, Sov. Phys. Semicond. 25, 997 (1991) 12.38. V.V. Ageev, N.S. Aksenova, V.N. Kokovina, and E.P. Troshina, Izvestia Leningradskogo Electrotechnicheskogo Instituta 211, 80 (1977) 12.39. S. Roberts and G. Parker, Mater. Lett. 24, 307 (1995) 12.40. D.E. Nazyrov, G.S. Kulikov, and R.S. Malkovich, Tech. Phys. Letters 23, 68 (1997) 12.41. C. Fu and Y. Lu, Chin. Phys. (USA) 5, 527 (1985) 12.42. V.A. Uskov, A.I. Rodionov, G.T. Vlasenko, and A.B. Fedotov, in Doped Semiconductors, ed. by N.Kh. Abrikosov and V.S. Zemskov, Nauka, Moscow (1985) p. 80 12.43. T. Hamaguchi and Y. Hayamizu, Jpn. J. Appl. Phys. (Letters) 30, L1837 (1991) 12.44. J.L. Benton, D.C. Jacobson, B. Jackson, J.A. Johnson, T. Boone, D.J. Eaglesham, F.A. Stevie, and J. Becerro, J. Electrochem. Soc. 146, 1929 (1999) 12.45. H. Bracht, E.E. Haller, and R. Clark-Phelps, Phys. Rev. Lett. 81, 393 (1998) 12.46. H. Francois-Saint-Cyr, E. Anoshkina, F. Stevie, L. Chow, K. Richardson, and D. Zhou, J. Vac. Sci. Technol. B 19, 1769 (2001) 12.47. H. Lemke, phys. stat. sol. (a) 122, 617 (1990) 12.48. V.V. Voronkov, G.I. Voronkova, M.I. Iglitsyn, and A.G. Salmanov, Sov. Phys. Semicond. 8, 1277 (1974) 12.49. A.A. Lebedev, N.A. Sultanov, and P. Yusupov, Sov. Phys. Semicond. 14, 342 (1980) 12.50. M. Aoki, T. Itakura, and N. Sasaki, Jpn. J. Appl. Phys. 34, 712 (1995) 12.51. R.H. Hopkins and A. Rohatgi, J. Cryst. Growth 75, 67 (1985) 12.52. J.R. Davis, A. Rohatgi, R.H. Hopkins, P.D. Blais, P. Rai-Choudhury, J.R. McCormic, and H.C. Mollenkopf, IEEE Trans. Electron. Dev. ED–27, 677 (1980) 12.53. A. Rohatgi, R.H. Hopkins, J.R. Davis, R.B. Campbell, and H.C. Mollenkopf, Sol. St. Electron. 23, 1185 (1980) 12.54. A. Rohatgi, J.R. Davis, R.H. Hopkins, and P.G. McMullin, Sol. St. Electron. 26, 1039 (1983) 12.55. M.L. Polignano, C. Bresolin, G. Pavia, V. Soncini, F. Zanderigo, G. Queirolo, and M. di Dio, Mater. Sci. Eng. B 53, 300 (1998) 12.56. J.P. Kalejs, B.R. Bathey, J.T. Borenstein, and R.W. Stormont, in Twenty Third IEEE Photovoltaic Specialists Conference, IEEE, Louisville, KY (1993) p. 184 12.57. J.T. Borenstein, B.R. Bathey, J.P. Kalejs, J.I. Hanoka, and N.O. Pearce, in Twenty Second IEEE Photovoltaic Specialists Conference, IEEE, Las Vegas, NV, USA (1991) p. 1006 12.58. A. Sandhu, T. Ogikubo, H. Goto, V. Csapo, and T. Pavelka, J. Cryst. Growth 210, 116 (2000) 12.59. K. Schmalz, H.G. Grimmeiss, H. Pettersson, and L. Tilly, in Defect Engineering in Semiconductor Growth, Processing and Device Technology, ed. by S. Ashok, J. Chevallier, K. Sumino, and E. Weber, Mater. Res. Soc., Pittsburgh (1992) p. 489

378

A.A. Istratov and E.R. Weber

12.60. H. Pettersson, H.G. Grimmeiss, L. Tilly, K. Schmalz, K. Tittelbach, and H. Kerkow, Semicond. Sci. Technol. 6, 237 (1991) 12.61. M. Schulz, Appl. Phys. 4, 225 (1974) 12.62. J. Zhou, X. Ji, S. Li, J. Wu, J. Gao, and Z. Han, Mater. Sci. Forum 38–41, 457 (1989) 12.63. H. Pettersson, H.G. Grimmeiss, L. Tilly, K. Schmalz, and H. Kerkow, Semicond. Sci. Technol. 8, 1247 (1993) 12.64. H.H. Busta and H.A. Waggener, J. Electrochem. Soc. 124, 1424 (1977) 12.65. A.G. Milnes, Deep Impurities in Semiconductors, Wiley-Interscience, Chichester, Sussex, UK (1973) 12.66. Y.A. Zibuts, L.G. Paritskii, and S.M. Ryvkin, Sov. Phys. Solid State 5, 2416 (1964) 12.67. Y.A. Zibuts, L.G. Paritskii, S.M. Ryvkin, and Z.G. Dokholyan, Sov. Phys. Solid State 8, 2041 (1967) 12.68. Y. Fujisaki, T. Ando, H. Kozuka, and Y. Takano, J. Appl. Phys. 63, 2304 (1988) 12.69. S. Boughaba and D. Mathiot, J. Appl. Phys. 69, 278 (1991) 12.70. T. Ando, S. Isomae, and C. Munakata, J. Appl. Phys. 70, 5401 (1991) 12.71. O.V. Aleksandrov, V.V. Emtsev, D.S. Poloskin, N.A. Sobolev, and E.I. Shek, Semiconductors 28, 1126 (1994) 12.72. G.I. Voronkova, M.I. Iglitsyn, and A.R. Salmanov, Sov. Phys. Semicond. 9, 328 (1975) 12.73. S. Pizzini, L. Bigoni, M. Beghi, and C. Chemelli, J. Electrochem. Soc. 133, 2363 (1986) 12.74. A. Rohatgi, J.R. Davis, R.H. Hopkins, P. Rai-Choudhury, and P.G. McMullin, Sol. St. Electron. 23, 415 (1980) 12.75. J.H. Reiss, R.R. King, and K.W. Mitchell, Appl. Phys. Lett. 68, 3302 (1996) 12.76. D.E. Hill, H.W. Gutsche, M.S. Wang, K.P. Gupta, W.F. Tucker, J.D. Dowdy, and R.J. Crepin, in Twelfth IEEE Photovoltaic Specialists Conference, Baton Rouge, LA, USA (1976) p. 112

13 High-k Gate Dielectric Deposition Technologies J.P. Chang

The need to replace silicon dioxide and silicon oxynitride with a thicker dielectric layer with higher permittivity is well addressed throughout this book. The transition to a high-k dielectric material represents a fundamental change in chemical processing towards deposited dielectrics and away from dielectrics that can be thermally grown on crystalline silicon. To ensure good electrical performance of the resulting devices, the deposited dielectrics must have an excellent thickness uniformity and superior interfacial and bulk properties. To accomplish these goals, a careful selection and a thorough understanding of both the chemical reactive precursors and the processing reactors are critical. This chapter discusses the various technologies used for depositing high-k dielectrics on silicon with an emphasis on those techniques thought to be the most manufacturable: 1. 2. 3. 4. 5. 6. 7. 8.

Atomic Layer Deposition Chemical Vapor Deposition Plasma Enhanced Atomic Layer Deposition Plasma Enhanced Chemical Vapor Deposition Physical Vapor Deposition Molecular Beam Epitaxy Ion Beam Assisted Deposition Sol-gel Deposition

This list is by no means comprehensive, but it reflects on the key technologies widely used for exploring the deposition of high-k dielectric materials. Each of these techniques has its uniqueness, advantages, and disadvantages. In this chapter, the fundamental surface reaction mechanisms underlying each deposition technique are illustrated with examples of various precursor chemistries. Manufacturability is assessed through the reactor design, chemical delivery systems, and the processing throughput. The focus is placed on simple metal oxides, but alloyed metal oxides and doped metal oxides will also be discussed. It is important to note three other important areas that are not detailed in this chapter due to its limited length. The first one is the integrated processing concept that combines in-situ pre-deposition surface preparation [13.1, 13.2], high-k deposition, post-deposition annealing [13.3], and the deposition of gate electrodes [13.4]. Secondly, an effective etching chemistry needs to be devised

380

J.P. Chang

in parallel to pattern these high-k dielectric materials while avoiding undercutting or leaving residual metal oxides at the source/drain regions during the fabrication of the metal-oxide-semiconductor field effect transistors (MOSFET) [13.5–13.8]. Finally, although the focus of this chapter is to address the applications of these high-k dielectric materials as a replacement for SiO2 in MOSFET applications, many of the techniques are also viable solutions for depositing highly conformal dielectric layers on trench capacitors with high aspect ratios that are greatly needed for the dynamic random access memory (DRAM) devices in the gigabyte regime [13.9, 13.10].

13.1 Atomic Layer Deposition 13.1.1 Technology Description The atomic layer deposition (ALD), also known as atomic layer epitaxy (ALE) [13.11, 13.12], is typically referred to the binary chemical reaction sequence where the surface reaction of each chemical precursor is selflimiting [13.13–13.16]. Two main features that distinguish ALD from other chemical processes are: – –

A deposition process comprises of a sequence of discrete self-limiting process steps. Each self-limiting step must be dominant and lead to a monolayer saturation.

Specifically, in each half reaction, a gas phase chemical precursor reacts with a surface functional group to generate a volatile product, and the reaction proceeds until all the surface functional groups are consumed and replaced with the second chemical functional group. Since each half reaction is selflimiting, growth of thin film beyond one monolayer is not possible once all the surface functional groups are reacted. Thin films are thus deposited by repetitive application of a single layer deposition sequence (or less than a layer due to the steric hindrance of the precursor molecules). The reaction kinetics for each sequence involves several gas-surface interactions that include: a) adsorption of the precursor molecules onto the substrate, b) surface reaction, c) desorption of the adsorbed molecules, and d) desorption of the gaseous reaction products or by-products. Typically, low temperature and mildly oxidizing processes are required for ALD, in order to avoid pyrolyzing the precursor or oxidizing the interface between these high-k materials and the silicon substrate. Thus, an exothermic chemical reaction with a small activation energy is essential for ALD processing. Since the reaction rate constant, k, is related to the collision cross-sections between the gas-phase molecules and the surface, a thermally activated surface reaction can be characterized as:

13 High-k Gate Dielectric Deposition Technologies

k=

kB T exp h



−∆G RT

381



where kB is the Boltzmann constant, h is the Planck constant, T is the temperature, and the ratio kB T /h is the frequency factor. Since the change in Gibbs free energy and the activation energy, Ea , can be related to the enthalpy and the entropy by ∆G = ∆H − T ∆S and Ea ≈ ∆H + RT , the rate constant can be rearranged into the Arrhenius form:       −∆H −Ea ∆S kB T exp → A exp exp k= R RT RT h where A is a lumped pre-exponential constant. An accurate measurement of the reaction activation energy would aid the determination of the reaction pathways. Because of its ability to deposit extremely smooth and conformal thin films [13.17], ALD has been used to deposit metals [13.18, 13.19], metal oxides [13.20–13.22], metal nitrides [13.23–13.25], semiconductors [13.26,13.27], transparent conductive oxides [13.28,13.29], and ferroelectric materials [13.30, 13.31]. ALD also enables the engineering of binary and mixed metal oxides, which have more complex dielectric properties as potential high-k materials. Before we discuss the detailed chemical reactions in ALD, it is important to note that it is also possible to achieve atomic layer deposition from a simple self-limiting chemisorption process. In this case, the precursor molecules will chemisorb and saturate the surface, and subsequently exchange their ligands with another precursor molecule to start forming the film. In this chapter, however, we will focus on ALD reactions that are dominated and governed by sequential surface chemical reactions. 13.1.2 Chemical Reaction Mechanisms and Precursors To illustrate the ALD of high-k dielectric materials, which are typically metal oxides, we use MLx to denote a metal-containing precursor and H2 O as the oxidizing agent to describe the deposition of a high-k dielectric material, MOx/2 , on a substrate. These chemicals are introduced into the reactor one at a time, with an inert gas purge in between. The overall reaction of these two reactants results in the formation of metal oxides and volatile reaction products, LH: x MLx + H2 O → MOx/2 + x LH 2 Note that M represents a metal, and L is a molecular ligand that makes MLx volatile. To control the ALD process, the starting silicon surface needs to be prepared to have the desired functional group for reaction. This process is sometimes called the surface activation, which is designed to introduce the appropriate surface functional groups to initiate ALD. The ability to initiate the

382

J.P. Chang

Pressure

Metal-precursor Oxidant t1

t2 t3

t4

time H2O

MLx

Fig. 13.1. Schematic of atomic layer deposition with alternating MLx and H2 O exposures

surface allows the first layer to be deposited and the layer-by-layer growth to follow, leading to a smooth transition from the substrate to the bulk of the thin film. Since H2 O is commonly used as an oxidant in ALD, it is often desired to activate the surface with H2 O, generating hydroxyl groups on the surface (–OH). These surface hydroxyl groups serve as reactive sites for the subsequent chemical reactions. This initialization step can take a few chemical pulses to complete, due to the steric hindrance effect and the activation energy required for the reaction to occur. It is sometimes observed as an “incubation” time before the layerby-layer growth begins. Alternatively, chemical oxides have been used as an initialization layer, in which abundant –OH groups are present [13.2]. Once the surface is activated, excess H2 O precursor is removed and the reactive surface is exposed to MLx , as shown in Fig. 13.1. The subsequent surface reaction should proceed quickly and self-saturate: –OH + MLx → –O–MLx−1 + LH Once all the –OH groups are consumed and the surface is terminated with the L-ligands, no more surface reaction occurs. H2 O is then introduced to react with these L-ligands to form LH as the volatile product and regenerate the surface hydroxyl groups as well as forming –O–M–O– bonds which are the building blocks for high-k metal oxide materials. –O–MLx−1 + (x − 1)H2 O → –O–M–(OH)x−1 + (x − 1)LH Note that the two chemical pulses (H2 O and MLx ) can be varied independently in terms of their duration and concentrations (pressures). An important observation is that, at this time, the surface is “restored” to almost its initial state after the surface activation (Fig. 13.1). This sequence

13 High-k Gate Dielectric Deposition Technologies

383

of surface reactions that restores the surface to its starting point is called one ALD deposition cycle. This restored surface activity allows the sequential reactions to occur and complete the subsequent ALD cycles. Since more than one –OH could be formed per metal atom at this stage, the restored surface is not exactly the same as the initially activated surface. Thus the transition region from the substrate into the MOx/2 film may have a slightly different density and deposition rate per cycle compared to the bulk MOx/2 film. Once a complete surface coverage of the initial atomic layer is formed, the deposition thickness scales linearly with the number of deposition cycles. It is also important to note that not all the hydroxyl groups serve as subsequent reaction sites with the metal-containing precursors due to the steric hindrance effect. Some of these –OH groups might react to eliminate water, and this dehydroxylation process can reduce the density of available –OH group. In addition, the process of losing excessive –OH ligands by dehydroxylation is responsible for cross-linking the film to achieve densely packed and higher quality films. If the surface is not properly activated, nucleation processes can occur on defect sites and the re-deposited nonvolatile reaction products can take place at these seeding sites to form islands. As the deposition proceeds, these islands grow into grains, which finally coalesce into films that are usually neither uniform nor continuous. It has been shown that without surface activation, ALD films grow inconsistently with significant texture and roughness. For high-k metal oxide deposition, there are two major families of the precursors: the metal-containing precursors and the oxidizing agents. The selection criteria of ALD precursors include: – – – – – – – – – –

High volatility Aggressive surface reactivity High purity Minimal self-decomposition No etching of the deposited films No dissolution into the deposition films Inexpensive Easy to handle Non-toxic Environmentally benign

Especially for the metal-containing precursors, they must be volatile and thermally stable, whether they are of gaseous, liquid, or solid forms. They also must have high reactivity in order to complete the ALD cycle in a short time to ensure the deposition throughput. Therefore, precursors resulting in a larger negative change in the free energy of a reaction (∆G) are favored. The non-metal precursors are usually NH3 or oxidants (O2 , O3 , H2 O, H2 O2 ) that react with the metal and/or silicon precursors to form metal oxides, metal

384

J.P. Chang

oxynitrides, or metal silicates. Table 13.1 summarizes the chemicals widely used for ALD deposition of high-k metal oxide materials. The interaction between these three families of chemicals should lead to protonation, combustion, or elimination of the ligands, with minimal side reactions such as etching of the deposited film or redeposition of the reaction by-products. Combinations of these precursors have led to the deposition of a wide variety of high-k dielectric materials, such as Si3 N4 [13.32], Al2 O3 [13.33], ZrO2 [13.34], HfO2 [13.35], TiO2 [13.36], Ta2 O5 [13.37], ZrSnx Tiy Oz [13.38], and BaSrTiO3 [13.39]. In the selection of oxidizing agents, water is the most popular oxygen source and its role is clearly illustrated in the earlier example. Hydrogen peroxide is also considered a viable oxidizing agent, though it is not always desirable due to its very high reactivity [13.40]. In the case of H2 O2 reacting with surface halides, both hydrogen halide and halogen molecules can be generated [13.36]. Molecular oxygen has also been used as an oxygen precursor to deposit TiO2 with titanium iodide [13.41], likely due to the reaction between the slightly unstable metal precursors with oxygen. Ozone is another reactive oxidizing agent that has been used to deposit Al2 O3 with trimethyl aluminum [13.42]. The major concern of the strong oxidizing characteristics of some oxidants is the oxidation of the underlying silicon, resulting in the formation of an unwanted SiO2 interfacial layer. Metal halides, especially chlorides [13.43–13.45], are one of the most widely used families of ALD precursors in depositing dielectric thin films, including SiO2 and high-k materials. For depositing SiO2 and HfO2 , the overall surface reactions and the two half-reactions are:  –OH + SiCl4 → –O–Si–Cl3 + HCl SiCl4 + 2H2 O → SiO2 + 4HCl –O–Si–Cl3 + 3H2 O → –O–Si–(OH)3 + 3HCl  –OH + HfCl4 → –O–Hf–Cl3 + HCl HfCl4 +2H2 O → HfO2 +4HCl –O–Hf–Cl3 + 3H2 O → –O–Hf–(OH)3 + 3HCl Table 13.1. Chemical precursors for ALD of high-k dielectric materials

(a) Oxidizing chemical agents: – H2 O – H2 O2 – O2 – O3 (b) Silicon-containing chemical precursors: – SiH4 – SiCl4 – Si(OC2 H5 )4 – Si(OC(CH3 )3 )3 OH

13 High-k Gate Dielectric Deposition Technologies

385

(c) Metal-containing chemical precursors: Chemicals

Terms

Examples

Metal halides

MXy

ZrCl4 HfCl4 WF6 TiI4

Metal alkyls Metal alkoxides

MRy

Molecular Structures Cl Hf

Cl

Cl

Cl CH3

Al(CH3 )3

Al H3C

M(OR)y

CH3

Zr[OC(CH3 )3 ]4 Hf[OC(CH3 )3 ]4

CH3 CH3 C O

H3C H3C H3C H3C

CH3 C O Hf O C CH 3 O CH3 C CH 3 H3C CH 3

Metal alkylamides

M(NR2 )y

Zr[N(C2 H5 )2 ]4 Hf[N(CH3 )2 ]4 Hf[N(CH3 )(C2 H5 )]4

Et

Et N Et Et

M(NO3 )y

O

Zr(NO3 )4 Hf(NO3 )4

N

N

Et Et

Et

Metal nitrates

Et

Hf

N

O

N O

O

Zr

N O O O

O O

N

N O

O

O

Metal β-diketonates

Zr(O2 C11 H19 )4 M(thd)y But M(acac)y C H C M(thd)y (OR)z O O But But

C

O O

C C H

O

Zr O C But

O

C

Hf(O2 C5 H7 )4 H3C

Bu t

C But CH C CH C O But

H3C H3C

C

H3C

CH3 O Hf

C

Zr

O O

C

CH3 CH C CH C O CH3 O O C

O O C H

Bu t

Zr(Cp )2 (CH3 )2 Metal M(Cp Rx )y cyclopentadienyl M(Cp Rx )y Rz Hf(Cp )2 (CH3 )2 Ba(Cp -(CH3 )5 )2

C

HC

C

CH3

CH3

CH3

Note: thd = tris-tetramethylheptanedione; acac = acetylacetonate; Cp =C5 H5 .

386

J.P. Chang

To confirm these reaction steps and reduce the processing temperature, strong Lewis bases, such as pyridine (C5 H5 N) and ammonia, have been used to catalyze the surface reactions [13.15,13.46]. These catalysts are shown to initiate immediate SiO2 film growth and reduce the required reaction temperatures from > 300◦ C to 27◦ C and the saturation reactant exposures from ∼ 109 L to ∼ 104 –105 L. The catalytic activation is proposed to take place through the strong interaction of Si–OH surface species with a strong Lewis base. The hydrogen-bonding in such an interaction weakens the SiO–H bond and increases the nucleophilicity of the O atom which in turn effectively attacks the electron-deficient Si in SiCl4 . Pyridine also accelerates the reaction by hydrogen-bonding with the H2 O reactant, again increases the nucleophilicity of the O atom which in turn effectively attacks the electron-deficient Si on the surface. Fluorides [13.47] and iodides [13.48] have also been used in ALD of oxides, for example, tungsten oxofluorides are used to deposit tungsten oxide, WO3 [13.11, 13.49], while zirconium tetra-iodides and H2 O2 are used to deposit zirconium oxide [13.40]. Metal alkyls are also common ALD precursors, especially for II–VI compound semiconductors (ZnS, CdS) [13.50–13.52]. The most widely studied metal alkyl precursor for depositing high-k materials is trimethyl aluminum, which is used with H2 O to deposit Al2 O3 [13.53, 13.54]. 2Al(CH3 )3 +  3H2 O → Al2 O3 + 6CH4 –OH + Al(CH3 )3 → –O–Al–(CH3 )2 + CH4 –O–Al–(CH3 )2 + 2H2 O → –O–Al–(OH)2 + 2CH4 Metal alkoxides form an important family of precursors that are successfully applied in depositing a wide variety of high-k dielectric materials. Note that metal alkoxide precursors could serve as both metal and oxygen sources. For example, it is found that there is a threshold temperature for ZrO2 deposition using zirconium t-butoxide as the precursor. At the onset of the chemisorption of the precursor molecule, ∼ 330–350◦ C, the reaction was thermally activated with an activation energy of 29 kcal/mol, consistent with a β-hydride elimination mechanism [13.55, 13.56]. Once the precursor adsorbed on the surface via a hydroxyl group and t-butanol was eliminated, the remaining t-butoxyl species underwent β-hydride elimination to produce isobutylene and generate new hydroxyl groups on the ZrO2 surface. Additional zirconium t-butoxide can thus adsorb on these newly generated hydroxyl groups. –OH +Zr(OC4 H9 )4 → –O–Zr(OH)3 + C4 H9 OH + 3(CH3 )2 –C = CH2 –OH + Zr(OC4 H9 )4 → –O–Zr–(OC4 H9 )3 + C4 H9 OH –O–Zr–(OC4 H9 )3 → –O–Zr–(OH)3 + 3(CH3 )2 –C = CH2 Thus oxygen in the precursor is the oxygen source in this reaction. Additional molecular oxygen can be introduced in between the metal-containing precursor pulses to fully oxidize the film and remove the re-deposited hydrocarbon impurities. It is important to note that the conventional CVD process oc-

13 High-k Gate Dielectric Deposition Technologies

387

curs at elevated temperatures when the metal alkoxide precursors thermally decompose. Metal alkoxide precursors also serve as the oxygen source in depositing mixed metal oxides, when combined sequentially with the metal chloride chemistry. For example, hafnium aluminate can be deposited using hafnium tetrachloride and aluminum tri-ethoxide. 3HfCl4 + 4Al(OC2 H5 )3 → Hf 3 Al4 O12 + 12C2 H5 Cl –Cl + Al(OC2 H5 )3 → –O–Al–(OC2 H5 )2 + C2 H5 Cl –O–Al–(OC2 H5 )2 + 2HfCl4 → –O–Al–(O − HfCl3 )2 + 2C2 H5 Cl Similarly, metal silicate can be deposited using silicon alkoxides and metal halides [13.57, 13.58]. It is found that the reaction threshold temperature is lower when the alkyl group in alkoxides is more branched. The deposition temperatures need to be kept low enough that thermal self-decomposition of the alkoxides does not occur. Since metal and oxygen form strong bonds with short bond lengths in transition metal alkoxides, likely due to the πbonding of oxygen p-orbital to the metal d-orbital, alkoxides are less oxidizing compared to other oxidants mentioned above. Metal alkyl amides have been widely used as CVD precursors for depositing nitride thin films and employed in ALD to deposit metal nitrides as barrier layers for back-end integration. When combined sequentially with H2 O or O3 , metal oxides can be deposited [13.59, 13.60]. Zr(N(C2 H5 )2 )4 + 2H2 O → ZrO2 + 4HN(C2 H5 )2  2(–OH) + Zr(N(C2 H5 )2 )4 → (–O–)2 Zr–(N(C2 H5 )2 )2 + 2HN(C2 H5 )2 (–O–)2 Zr–(N(C2 H5 )2 )2 + 2H2 O → (–O–)2 Zr–(OH)2 + 2HN(C2 H5 )2 Other chemistries under investigation include β-diketonates and metal cyclopentadienyl. β-diketonates are usually used to deposit metal compounds, but can also be combined with an oxidant to deposit high-k dielectric materials. For example, La(thd)3 and ozone are used to deposit La2 O3 [13.61] as a potential high-k dielectric material, while Ba(Cp (CH3 )5 )2 , Ti(OCH(CH3 )2 )4 , and H2 O are used to deposit BaTiO3 [13.29]. To facilitate a better understanding of the ALD processes, quantum chemical studies have enabled the determination of the formation energies and the activation energies in the half reactions of an ALD cycle [13.62]. These calculations can help design novel high-k dielectric materials that can be deposited at lower temperatures. Feature scale models have also been constructed using both Boltzmann transport equation and chemistry models to help optimize the reaction throughput [13.63]. 13.1.3 Processing Reactors and Chemical Delivery System An ideal ALD reactor should have a high throughput, minimal contamination, and be simple, reliable, inexpensive, and user-friendly. The reactor flow

388

J.P. Chang

design for ALD processing is simpler compared to other CVD processes because ALD operates in a surface reaction control mode. Therefore the gas flow pattern does not play an important role except for optimizing the wafer throughput. However, the deposition rate per cycle of most ALD processes is temperature-dependent and thus uniform temperature control could still be critical. In addition to the reactor design, processing diagnostics are required to better understand and optimize the ALD deposition. Spectroscopic techniques are often used in characterizing the composition, crystallinity, and the thermal stability of thin films, and some of them have only recently been used to characterize these ultra-thin films with success: – – – – – – – –

X-ray photoelectron spectroscopy (XPS) Rutherford backscattering spectroscopy (RBS) Electron energy loss spectroscopy (EELS) [13.64] X-ray diffraction (XRD) Medium energy ion scattering (MEIS) [13.65] Low energy ion scattering spectroscopy (ISS) [13.66] Temperature programmed desorption (TPD) High resolution transmission electron microscopy (HRTEM)

Real-time monitoring of the reactions also assists the understanding of the ALD processes, and two common chemical diagnostic tools for ALD are: – –

Quartz crystal microbalance (QCM) Quadrupole mass spectrometry (QMS)

For chemical delivery, depending upon if the precursor is a liquid or a solid, a bubbler setup or an evaporator can be used to introduce the chemical vapor into the reactor. An inert carrier gas flows from the reactant lines and the flow tube into the pumping system. Reactive precursors can be alternately pulsed into the reactor with an inert gas purge in between. The precursor duration time is typically set to be above the saturation requirement, and the purge time is optimized to effectively remove the chemical precursors before the next chemical pulse. Typically, the deposition is carried out at a pressure of 200 mTorr to 2 Torr, a substrate temperature of 100–400◦ C, and the resulting deposition rate is 0.5–3 ˚ A/cycle. One of the popular designs of an ALD reactor is a hot wall reactor as shown schematically in Fig. 13.2. The reactor flow tube is resistively heated to the desired deposition temperature. Wafers inside the flow tube are heated by convection and radiation from the hot wall. This simple design allows multiple wafers to be deposited simultaneously. Figure 13.2 also shows that the atomic layer deposition can be accurately monitored by quartz crystal microbalance and quadrupole mass spectrometry. Alternatively, the heating can be provided through an array of infrared (IR) lamps to precisely control the reaction temperature on the wafer, as shown in Fig. 13.3. The IR irradiates through quartz window to provide

13 High-k Gate Dielectric Deposition Technologies

389

Heater Orifice

Substrates Sources

QMS

QCM Reaction Chamber

Heater Pump

Weight change (a.u.) Pressure (m/z=17) (a.u.)

a

Purge Al(CH3)3

Al(CH3)3 D2O Purge Purge

170 180

b

Purge D2O

m1

190 200 210 220

mo

230

Time (s)

Fig. 13.2. Schematic of (a) an atomic layer deposition reactor employing resistive heating, and (b) deposition response measured by quadrupole mass spectrometry and quartz crystal microbalance. (Rahtu, Alaranta, and Ritala [13.53])

heating of the wafer and an optical pyrometer is used to monitor the sample temperature during the deposition and is feedbacked to correct the temperature drift during deposition. This design limits the processing to one wafer at a time, but the concept of integrated chemical processing of a single wafer minimizes the ambient contamination and enables the integration of multiple processes. Again, the atomic layer deposition rate is shown in Fig. 13.3 to scale linearly with the ALD cycles. By setting the reactor temperature, chemical pulse time, chemical purge time, and the total number of ALD cycles, desired thin film thickness can be deposited. The typical sticking probability of the precursor molecules is be-

390

J.P. Chang ALD

o

Ar

Deposition Thickness (A)

IR lamps

15

Surface Preparation

.

Wafer Handler

Load Lock

Bubbler Precursor

10

5

0

0

2

4

6

8

ALD Cycles

Fig. 13.3. Schematic of (left) an atomic layer deposition reactor employing IR heating, and (right) linearly scaled deposition thicknesses versus ALD cycles. (Chang, Lin, and Chu [13.56])

tween 0.0001–0.1 [13.55,13.56,13.59]. Judging from a typical ALD deposition rate of ∼ 0.5–3 ˚ A/cycle, and a typical desired high-k dielectric layer thickness of ∼ 1–5 nm, it takes approximately 2 to 20 ALD cycles for depositing the desired high-k dielectric thin film. The commercialization of the ALD equipment for semiconductor processing is currently at the alpha and beta stages. The alpha stage indicates that the equipment is assembled, tested and shipped to customers for testing. The beta stage, which allows actual wafer runs but not revenue production, means the equipment is moving toward production readiness. As mentioned before, since ALD operates in a surface reaction control mode and the flow uniformity is not as important, the reactor can be easily scaled-up to accommodate 300 mm wafers. In a commercialized system, to ensure high precision in metering the precursor dose, new chemical delivery system and control electronics are being developed to control the gas injection with accuracy and resolution of tens to hundreds of milliseconds. The overall throughput determines the viability of an ALD process for high-k deposition application. To meet the throughput target of > 10 wafers per hour [13.67], approximately 3–5 minutes are available for each wafer, including wafer loading and unloading. Thus, very short ALD cycle time and purge time (hundreds of milliseconds) are required, which remains one of the major engineering challenges. To understand the effect of gas flow in ALD reactors, a calculation model was cnstructed to account for the effect of re-adsorption of volatile reaction product in a low-pressure channel-type reactor with many parallel substrates [13.68, 13.69]. The calculations are based on the continuity equation and kinetic equations for surface coverage. The formation of a steady-state adsorption wave between the substrates during a precursor pulse depends strongly on the sticking and diffusion coefficients, the mean flow rate of the carrier gas, and the reactor temperature. The film thickness is found to decrease in the gas-flow direction.

13 High-k Gate Dielectric Deposition Technologies

391

13.1.4 Film Composition, Microstructure, and Electrical Results The advantages of atomic layer deposition include accurate film thickness control, interface control, uniformity over large area, excellent conformality, good reproducibility, mixed oxide processing capability, and superior film qualities at relatively low processing temperatures. For simple metal oxides, stoichiometric, uniform, and amorphous high-k dielectric thin films can be achieved by atomic layer deposition with very thin to no interfacial layer. Electrically, the metal-oxide-semiconductor devices showed low leakage current, small hysteresis, low interface state density, and reasonably high electron mobility with an equivalent oxide thickness of ∼ 1–3 nm and a dielectric constant > 20 [13.70]. Some literature reported electrical results for ALD dielectric thin films are summarized in Table 13.3, in comparison with those obtained by other deposition techniques. For ALD deposited highk dielectric thin films, the conduction mechanism is identified as Schottky emission at low electric fields and as Poole-Frenkel emission at high electric fields [13.21, 13.71–13.73]. Note that Schottky emission refers to thermionic emission across a metal-dielectric interface, while Poole–Frenkel emission is a frequently observed and well-characterized mechanism for deposited dielectrics which contain a high density of structural defects. These structural defects cause additional energy states and traps near the band edge, restricting the current flow because of the capture and emission processes. If metal oxides are mixed or doped with another metal oxide, special electrical and material properties can be derived, such as an increased dielectric constant [13.74] and an increased re-crystallization temperature [13.75].

13.2 Chemical Vapor Deposition 13.2.1 Technology Description Chemical vapor deposition, especially at low-pressures (LPCVD), is widely used for thin film deposition [13.76]. The transport and reaction processes underlying a CVD process include: a) the introduction of a precursor into a reaction chamber, b) gas phase collisions between precursor molecules, c) transport of precursors to the substrate, d) absorption of the precursors onto the substrate, e) the adatoms migration and film-forming chemical reactions on the substrate, f) desorption of the adsorbed molecules, g) surface nucleation, and h) desorption of the gaseous by-products of the reaction. Since the reaction is thermally activated, reaction rate increases with temperature until it exceeds the rate at which reactant species arrive at the surface. Thus the reaction cannot proceed any more rapidly than the rate at which reactant gases are supplied to the substrate by mass transport. On the other hand, at lower temperatures, the surface reaction rate is reduced, and the arrival rate

392

J.P. Chang

of reactants exceeds the rate at which they are consumed by the surface reaction process. Therefore, CVD process can be divided into two major reaction regimes: – Mass-transport limited regime at high temperatures – Surface reaction limited regime at low temperatures For CVD of high-k materials, the metal-containing precursors with or without the oxidizing agents are directed to a heated surface leading to their decomposition and the deposition of high-k dielectric materials. Basically all the chemicals used in ALD can be used for CVD processing. However, metal halides are normally avoided in CVD because of their higher decomposition temperature, and O2 is normally used as the oxidant. It is important to note that the reactor geometry and temperature gradient can lead to a wide variety of flow structure affecting the deposition rate and composition uniformity [13.76]. 13.2.2 Chemical Reaction Mechanisms and Kinetics Metal alkoxides [13.77], β-diketonates [13.78], metal alkyl amides [13.79, 13.80], and metal nitrates are common candidates for CVD of high-k materials. Metal nitrate is a promising new precursor, since it contains no hydrogen nor carbon atoms and could lead to hydrocarbon free deposition of high-k dielectric materials [13.81,13.82]. The N–O bond in the metal nitrate precursor is the weakest in the molecule and can break easily and result in high-k metal oxide deposition: Zr(NO3 )4 → ZrO2 + 4NO2 + O2 In other words, the nitrate ligand serves as an intrinsic oxidant. NO2 is assumed to be one major volatile product since NO, NO2 and O2 have been shown to evolve from pre-adsorbed NO− 3 on solid surfaces [13.83, 13.84]. 13.2.3 Processing Reactors and Chemical Delivery System The commonly used CVD reactors include horizontal and vertical furnaces where multiple wafers can be processed simultaneously. Alternatively, a liquid source misted chemical depositions have also been used effectively in depositing high-k materials [13.85]. For chemical delivery, a bubbler or an evaporator can be used to introduce the chemical vapor into the reactor. The precursors are usually mixed with the oxidant at an injection point near the reactor. Low pressure processing enhances the deposition uniformity due to the larger diffusion coefficient leading to surface-reaction limited deposition. Since the reactor design is critical to the deposition uniformity, detailed reactor scale modeling combined with surface deposition simulation are often required to design the CVD reactors [13.86]. The typical deposition pressures are between

13 High-k Gate Dielectric Deposition Technologies

393

0.3–8 Torr, and the substrate temperatures are between 275–800◦ C [13.87]. The deposition rates range from 70–500 ˚ A/min, depending upon the type of precursors used. Multiple wafers can be processes at the same time to improve the processing throughput. 13.2.4 Film Composition, Microstructure, and Electrical Results The composition and microstructure of the CVD deposited films depend largely on the deposition conditions and the purity of the precursors. The films typically incorporate some carbon impurities, which can be reduced by increasing the O2 flow rate. The deposited films are likely polycrystalline, with an interfacial layer on silicon. The metal-oxide-semiconductor devices showed low leakage current, small hysteresis, and low interface state density with an equivalent oxide thickness of ∼ 1–2 nm and a dielectric constant > 17 [13.88, 13.89]. The interfacial layer thickness is about 1–2 nm with a dielectric constant of 5–6 [13.90]. For CVD high-k dielectric thin films, the conduction mechanism is identified as Fowler–Nordheim tunneling [13.87], which has been shown to compete with the Poole–Frenkel emission and dominate the current mechanism, especially for thick oxides. The basic principle for quantum mechanical tunneling is that the carriers can tunnel from the adjacent conductor into the insulator and move freely within the valence or conduction band of the insulator.

13.3 Plasma-Enhanced Atomic Layer Deposition 13.3.1 Technology Description The plasma enhanced atomic layer deposition (PEALD) is referred to a binary chemical reaction sequence where the deposition of a precursor is selflimiting, followed by a ligand extraction or surface activation step with radicals (or ions) produced from a plasma. In the first half reaction, a gas phase chemical precursor reacts with a surface functional group and the reaction proceeds until all the surface functional groups have reacted and been replaced. The following step involves reactive radicals and ions from a plasma which remove the surface ligands by forming volatile species and leave behind a desired surface layer. Since the first half reaction is self-limiting, growth of one atomic layer per each cycle is expected, if the radicals do not cause etching of the deposited precursors. PEALD has been used for depositing inert refractory metal and metal nitride thin films as diffusion barrier, seed, and adhesion layers in semiconductor interconnect applications [13.91–13.94]. Recently, it is also proven to be effective in depositing high-k dielectric materials [13.95, 13.96]. PEALD processing has two unique features:

394

J.P. Chang

– A deposition process comprises of one self-limiting process step. – Surface activation or ligand removal is accomplished by radicals from the plasma. Since the radicals can react with the surface ligands with minimal to no activation energy barrier, PEALD generally results in an increased reaction rate and an improved removal of volatile products at lower temperatures compared to thermal ALD processes. PEALD is also capable of engineering binary and mixed metal oxides through the use of various chemical precursors. 13.3.2 Chemical Reaction Mechanisms and Kinetics In PEALD, the reactants are introduced into the reactor one at a time, typically with a pumping down period in between. The effect of ions is largely on the densification and crystallization of the deposited film, while the effect of radicals is on the activation of surface reactions. The most widely studied PEALD is hydrogen abstraction of a ligand from a metal precursor to form a metal or a metal nitride film. Since the deposition of metallic films is not the main focus of this chapter, no details are provided beyond a few reactions listed below. However, due to the increased interest in implementing metal gate electrodes in future MOSFET devices, ALD of metal thin films can be used to engineer the composition and work function of various promising metal electrode materials. Metal:

MLx + xH· → M + xLH

Metal nitride:

M(NLx )y + xyH· → MNy + xyLH MLx + xNH· → MNx + xLH

For PEALD of high-k dielectric materials, two possible routes are possible in generating the metal oxides (MOx/2 ) from MLx . First, a metal-containing precursor is adsorbed on the surface, a hydrogen plasma is used to abstract the ligands and form a metal layer, and followed by an oxygen plasma for oxidizing the metal to form metal oxides [13.96], as shown in Fig.13.4. ⎧ ⎨ MLx + xH· → M + xLH x MLx + xH · + O· → MOx/2 + xLH x ⎩ M + O· → MOx/2 2 2 For high-k metal oxide deposition, similar metal-containing precursors as shown in Table 13.1 are used in PEALD. H2 and O2 plasmas are commonly used as the reducing and oxidizing chemistries, respectively. The key to this reaction, which applies to many similar metal precursors, is atomic hydrogen scavenging the ligand from the precursor to form volatile LH. This radical assisted reaction can occur at near room temperature, whereas the conventional CVD equivalent process, using H2 , requires temperatures above 700–800◦ C.

13 High-k Gate Dielectric Deposition Technologies

395

Pressure

Metal-precursor Reducing Oxidant agent t1

t2 t3

t4 t5

t6

time

Fig. 13.4. Schematic of plasma enhanced atomic layer deposition with MLx and H and O radicals from plasma dissociation

If the metal-containing precursor is relatively labile and the L-ligand is relatively volatile, the hydrogen plasma can be eliminated, and a direct oxidation of the precursor can form the metal oxide. Combinations of the chemical precursors and various radicals generated by plasma have led to the deposition of a wide variety of high-k dielectric materials, such as Al2 O3 [13.96], ZrO2 [13.95], HfO2 [13.97], Y2 O3 [13.98], and SrTa2 O6 [13.99]. 13.3.3 Processing Reactors and Chemical Delivery System The PEALD process requires a high vacuum environment (a base pressure of 10−7 Torr), which makes the reactor design a little more complex than the ALD tools. A typical deposition system is shown in Fig 13.5. The plasma source is a multiple-turn coil powered by an rf source at 13.56 MHz and 100– 500 W, but can be of other types. During the deposition, the chamber pressure is controlled at 30–100 mTorr, and the substrate temperature is typically held between 50–200◦ C. A typical deposition rate is of 1–3 ˚ A/cycle. It is clear that PEALD operates in a single-wafer processing mode, thus the processing throughput can only be improved with the fast surface reactions with the radicals. For chemical delivery, a bubbler or an evaporator is commonly used to introduce the chemical vapor into the reactor. The precursor exposure time is typically set to be above the saturation requirement, and the pump down

J.P. Chang

H2

Pulsed-flowcontroller

Wafer

Pump

Precursor source

Ti surface coverage (1015/cm2)

396

30

20

10

0

0

10

20

30

40

50

60

Number of Cycles

Fig. 13.5. Schematic of (left) a plasma enhanced atomic layer deposition system, and (right) linearly scaled deposition rate of PEALD (Kim and Rossnagel [13.91], Rossnagel, Sherman, and Turner [13.94])

time is optimized to effectively remove the remaining chemical precursors. The radicals are typically dissociated from the precursor molecules in a highdensity plasma. 13.3.4 Film Composition, Microstructure, and Electrical Results Due to the reactive radicals and ions interacting with the surface, the composition and microstructure of the PEALD is typically stoichiometric with minimal carbon incorporation but polycrystalline. The metal-oxide-semiconductor devices showed low leakage current and low interface state density with an equivalent oxide thickness of 3–4 nm and a smaller dielectric constant about 12, due to the formation of interfacial SiOx . The interfacial layer thickness is about 2–3 nm, thicker than that obtained in ALD, likely because the oxygen radical more effectively oxidizes the underlying silicon [13.95].

13.4 Plasma Enhanced Chemical Vapor Deposition 13.4.1 Technology Description For PECVD of high-k materials, the metal-containing precursor and the oxidant, typically molecular oxygen, are dissociated and ionized in the gas phase. The key to this reaction is to generate metal oxide precursors that deposit on the surface to form high-k metal oxides. This ion and radical assisted deposition can occur at near room temperature. Thus PECVD processing has two unique features:

13 High-k Gate Dielectric Deposition Technologies

– –

397

A high deposition rate due to the effective breakdown of the gas phase precursors. The flexibility in introducing the metal-containing precursors in the plasma or down-stream [13.100] of a plasma to control the film composition and morphology.

13.4.2 Chemical Reaction Mechanisms and Kinetics For high-k metal oxide deposition in PECVD, similar metal-containing precursors as shown in Table 13.1 and O2 are commonly used. Deposition of high-k dielectric thin films inside or downstream of a plasma with metal alkoxide or alkyl amide precursors has been explored [13.101]. For example, high-k dielectric materials such as ZrO2 [13.102] and yttria-stabilized ZrO2 [13.103] have been deposited, using zirconium t-butoxide and oxygen [13.104]. Since the bond strength of O–C is weaker than that of Zr–O, the hydrocarbon ligands will be liberated into the plasma. The dissociated atomic oxygen from O2 oxidizes the hydrocarbons to form H2 O and CO2 , to minimize surface hydrocarbon contamination. Using ZrO2 deposition as an example, through + a serial dissociation process, Zr+ , ZrO+ , ZrO2 H+ , ZrO3 H+ 3 , and ZrO4 H5 and their neutral counterparts are generated and serve as the MO precursor for high-k dielectric thin film deposition. e−

4H

e−

e−

e−

Zr(OC4 H9 )4 −→ ZrO4 −→ ZrO4 H4 . . . → ZrO3 H3 −→ ZrO2 H −→ ZrO −→ Zr    O, 2 H O, H O e−

O2 −→ O + O  y  e− y O −→ H2 O + xCO2 Cx Hy + 2x + 2 2 As the fraction of O2 increases, plasma gas phase serial oxidations shift ZrOx H+ y to higher oxidation states [13.105]. The gas phase reaction mechanisms have a direct impact on the material and electrical characteristics of the PECVD films. 13.4.3 Processing Reactors and Chemical Delivery System The PECVD processing system is similar to that for PEALD, and the plasma source can be either an inductively coupled plasma reactor, an electron cyclotron resonance reactor, or a remote plasma source/jet [13.100, 13.106]. During deposition, the chamber pressure is controlled at 100 mTorr–2 Torr, and the substrate temperature can be controlled in between 25–150◦ C, as shown in Fig. 13.6. The deposition rate can be controlled at 10–100 ˚ A/min, depending upon the plasma density and the precursors used. PECVD also operates in a single wafer processing mode, but the processing throughput

398

J.P. Chang µ-wave (2.45 GHz)

O2:Oxidant Electromagnet (2.5KW)

Ar: Carrier gas

Electromagnet (5KW)

MO precursor

Count rate normalized to Ar flow rate (cts/s sccm)

a 10000 7500

(a) Oxidant/Precursor=0

ZrO+

5000 2500

ZrO2H+

0 10000 7500 5000 2500

(b) Oxidant/Precursor=0.5

100

150 50 100

0

50 90

94

98 0

0 10000 7500 5000 2500 0

b

Zr+

ZrO2H+

160

164

168

(c) Oxidant/Precursor=4

ZrO3H3+ ZrO4H5+

90 100 110 120 130 140 150 160 170

Mass-to-charge ratio

Fig. 13.6. Schematic of (a) a plasma enhanced chemical vapor deposition system, and (b) dominant reactive species in a PECVD depositing plasma (Cho, Wang, Sha, and Chang [13.104])

13 High-k Gate Dielectric Deposition Technologies

399

can be improved, compared to many CVD and ALD processes, through the more effective breakdown of the precursors. For chemical delivery, a bubbler setup or an evaporator can be used to introduce the chemical vapor into the reactor. The precursor is mixed with the oxygen and then injected into the reactor to generate a plasma, and the deposition can take place in the plasma or downstream of the plasma. 13.4.4 Film Composition, Microstructure, and Electrical Results The physical and material properties of PECVD deposited high-k thin film are known to depend strongly on the plasma gas phase chemistry. For example, in depositing ZrO2 with zirconium t-butoxide and O2 , the deposited film is stoichiometric but its surface morphology and electrical properties are strongly affected by the flow rate ratio of O2 to precursor, which controls the hydrocarbon incorporation in the films and the interfacial layer formation [13.102]. Using root-mean-square (RMS) roughness as a measure of the surface morphology, very smooth ZrO2 films are deposited at high O2 fractions in the plasma (RMS < 2 ˚ A) where ZrO2 H+ is the dominant ionic species, while much rougher films are deposited at low O2 fractions in the plasma (RMS > 10 ˚ A) where ZrO+ is the dominant ionic species. These distinct material properties also have significant impact on the electrical performance of these high-k materials. The metal-oxide-semiconductor devices built with the PECVD high-k thin film deposited in an oxygen rich plasma showed low leakage current and low interface state density with an equivalent oxide thickness of 2–3 nm and a dielectric constant > 15. The interfacial layer thickness is thicker than that obtained in ALD, again due to the oxygen radical oxidizing the underlying silicon [13.104].

13.5 Physical Vapor Deposition 13.5.1 Technology Description The concept of ALD was first introduced as a variant of physical vapor deposition (PVD), where atomic elements were evaporated from two separated sources to grow compound films. The vapors of each element were alternatively pulsed onto the substrate by using shutters to block the physical vapor. The substrate temperature was adjusted to be higher than the sublimation temperatures of the single atomic elements. Compound materials could be deposited if the sublimation temperature of the compound was higher than the substrate temperature. The unique aspects of PVD processing are: – –

A versatile and robust family of techniques available. Deposition of high-k dielectric materials is not limited to the synthesis of volatile and stable gas-phase metal-containing precursors.

400



J.P. Chang

Substrate temperature can have a broad range from near room temperature to very high temperatures.

13.5.2 Chemical Reaction Mechanisms and Kinetics In physical vapor deposition, vacuum evaporation [13.107,13.108], sputtering deposition [13.109–13.112], oxidation of metals [13.113], and laser-assisted deposition [13.114] have all been used to deposit high-k dielectric materials. Reactive Sputtering Deposition The sputtering deposition of high-k dielectric materials is based on a metal oxide target being bombarded by an inert plasma or a metal target being bombarded by an oxidizing discharge, while the sputtered products are collected on the substrate [13.115–13.117]. For example, deposition of ZrO2 can be achieved by reactive sputtering of a zirconium metal target in an oxygen plasma or by directly sputtering a ZrO2 target with a noble Ar plasma. Moreover, the deposition of multi-component metal oxides or silicates can be easily accomplished by physical sputtering of multiple targets or using multiple chemical precursors. For example, silane can be added in the sputtering chemistry or a metal silicide target (e.g., HfSi) can be used in reactive sputtering to deposit metal silicates [13.118]. Because of the flexibility and robustness in reactive ion sputtering, it can be used as a combinatorial synthesis tool to create and test a broad spectrum of complex materials in parallel. This is achieved by using multiple targets in the same reactor for simultaneous deposition of multi-component metal oxides [13.75, 13.119, 13.120]. This approach enables the simultaneous creation of compositionally graded thin film which resembles the formation of thousands of materials of different compositions. Evaporation High-k dielectric materials can be deposited by evaporation deposition [13.121], for example, ZrO2 target can be used as the source material, and a high energy electron gun (3–4 kW) can be used to evaporate the metal oxide target and deposit the metal oxide high-k thin films [13.122]. Powder packed ceramic Gd2 O3 and Y2 O3 sources have also been used for electron beam evaporation of Gd2 O3 and Y2 O3 to deposited high-k thin film materials [13.123]. To ensure a fully oxidized film, oxygen is typically introduced into the vacuum system during deposition, and usually leads to excess oxygen in these films. Metal Sputtering Deposition and Oxidation High-k dielectric materials can be formed by first depositing a thin layer of metal by the above mentioned physical vapor deposition processes, followed

13 High-k Gate Dielectric Deposition Technologies

401

by the oxidation of the PVD metal thin films [13.113]. For example, a general reaction scheme of elementary reactions for formation of metal silicate can be written as follows: M + xSi → MSix x MSix + O2 → MOx Six 2 Upon deposition of the metal, metal silicide is formed through the diffusion of silicon from the substrate. In an oxidizing ambient and at elevated temperatures, oxygen dissociates and reacts to oxidize the metal silicide, through insertion of oxygen atoms into the M–Si bonds. Pulsed Laser Deposition Laser ablation can also be used to deposit high-k dielectric materials. For example, a KrF laser beam with an energy density of 2–3 J/cm2 at ∼ 248 nm wavelength can be pulsed at 30–60 Hz for 10–20 ns onto a metal target, in an oxidizing ambient. For example, silicon oxide and oxynitride thin films have been deposited by pulsed laser ablating a silicon target in an O2 and O2 /N2 mixture, respectively [13.114]. Titanium silicate thin films have been deposited by reactive pulsed-laser deposition from a combined Ti–Si target [13.124]. 13.5.3 Processing Reactors and Chemical Delivery System The PVD processing system often requires a vacuum system with metal or metal oxide targets powered by an energy source. A schematic of a magnetron reactive sputtering system is shown in Fig. 13.7. A metal or metal oxide target is used as a sputtering source and is driven by either a dc or a 13.56 MHz rf power supply [13.115, 13.125]. Normally, the chamber pressure is at 10– 40 mTorr, the power ranges from 100–400 W, and the magnetron produces a dense plasma ring adjacent to the metal target. Either a noble or oxidizing gas is used to generate the plasma for sputtering the different types of targets. It is noteworthy that the sputtering rate of an oxidized target is much lower than that of an unoxidized target. The substrate is typically placed in line-of-sight from the sputtering source and its temperature can be controlled at 25–450◦ C. Though the wafers are placed in line-of-sight to the source, PVD typically leads to poor conformality due to the large sticking coefficient of the reactive species generated by the plasma sputtering process. The PVD processing could accommodate 1–5 wafers with a deposition rate of 50–250 ˚ A/min, and no complex chemical delivery system is needed. 13.5.4 Film Composition, Microstructure, and Electrical Results The PVD high-k films are amorphous to polycrystalline, with a very thin interfacial layer. Electrically, the metal-oxide-semiconductor devices showed

402

J.P. Chang

Fig. 13.7. Schematic of (left) a generic physical vapor deposition system, and (right) the TEM cross-section of sputtering deposited hafnium silicate (a) before and (b) after annealing (Wilk, Wallace, and Anthony [13.126])

very low leakage current, small hysteresis, and low interface state density with an equivalent oxide thickness of ∼ 1–2 nm and a dielectric constant > 20 [13.126, 13.127]. The interfacial layer thickness could be minimized to 0–1 nm, and the electron mobility is reasonably high, as summarized in Table 13.2 and Table 13.3 [13.128].

Table 13.2. Summary of operating conditions for various high-k deposition processes Deposition Techniques ALD CVD PEALD PECVD PVD MBE IBAD Sol-gel

Pressure (Torr)

Temperature (◦ C)

10−4 –2 0.3–8 0.03–0.1 0.1–2 0.01–0.04 10−6 –10−4 1–3×10−4 760

100–400 275–800 50–200 25–150 25–450 25–800 25–100 300–400

Deposition Rate 0.5–3 70–500 1–3 10–100 50–250 20 60–200 100–1000

˚ A/cycle ˚ A/min ˚ A/cycle ˚ A/min ˚ A/min ˚ A/min ˚ A/min ˚ A/min

# Wafers Processed 1–5 25 1 1 1–5 1 1 1

13 High-k Gate Dielectric Deposition Technologies

403

Table 13.3. Summary of electrical performance of high-k thin films Deposition Techniques

High-k Material

ALD ZrO2 [13.21, 13.70–13.73] HfO2 CVD ZrO2 [13.87–13.90] HfO2 PEALD ZrO2 [13.95] PECVD ZrO2 [13.102, 13.104] PVD HfO2 [13.126–13.128] HfSiO4 MBE Al2 O3 [13.129, 13.130] Y 2 O3

tIL EOT Dit J (µA/cm2 ) Hys. µe (peak)   11 10 @ 1V (mV) (nm) (nm) (cm2 /V–s) cm2 eV > 20 0–1 1–3 0.4–1 0.1–1 8–10 270–320 k

> 17

1–2

1–2

10

0.1–10

10

180–380

> 12

2–3

3–4

2–50

0.03–0.4





> 15

1–2

2–3

1

0.1–3





> 20

0–1

1–2

1–5

2–4

10

230–380

> 15

0–1

1–3

1–3

0.02–0.04



70–210

13.6 Molecular Beam Epitaxy 13.6.1 Technology Description Molecular beam epitaxy (MBE) [13.129–13.131] has been used to deposit lattice matched epitaxial thin films under ultrahigh vacuum. The starting substrate surface needs to be carefully engineered to serve as a template for thin film growth, for example, a stable submonolayer silicide at the interface can allow certain metal oxides to grow epitaxially. Other metal oxides such as La2 O3 [13.132] and single crystalline perovskite oxides such as SrTiO3 have also been deposited on silicon [13.133]. Researchers very recently reported that GaAs can be grown epitaxially on Si if a very thin layer of SrTiO3 is used as the buffer to match the lattice constants [13.134, 13.135]. The unique aspects of MBE processing are: – –

Possible epitaxial growth on the silicon substrate. Complex metal oxides can be engineered without the use of many chemical precursors.

13.6.2 Chemical Reaction Mechanisms and Kinetics Metal oxide thin films can be deposited by molecular-beam epitaxy using molecular oxygen and thermally evaporated metal. Prior to oxide growth, the Si surface is typically heated to ∼ 700◦ C to desorb the surface hydrogen. Additionally, sometimes a thin (∼ 10 nm) epitaxial silicon buffer layer is grown in-situ. This results in a smooth and reproducible starting Si surface exhibiting a (7 × 7) surface reconstruction as determined by reflection

404

J.P. Chang

high-energy electron diffraction. The MBE grown metal oxide film can take place at a typical chamber pressure of ∼ 2 × 10−5 Torr in one of two ways: exposure of the Si surface to ∼ 1 or more monolayer(s) of metal flux, followed by exposure to the oxygen, or exposure to O2 during the metal evaporation. Although some metals require high temperatures to fully oxidize (∼ 800◦ C), many can be deposited at even room temperature. The deposition rate is typically 20 ˚ A/min. 13.6.3 Processing Reactors and Chemical Delivery System The MBE system is often an ultra-high vacuum reactor with electron-beam evaporation solid sources, effusion cell solid sources, a molecular oxygen source, and in-situ optical and high-energy electron diffraction probes for real-time monitoring of the metal fluxes and surface structure and morphology [13.123]. The MBE processing is also limited to a single wafer, and no complex chemical delivery system is needed. 13.6.4 Film Composition, Microstructure, and Electrical Results The MBE deposited high-k thin films, including Al2 O3 , Y2 O3 , and SrTiO3 , are typically crystalline, stoichiometric, and atomically smooth. They have very low leakage current density and interface state density [13.131]. The effective oxide thickness is about 1–3 nm with minimal interfacial layer thickness and is not sensitive to post-deposition anneal in O2 . Very high dielectric constants have been observed for crystalline perovskite oxides.

13.7 Ion Beam Assisted Deposition 13.7.1 Technology Description The concept of ion beam assisted deposition (IBAD) uses energetic ions to stimulate the surface reactions of precursor atoms that are generated with an electron-beam evaporator and deposited on a substrate. Ion bombardment is the key factor controlling the film properties in the IBAD process. IBAD is a viable process for depositing high-k dielectric materials because: – –

The ions transfer momentum and energy to the precursor molecules on the surface and produce a graded material interface. Complex metal oxides can be engineered without the use of many chemical precursors.

13.7.2 Chemical Reaction Mechanisms and Kinetics For ion beam assisted deposition of high-k dielectric materials, energetic oxygen ions in a vacuum system are used to convert deposited surface metal

13 High-k Gate Dielectric Deposition Technologies

405

atoms into metal oxides for thin film deposition. This technique has been applied to deposit zirconium oxide and hafnium oxide thin films [13.136–13.139]. 13.7.3 Processing Reactors and Chemical Delivery System A commonly used ion beam source extracts ions from an oxygen plasma created under the electron cyclotron resonance condition at a microwave frequency of 2.45 GHz and a pressure of 1–3×10−4 Torr to form an O ion beam with an energy of 1–20 keV and a current density of 30–120 µA/cm2 . Metal is typically evaporated by an electron beam and condensed on a Si substrate at a temperature of 25–100◦ C, while oxygen ions are typically directed simultaneously onto the substrate. The deposition rate is about 60–200 ˚ A/min and the deposition is limited to a single wafer. 13.7.4 Film Composition, Microstructure, and Electrical Results The stoichiometry of high-k materials deposited by ion-assisted deposition is affected by the diffusion of oxygen on the substrate surface. Fine-grained cubic zirconia has been deposited with the (111) orientation [13.136]. The effect of low-energy ion bombardment on metal oxide films during reactive ion-beam process is shown to induce a substantial relaxation of the residual stress as a result of structural modification by the increased mobility of adatoms. The ion bombardment also controls the amounts of Zrn+ ions. The relative amount of predominant Zr4+ ions contributes systematically to variations in the (111) interplane distance and the degree of crystallinity in highly preferentially orientated films.

13.8 Sol-gel Deposition 13.8.1 Technology Description Sol-gel processing is a wet chemical process for synthesizing a colloidal suspension of solid particles that are formed through hydrolysis and condensation of chemical precursors using catalysts. These nano-particles are suspended in an organic or aqueous solvent. Upon removal of the solvent, the wet gel converts into a xerogel through ambient pressure drying or an aerogel by supercritical drying. Thin, uniform and crack-free films can be readily formed on various substrates by dipping, spinning, or spray-coating. The advantages of sol-gel processing are: – –

No vacuum system nor chemical delivery systems needed. Inexpensive and low temperature processing for preparing complex oxide compositions with high homogeneity.

406

J.P. Chang

13.8.2 Chemical Reaction Mechanisms and Kinetics Reactions of metal halides and alkoxides to form metal oxides and alkyl halides are well known for sol-gel processes under nonhydrolytic conditions. The reaction mechanism involves coordination of an oxygen atom of the alkoxy group to the metal center of the halide precursor, followed by a nucleophilic cleavage of the O–R bond. Therefore, the cationic character of the alkyl group determines the reactivity. The more branched organic ligand R, the easier the reaction proceeds. High-k materials such as zirconium oxide have been deposited by sol-gel processing [13.140, 13.141]. 13.8.3 Processing Reactors and Chemical Delivery System Sol-gel processing requires no vacuum systems, making it inexpensive and easy to master. A complete system typically includes a spin and dip coater, a hot plate or a thermal oven, an ultrasonic cleaner, and a UV curing system. Precursors such as metal alkoxides, metal β–diketonates, metal alkylamides, metal carboxylates, metal organo–polymers are all commercially available and can be used for deposition at temperatures between 300–400◦ C. Solgel processing is a simple and inexpensive single wafer process, making it attractive from the viewpoint of processing and manufacturing cost. 13.8.4 Film Composition, Microstructure, and Electrical Results The sol-gel deposited high-k thin films are typically non-stoichiometric. Both oxygen deficiency and trivalent zirconium have been observed [13.142] and these sol-gel formed materials tend to crystallize at 400–500◦ C [13.143, 13.144].

13.9 Summary There are plenty of techniques being explored for depositing high-k dielectric materials, and each has its own advantages and limitations. To engineer superior high-k dielectric materials, the effect of surface reaction mechanisms needs to be investigated to allow the manipulation of the composition, chemical coordination, microstructure, and electrical properties of the dielectric thin films. The operating conditions of the deposition techniques discussed in this chapter are summarized in Table 13.2. It is evident that the pressure and temperature ranges of these techniques are quite different, and can be individually optimized for the desired dielectric properties. The deposition rate and the number of wafers that can be processed at the same time limit the throughput of the deposition techniques and are critical to their manufacturability.

13 High-k Gate Dielectric Deposition Technologies

407

Since the electrical performance is the ultimate testament of the success of a deposition technique, metal-oxide-semiconductor capacitors and transistors are typically fabricated to assess the electrical performance of the dielectric thin films. To aid the evaluation of these techniques, some available literature results are summarized in Table 13.3 for comparison. These results include six major techniques and a variety of high-k materials such as HfO2 , ZrO2 , HfSiO4 , Al2 O3 , and Y2 O3 . Table 13.3 summarizes the dielectric constant (k), the interfacial layer thickness (tIL ), the effective oxide thickness (EOT), the interfacial state density (Dit ), the leakage current density at 1 V (J), the hysteresis (Hys.), and the peak electron mobility (µe ) of the high-k dielectric thin films in MOS devices. It is important to note that in order to compare the efficacy of these techniques, each technique should be optimized and used to deposit the same material on similar substrates with the gate electrode material. This is obviously hard to attain through the available literature, therefore this table serves only as a guide to the attainable electrical performance and cautions should be taken in comparing the listed values. Finally, engineering a lattice matched metal oxide thin film on silicon is especially challenging. Therefore, in-situ surface analytical techniques and gas-phase diagnostics are often needed, and solid-state electronic devices need to be fabricated to characterize the electrical properties of the dielectric and its interface with silicon. In searching for the ideal high-k material for the future generation devices, experimentally, combinatorial approach is a powerful tool in screening and examining a large number of promising mixed metal oxides and silicates. Theoretically, first principle calculations [13.145–13.147] can be used to understand the local atomic and electronic structure of dielectric/silicon interface and extend the predictability of the theoretical calculations to other promising but less studied metal oxides/silicon interfaces [13.148]. However, until a high-k candidate emerges that can meet all the desired properties (described in this book), the optimum deposition technique and/or deposition tool configuration cannot be completely decided upon.

References 13.1. J.P. Chang, J. Eng Jr., J. Sapjeta, R.L. Opila, P. Cox, and P. Pianetta, Cleaning Technology in Semiconductor Device Manufacturing, Proceedings of the Sixth International Symposium, Electrochemical Society Proceedings, 99–36, 129, xiii+614 (2000) 13.2. A. Delabie, M. Caymax, B. Brijs, E. Cartier, T. Conard, L. Geenen, W. Vandervorst, S. De Gendt, M.M. Heyns, P. Bajolet, J.W. Maes, and W. Tsai, Fourth International Conference on Microelectronics and Interfaces, Santa Clara, California, March 2003 13.3. Y. Wu, G. Lucovsky, Y. M. Lee, IEEE Transactions on Electron Devices 47 (7), 1361 (2000)

408

J.P. Chang

13.4. Y.-C. Yeo, P. Ranade, T.-J. King, C. Hu, IEEE Electron Device Letters 23 (6), 342 (2002) 13.5. L. Sha, B.-O. Cho, and J. P. Chang, J. Vac. Sci. Technol. A 20 (5), 1525– 1531 (2002) 13.6. L. Sha and J.P. Chang, J. Vac. Sci. Technol. A 21 (6), 1915–1922 (2003) 13.7. L. Sha and J.P. Chang, J. Vac. Sci. Technol. B 21 (6), 2420–2427 (2003) 13.8. L. Sha and J.P. Chang, J. Vac. Sci. Technol. A 22 (1), 88–95 (2004) 13.9. K. Kukli, M. Ritala, R. Matero, M. Leskela, Journal of Crystal Growth 212 (3-4), 459 (2000) 13.10. J.P. Chang and Y.-S. Lin, J. Appl. Phys. 90 (6), 2964 (2001) 13.11. T. Suntola, Appl. Surf. Sci. 100-101, 391 (1996) 13.12. H. Kattelus, M. Ylilammi, J. Saarilahti, J. Antson, and S. Lindfors, Thin Solid Films 225 (1-2), 296 (1993) 13.13. M. Leskela, Ritala, M., Thin Solid Films 409 (1), 138 (2002) 13.14. O. Sneh, R.B. Clark-Phelps, A.R. Londergan, J. Winkler, T.E. Seidel, Thin Solid Films 402 (1-2), 248 (2002) 13.15. M. Ritala and M. Leskela, Handbook of Thin Film Materials, edited by H.S. Nalwa, Vol. 1, Academic Press (2002), Chap. 2: Atomic Layer Deposition 13.16. J.W. Klaus, O. Sneh, S.M. George, Science 278, 1934 (1997) 13.17. R.G. Gordon, J. Becker, D. Hausmann, and S. Suh, Mater Res Soc, Gate Stack and Silicide Issues in Silicon Processing II Symposium Proceedings 670, K.2.4.1 (2002) 13.18. J.W. Klaus, S.J. Ferro, S.M. George, Thin Solid Films 360 (1-2), 145 (2000) 13.19. R. Solanki, B. Pathangey, Electrochemical and Solid-State Letters 3 (10), 479 (2000) 13.20. J.P. Chang, Y-S. Lin, S. Berger, A. Kepten, R. Bloom, and S. Levy, J. Vac. Sci. Technol. B 19 (6), 2137 (2001) 13.21. J.P. Chang and Y-S. Lin, Appl. Phys. Lett. 79 (22), 3666-3668 (2001) 13.22. J.W. Klaus, S.J. Ferro, S.M. George, J. Electrochemical Soc. 147 (3), 1175 (2000) 13.23. M. Juppo, M. Ritala, M. Leskela, J. Electrochemical Soc. 147 (9), 3377 (2000) 13.24. P. Alen, M. Juppo, M. Ritala, T. Sajavaara, J. Keinonen, M. Leskela, J. Electrochemical Soc. 148 (10), G566 (2001) 13.25. R.L. Puurunen, A. Root, P. Sarv, S. Haukka, E.I. Iiskola, M. Lindblad, A.O.I. Krause, Appl. Surf. Sci. 165 (2-3), 193 (2000) 13.26. H. Akazawa, Journal of Crystal Growth 173 (3-4), 343 (1997) 13.27. L.P. Colletti, J.L. Stickney, J. Electrochemical Soc. 145 (10), 3594 (1998) 13.28. B. Sang, A. Yamada, M. Konagai, Jpn. J. Appl. Phys. 37 (2B), Part 2 (Letters), L206 (1998) 13.29. M. Vehkamaki, T. Hatanpaa, T. Hanninen, H. Ritala, M. Leskela, Electrochemical and Solid-State Letters 2 (10), 504 (1999) 13.30. Z. Wang, S. Oda, J. Electrochemical Soc. 147 (12), 4615 (2000) 13.31. Z. Wang, S. Oda, M. Karlsteen, U. Sodervall, M. Willander, Jpn. J. Appl. Phys., Part 1 39 (7A), 4164 (2000) 13.32. H. Goto, K. Shitahara, and S. Yokoyama, Appl. Phys. Lett. 68 (23), 3257 (1996) 13.33. D.-G. Park, H.-J. Cho, K.-Y. Lim, C. Lam, I.-S. Yeo, J.-S. Roh, J.W. Park, J. Appl. Phys. 89 (11, pt.1-2), 6275 (2001)

13 High-k Gate Dielectric Deposition Technologies

409

13.34. J.P. Chang and Y.-S. Lin, Appl. Phys. Lett. 79 (23), 3824, (2001) 13.35. J. Aarik, A. Aidla, H. Mandar, T. Uustare, K. Kukli, M. Schuisky, Appl. Surf. Sci. 173 (1-2), 15 (2001) 13.36. K. Kukli, A. Aidla, J. Aarik, M. Schuisky, A. Hartsa, M. Rotala, M. Leskela, Langmuir 16 (21), 8122 (2000) 13.37. K. Kukli, M. Ritala, M. Leskela, T. Sajavaara, J. Keinonen, D. Gilmer, S. Bagchi, and L. Prabhu, J. Non-Crystalline Solids 303, 35 (2002) 13.38. Y. Senzaki, G.B. Alers, A.K. Hochberg, D.A. Roberts, J.T. Norman, R.M. Fleming, and H. Krautter, Electrochemical and Solid State Letters 3 (9), 435 (2000) 13.39. M. Kiyatoshi, S. Yamaxaki, J. Nakahira, K. Egushi, K. Heida, H. Yamamoto, T. Umehara, K. Hasebe, T. Asano, K. Nakao, T. Arikado, and K. Okumura, The ninth international symposium on Semiconductor Manufacturing, IV b-6, 110 (2000) 13.40. K. Kukli, K. Forsgren, J. Aarik, T. Uustare, A. Aidla, A. Niskanen, M. Ritala, M. Leskela, A. Harsta, J. Crystal Growth 231 (1-2), 262 (2001) 13.41. M. Schuisky, K. Kukli, J. Aarik, J. Lu, A. Harsta, J. Crystal Growth 235 (1-4), 293 (2002) 13.42. K.H. Hwang. S.J. Choi, J.D. Lee, Y.S. You, Y.K. Kim, H.S. Kim, C.L. Song, S.I. Lee, ALD 2001 topical conference, AVS, Monterey, CA, (2001) 13.43. J. Aarik, A. Aidla, H. Mandar, T. Uustare, V. Sammelselg, Thin Solid Films 408 (1-2), 97 (2002) 13.44. K. Kukli, M. Ritala, T. Uustare, J. Aarik, K. Forsgren, T. Sajavaara, M. Leskela, A. Harsta, Thin Solid Films 410 (1-2), 53 (2002) 13.45. A.W. Ott, K.C. McCarley, J.W. Klaus, J.D. Way, S.M. George, Applied Surface Science 107, 128 (1996) 13.46. J.W. Klaus, S.M. George, Surf. Sci. 447 (1-3), 81 (2000) 13.47. J.W. Klaus, S.J. Ferro, S.M. George, Thin Solid Films 360 (1-2), 145 (2000) 13.48. K. Kukli, K. Forsgren, M. Ritala, M. Leskela, J. Aarik, A. Harsta, J. Electrochemical Soc. 148 (12), F227 (2001) 13.49. P. Tagstrom, P. Martensson, U. Jannson, J.O. Carlsson, J. Electrochem. Soc. 146, 3139 (1999) 13.50. C.H. Liu, M. Yokoyama, Y.K. Su, N.C. Lee, Jpn J of Appl Phys, Part 1 (Regular Papers, Short Notes & Review Papers) 35 (5A), 2749 (1996) 13.51. M. Innocenti, G. Pezzatini, F. Forni, M.L. Foresti, J. Electrochemical Soc. 148 (5), C357 (2001) 13.52. M.L. Foresti, G. Pezzatini, M. Cavallini, G. Aloisi, M. Innocenti, R. Guidelli, J. Phys. Chem. B 102 (38), 7413 (1998) 13.53. A. Rahtu, T. Alaranta, M. Ritala, Langmuir 17 (21), 6506 (2001) 13.54. A. Paranjpe, S. Gopinath, T. Omstead, R. Bubber, J Electrochemical Soc 14 8(9), G465 (2001) 13.55. M.A. Cameron and S.M. George, Thin Solid Films 348, 90 (1999) 13.56. J.P. Chang, Y. Lin, and K. Chu, J. Vac. Sci. Technol. B 19 (5), 1782-1787 (2001) 13.57. M. Ritala, K. Kukli, A. Rahtu, P. Raisanen, M. Leskela, T. Sajavaara, J. Keinonen, Science 288 (5464), 319 (2000) 13.58. R. Gordon, J. Becker, D. Hausmann, and S. Suh, Chem. Mater. 13, 2463 (2001)

410

J.P. Chang

13.59. D. Hausmann, E. Kim, J. Becker, and R. Gordon, Chem. Mater. 2002 (in press) 13.60. M. Suzuki, T. Magara, T. Takahashi, and H. Shinriki, ALD 2002 Topical Conference, Seoul, Korea (2002), p. 5 13.61. M. Nieminen, M. Putkonen, J. Niinisto, Appl. Surf. Sci. 174 (2), 155 (2001) 13.62. Y. Widjaja, C.B. Musgrave, Appl. Phys. Lett. 80 (18), 3304 (2002) 13.63. M.K. Gobbert, V. Prasad, T.S. Cale, Thin Solid Films 410 (1-2), 129 (2002) 13.64. S. Ramanathan, D.A. Muller, G.D. Wilk, C.M. Park, and P.C. McIntyre, Appl. Phys. Lett. 79 (20), 3311 (2001) 13.65. T. Gustafsson, H.C. Lu, B.W. Busch, W.H. Schulte, E. Garfunkel, Nuclear Instruments & Methods in Physics Research, Section B 183 (1-2), 146 (2001) 13.66. R.P. Pezzi, C. Krug, E.B.O. da Rosa, J. Morais, L. Miotti, I.J.R. Baumvol, Nuclear Instruments & Methods in Physics Research, Section B 190, 510 (2002) 13.67. J. Chappell, Electronic News, July (2002) 13.68. H. Siimon, J. Aarik, J. Phys. D (Applied Physics) 30 (12), 1725 (1997) 13.69. H. Siimon, J. Aarik, J. de Physique IV, Colloque C5, supplement au Journal de Physique II, 5, 245 (1995) 13.70. B. Guillaumot, X. Garros, F. Lime, K. Oshima, B. Tavel, J.A. Chroboczek, P. Masson, R. Truche, A.M. Papon, F. Martin, J.F. Damlencourt, S. Maitrejean, M. Rivoire, C. Leroux , S. Cristoloveanu, G. Ghibaudo, J.L. Autran, T. Skotnicki, S. Deleonibus, IEDM Technical Digest, 355 (2002) 13.71. C.M. Perkins, B.B. Triplett, P.C. McIntyre, K.C. Saraswat, S. Haukka, and M. Tuominen, Appl. Phys. Lett. 78 (16), 2357 (2001) 13.72. E.P. Gusev, E. Cartier, D.A. Buchanan, M. Gribelyuk, M. Copel, H. Okorn-Schmidt, and C. D’Emic, Microelectronic Engineering 59, 341 (2001) 13.73. Z. Xu, M. Houssa, S. de Gendt, and M. Heyns, Appl. Phys. Lett. 80 (11), 1975 (2002) 13.74. M. Stromme, G.A. Niklasson, M. Ritala, M. Leskela, K. Kukli, J. Appl. Phys. 90 (9), 4532 (2001) 13.75. R.B. van Dover, D.V. Lang, M.L. Green, L. Manchanda, J. Vac. Sci. Technol. A 19 (6), 2779 (2001) 13.76. K.F. Jensen, Chemical Vapor Deposition, Academic Press (1993) Chap. 2: Fundamentals of Chemical Vapor Deposition 13.77. C. Chaneliere, J.L. Autran, R.A.B. Devine, and B. Balland, Materials Science and Engineering R22, 269 (1998) 13.78. H.W. Chen, D. Landheer, X. Wu, S. Moisa, G.I. Sproule, T.S. Chao, and T.Y. Huang, J. Vac. Sci. Technol. A 20 (3), 1145 (2002) 13.79. B.C. Hendrix, A.S. Borovik, C. Xu, J.F. Roeder, T.H. Baum, M.J. Bevan, M.R. Visokay, J.J. Chambers, A.L.P. Rotondaro, H. Bu, and L. Colombo, Appl. Phys. Lett. 80 (13), 2362 (2002) 13.80. Y. Ohshita, A. Ogura, A. Hoshino, S. Hirro, H. Machida, J. Crystal Growth 233, 292 (2001) 13.81. D.G. Colombo, D.C. Gilmer, V.G. Young Jr., S.A. Campbell, and W.L. Gladfelter, Chem. Vap. Deposition 4 (6), 220 (1998)

13 High-k Gate Dielectric Deposition Technologies

411

13.82. R.C. Smith, T. Ma, N. Hoilien, L.Y. Tsung, M.J. Bevan, L. Colombo, J. Roberts, S.A. Campbell, and W.L. Gladfelter, Adv. Mater. Opt. Electron 10, 105 (2000) 13.83. R.I. Bickley, R.K.M. Jaynaty, J.A. Navio, C. Real, and M. Macias, Surf. Sci. 251/252, 1052 (1991) 13.84. D.C. Gilmer, D.G. Colombo, C.J. Taylor, J. Roberts, G. Haugstad, S.A. Campbell, H.S. Kim, G.D. Wilk, M.A. Gribelyuk, W.L. Gladfelter, Chem. Vap. Deposition 4, 9 (1998) 13.85. D.-O. Lee, P. Roman, C.-T. Wu, W. Mahoney, M. Horn, P. Mumbauer, M. Brubaker, R. Grant, J. Ruzyllo, Microelectronic Engineering 59 (1-4), 405 (2001) 13.86. K.F. Jensen and W. Kern, Thin Film Processes II, edited by J. Vossen and W. Kern, Academic Press, 1991, p. 283 13.87. B. He, N. Hoilien, R. Smith, T. Ma, C. Taylor, I. St. Omer, S.A. Campbell, W.L. Gladfelter, M. Gribelyuk, D. Buchanan, Proceedings of the Thirteenth Biennial University/Government/Industry Microelectronics Symposium (Cat. No.99CH36301), IEEE, 33 viii+224 (1999) 13.88. S.J. Lee, H.F. Luan, W.P. Bai, C.H. Lee, T.S. Jeon, Y. Senzaki, D. Roberts, and D.L. Kwong, International Electronic Devices Meeting, IEEE, 00-31, 2.4.1, (2000) 13.89. S.B. Samavedam, L.B. La, J. Smith, S. Dakshina-Murthy, E. Luckowski, J. Schaeffer, M. Zavala, R. Martin, V. Dhandapani, D. Triyoso, H.H. Tseng, P.J. Tobin, D.C. Gilmer, C. Hobbs, W.J. Taylor, J.M. Grant, R.I. Hegde, J. Mogab, C. Thomas, P. Abramowitz, M. Moosa, J. Conner, J. Jiang, V. Arunachalam, M. Sadd, B-Y. Nguyen and B. White, IEDM Technical Digest, 433 (2002) 13.90. B.K. Park, J. Park, M. Cho, C.S. Hwang, K. Oh, Y. Han and D.Y. Yang, Appl. Phys. Lett. 80 (13), (2002), p. 2368 13.91. H. Kim, S.M. Rossnagel, J. Vac. Sci. Technol. A 20 (3), 802 (2002) 13.92. J.-S. Min, H.-S. Park, S.-W. Kang, Appl. Phys. Lett. 75 (11), 1521 (1999) 13.93. J.-S. Park, H.-S. Park, S.-W. Kang, J. Electrochemical Soc. 149 (1), C28 (2002) 13.94. S.M. Rossnagel, A. Sherman, F. Turner, J. Vac. Sci. Technol. B 18 (4), 2016 (2000) 13.95. J. Koo, Y. Kim, and H. Jeon, Jpn. J. Appl. Phys. 41, Pt. 1 (5A), 3043 (2002) 13.96. C.W. Jrong, B. Lee, S.K. Joo, Mater. Sci. and Eng. C 16, 59 (2001) 13.97. S.X. Lao and J.P Chang, 5th International Conference on Microelectronics and Interfaces, Santa Clara, CA, March 2004, p. 89 13.98. T.T. Van and J.P. Chang, Material Research Society, San Francisco, CA, April 2004, pp. E1.5, E1.6 13.99. W.-J. Lee, I.K. You, S.O. Ryu, B.G. Yu, K.I. Cho, S.G. Yoon, C.S. Lee, Jpn. J. Appl. Phys. 40, 6941 (2001) 13.100. G. Lucovsky, G.N. Rayner, and R.S. Johnson, Microelectronics Reliability 41, 937 (2001) 13.101. T.D. Abatemarco and G. Parsons, ALD 2002 topical conference, Seoul, Korea (2002), p. 4 13.102. B.O. Cho, S. Lao, and J.P. Chang, J. Vac. Sci. Technol. A 19 (6), 2751 (2001)

412

J.P. Chang

13.103. H. Holzschuh and H. Suhr, Appl. Phys. Lett. 59 (4), 470 (1991) 13.104. B.- O. Cho, J.-J. Wang, L. Sha, and J.P. Chang, Appl. Phys. Lett. 80 (6), 1052 (2002) 13.105. B.O. Cho and J.P. Chang, J. Appl. Phys. 92 (8), 4238–4244 (2002) 13.106. H.H. Tseng, J. Veteran, P.J. Tobin, J. Mogab, P.G.Y. Tsui, V. Wang, M. Khare, X.W. Wang, T.P. Ma, C. Hobbs, R. Hegde, M. Hartig, G. Kenig, R. Blumenthal, R. Cotton, V. Kaushik, T. Tamagawa, B.L. Halpern, G.J. Cui, J.J. Schmitt, Materials Science in Semiconductor Processing 3 (3), 173 (2000) 13.107. M.G. Krishna, K.N. Rao, and S. Mohan, Appl. Phys. Lett. 57, 557 (1990) 13.108. V.V. Klechkovskaya, V.I. Khitrova, S.I. Sagitov, and S.A. Semiletov, Sov. Phys. Crystallogr. 25, 636 (1980) 13.109. F. Jones, J. Vac. Sci. Technol. A 6, 3088 (1988) 13.110. W.J. Qi et al., Appl. Phys. Lett. 77 (11), 1704 (2000) 13.111. T. Kim et al., Appl. Phys. Lett. 76 (21), 3043 (2000) 13.112. P. Gao, L. Meng, M. Santos, V. Teixeira, and M. Andritschky, Vacuum 56, 143 (2000) 13.113. D. Niu, R.W. Ashcraft, G.N. Parsons, Appl. Phys. Lett. 80 (19),3575 (2002) 13.114. E. Disbiens, R. Dolbec, and M.A. Khakani, J. Vac. Sci. Technol. A 20 (3), 1157 (2002) 13.115. K. Chu, J.P. Chang, M.L. Steigerwald, R.M. Fleming, R. L. Opila, D. V. Lang, R.B. Van Dover, and C.D.W. Jones, J. Appl. Phys. 91 (1), 308-316 (2002) 13.116. G.D. Wilk and R.M. Wallace, Appl. Phys. Lett. 76 (1), 112 (2000) 13.117. A. Callegari, E. Cartier, M. Gribelyuk, H.F. Okorn-Schmidt, and T. Zabel, J. Appl. Phys. 90 (12), 6466 (2001) 13.118. M.R. Visokay, J.J. Chambers, A.L.P. Rotondaro, A. Shanware, and L. Colombo, Appl. Phys. Lett. 80 (17), 3183 (2002) 13.119. L. Manchanda, M.L. Green, R.B. van Dover, M.D. Morris, A. Kerber, Y. Hu, J.-P. Han, P.J. Silverman, T.W. Sorsch, G. Weber, V. Donnelly, K. Pelhos, F. Klemens, N.A. Ciampa, A. Kornblit, Y.O. Kim, J.E. Bower, D. Barr, E. Ferry, D. Jacobson, J. Eng, B. Busch, H. Schulte, International Electron Devices Meeting, IEDM (Cat. No.00CH37138), 23 (2000) 13.120. R.J. Cava and J.J. Krajewski, J. Appl. Phys. 83 (3), 1613 (1998) 13.121. K.M. Ghanashy, K. Narashima, S. Mchan, Thin Solid Films 193, 690 (1990) 13.122. E.E. Khawaja, F. Bouamrane, A.B. Hallak, M.A. Daous, and M.A. Salim, J. Vac. Sci. Technol. A 11, 580 (1993) 13.123. J. Kwo, M. Hong, A.R. Kortan, K.L. Queeney, Y.J. Chabal, R.L. Opila, D.A. Muller, S.N.G. Chu, B.J. Sapjeta, T.S. Lay, J.P. Mannaerts, T. Boone, H.W. Krautter, J.J. Krajewski, A.M. Sergnt, J.M. Rosamilia, J. Appl. Phys. 89 (7), 3920 (2001) 13.124. D.K. Sarkar, E. Disbiens, and M.A. Khakani, Appl. Phys. Lett. 80 (2), 294 (2002) 13.125. S.W. Nam, J.H. Yoo, H.Y. Kim, S.K. Kang, D.H. Ko, C.W. Yang, H.J. Lee, M.H. Cho, and J.H. Ku, J. Vac. Sci. Technol. A 19 (4), 1720 (2001) 13.126. G.D. Wilk, R M. Wallace, and J.M. Anthony, J. Appl. Phys. 87 (1), 484 (2000)

13 High-k Gate Dielectric Deposition Technologies

413

13.127. W.-J. Qi, R. Nieh, E. Dharmarajan, B.H. Lee, Y. Jeon, L. Kang, K. Onishi, and J.C. Lee, Appl. Phys. Lett. 77 (11), 1704 (2000) 13.128. R. Choi, K. Onishi, C.S. Kang, S. Gopalan, R. Nieh, Y.H. Kim, J.H. Han, S. Krishnan, H. Cho, A. Shahriar, and J.C. Lee, IEDM Technical Digest, 613 (2002) 13.129. M. Shahjahan, N. Takahashi, K. Sawada, and M. Ishida, IEEE International Workshop on Gate Insulator (IWGI), 7.15, 160 (2001) 13.130. L.A. Ragnarsson, S. Guha, M. Copel, E. Cartier, N.A. Bojarczuk, J. Karasinski, Appl. Phys. Lett. 78 (26), 4169 (2001) 13.131. R.A. McKee, F.J. Walker, and M.F. Chisholm, Phys. Rev. Lett. 81, 3014– 3017 (1998) 13.132. S. Guha; N.A. Bojarczuk, V. Narayanan, Appl. Phys. Lett. 80 (5), 766 (2002) 13.133. M. Ishida, K. Sawada, S. Yamaguchim, T. Nakamura, T. Suzaki, Appl. Phys. Lett. 55, 56 (1989) 13.134. P. Singer, Semiconductor International 24 (12), 36 (2001) 13.135. B.S. Meyerson, K.J. Uram, and K.F. LeGoues, Appl. Phys. Lett. 53, 2555 (1998) 13.136. M. Matsuoka, S. Isotani, J.F.D. Chubaci, S. Miyake, Y. Setsuhara, K. Ogata and N. Kuratani, J Appl Phys 88 (6), 3773 (2000) 13.137. A.S. Kao, G.L. Gorman, J. Appl. Phys. 67, 3826 (1990) 13.138. S. Miyahe, I. Shimizu, R.R. Manory, T. Mori, and G. Kimmel, Surf. Coatings Technol. 146-147, 237 (2001) 13.139. M. Yoshitake, K. Takiguchi, Y. Suzuki, and S. Ogawa, J. Vac. Sci. Technol. A 6, 2326 (1988) 13.140. E. Celik. J. Schwartz, E. Avci, and Y.S. Hascicek, IEEE Transactions on Applied Superconductivity 19, Pt. 2 (2), 1916 (1999) 13.141. L. Yang and J. Cheng, J. Non-crystalline Solid 112, 422 (1989) 13.142. S. Jana and P.K. Biswas, Materials Letters 30, 53 (1997) 13.143. R. Corriou, D. Leclercq, P. Lef`erve, P.H. Mutin, A. Vioux, Chem. Mater. 4, 961 (1992) 13.144. A. Vioux, Chem. Mater. 9, 2292 (1997) 13.145. R.M. Rignanese, F. Detraux, X. Gonze, and A. Pasquarello, Phys. Rev. B 64 (13), 134301 (2001) 13.146. K. Cho, Computational Materials Science 23 (1-4), 43 (2002) 13.147. R. Puthenkovilakam, E. Carter, and J.P. Chang, accepted for publication in Physical Review B 69 (11), March 2004 13.148. V.V. Brodskii, E.A. Rykova, A.A. Bagatur’yants, A.A. Korkin, Computational Materials Science 24 (1-2), 278 (2002)

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices V. Misra

14.1 Background In the early stages of microelectronics development, doped polysilicon gates replaced metal gates due to their high-temperature compatibility with SiO2 , which enabled the formation of self-aligned MOSFETs. In addition, the incorporation of n+ or p+ dopants via ion-implantation into polysilicon provided low and high work functions, which were desirable for achieving low threshold voltages for CMOS devices. Since then, the semiconductor industry has been aggressively scaling the transistor dimensions. Specifically, the scaling of minimum feature sizes, such as the channel length, has been the major driving force for improving circuit speed, reducing power dissipation and increasing packing density. According to projections made in the ITRS [14.1], sub-1 nm gate oxides are required for 50 nm and beyond IC devices to maximize gate control and improve short channel effects. Unfortunately, the scaling of SiO2 below 1.5 nm is plagued by several key challenges such as increased gate leakage current, dopant penetration from the gate electrode and increased effect of polysilicon depletion. In order to reduce gate leakage while maintaining capacitance, high-K dielectrics are being pursued. The leading candidates include HfO2 , ZrO2 , Zr- and Hf-based silicates and Al2 O3 , all of which have dielectric constants > 10 and can therefore be made physically thicker than SiO2 resulting in significant reductions in tunneling currents while still producing low thickness values. Although polysilicon gates have been the workhorse of the transistor gate technology, their usefulness in scaled devices is diminishing. This is attributed to gate depletion problem that affects polysilicon gates on ultrathin dielectrics. The origin of this gate depletion is the lower concentration of dopant atoms/carriers near the gate electrode/dielectric interface (less than 1021 carriers per cm3 ), which results in a depletion layer under gate inversion conditions. This depleted layer introduces an additional capacitance (∼ 3–4 ˚ A in thickness) that is in series with the gate dielectric capacitance. This additional capacitance reduces the net gate capacitance and degrades short channel effects. Furthermore, since the impact of this gate depleted capacitance increases as the gate dielectric capacitance increases, it becomes necessary to consider metal gates for ultra-thin dielectrics. In addition to the gate depletion problem, reactions between the polysilicon gate and high-K

416

V. Misra

dielectric interface can produce silicides, which compromise the work function and dielectric integrity of the gate stack. Recently, it has also been reported that polysilicon gates suffer from Fermi Level pinning on HfO2 based dielectrics which limit their usefulness on high-K gatestacks [14.2]. These reasons warrant the investigation of metal and metallic compounds as gates electrodes for CMOS devices.

14.2 Metal Gate Selection Criteria In order for metals to replace n+ and p+ polysilicon gate electrodes, metal gate candidates must have i) compatible work functions (φm ), ii) thermal/chemical interface stability with the underlying dielectric and iii) high carrier concentration. To replace n+ and p+ poly-Si and maintain scaled performance for bulk CMOS devices, i.e., low threshold voltages and good short channel characteristics, it is necessary to identify pairs of metals with work functions that are near the conduction band and valence band edges of Si [14.3], i.e., gate electrodes for NMOS and PMOS devices must have work functions near 4.1 eV and 5.2 eV respectively. It should be noted that mid-gap work function metals (e.g., TiN and W) are inadequate for advanced bulk silicon CMOS devices due to a) threshold voltages that are too large for low-voltage operation or b) severely degraded short channel characteristics due to low substrate doping density [14.4]. Further studies are required to verify whether or not mid-gap work function metals are suitable for fully depleted SOI devices. Some general trends in anticipated properties that can be drawn from the extensive studies of metal-silicon contacts include: a) work functions of metals increase as the electro-negativity of the metal or metal compound increases; b) since Si has a higher electro-negativity than many transition metals, metal silicide work functions are generally in the lower half of the band gap of Si; and c) the work functions of metal oxides and nitrides are in general larger than those of their corresponding elemental metals. In addition to the appropriate work function requirements, several material properties such as free energy of oxide formation, oxygen solubility, diffusion barrier properties and film microstructure can be used as predictors of thermal stability. Since electronegativity is generally inversely related to free energy of oxide formation, and is proportional to work function, elemental metals with lower work functions (i.e. NMOS compatible) will exhibit problems with stability. This implies that most low work function NMOS compatible elemental metals are expected to suffer from high temperature instability and reactions resulting in a degraded interface with the underlying dielectric [14.5]. The reaction layers that form can either be insulating and/or conducting and can correspondingly affect the equivalent oxide thickness (EOT) and the flatband voltage (VFB ) shift of the device. For example, metals such as Ta, Ti, Al, Zr, Hf, which have work functions less than 4.3 eV are considered inadequate for gate application under conventional process

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

417

flows since they react with the underlying dielectric. It has been found that low work function metals, such as Zr, Hf, and Ti, tend to reduce underlying dielectrics resulting in poor insulating properties [14.5]. Metal oxide semiconductor capacitors have been used to study the interaction of Hf and Zr gate electrodes on SiO2 , ZrSix Oy and ZrO2 . A large reduction in the SiO2 equivalent oxide thickness (EOT) accompanied by an increase in the leakage current was observed after Hf electrodes were subjected to anneal temperatures as low as 400◦ C. The reduction in electrical thickness as observed from the capacitance-voltage measurements was attributed to the combination of a) physical thinning of the SiO2 and b) formation of a high-K layer. Severe instability of Zr and Hf electrodes was also observed on ZrSix Oy and ZrO2 dielectrics. This behavior of Zr and Hf gates was attributed to high electronegativity and high oxygen solubility resulting in the reduction of the gate dielectric and subsequent oxygen diffusion to the gate electrode. Elemental metals with larger work functions, on the other hand, provide intrinsic stability due to their expected low free energy of metal oxide formation. However the challenge here lies in selecting metals that exhibit good adhesion on the underlying dielectrics. Adhesion is impacted by the formation of chemical bonding between the metal gate and dielectric interface. High work function metals, such as Pt, Au, do not react with SiO2 and therefore do not form a good adhesion layer [14.6]. Films with poor adhesion properties are also more susceptible to stress related failure such as cracking or buckling. In addition, thermal stresses in thin films can cause poor adhesion, cracking or strain the underlying dielectric, and since metals often have larger thermal expansion coefficient than the silicon substrate, the process induced stress incorporation due to varying thermal expansion coefficients must be minimized. It is also important to mention the techniques that can be used to measure work function of electrodes on dielectrics. In order to decouple the effect of charge in the dielectric from the true work function, measurements of flatband voltage (VFB ) can be made on varying thicknesses of dielectrics. If the charge in the dielectric is minimal, then the VFB vs. oxide thickness or vs. equivalent oxide thickness (EOT), gives: VFB = Φm − Φs − Qf tox /εox . This plot will exhibit a straight line and the y-intercept of this curve leads to the metal work function. It should be noted, however, that accuracy of this methodology is only as good as the linear fit of the VFB vs. EOT plot. If charge is present in the bulk of the dielectric, which will likely be the case for high-K dielectrics, then this method will be error-prone. Other techniques to corroborate the work function may be needed such as those based on tunneling current measurements [14.7]. The work function and thermal stability behavior of metal gates on high-K dielectrics may have additional issues that need to be considered. It has been reported that the work function of metals is expected to be different on high-K dielectrics as compared to SiO2 due to interface-induced gap states. These gap states are believed to originate from exponential decay of the wave function tails of the metal electrons into the

418

V. Misra

dielectric and promote charge transfer at the gate-dielectric interface until the net charge is zero [14.8–14.10]. This charge transfer results in the metal Fermi level lining away from the vacuum work function level. The amount of shift depends, among other factors, on the electronic component of the dielectric constant. This implies that dielectrics with larger dielectric constant will result in a larger degree of metal Fermi level pinning and hence the effective work function values. In addition to the above intrinsic mechanism, extrinsic mechanisms such as reaction layers can also affect the work function of metals on high-K. Since without the effects of Fermi pinning, the work function is determined by the Debye length of the metal [14.11], even a reaction layer of 5 ˚ A is enough to change the work function. Therefore, in order to accurately identify the origin of work function shifts, it is imperative to accurately measure the work function behavior. As mentioned above, this is especially critical when gate dielectrics contain high levels of bulk charges located either at internal interfaces or in their bulk. In such a case, the VFB vs. EOT curve will fail to determine the accurate work function and careful decoupling of the bulk charge from the work function has to be performed in order to obtain an accurate work function value. This decoupling may require the use of varying thicknesses of the high-K dielectric stacks. Finally, metal gates under consideration must have high carrier concentration to avoid gate depletion. Carrier concentrations exceeding 1023 electrons/cm3 are routine for elemental metals; however, metal nitrides and metal silicon nitrides may have lower carrier concentrations due to the presence of non-metallic bonding [14.12]. This can potentially create gate depletion issues and therefore carrier concentrations of metal nitrides and metal silicon nitrides must be measured as part of the selection process. The resistivity of the gate electrode, which is a function of grain size and impurity levels (e.g. oxygen), should also be minimized. However, a high resistivity gate electrode may be still be practical if a thin film is deposited to set the work function and a lower resistivity metal is deposited on top to reduce the sheet resistance of the gate electrode stack.

14.3 Other Challenges with Metal Gates The implementation of metal gates may not come without some additional costs. Although the replacement of SiO2 with a high-K dielectric generally does not change the integration flow, the incorporation of dual work function metals is expected to greatly increase integration complexity. This is due to the fact that unlike in dual polysilicon gate technologies where a single polysilicon layer is deposited and then implanted, dual metal gates will most likely require two metal deposition steps. Since damage to the underlying dielectric may occur when the first metal is removed from one of the device regions, a redeposition of the dielectric may be required prior to the deposition of the second metal. Next, another non-critical photo step will be needed

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

419

to pattern the second metal and the gate dielectric (alternatively, CMP can be used to planarize and remove the dielectric). This results in a net two additional non-critical photo steps. There are alternate integration schemes; for example, if a metal can be implanted with a certain species that alters its workfunction to the desired value then a process flow similar to polysilicon may be obtained. Recently, it has been demonstrated that Mo can be implanted with N to result in a higher work function film [14.13]. Another integration route may involve the deposition of a metal stack consisting of metals A and B followed by the removal of the top metal B from one of the well regions. Then a controlled reaction, either diffusion [14.14] or intermixing [14.15], between the two metals may be able to produce a counter work function compared to the region in which only metal A exists. This methodology would require only 1 additional non-critical photo step. Some approaches to tackle the integration challenges will be discussed later in this chapter.

14.4 Metal Gate Candidates for NMOS Devices As discussed above, a major challenge for metal gates lies in finding low work function metals for NMOS which also offer good thermal stability. One option to achieve this is by introducing N and/or Si in low work functions metals to improve their stability. Metal alloys such as TaN and TaSix Ny films have been studied as barrier layers for Cu interconnection and have shown superior diffusion barrier properties [14.12]. The presence of N is believed to stuff the grain boundaries, effectively retard reaction rates, provide better diffusion barrier properties and result in smoother microstructures. These materials may be potential NMOS gate electrode candidates provided the presence of N and Si does not increase the work function. 14.4.1 Metal Nitrides Refractory metal nitrides have been shown to increase the critical temperature for reactions of the metal with materials such as crystalline silicon [14.16]. The structure of metal nitride often consists of close packed metal atoms with nitrogen occupying octahedral interstitial positions. The metal nitrides are referred to as interstitial alloys because of the interstitial locations of the nitrogen atoms. Bonding in metal nitrides is due to the interaction of nitrogen 2s and 2p orbitals with the metal d orbitals [14.16]. The direction of charge transfer is from nitrogen to metal. The bonds in metal nitrides have covalent and metallic components. Ideal stoichiometry is not often realized for refractory metal nitrides. The phases exist over a wide range of compositions and large fractions of the non-metal lattice sites are vacant with a smaller fraction of the metal lattice sites vacant. The vacancy concentration has a significant impact on the thermodynamic, electrical and mechanical properties of these films. Different deposition techniques tend to result in different

420

V. Misra

Fig. 14.1. Capacitance-voltage showing a positive shift in the flatband voltage for a range of nitrogen partial pressures during sputtering

defect structures, which explains why the properties of metal nitride films vary so greatly in the literature. Of the refractory metal nitrides, tantalum nitride (TaN) is considered promising for NMOS gate electrode applications because of the excellent diffusion barrier properties and high melting point. TaN (and WN) has also been found to block the diffusion of oxygen better than TiN [14.17]. However, the reported work function values of transition metal nitrides have varied widely in the literature. This variation may be attributed to the various techniques used to deposit the films, and to the different characterization procedures that have been used to determine the work function value. In some cases, the reported work function values are extracted from insufficient data or measured using methods that are fundamentally flawed. For instance, the work function of the gate electrode cannot be calculated from a single C–V curve since the flatband voltage includes charges within the dielectric. The work function of TaN has been reported to range from 4.2 eV to 4.9 eV. The role of N in the transition metal nitride is not yet clear since it may either create charge in the dielectric and/or change the work function of the gate. Wakabayashi et al. [14.18] concluded that the work function of TiNx decreases as the concentration of nitrogen is increased. In addition, nitridation of the gate oxide during gate deposition has also been observed in the WNx /SiO2 [14.19]. Thus, when determining the work function a method that takes into account the charges in the dielectric should be used. Experimental data on MOS capacitors with SiO2 gate dielectric was obtained for TaNx gate electrodes formed via reactive sputtering of Ta films in varying partial pressures of N. As shown in Fig. 14.1, a positive shift of nearly

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

421

Flatband Voltage(V)

-0.2 TaN(5%)

-0.3

TaN(15%)

-0.4

TaN(71%)

-0.5 -0.6 -0.7 -0.8

0

5

10

15

20

25

30

EOT(nm) Fig. 14.2. Flatband voltage as a function of dielectric thickness for Ta1−x Nx (5, 15 and 71%) on SiO2 following forming gas anneal at 400◦ C for 30 min

250 mV in flatband voltage is observed for all gate electrodes containing nitrogen as compared to Ta gates. Analysis of the work function extraction via VFB vs. EOT analysis indicated a work function of nearly 4.1e V for Ta films. However, the VFB vs. EOT plot for Ta1−x Nx films indicated deviation from linearity in the thin oxide regime (less than 35 ˚ A), which prevented a simple work function extraction (Fig. 14.2). This deviation was found to be a strong function of the N2 flow rate during sputtering suggesting that N plays a role in the VFB deviation towards more negative values. Nitrogen tends to contribute a positive charge to the dielectric, which shifts the VF B to more negative values and this shift becomes exaggerated for thin oxides suggesting non-linear charge incorporated in the dielectric. The thickness of dielectric at which the onset of the rollover effect occurs increases with increasing N2 partial pressures flow rates, confirming the role of nitrogen. Both conductance measurements of interface state densities and electron energy loss spectra have confirmed the presence of N in the SiO2 film during the deposition process. Due to the non-linearity of the VF B vs. EOT curves, an accurate extraction of work function is not possible. If the thin oxide data is removed, then a work function value of ∼ 4.55 eV is obtained. Barrier heights values, obtained from Fowler–Nordheim tunneling current, were near 3.7 eV which agree well with the above CV extraction, however, the assumption that the effective mass and tunneling characteristics are the same as poly-SiO2 system may not be accurate and need to be validated by further studies. In addition, the use of barrier heights to extract work function is highly susceptible to the physical and chemical nature of the gate electrode/dielectric interface since changes at the interface will affect the tunneling currents and hence the barrier heights.

422

V. Misra

b

a

Fig. 14.3. (a) High resolution transmission electron micrograph of Ta1−x Nx (5% flow)/SiO2 /Si after a 400◦ C anneal, (b) High resolution transmission electron micrograph of Ta1−x Nx (5% flow)/SiO2 /Si after a 1000◦ C anneal

a

b Fig. 14.4. (a) Auger Electron Spectroscopy Depth Profile of as-deposited W/TaN(5%)/SiO2 /Si, (b) Auger Electron Spectroscopy Depth Profile of W/TaN(5%)/SiO2 /Si annealed at 1000◦ C for 15 s in Argon

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

423

Thermal stability studies of TaN have revealed that for all but the most Nrich TaN electrodes, a non-homogenous reaction layer is formed between the TaN and the SiO2 dielectric at anneal temperature exceeding 900◦ C. It is believed that the reaction is initiated by SiO2 reduction and the subsequent incorporation of oxygen into a Ta–N–Si matrix. As shown in Figs. 14.3 and 14.4, High-resolution transmission electron micrographs and Auger Depth profiling (not shown) have revealed that the reaction layer has formed within the region that was previously SiO2 as evidenced by the oxygen redistribution from within the SiO2 layer into the Ta1−x Nx region. Based on these results, it is concluded that the reaction layer primarily consists of a combination of Ta, O, Si and N. This reaction layer has an RC component, which contributes to the capacitance changes that are observed. Although the very N-rich TaN gates did not suffer from reaction layer formation, it was plagued by other issues such as enhanced N diffusion into the dielectric and a higher resistivity film. However, work function determination remains complicated for ultrathin dielectrics due to non-linearity in VFB vs. EOT resulting in “effective” work function values that are different from the accurate work function values. 14.4.2 Metal Silicon Nitrides Due to the problems associated with TaN listed above, other approaches have also been evaluated. It was found that the TaNx alloy can be further stabilized by the addition of silicon, which not only improves the diffusion barrier properties but also retards grain growth compared to TaNx films. Furthermore, an amorphous or nanocrystalline grain structure film can have intrinsically better interface stability since grain boundaries can act as oxygen scavenging sites. TaSiN films have been studied both as a barrier layer for Cu interconnection and as a bottom electrode for dynamic random access memories. They have shown low resistivity and excellent thermal/chemical stability against high-temperature processing. We have demonstrated that TaSix Ny films on SiO2 provide good thermal stability and results in a minimal change of equivalent oxide thickness and stable leakage currents [14.20]. After a 1000◦ C anneal, the reaction of Ta53 Si47 with SiO2 resulted in the increase of interfacial roughness and oxide thickness in the TaSix /SiO2 /p-Si structures whereas, by adding nitrogen the chemical structure of Ta22 Si29 N49 was stable up to 1000◦ C, exhibiting lower leakage current compared to TaSix gate. The presence of Si-N bonds is attributed to the amorphous nature of the high N containing TaSix Ny films, which retarded the formation of the interface layer under high temperature annealing. This excellent stability of TaSix Ny films may enable its use as an electrode with high-k dielectrics for advanced CMOS devices. However, we found that the composition of the TaSix Ny film was critical in achieving work function stability, thermal stability and prevention of N diffusion. We have found that TaSix Ny alloys can result in low work function values (4.4 eV) provided the Ta/Si ratio in the films is < 1 [14.21]. The presence of Si prevents

424

V. Misra

Flatband Voltage(V)

0 Furnace o 400C-30m in 400 C, 30min

0% 5% 10% 15% 25%

-0.2 -0.4 -0.6 -0.8 -1 -1.2 2

3

4

5

6

7

8

9

10

EOT(nm )

N concentration(atoms/cc)

Fig. 14.5. Flatband voltage as a function of dielectric thickness for TaSiN with varying N flow rates on SiO2 after forming gas anneal 24

10 TaN SiO 2 Si 23 10 22 10 21 10 20 10 24 10 TaSiN SiO 2 Si 23 10 22 10 21 10 20 10 800 1000 1200 1400

Depth(Å)

1600

Fig. 14.6. The nitrogen profile for nitrogen diffusion from gate electrode through SiO2 at 1000◦ C for 15 sec. (top) TaNx /SiO2 /Si and (bottom) TaSix Ny/SiO2 /Si structures

N diffusion to the underlying Si-dielectric interface both for thin dielectrics and under high temperature processing as is evidenced by the linearity of the VFB vs. EOT curves (Fig. 14.5). Furthermore, SIMS analysis, shown in Fig. 14.6, did not reveal any N in the gate dielectric indicating that the presence of Si was effective in blocking N diffusion. However, the Si content must be carefully optimized since excess Si can result in result in high resistivity and in the formation of TaSi2 at the gate dielectric/gate electrode interface resulting in a high work function film (∼ 4.8 eV). This effect can be mitigated by reducing the Si content and/or adding N to the film. It was found that reducing the Si content alone did

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

425

not prevent the TaSi2 formation during high temperature processing, while removing Si completely, i.e. TaN, resulted in work functions that were too high [14.22]. The N content was optimized since too little did not prevent the TaSi2 formation and too much resulted in N diffusion. Therefore, the presence of both Si and N was deemed necessary and their content was critical in obtaining optimized TaSix Ny gates that are suitable for NMOS devices. The optimized TaSix Ny films exhibit excellent thermal stability of EOT and VrmF B . It is believed that the presence of Si–N bonds in the film improves the thermal stability of the gate-dielectric interface while maintaining an appropriate work function. Recently, MOSFETs with TaSiN have also been demonstrated with good electrical results [14.23]. 14.4.3 Binary Metal Alloys An alternate route in obtaining low work function films is by alloying of elemental metals. For example, metals with low and high work function metals can be alloyed to provide desired work functions with good thermal stability. This approach avoids the use of N, which can diffuse under high thermal budgets, or Si, which can lead to the formation of di-silicides which have large work functions. Work function modulation using binary alloys has been studied before with systems like AuAg, PtRh, etc. [14.24, 14.25], however, the work function range of these alloys was limited and ranged from 4.5 to 5 eV. This is understandable since both the metals involved in the above alloys were only separated by 0.5 eV in work function. In addition to the work function tuning, binary alloys may also offer thermodynamic stability owing to strong metal-metal bonding. For example, Al is very reactive even under low temperature but the Ni–Al compound is substantially more stable than either pure Ni or Al, although the alloy is still not stable enough to withstand the typical post-implantation heat treatment at temperatures near 1000◦ C anneal for ∼ 10 seconds [14.26]. Pt–Al alloy films with compositions between Pt1 Al1 and PtAl2 were found to satisfy the stability requirements of GaAs self-aligned gate processing [14.27]. Mo–Al alloy Schottky contacts to n-GaAs indicated that the Mo–Al alloy with compositions between Mo3 Al and Mo3 Al8 are stable after rapid thermal annealing up to 900◦ C [14.28]. For an Ax B1−x alloy, it is natural to consider that the work function φ(x) changes with x as Φaverage (x) = xφA + (1 − x)φB , where φA and φB represent the work function of pure elements. Ni–Cu alloy system was found to have close linear relation to its component work functions [14.25]. However, this is not necessarily the case for all binary alloys. Fain and McDavid [14.24] measured the workfunction of AgAu alloys and found a nonlinear composition dependence. Also, for PtRh alloys, the work function falls below the linear interpolation [14.25]. The relationship between charge transfer in alloys and electronegativity work function resulting from the modified electron-electron Coulomb has been used to explain the non-linear relationship between the work function and alloy composition [14.29]. For binary alloy systems, a

426

V. Misra

EOT change ( Å)

5 0 -5

0

800 C 0

900 C

-10

0

1000 C -15

0

20

40

60

80

100

Ta Percent in Ru-Ta Alloy (%)

Work function(eV)

Fig. 14.7. EOT change under high temperature annealing for Ru–Ta alloys with various Ta percent power conditions on 25 ˚ A SiO2 gate dielectrics 5.2 FGA

5

0

800 C

4.8

0

900 C

4.6

0

1000 C

4.4 4.2 40

20

40

60

80

100

Ta Percent Power ( %)

Fig. 14.8. Workfunction extracted from CV curves vs. Ta percent power annealed at 800◦ C, 900◦ C and 1000◦ C. The alloys with Ta percent power less than 60% remain stable under high temperature anneals

nonzero difference in electron-per-atom ratio for the constituents will lead to a transfer of charge in the alloy, which is related to electronegativity and work function. Although many binary and even higher order metal alloy systems exist, the selection of the appropriate metal alloy system may be guided by the presence of stable single phases over a large composition range. It is expected that the presence of a single phase may ensure better work function uniformity since only one phase of the structure is available. Our recent work has indicated that binary alloys of Ta-Ru with at. Ta% between 40 and 54% produces a single Ru1 Ta1 phase, which has a work function of 4.3 eV with excellent thermal stability of EOT and VFB , making it a viable candidate for NMOS devices [14.30, 14.31] (Fig. 14.7). The high Ru solubility in Ta makes it feasible to obtain a low work function alloy film with good thermal stability. A non-linear relationship between work function and Ru–Ta alloy

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

Low Φ m (Ta) High Φ m (Ru)

N M OS

PM OS

(a) Low Temperature

Low Φ m (Ru-Ta Alloy) High Φ m (Ru)

N M OS

N NM OS MOS

PM OS PMOS

(b) Low Temperature

PMOS

High Temperature

High Φ m (Ru) Low Φ m (Ta or Alloy)

427

Low Φ m (Ta or Alloy)

NM OS MOS

High Φ m Alloy

PM OS PMOS

High Temperature

Fig. 14.9. Integration approach for dual metal gate CMOS. (a) shows the process using a high fm layer on the bottom and a low fm on top figures, whereas (b) uses the opposite approach

composition was measured (Fig. 14.8). The values ranged from 4.2 eV to 5 eV, making it possible for this alloy to be used as a gate electrode for both Nand P-MOSFET devices depending on stoichiometry. Binary metal alloys offer the additional advantage of integration simplification provided they are formed by vertical intermixing of the two metal layers. Figure 14.9 shows two possible configurations of integrating metal gates for both NMOS and PMOS devices. The proposed scheme (Fig. 14.9a) starts with deposition of Ru/Ta stacks. Next, the top Ta layer is removed from the PMOS regions via a non-critical photo step and etch. High temperature anneals are performed to produce a Ru-Ta alloy on regions with vertical stacks [14.15]. An alternate scheme is shown in Fig. 14.9b. The resulting φm is expected to be strong function of the thickness and composition of the two layers. Although other metal gate integration schemes have been proposed [14.13, 14.14], this approach offers the following advantages: a) a range of φm from 4.2 eV to 5.0 eV, b) avoids the use of N implantation and c) use of thermally stable metals that do not react with the underlying dielectric. As shown in Fig. 14.10, the final φm of various gate stacks depends strongly on the thickness of the bottom and top layers and the anneal condition. Single layer Ru gates maintain their work function around 5 eV after anneal. However, as the Ru layer on the bottom is made thinner with a Ta overlayer, the φm starts decreasing and saturates to ∼ 4.6 eV at ∼ 900◦ C, regardless of the Ru thickness. This is attributed to fast intermixing and alloying at high temperatures. Intermixing is a key indicator of alloy formation as reported for alloys of Ru–Sm and Ru–Ag bilayers [14.36]. Auger depth analysis, as shown in Fig. 14.11, revealed significant levels of intermixing and

428

V. Misra 4.4 o

After 400 C Forming Gas Anneal

4.9

4.2

4.8 4.7

Barrier Height

4.6

4 3.8

4.5

3.6

4.4

4.2

3.4

Work Function

4.3

Barrier Height (eV)

Work Function (eV)

5

3.2

1 2 3 4 5 6 Sample Condition (see Table 2)

Sample

1 2 3 4 5 6

Bottom Metal Ru 700Å

Top Metal ---

Capping Layer ---

Ru 100Å

Ta 200Å

W 500Å

Ru 30Å

Ta 200Å

W 500Å

Ta 30Å

Ru 500Å

---

Ta 100Å

Ru 500Å

---

Ta 500Å

---

W 500Å

Fig. 14.10. Work function extracted from C–V curves for various conditions of Ru and Ta stacks. Even after 400◦ C, evidence of intermixing and alloy formation is observed as indicated by the change in work function Atomic concetration(%)

Si

W

80

Ta Ru

60

As-Deposited 40

O 20 0

0

200

400

600

800

Atomic concetration(%)

100

100

Si

Ta

60

900ºC, 30sec

Ru

40

0

1000 1200

O

O

20

0

200

400

600

1200 800 1000 1200

sputter depth (A)

sputter depth (A)

a

W

80

b

A/Ta 200 ˚ A/W Fig. 14.11. (a) Auger Depth Profile of as deposited Si/SiO2 /Ru 100 ˚ 500 ˚ A gatestack. Negligible inter-diffusion between the metal layers is observed in the as-deposited state, (b) Auger Depth Profile of the Si/SiO2 /Ru 100 ˚ A/Ta 200 ˚ A/W 500 ˚ A gatestack annealed at 900◦ C for 30 sec. Significant interdiffusion is observed between Ta and Ru layers

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

429

Work function (eV)

5.2 5 4.8 4.6 4.4

After 800C, 10 Min Anneal

4.2 4

4

3

2

1

Sample Condition (see Table)

Sample Set B

Bottom Metal

Top Metal

1

Ru50Ta50 500Å

---

2

Ru50Ta50 100Å

Ru 500Å

3

Ru50Ta50 70Å

Ru 500Å

4

Ru50Ta50 30Å

Ru 500Å

Fig. 14.12. Work function vs. sample condition of Ru50 Ta50 /Ru stacks after an 800◦ C anneal. The underlying thicknesses correspond to samples in the table. A work function change of 0.8 eV is observed between Ru50 Ta50 /Ru stacks and 500 A Ru50 Ta50 single layer

interdiffusion between the two layers after anneal although negligible interdiffusion between the metal layers is observed in the as-deposited state. The Ru/Ta approach results in a φm decrease of ∼ 0.4 eV between Ru and Ru/Ta stacks. The Ta/Ru stacks are unattractive due to the reactivity of Ta with the underlying dielectric during the intermixing anneal [14.32]. Although the Ru/Ta stacks indicate a 0.4 eV φm shift from pure Ru, the amount of shift is insufficient to meet the bulk CMOS requirements although this shift may be sufficient for non-bulk silicon devices (i.e. SOI). To obtain even a larger change in work function, an alternate approach utilizing Ru50 Ta50 /Ru stacks

430

V. Misra

was investigated. A significant positive shift in φm is observed compared to the single layer Ru50 Ta50 gate electrodes after 800◦ C anneal making this an attractive approach for bulk CMOS (Fig. 14.12). The resulting φm value can be further tuned in by controlling the composition of the two layers.

14.5 Metal Candidates for PMOS Devices As discussed above, the larger work function metals have inherent stability since they are thermodynamically predicted to be stable on many gate dielectrics of interest. Elements such as Ru, Pt, Ir, Ni, Co are all stable on most dielectrics under consideration today. Ru gates may offer advantages over Pt and Ni gates due to better adhesion and low oxygen conductivity. In addition, Ru also offers an exciting opportunity that its oxide is also an intrinsic conductor with low resistivity, high carrier concentration and excellent diffusion barrier properties. Its investigation as the bottom electrode in high-K DRAM research lends strength to its potential. We have investigated Ru and RuO2 on SiO2 and high-K films and have obtained excellent stability of the EOT and VFB on SiO2 , ZrSiO4 and ZrO2 up to 950◦ C anneals [14.33, 14.34]. However, conductive metal oxides such as RuO2 , suffer due to their propensity to release oxygen that can incorporate and diffuse into the dielectric during the deposition process which can lead to an EOT increase during the deposition or subsequent high temperature steps.

14.6 Metals on High-k Dielectrics This section discusses electrical properties of candidate metal electrodes on HfO2 thin films formed via RF sputtering from a Hf target in an argon ambient followed by subsequent oxidation. Various metals and metallic alloys have been recently investigated on high-K dielectrics [14.36–14.40]. In our work, we have investigate both Ru-based and Ta based compounds on physically vapor deposited (PVD) HfO2 dielectrics. In addition, we have also investigated the high temperature stability and work function extraction issues [14.40]. The CV results of various Ru and Ta based compounds on HfO2 indicated that the VFB shifts among the various gates on HfO2 were similar to those observed on SiO2 suggesting that the φm differences between the studied gates are preserved even on HfO2 . EOT values down to 9.4 ˚ A were observed with these electrodes. As discussed above, accurate work function determination requires evaluating HfO2 films of varying thicknesses to decouple the effect of charge on the work function values. However, this is difficult to achieve via PVD owing to the variability in composition with oxidizing metals of different thicknesses. Therefore, we used an alternate approach which consisted of forming HfO2 /SiO2 stacks wherein the total EOT was varied by varying only the interfacial SiO2 layer, i.e., the HfO2 layer was kept constant

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

431

and only the interfacial SiO2 layer thickness was changed. Charge analysis calculations of the above stacks reveal that the slope of the VFB vs. EOT curves of the HfO2 /SiO2 stacks is proportional to the SiO2 charge, however, the intercept is a now a combination of the φms and the charge in the high-K layer [14.41]. It was found that for a given gate, the HfO2 /SiO2 resulted in a positive shift in the intercept value of VFB vs. EOT plots as compared to SiO2 only dielectrics. Although there is a shift between the φms between SiO2 and HfO2 /SiO2 , the shift of the intercept between the two gates for a given dielectric is the same suggesting that the work function difference is preserved even on HfO2 and that Fermi Level pinning is either not occurring with these metals on HfO2 or is minimal [14.40]. This is a key advantage of metal gates over polysilicon gates, which have recently shown to suffer from Fermi Level pinning [14.2]. An accurate determination of the charge value of HfO2 requires an alternate deposition technique, such as atomic layer deposition or chemical vapor deposition or comparison to an ideal curve. All the electrodes studied suffered from an increase in EOT after high temperature post-gate annealing. This is attributed to the incorporation of oxygen and/or H2 O in the HfO2 prior to gate deposition. A larger EOT increase has been observed with RuO2 gates compared to Ru gate due to the oxygen present in the RuO2 deposition [14.33]. To prevent this EOT increase after metal deposition, either an in-situ gate deposition or composition tuning of the dielectric to make it a better diffusion barrier must be performed.

14.7 Conclusion Metal selection on advanced gate dielectrics must be based on the appropriate work function values, thermal stability of EOT and VFB values and process compatibility. Accurate extraction of metal work functions on highK dielectrics requires careful charge analysis of the charges in the dielectric stack. Many promising metal gate electrode systems have been proposed and include elemental metals, metal nitrides, metal silicon nitrides, metal silicides and binary metal alloys. The work function of the above systems vary from near mid-gap to near band-edges and as such may be suitable for bulk and/or non-bulk devices. However, further investigation is critically needed for all of the above systems to evaluate issues such as reliability and mobility to assess their feasibility in CMOS applications.

432

V. Misra

References 14.1. International Technology Roadmap for Semi-conductors Home Page, http://public.itrs.net 14.2. C. Hobbs, L. Fonseca, V. Dhandapani, S. Samavedam, B. Taylor, J. Grant, L. Dip, D. Triyoso, R. Hegde, D. Gilmer, R. Garcia, D. Roan, L. Lovejoy, R. Rai, L. Hebert, H. Tseng, B. White and P. Tobin. “Fermi Level Pinning at the PolySi/Metal Oxide Interface” IEEE Symp. on VLSI Technology Tech. Dig., 2-1, 2003 14.3. I. De, D. Johri, A. Srivastava, C.M. Osburn, “Impact of gate workfunction on device performance at the 50 nm technology node”, Solid-StateElectronics 44, no. 6, p. 1077–80, 2000 14.4. J.R. Hauser and W.T. Lynch, “Critical front materials and processes for 50 nm and beyond IC devices,” SRC working paper, 1997 14.5. V. Misra, G.P. Heuss, and H. Zhong, “Use of metal-oxide-semiconductor capacitors to detect interactions of Hf and Zr gate electrodes with SiO2 and ZrO2 ,” Appl. Phys. Lett. 78, p. 4166, 2001 14.6. S.P. Murarka, Metallization: Theory and Practice for VLSI and ULSI, Boston, Butterworth-Heinemann, 1993 14.7. S. Zafar, C. Cabral, R. Amos and A. Callegari, “A method for measuring barrier heights, metal work functions and fixed charge densities in metal/SiO2 /Si capacitors”. Appl. Phys. Lett. 80 (25), pp. 4858–60, 2002 14.8. Y.-C. Yeo, P. Ranade, T.-J. King and C. Hu, “Effects of high-k gate dielectric materials on metal and silicon gate workfunctions,” IEEE Electron Device Letters 23, No. 6, pp. 342–344, 2002 14.9. W. M¨ onch, “Electronic properties of ideal and interface-modified metal semiconductor interfaces,” J. Vac. Sci. Technol. 14, pp. 2985–2993, Jul./Aug. 1996 14.10. J. Robertson, “Band offsets of wide-band-gap oxides and implications for future electronic devices,” J. Vac. Sci. Technol. 18, pp. 1785–1791, May/Jun. 2000 14.11. W.A. Harrison, Electronic Structure and the Properties of Solids: The Physics of the Chemical Bond, Freeman, San Francisco, 1980 14.12. M.A. Nicolet, “Ternary amorphous metallic thin-films as diffusion-barriers for Cu metallization”, Appl. Surf. Sci. 91, p. 269, 1995 14.13. Q. Lu, R. Lin, P. Ranade, T.-J. King, and C. Hu, “Metal gate work function adjustment for future CMOS technology,” IEEE Symp. on VLSI Technology Tech. Dig., p. 49, 2001 14.14. I. Polishchuk, P. Ranade, T.-J. King, and C. Hu, “Dual work function metal gate CMOS technology using metal interdiffusion,” IEEE Electron Device Lett. 22, p. 444, 2001 14.15. J. Lee, H. Zhong, Y.-S. Suh, G. Heuss, J. Gurganus, B. Bei, V. Misra, “Tunable work function dual metal gate technology for bulk and non-bulk CMOS,” IEEE International Electron Devices Meeting Digest, p. 359, 2002 14.16. M. Takeyama, A. Noya, T. Sase, and A. Ohta, “Properties of TaNx films as diffusion barriers in the thermally stable Cu/Si contact systems,” Journal of Vacuum and Science Technology 14, pp. 674–678, 1996 14.17. J.P. Chang, M.L. Steigerwald, R.M. Fleming, R.L. Opila, and G.B. Alers, “Thermal Stability of Ta2 O5 in Metal-Oxide-Metal Capacitor Structures,” Appl. Phys. Lett. 74, pp. 3705–3707, 1999

14 Issues in Metal Gate Electrode Selection for Bulk CMOS Devices

433

14.18. H. Wakabayashi, Y. Saito, K. Takeuchi, T. Mogami, and T. Kunio, “A DualMetal Gate CMOS Technology Using Nitrogen-Concentration-Controlled TiNx Film,” IEEE Transactions on Electron Devices 48, pp. 2363–2369, 2001 14.19. M. Moriwaki, T. Yamada, Y. Harada, S. Fujii, M. Yamanaka, J. Shibata, and Y. Mori, “Improved Metal Gate Process by Simultaneous Gate-Oxide Nitridation during W/WNx Gate Formation,” Japanese Journal Of Applied Physics 39, pp. 2177–2180, 2000 14.20. Y.S. Suh, G.P. Heuss, H. Zhong, and V. Misra, “Electrical Characteristics of TaSix Ny Gate Electrodes for Dual Gate Si-CMOS Devices,” IEEE Symp. on VLSI Technology Tech. Dig., p. 47, 2001 14.21. Y-S. Suh, G.P. Heuss, V. Misra, D.G. Park and K.Y. Lim, “Thermal stability of TaSi/sub x/N/sub y/ films deposited by reactive sputtering on SiO2 .”, Journal-of-the-Electrochemical-Society 150 (5), pp. 79–82, May 2003 14.22. Y-S. Suh, G.P. Heuss, J.H. Lee and V. Misra, “Effect of the composition on the electrical properties of TaSix Ny metal gate electrodes”, IEEE-ElectronDevice-Letters 24 (7), pp. 439–41, July 2003 14.23. S.B. Samavedam et al., “Dual-metal gate CMOS with HfO2 gate dielectric”, IEEE Internation Electron Device Meeting Technical Digest, pp. 433– 6, 2002 14.24. S.C. Fain, J.M. McDavid, “Work-function variation with alloy composition: Ag-Au”, Phys. Rev. B 9, p. 5099, 1974 14.25. R. Ishii, K. Matsumura, A. Sakai, T. Sakata, “Work function of binary alloys”, Appl. Surf. Sci. 169-170, p. 658, 2001 14.26. T. Sands, W.K. Chan, C.C. Chang, E.W. Chase, and V.K. Keramidas, “NiAl/n-GaAs Schottky diodes: barrier height enhancement by hightemperature annealing”, Appl. Phys. Lett. 52, p. 1388, 1988 14.27. B. Blanpain, G.D. Wilk, J.O. Olowolafe, and J.W. Mayer, “Thermal stability of coevaporated Al-Pt thin films on GaAs substrtate,” Appl. Phys. Lett. 57, p. 392, 1990 14.28. T.S. Huang, J.G. Peng, and C.C. Lin, “Thermal stability of Mo-Al Schottky metallization on n-GaAs,” J. Vac. Sci. Technol. B 11, p. 756, 1993 14.29. C.D. Gelatt, and H. Ehrenreich, “Charge transfer in alloys: AgAu,” Phys. Rev. B 10, p. 398, 1974 14.30. H. Zhong, S.N. Hong, Y.-S. Suh, H. Lazar, G. Heuss, and V. Misra, “Properties of Ru-Ta Alloys as Gate Electrodes For NMOS and PMOS Silicon Devices,” in IEEE Int. Electron Devices Meet. Tech. Dig., p. 467, 2001 14.31. V. Misra, H. Zhong and H. Lazar, “Electrical properties of Ru-based alloy gate electrodes for dual metal gate Si-CMOS,” IEEE Electron Device Letters 23 (6), pp. 354–6, 2002 14.32. R. Beyers, “Thermodynamic considerations in refractory metal-siliconoxygen systems,” Journal of Applied Physics 56, pp. 147–152, 1984 14.33. H. Zhong, G.P. Heuss, Y-S. Suh, S.N. Hong and V. Misra, “Electrical properties of Ru and RuO2 gate electrodes for Si-PMOSFET with ZrO2 and Zr-silicate dielectrics,” Journal-of-Electronic-Materials 30 (12), pp. 1493–8, 2001 14.34. H. Zhong, G.P. Heuss, V. Misra, L. Hongfa, H.L. Choong and D.L. Kwong, “Characterization of RuO2 electrodes on Zr silicate and ZrO2 dielectrics”, Applied-Physics-Letters 78 (8), pp. 1134–619, Feb. 2001

434

V. Misra

14.35. J.E. Chung, P.K. Ko, and C. Hu, “A model for hot-electron-induced MOSFET linear current degradation based on mobility reduction due to interface-state generation,” IEEE Trans Electron Devices ED-38, pp. 1362– 1370, 1991 14.36. A. Vandooren, A. Barr, L. Mathew, T.R. White, S. Egley, D. Pham, M. Zavala, S. Samavedam, J. Schaeffer, J. Conner, B.Y. Nguyen, B.E. White Jr., M.K. Orlowski, and J. Mogab, “Fully-depleted SOI devices with TaSiN gate, HfO2 gate dielectric, and elevated source/drain extensions,” IEEE-Electron-Device-Letters 24 (5), pp. 342–4, May 2003 14.37. S.B. Samavedam et. al, “Fermi Level Pinning with Sub-monolayer MeOx and Metal Gates,” IEEE International Electron Device Meeting Technical Digest, pp. 307–310, 2003 14.38. H.Y. Yu, J.F. Kang, J.D. Chen, C. Ren, Y.T. Hou, S.J. Wang, M.F. Li, D.S.H. Chan, K.L. Bera, C.H. Tung, A. Du and D.L. Kwong, “Thermally Robust High Quality HfN/HfO2 Gate Stack for Advanced CMOS Devices,” IEEE International Electron Device Meeting Technical Digest, pp. 99–102, 2003 14.39. M. Akbar, S. Gopalan, H.-J. Cho, K. Onishi, R. Choi, R. Nieh, C. S. Kang, Y.H. Kim, J. Han, S. Krishnan, and J.C. Lee, “High-performance TaN/HfSiON/Si metal-oxide-semiconductor structures prepared by NH3 post-deposition anneal,” Applied Physics Letters 82, No. 11, 17, March 2003 14.40. J. Lee, Y.-S. Suh, H. Lazar, R. Jha, J. Gurganus, Y. Lin, V. Misra, “Compatibility of Dual Metal Gate Electrodes with High-K Dielectrics for CMOS”, IEEE International Electron Device Meeting Technical Digest, pp. 323–6, 2003 14.41. R. Jha, J. Gurganos, Y.H. Kim, R. Choi, J. Lee and V. Misra, “A Capacitance Based Methodology for Extracting Work Function of Metal Electrodes on High-K Dielectrics”, Submitted to IEEE Electron Device Letters, Jan 2004

15 CMOS IC Fabrication Issues for High-k Gate Dielectric and Alternate Electrode Materials L. Colombo, A.L.P. Rotondaro, M.R. Visokay, and J.J. Chambers

15.1 Introduction Silicon dioxide based dielectrics such as SiO2 and nitrided SiO2 (SiON) are reaching the limit of their usefulness in complementary metal oxide semiconductor (CMOS) devices principally because of high tunnel currents. The semiconductor industry has adopted SiON at the 130 nm node where equivalent oxide thicknesses less than 2 nm are typically used for the high-performance devices. But SiON is also nearing its limits for both low power and highperformance applications as the gate dielectric is scaled beyond about 1.5 nm for low power and beyond 1 nm for high performance applications. There have been a multitude of high-k gate dielectric materials investigated to date and the community is slowly converging upon a few options [15.89, 15.103]. Today most of the published reports on high-k gate dielectrics are focusing on Hf-based compounds, such as HfO2 , HfO2 /SiN, HfSiON, and there are a few reports on HfAlO and HfAlON. The details of these materials are described in Chap. 10 of this book. The challenges that we face as a semiconductor community as we attempt to replace SiON is not only finding a suitable new gate dielectric material that can be scaled and has lower gate leakage than SiON, but also integrating such a material into the conventional CMOS flow. Manufacturability is a key issue whenever a new material is adopted. Introduction of a new gate dielectric and new gate electrode materials will probably be the most disruptive changes that the silicon semiconductor industry will be facing in the near future. The primary objective of the industry at first was to replace SiON with an appropriate high-k gate dielectric while still using poly-Si electrodes. This would be a preferred approach at least for the initial introduction of the new gate dielectric. Much effort has been devoted to achieving this goal and many papers have been written on the subject. However, a number of problems have been encountered with most of the high-k gate dielectrics when trying to integrate them with poly-Si gate electrodes and it is not clear yet whether poly-Si will be used in conjunction with high-k materials for high performance transistors. Some of these problems will be discussed in this chapter while others are discussed in other chapters of this book. The objective of this chapter is to review the challenges of integrating high-k gate dielectrics in the conventional CMOS flow with polycrystalline

436

L. Colombo et al.

silicon (poly-Si) gate electrodes and to review the integration of metal gates in CMOS devices. The chapter is divided in four sections. The first section describes the conventional CMOS flow utilizing SiO2 and SiON gate dielectrics. The second section introduces the integration of high-k gate dielectrics into the flow. Some of the key issues encountered in implementing new materials in the production line will be reviewed and the integration issues of these materials will be discussed. The third section delves into a review of metal gate integration covering various approaches used when adopting single metal, midgap metals, dual metals, and work function setting techniques using dopants as in the case of fully silicided gate electrodes. The fourth section reviews potential approaches for scaling CMOS devices and the use of high-k and metal gates in these new device structures, a new and exciting field for device scaling.

15.2 The “Standard” CMOS Flow The sequence of operations that a silicon wafer undergoes to build integrated circuits on it is called Process Flow. Not only are the process conditions of each individual operation important, but the sequence in which the operations are used is also critical. The semiconductor industry is based on process flows that generate Complementary Metal Oxide Semiconductor Field Effect Transistors (CMOSFETs) as the spinal component of its circuits. Each facility that fabricates semiconductor devices, from university laboratories to production plants, uses process flows that are customized for a particular application and the process tool availability at the location. Although differing in the details, the majority of the process flows currently in use follow the same sequence of macro steps or modules which has been described by Wolf [15.105] for a 3 micron technology. A typical list of modules for the fabrication of CMOSFET devices is: 1) isolation; 2) well and channel doping; 3) gate dielectric/gate stack formation; 4) source and drain formation; 5) silicidation/contact fabrication; and 6) metallization. Figure 15.1 depicts the basic sequence of modules that are needed to create advanced CMOSFETs on bulk silicon wafers. An overview of each module is presented in the subsequent sub-sections. 15.2.1 Isolation Initially, the incoming silicon wafers are inspected for defects and particles. The wafers are then laser marked and wet cleaned. After these initial steps, the wafers move to the isolation module. Shallow trench isolation (STI) replaced Local Oxidation of Silicon (LOCOS) since the mid nineties as the preferred technology for isolating devices of advanced nodes [15.80]. Shallow trench isolation consists of patterning and etching shallow trenches into the silicon substrates in the region between the devices. Usually, a nitride/oxide

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

437

Initialization: Wafer selection and labelling

Shallow Trench Isolation

Well and Channel Doping

Gate Stack formation

Source and Drain formation

Metallization Fig. 15.1. Basic sequence of modules needed to fabricate 130 nm technology CMOSFETs on bulk silicon wafers

hard mask is used to protect the active regions during the trench etch. Subsequently, the trenches receive a liner of grown oxide and are filled with deposited oxide. The surface of the wafer is planarized by chemical mechanical polishing (CMP) and the nitride/oxide mask stack is removed from the active regions, usually by wet etch [15.67]. Figure 15.2 schematically shows a wafer cross section after STI formation.

438

L. Colombo et al.

Fig. 15.2. Schematic cross section of a silicon wafers after the shallow trench isolation (STI) has been formed

15.2.2 Well and Channel Doping After the definition of the active regions by STI, wells of p-type and n-type regions are created on the silicon substrate to accommodate NMOSFET and PMOSFET devices respectively. This is usually accomplished by ion implantation with energies of the order of several hundred keV. After the well implants, the surface of the active regions is doped to create dopant profiles that would result in the desired threshold voltage for each device. In advanced technologies the dopant concentration in the channel region is of the order of 1017 –1018 at/cm3 [15.95, 15.106]. The wafer also receives anti-punch through and anti-latchup implants. After this sequence of implantations the wafer is annealed to diffuse and activate the dopants, and also to “recrystallize” the substrate regions that have been damaged/amorphized by the implantation processes. Figure 15.3 schematically shows a cross-section of a wafer after the sequence of implantations that define the substrate and channel doping for NMOS and PMOS devices.

Fig. 15.3. Schematic cross-section of a silicon wafer after the sequence of dopant implantations that define the substrate and channel doping for NMOS and PMOS transistors

15.2.3 Gate Dielectric/Gate Stack In preparation for the gate oxide growth, a sacrificial oxide can be grown and wet etched to remove the top surface of the silicon substrate at the active region, aiming for the elimination of surface defects. This is a legacy from the LOCOS isolation days where the sacrificial oxide was used to eliminate the Kooi effect [15.25, 15.45, 15.93]. The Kooi effect (also called White Ribbon effect) results from the interaction of the silicon nitride mask with the wet

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

439

oxidation ambient during the growth of the field oxide in the LOCOS isolation scheme. The field oxidation step in the LOCOS isolation scheme is a long (2– 4 hours) high temperature (∼ 1000◦ C) wet oxidation process targeted to grow an oxide several hundred nanometers thick [15.105]. During this process the wet oxidizing ambient interacts with the silicon nitride hard mask resulting in the formation of silicon nitride spots or ribbons at the pad oxide–silicon substrate interface in the vicinity of the isolation edge. These silicon nitride regions cannot be removed during the pad oxide and silicon nitride mask wet strip process. Their presence on the active regions impedes the gate oxidation, resulting in local thinning of the gate oxide. This causes the reduction of the gate oxide breakdown voltage with consequent reliability degradation. The growth and removal of a sacrificial oxide after the pad oxide and silicon nitride mask strip, eliminates the silicon nitride regions allowing for a high quality gate oxide to be grown. In advanced technologies, input/output (IO) and core devices have different requirements. The split gate method is a way to achieve IO and core devices operating at different voltages by growing gate oxides of different thickness. The split gate process consists of first growing a thick gate oxide over the wafer. This thick gate oxide is patterned to expose the core device regions where it is removed by wet etching. The thin gate oxide for the core devices is then grown on the exposed active regions and both thick IO and thin core gate oxides are nitrided [15.32]. After the gate oxide formation, the wafers receive polycrystalline silicon (poly-Si) deposition as the gate electrode. The poly-Si can be doped to create n+ poly-Si and p+ poly-Si electrodes for NMOS and PMOS devices, respectively. A hard mask is then deposited over the poly-Si. The hard mask is usually composed of silicon nitride or silicon oxynitride and acts as an anti-reflective layer during gate patterning. After gate patterning, the poly-Si electrodes are anisotropically etched to form gates. The wafers are cleaned to remove the remaining photo-resist and the polymers that were formed on the sides of the poly-Si lines during the gate electrode etch. The hard mask is then removed, by wet etch in H3 PO4 [15.88] or other means. The gate stack after gate electrode definition is shown schematically in Fig. 15.4.

Fig. 15.4. Schematic cross section of a silicon wafer after the gate stack definition

440

L. Colombo et al.

Fig. 15.5. Schematic cross section of a silicon wafer after the formation of the extensions, spacer and source and drain regions

15.2.4 Source and Drain The poly-Si lines are subjected to a short oxidation that results in the corner rounding at the bottom of the lines. This process tends to improve the gate oxide integrity at these regions. An offset spacer can then be defined prior to the implantation of the source and drain extensions. Pocket implants can be added to confine and improve the sharpness of the dopant profile of the extensions. Annealing temperatures of the order of 900◦ C can be used to improve dopant activation. Spike annealing is typically used, where the temperature is rapidly ramped to the anneal temperature and the wafers are only submitted to the target temperature for a duration of one second or less. The wafers are then rapidly cooled. After the extension implant and anneal, the source and drain spacer is defined. Many approaches are currently being used in the industry. We are going to describe the one that fabricates L-shaped spacers. In this case, a stack of a deposited oxide (8–20 nm), deposited nitride (10–45 nm) and thick deposited oxide (25-60 nm) is used to define the spacer. After etching the structure, the nitride layer has an L-shape defined by the oxide layers as shown in Fig. 15.5. Source and drain dopants are introduced to the wafers at the desired regions by patterning and ion implantation. The wafers are spike annealed at temperatures of about 1000◦ C to activate the dopants and to eliminate defects generated during the implantation. Figure 15.5 shows a schematic cross-section of a wafer after the formation of the extension, spacer and source and drain regions. 15.2.5 Silicide and Contact Silicidation of poly-Si lines and source drain regions is used on advanced technologies in order to reduce the sheet resistance of the poly-Si lines and also to improve contact resistance at the source and drain regions. First, the silicon on the top of the poly-Si lines and at the source drain regions needs to be exposed. This is commonly accomplished by a combination of a plasma etch followed by wet etch in hydrofluoric acid. Metal is then deposited directly on the

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

441

Fig. 15.6. Schematic cross section of a silicon wafer after the silicidation of the gate and source and drain regions

Fig. 15.7. TEM cross-section of a 90 nm technology node MOSFET showing an Lshaped spacer and the silicide on top of the polysilicon gate electrode. After [15.32]

exposed silicon by physical vapor deposition (PVD) methods like sputtering. The wafers are annealed in order to react the metal with the poly-Si to form the desired silicide. Currently, most 130 nm technologies use cobalt disilicide (CoSi2 ) but most of the 65 nm technologies envision replacing the cobalt silicide with nickel silicide (NiSi). Cobalt silicide has been shown to agglomerate when formed on narrow poly-Si lines while nickel silicide does not suffer from this drawback [15.44, 15.53]. After silicide formation, the excess metal is removed from the wafer surface by a wet etch that is selective to metal silicide. Figure 15.6 shows a schematic cross-section of a wafer after silicidation and Fig. 15.7 shows a cross sectional TEM of a 90 nm technology node MOSFET. A pre-metal dielectric (PMD) liner is then deposited on top of the wafer. The PMD liner is usually a silicon nitride film that is deposited at a temperature that does not degrade the metal silicide. The PMD liner is followed by a PMD layer deposition that is a phosphorous silicate glass for technologies beyond 180 nm. This film is deposited and subsequently planarized by CMP. Contact holes are then patterned and etched, and the contacts are filled, typically with tungsten and subsequently planarized by CMP. Copper metal-

442

L. Colombo et al.

lization is then used to create the interconnects on the wafer [15.3,15.68]. Advanced technologies use several layers of metal interconnect. Low-k dielectrics are used to reduce the capacitance between the lines and between the metal levels aiming for the improvement of circuit speed by reducing parasitics.

15.3 Insertion of High-k Gate Dielectric into the CMOS Flow This section uses a process flow aimed at fabricating logic devices with 130 nm technology on bulk silicon that was described in the previous section as the basis for discussing the fabrication issues when high-k gate dielectric materials are used. Replacing the well-known silicon dioxide and nitrided silicon dioxide gate dielectrics by a new class of materials poses many challenges to fabrication. The integration issues the silicon technology industry faces for integrating high-k gate dielectrics in CMOS devices are discussed in the subsequent sections. 15.3.1 High-k Materials as a Substitute for SiON Cross Contamination New materials are being introduced at a faster pace than ever in the silicon process flow. Among others, copper for interconnect and nickel for silicidation. A key aspect that allowed the successful scaling of the SiO2 gate dielectric was the improvement in the contamination control during its fabrication. Advanced gate dielectrics that are used in production are practically free of metallic and particulate contamination. Many metallic contaminants have been found to be yield killers in the manufacturing line [15.26, 15.34, 15.35, 15.59, 15.86, 15.87]. However, not all metallic elements are equal and the use of metallic elements to increase the dielectric constant of the gate dielectric is becoming a necessity for scaling planar CMOS devices. From the fab manager point of view, these new elements should not threaten other processes, that is the new materials should not cause any cross contamination during thermal or wet treatments. Of the many new high-k gate dielectrics investigated over the past decade or so, hafnium based dielectrics seem to be the most promising ones. Hafnium from a HfSiO source has been found to have low diffusivity through a thin SiO2 interfacial layer and into Si [15.78]. In addition, hafnium metal has a very low vapor pressure (see Table 15.1 for the melting point of some elements of interest), and one would not expect it to evaporate during thermal treatments at typical temperatures used for silicon processing (< 1200◦ C). It turns out that these refractory oxides are very stable and pose the opposite problem, i.e., they are difficult to etch [15.6]. The chemical state of the metal is critical for the application of these new materials to gate dielectrics.

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

443

Table 15.1. Melting point of some elements and compounds of interest for high-k gate dielectric application [15.102] Element / Compound Melting Point (◦ C)

Si 1410

SiO2 1723 cub 1703 rhomb 1610 hex

Hf

HfO2

Al

Al2 O3

2227

2758

660

2072 hex 2015 rhomb

For example, aluminum in Al2 O3 form should be very stable (Table 15.1). However, recent data indicates that the aluminum diffuses into the channel thus causing mobility degradation [15.4]. This suggests that not all of the aluminum is oxidized. Further, this “free aluminum” might contaminate other tools during subsequent processing. Manufacturability of a Deposited Gate Dielectric Control of the gate oxide thickness is a critical parameter for manufacturing. The expectation is that the gate dielectric for advanced nodes will have at least the same thickness uniformity and reproducibility as that of the previous node. This means that the deposited high-k gate dielectrics are expected to have a thickness control that is similar to that of grown SiO2 . This problem may be somewhat alleviated by the expected higher physical thickness of the high-k material. Agnello [15.1] discusses this effect and shows that typical high-k dielectric deposition methods that are currently under consideration like chemical vapor deposition (CVD) and atomic layer deposition (ALD), will be able to meet the uniformity requirements for nodes beyond 65 nm. There are many other considerations that need to be evaluated in deciding which material and which process will eventually be introduced into the flow. For example, SiO2 gate dielectric growth requires no chamber clean. Deposition processes on the other hand do require chamber clean, and as mentioned earlier some of the high-k gate dielectrics are refractory and very difficult to etch with standard chemistries. Pre Gate Clean – More Stringent Requirements for Deposited Versus Grown Gate Dielectrics The interface between the silicon substrate and the gate dielectric has a significant effect on the channel mobility. It has been shown that the presence of an SiO2 -like interface significantly improves the channel mobility on MOSFETs fabricated with high-k gate dielectrics [15.90]. For grown gate dielectrics the final interface between the gate dielectric and the substrate is formed during the growth process. For deposited gate dielectrics the final interface might be mostly formed during the pre-deposition surface preparation (cleans). This poses additional requirements to the gate cleans that precede the gate dielectric deposition.

444

L. Colombo et al.

During deposition it is possible that the interface with the substrate gets oxidized. In fact, most of the high-k deposition processes under consideration result in some degree of interface oxidation. This might bring some of the advantages of the interface of grown dielectrics but at the expense of an increase of the final EOT of the gate dielectric. It is still not clear if the presence of a monolayer of grown SiO2 at the interface between the gate dielectric and the substrate is sufficient to overcome the perceived mobility degradation of high-k gate dielectrics in comparison to SiO2 or SiON gate dielectrics. Moreover, the task of controlling the interface composition at the atomic level is extremely challenging. Suppression of the substrate oxidation during deposition has been pursued for ultimate EOT scaling. This has been done by: 1) controlling the deposition ambient (i.e. by removing the oxidizers from the deposition ambient) [15.28], and 2) passivating the substrate prior to the deposition (i.e. by nitridation of the substrate) [15.103]. The control of the evolution of the interface between the gate dielectric and the substrate during the deposition of the high-k film is a key aspect for the successful integration of high-k gate dielectrics in advanced devices. The use of a post-deposition anneal (PDA) is an important aspect in the high-k gate dielectric integration. Tailoring the PDA conditions allows for the optimization of the high-k film characteristics, including the interface between the gate dielectric and the substrate, without inflicting a penalty on the EOT of the gate dielectric stack [15.104].

15.3.2 Interactions with/During the Gate Electrode Deposition A critical aspect for successful high-k gate dielectric integration is the stability of the interfaces of the high-k with the silicon substrate and with the deposited poly-Si gate electrode or potentially metal gate electrodes. Some high-k materials, notably the zirconium compounds, have been shown [15.27, 15.43, 15.46, 15.66, 15.73] to interact with the deposited poly-Si film. This interaction severely degrades the device performance. Aluminum oxide has also been found to interact with silicon [15.50]. It is of utmost importance that the high-k gate dielectric does not interact with the gate electrode. The threshold voltage of MOSFET devices fabricated with all tested highk gate dielectrics shows some degree of voltage offset from the ideal SiO2 threshold voltage. This points to the fact that there could be interaction between the poly-Si electrode and the high-k gate dielectric. In the case of the group IVB oxides or their silicates, the offset has been found to be accentuated in pMOS devices [15.27]. In the case of Al2 O3 and IVB aluminate gate dielectrics, this effect is larger for nMOS devices [15.27]. The threshold voltage offset observed to date is leading many groups to evaluate the use of alternate gate electrodes rather than poly-Si. Alternate gate electrode materials and their integration into the CMOS fabrication flow are discussed in Sect. 15.4.

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

445

15.3.3 Gate Electrode Etch Concerns – Stopping on High-k The use of a physically thicker gate dielectric alleviates the selectivity requirement of the gate electrode etch, as it renders the task of stopping on the gate dielectric to prevent silicon substrate trenching easier. The use of high-k gate dielectrics naturally brings this advantage, because they are physically thicker. Moreover, it has been observed that most of the high-k gate dielectrics under consideration are more difficult to etch than the conventional gate dielectrics, SiO2 and SiON [15.6, 15.43, 15.74]. This brings an additional advantage during gate electrode etch as their selectivity towards etch should be higher than the one of conventional gate dielectrics. Even though the low etch rate is an advantage during gate electrode etch, the difficulty in etching the high-k gate dielectrics might cause some integration challenges. These will be discussed in subsequent sections. 15.3.4 Surface Preparation (Cleans) in the Presence of High-k Materials Under any circumstances, the high-k film cannot leach metal during wet processing. This would cause the degradation of the film properties and more importantly, would be a cross contamination risk for the fab. Most of the high-k gate dielectrics under consideration are fairly resistant to chemistries that are commonly used in the fab. Although many of the high-k gate dielectrics have shown to be resistant to many wet etch chemistries, care needs to be exercised when exposing high-k gate dielectrics to wet clean chemistries [15.6, 15.79]. For instance, hydrofluoric acid (HF) attacks most of the high-k dielectrics, and the etch rate of the high-k film is a function of the film composition and its thermal history. It has been shown that the etch rate of hafnium and zirconium silicates increases as the silicon content of the film increases [15.6,15.79]. It was also demonstrated that the etch rate of these films decreases if they are annealed compared to the as-deposited films [15.6]. The removal of the gate hard mask in H3 PO4 might pose an additional challenge to the high-k gate dielectric candidates. The high-k film cannot be undercut during the H3 PO4 treatment. However, this step could be advantageously used to remove the high-k film from the source and drain and isolation regions prior to the extension formation. 15.3.5 Poly Silicon Oxidation Poly-Si oxidation has been used to round the corners at the bottom edges of poly-Si lines aiming at gate oxide integrity improvement. In the case of high-k gate dielectrics, it has been shown that the use of poly-Si oxidation degrades the high-k dielectric as the oxidation front moves into the film laterally causing EOT increase [15.31]. Therefore it is likely that flows that implement high-k gate dielectrics will not include poly-Si oxidation.

446

L. Colombo et al.

15.3.6 Source and Drain Extension Formation High-k Removal or Implantation Through the High-k Most of the high-k gate dielectrics, and in particular the hafnium based materials, are very difficult to etch selectively towards the materials that are usually exposed on the wafers, like silicon dioxide, silicon nitride and silicon (both crystalline and polycrystalline). It has been shown that CMOS devices can be fabricated while leaving the high-k gate dielectric on the wafers after the gate electrode definition. In this case, the source and drain extensions and pockets are implanted through the high-k film [15.43, 15.74]. The implant conditions need to be adjusted to account for the stopping power of the high-k film. Simulation work [15.8] suggests that keeping the high-k gate dielectric on the extension regions might be advantageous in reducing gate induced drain leakage (GIDL) at the expense of higher fringing capacitance between gate and source and drain. Some groups [15.43] have seen enhanced oxidation under the high-k dielectric that is left at the extension regions. It is not clear at this moment if this oxide formation would have a deleterious effect on the device performance but it will cause a recess of the extension regions vis-` a-vis of the gate channel and it can potentially cause an enhancement of dopant loss, in particular boron. The removal of the high-k film from the source and drain and isolation regions prior to the extension formation eliminates these problems. Removal of the high-k would also allow for the process conditions, in particular the implants, to be similar to the ones used in the conventional flow. High-k Thermal Stability During the Extensions Anneal After performing the implants to form the source and drain extensions and pockets, an activation anneal is commonly used. The thermal budget for this process is not as high as the one used for the source and drain anneal, nevertheless temperatures of the order of 900◦ C to 1000◦ C can be used. Spike annealing is typically used. The properties of the gate dielectric should not be degraded by the activation anneal process. This will be discussed further in the point “High Temperature Activation Anneal” on Sect. 15.3.8 when discussing source and drain anneal. A possible issue of having the high-k film present on the source and drain and isolation regions during the annealing is related to the fact that most high-k films are more difficult to etch after annealing at high temperatures [15.6]. This might cause integration issues as discussed in Sect. 15.3.9. 15.3.7 Spacer Formation After doping the extensions the main spacer materials are deposited. Advanced technologies use layers of silicon nitride and silicon dioxide to fabricate the source and drain spacers. A common approach uses a thin layer of

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

447

silicon nitride as an etch stop for the silicon dioxide spacer etch process. The resulting structure shows an L-shaped silicon nitride layer at the spacer of the transistors. If the high-k gate dielectric has not been removed prior to the spacer formation, the spacer etch can be used to remove the high-k gate dielectric from the source and drain regions [15.43, 15.74]. 15.3.8 Source and Drain Formation High Temperature Activation Anneal Dopant implantation is performed to create highly doped source and drain regions. At this stage it is possible to have the high-k film in place. However, the implant conditions need to be optimized to take into account the stopping power of the high-k material. After source and drain implantation the wafers need to be annealed to remove the damage that results from the high dose implants and also to activate the dopants. Usually temperatures in the range of 1000◦ C to 1100◦ C are used. A high-k gate dielectric that is stable after high temperature processing is extremely desirable for integration into a conventional CMOS flow. It has been reported [15.103] that the metal oxides, silicates and aluminates considered for high-k gate dielectric application crystallize at the temperatures of interest. The silicon containing high-k dielectrics, or silicates, have shown improvement on thermal stability compared to the metal oxides [15.103] but they still suffer from crystallization and phase separation at the needed anneal temperature range. Aluminates have also been shown to crystallize at temperatures lower than typical activationanneal temperatures. Recently, it has been shown that high-k materials that have nitrogen added to their composition present excellent thermal stability [15.47,15.100,15.101]. It has been demonstrated, for instance, that HfSiON remains amorphous and electrically stable up to at least 1100◦ C [15.100]. It has been shown [15.112] that the crystallization of the high-k gate dielectric has deleterious impact on device mobility and gate leakage. Also, a recent reliability study has indicated that crystallization and phase separation of HfSiO causes significant reduction of the Weibul slope [15.113]. The suppression of the high-k gate dielectric crystallization will be a key factor for obtaining high performance and reliable devices. Flash/Laser Annealing A high-k gate dielectric that remains stable when subjected to high temperature treatments will be key when extremely high temperature treatments are incorporated to the process flow aiming for sharper dopant profiles. Temperatures higher than 1100◦ C have been explored with flash or laser anneal [15.52, 15.58, 15.71]. The duration of these processes is very short (milliseconds for flash and nanoseconds for laser processing) and this might ease

448

L. Colombo et al.

the requirements on the high-k stability. As CMOS technologies scale the thermal stability of the high-k gate dielectric materials might face even more demanding challenges compared to the ones that are required today. 15.3.9 Silicidation The silicidation of the source and drain regions together with the poly-Si gate lines is a key process for the fabrication of high performance devices. This step reduces the series resistance of the poly-Si lines, that can be used as local interconnects, shunts the junction formed at the interface of the p+ doped and n+ doped poly-Si, reduces the series resistance of the source and drain regions, and reduces the contact resistance. An efficient and reliable silicidation process is critical to the operation of advanced devices. In order to form silicide, metal needs to be deposited in direct contact with the silicon at the source and drain regions and at the poly-Si lines. If the high-k gate dielectric is still present on the wafers, it has to be removed from the source and drain regions prior to the metal deposition. This may be a significant challenges at this point of the flow as the high-k film has been submitted to the high temperature anneals from the extensions and source and drain formation and it might be very difficult to etch [15.6]. After the high-k removal the silicon surfaces are deglazed and metal is deposited on the wafers. The wafers are annealed at temperatures that vary depending upon which silicide is used. Typically the temperatures are in the range of 300◦ C to 800◦ C. After silicide formation, excess metal is removed and the wafers may receive a second anneal to drive silicidation to completion (i.e., achieve the desired silicide phase and composition). 15.3.10 Contact and Metallization – Low Temperature Processes After silicidation the wafers receive Back End of the Line (BEOL) processing, which consists of contact formation, and interconnect metallization. All thermal processes have temperatures of less than 600◦ C, and they should not affect the high-k gate dielectric. Issues related to plasma damage resulting from antenna effects still have to be taken into account when considering the high-k gate dielectric reliability. 15.3.11 Sinter Usually the wafers receive a sinter anneal in forming gas, a mixture of nitrogen and 4–8% hydrogen. This was historically used to anneal interface defects of the silicon dioxide gate dielectric. Devices fabricated with HfO2 gate dielectric showed improved characteristics when annealed in forming gas at temperatures as high as 600◦ C prior to contact formation [15.70]. It is still not clear at this time that high-k gate dielectrics will benefit from sinter annealing after metallization.

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

449

15.4 Alternative Electrode Materials 15.4.1 The Need for Alternative Electrode Materials Poly-Si has been used as the gate electrode material of choice in the IC industry for the last two decades [15.68, 15.91]. One of the primary reasons for this is that the work function of the poly-Si layer can be readily adjusted by doping, and this in turn allows threshold voltage (Vt ) setting. To attain low Vt for NMOS devices the electrode work function should be near 4 eV, while for PMOS it should be near 5 eV. The ability to selectively and locally adjust the work function of poly-Si offered a key advantage in CMOS devices where it is desired to independently optimize the Vt of both PMOS and NMOS devices within the same die. Poly-Si is also a refractory material, which makes it compatible with MOS thermal budgets, and is readily etched in a variety of chemistries. Finally, it would be expected that a good quality interface between the poly-Si gate electrode and the gate dielectric could be formed, as is the case for the single-crystal Si substrate. The use of poly-Si presents several disadvantages when used as the gate electrode material, however. Due to the limited carrier concentration available even in heavily doped Si, the resistivity of the electrode is too high and leads to gate sheet resistance values that negatively impact device performance. To address this, low resistivity cladding layers (predominantly metal silicides) have been used successfully for many years. The scalability of these cladding layers has been severely tested in recent device generations, where the formation of the silicides onto narrow gate lines has been found to be very demanding. For this reason the industry has progressively moved from TiSi2 to CoSi2 and is currently undergoing a transition to NiSi with each evolutionary step being required in order to attain low gate sheet resistances on shrinking poly-Si gates and source/drains. Additionally, and perhaps more importantly today, the low carrier density also results in a significant depletion layer within the poly-Si when the devices are in operation. This depletion layer acts as a capacitance, Cdep , in series with that of the gate dielectric, which decreases the overall effective capacitance of the gate stack and therefore the resulting drive current. In the past, the decrease due to Cdep has been small relative to the oxide capacitance and thus could be neglected. Today, and increasingly in the future, the capacitance arising from poly-Si depletion is a large fraction of the total capacitance. Two factors have contributed to making the poly-Si depletion a significant issue: 1) gate capacitance scaling and 2) gate length scaling. As the gate dielectric thickness is scaled in order to increase Cox , the poly-Si depletion layer capacitance (which nominally remains constant) will play a larger and larger role, to the point where scaling of the gate dielectric reaches diminishing returns. As the gate length is decreased, it has been found that dopant loss from the poly-Si layer itself can also lead to increased electrical depletion and device degradation [15.16, 15.17].

450

L. Colombo et al.

Finally, poly-Si has been found to interact in negative ways with some of the high-k materials under consideration. One clear example is that of Zrbased high-k materials such as ZrO2 , Zr silicates and Zr aluminates. While bulk thermodynamics predicts stability of Si in contact with these materials, numerous groups have seen significant device degradation due to reduction of the dielectric material with the result being catastrophically high leakage currents [15.5, 15.21, 15.37, 15.73]. Several groups have also identified process schemes that minimize these interactions, but it is not clear that they can eliminate them altogether [15.5, 15.73]. Other materials such as Hf-based oxides, silicates, aluminates (including nitrided versions of these) have been shown to be significantly more robust when integrated with polySi [15.89,15.100,15.101]. What is not known, however, is whether or not there are more subtle interactions taking place that may limit the scalability and ultimate integrability of these high-k layers with poly-Si electrodes. 15.4.2 Material Classes Under Consideration as Alternative Electrode Materials Metals, rather than semiconductors, are the primary materials under consideration as alternative to electrodes that have been used in the past. In this context, “metals” encompasses a broad range of materials such as metals, metal alloys, and metal compounds such as metal nitrides (for example TiN), oxides (for example RuO2 ), silicides (for example NiSi) and borides (for example TiB2 ). For these materials the depletion layer associated with poly-Si is essentially eliminated such that Cox is increased, leading to an improvement in device performance. However, these materials generally do not offer straightforward means to make significant local selective changes in the work function as can be done by doping poly-Si. To a first order, the work function of a given metal is fixed. Therefore implementing CMOS devices with two independently optimized Vt values using metals is a significant challenge. This is not necessarily an issue for some device architectures (such as FD–SOI) and there are a number of approaches to resolve this from a gate stack integration point of view. Achieving the desired work function(s) can be done by either using two distinct materials or finding innovative means to locally adjust the work functions of a single base material. Several of these approaches will be discussed below. It should be noted that the use of metal gate electrodes refers to the case where the metal is in direct contact with the gate dielectric, with no intervening layer of poly-Si. A stack with metal or metal silicide overlying a poly-Si layer has been used for many years in both logic and memory products, and is not of interest for this discussion.

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

451

Fabrication Issues for Metal Gate Electrodes Use of a metal as the gate electrode represents a significant departure from the standard CMOS process flow that has been in place for the last two decades. Use of metal gate electrodes can be broadly broken down into two categories, the first being that of a single material with one work function and the second being the use of one or more materials that yield two work functions, one adjusted for PMOS and other for NMOS devices (“dual work function”). Both schemes present significant challenges to an industry that has been comfortable with and successful in using poly-Si as the gate electrode material for such a long time. While some aspects of implementation, e.g. basic gate stack process steps, are common to both schemes, there are also a number of areas where the approaches differ, e.g. in the channel engineering for a single work function or the added integration difficulties presented by using dual work functions. For this reason, the two cases will be covered separately. The focus of this discussion will be that of challenges in the implementation of these gate stacks using a traditional or high-k gate dielectric rather than the specific material selection or property requirements, since these have been discussed in detail elsewhere in this book. Single Work Function Gate Stack Implementation Single work function implies the use of a single metal gate electrode material on both NMOS and PMOS devices, which can simplify the gate stack processing significantly relative to a dual work function metals scheme. Typically a metal is chosen with a work function around 4.65 eV that falls about mid-way between the conduction and valence band of silicon. This simpler gate stack processing comes at a cost elsewhere in the device manufacture, however. For bulk CMOS, use of a single midgap metal electrode yields Vt values that are higher than desired for both N and PMOS devices, resulting in degraded performance for a given operating voltage. This is especially true for high performance, low Vt devices. Several groups have addressed this by pursuing aggressive channel engineering in order to adjust the Vt with the help of electrode work function engineering [15.15, 15.56, 15.69]. This method places a significant burden upon the channel doping from both initial dopant placement (i.e., ion implantation) as well as diffusion points of view, and may severely limit the thermal budget that can be tolerated. As a result of dopant diffusion-driven thermal budget constraints, implementation of a single work function approach in bulk CMOS may require a low thermal budget after doping, like that offered by the replacement gate or damascene schemes [15.33]. For the case of partially and fully depleted silicon-on-insulator (PDSOI and FDSOI, respectively), the situation with respect to Vt is significantly improved, so a single work function electrode with a work function near midgap may be feasible [15.13, 15.15, 15.83].

452

L. Colombo et al.

While the channel engineering associated with use of a single work function midgap metal gate electrode may be difficult (at least for bulk CMOS) the actual gate stack fabrication also presents a number of challenges. There are three basic routes to the gate stack formation, each with positive and negative aspects. The first is a subtractive process of patterning and etching the gate stack, similar to that currently used in production with poly-Si gate electrodes. The second is an additive process, along the lines of a damascene or replacement gate flow where the metal layer fills an opening in a dielectric layer. Finally, silicide reaction of the entire gate stack has also been proposed. Subtractive Gate Stack Fabrication The subtractive gate stack definition process follows a similar route as that of the conventional poly-Si gate electrode flow previously discussed. The gate stack is built up from the gate dielectric by depositing the desired metal electrode material onto an essentially smooth surface. While there is some topography, like that associated with non-planarity in the isolation process, there are no high-aspect ratio or narrow geometry features to be concerned about. The choice of deposition techniques is a significant one, however. Since a conformal method is not required, there is no a-priori restriction on the choice of method, and PVD, CVD as well as ALD processes have been proposed. Reactively sputtered and CVD TiN has been compared in the past, with comparable results in some cases [15.115] while other groups more recently have found CVD to yield better results, particularly with respect to channel mobility [15.111]. It seems likely that a process like thermal CVD or ALD will be preferable over a plasma-based process like sputtering to reduce or minimize potential gate dielectric damage. Typically, after deposition of the desired gate electrode material (that is, the material that is to be in contact with the gate dielectric) a second layer is deposited onto the electrode material. This cladding layer is usually of significantly lower resistivity than the main electrode layer and serves the purpose of reducing the gate sheet resistance. Tungsten [15.9, 15.23, 15.33, 15.110] is a common metal used for this purpose. Use of a low resistivity cladding layer allows more flexibility in the choice of the layer that sets the work function, since the work function setting layer would not be required to also provide low gate sheet resistance. Once the gate stack is deposited, it is then patterned and etched. For technology nodes where metal gates are being considered for implementation (no sooner than the 65 nm node), the physical gate lengths are on the order of 25 nm and smaller according to the ITRS roadmap [15.36]. This is a major challenge even for poly-Si gate electrodes, which have been the subject of extensive, etch development for many years. Since metal gate electrodes will be used for the most advanced devices they need to be etched to smaller dimensions than the current poly-Si. In addition to the geometries involved, the gate electrode etch process must also be compatible with (i.e. selective

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

453

to) very aggressively scaled gate dielectrics. Ideally the gate etch would stop perfectly on the gate dielectric and not remove any of that layer or the underlying Si substrate. Considering how thin a traditional gate dielectric would be (on the order of 1 nm), use of high-k gate dielectrics might be needed for this approach to be successful as mentioned in Sect. 15.3.3. The high-k gate dielectrics currently under investigation have two significant advantages over traditional SiON-based materials in this context. They will be physically thicker for a given EOT due to the increased dielectric constant and they tend to be more difficult to etch in most etching chemistries. Therefore high-k materials are more likely to yield good selectivity during a metal electrode etch. While the difficulty in etching of the high-k layer may present difficulties later in the flow when removal from the source-drain regions is needed, at gate etch this may be a helpful feature. After gate-etch, the metal gate stack would move through the flow in a similar manner to the standard poly-Si gate stack. This now points out several new areas of difficulty for implementing this stack. Fab compatibility/acceptance is a potentially significant problem since the metal gate stack would need to run through process steps that, in a logic flow, would never have been shared between metal-containing and non-metal containing wafers. The extent to which this is a problem depends upon the metals of interest and the fab protocols of an individual manufacturer. While this is not a fundamental restriction, it may be a significant practical one. Two more fundamental potential issues are those of the thermal stability of the metal gate stack and also its chemical compatibility with subsequent processing. The gate stack formed in this manner would need to remain physically and electrically intact during the entire FEOL process flow, which usually involves several high-temperature dopant activation anneals. Clearly, low melting temperature metals such as aluminum or NiSi would not be appropriate for this approach since the anneals are generally in excess of 1000◦ C. Materials proposed for this application have typically been refractory in nature, such as W or TiN, both of which have melting temperatures in excess of 1500◦ C. Aside from the issue of simple melting, all of the layers in the metal stack of choice also need to be stable in contact with each other and any other layers on the gate structure (such as spacer materials and the gate dielectric itself) during high temperature processing. This is a very significant criterion and can be used to immediately eliminate a number of materials from consideration. Ti, for example, readily reacts with SiO2 gate dielectrics at typical CMOS process temperatures and would therefore not be appropriate. Clearly, materials that react with SiO2 may still be applicable to nitrided oxides or high-k gate dielectrics but it is something that needs to be examined on a case by case basis, particularly if the high-k film contains Si, as in HfSiOx or HfSiON. The metal gate stack of choice also needs to be compatible with any processing chemistries to which it is exposed. Wet and dry-cleans in particular

454

L. Colombo et al.

need to be examined carefully for compatibility with materials of interest. In the course of the flow, the metal stack sidewalls will become encapsulated with dielectric (SiO2 and Si3 N4 ) layers which will shield the metals from chemical attack. Before the encapsulation and even afterward (if there are any pinholes in the layers) some metals would be attacked readily by common surface preparation chemistries routinely used in the front end of line. Also, depending upon the stack used, metal layers (e.g. cladding layers) may also be exposed on the top surface, again leading to potential incompatibility issues. In this case, an extra encapsulating surface layer could be included into the overall gate stack. This layer would then need to be accounted for in the overall gate stack integration scheme, but would allow greater flexibility in the gate stack materials choices. Additive Gate Stack Formation – “Replacement Gate” The additive gate stack approach can be used to eliminate a number of problems associated with the subtractive approach, though this approach is not free of problems itself. The general additive approach (also called “replacement gate” or “damascene gate”) [15.9, 15.10, 15.23, 15.110, 15.111] follows the standard flow, including gate formation with poly-Si electrode and junction formation up to the point where the device (i.e., after the source-drain activation) is fully formed. At this point, the so-called “dummy” or sacrificial poly-Si layer is removed, and the underlying gate dielectric may also be removed. The actual gate electrode (and possibly gate dielectric) is then formed in the openings left by removing the “dummy” poly-Si. If needed, the gate dielectric layer is then deposited or grown, after which the metal gate electrode is deposited to refill the gate opening. As in the subtractive approach, multiple metal layers can be used and the one in contact with the dielectric sets the work function and eliminates poly-Si depletion. Subsequent layers can be used to reduce the gate sheet resistance if the first layer is too resistive. Once the metal layers are deposited, they are either etched or planarized by CMP in order to re-define the gate structures. Planarization by CMP would be the preferred approach since no patterning and etch of the metal layer would be required. A replacement gate approach clearly has several advantages over a subtractive scheme. Since the metal portion of the processing is moved to later in the flow (essentially at the end of the MOSFET front end of the line fabrication) fab compatibility concerns would be significantly reduced. Since the metal layers are not introduced until later in the flow, the same basic processes and toolsets currently used in the FEOL can still be used for the dummy (sacrificial) gate stack fabrication. The gate stack etch (which must stop on the ultra thin gate dielectric) would be of poly-Si, which has been established and well-studied for many years. Finally, the electrode is introduced after the high-temperature process steps in the FEOL have been completed, so the thermal stability requirements are significantly reduced. Materials that

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

455

would not be acceptable in a subtractive approach may be sufficiently robust for use in a replacement gate flow. While there are many clear advantages to a replacement gate flow, there are also some significant difficulties. It is important to consider the implications of the dummy gate removal. If the gate dielectric is not removed and reformed, then it is critical that the Si removal selectivity be essentially infinite. Additionally, the “dummy” Si removal may degrade the quality of the dielectric layer from both a basic film property as well as reliability point of view. This may be more of a problem for traditional gate dielectrics than for high-k materials but would still need to be examined carefully. If the dielectric is stripped and then reformed after dummy gate removal, this would eliminate the concerns mentioned above but lead to several different ones. If a high-k material were used as the gate dielectric, then the deposition process would need to provide sufficient bottom coverage and no corner effects (for example cracks or local thinning or thickening) at the gate edges. Such defects would likely destroy the device, possibly immediately but could also cause severe reliability degradation. A deposition technique such as CVD or ALD would be able to provide the needed bottom coverage, but the presence or absence of defects would need to be established. Another potential issue with a deposited gate dielectric is that a conformal film would also take up space on the gate sidewall, decreasing the size of the opening. While the high-k layers will be fairly thin, a 4 nm film will still occupy 8 nm of the gate opening. Considering the ITRS roadmap projected Lg of 25 nm for the 65 nm node the high-k layer will account for a significant fraction of the gate cross-section. If a traditional SiO2 or SiON dielectric were to be re-grown, then the flow may have a significant problem with thermal budget, especially if the S/D silicide is in place during the replacement gate processing. Having this silicide in place before the metal gate formation would be very desirable from a process flow complexity point of view, so the gate dielectric thermal budget would be limited to that for which the silicide is stable. As the industry moves to NiSi, this will be a significant limitation since NiSi has poor thermal stability. Clearly, one way around this problem would be to re-expose the source-drain areas after the final gate processing for silicidation, but this would likely be quite difficult to accomplish. In addition to issues with the gate dielectric, deposition of the metal layers will also likely be quite difficult due to the size of the gate openings that will need to be filled (∼ 25 nm for the 65 nm node, and even less if a deposited dielectric is used). While the gate stack height will likely be scaled to keep the gate opening aspect ratio relatively low, deposition into such small openings is still not trivial. Furthermore, if CMP is used to planarize the layer then it will be important to minimize or eliminate any seam in the trenches, just as it is important to do so at the contact level for tungsten. While one of the advantages of the damascene approach is that the gate metals do not need to be etched, they still need to be planarized by CMP.

456

L. Colombo et al.

It is necessary, therefore, to develop appropriate CMP processes to remove all of the layers in the gate stack. Depending upon which gate metals are used, this may be quite difficult. The process would also need to stop on the underlying dielectric robustly in order to avoid removing the gate stack. Any variability in the gate stack height due to the CMP process would be seen immediately as variability in the gate sheet resistance, which directly impacts device performance. Silicide Reaction of the Gate Stack – “Full Silicidation” The use of silicides as potential gate electrode materials has been investigated for many years [15.38,15.39,15.61,15.85]. In the 1980s and also more recently, refractory silicides such as MoSi2 and WSi2 were investigated extensively. Indeed, the DRAM industry has used WSi2 as the gate metallization layer for a number of generations. In this case, however, the WSi2 was deposited onto a poly-Si layer so was not a metal gate in the true sense. Two significant benefits of these materials are their etchability and thermal stability, which lends them to the subtractive process discussed above. More recently an alternative fabrication methodology (termed “full silicidation”) has been demonstrated to allow implementation of a significantly broader range of silicide materials (namely silicides that aren’t readily dry etched and have relatively poor thermal stability) [15.77, 15.96]. This scheme is similar to the “replacement gate” process flow mentioned above in that a sacrificial Si gate electrode is fabricated in the traditional manner. In this case, rather than actually removing the sacrificial Si and then re-depositing the final metal electrode, the Si is fully reacted with an appropriate metal to form a silicide layer in contact with the gate dielectric. The silicide layer in contact with the gate dielectric then sets the work function and eliminates depletion. One significant benefit over a “replacement gate” process flow is that the gate dielectric is never exposed and does not need to be re-grown. One significant problem associated with this approach is excessive silicidation of the S/D regions during full silicidation of the gate if both regions are reacted simultaneously [15.96]. One route to avoid excessive silicidation is to perform the S/D and gate silicidations separately, using a dielectric capping layer over the S/D regions to prevent further silicidation during gate silicidation. While this requires a well-controlled dielectric etch-back or CMP planarization, Tavel et al. [15.96] indicate that such a process is feasible. Use of SOI substrates will also prevent this problem since there is a finite thickness of Si in the source-drain regions available for silicidation. The ability to uniformly fully silicide all gate structures across an entire wafer is also a significant risk with this approach. This requirement may severely limit the material choices available for implementation. Silicide formation in narrow lines has been driving the industry to incorporate new silicide materials over time, evolving from TiSi2 to CoSi2 and now NiSi. Early indications are that NiSi shows significant promise for application in this ap-

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

457

proach, but it is not clear that it will be successful from a manufacturability standpoint [15.42]. 15.4.3 Dual Work Function Gate Stack Implementation While the use of the same metal gate stack on both PMOS and NMOS transistors may be acceptable for device architectures such as SOI, it will be difficult at best to implement this successfully on traditional bulk CMOS devices since the Vt will be too high for both NMOS and PMOS devices. More desirably, different work functions would be used on NMOS and PMOS transistors either by using more than one gate electrode metal or by locally adjusting the work function of a single material. For the most part, the basic issues that were discussed in terms of single metal gate stack fabrication also apply to the dual work function case, but the need to simultaneously attain two work functions on the same die adds to the complexity and risk significantly. Several of the processing schemes that have been proposed to accomplish this will be discussed in this section. Pattern and Etch NMOS and PMOS Separately The most straightforward means to create a dual work function CMOS device is to simply deposit two materials with the desired work functions sequentially, with an etch between the depositions to remove the first layer from the complementary device type. The basic processing flow is shown in Fig. 15.8 [15.92]. In this particular case, TiN was used for PMOS electrode and TaSiN was used and the NMOS electrodes, which demonstrates the general concept and is not limited to these materials. The TiN was first deposited onto the entire wafer, then patterned and etched from the NMOS regions. TaSiN and a poly-Si cladding layer were then deposited, followed by pattern

Fig. 15.8. Schematic of the dual-metal gate process flow with corresponsing SEM images of an SRAM showing the different layers covering the NMOS and PMOS regions of the die. After [15.92]

458

L. Colombo et al.

and etch to define the gates.While conceptually simple, this process scheme has several significant risks. The first is that the gate dielectric is exposed to etch and clean chemistries as well as the fab ambient during the removal of the first layer (TiN in the case shown) from the NMOS regions. In a standard flow, the gate dielectric above the channel is never exposed after the electrode deposition. In addition to the possibility that the dielectric would be thinned in these regions during etching and cleans, it is possible that this processing would degrade the overall properties of the dielectric layer, in particular from a reliability point of view. In the article at hand [15.92], the authors found that HfO2 stood up well to the TiN etch and clean chemistries, and preliminary reliability data looked encouraging. Clearly, the degree to which this is an issue strongly depends upon the gate dielectric and electrode materials involved as well as the etch and clean chemistries used for the removal of the first layer and any etch or clean chemistry residues. The gate stack etch and cleans for this approach are also significantly more difficult than for the single work function since there is an extra material involved and there are also different stacks that need to be etched simultaneously for the PMOS and NMOS devices. In terms of gate etch, the stack thicknesses are different and the addition of the extra layer means that an extra etch chemistry may be needed. For ultra-thin SiON dielectrics, it is unlikely that a gate etch of this type can be made to reliably stop on the thin dielectric layer and not etch into the underlying Si substrate, so this may be a situation where a thicker and more etch-resistant high-k gate dielectric is needed. For post gate cleaning, the chemistries used must be compatible with two metal layers and any cladding layers simultaneously. Metal Interdiffused Gates One method that has been proposed to alleviate the need to remove the gate material from one of the device types during processing involves the use of metal inter-diffusion to adjust the work function [15.76]. In this case a first metal (or metal alloy) is deposited, followed by a second metal (or alloy) deposition. The layers are patterned and etched so that the second layer is removed from one device type, but the first layer is left intact. The first layer has a work function suited to one of the device types. An anneal treatment is then performed to intermix the first and second layers on the complementary device type, followed by a pattern and etch to define the gates. This process is shown schematically in Fig. 15.9. At the interface with the gate dielectric, the composition of the intermixed layer is changed so as to also change the work function to a value suited to that of the complementary device type. Using Ti and Ni as the two materials (with Ni as the second material), the Ni was found to segregate to the interface with the dielectric so the complementary devices had Ti and Ni (essentially pure) setting the work functions and a flat band (Vfb ) difference of 1.4 eV was obtained [15.76] as shown in Fig. 15.9. While Ni and Ti are far too reactive to

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

459

Fig. 15.9. Schematic illustration of the process flow. (a) CMOS structure after second metal has been removed from the n-MOS side. (b) CMOS structure after annealing shows that the metals on the p-MOS side have interdiffused, and second metal has segregated to the dielectric interface. After [15.76]

be seriously considered as candidate materials, this work did show that the general concept is valid. Subsequent work with Ta/Ru alloys (among others) has expanded upon this concept to include materials that have better thermal stability and are more likely to be valid options to explore [15.60, 15.118]. While the metal interdiffused approach clearly has some significant advantages over the basic dual work function scheme (in particular, the gate dielectric over the channel is never exposed after the deposition of the first metal layer) there is still a significant risk in terms of the gate etch and cleans. The stacks on the complementary device types are again different, with different constituents and thicknesses. Indeed this case may be even more difficult because there is no guarantee that the composition of the films (and therefore their etch properties) will be constant through the thickness of the film. One possible way to simplify this somewhat would be to pattern and etch the gates before annealing to intermix the layers, but this does not alleviate the more fundamental problem of having significantly different stacks on the complementary device types. Local Nitridation Another approach to local work function modification has been demonstrated by using local nitridation to adjust the work function. For this approach, a metal with a work function appropriate to either PMOS of NMOS is deposited. Use of Mo for this method has been discussed extensively in the literature [15.51, 15.54, 15.81, 15.82]. Subsequently, the Mo is patterned and nitrogen is introduced by ion implantation or other means. The implantation may be followed by an annealing treatment to redistribute the nitrogen, and then the gates are patterned and etched.

460

L. Colombo et al.

Regions that are masked during nitrogen ion implant retain the high work function, but regions that have added nitrogen yield reduced work functions as MoNx is formed. Depending upon the ion implantation and post-implant annealing conditions, work function separations as large as 0.6 eV were obtained [15.81]. As with the Ni/Ti interdiffusion work, the nitrogen was found to segregate to the interface with the gate dielectric so the majority of the film remained nitrogen deficient. This method has a number of beneficial aspects to it and even the gate etch problems associated with other approaches are mitigated somewhat by the fact that there is only one base material involved. The addition of N has potential to change the etch characteristics and the degree to which this occurs would need to be characterized. Using this methodology the gate stack thicknesses are the same on both NMOS and PMOS devices. However, if the N segregates to the gate dielectric interface (as in the case for N ion implantation), the bulk of the film can be denuded of N and therefore only a thin layer has potential to etch differently between NMOS and PMOS devices. Full Silicidation with Differentiated Work Functions As mentioned earlier in the context of single work function implementation, the full silicidation approach can be used to yield multiple work functions by at least two means. The first is to simply use two silicides with different work functions, and to form them either independently or simultaneously during gate silicidation. In a manner similar to the basic dual metal gate approach, a metal that yields a silicide with appropriate work function for one device type is deposited and then patterned and removed from the complementary device type. At this point, a silicide reaction can be performed if desired. Subsequently, the second metal (chosen to yield a silicide with the appropriate work function for the complementary device type) is deposited. It may be patterned and removed from the complementary device type, but this may not be needed. This layer is then annealed to react, and any residual metal are then removed. In this manner, two different silicides are formed on the two device types. Clearly this approach presents many difficulties in terms of the silicidation process. Generally, optimizing the deposition, anneal and clean processes for just one silicide requires a significant effort. Performing this optimization for two silicides simultaneously, with the requirement of uniformly reacting lines on the order of 25 nm wide is expected to be very difficult. Significantly simpler, the other means to differentiate the NMOS and PMOS work functions involves only one base silicide material [15.40, 15.41, 15.57, 15.94, 15.109]. Before silicidation, the poly-Si gates for PMOS and NMOS devices are doped independently, with either P, As (for NMOS) or B (for PMOS). When the silicidation is performed, the dopants redistribute and in some cases segregate to the gate dielectric interface [15.41,15.57]. This segregation has been found to result in a significant change in the work function,

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

461

not surprising since the material at this interface is more accurately described as a ternary compound rather than a silicide per se. Early indications of this approach applied to NiSi, for example, are quite positive, but a significant amount of work will be needed to understand and control the mechanism responsible for the work function modification. In addition manufacturability issues mentioned previously need to be addressed [15.42].

15.5 Integration of High-k Gate Dielectrics and Metal Gates into Advanced Devices 15.5.1 Advanced Planar Integration Schemes As conventional bulk MOSFETs shrink to gate lengths approaching 10 nm, the need for improved control over the channel region in order to suppress short channel effects (SCEs) becomes increasingly critical. The SCEs, such as drain-induced-barrier lowering (DIBL), threshold voltage (Vt ) roll-off, punchthrough, high off-state leakage and parasitic capacitances and resistances, tend to degrade the transistor characteristics at today’s gate lengths and only stand to worsen upon future scaling. Integration elements, including heavily doped drain extensions and pocket or halo implants have been implemented to combat SCEs in bulk MOSFETs, however, these doping levels have begun to approach the 1019 cm−3 regime and carrier mobilities may be degraded. In order to relax the doping constraints on SCE control, several device structures to suppress SCEs by changing the device geometry are either currently implemented, such as partially-depleted silicon-on-insulator (PD– SOI), or have already been proposed, such as fully-depleted SOI (FD–SOI) and non-planar multiple-gated devices (e.g. Fin–FETs). However, with the continued scaling of gate lengths, and subsequently gate dielectric thickness, these devices will eventually also require high-k gate dielectrics and metal gate electrodes to improve gate leakage and dopant depletion, respectively, as outlined in the previous sections for bulk MOSFETs. The process integration of PD–SOI MOSFETs is very similar to that of bulk MOSFETs [15.116]. Heavily doped drain extensions and pocket implants are used to control SCEs and separate implants are used to adjust threshold voltages. The difficulty in implementing PD–SOI devices is not in the process integration, but in the strained device non-idealities, such as floating-body effects, put on circuit designers [15.20]. These factors, plus the minimal performance gains PD–SOI offers over bulk MOSFETs [15.11], lessen the possibility that high-k gate dielectrics and metal gate electrodes will be integrated in PD–SOI for commercial production. If high-k dielectrics and metal electrodes are not integrated into PD–SOI devices, then planar double-gated SOI devices and FD–SOI devices, and high mobility substrates are the other main planar integration options remaining. Planar double-gate SOI devices employ a bottom gate to prevent the drain electric field lines from penetrating the channel

462

L. Colombo et al.

region thereby increasing SCEs. Even if the technology existed to build transistors with a functional bottom gate in a manufacturing environment, the alignment of the top and bottom gates with each other and with the source and drain is both critical and difficult [15.108]. While continual improvements in process technology may enable such devices, introduction of high-k and metal gates into planar double-gate devices only increases the integration challenges. To realize high-k dielectrics and metal gates in planar doublegated devices, the high-k dielectric and metal gate would need to withstand the thermal budget of a wafer bonding process and the metal gate would need to be thermodynamically compatible with the buried oxide layer [15.108]. The reduction of the floating body effect makes FD–SOI devices an attractive alternative to PD–SOI [15.24] and the simplification of the integration makes FD–SOI an attractive alternative to planar double-gated SOI devices. In order to enable fully depleted operation when the channel doping is high, FD–SOI devices are built on thin (approximately one-third of the gate length) silicon bodies. The control of Vt in these devices is highly dependent on the silicon body thickness [15.107] and devices built on a silicon body thickness less than ∼ 10.0 nm [15.84] experienced a significant degradation in the mobility. Furthermore, FD–SOI devices require the implementation of elevated sourcedrain (ESD) to prevent silicidation down to the buried oxide and underneath the sidewall spacer and to reduce source/drain resistance [15.2, 15.22]. FD– SOI devices present similar integration issues for high-k gate dielectrics and metal gates as outlined in the previous sections for standard bulk CMOS. Integration of metal gates into FD–SOI may not only improve the device characteristics due to the reduction in inversion dielectric thickness, but may also enable lower channel doping. Since the gate potential exercises a significant control over the channel potential in FD–SOI devices, the improvement in SCEs may allow a decrease in the channel doping and a commensurate increase in the channel mobility due to the reduction in impurity scattering. In order to maintain sufficiently small Vt (∼ ±0.2 to ±0.3 V) for high performance operation, an electrode with a mid-gap work function may be desired in this circumstance [15.15]. This presents an opportunity for mid-gap metals, such as TiN, W and metal silicides to be integrated into these devices. Metal gate FD–SOI devices have been demonstrated using a PVD TiN process to deposit the gate electrode on a thermal SiO2 gate dielectric [15.2, 15.13,15.55]. Control of Vt from ±0.3 V to ±0.5 V and control of within wafer Vt variation (∼ 5 mV) due to within wafer silicon body thickness variation was demonstrated through the utilization of the mid-gap metal work function and light channel doping. Due to the light channel doping, Vt roll-off was problematic in some cases, but was controllable by thinning the silicon body thickness to ∼ 30 nm. Another integration approach for integration of mid-gap metals into FD– SOI is the full silicidation of a poly-Si gate electrode. Full silicidation has been demonstrated using nickel silicide as the gate electrode on a thermal SiO2 gate dielectric [15.48]. Figure 15.10 illustrates the cross-section of a FD–SOI de-

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

463

Fig. 15.10. TEM of a NiSi FD–SOI device formed by full silicidation of poly-Si. After [15.48]

Fig. 15.11. SEM of NiSi gates formed by the full silicidation of poly-Si indicating incomplete silicidation for some gates. After [15.42]

vice formed by the full silicidation of poly-Si. This process yielded FD–SOI devices with Vt ∼ ±0.4 V. The nickel silicide gated FD–SOI devices showed impressive reduction in SCEs with sub-threshold characteristics nearly ideal and low DIBL (from 40 to 100 mV/V). The integration of nickel silicide with the SiO2 gate dielectric showed an order of magnitude decrease in the gate leakage over poly-Si gates and no degradation in breakdown characteristics versus poly-Si gates. Nickel silicide gate electrodes formed by full silicidation may suffer from gate length dependent silicidation [15.42]. Gate length dependent formation of a nickel-rich phase tends to arrest silicidation at formation temperatures below 450◦ C. Figure 15.11 presents SEM cross-sections of multiple gates that display incomplete silicidation. Furthermore, the nickelrich phase has a 130 mV higher work function than the mono-silicide raising concerns about Vt control. A final option to integrate a mid-gap metal into FD–SOI devices utilizes a replacement gate approach for a tungsten damascene gate electrode with SiO2 and SiON gate dielectrics [15.22]. This approach has been used to

464

L. Colombo et al.

successfully fabricate devices with gate lengths down to 8 nm. The FD–SOI structure notwithstanding, at these extremely scaled gate lengths, pocket implants were implemented to control short channel effects and set the Vt . Long channel Vt of ∼ −0.4 V for pMOS and ∼ 0.5 V for nMOS were obtained. These devices exhibited short channel sub-threshold slope of < 83 mV exhibiting good control of SCEs. The main assumption in this FD–SOI approach using a mid-gap metal gate and an undoped or lightly-doped channel is whether the gate can provide sufficient control over the channel region when the drain extension and deep drain are so close (on the order of 10 nm) to the source. It is likely that this situation will necessitate the use of pocket implants to control SCEs, which in turn necessitates the use of Vt implants. These higher doping levels in the channel may spur the use of dual work function gate electrodes to obtain sufficiently low Vt [15.7]. Under these circumstances, dual work functions at the endpoints from band edge to band edge or at a range of 4.2–5.0 eV are desired. The ability to obtain dual work function metals for FD–SOI by manufacturable means has been separately demonstrated by selectively implanting either Ar or N into Mo [15.82] and selectively incorporating dopants (B and As) into the silicide [15.57]. The capability to shift the work function of the metal gate in the range of 4.9 to 4.5 eV was achieved by each technique. The implementation of high-k gate dielectrics in FD–SOI devices may present additional integration issues not experienced when SiO2 or SiON is used as the gate dielectric due to an increase in SCEs associated with fringing induced barrier lowering (FIBL) [15.117]. FIBL is the result of enhanced coupling between the channel and the source/drain through the high-k gate dielectric. The increased thickness of a high-k dielectric over a SiO2 -based gate dielectric causes the termination of a greater percentage of field lines in the source/drain region. This termination of field lines in the source/drain region increases the electric field from the drain to the channel, thereby increasing DIBL and decreasing the control of the gate over the channel. If a high-k gate dielectric is integrated with a lightly doped channel in a FD–SOI scheme, then the increase in SCEs due to FIBL may encourage the use of channel engineering to decrease the effects of SCEs. Achieving low Vt when high doping levels are used in the channel of FD–SOI devices with a high-k gate dielectric would be difficult by using a single work function metal gate electrode. It is then likely that FD–SOI with a high-k gate dielectric would require the use of dual work function gate electrodes, thus increasing the complexity of the integration. Two caveats to the above scenario should be noted. First, simulations have indicated that SCEs may be insensitive to channel engineering when a high-k gate dielectric is employed [15.63]. This is most likely a result of the tight coupling of the source/drain to the channel through the high-k gate dielectric and not through the channel. It may be possible to mitigate this effect by integrating a low-k (k < 3.9) spacer material when using a high-k gate dielectric [15.62]. A low-k spacer can confine the electric field lines to

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

465

Fig. 15.12. TEM image of a FD–SOI device with PVD TaSiN metal electrode and MOCVD HfO2 high-k gate dielectric. Inset is a HR-TEM image of the gate stack and silicon channel. After [15.98]

the channel region effectively reducing the effects of FIBL. No known devices have been demonstrated to date with a high-k gate dielectric and a low-k spacer. The second caveat to the above scenario is that the effects of FIBL on SCEs do not become substantial until the dielectric constant is around 35 or alternatively for a dielectric thickness to gate length ratio of ∼ 1 to 5 [15.14]. Most high-k gate dielectric stacks under consideration today have effective dielectrics constants less than the critical value of 35 and it is unclear whether or not FIBL will be a major effect. Several instances of integration of both high-k gate dielectrics and metal gate electrodes into FD–SOI devices exist in the literature. A gate first approach has been used to integrate a MOCVD HfO2 as the gate dielectric and reactively sputtered PVD TaSiN as the gate electrode [15.97–15.99]. The cross-section of a FD–SOI device with a PVD TaSiN gate electrode and a MOCVD HfO2 high-k gate dielectric is exhibited in Fig. 15.12. The Vt for nMOS and pMOS was found to be 0.26 V and −0.51 V, respectively. These asymmetric threshold voltages are the result of the TaSiN having a work function of 4.4 eV close to the value needed for nMOS. Clearly, these devices would require dual work function metal gate electrodes to obtain the required low Vt for high performance device operation. These devices suffered from severe mobility degradation (peak mobility of 250 cm2 /V/s and 20 cm2 /V/s for nMOS and pMOS, respectively). Based on the quality of the bottom SOI interface (near ideal sub-threshold slope and high mobility), the high-k gate dielectric was identified as the source for the mobility degradation. Good control of short channel effects was obtained with these devices with a sub-threshold slope of ∼ 70 mV/dec and DIBL at 50 mV/V. A replacement gate approach has been used to integrate a tungsten damascene gate electrode and an HfO2 gate dielectric [15.22]. The tungsten gate has a

466

L. Colombo et al.

work function close to mid-gap and the threshold voltages were 0.4 V nMOS and −0.3 V for pMOS long channel transistors. However, a large Vt offset (+200 mV) was observed for short channel pMOS devices. Fixed charge in the gate stack resulting from the replacement gate process was hypothesized to cause the Vt offset. These devices showed immunity to short channel effects with sub-threshold slope of 77 mV/dec. 15.5.2 Advanced Non-planar Integration Schemes FD–SOI devices offer many gains in control of SCEs. However, the ability to fabricate SOI substrates with controlled silicon thickness uniformity in a manufacturing environment becomes quite difficult, since the silicon body thickness must scale with the gate length to achieve fully depleted operation. In order to relax the constraints on the silicon body thickness compared to planar FD–SOI, the third dimension is now being explored. The addition of the third dimension enables the implementation of multiple gates surrounding the silicon channel. Multiple gates enable additional control over the channel versus single gated devices. This extra control over the gate region relaxes the constraints on the minimum dimension of the silicon thickness enabling a wide range of devices architectures. Research organizations have recently proposed double-gate (Fin–FETs) [15.19], triple-gate [15.12], triple-gate plus (e.g. Pi-gate [15.72] and Omegagate [15.114]) and quadruple-gate (VRG) [15.29] and silicon on nothing (SON) [15.65] devices as means to obtain devices that offer enhanced channel control over planar single-gated devices. Figures 15.13–15.15 illustrate the structure of several multiple-gated MOSFETs. The additional control over the channel region afforded by the double gates of a Fin–FET allows the width of the silicon fin to be approximately two- thirds the gate length. This represents an improvement over planar FD–SOI devices where the silicon thickness is approximately one-third the gate length. However, the fin width must be patterned by lithography and etch. In planar MOSFETs, the smallest dimension pattered by lithography and etch is the gate length. In Fin–FETs, the smallest dimension patterned by lithography and etch is the silicon fin width which would put a greater strain on the lithography and etch scaling. One possible solution is to use a spacer patterning technique, but it is unclear if this technique can be controlled in a manufacturing environment. Fin–FETs typically have a high aspect ratio, but triple-gated devices allow for a shorter fin height and a thicker silicon width due to their greater ability to shield the channel from the source/drain regions. Pi-gate and Omega-gate devices (triple-gate plus) further improve the shielding of the channel from electric field lines emanating from the source/drain by extending the gate electrode into the buried oxide. These triple- and triple plus gated devices allow the dimensions of the silicon body to be approximately 1:1:1 (height:width:length). Due to their enhanced symmetry, tripleand triple plus gate devices present attractive integration options over Fin–

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

467

Fig. 15.13. Schematic diagram illustrating the layout of Fin–FET transistors. After [15.19]

Fig. 15.14. Schematic diagram illustrating various SOI devices: 1) single-gate; 2) double-gate; 3) triple gate; 4) quadruple-gate; 5) Pi-gate. After [15.72]

Fig. 15.15. TEM illustrating the cross-section of an Ω-gate device. After [15.114]

468

L. Colombo et al.

FETs. Quadruple-gate devices, such as the vertical replacement-gate (VRG) and the silicon-on-nothing (SON) offer the ultimate in channel control, but require difficult integration steps, such as epitaxially grown channels. All of these devices can be integrated with SiO2 or SiON gate dielectrics and dual work function poly-Si gate electrodes. All of these devices also enable the use a thicker equivalent electrical thickness due to their enhanced control of SCEs over their single-gated counterparts. However, the exact timing of the first implementation of multiple-gated devices is unclear, and with current scaling trends, it is not too soon to begin investigating the integration of high-k gate dielectrics and metal gate electrodes into multiple-gated devices. The common threads that tie multiple-gated devices together from a highk gatedielectric and metal gate electrode integration perspective are their requirement for deposition techniques with extremely high conformality and low defect density surfaces on the multifaceted etched silicon body. High conformality around the three-dimensional structure of the silicon body of a multiple-gated device is critical for the performance of the gate dielectric. Gate dielectric thinning at the corners or vertical edges of the silicon body could create high leakage and breakdown sites that could compromise dielectric reliability. Similar non-conformality of a metal gate electrode in these areas could possibly cause areas of high electric field that could also compromise dielectric reliability. These issues make conformal deposition techniques such as ALD and CVD promising candidates for high-k gate dielectric and metal gate electrode integration and make non-conformal techniques such as PVD unlikely candidates. Since the silicon body for most multiple gate devices is formed by lithography and etch, at least two of the device channels will be on etched surfaces. The ability to form low defectivity interfaces on silicon surfaces with line edge roughness and possible sidewall damage resulting from the pattern and etch has yet to be fully demonstrated. Hydrogen annealing the silicon body after etch but before gate dielectric formation can increase the surface diffusivity of silicon atoms and enable smoothing of these etched surfaces. In fact, post-silicon body etch hydrogen annealing has been shown to improve the mobility of Fin–FET devices [15.18]. Since these multiple-gated devices have silicon channels that are multiple silicon planes (generally (100) and (110)), techniques must be developed to create low defectivity interfaces with high mobility on multiple silicon surfaces in order to achieve devices with high drive current and acceptable device characteristics. Even with these integration challenges, several research groups have already demonstrated high-k gate dielectrics or metal gate electrodes integrated into multiple-gated structures. At least two separate research groups have fabricated Fin–FETs with metal gate electrodes. Both approaches are targeted at obtaining metal gate electrodes with dual work functions. A shift in the Vt of Mo-gated Fin–FETs has been achieved by implanting N into the Mo gate [15.18]. The Vt for an as-deposited (i.e. no N implant) Mo gated pMOS Fin–FET was found to be −0.2 V. Nitrogen implantation shifted the Vt by ∼ 0.3 V toward that needed for nMOS. Metal-gated Fin–FETs have also been

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

469

Fig. 15.16. TEM image illustrating the cross-section of a mesa isolated FD–SOI device. After [15.49]

Fig. 15.17. TEM image illustrating the cross-section of a SON MOSFET. After [15.65]

fabricated using complete silicidation of the poly-Si gate electrode to form NiSi [15.40]. Selectively implanting either P and As or B dopants into the n- and p- poly-Si before silicidation resulted in a work function delta between nMOS and pMOS of 0.45 eV. Complete conversion of the gate poly-Si to form nickel silicide has been demonstrated for FD–SOI devices with mesa isolation [15.49]. A FD–SOI device with mesa isolation is essentially a planar FD–SOI device with some gate wrapping around the sides of the channel creating a one gate plus structure as illustrated in Fig. 15.16. Wrapping the gate around the channel and the resulting stress in the channel due to the Nirich silicide electrode enabled an increase in inversion charge at the channel sidewalls and up to a 22% increase in electron mobility, respectively. Heavy extension implants were implemented to reduce pMOS Vt from mid-gap, but the nMOS devices had a high Vt (∼ 0.35 V) that degraded device performance. Complete silicidation of the poly-Si gate electrode to form CoSi2 has also been performed to produce metal-gated SON transistors [15.65]. The metallurgical contact of the source/drain regions to the silicon substrate enables complete silicidation of the poly-Si gate without overrunning the channel or the source/drain implants. Figure 15.17 is a TEM image of a SON

470

L. Colombo et al.

Fig. 15.18. HR-TEM image illustrating the cross-section at the gate edge of a VRG device with a HfO2 high-k gate dielectric. After [15.30]

MOSFET. These CoSi2 -gated SON devices exhibited reasonable immunity to SCEs (DIBL of 60 mV and sub-threshold slope of 73 mV/dec). Instances of integration of high-k gate dielectrics into multiple-gated devices are not as common in the literature as integration of metal gate electrodes in multiple-gate devices. Integration of ALD HfO2 and Al2 O3 into VRG devices is one instance of the implementation of high-k gate dielectrics in quadruple-gated devices [15.30]. Although these devices were not specifically designed for fully-depleted operation, the concept can clearly be extended to this end. The VRG device utilizes an epitaxially grown channel surrounded by a gate that is defined by the selective removal of a thin film. This device is perhaps the most demanding from a film conformality point of view since the gate dielectric must be deposited into severe reentrant corners as illustrated in Fig. 15.18. This only demonstration of high-k gate dielectrics in VRG devices suffered high threshold voltages likely arising from high fixed charge (on the order of 1012 cm−2 ) in the high-k dielectrics.

15.6 Conclusions The replacement of nitrided silicon dioxide by a high-k gate dielectric is proving to be as difficult as originally expected. The semiconductor industry has made significant progress in the selection of a high-k gate dielectric that is stable on Si but at this time there are no products with high-k gate dielectrics yet. The high-k gate dielectrics have physically been integrated into the bulk “conventional” CMOS flow. However, issues like: the flat band offset, channel mobility degradation especially for the simple binary oxides, and excessive poly Si depletion for PMOS devices make it difficult for the devices to sim-

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

471

ply meet device performance expectations. It is becoming clear that in order to achieve the performance goals using bulk Si CMOS devices, the semiconductor industry will have to implement some kind of metal gate. Metal gate development is slowly ramping as the industry is realizing the need to achieve further scaling. However, a number of significant issues remain to be addressed, from the material selection, to integration for both midgap metal electrodes and especially for dual work function metals.

References 15.1. Agnello PD (2002) “Process Requirements for Continuing Scaling of CMOS – The Need and Prospects for Atomic Level Manipulation.” IBM Journal of Research and Development 46, 2/3: 317–338 15.2. Bagchi S, Grant JM, Chen J, Samavedam S, Huang F, Tobin P, Conner J, Prabhu L, and Tiner M, (2000) in Fully depleted SOI devices with TiN gate and elevated source-drain structures. Technical Digest of IEEE International Electron Device Meeting, pp. 56–57 15.3. Bao TI, Ko CC, Song JY, Li LP, Lu HH, Lu YC, Chen YH, Jang SM, and Liang MS, (2002) in 90 nm Generation Cu/CVD Low-k (k < 2.5) Interconnect Technology. Technical Digest of IEEE International Electron Device Meeting, pp. 583–586 15.4. Buchanan DA, Gusev EP, Cartier E, Okorn-Schmidt H, Rim K, Gribelyuk MA, Mocuta A, Ajmera A, Copel M, Guha S, Bojarczuk N, Callegari A, D’Emic C, Kozlowski P, Chan K, Fleming RJ, Jamison PC, Brown J, and Arndt R, (2000) in 80 nm poly-silicon gated n-FETs with ultrathin Al2 O3 gate dielectric for ULSI applications. Technical Digest of IEEE International Electron Device Meeting, pp. 223–226 15.5. Callegari A, Gousev E, Zabel T, Lacey D, Gribelyuk M, and Jamison P (2002) “Thermal stability of polycrystalline silicon/metal oxide interfaces.” Applied Physics Letters 81, 22: 4157–4158 15.6. Chambers JJ, Rotondaro ALP, Bevan MJ, Visokay MR, and Colombo L (2002) “Effect of Composition and Post-deposition Annealing on the Etch Rate of Hafnium and Zirconium Silicates in Dilute HF.” Proceedings of the Electrochemical Society, Cleaning Technology in Semiconductor Device Technology VII, Fall 2001 PV-26, 359–363 15.7. Chang L, Tang S, King T-J, Bokor J, and Hu C, (2000) in Gate length scaling and threshold voltage control of double-gate MOSFETs. Technical Digest of IEEE International Electron Device Meeting, pp. 719–722 15.8. Chang S, Lee J, and Shin H (2002) “Gate Induced Drain Leakage Currents in Metal Oxide Semiconductor Field Effect Transistors with High-k Dielectric.” Japanese Journal of Applied Physics 41, 7A: 4432–4435 15.9. Chatterjee A, Chapman RA, Dixit G, Kuehne J, Hattangady S, Yang H, Brown GA, R. Aggarwal, Erdogan U, He Q, Hanratty M, Rogers D, Murtaza S, Fang SJ, Kraft R, Rotondaro ALP, Hu JC, Terry M, W.Lee, Fernando C, Konecni A, Wells G, Frystak D, Bowen C, Rodder M, and Chen I-C, (1997) in Sub-100 nm Gate Length Metal Gate NMOS Transistors Fabricated by a Replacement Gate Process. Technical Digest of IEEE International Electron Device Meeting, pp. 821–824

472

L. Colombo et al.

15.10. Chatterjee A, Chapman RA, Joyner K, Otobe M, Hattangady S, Bevan M, Brown GA, H. Yang, He Q, Rogers D, Fang SJ, Kraft R, Rotondaro ALP, Terry M, Brennan K, Aur S-W, Hu JC, Tsai H-L, Jones P, Wil G, Aoki M, Rodder M, and Chen I-C, (1998) in CMOS Metal Replacement Gate Transistors using Tantalum Pentoxide Gate Insulator. Technical Digest of IEEE International Electron Device Meeting, pp. 777–780 15.11. Chau R, Kavalieros J, Doyle B, Murthy A, Paulsen N, Lionberger D, Barlage D, Arghavani R, Roberds B, and Doczy M, (2001) in A 50 nm depleted-substrate CMOS transistor (DST). Technical Digest of IEEE International Electron Device Meeting, pp. 231–234 15.12. Chau R, Doyle B, Kavalieros J, Barlage D, Murthy A, Doczy M, Rios R, Linton T, Arghavani R, Jin B, Datta S, and Hareland S, (2002) in Advanced Depleted-Substrate Transistors: Single-Gate, Double-Gate and Tri-Gate. International Conference on Solid State Devices and Materials, pp. 68–69 15.13. Chen J, Maiti B, Connelly D, M. Mendocino, Huang F, Adetutu O, Yu Y, Weddington D, Wu W, Candelaria J, Dow D, Tobin P, and Mogab J, (1999) in 0.18 µm Metal Gate Fully-Depleted SOI MOSFETs for Advanced CMOS Applications. Symposium on VLSI Technology Digest of Technical Papers pp. 25–26 15.14. Cheng B, Cao M, Rao R, Inani A, Vande Voorde P, Greene WM, Stork JMC, Yu Z, Zeitzoff PM, and Woo JCS (1999) “The impact of high-k gate dielectrics and metal gate electrodes on sub-100 nm MOSFETs.” IEEE Transactions on Electron Devices 46, 7: 1537–1544 15.15. Cheng B, Maiti B, Samavedam S, Grant J, Taylor B, Tobin P, and Mogab J (2001) in Metal Gates for Advanced Sub-80nm SOI CMOS Technology, 2001 IEEE International Conference, pp. 91–92 15.16. Choi C-H, Chidambaram PR, Khamankar R, Machala CF, Yu Z, and Dutton RW (2002) “Dopant Profile and Gate Geometric Effects on Polysilicon Gate Depletion in Scaled MOS.” IEEE Electron Device Letters 49, 7: 1227–1231 15.17. Choi C-H, Chidambaram PR, Khamankar R, Machala CF, Yu Z, and Dutton RW (2002) “Gate Length Dependent Polysilicon Depletion Effects.” IEEE Electron Device Letters 23, 4: 224–226 15.18. Choi Y-K, Chang L, Ranade P, Lee J-S, Ha D, Balasubramanian S, Agarwal A, Ameen M, King T-J, and Bokor J, (2002) in FinFET process refinements for improved mobility and gate work function engineering. Technical Digest of IEEE International Electron Device Meeting, pp. 259–262 15.19. Choi Y-K, King T-J, and Hu C (2002) “Nanoscale CMOS spacer FinFET for the terabit era.” IEEE Electron Device Letters 23, 1: 25–27 15.20. Colinge JP, Park JT, and Colinge CA (2002) “SOI devices for sub-0.1 µm gate lengths.” Microelectronics 1, 109–113 15.21. Copel M, Gribelyuk M, and Gusev E (2000) “Structure and stability of ultrathin zirconium oxide layers on Si (001).” Applied Physics Letters 76, 4: 436–438 15.22. Doris B, Ieong M, Zhu H, Zhang Y, Steen M, Natzle W, Callegari S, Narayanan V, Cai J, Ku SH, Jamison P, Li Y, Ren Z, Ku V, Boyd D, Kanarsky T, D’Emic C, Newport M, Dobuzinsky D, Deshpande S, Petrus J, Jammy R, and Haensch W, (2003) in Device Design Considerations for Ultra-Thin SOI MOSFETS. Technical Digest of IEEE International Electron Device Meeting, pp. 631–635

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

473

15.23. Ducroquet F, Achard H, Coudert F, Pr´evitali B, Lugand J-F, Ulmer L, Farjot T, Gobil Y, Heitzmann M, Tedesco S, Nier M-E, and Deleonibus S (2001) “Full CMP Integration of CVD TiN Damascene Sub-0.1-µm Metal Gate Devices For ULSI Applications.” IEEE Transactions on Electron Devices 48, 8: 1816–1821 15.24. Fung SKH, Zamdmer N, Oldiges PJ, Sleight J, Mocuta A, Sherony M, Lo S-H, Joshi R, Chuang CT, Yang I, Crowder S, Chen TC, Assaderaghi F, and Shahidi G, (2000) in Controlling floating-body effects for 0.13 µm and 0.10 µm SOI CMOS. Technical Digest of IEEE International Electron Device Meeting, pp. 231–234 15.25. Goodwin CA and Brossman JW (1982) “MOS Gate Oxide Defects Related to Treatment of Silicon Nitride Coated Wafers Prior to Local Oxidation.” Journal of the Electrochemical Society 129, 5: 1066–1070 15.26. Graff K, (1995) Metal Impurities in Silicon-Device Fabrication (SpringerVerlag, Heidelberg, Germany, 1995) 15.27. Gusev E, Buchanan DA, Cartier E, Kummar A, DiMaria D, Guha S, Callegari A, Zafar S, Jamison PC, Neumayer DA, Copel M, Gribelyuk M, Okorn-Schmidt H, D’Emic C, Kozlowski P, Chan K, Bojarczuk N, Ragnarson L-˚ A, Ronsheim P, Rim K, Fleming RJ, Mocuta A, and Ajmera A, (2001) in Ultrathin High-k Gate Stacks for Advanced CMOS Devices. Technical Digest of IEEE International Electron Device Meeting, pp. 451– 454 15.28. Hendrix BC, Borovik AS, Xu C, Roeder JF, Baum TH, Bevan MJ, Visokay MR, Chambers JJ, Rotondaro ALP, Bu H, and Colombo L (2002) “Composition control of Hf1−x Six O2 films deposited on Si by chemicalvapor deposition using amide precursors.” Applied Physics Letters 80, 13: 2362–2364 15.29. Hergenrother JM, Monroe D, Klemens FP, Komblit A, Weber GR, Mansfield WM, Baker MR, Baumann FH, Bolan KJ, Bower JE, Ciampa NA, Cirelli RA, Colonell JI, Eaglesham, Frackoviak DJ, Gossmann J, Green HJ, and Hill ML, (1999) in The Vertical Replacement-Gate (VRG) MOSFET: a 50-nm vertical MOSFET with lithography-independent gate length.Technical Digest of IEEE International Electron Device Meeting, pp. 75–78 15.30. Hergenrother JM, Wilk GD, Nigam T, Klemens FP, Monroe D, Silverman PJ, Sorsch TW, Busch B, Green ML, Baker MR, Boone T, Bude MK, Ciampa NA, Ferry EJ, Fiory AT, Hillenius SJ, Jacobson DC, Johnson RW, and Kalava, (2001) in 50 nm vertical replacement-gate (VRG) nMOSFETs with ALD HfO2 and Al2 O3 gate dielectrics. Technical Digest of IEEE International Electron Device Meeting, pp. 3.1.1–3.1.4 15.31. Hobbs C, Tseng H, Reid K, Taylor B, Dip L, L.Hebert, Garcia R, Hegde R, Grant J, Gilmer D, Franke A, Dhandapani V, Azrak M, Prabhu L, Rai R, Bagchi S, Conner J, Backer S, Dumbuya F, Nguyen B, and Tobin P, (2001) in 80 nm Poly-Si Gate CMOS with HfO2 Gate Dielectric. Technical Digest of IEEE International Electron Device Meeting, pp. 651–654

474

L. Colombo et al.

15.32. Hornung B, Khamankar R, Niimi H, Goodwin M, Robertson L, Miles D, Kirkpatrick B, AlShareef H, Varghese A, Bevan MJ, Nicollian P, Chidambaram PR, Chakravarthi S, Gurba A, Zhang X, Blatchford J, Smith B, Lu JP, Deloach J, Rathsack B, Bowen C, Thakar G, Machala C, and Grider T, (2003) in A High-Performance 90 nm Logic Technology with a 37 nm Gate Length, Dual Plasma Nitrided Gate Dielectric and Differential Offset Spacer. Symposium on VLSI Technology Digest of Technical Papers, pp. 85–86 15.33. Hu JC, Yang H, Kraft R, Rotondaro ALP, Hattangady S, Lee WW, Chapman RA, Chao C-P, Chatterjee A, Hanratty M, Rodder M, and Chen I-C, (1997) in Feasability of Using W/TiN as Metal Gate for Conventional 0.13 µm CMOS Technology and Beyond. Technical Digest of IEEE International Electron Device Meeting, pp. 825–828 15.34. Istratov AA, Hieslmair H, and Weber ER (2000) “Iron Contamination in Silicon Technology.” Applied Physics A 70, 489–534 15.35. Istratov AA and Weber ER (2002) “Physics of Copper in Silicon.” Journal of the Electrochemical Society 149, 1: G21–G30 15.36. ITRS, (2003) in International Technology Roadmap for Semiconductors. Semiconductor Industry Association, http://public.itrs.net, 2003 updated edition, (181 Metro Drive, Suite 450, San Jose, CA, 95510) 15.37. Jeon TS, White JM, and Kwong DL (2001) “Thermal stability of ultrathin ZrO2 films prepared by chemical vapor deposition on Si(100).” Applied Physics Letters 78, 3: 368–370 15.38. Kakumu M and Hashimoto K, (1984) in Work Function Controlled Silicide Technology. Symposium on VLSI Technology Digest of Technical Papers, pp. 30–31 15.39. Kakumu M and Matsunaga Ji, (1985) in Lightly Impurity Doped (LD) Mo Silicide Gate Technology. Technical Digest of IEEE International Electron Device Meeting, pp. 415–418 15.40. Kedzierski J, Nowak E, Kanarsky T, Zhang Y, Boyd D, Carruthers R, Cabral C, Amos R, Lavoie C, Roy R, Newbury J, Sullivan E, Benedict J, Saunders P, Wong K, Canaperi D, Krishnan M, Lee K-L, Rainey BA, Fried D, Cottrell P, Wong H-SP, Ieong M, and Haensch W, (2002) in Metalgate FinFET and fully-depleted SOI devices using total gate silicidation. Technical Digest of IEEE International Electron Device Meeting, pp. 247– 250 15.41. Kedzierski J, Boyd D, Ronsheim P, Zafar S, Newbury J, Ott J, Jr. CC, Ieong M, and Haensch W, (2003) in Threshold voltage control in NiSigated MOSFETs through silicidation induced impurity segregation (SIIS). Technical Digest of IEEE International Electron Device Meeting, pp. 315– 318 15.42. Kedzierski J, Boyd D, Zhang Y, Steen M, Jamin FF, Benedict J, Ieong M, and Haensch W (2003) “Issues in NiSi-gated FDSOI device integration.” Technical Digest of IEEE International Electron Device Meeting, pp. 441– 444

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

475

15.43. Kim Y, Gebara G, Freiler M, Barnett J, Riley D, Chen J, Torres K, Lim J, Foran B, Shaapur F, Agarwal A, Lysaght P, Brown GA, C. Y, Borthakur S, Li H-J, Nguyen B, Zeitzoff P, Bersuker G, Derro D, Bergmann R, Murto RW, Hou A, Huff HR, Shero E, Pomarede C, Givens M, Mazanec M, and Werkhoven C, (2001) in Conventional n-Channel MOSFET Devices Using Single Layer HfO2 and ZrO2 as High-k Gate Dielectrics with Polysilicon Gate Electrode. Technical Digest of IEEE International Electron Device Meeting, pp. 455–458 15.44. Kittl JA, Lauwers A, Chamirian O, Van Dal M, Akheyar A, Richard O, Lisoni JG, De Potter M, Lindsay R, and Maex K (2003) in Silicides for 65 nm CMOS and Beyond, CMOS Front-End Materials and Process Technology (to be published) 15.45. Kooi E, Van Lierop JG, and Appels JA (1976) “Formation of Silicon Nitride at a Si-SiO2 Interface During Local Oxidation of Silicon and During Heat-Treatment of Oxidized Silicon in NH3 Gas.” Journal of the Electrochemical Society 123, 7: 1117–1120 15.46. Koyama M, Suguro K, Yoshiki M, Kamimuta Y, Koike M, Ohse M, Hongo C, and Nishiyama A (2001) in Thermally Stable Ultra-Thin Nitrogen Incorporated ZrO2 Gate Dielectric Prepared by Low Temperature Oxidation of ZrN. Technical Digest of IEEE International Electron Device Meeting, pp. 459–462 15.47. Koyama M, Kaneko A, Ino T, Koike M, Kamata Y, Iijima R, Kamimuta Y, Takashima A, Suzuki M, Hongo C, Inumiya S, Takayanagi M, and Nishiyama A, (2002) in Effects of Nitrogen in HfSiON Gate Dielectric on the Electrical and Thermal Characteristics. Technical Digest of IEEE International Electron Device Meeting, pp. 849–852 15.48. Krivokapic Z, Maszara W, Achutan K, King P, Gray J, Sidorow M, Zhao E, Zhang J, Chan J, Marathe A, and Lin M-R, (2002) in Nickel silicide metal gate FDSOI devices with improved gate oxide leakage. Technical Digest of IEEE International Electron Device Meeting, pp. 271–274 15.49. Krivokapic Z, Moroz V, Maszara W, and M.-R.Lin, (2003) in Locally Strained Ultra-Thin Channel 25 nm Narrow FDSOI Devices with Metal Gate and Mesa Isolation. Technical Digest of IEEE International Electron Device Meeting, pp. 445–448 15.50. Krug C, Da Rosa EBO, De Almeida RMC, Morais J, Baumvol IJR, Salgado TDM, and Stedile FC (2000) “Atomic Transport and Chemical Stability During Annealing of Ultrathin Al2 O3 Films on Si.” Physics Review Letters 85, 19: 4120–4123 15.51. Lin R, Lu Q, Ranade P, King T-J, and Hu C (2002) “An Adjustable Work Function Technology Using Mo Gate for CMOS Devices.” IEEE Electron Device Letters 23,1: 49–51 15.52. Lindsay R, Pawlak B, Kittl JA, Henson K, Torregiani C, Giangrandi S, Surdeanu R, Vandervorst W, Mayur A, Ross J, McCoy SP, Gelpey J, Elliott K, Pages X, Satta A, Lauwers A, Stolk P, and Maex K (2003) in A Comparison of Spike, Flash, SPER and Laser Annealing for 45 nm CMOS, CMOS Front-End Materials and Process Technology (to be published)

476

L. Colombo et al.

15.53. Lu JP, Miles D, Zhao J, Gurba A, Xu Y, Lin C, Hewson M, Ruan J, Tsung L, Kuan R, Grider T, Mercer D, and Montgomery C, (2002) in A Novel Nickel SALICIDE Process Technology for CMOS Devices with sub-40 nm Physical Gate Length. Technical Digest of IEEE International Electron Device Meeting, pp. 371–374 15.54. Lu Q, Lin R, Ranade P, King T-J, and Hu C, (2001) in Metal Gate Work Function Adjustment for Future CMOS Technology. Symposium on VLSI Technology Digest of Technical Papers, pp. 45–46 15.55. Maiti B, Tobin PJ, Hobbs C, Hegde RI, Huang F, O’Meara DL, Jovanovic D, Mendicino M, Chen J, Connelly D, Adetutu O, Mogab J, Candelaria J, and La LB, (1998) in PVD TiN metal gate MOSFETs on bulk silicon and fully depleted silicon-on-insulator (FDSOI) substrates for deep subquarter micron CMOS technology. Technical Digest of IEEE International Electron Device Meeting, pp. 781–784 15.56. Maitra K and Misra V (2003) “A Simulation Study to Evaluate the Feasibility of Midgap Workfunction Metal Gates in 25 nm Bulk CMOS.” IEEE Electron Device Letters 24, 11: 707–709 15.57. Maszara WP, Krivokapic Z, King P, Goo J-S, and Lin M-R, (2002) in Transistors with Dual Work Function Metal Gates by Single Full Silicidation (FUSI) of Polysilicon. Technical Digest of IEEE International Electron Device Meeting po. 367–370 15.58. McCoy SP, Gelpey J, Elliott K, and Gable KA (2003) in Flash-Assist RTP USJ Source Drain Extension Junction Formation and Characterization, Seventh International Workshop on: Fabrication, Characterization, and Modeling of Ultra-Shallow Doping Profiles in Semiconductors, pp. 104–110 15.59. Mertens PW, Meuris M, Schmidt HF, Verhaverbeke S, Heyns MM, Carr P, Graf D, Schnegg A, Kubota M, Dillenbeck K, and De Blanc R (1993) in Critical Aspects of Wafer Cleaning and Gate Oxide Integrity, Proceedings of the Electrochemical Society, Crystalline Defects and Contamination: Their Impact and Control in Device Manufacturing, pp. 87–102 15.60. Misra V, Zhong H, and Lazar H (2002) “Electrical Properties of Ru-Based Alloy Gate Electrodes for Dual Metal Gate Si-CMOS.” IEEE Electron Device Letters 23, 6: 354–356 15.61. Mohammadi F and Saraswat KC (1981) “N-Channel MOSFETs with WSi2 Gate.” IEEE Electron Device Letters 2, 2: 24–25 15.62. Mohapatra NR, Desai MP, Narendra SG, and Rao VR (2002) “The effect of high-K gate dielectrics on deep submicrometer CMOS device and circuit performance.” IEEE Transactions on Electron Devices 49, 5: 826–831 15.63. Mohapatra NR, Desai MP, and Rao VR, (2003) in Detailed analysis of FIBL in MOS transistors with high-k gate dielectrics. International Conference on VLSI Design, pp. 99–104 15.64. Monfray S, Skotnicki T, Morand Y, Descombes S, Coronel P, Mazoyer P, Harrison S, Ribot P, Talbot A, Dutartre D, Haond M, Palla R, Le Friec Y, Leverd F, Nier ME, Vizioz C, and Louis D, (2002) in 50 nm - gate all around (gaa) - silicon on nothing (son) - devices: a simple way to cointegration of gaa transistors within bulk mosfet process. Symposium on VLSI Technology Digest of Technical Papers, pp. 108–109

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

477

15.65. Monfray S, Skotnicki T, Tavel B, Morand Y, Descombes S, Talbot A, Dutartre D, Jenny C, Mazoyer P, Palla R, Leverd F, Le Friec Y, Pantel R, Haond M, Charbuillet C, Vizioz C, Louis D, and Buffet N, (2002) in SON (Silicon-On-Nothing) P-MOSFETs with totally silicided (CoSi/sub 2/) polysilicon on 5 nm-thick Si-films: the simplest way to integration of metal gates on thin FD channels. Technical Digest of IEEE International Electron Device Meeting, pp. 263–266 15.66. Morais J, Da Rosa EBO, Miotti L, Pezzi RP, Baumvol IJR, Rotondaro ALP, Bevan MJ, and Colombo L (2001) “Stability of Zirconium Silicate Films on Si Under Vacuum and O2 Annealing.” Applied Physics Letters 78, 17: 2446–2448 15.67. Nandakumar M, Chatterjee A, Sridhar S, Joyner K, Rodder M, and Chen I-C, (1998) in Shallow Trench Isolation for Advanced ULSI CMOS Technologies. Technical Digest of IEEE International Electron Device Meeting, pp. 133–136 15.68. Nishi Y and Doering R, (2000) Handbook of Semiconductor Manufacturing Technology (Marcel Dekker Inc, New York, NY, USA, 2000) 15.69. Nishinohara KT, Akasaka Y, Saito T, Yagishita A, Murakoshi A, Suguro K, and Arikado T (2001) “Surface Channel Metal Gate Complementary MOS with Light Counter Doping and Single Work Function Gate Electrode.” Japanese Journal of Applied Physics 40, 2603-2606 15.70. Onishi K, Kang CS, Choi R, Cho H, Gopalan S, Nieh R, Krishnan S, and Lee JC, (2002) in Effects of High Temperature Forming Gas Anneal on HfO2 MOSFET Performance. Symposium on VLSI Technology Digest of Technical Papers, pp. 22–23 15.71. Park C, Kim S, Wang Y, Talwar S, and Woo JCS, (2001) in 50 nm SOI CMOS Transistors with Ultra Shallow Junction Using Laser Annealing and Pre-amorphization Implantation. Symposium on VLSI Technology Digest of Technical Papers, pp. 69–70 15.72. Park JT, Colinge JP, and Diaz CH (2001) “Pi-gate SOI MOSFET.” IEEE Electron Device Letters 22, 8: 405–406 15.73. Perkins CM, Triplett BB, McIntyre PC, Saraswat KC, and Shero E (2002) “Thermal stability of polycrystalline silicon electrodes on ZrO2 gate dielectrics.” Applied Physics Letters 81, 8: 1417–1419 15.74. Pidin S, Morisaki Y, Sugita Y, Aoyama T, Irino K, Nakamura T, and Sugii T, (2002) in Low Standby Power CMOS with HfO2 Gate Oxide for 100-nm Generation. Symposium on VLSI Technology Digest of Technical Papers, pp. 28–29 15.75. Polishchuk I and Hu C, (2001) in Electron wavefunction penetration into gate dielectric and interface scattering – An alternative to surface scattering model. Symposium on VLSI Technology Digest of Technical Papers, pp. 51–52 15.76. Polishchuk I, Ranade P, King T-J, and Chenming Hu (2001) “Dual Work Function Metal Gate CMOS Transistors by Ni–Ti Interdiffusion.” IEEE Electron Device Letters 22, 9: 444–446 15.77. Qin M, Poon VMC, and Ho SCH (2001) “Investigation of Polycrystalline Nickel Silicide Films as a Gate Material.” Journal of the Electrochemical Society 148, 5: G271–G274

478

L. Colombo et al.

15.78. Quevedo-Lopez M, El-Bouanani M, Addepalli S, Duggan JL, Gnade BE, Wallace RM, Visokay MR, Douglas M, and Colombo L (2001) “Hafnium Interdiffusion Studies from Hafnium Silicate into Silicon.” Applied Physics Letters 79, 25: 4192–4194 15.79. Quevedo-Lopez MA, El-Bouanani M, Wallace RM, and Gnade BE (2002) “Wet chemical etching studies of Zr and Hf-silicate gate dielectrics.” Journal Vacuum Science and Technolology A 20, 6: 1891–1897 15.80. Quirk M and Serda J, (2001) Semiconductor Manufacturing Technology (Prentice-Hall Inc, Upper Saddle River, NJ, USA, 2001) 15.81. Ranade P, Takeuchi H, King T-J, and Hu C (2001) “Work Function Engineering of Molybdenum Gate Electrodes by Nitrogen Implantation.” Electrochemical and Solid State Letters 4, 11: G85–G87 15.82. Ranade P, Choi Y-K, Ha D, Agarwal A, Ameen M, and King T-J, (2002) in Tunable Work Function Molybdenum Gate Technology for FDSOI-CMOS. Technical Digest of IEEE International Electron Device Meeting, pp. 363– 366 15.83. Ranade P, Choi Y-K, Ha D, Takeuchi H, and King T-J, (2003) in Metal Gate Technology for Fully Depleted SOI CMOS. 2003 AVS 4th International Conference on Microelectronics and Interfaces, pp. 131–133 15.84. Ren Z, Solomon PM, Kanarsky T, Doris B, Dokumaci O, Oldiges P, Roy RA, Jones EC, Ieong M, Miller RJ, Haensch W, and Wong H-SP, (2002) in Examination of hole mobility in ultra-thin body SOI MOSFETs. Technical Digest of IEEE International Electron Device Meeting, pp. 51–54 15.85. Roh K, Youn S, Yang S, and Roh Y (2001) “Tungsten silicide for the alternate gate metal in metal-oxide-semiconductor devices.” Journal Vacuum Science and Technolology A 19, 4: 1562–1565 15.86. Rotondaro ALP, Vandamme E, Vanhellemont J, Simoen E, Heyns MM, and Claeys C (1995) “The Impact of Fe and Cu Contamination in the 1012 at/cm2 Range on the Performance of Junction Diodes.” Solid State Phenomena 47-48, 397–402 15.87. Rotondaro ALP, Hurd TQ, Kaniava A, Vanhellemont J, Simoen E, Heyns MM, Claeys C, and Brown GA (1996) “Impact of Fe and Cu Contamination on the Minority Carrier Lifetime of Silicon Substrates.” Journal of the Electrochemical Society 143, 9: 3014–3019 15.88. Rotondaro ALP, Hames GA, and Yocum T (1999) in Use of H2 SO4 for Etch Rate and Selectivity Control of Boiling H3 PO4 , Proceedings of the Electrochemical Society, Cleaning Technology in Semiconductor Device Technology VI, Fall 1999, pp. 385–390 15.89. Rotondaro ALP, Visokay MR, Chambers JJ, Shanware A, Khamankar R, Bu H, Laaksonen RT, Tsung L, Douglas M, Kuan R, Bevan MJ, Grider T, McPherson J, and Colombo L, (2002) in Advanced CMOS transistors with a novel HfSiON gate dielectric. Symposium on VLSI Technology Digest of Technical Papers (Honolulu, HI), pp. 148–149 15.90. Rotondaro ALP, Visokay MR, Shanware A, Chambers JJ, and Colombo L (2002) “Carrier Mobility in MOSFETs Fabricated with Hf-Si-O-N Gate Dielectric, Polysilicon Gate Electrode, and Self-Aligned Source and Drain.” IEEE Electron Device Letters 23,10: 603–605 15.91. Runyan WR and Bean KE, (1990) Semiconductor Integrated Circuit Processing Technology (Addison-Wesley Publishing Company, Inc, New York, NY USA, 1990)

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

479

15.92. Samavedam SB, La LB, Smith J, Dakshina-Murthy S, Luckowski E, Schaeffer J, Zavala M, R.Martin, Dhandapani V, Triyoso D, Tseng HH, Tobin PJ, Gilmer DC, Hobbs C, Taylor WJ, Grant JM, Hegde RI, Mogab J, Thomas C, Abramowitz P, Moosa M, Conner J, Jiang J, Arunachalam M, Sadd M, Nguyen B-Y, and White B, (2002) in Dual-Metal Gate CMOS with HfO2 Gate Dielectric. Technical Digest of IEEE International Electron Device Meeting, pp. 433–436 15.93. Shankoff TA, Sheng TT, Haszko SE, Marcus RB, and Smith TE (1980) “Bird’s Beak Configuration and Elimination of Gate Oxide Thinning Produced During Selective Oxidation.” Journal of the Electrochemical Society 127, 1: 216–222 15.94. Sim JH, Wen HC, Lu JP, and Kwong DL (2003) “Dual Work Function Metal Gates Using Full Nickel Silicidation of Doped Poly-Si.” IEEE Electron Device Letters 24,10: 631–633 15.95. Taur Y and Ning TH, (1998) Fundamentals of Modern VLSI Devices (Cambridge University Press, Cambridge, UK, 1998) 15.96. Tavel B, Skotnicki T, Pares G, Carri`ere N, Rivoire M, Leverd F, Julien C, Torres J, and R.Pantel, (2001) in Totally Silicided (CoSi2 ) Polysilicon: a novel approach to very low-resistive gate (∼ 2 Ω/) without metal CMP nor etching. Technical Digest of IEEE International Electron Device Meeting, pp. 37.5.1–37.5.4 15.97. Vandooren A, Egley S, Zavala M, Franke A, Barr A, White T, Samavedam S, Mathew L, Schaeffer J, Pham D, Conner J, Dakshina-Murthy S, Nguyen B-Y, White B, Orlowski M, and Mogab J, (2002) in Ultra-thin body fullydepleted SOI devices with metal gate (TaSiN) gate, high K (HfO/sub 2/) dielectric and elevated source/drain extensions. IEEE International SOI Conference, pp. 205–206 15.98. Vandooren A, Barr A, Mathew L, White TR, Egley S, Pham D, Zavala M, Samavedam S, Schaeffer J, Conner J, Nguyen B-Y, White BE, Jr., Orlowski MK, and Mogab J (2003) “Fully-depleted SOI devices with TaSiN gate, HfO/sub 2/ gate dielectric, and elevated source/drain extensions.” IEEE Electron Device Letters 24, 5: 342-344 15.99. Vandooren A, Thean AVY, Y. Du IT, Hughes J, Stephens T, Huang M, Egley S, Zavala M, Sphabmixay K, Barr A, White T, Samavedam S, Mathew L, Schaeffer J, Triyoso D, Rossow M, Roan D, Pham D, Rai R, Nguyen B-Y, White B, Orlowski M, Duvallet A, Dao T, and Mogab J, (2003) in Mixed-signal performance of Sub-100nm fully-depleted SOI devices with metal gate, high-k (HfO2 ) dielectric and elevated Source/Drain extensions. Technical Digest of IEEE International Electron Device Meeting, pp. 975– 977 15.100. Visokay MR, Chambers JJ, Rotondaro ALP, and Colombo L (2002) “Application of HfSiON as a Gate Dielectric Material.” Applied Physics Letters 80, 18: 3183–3185 15.101. Visokay MR, Chambers JJ, Rotondaro ALP, Kuan R, Tsung L, Douglas M, Bevan MJ, Bu H, Shanware A, and Colombo L, (2002) in Properties of Hf-based oxide and oxynitride thin films. Proceedings of the AVS Third International Conference on Microelectronics and Interfaces (AVS, Santa Clara, CA, USA), pp. 127–129

480

L. Colombo et al.

15.102. Weast RC, (1978) CRC Handbook of Chemistry and Physics, 58th edn (CRC Press Inc., Cleveland, OH, USA, 1978) 15.103. Wilk GD, Wallace RM, and Anthony JM (2001) “High-k gate dielectrics: Current status and materials properties considerations.” Journal of Applied Physics 89, 10: 5243–5275 15.104. Wilk GD, Green ML, Ho M-Y, Busch BW, Sorsch TW, Klemens FP, Brijs B, Dover RBv, Kornblit A, Gustafsson T, Garfunkel E, Hillenius S, Monroe D, Kalavade P, and Hergenrother JM, (2002) in Improved Film Growth and Flatband Voltage Control of ALD HfO2 and Hf-Al-O with n+ poly-Si Gates using Chemical Oxides and Optimized Post-Annealing. Symposium on VLSI Technology Digest of Technical Papers, pp. 88–99 15.105. Wolf S, (1990) Silicon Processing for the VLSI Era - Process Integration, Vol. 2 (Lattice Press, Sunset Beach, CA, USA, 1990) 15.106. Wolf S, (1995) Silicon Processing for the VLSI Era - The Submicron MOSFET, Vol. 3 (Lattice Press, Sunset Beach, CA, USA, 1995) 15.107. Wong H-SP, Frank DJ, and Solomon PM (1998) in Device design considerations for double-gate, ground-plane, and single-gated ultra-thin SOI MOSFET’s at the 25 nm channel length generation, Electron Devices Meeting, pp. 407–410 15.108. Wong H-SP (2002) “Beyond the conventional transistor,” IBM Journal of Research and Development 46, 2/3: 133–168 15.109. Xuan P and Bokor J (2003) “Investigation of NiSi and TiSi as CMOS Gate Materials.” IEEE Electron Device Letters 24, 10: 634–636 15.110. Yagishita A, Saito T, Nakajima K, Inumiya S, Akasaka Y, Ozawa Y, Hieda K, Tsunashima Y, Suguro K, Arikado T, and Okumura K (2000) “High Performance Damascene Metal Gate MOSFET’s for 0.1 µm Regime.” IEEE Trancations on Electron Devices 47, 5: 1028–1034 15.111. Yagishita A, Saito T, Nakajima K, Inumiya S, Matsuo K, Takeshi Shibata, Tsunashima Y, Suguro K, and Arikado T (2001) “Improvement of Threshold Voltage Deviation in Damascene Metal Gate Transistors.” IEEE Transcations on Electron Devices 48,8: 1604–1611 15.112. Yamaguchi T, Iijima R, Ino T, Nishiyama A, Satake H, and Fukushima N, (2002) in Additional Scattering Effects for Mobility Degradation in Hfsilicate Gate MISFETs. Technical Digest of IEEE International Electron Device Meeting, pp. 621–624 15.113. Yamaguchi T, Ino T, Satake H, and Fukushima N (2003) in Novel Dielectric Breakdown Model of Hf-Silicate with High Temperature Annealing, IEEE International Reliability Physics Symposium, pp. 34–40 15.114. Yang F-L, Chen H-Y, Chen F-C, Huang C-C, Chang C-Y, Chiu H-K, Lee C-C, Chen C-C, Huang H-T, Chen C-J, Tao H-J, Yeo Y-C, Liang M-S, and Hu C, (2002) in 25 nm CMOS Omega FETs. Technical Digest of IEEE International Electron Device Meeting, pp. 255–258 15.115. Yang H, Brown GA, Hu JC, Lu JP, Kraft R, Rotondaro ALP, Hattangady SV, Chen I-C, Luttmer JD, Chapman RA, Tsai HL, Amirhekmat B, and Magel LK, (1997) in A Comparison of TiN Processes for CVD W/TiN Gate Electrode on 3 nm Gate Oxide. Technical Digest of IEEE International Electron Device Meeting, pp. 459–462

15 CMOS IC Fabrication Issues for High-k Gate Dielectric

481

15.116. Yang IY, Chen K, Smeys P, Sleight J, Lin L, Leong M, Nowak E, Fung S, Maciejewski E, Varekamp P, Chu W, Park H, Agnello P, Crowder S, Assaderaghi F, and Su L, (1999) in Sub-60 nm physical gate length SOI CMOS. Technical Digest of IEEE International Electron Device Meeting, pp. 431–434 15.117. Yeap GC-F, Krishnan S, and Lin M-R (1998) “Fringing-induced barrier lowering (FIBL) in sub-100 nm MOSFETs with high-K gate dielectrics.” Electronics Letters 34, 11: 1150–1152 15.118. Zhong H, Hong S-N, Suh Y-S, Lazar H, Heuss G, and Misra V, (2001) in Properties of Ru-Ta Alloys as Gate Electrodes For NMOS and PMOS Silicon Devices. Technical Digest of IEEE International Electron Device Meeting, pp. 467–470

16 Characterization and Metrology of Medium Dielectric Constant Gate Dielectric Films A.C. Diebold and W.W. Chism

The characterization and optical metrology of physical properties of thin dielectric films and film stacks composed of metal oxides with static dielectric constants from ε ∼ = 4 to ε ∼ = 25 is reviewed. These dielectric films are designed to mimic the behavior of silicon dioxide gate dielectrics of 1 to 2 nm thickness, with the elevated dielectric constant allowing a gate dielectric physical thickness sufficient to suppress tunneling. The materials used to fabricate these films include metal oxides of zirconium and hafnium, and metal oxide alloys with silicon oxide or aluminum oxide, also known as “silicates” or “aluminates”, respectively. Whenever possible, we distinguish which films are appropriate for use in transistors manufactured at high volume. We review recent work aimed at determining the structure of these films, particularly through transmission electron microscopy and associated electron spectroscopies such as electron energy loss spectroscopy. The presence of crystalline structure or structures in these films is generally expected under high temperature processing, and has a significant impact on their electric and optical properties. Available characterization data is presented and correlated to the optical frequency dielectric function determined by spectroscopic ellipsometry. In addition to presenting known structure – function relationships, metrology topics are also covered. In-line physical measurement of film thickness relies on the optical models used for fitting the dielectric function. The best optical models for use in process control are described and reviewed.

16.1 Introduction The practical aspects of the dielectric function of silicon dioxide have been well understood for decades. Modern transistors operate in the 106 to 109 Hz frequency range, where the dielectric function of many materials including silicon dioxide are relatively constant [16.1]. Thus the value of the dielectric function below 1 GHz is typically referred to as the dielectric “constant” ε (∼ = 3.92 for SiO2 ). As described in the other chapters in this volume, the scaling transistors to gate lengths well below 100 nm would require the gate dielectric thickness of silicon dioxide films to shrink below 1.5 nm. At 1 nm, silicon dioxide is only a few atomic layers thick, and electrons easily tunnel between the gate electrode and the channel region of the transistor. The need

484

A.C. Diebold and W.W. Chism

to suppress tunneling currents has led to the use of silicon oxynitride films and the projected need for elevated dielectric constant films. One view of the “ideal” gate dielectric is that it is an amorphous film of uniform composition with a defect free interface [16.2]. One can match the capacitance of a given thickness silicon dioxide gate dielectric with a thicker layer of the higher κ material, thereby reducing tunneling leakage current. If the higher dielectric constant layer is directly deposited on silicon so that there is no interfacial layer, the higher κ film thickness is simply determined by multiplying the required silicon dioxide thickness by the ratio of the high κ film’s static dielectric constant to that of silicon dioxide. For example, a tantalum pentoxide film (ε ∼ 25) of thickness 6.4 nm would give an equivalent capacitance to a silicon dioxide thickness of 1 nm [16.2]. Alternatively, a 6.4 nm tantalum pentoxide film has an equivalent oxide thickness (EOT) of 1 nm. The high κ material dielectric constant is a typically a sensitive function of process conditions. In reality, the replacement gate stack is also expected to require a silicon dioxide interfacial layer between the silicon substrate and higher κ film. The capacitance of the interface film might comprise half or more of the overall capacitance, so control of this interfacial thickness is a critical requirement. Ideally, a manufacturing ready, physical metrology would directly measure EOT. Electrical metrology would measure capacitive equivalent thickness (CET), which is slightly different than EOT. In-line physical metrology is accomplished with ellipsometry, which does not directly measure EOT. The static dielectric constant cannot be determined from the dielectric constant at optical frequencies. In addition, determination of the dielectric constant of very thin films at optical frequencies is quite difficult. Ellipsometry measures film thickness, and optical response, which may be correlated with static dielectric constant. Thus, the need to control EOT in production has transformed from the need to determine silicon dioxide gate physical thickness, into the need to determine the high κ gate dielectric film physical thickness, and optical response, simultaneously. One approach would be to determine dielectric constant and physical thickness with sufficient precision to provide EOT values directly. The EOT of the higher κ film is calculated as follows: EOT = (εOX ÷εHI–K )×TPHYS , where εHI–K = 1+4πκHI–K . The precision of the EOT value for high κ films is given by propagation of the dielectric constant measurement precision, and physical thickness precision, through the equation for EOT: σEOT = (εOX ÷ εHI–K ) × TPHYS × (σHI–K /εHI–K + σPHYS /TPHYS ). This is just the EOT multiplied by the sum of the relative uncertainties in dielectric constant and physical thickness. Now, from the requirement that P ÷ T = 1/10 ≡ 6σ÷(UL–LL), we see that we must have: σEOT ≤ (0.1)2 × EOT ÷ 6, or σEOT ≤ EOT ÷ 600. Thus, we will need (σHI–K /εHI–K + σPHYS /PHYS ) ≤ 1 ÷ 600. As mentioned, the gate dielectric may also have a stacked structure, such as a very thin silicon dioxide or silicon oxynitride interfacial layer. In this case the equation for EOT becomes EOT = (εOX ÷ εHI–K ) × TPHYS + TINT . The interfacial layer capacitance will

16 Characterization and Metrology

485

typically comprise a large portion of the total gate capacitance although its physical thickness may be less than 0.5 nm. Thus, control of this interfacial layer thickness below the high κ film is an additional key metrology challenge. Due to the issues discussed above, process control will be done on the basis of the physical thickness of the interface and high κ films. A principle advantage of using silicon dioxide is that the interfacial region between the crystal silicon and the silicon dioxide can be produced with very few defects [16.3], which trap electrical charge and greatly alter the properties of a transistor. Decades of experience have gone into understanding how to process silicon dioxide on silicon with reduced defect densities. To date, interfacial defects have been an issue for the higher dielectric constant films due to the lack of good matching between the metal oxide and silicon. An interfacial layer of silicon oxynitride has been used to provide low defect densities and possibly stress relief. This chapter will attempt to present structure – function relationships for stacked films of higher κ materials on oxynitride interfacial layers. The difficulties associated with interpretation of interfacial structure of Si/SiO2 observed using transmission electron microscopy (TEM) and related microscopy methods and associated spectroscopies such as high angle annular dark field (HA–ADF) scanning transmission electron microscopy (STEM) with electron energy loss spectroscopy (EELS) have been reviewed [16.4–16.6]. Muller has described the use of HA–ADF STEM with EELS for understanding the transition from silicon to silicon dioxide and applied this methodology to understand the interfacial characteristics of higher dielectric constant films [16.5, 16.6]. Muller and others have carefully evaluated the impact of interfacial micro-roughness on the interpretation of the high-resolution TEM images. Using this background, the available data on interfacial layers between higher κ films and silicon is summarized. Knowledge of the physical structure of these films is essential for determining the true optical properties (dielectric function (dispersion) from the UV to IR) of the higher κ films. Single wavelength ellipsometry and spectroscopic ellipsometry are the fundamental technologies used inside semiconductor fabrication clean rooms to characterize the thickness and uniformity of silicon dioxide and silicon oxynitride films. This technology will remain the primary means to characterize gate thickness as IC manufacturers transition to the medium κ metal oxides. Ellipsometry determines film thickness and optical properties by regressive analysis using film stack models. These models include the layer structure, the layer thickness, and the dielectric function of each layer. Each layer’s thickness and dispersion are adjusted to provide best matching to the experimental data. The optical properties of each layer are contained in its dielectric function, which may be written ε(λ) = ε1 + iε2 ≡ n ˜ 2 = (n2 − k 2 ) + 2 ink. This equation defines the relationship between the real and imaginary parts of the dielectric function, or equivalently, of the real and imaginary parts of the refractive index, n and k [16.1]. The absorption coefficient is given by

486

A.C. Diebold and W.W. Chism

α = 4πk/λ = 4πε2 /nλ. The inverse of this quantity is the penetration depth. Silicon dioxide is transparent from the from the near-infrared (NIR) to the ultraviolet (UV), so ε2 = 0 throughout this wavelength range. Al2 O3 is also transparent in the NIR-UV. Thus, dispersions for these films are relatively simple and may be accurately represented with the well-known Cauchy or Sellmeier dispersion forms. However, for ZrO2 , HfO2 , and their alloys with either Al2 O3 or SiO2 , significant optical absorption occurs in the UV. Therefore, these films require the use of dispersion parameterizations that account for non-zero absorption. The optimum dispersion models will correctly describe the material and have a minimum number of parameters that are fit to the data. Although medium κ processes are typically designed to deposit amorphous layers, many films are polycrystalline or a mixture of amorphous and crystalline regions. It is also possible to initiate crystallization during the thermal cycles the films undergo as a routine part of chip fabrication. In general, there are significant differences between the dielectric functions of crystalline and amorphous states, and this can be tracked by changes in the optical properties of the film. However, to be useful in a production environment, the dispersion for the film of interest must be known over the process window. The measurement of interfacial layer thickness and/or composition below higher dielectric constant materials represents another challenging task. In particular, sensitivity to the thickness and nitrogen content of the oxynitride layer below the high κ is difficult to obtain with adequate precision. This is due to factors such as light attenuation in the high κ and loss of precision as the number of fitting parameters is increased. This loss of precision generally occurs when fit parameters become highly correlated. One of the advantages of using ellipsometry and capacitance-voltage for in-line measurements is that they average over large areas (relative to atomic dimensions) and are interpreted in terms of slab layers [16.4]. The film stacks have typically ∼ 3 to 5 nm of higher κ material over 1 nm of oxynitride. It is interesting to note that when compared to silicon dioxide films of equivalent capacitance, the thicker high κ part of the stack is easier to measure optically and more difficult electrically [16.2]. In the first section of this chapter, the characterization of structural and chemical properties of the higher κ film stacks is reviewed. In the second section, the optical modeling of higher dielectric constant films is reviewed. The last section of the chapter describes in-line optical and electrical metrology for these films.

16.2 Structural and Chemical Characterization of Medium ε Film Stacks This section is divided into sections covering characterization methods, structure function relationships, and a review of results of medium κ characterization.

16 Characterization and Metrology

487

16.2.1 Characterization Methods Transmission Electron Microscopy and Scanning TEM The width and crystalline structure of very thin films is typically characterized by transmission electron microscopy (TEM) and complementary methods such as high angle annular dark field scanning transmission electron microscopy (HA–ADF STEM). In order to fully understand the existing high magnification characterization of medium ε dielectric films, it is useful to describe the fundamental differences between the imaging modes. Next, the difficulty in determining the exact film thickness is reviewed. The issues include the impact of imperfections in the TEM itself on film thickness metrology as well as the effect of interfacial roughness on locating the interface itself [16.7]. Three different TEM imaging modes have been used in the characterization of thin oxide films [16.4–16.6]. The electron lens system of the state of the art TEM can be used to obtain bright filed/dark field, phase contrast (or high resolution TEM, i.e., HRTEM), and high angle annular dark field scanning TEM (HA–ADF–STEM) images. In bright field (BF) mode, images are obtained by passing the electrons directly though the sample. The contrast in a BF image comes from diffraction and/or scattering from the transmitted electron beam by either local density or mass (Z) differences [16.8]. A dark field (DF) image is formed by selecting electrons that have been diffracted upon passage through the sample [16.8]. Several DF images can be formed by selecting one or more diffracted beams using specific diffraction conditions (e.g., 001 or 011, etc.) HRTEM (phase contrast) images are formed when two or more of the diffracted beams interfere to form an image [16.8]. Thus one is not seeing atoms, but the interference of the directly transmitted beam with diffracted beams to produce intensity minima at the atom positions. Many TEM images published in the literature such as transistor cross-sections, use this mode. HRTEM images obtained as “on-axis lattice fringe images” from the silicon substrate can be used as a high-resolution calibration of the device dimensions. The term on-axis lattice fringe image refers to a phase contrast image formed when the sample is tilted so that the lattice planes used to form the diffraction beams are parallel to the electron optics axis. Image formation in Scanning TEM (STEM) is different from the high resolution, phase contrast images discussed above. STEM images are formed by monitoring the transmitted electron intensity as a finely focused electron beam is scanned across a very thin sample as in TEM. Both BF and DF images can be formed in STEM. One advantage of STEM is that materials characterization maps or line scans by EDS and EELS can be done at high resolution [16.8]. TEM/STEM systems equipped with thermal field emission sources have the required brightness and current stability for materials characterization. The latest generation STEM systems are equipped with high angle annular dark field (HA–ADF) detectors that provide very high resolution images [16.5, 16.6]. The HA–ADF STEM image is an intensity image and not a phase contrast image. There is a one

488

A.C. Diebold and W.W. Chism

to one correspondence between the atom like spots in HA–ADF–STEM image and the position of atoms. The atom-like spots are due to columns of atoms [16.4–16.6]. In addition the form of the image does not vary strongly with defocus, or thickness, or lens conditions. Although the contrast in these images is a result of atomic number differences, image interpretation requires consideration of subtle physics [16.4–16.6]. The objective is to separate the elastically scattered electrons from the diffracted beams which interfere with the elastically scattered electrons and make image interpretation difficult. Diffraction effects are reduced in ADF imaging as the annular detector averages over a wide range of scattering angles. The HA–AFD STEM image is a map of the atomic scattering factor which is a strong function of the atomic number [16.4, 16.5]. Comparison of the spatial resolution of HRTEM and HA–ADF–STEM show that the HA–ADF mode can resolve features smaller than with conventional HRTEM if the Scherzer resolution limit is considered, only, and focal-series reconstruction is neglected [16.5]. Careful studies of silicon dioxide based transistor gate dielectric thickness and interfacial layer thickness have used HA–ADF [16.5, 16.6]. Although the dimensions in HRTEM images can be calibrated using the lattice like image, new work suggests that the accuracy must be improved. Taylor et al. have simulated images of thin silicon dioxide films between crystalline silicon using different sample thickness, defocus, tilt, and spherical aberrations of the TEM lens for film thickness’ of 1.056 nm and 1.629 nm [16.9]. This work concludes that thickness errors of 10% are common and that only a TEM with no spherical lens aberration could measure film thickness exactly if single lattice images are considered [16.10]. Taylor et al. also simulated the ability of TEM to locate structural defects. They conclude that very thin sample thickness’ < 10 nm are required for detection of even large defects > 4 nm [16.10]. Preparation of extremely thin samples does not seem to be in line with the time constraints found during routine TEM sample preparation in the semiconductor industry. There are two approaches to reducing or even removing the effect of lens aberration that are worthy of further discussion. HRTEM imaging can be improved through either new lens design or by a method known as focal series reconstruction. New lens technology that actively corrects for spherical aberration has recently been introduced. Improved resolution has been reported using this technology [16.11]. When this is combined with an electron monochromator, TEM and STEM may come close to their ultimate resolution. However, the effect of a reduced energy spread is different for HR–TEM and ADF–STEM. Pennycook has indicated that TEM and STEM have important differences in their sensitivity to energy spread near the limit of resolution that deserve mention here [16.12]. The important beams in STEM are traveling in equal and opposite angles through the lens and undergo the same phase change due to instability (energy spread). In HR–TEM, the beams forming the high-resolution image travel at angles and then interfere with the direct beam to form the image [16.12]. The beams traveling at angles

16 Characterization and Metrology

489

Fig. 16.1. High Resolution TEM image resolution can be improved using Focal Series Reconstruction. Figure courtesy Micheal O’Keefe [16.11]

have a different phase change than the direct beam. The STEM images do not have the exponential damping envelope that HR–TEM images do when approaching the limits of resolution [16.12]. Focal series reconstruction produces images of the electron exit wave with greatly reduced spherical aberration when compared to a single image recorded at Scherzer defocus [16.13, 16.14]. By now, the focal series reconstruction is a reliable process that starts with a series of images of the same location taken under different focus conditions [16.11], and then combines them into a single phase and amplitude image of the electron exit wave. The process is shown in Fig. 16.1. HRTEM combined with focal-series reconstruction can be used to correct for the effects of spherical aberration, and provide sub-˚ Angstrom resolution down to about 0.085 nm and possibly below [16.14, 16.15]. We also note that Focal Series Reconstruction is an important breakthrough that greatly improves HRTEM imaging. Although one might expect that focal series reconstruction HRTEM, Cs corrected STEM, and HA–ADF STEM will give identical dielectric thickness information on the same sample, agreement requires careful attention to sample thickness and experiment setup. Before discussing this point further, the impact of interfacial roughness and sample thickness (not film thickness) is further presented below.

490

A.C. Diebold and W.W. Chism

Fig. 16.2. Lattice fringes in HRTEM appear in microrough interfaces thus making rough interfaces appear smooth. Figure courtesy Ed Principe and used with permission of Cambridge University Press [16.22]

Interfacial analysis is a critical aspect of medium ε film studies. Therefore, it is important to understand what is actually seen in the image. Baumann et al. have simulated the effect of interfacial roughness on single HRTEM images to illustrate the complications in interpretation [16.7]. It is important to note that physics of HRTEM image formation is significantly different than that of HA–ADF–STEM as discussed more extensively below. In HRTEM, lattice fringes are observed in micro-rough interfaces that can make the interface appear to be smooth as shown in Fig 16.2. Note that even from single lattice images one could deduce the proper structure of the interface if thickness and defocus are chosen suitably and one relates the image by theory to the crystal structure [16.16]. However, there is usually a lack of uniqueness involved in the process. This work illustrates the impact of the sample thickness and defocus on the image of the interfacial region, and supports the conclusions of other studies [16.7]. Baumann reported that the impact of the TEM sample thickness variation includes an apparent shift in HRTEM determined oxide thickness of up to 0.3 nm for a surface roughness having magnitude of 1.1 nm peak to peak and a single period of 7.6 nm [16.7]. Baumann et al.’s simulation results are shown in Fig. 16.3. More recent investigation of the effect of interfacial roughness on thickness determination is discussed below. Another possible situation is that the oxide thickness remains constant but the surface roughness is mimicked at both interfaces [16.7]. This situation is shown in Fig. 16.4. The point is that even the smoothest interfaces will be imaged in HRTEM as an average of the roughness that is dependent on the thickness of the TEM sample. Muller points out [16.5] that averaging of the roughness over the sample thickness still occurs in ADF imaging. Ross et al. studied the oxidation of single crystal silicon and the thickness of the interface before the widespread availability of HA–ADF STEM and focal series reconstruction [16.17]. The Fresnel method was used to study interfacial properties. Fresnel fringes are seen at the interface of every crys-

16 Characterization and Metrology

491

Fig. 16.3. The effect of interfacial microroughness on a single interface is illustrated by the simulation of Baumann et al. [16.7]. Figure courtesy David Muller and used with permission of MRS

Fig. 16.4. The effect of interfacial microroughness at both interfaces is shown. Adapted from figures used by David Muller

492

A.C. Diebold and W.W. Chism

tal with an amorphous layer. A series of images are obtained at different defocuses. These studies concluded that a 0.5 nm layer with intermediate stoichiometry is seen at the silicon – silicon dioxide interface for both wet and dry oxidation methods [16.17]. Stemmer has indicated the importance of Ross’s discussion of the impact of sample tilt on the observed interfacial thickness [16.18]. For example, for a 10 nm sample thickness, a tilt of 1 degree will result in a 0.17 nm observed extension of the crystalline region into the amorphous region [16.17]. A fundamental part of thickness determination by ADF–STEM and HR– TEM is determining how many atoms and how much crystalline order is required for each method to detect an ordered column of atoms. Recently, Kisielowski et al. compared the sensitivity of the HA–ADF–STEM with the HR–TEM at the National Center for Electron Microscopy [16.16]. Under ideal conditions (absence of an amorphous layer) the sensitivity of the One Angstrom HRTEM allows for the detection of single Si (Z = 14) atoms, while the sensitivity of a HA–ADF–STEM was demonstrated to be good enough to detect single gold (Z = 79) atoms in thin samples with certainty. In the presence of correlated noise from an amorphous SiO2 , however, the HRTEM sensitivity is reduced and an extrapolation of these data indicates that both methods may have comparable detection limits in this case. One could expect that roughly 5 Si atoms aligned in a column are required to generate a detectable signal above the noise levels but the atoms need not be next to each other in the sample. Muller has simulated HR–TEM and HA–ADF–STEM images of a thin sample ∼ 6 nm thick having interfacial roughness of the type where the atoms on the crystalline side of the interface were in ordered [16.19]. Both methods observed the interface at the same location. Previous simulations had shown that HRTEM could underestimate silicon dioxide thickness by 0.3 to 0.6 nm [16.6,16.7]. Kisielowski’s and Muller’s most recent work now show that in the absence of disorder at the interface, HR–TEM and ADF– STEM should observe the same film thickness for a well prepared thin sample of Si/SiO2 /poly Si. However, both agree that HA–ADF–STEM will be more sensitive to chemical roughness such as diffusion of oxygen atoms into a column of Si atoms and disorder in the atomic columns. [16.16,16.19] When this type of disorder is present, the two methods will observe the interface at different locations as shown in Fig. 16.5. Muller points out that STEM images are very sensitive to sample tilt when imaging atomic columns. Thus disorder in a column should make its observation more difficult. Below, we discuss the difficulties that interfacial roughness imposes on thickness determination and the results of a recent experimental comparison of these methods for thin silicon oxynitride films. Muller’s work has shown advantages of using HA–ADF–STEM for interfacial chemical analysis and micro-roughness characterization [16.5, 16.6]. The localized nature of the STEM probe produces EELS spectra with spot sizes under 0.2 nm resulting in the ability to study chemical changes across the interface. As mentioned above, when the sample is oriented along a crystallo-

16 Characterization and Metrology

493

Fig. 16.5. The origin of the difference in thickness determination for HRTEM and HA–ADF–STEM is shown. In (a), the interface appears at the same location when the atomic columns are well ordered. In (b), the HA–ADF STEM image will show the interface at the mid-point of the rough area, while the lattice fringes in HRTEM images will make the interface appear to be at the edge of the roughness when the atomic columns are not well ordered or chemical disorder is present

graphic axis, the HA–ADF STEM images columns of atoms. Although rough interfaces result in incomplete columns of atoms and blur the interface, the location of the interface is more readily observed in HA–ADF–STEM [16.5,16.6] than in conventional HRTEM when thicker samples are used (> 15 nm). The ability to use thicker samples is a useful attribute for routine interfacial characterization. Focal-reconstructed HRTEM also provides an excellent method of observing where the interface is located [16.20]. ADF mode also allows

494

A.C. Diebold and W.W. Chism

Fig. 16.6. Thinner cross-sectional samples result in better observation of the interface for high resolution TEM images. Figure courtesy Ed Principe

for measurement of interfacial roughness itself but only if the samples are thin [16.6]. Thus, both HAADF and HRTEM require thin samples to access the interfacial roughness. In the HA–ADF STEM mode, spectroscopy data from electron energy loss spectroscopy (EELS) provides the ability to chemically map elements across interfacial regions. Muller has provided dramatic HA–ADF STEM images of the interface of Si/SiO2 with point by point data of the EELS intensity for O–K edge [16.5, 16.6]. X-ray photoelectron spectroscopy (XPS) spectra are mostly influenced by nearest neighbors [16.5, 16.6]. Muller has also measured the interfacial state of Al in Al2 O3 on Si and Zr in Zr doped SiO2 . It is important to note that ELS spectra at an interface might show both SiO2 and sub-oxide properties which is a further indication of sample roughness. Recently, Principe et al. [16.21], have used both HA–ADF STEM, HRTEM with focal series image reconstruction, and Cs corrected STEM to study the thickness of silicon oxynitride. In addition, XPS was used to determine both thickness and nitrogen concentration. TEM data was obtained using extremely thin samples in order to minimize issues associated with interfacial roughness. HRTEM images of TEM samples from ∼ 30 nm to 6 nm from the same oxynitride are shown in Fig. 16.6. Preparation of 6 nm thick

16 Characterization and Metrology

495

Fig. 16.7. Comparison of HR–TEM and HA–ADF STEM images of the same silicon oxynitride sample. HA–ADF STEM images indicate that the film is thicker than the HR–TEM image. The EELS data shows an interfacial region between the poly (crystalline)-silicon and the silicon oxynitride. This data represents the first comparison of gate dielectric thickness determination between both TEM imaging modes. The explanation of this un-expected difference is the focus of considerable discussion [16.22]. Figure courtesy Ed Principe

cross-sections is difficult and time consuming. Although initially the HA– ADF images consistently showed a greater dielectric layer thickness than the HR–TEM images obtained focal series reconstruction [16.22], ADF–STEM and HR–TEM measure the same thickness when the appropriate aperture is selected [16.19]. Image reconstruction and HRTEM equipped with aberration correction gave the same dielectric thickness. It is interesting to note that the EELS data indicated that the interface between the poly-silicon and silicon oxynitride is not sharp because the nitrogen and oxygen signals extending past the start of the poly-silicon layer observed by HRTEM. This is illustrated in Fig. 16.7. The discrepancy between the HRTEM and HA– ADF–STEM determination of dielectric thickness observed is probably due to physical disorder (poorly aligned atomic columns) and chemical disorder [16.6,16.7,16.18–16.23] discussed above. XPS determination of the thickness of silicon dioxide is based on knowledge of the mean free path of photoelectrons electrons in that material [16.22]. Extension to silicon oxynitride is based on the approximation that the nitrogen concentration is less than 15% in all the films examined in that study [16.23]. The XPS thickness measure-

496

A.C. Diebold and W.W. Chism

ment matched the HRTEM thickness measurement. This is probably because a small amount of moisture can result in adding a few tenths of a nanometer in thickness when amorphous or poly silicon is deposited. This discrepancy could be due to electron mean free path issues. Use of XPS for determination of nitrogen concentration is discussed below.

MEIS, RBS, and XPS Medium energy ion scattering (MEIS), Rutherford backscattering (RBS), and x-ray photoelectron spectroscopy have all been used to characterize medium κ dielectric films. Garfunkel recently reviewed the application of MEIS and XPS to thin gate oxide films [16.4]. In MEIS, ∼ 100 keV protons are scattered off an oriented and detected with high angular and energy resolution [16.4]. The use of a high energy resolution detector and 100 keV energies (instead of MeV energies) greatly increases surface sensitivity [16.4]. The physics of backscattering is well understood, and a depth profile of elemental constituents in the sample can be calculated from MEIS data. The use of protons as a probe means that elements heavier than hydrogen can be characterized in the sample [16.4]. XPS is used to determine elemental composition and when ion sputtering is alternated with XPS analysis, depth profiles of elemental composition can be obtained. In XPS, a beam of x-rays ionizes the atoms in a sample and photoelectrons are emitted. The photoelectrons are energy analyzed to yield information about elemental composition and in some cases local chemical bonding. Because the mean-free-path of low energy electrons (10–1000 eV) is quite short, photoelectrons generated in XPS that arrive at the electron detector are predominantly emitted from atoms residing within the top ∼ 25 ˚ A of a material; thus the surface sensitivity. From the known photon energy and the measured kinetic energy of a photoelectron, one can roughly determine the initial “binding energy” of the electron. Because all atoms have unique binding energies for their atoms, the electron energy spectrum of a material can be used to quantitatively determine its elemental composition. Furthermore, if the mean-free-path is known for electrons of a certain energy in a given material, the thickness of a layered film of know density can be determined. A closer analysis of the peak positions and shapes can be used to determine the oxidation state (chemical environment) of the atom. Angularly resolved photoelectron spectroscopy can be used to determined the valance band electronic structure of some crystalline solid materials and overlayers, and can also be used to determine a depth/compositional profile somewhat analogous to what MEIS and SIMS offer. Principe et al. have demonstrated that XPS can be used to determine nitrogen concentration with the precision required for manufacturing control [16.23]. The reported precision would allow for control of ∼ 0.1% nitrogen for films having 5 to 10% nitrogen.

16 Characterization and Metrology

497

Fig. 16.8. Grazing incidence x-ray reflectivity can be used to determine the thickness of multilayer samples. The XRR data was taken by R. Matyi for P. Lysaght

Grazing Incidence X-Ray Reflectivity (GI-XRR) X-ray reflectivity is powerful means of determining the thickness of thin films and their interfaces. GI-XRR can be used to determine the thickness of a single layer or each film in a multi-layer sample. In laboratory based GI-XRR, the angle of incidence of an well collimated, monochromatic x-ray beam is reflected off a flat sample over a range of incident angles [16.4, 16.24–16.26]. The intensity behavior of the specular reflection is used for film thickness determination. Non-specular x-ray scattering can also be measured, and film and interface roughness analyzed. The use of this information is a critical part of careful analysis of the interfacial layer of higher κ films. Interference patterns in the form of intensity oscillations are formed when GI-XRR is done on single and multi-layer thin film samples. Reflectivity from a homogeneous substrate, a single layer, and a two-layer film is shown in Fig. 16.8. As the angle that the X-rays strike the sample move past the critical angle of total reflection, some X-rays will transmit through the first layer and reflect. The reflected X-rays will constructively and destructively interfere as the angle changes. Film thickness of a single layer film can be determined from the angular difference between the peaks of subsequent intensity oscillations. The angle of these interference fringes is related to the sample thickness as follows for the mth interference maximum (reference):

498

A.C. Diebold and W.W. Chism

2 2 θm = θC + m2 (λ2 /(4 d2 ))

2 2 and θm+1 − θm = (2m + 1)(λ2 /(4 d2 )) (16.1)

Because the wavelength of monochromatic X-rays λ is accurately known to many significant figures, the film thickness (d) can be accurately determined. θC is the critical angle below which X-rays will totally reflect [16.24]. Film stack measurement also can be done. Two periods of intensity oscillation are present when analyzing a two-layer film. The observed periods correspond to the cumulative paths from the top surface to successive interfaces [16.24]. Deslattes and Matyi indicate that using these path-lengths, the component layer thicknesses are determined as pairwise differences between the observed angular frequencies, beginning with that corresponding to the largest path. It is important to note that for film thickness, GI-XRR is independent of film composition. Because the wavelength of a monochromatic x-ray is accurately known to from 5 to 7 significant figures, thickness can be accurately determined. Density measurement is more difficult. Film density affects x-ray reflectivity in several ways. X-rays reflect from the electron density of a material. Each material has a critical angle below which x-rays completely reflect from a flat surface. Reflectivity decreases at angles larger than the critical angle due to x-ray absorption. The value of the critical depends on the surface electron density which may not be identical to the bulk film. The decrease in reflectivity after the critical angle is due to the bulk density, and the critical angles are observed for each film in a film stack. Using dynamical x-ray theory, the overall decrease in reflectivity over the entire range of angles can be modeled and bulk film density determined. The density determined in this way can be accurate to roughly 5% error. Other properties can be determined such as surface and interface roughness as discussed above. Non-specular x-ray scattering can provide critical information about the roughness of the interface layer. The dominant mechanism for diffuse scattering is microroughness at the interface between layers of different density. The other mechanisms include thermal diffuse scattering and Compton scattering [16.24]. The separation of diffuse or non-specular scattering from specular scattering requires the use of a well-collimated beam and appropriate optics. The use of x-ray optics that reflect from many angles simultaneously and collect data from many angles simultaneously does not separate the diffuse scattering from specular reflection. This type of optics is very useful when data must be collected rapidly for in-line metrology. Diffuse scattering is that observed by rocking the sample while keeping the 2Θ angle fixed. [16.24]. By measuring the diffuse scattering at a number of 2Θ angles, the interfacial roughness can be characterized over a wide range of spatial frequencies. The range of spatial wavelengths that the x-ray scattering is sensitive to depends on the angle of incidence and angle of scatter of the x-rays [16.21, 16.23]. In this way, the interfacial roughness can be compared to other methods such as atomic force microscopy and optical scattering. The RMS microroughness measured from diffuse scattering provides an appropriate value for use in building interfacial models for specular reflectivity [16.24–16.26].

16 Characterization and Metrology

499

Fig. 16.9. Diffuse or non-specular scattering provides critical information on the roughness of interfacial layers. This information can be used to build a model for interpretation of GI-XRR data

GI-XRR is able to observe several angstrom thick deposited oxynitride film or a reactive layer that form between the silicon substrate and medium κ material. HfO2 forms a silicate interfacial layer while ZrO2 does not [16.24]. The interpretation of the nature of the interface layer is often controversial. Therefore, it is useful to illustrate the importance of including diffuse scattering information when building a model for higher κ materials. In Fig. 16.9, we show XRR data for a HfO2 film fit to a model showing several interfaces.

500

A.C. Diebold and W.W. Chism

16.2.2 Structure/Function Relationships Metrology of medium κ materials requires models that predict optical and electrical properties. Therefore, understanding the dielectric function is prerequisite for measurement of film thickness and composition. Both the dielectric properties and the propensity of amorphous medium k materials to crystallize can be understood in terms of structure – function relationships. We begin this section by describing methods for estimating the dielectric constant in terms of crystallinity and composition. Then a classification scheme for amorphous films is presented as a means of predicting when a film might be prone to separating or crystallizing. Then a theoretical description of network structure of glasses is used to understand thermodynamic stability of silicates. A more detailed discussion of the material properties of high κ dielectrics and their application to capacitor and transistor structures is available in the outstanding review of Wilk, Wallace, and Anthony [16.27]. The crystalline structure of bulk samples of some of the crystalline medium κ materials is well know, and oxides from the same column of the periodic table are expected to have similar structures. Bulk amorphous samples of the same materials are expected to have lower static dielectric constants due to the enhanced ionic polarizability of crystalline samples [16.28]. The dielectric constant of Al2 O3 changes from ∼ 9 to ∼ 11 upon crystallization [16.28]. The static dielectric constant ε = 25 for ZrO2 and ε = 25–40 for HfO2 . Bulk, crystalline ZrSiO4 is tetragonal with ε = 12.6 [16.28–16.32]. The crystal can be considered to have units of ZrO2 and SiO2 . Each Zr and Si is bonded to four O atoms, and chains of Zr–2 O–Si–2 O–Zr–2 O–. Wilk et al. predict that Hf silicate should have the same structure and ε = 25– 40 [16.27,16.29,16.30]. The above analysis predicts a linear effect in dielectric constant as the medium κ material is mixed with the lower κ SiO2 . Lucovsky and Rayner have discussed the unexpected enhancement of dielectric constant for amorphous (MO2 )x(SiO2 )y for M being Zr and Hf [16.31, 16.32]. The change in κ was found to be a function of Si–O–Zr vibrational mode concentration. The contribution of Si–O–Zr vibrational modes decreases with increase in Zr coordination (the amount of ZrO2 ). It is possible to separate (MO2 )x(SiO2 )y films into regions of MO2 and SiO2 during annealing. The regions can be either amorphous as reported by Rayner et al. for zirconium silicate with 23% zirconium [16.32], or have one or both regions crystalline. The FT–IR spectra and the dielectric constant are both expected to change. Lucovsky and Rayner have proposed a classification scheme for amorphous medium κ oxides and silicates [16.28]. The concept behind this scheme is that the propensity to crystallize is much higher for ionic oxides. Using the Pauling electronegativity differences between oxygen and the metal atom ∆X, Pauling has described bond ionicity (fi = 1 − exp(−0.25[∆X]2 ) as a measure of oxide structure. Materials such as SiO2 are continuous random networks (crn), while ZrO2 and HfO2 are at the other extreme being random close packed ionic structures (rcpis). Al2 O3 , Ta2 O5 , and Zr silicates

16 Characterization and Metrology

501

Table 16.1. Classification Scheme for amorphous oxides (from Rayner [16.28]) Material

∆X

CRN

fi

∆X < 1.6 1.43 1.54

0.40 0.45

Al2 O3 Ta2 O5

1.84 1.94

0.57 0.61

10% 25% 35% 50%

1.61 1.71 1.78 1.88

0.48 0.52 0.55 0.59

2.22 2.14 2.22 2.34

0.71 0.68 0.71 0.75

GeO2 SiO2 ∆X = 1.6 − 2.1

MCRN

Zr silicates ZrO2 ZrO2 ZrO2 ZrO2

RCPIC

∆X > 2.1 ZrO2 HfO2 Y2 O3 La2 O3

have more ionicity that CRN oxides but not the ionicity of Zr and Hf oxide. These oxides are classified as modified CRN (mcrn). In Table 16.1, Rayner et al.’s classification scheme is listed. FT–IR spectra are clearly effected by the structure of the oxide, with crystalline oxides having shifted and sharper absorption peaks. Wilk et al. point out that certain silicate compositions are thermodynamically stable, and thus less prone to phase separation and crystallization [16.27]. Philips has developed a constraint theory approach to predicting the stability of ZrO2 SiO2 mixtures [16.33,16.34]. Philips describes constraint theory of glasses in terms of a hierarchical arrangement of valence forces in terms of bond stretching, bending, and then longer distance forces. Radial distribution functions from diffraction studies are used to determine when bond lengths or angles are fixed throughout a glass and thus remain intact. If this theory is applied to Zr silicate, the optimum composition of a thin film should be 3 to 5 atomic % Zr or 9 to 15 mol % ZrO2 [16.33]. 16.2.3 Characterization Results for Medium κ HRTEM has been used to characterize the microcrystalline nature of the medium ε thin films. The results show that surface conditions, such as the presence or absence of an oxide layer, play a critical role in the structural properties of the medium ε layer. Other non-optical characterization methods such as Medium Energy Ion Scattering Spectroscopy (MEIS), grazing

502

A.C. Diebold and W.W. Chism

Fig. 16.10. HR–TEM image illustrating the stability of Hafnium and Zirconium Silicate after annealing to 850◦ C. Figures courtesy Bob Wallace [16.29, 16.30] and used with permission of AIP

incidence x-ray reflectivity (GI-XRR), and x-ray diffraction (XRD) have also been used to characterize medium κ films [16.27–16.30, 16.35–16.40]. The non-optical characterization results are summarized below. ZrO2 Copel et al. have shown that ZrO2 deposited by ALCVD on HF stripped Si(001) are amorphous with crystallite regions while ZrO2 on oxidized Si(100) is poly-crystalline [16.35]. Similar results have been reported by Tuominen [16.36]. Ramanathan et al. have reported that a UV assisted growth of ZrO2 on silicon dioxide has better electrical properties than a thermally grown ZrO2 on silicon dioxide [16.37]. EELS study of the thermally grown ZrO2 show an oxygen deficiency [16.37]. HfO2 Gusev et al. observed amorphous films when HfO2 was deposited on interfacial layers of SiO2 by ALCVD [16.38, 16.39]. Poor nucleation resulted in incomplete coverage when HfO2 was deposited on HF stripped Si(001) [16.38]. ZrSiO4 , HfSiO4 , and (Zr or Hf )x Siy O4 Both Zr and Hf silicates are expected to be thermodynamically stable in certain composition ranges. The available data reflect this, and provide good

16 Characterization and Metrology

503

Fig. 16.11. HR–TEM image illustrating the stability of Hafnium Silicate after annealing to 1050◦ C. Figures courtesy Bob Wallace [16.29, 16.30] and used with permission of AIP

examples of how TEM can provide important information about separation of the metal oxide and silicon dioxide. Wilk and Wallace have sputter deposited amorphous HfSiO4 on HF stripped Si(001) [16.29, 16.30]. The stoichiometry was found to vary with deposition conditions, and a 5 nm film of Hf6 Si29 O65 with equivalent oxide thickness of 1.8 nm had ε ∼ 11. TEM micrographs are shown in Figs. 16.10 and 16.11. XPS and RBS were used to determine the stoichiometry. Rayner have made SiO2 rich Zr silicate. Although Rayner et al. do not provide a value for ε, they do show the annealing induced separation of Zr silicate into regions of amorphous ZrO2 and SiO2 [16.28]. Higher temperature anneals result in crystallization of the ZrO2 into a tetragonal phase. More recently, Muller and Wilk have found a Zr free region at the silicon and poly-silicon interfaces of zirconium silicate with low levels of zirconium [16.40]. This film and the Zr free interfacial regions were found to be stable when annealed to 1050◦ C [16.40]. HA–ADF–STEM was used to characterize the interfacial region of the films.

16.3 Optical Models for Medium κ Films As mentioned in the introduction, optical models for medium κ films require dielectric functions with non-zero imaginary part in the UV. A number of

504

A.C. Diebold and W.W. Chism

dielectric function parameterizations have been used to represent the dispersions of medium κ dielectrics. These include the Cauchy, Sellmeier, Lorentz (or harmonic) oscillator, Tauc-Lorentz, and effective medium approximations [16.41–16.45]. Jellison provides an excellent discussion of these in his review of optical metrology [16.42]. The most commonly encountered model incorporating absorption is the Lorentz oscillator (LO) model, which possesses an imaginary dielectric component with Lorentzian lineshape. The Lorentzian lineshape describes the macroscopic response of a homogeneously broadened ensemble of two-level atoms. Thus, “relatively” isolated direct absorptions, such as optical phonons, are well described by the Lorentzian lineshape. This is also true for excitonic state transitions. However, the Lorentzian form for the absorption does not have a distinct band edge, or a distinct band edge scaling, so it cannot accurately represent the optical response near the onset of optical absorption. Choice of the dispersion functional form is properly guided by the physics of absorption near the band edge, since the onset of absorption just begins to occur within the experimental data range for medium κ films. The functional form of the absorption near the band edge is strongly dependent on i) the type of transition in which the photon is absorbed, i.e., “direct” or “indirect”, and ii) the density of final states [16.46]. Tauc has shown “indirect” band edge scaling is expected in amorphous semiconducting materials [16.47]. At photon energies just below the band edge, there may also be absorption features due to excitonic states, which scale as direct transitions [16.46]. These may conceal the band edge and make determination of the band edge problematic. An appropriate optical model for a medium κ gate dielectric will need to incorporate the “indirect” dependence of the optical absorption near the band edge, using a minimum number of free parameters. This is essentially the idea behind the Tauc-Lorentz parameterized optical model, which consists of an imaginary part of the dielectric function formed by multiplying the Tauc (indirect) band edge scaling by the imaginary part of the Lorentz oscillator dielectric function [16.41–16.43], with the real part given by Kramers-Kronig integration. The Kramers-Kronig relations result from the principle of causality and express the real or imaginary component of the dielectric function in terms of the other (over all frequency). In this model, the Tauc scaling is guaranteed right at the band edge. However, at some distance (in photon energy) above from the band edge, the Lorentzian function will dominate, so the Tauc scaling is no longer preserved. In this section, we apply the Tauc-Lorentz optical model to medium κ gate dielectrics over wavelength ranges extending into the UV. We compare fits obtained with these models to fits obtained using more commonly encountered models such as the Lorentz oscillator. Before discussing the application of optical models, it is useful to review the expected band edge scaling for direct and indirect absorptions. Now, the fundamental expression for the absorption is a sum over delta function quantum transitions. Near the band edge, this sum is can be transformed into an integral through the introduction of the density of states ρ(E) =

16 Characterization and Metrology

505

dN/dE. For a “direct” optical band gap, energy is conserved, therefore the final states obey ω = 2 k 2 /2m∗ + Eg , where m∗ is the electron reduced mass, Eg is the band gap, and 2 k 2 /2m∗ is the conduction band energy. Thus, dE/dk = 2 k/m∗ . Writing the density of states as dN/dk×(dE/dk)−1 , and assuming the standard three dimensional “bulk” form for dN/dk, it is easy to show the density of states just above the band gap is proportional to (ω − Eg )1/2 [16.46]. This leads to a dependence of ε2 on photon energy just above the band gap of: ω 2 ε2 ∼ (ω − Eg )1/2 [16.46–16.49]. However, for an indirect transition, a phonon is also either created or destroyed, so the quantum mechanical expression absorption becomes a double sum. Thus, the 1/2 final density of states becomes proportional to (ω − Eg )1/2 × Ev , where Ev is the valence band energy [16.46]. Performing the double integration over the valence and conduction band states (with a single delta function), leads to the “indirect” absorption band edge scaling of ω 2 ε2 ∼ (ω − Eg )2 [16.46–16.49]. The Lorentz oscillator parametric dispersion form is shown below: εLO = ε∞ 1 +

 i

(E 2



Ai 2 2 Ei )

+ Ci2 Ei2

.

(16.2)

There are four parameters [A, E0 , C, ε1 (∞)] that must be determined in a dispersion fit using a single LO. The parameter A is proportional to the transition amplitude, E0 is the absorption resonance frequency, C is the damping, and ε1 (∞) is the high frequency dielectric constant. In the energy range below the resonant absorption, the real part of the Lorentz oscillator dielectric function is formally equivalent to the Sellmeier dispersion. Thus, the optical response of SiO2 and Al2 O3 is well described by the Lorentz oscillator model with narrow resonant absorption located suitably above the UV [16.50]. As mentioned, this model works relatively well when applied to direct optical transitions. However, there are typically several such allowed transitions in the Brillouin zone of a crystalline material, necessitating the use multiple oscillators to accurately represent the optical response. Of course, this introduces additional fit parameters and potential problems associated with an excess number of fit parameters such as high parameter correlation and increased parameter variation (precision). It is expected that manufacturable, amorphous high κ dielectric films will not possess direct optical transitions. Thus, the single Lorentz oscillator, or multiple Lorentz oscillator dispersion model is not expected to exhibit satisfactory characteristics for medium κ gate dielectrics. The Tauc-Lorentz (T L) dispersion parameterization was developed by Jellison and Modine to describe amorphous materials, and has been successfully applied to several amorphous thin films in the NIR-UV range [16.41–16.44]. The T L dielectric function is a smooth function of energy and is characterized by a single broad peak. In the T L model, the imaginary part of the dielectric function, ε2 , is obtained by multiplying the Tauc band edge scaling for amorphous materials with the standard Lorentz expression for ε2 . The real

506

A.C. Diebold and W.W. Chism

part ε2 is then obtained by a Kramers–Kronig integration. The equations for εTL are shown below. εTL 2 =

AE0 CE (E 2 −

2 E02 )

+ C 2E2



(E − Eg )2 E2

for E > Eg , (16.3) for E ≤ Eg ,

εTL 2 =0 and εTL 1

2 = ε1 (∞) + P π

∞ Eg

 E  εTL 2 (E ) dE  . 2 2 E −E

(16.4)

A detailed expression for ε1 can be found in [16.43]. There are five parameters [A, E0 , C, Eg , ε1 (∞)] that must be determined in a T L dispersion fit. Determination of the T L parameters for very thin (sub 3 nm) films requires judicious constraint on parameters due to increasing parameter correlation. The standard physical interpretation of A as a direct transition amplitude, E0 as the direct absorption resonance, C as the broadening term, ε1 (∞) as the high frequency dielectric constant, and Eg as the band gap must be used with caution. The goal of depositing purely amorphous films has proven difficult. Thus, optical models characterizing crystalline phases of medium κ films are also necessary. It is useful to recall the form of the dielectric function of crystalline silicon in the visible wavelength range, as shown in Fig. 16.12. Despite the indirect nature of the fundamental band edge, the dielectric function of crystalline silicon exhibits sharp features resulting from direct optical absorptions occurring above the fundamental band edge. This optical structure is not seen for silicon in the amorphous state. In order to model the crystalline state of the medium κ oxides, multiple damped harmonic oscillator models may be used to fit more complicated structure. In Fig. 16.13, we show the dielectric function of such a microcrystalline ZrO2 high K gate dielectric film. The imaginary part of the dielectric function is nonzero in the UV and clear absorption structures exist, indicating the presence of crystalline structure. HRTEM indicates the presence of a zirconium rich layer about 1.5 nm in thickness below the ZrO2 . We also show the optical response of the Zr rich layer. The response is dramatically different from the dielectric, with a broad absorption peaked near 3.2 eV, more typical of a highly doped semiconductor. We have conducted optical studies on i) MOCVD zirconium dioxide, ii) MOCVD zirconium silicate, iii) ALCVD hafnium dioxide, and iv) ALCVD hafnium aluminate high κ gate dielectric films. (ALCVD refers to atomic layer chemical vapor deposition and MOCVD refers to metal organic chemical vapor deposition.). Both in-line and research grade spectroscopic ellipsometers have been used to independently confirm gate dielectric thickness and material dispersions. A variety of optical models have been analyzed on these high κ dielectric films. These include the single Lorentz oscillator model, a

16 Characterization and Metrology

507

Fig. 16.12. Index and extinction for amorphous and crystalline silicon

Fig. 16.13. Real and imaginary parts of the dielectric function of a crystalline ZrO2 film, showing absorption structure in the UV

single Lorentz oscillator multiplied by an imaginary phase exp(iφ), multiple Lorentz oscillators, and the Tauc–Lorentz model [16.41–16.45]. Compositional changes and the amount of crystallinity are modeled using the effective medium approximation (EMA), which gives an optical model for based on the optical properties (models) of constituent materials [16.42]. For example, the relative amounts of zirconium oxide and silicon dioxide in zirconium

508

A.C. Diebold and W.W. Chism

silicate may be determined using an EMA. A critical issue with any optical model is the correlation of fit parameters with thickness. In this chapter, we illustrate accurate extraction of material dispersion and thickness for thin Zr silicate and Hf aluminate film set created under range of process conditions. Zirconium Silicates In this section, we discuss the optical properties of zirconium silicate films processed over a range of growth temperatures using MOCVD. Ellipsometric data was acquired from each wafer using the Woollam VASETM , at 72◦ , 75◦ , and 78◦ angles of incidence, and over the energy range 1.55–6.55 eV. The T L dispersion model was applied to each film, and found to provide excellent fits. In Fig. 16.14, we show the typical data and T L fit, and in Fig. 16.15, we show the extracted T L dispersions in the range 1.55–6.55 eV. The extracted film thickness and dispersion parameters are shown in Table 16.2. For each fit parameter, the relative fit uncertainty may be estimated in the √ form of a figure of merit. The figure of merit√is given by: 1.65 × M SE × Cii , where M SE is “mean square error”, and Cii is the square root of that parameter’s diagonal covariance matrix element [16.50]. Small relative uncertainty was exhibited across the set of extracted parameters, indicative of parameters that are not highly correlated. The most highly correlated parameter is the T L amplitude A, which increases in relative uncertainty from ∼ 1/6, to ∼ 1/5, and to ∼ 1/4, as the film transitions from ∼ 4.0 nm, to ∼ 3.5 nm, and to ∼ 3.0 nm in thickness. This onset of parameter correlation as film physical thickness decreases is typical. The data in Table 16.2 show a large range of “amplitude” A values, while the other dispersion parameters exhibit smaller relative variations. This is

Fig. 16.14. Typical Woollam VASETM data and Tauc–Lorentz fit for MOCVD Zr silicate films

16 Characterization and Metrology

509

Fig. 16.15. Extracted Tauc–Lorentz dispersions for MOCVD Zr silicate films in the range 1.55–6.55 eV Table 16.2. Summary of Tauc–Lorentz parameter fit to Zr silicate sample set Temp

Thick [˚ A]

A [eV2 ]

Eo [eV]

C [eV]

Eg [eV]

ε1 (E = ∞)

M SE

“low” “medium” “high”

39.92 34.67 30.91

126 62 40

6.86 7.02 7.12

3.73 2.81 2.41

4.85 4.63 4.46

1.97 2.44 2.55

4.066 4.689 4.755

Table 16.3. Tauc–Lorentz A parameter and thickness fit to Zr silicate sample set Temp “low” “medium” “high”

Thickness [˚ A]

A [eV2 ]

M SE

39.50 34.67 31.10

79.1 61.9 50.8

4.299 4.673 4.787

reasonable since the amplitude is proportional to the density of oscillators, and suggests that accurate dispersion and thickness might be extracted by fitting only A and thickness, holding all other parameters fixed. This could provide an acceptable approach to measurement of EOT using variable angle spectroscopic ellipsometry provided the parameter A is correlated to the dielectric constant. Fixing the parameters Eo , C, Eg , and ε1 (E = ∞) to the values found for the Zr silicate film processed at “medium” temperature, we find the results shown in Table 16.3. The data in Table 16.3 show that the extracted film thicknesses, “amplitude” values A, and M SE values for fits accomplished with the parameters Eo , C, Eg , and ε1 (E = ∞) held fixed. For the films processed at “medium”

510

A.C. Diebold and W.W. Chism

Fig. 16.16. Comparison of extracted dispersion using T L amplitude parameter only fit with the dispersion determined in the full fit, for the “low” temp Zr silicate film

and “high” temp, extracted fit values agree the values determined in the full fit (Table 2). However, the returned thickness for the film processed at “low” temp is slightly lower, and the M SE is increased, with respect to the full fit values. To see how the “A only” dispersion fitting approach breaks down, it is instructive to look at the actual extracted dispersion curves for the “low” temp film. This we show in Fig. 16.16. The imaginary part of the dielectric function for essentially identical for both fits, but the real part of the “A only” fit is larger than the full fit dispersion. This increases the refractive index, requiring a thinner returned thickness to keep the optical thickness n × t constant. Thus, the incorrect dispersion incorrectly reduces the thickness. From this we can conclude that an “A only” fit will not provide accurate dispersions over the process window, or that we need to adjust the fixed parameters to provide improved fitting over all three films. Ultimately, in any production situation, we want to fit the smallest number of dispersion parameters, while maintaining maximum accuracy across the process window. A minimally subjective approach which meets this criteria is to first characterize the dispersions at the expected extremes of the (hopefully small) process window using a full fit, then combine these known dispersions as constituents of an effective medium approximation (EMA) model, and finally restrict the fit to thickness and EMA constituent fraction. There are several well-known EMA types, for example the Bruggeman and Maxwell– Garnett EMA models [16.42], each based on slightly different assumptions about the film structure. For a typical gate dielectric film, the assumptions upon which the EMA is based will probably be false. Nevertheless, the EMA is a physically reasonable way to combine relatively “nearby” dispersions,

16 Characterization and Metrology

511

Table 16.4. EMA (with T L constituents) fraction and thickness fit to Zr silicate sample set Temp “low” “medium” “high”

Thickness [˚ A]

EMA [%]

M SE

39.92 34.73 30.92

0.0 59.8 100.0

4.053 4.688 4.739

Fig. 16.17. Comparison of extracted dispersion using EMA (w/T L constituents) fraction fit with the dispersion determined in the full fit, for the “medium” temp Zr silicate film

and allows relatively robust fitting for optical response in medium κ thin gate dielectrics. The results of this approach as applied to this film set are shown in Table 16.4. Comparison of Table 16.4 with Table 16.2 indicates agreement for all values of thickness and M SE. However, the estimated uncertainty in thickness is much smaller in Table 16.4, with maximum relative uncertainty ∼ 1/300. This represents about twice the required relative uncertainty budget. The absolute value of the estimated EMA fraction uncertainty is less than 4% in all cases. Recalling our EOT relative precision target of 1/600, we will need an uncertainty in EMA fraction approximately two orders of magnitude smaller to have viable EOT production measurement capability using spectroscopic ellipsometry. Nevertheless, we have demonstrated is a minimally subjective approach which provides optimal precision and accuracy. In Fig. 16.17, we illustrate the agreement between the extracted dispersion using this approach and the dispersion determined in the full fit, for the “medium” temp film.

512

A.C. Diebold and W.W. Chism

Fig. 16.18. Typical Woollam VASETM data and Tauc–Lorentz fit for ALCVD Hf aluminate films

Hafnium Aluminates The optical properties of hafnium aluminate gate dielectric films grown by ALCVD are discussed in this section. A typical hafnium aluminate ALCVD growth cycle consists of deposition of approximately one atomic layer of hafnium/aluminum oxide alloy. This sequence is then repeated a number of times, resulting in an effectively uniform high κ film of targeted thickness. The films are labeled “A”, “B”, “C”, & “D”. Ellipsometric data was acquired on the Woollam VASETM , at 72◦ , 75◦ , and 78◦ angle of incidence, over the energy range 0.75–6.55 eV. The T L dispersion model was applied to each film, and found to provide excellent fits. In Fig. 16.18, we show a typical data and T L fit, and in Fig. 16.19, we show the extracted T L dispersions in the range 0.75–6.55 eV. In Fig. 16.20, √ we show the extracted band edge scaling using the standard “Tauc” plot ω ε2 vs. ω [16.47]. Films “A”, “B”, & “C” clearly exhibit a linear dependence, confirming the expected “Tauc” band edge scaling, while film “D” exhibits a slight deviation from linearity. The extracted film thicknesses and dispersion parameters with thickness uncertainties estimated in the form of a figure of merit are shown in Table 16.5. There are several important trends evident in Table 16.5. First is to note the reduction in M SE and thickness figure of merit (FOM) when we compare the five-parameter Lorentz oscillator “with phase” fit, to the four-parameter Lorentz oscillator (LO) fit. We also see a slight reduction in the returned thickness across the wafer set. This trend of improved M SE, and FOM, continues as we compare the five-parameter Tauc–Lorentz (T L) fit to the “Lorentz oscillator with phase” (LP). However, the thickness value returned by the Tauc-Lorentz fit increases, matching the thickness obtained with the Lorentz oscillator fit. To understand the nature of these changes, again it is useful to compare examine the extracted dispersion for each model. This is

16 Characterization and Metrology

513

Fig. 16.19. Extracted Tauc–Lorentz dispersions for ALCVD Hf aluminate films in the range 0.75–6.55 eV

Fig. 16.20. “Tauc” plot showing extracted and edge scaling for the Hf aluminate films

shown in Fig. 16.21. The solid line is the Tauc-Lorentz fit. The LP fit is the short dashed line, which is seen to closely match the T L imaginary dielectric function in the ∼ 5.5–6.55 eV range. The real part of the LP dielectric function is also very close to the T L fit over the entire data range. However, the imaginary dielectric becomes negative in the NIR–VIS range, which is not physical. This is the cause of the reduced thickness value for this model. The LO model, which does not fit as well as either the LP or T L, more or less splits the difference in mismatched real and imaginary components, returning close to the correct thickness over the entire wafer set. Adding an overall phase to

514

A.C. Diebold and W.W. Chism

Table 16.5. Comparison of five dispersion models applied to the Hf aluminate sample set Thickness [˚ A]

M SE

Thickness [˚ A]

M SE

“medium” K Alloy” Lorentz Osc. Lorentz w/phase Tauc–Lorentz Tauc–Lorentz w/phase Two Lorentz Osc.

A 69.75 ± .23 68.87 ± .16 69.71 ± .15 69.83 ± .23 69.50 ± .23

A 6.554 5.018 4.32 4.316 4.606

B 83.09 ± .25 81.79 ± .18 83.06 ± .12 83.06 ± .15 82.96 ± .15

B 10.55 6.822 4.494 4.497 5.025

“medium” K Alloy Lorentz Osc. Lorentz w/phase Tauc–Lorentz Tauc–Lorentz w/phase Two Lorentz Osc.

C 94.82 ± .29 92.95 ± .21 94.33 ± .11 94.28 ± .12 94.44 ± .18

C 12.85 7.889 4.702 4.714 5.033

D 123.83 ± .66 118.63 ± .35 121.84 ± .17 121.12 ± .18 122.13 ± .20

D 31.72 13.17 7.132 6.914 7.397

Fig. 16.21. Comparison of extracted dispersions using the Lorentz oscillator, the Lorentz oscillator with phase, and the T L optical models, for the “D” Hf aluminate film

the T L model and full fit procedure, the M SE is seen to remain essentially unchanged across the wafer set, while the FOM is seen to increase. This is characteristic of the onset of parameter correlation as the number of fit parameters increases. Other six-parameter models have also been investigated on a limited basis [16.51] and may provide reduced M SE values without parameter correlation for medium K films such as these. However, as a practical

16 Characterization and Metrology

515

Fig. 16.22. Comparison of extracted dispersions using the T L and two Lorentz oscillator models, for the “D” Hf aluminate film

matter, the parameter correlation in a multi-parameter fit will be very high, and will not provide satisfactory precision in a production mode. Increased FOM values are also exhibited in the case of two Lorentz oscillators, which has a total of seven fit parameters. The M SE is slightly larger than the T L, or “T L with phase”, fit, but the FOM parameters are increased by ∼ 1/2, indicating increased parameter correlation. In Fig. 16.22, we show the close agreement between the extracted T L and two Lorentz oscillator models. We have shown the Tauc-Lorentz model provides superior dispersion characterization for high K films in the near-IR to UV (0.75–6.55 eV). It clearly provides improved fitting over the single Lorentz oscillator, the single Lorentz oscillator “with phase”, or two Lorentz oscillators. However, this model is not available on typical production ellipsometers, so it is useful to have a more common dispersion parameterization. As illustrated, one can use multiple Lorentz oscillators to produce a dispersion curve very close to “correct”. However, the high degree of parameter correlation complicates extraction of dispersion in production using a multiple Lorentz oscillator fit. Practically speaking, only one fit parameter is allowable, if any. As for the case of Zr silicates, dispersions known to accurately represent extremes process window may be used as constituents in an EMA to extract material composition. This approach offers the advantage of simplicity in implementation and interpretation, and also robust fitting, since the endpoints are controlled. In Fig. 16.23 we show the excellent overall agreement between the T L dispersion fit and the fit obtained using the EMA approach for the Hf aluminate “C” alloy film. We also note that for the thinnest (∼ 7.0 nm) film, the estimated relative uncertainty in thickness is ∼ 1/700, while the estimated relative uncertainty in EMA fraction is ∼ 1/25. Production recipes using this approach were devel-

516

A.C. Diebold and W.W. Chism

Fig. 16.23. Comparison of extracted dispersions using T L and EMA optical models, for the “C” Hf aluminate film

oped for each of the above mentioned films. However, high K dispersions must be determined on a process specific basis, i.e., choice and calibration of fit parameters must be made, and the final production recipe must be qualified in gauge studies, so the recipes cannot be considered universal. Although no capability study was performed on this wafer set, the estimated uncertainties for thickness obtained from the figure of merit agree with precision values previously established. The EMA fraction repeatability is expected to be less than 4% (absolute). The medium κ gate dielectric metrology challenge will be control of EOT value using current production spectroscopic ellipsometers and optical models. For production implementation, high κ dispersions must be determined on a process specific basis, choice and calibration of fit parameters must be made, and the final production recipe must be qualified in gauge studies. From Table 16.4, we have σEMA ≈ 4%, and σT ≈ 0.01 nm for simultaneous determination of EMA fraction and physical thickness using variable angle spectroscopic ellipsometry. If the high κ physical thickness is 3.0 nm, then σT /TPHYSs = 1 ÷ 300, which is about twice the required precision. Now, if the EMA fraction is 50%, then σEMA /EM A = 1 ÷ 12.5, which about two orders of magnitude too large. Note that if the high κ process were so well controlled that an EMA fraction measurement is not required, then a “thickness only” fit to extract physical thickness will provide satisfactorily small thickness precision. However, it remains to be seen whether this is the case, or if optical metrology for EOT does require determination of material composition. Simultaneous thickness and EMA fraction measurements with the spectroscopic ellipsometry on medium κ gate dielectrics will require approximately two orders of magnitude improvement in precision to become

16 Characterization and Metrology

517

viable for production EOT metrology. Since in-line metrology is done at high throughput, it is likely that process control will be done for thickness uniformity assuming a robust high κ process that results in constant dielectric properties. Acknowledgments. ACD thanks Bob Wallace for providing TEM micrographs and suggestions about structure function relationships. One of the authors (ACD) gratefully acknowledges the contributions of Dave Muller, David Joy, Steve Pennycook, Christian Kisielowski, Michael O’Keefe, and Brendan Foran to his understanding of HRTEM and HA–ADF STEM as well the intricacies of the art of microscopy. ACD also thanks Rick Garfunkel for his contributions to ACD’s understanding of MEIS. We also thank Ed Principe for discussions in his study of oxynitrides. WWC thanks G.E. Jellison, Dave Aspnes, Nhan Nguyen, and Rob Collins for useful discussions.

References 16.1. H.R. Philipp, Silicon Dioxide (SiO2) (Glass), In: Handbook of Optical Constants, ed. by E.D. Palik, (Academic Press, San Diego, 1985), pp. 749–763 16.2. C.A. Richter, N.V. Nguyen, E.P. Gusev, T.H. Zabel, and G.B. Alers, Optical and Electrical Thickness Measurements of Alternative Gate Dielectrics, In Characterization and Metrology for ULSI Technology 2000, ed. by D.G. Seiler, A.C. Diebold, T.J. Shaffner, R. McDonald, W.M. Bullis, P.J. Smith, and E.M. Secula, AIP Conference Procedings 550, pp. 134–139 16.3. D.K. Schroder, Oxide and Interface Trapped Charge. In: Semiconductor Material and Device Characterization, (Wiley, New York, 1990) pp. 244– 296 16.4. A.C. Diebold, D. Venables, Y. Chabal, D. Muller, M. Welden, and E. Garfunkel, Characterization and Production Metrology of Thin Gate Oxide and Oxy-nitride Films, (review in) Materials Science in Semiconductor Processing 2, pp. 103–147 (1999) 16.5. D.A. Muller, Gate Dielectric Metrology Using Advanced TEM Measurements, In: Characterization and Metrology for ULSI Technology 2000, ed. by D.G. Seiler, A.C. Diebold, T.J. Shaffner, R. McDonald, W.M. Bullis, P.J. Smith, and E.M. Secula, AIP Conference Proceedings 550, pp. 500–505 16.6. D.A. Muller and JD. Neaton, Evolution of the Interfacial Electronic Structure During Thermal Oxidation, In: Fundamental Aspect of Silicon Oxidation, ed. by Y. Chabal, (Springer, New York, 2001), pp. 219–246; and D.A. Muller, T. Sorsch, S. Moccio, F.H. Baumann, K. Evans-Lutterodt, and G. Timp, “The electronic structure at the atomic scale of ultrathin gate oxides,” Nature 399, pp 758–761 (1999) 16.7. F.H. Baumann, C.-P. Chang, J.L. Grazul, A. Kamgar, C.T. Liu, and D.A. Muller, “A Closer Look at Modern Gate Oxides,” Mater. Res. Soc. Symp. 611, pp. C4.1.1–C4.1.12 (2000) 16.8. A.C. Diebold, Electron microscopy based measurement of feature thickness and calibration of reference materials, In: Handbook of Silicon Semiconductor Metrology, ed. by A.C. Diebold, (Dekker, New York, 2001), pp. 851–863

518

A.C. Diebold and W.W. Chism

16.9. S. Taylor, J. Mardinly, M.A. O’Keefe, and R. Gronsky, HRTEM Image Simulations for Gate Oxide Metrology, In: Characterization and Metrology for ULSI Technology 2000, ed. by D.G. Seiler, A.C. Diebold, T.J. Shaffner, R. McDonald, W.M. Bullis, P.J. Smith, and E.M. Secula, AIP Conference Proceedings 550, pp. 130–133 16.10. S. Taylor, J. Mardinly, M.A. O’Keefe, and R. Gronsky, HRTEM Image Simulations of Structural Defects in Gate Oxides, In: Characterization and Metrology for ULSI Technology 2000, ed. by D.G. Seiler, A.C. Diebold, T.J. Shaffner, R. McDonald, W.M. Bullis, P.J. Smith, and E.M. Secula, AIP Conference Proceedings 550, pp. 125–129 16.11. M.A. O’Keefe, C.J.D. Herington, Y.C. Wang, E.C. Nelson, J.H. Turner, C. Kisielowski, J.-O. Malm, R. Mueller, J. Ringnalda, M. Pan, A. Thust, SubAngstrom High Resolution Transmission Electron Microscopy at 300 keV, Ultramicroscopy (accepted) 16.12. S. Pennycook, Private Communication; and P.D. Nellist, and S.J. Pennycook, “Subangstrom resolution by underfocused incoherent transmission electron microscopy,” Physical Review Letters 81, pp. 4156–4159 (1998) 16.13. D. Van Dyck and M. Op de Beeck, “Direct Structural Retrieval from highresolution electron micrographs”, in Computer Simulation of Electron Microscope Diffraction and Images, A TMS Publication, ed. by W. Krakow and M.A. O’Keefe (1989), pp. 265–271; and W.M.J. Coene, A. Thust, M. Op de Beeck, D. Van Dyck, Ultramicroscopy 64, 109 (1996); and A. Thust, W.M.J. Coene, M. Op de Beeck, D. Van Dyck, Ultramicroscopy 64, 211 (1996) 16.14. M.A. O’Keefe, E.C. Nelson, Y.C. Wang and A. Thust, Sub-˚ Angstrom resolution of atomistic structures below 0.8 ˚ A, Philosophical Magazine B 8111, 1861–1878 (2001) 16.15. C. Kisielowski, C.J.D. Hetherington, Y.C. Wang, R. Kilaas, M.A. O’Keefe, A. Thust, “Imaging columns of the light elements C, N, and O with subAngstrom resolution.” Ultramicroscopy 894, 243–263 (2001) 16.16. C. Kisielowski, private communication and paper in progress 16.17. F.M. Ross and W.M. Stobbs, “A study of the initial stages of the oxidation of silicon using the Fresnel Method”, Philosophical Mag. A 63, pp. 1–36 (1991) 16.18. S. Stemmer, private communication 16.19. D. Muller, private communication; and A. C. Diebold, B. Foran, C. Kisielowski, D. Muller, S. Pennycook, E. Principe, S. Stemmer, Thin Dielectric Film Thickness Determination by Advanced Transmission Electron Microscopy, Microscopy and Microanalysis, in press 16.20. C. Kisielowski, E. Principe, B. Freitag, D. Hubert, “Benefits of microscopy with super resolution”, Physica B, International Conference on Defects in Semiconductors, Giessen, Germany, July 16-20, 2001 16.21. E. Principe, A. Hegedus, T.C. Chua, C. Olson, Hyper Thin Nitrided Gate Oxide Characterization Methodology, Quantitative Surface Analysis 16.22. C. Powell 16.23. E. Principe, A. Hegedus, C. Kisielowski, C. Song, B. Freitag, D. Hubert, T. Fliervoet, J. Gibson, J. Moulder, and D. Watson, “Pushing The Limits Of Nitrogen Doped Silicon Oxide Gate Dielectric Materials: The Materials Characterization Role of TEM/STEM, PEELS and XPS”, AVS 48th International Symposium, San Francisco, Oct. 28, 2001

16 Characterization and Metrology

519

16.24. D. Deslattes and R.J. Matyi, Analysis of thin layer structures by X-ray Reflectometry, In Handbook of Silicon Semiconductor Metrology, ed. by A.C. Diebold, (Dekker, New York, 2001) 16.25. C.H. Russell, R.D. Deslattes, A.C. Diebold, and J. Cline, A study of tantalum pentoxide thin dielectric films using grazing incidence x-ray reflectivity and powder diffraction, In Characterization and Metrology for ULSI Technology, ed. by D.G. Seiler, A.C. Diebold, M. Bullis, T.J. Shaffner, R. McDonald, (AIP Press, New York, 2000/2001) 16.26. D. Deslattes and R. Matyi, private communications to W. Chism and A.C. Diebold and to P. Lysaght 16.27. G.D. Wilk, R.M. Wallace, and J.M. Anthony, J. Appl. Phys. 89, 5243–5275 (2001) 16.28. G. Lucovsky and B. Rayner, A microscopic model for enhanced dielectric constants in low Zr concentration Si – O2 rich non crystalline Zr and Hf silicate alloys, Appl. Phys. Lett. 77, 2912 (2000) 16.29. G.D. Wilk and R. M. Wallace, Appl. Phys. Lett. 74, 2854–2856 (1999) 16.30. G.D. Wilk, R.M. Wallace, and J.M. Anthony, J. Appl. Phys. 87, 484–492 (2000) 16.31. W.B. Blumenthal, The Chemical Behavior of Zirconium, (van Nostrand, Princeton, 1958), pp. 201–219 16.32. B. Rayner, H. Niimi, R. Johnson, B. Therrien, and G. Lucovsky, Spectroscopic Evidence for a Network Structure in Plasma-Deposited Ta2 O5 Films for Microelectronics Applications, In: Characterization and Metrology for ULSI Technology 2000, ed. by D.G. Seiler, A.C. Diebold, T.J. Shaffner, M. Bullis, R. McDonald, P.J. Smith, and E.M. Secula, AIP Conference Proceedings 550, (AIP, New York, 2000), pp. 149–153 16.33. J.C. Philips, Stress and defects in silicate films and glasses, J. Vac. Sci. Technol. B 18, 1749–1751 (2000) 16.34. J.C. Philips, J. Non-Cryst. Solids 47, 203 (1979) 16.35. M. Copel, M. Gribelyuk, and E. Gusev, Appl. Phys. Lett. 76, pp. 436–438 (2000) 16.36. M. Tuominen, T. Kanniainen, and S. Haukka, ECS 16.37. S. Ramanathan, D.A. Muller, G.D. Wilk, C.M. Park, and P.C. McIntyre, Effect of oxygen stoichiometry on the electrical Properties of zirconia gate dielectrics, Appl. Phys. Lett. 79, 3311–3313 (2001) 16.38. E.P. Gusev, E. Cartier, M. Copel, M. Gribelyuk, D.A. Buchanan, M. Tuominen, M. Jussila, and S. Haukka, IEDM 2000 16.39. E.P. Gusev, M. Copel, E. Cartier, I.J.R. Baumvol, C. Krug, and M.A. Gribelyuk, Appl. Phys. Lett. 76, 176–178 (2000) 16.40. D.A. Muller and G.D. Wilk, Atomic Scale measurements of the interfacial electronic structure and chemistry of zirconium silicate dielectrics, Appl. Phys. Let.t 79, 1–4 (2001) 16.41. G.E. Jellison Jr. and F.A. Modine, Appl Phys Lett 69, 371 (1996); G.E. Jellison Jr. and F.A. Modine, Appl. Phys. Lett. 69, 2137 (1996) 16.42. G.E. Jellison, Physics of Optical Metrology of Silicon-based Semiconductor Devices, In: The Handbook of Silicon Semiconductor Metrology, ed. by A.C. Diebold, Dekker (2001) 16.43. J.Leng et al., Thin Solid Films 313-314, 132 (1998)

520

A.C. Diebold and W.W. Chism

16.44. A.C. Diebold, J. Canterbury, W. Chism, C.A. Richter, N.V. Nguyen, J.R. Ehrstein, and C. Weintraub, “Characterization and production Metrology of Gate Dielectric films: Optical Models for oxynitrides and high dielectric constant films”, Proceedings of the 2000 European Materials Research Society Meeting (E-MRS), Materials Science in Semiconductor Processing 4, pp 3–8 (2001) 16.45. W. Chism, A.C. Diebold, J. Canterbury, and C. Richter, In: Characterization and Production Metrology of Thin Transistor Gate Dielectric Films, Proceedings of the Fifth International Conference on Ultra-Clean Processing of Silicon Surfaces, UCPSS 2000, ed. by. M Heyns, P. Mertens, and M. Meuris, (Scitec Pulications, Zuerich, 2000) pp. 177–180 16.46. P.Y. Yu and M. Cardona, Fundamentals of Semiconductors (SpringerVerlag, Heidelberg, 1996) 16.47. J.Tauc et al., Phys. Stat. Sol. 15, 627 (1966) 16.48. C.F. Klingshirn, Semiconductor Optics (Springer-Verlag, Berlin, 1997) 16.49. H. Ibach and H. Luth, Solid-State Physics (Springer-Verlag, Berlin, 1995) 16.50. C.M. Herzinger, B. Johs, W.A. McGahan, J.A. Woollam, and W. Paulson, “Ellipsometric determination of optical constants for silicon and thermally grown silicon dioxide via a multi-sample, multi-wavelength, multi-angle investigation”, J. Appl. Phys. 83, pp. 3323.-3336 (1998) 16.51. N.V. Nguyen, C.A. Richter, Y.J. Cho, G.B. Alers, and L.A. Stirling, Appl. Phys. Lett. 77, 3012–3014 (2000)

17 Electrical Measurement Issues for Alternative Gate Stack Systems G.A. Brown

17.1 Introduction The downward scaling in dimensions of silicon metal-oxide-silicon integrated circuits (MOSIC’s) has proceeded with the mathematical precision implied by Moore’s Law since the early days of the technology [17.1]. While there have been refinements in the techniques used for the electrical characterization of these devices throughout the period, it has only been in the last several years that scaling-related changes in the materials and structure of the gate dielectrics in these devices have necessitated closer examination of the basic measurement procedures used in electrical characterization, particularly in the widely-used MOS capacitance-voltage (C–V) analysis. These changes have arisen from the need for increasing MOS transistor drive current in the face of decreasing overall device dimensions. Since the electric charge for this drive current comes from charging the capacitance of the gate dielectric, and this capacitance is inversely related to film thickness, this has forced the thickness of the conventional SiO2 gate dielectric downward until first its leakage and finally its physical realizability have made it no longer a viable material for these highly scaled devices. This has led to the search for new gate insulators with higher relative dielectric constants (k or εr ), which will permit induction of the higher charge density required for advanced devices with film thicknesses great enough to limit the excessively high tunneling currents of the very thin SiO2 films. Still, even with the new materials, the trend is to scale their thickness down to the point where their leakage approaches limits of acceptability in order to achieve the lowest possible electrical equivalent oxide thickness (EOT) for the devices. So it is this increased level of dc conduction and dielectric loss that is at the heart of the increased complexity in the electrical characterization of these new materials. It is the purpose of this chapter to highlight salient features of the electrical characterization of high-k and ultra-thin oxide structures that may differ from those of the thicker SiO2 or silicon oxynitride films used in earlier MOSIC’s. A general treatment of MOS electrical characterization is well beyond the scope of this chapter, and in fact has been the subject of several excellent texts, two of which are referenced here [17.2,17.3] and recommended to the reader for a more complete background. Going further, it is our goal to present this material at a basic level and with sufficient clarity to meet the

522

G.A. Brown

needs of researchers whose backgrounds may not have included basic electrical engineering concepts. In the early days of the technology, a good majority of workers came from electrical engineering or engineering physics disciplines, but the increasing complexity of the chemistry and materials science underlying MOSIC process technology has attracted large numbers of scientists with these skills into the field. As with the rest of us, they must depend upon valid electrical characterization of their materials and processes to measure their progress. While the higher dc conduction and dielectric loss of today’s ultra-thin oxide/high-k films are a central cause of electrical measurement difficulties, other material properties of high-k gate stacks require special attention. Among these are the decreased electrical band gap and associated contact barrier heights found in these materials compared to SiO2 . This results from the established relationship between decreasing band gap and increasing dielectric constant for these materials, and will affect the dc conduction properties of the films. Other material properties that will cause potential device operation problems include the crystallization of some of the high-k films, which may be associated with non-uniform conduction at grain boundaries and spatial dielectric nonuniformities as well as providing sites for charge trapping. The very name “gate stack” implies the existence of interfacial layers that are notable sites for charge trapping. All of these potential problems remind us of our good fortune in use of the SiO2 -Si interface with its inherent properties of wide band gap, high barrier heights, very low conductivity (in reasonable thicknesses), and relatively low levels of charge trapping.

17.2 Capacitance–Voltage Measurement 17.2.1 Overview In general terms, electrical characterization of dielectric films includes primarily MOS capacitance-voltage (C–V) and dc conduction (I–V) measurements. While both types of measurements have been affected by the special properties of ultra-thin oxide and high-k films, the more pronounced and obvious impact is seen for the C–V measurements. In the past, it has been possible simply to connect an MOS capacitor to some form of capacitance meter and analyze the resulting C–V plot in terms of a well-established model, extracting the parameters of interest. In this section we will see that these two operations, measurement and analysis, have both become more complex as a result of the physical and material changes inherent in these more advanced structures. In the measurement area, the task now is to extract valid values for the insulator and insulator-silicon interface capacitance from impedance or admittance measurements of a structure with an equivalent circuit containing additional elements including parallel conductors, series resistors, possibly

17 Electrical Measurement Issues for Alternative Gate Stack Systems

523

inductors, and RC networks representing interface states. One can no longer expect to be able to equate the ‘capacitance’ output of the measurement tool with the physical capacitance of interest. Once a valid representation of the insulator/interface C–V characteristic is obtained, extraction of primary parameters including equivalent oxide thickness (EOT), silicon surface doping density, and flat band voltage requires a much more complex analysis than previously, including quantum containment and polysilicon gate electrode depletion effects that must be deducted because their magnitude is now not negligible with regard to the EOT. Of course in some instances, the relevant thickness parameter is one that includes all these components. In particular, operating characteristics of MOSFET’s that are measured in the inversion regime of the gate stack, such as drive current and transconductance, depend upon the total capacitive thickness of the gate structure. This parameter is called CET, the Capacitance Equivalent Thickness, and is defined simply as the dielectric constant of the gate stack times the area of the gate divided by the gate capacitance measured at some appropriate specified voltage. In this section, we will discuss aspects of both the measurement and analysis of MOS C–V curves of ultra-thin oxide/high-k gate dielectrics, and to put them in perspective, we begin with a description of the development of the techniques as MOSIC technology evolved. 17.2.2 Background MOS C–V measurements have been an integral part of MOS device development since the inception of the technology in the early 1960’s. At that time, many measurements were made manually, point-by-point, using a four-arm, Wheatstone-type impedance bridge, usually of the Schering type [17.4]. The voltage was set and the bridge balanced to obtain the real and imaginary parts of the impedance, values were written down, and the voltage set to a new value. Data plotting was done manually, and data analysis was done with charts derived from simple calculations. These early devices were relatively simple to measure, in that the gate oxide thicknesses were in the range 80–120 nm, dc conduction through the films was immeasurably low at device operating voltages, and the Q, or quality factor for the capacitors was high, well above 100. While one had to obtain the dielectric loss or conductance of the sample to get the bridge to balance, there seemed little point in recording it as the values remained small and were not needed for conventional C–V analysis. Indeed one of the great strengths of the conductance-frequency technique for obtaining oxide-silicon interface state density proposed by Nicollian and Goetzberger [17.5] was the fact that these interface states were the only significant source of dielectric loss in the structures. Manually operated bridges gave way to a variety of impedance or capacitance meters specially designed for this application (or similar ones such as pn junction C–V), but many of these had only a single meter output. It was

524

G.A. Brown

possible to switch this meter from indicating capacitance to conductance, but in general this was not needed for MOS C–V analysis. Only the capacitance was needed, and the choice of series or parallel equivalent circuits was largely irrelevant as the two capacitance values were equivalent for the high-Q structures. Although device scaling was progressing steadily, gate oxide thicknesses remained 20 nm or thicker for the next two decades, into the 1980’s. In this regime the underlying assumptions of the measurement remained much the same, so that it was easy to become lulled into ignoring the finer points of MOS C–V measurement. As SiO2 gate thicknesses moved below 10 nm and work began on alternative gate dielectric materials, it became clear that we need to re-examine the underlying principles of C–V measurement and analysis, comprehending effects that must be taken into account in order to yield valid interpretations of the results. 17.2.3 Definition of Capacitance First let’s define capacitance. Capacitance is the defining feature of one of three lumped passive circuit elements that can be used to describe the characteristics of electrical circuits. The other two defining features are resistance and inductance. Capacitors and inductors, the elements exhibiting capacitance and inductance, are energy storage elements defined with no energy loss, while resistors, the elements exhibiting resistance, are purely energy dissipation elements with no ability to store energy. The energy in a capacitor is stored in the electric field between its electrodes; the simplest physical realization of a capacitor is two parallel plate electrodes separated by a free space. For such a structure the value of the capacitance, C, is given in terms of its geometry, C = ε0 ∗ A/d (17.1) where A = area of the parallel plates, d = the separation of the plates, ε0 = dielectric constant of free space; essentially a constant to get the units right. The value of the capacitance of this structure may be varied by inserting a dielectric medium having a relative dielectric constant, k, between the plates. The value of the capacitance of the structure then becomes C = k ∗ ε0 ∗ A/d.

(17.2)

The relative dielectric constant of free space is unity. Our three types of circuit elements are most commonly defined and measured in terms of their response to sinusoidal electrical excitation. This type of excitation has the interesting property that if it is applied to an electrical circuit made up of the linear passive elements; resistors, capacitors, and inductors, all currents and voltages measured within this circuit will also be sinusoids of the same frequency, differing only in amplitude and phase. In particular, for a resistor, R, the current in response to a sinusoidal applied

17 Electrical Measurement Issues for Alternative Gate Stack Systems

jX

525

Z = Rs – j(1/ωCs)

RS CS

Rs R Θ -j1/(ωCs)

Z

Fig. 17.1. Complex impedance plane with capacitive impedance

voltage, V , will be exactly in phase with the voltage, and will have an amplitude given by I = V /R. The current in an inductor in response to the same excitation will appear to lag behind the voltage by a time equal to 1/4 of the period of the sinusoid, while current in a capacitor appears to lead the applied voltage by the same amount. This will be illustrated in diagrams to be described below. Of course in the real world there are no ideal circuit elements. All capacitors and inductors have some degree of energy loss associated with them, and resistors will have some small degree of energy storage capability. These relationships are illustrated by considering the values of the elements as vectors on a Complex Impedance Plane. An example of such a plane is given in Fig. 17.1. The need for complex variables here can be thought of as coming from the intrinsically different nature of the idealized circuit elements; capacitors and inductors being lossless energy storage elements while resistors are purely energy dissipation elements. These effects must be accounted for separately. Thus in Fig. 17.1, the total impedance of the series combination of resistor Rs and capacitor Cs is the vector sum Z, shown with the associated phase angle θ. The imaginary part of the impedance is called reactance, and the capacitive reactance, −j1/(ωCs ), is shown as a negative value along the ordinate of the plot. Inductive reactances, jωL, would be indicated as positive values on the ordinate. Since capacitors are our primary interest here, particularly those with high conductivity, the use of a Complex Admittance plane for describing the parallel combination of capacitance and conductance may be more appropriate. Such a plane, along with the equivalent circuit mentioned, is shown in

526

G.A. Brown

jωCp δ

Y Θ

Gp

G

Y = Gp + jωCp Gp

Cp

Fig. 17.2. Complex admittance plane with capacitive admittance

Fig. 17.2. As in the complex impedance plane, the lossy component of the admittance is shown as a vector along the abscissa, while the capacitive portion of the admittance, called susceptance, is shown as a vector along the ordinate. The vector sum of these components, the admittance Y , is indicated on the plot. The phase angle between the two components, θ, is shown as before, but its complementary angle, δ, is of greater significance. This is because δ, or more properly the tangent of the angle δ, is a direct measure of the relative lossiness of the parallel combination of Cp and Gp . For this reason the quantity tan δ is called the dissipation factor, and given the symbol D. The quantity D is often used in the measurement of complex admittances and impedances, as we shall see. Since D is a measure of the amount of energy dissipation associated with our capacitive admittance, it is reasonable that its inverse should be a measure of the ideality of capacitance, or its freedom from loss. This is in fact the case, and the inverse of the dissipation factor D, 1/D, is defined as the quality factor of an impedance or admittance, Q. To integrate all the concepts described in the paragraphs above, we introduce the idea of a phasor diagram, shown in the left portion of Fig. 17.3. This diagram looks exactly like the complex admittance plane of Fig. 17.2, but a new aspect is added. This is the fact that if you rotate a vector in this complex admittance plane counterclockwise about the origin with a constant radial velocity, the ordinate intercept of the end of the vector will trace out a sine wave in time. This is illustrated in the right hand portion of Fig. 17.3. If at t = 0 one takes the excitation voltage as a vector of unity magnitude lying along the abscissa, its ordinate intercept is zero, as shown in the applied voltage curve. As it rotates counterclockwise about the origin, its ordinate intercept reaches 1 after 90◦ , or 1/4 of its period. Proceeding onward, its intercept again reaches zero at 1/2 its period, swings negative, and returns to zero as its period is completed. If one now considers all the vectors in the phasor diagram to be rotating together, they trace out the full set of curves in the right-hand plot, showing the amplitude and phase relationships of the

17 Electrical Measurement Issues for Alternative Gate Stack Systems δ

ωt =

Θ

1

applied voltage pure capacitor response lossy capacitor response pure resistor response

0.8

Cp

0.6

δ

0.4

Θ

0.2 0

t=0

527

Gp

-0.2

0

1

2

3

Time Æ

4

5

6

-0.4 -0.6 -0.8 -1

Gp

Cp

Y = Gp + jωCp

Fig. 17.3. Admittance phasor diagram with waveforms

complex admittance and its real (resistive) and imaginary (capacitive) components. The significance of the complementary phase angle, δ, as the phase shift by which the capacitive current response appears to lead the applied voltage, is clearly seen. The primary value of this set of waveforms for us is that it represents what is actually being done in the act of measuring our capacitance-voltage characteristics. Because of the complex nature of the impedance or admittance, two parameters must be determined to specify it; parameters representing the real and imaginary components of the quantity. From the diagram of Fig. 17.3, it is seen that these are the magnitude and phase shift of the response of the complex impedance or admittance to sinusoidal excitation. Through algebraic manipulation, these components may be expressed in other terms that may appear more applicable to the needs of the user, but their source and meaning must not be overlooked. With this background, it is appropriate to re-consider our simple definition of dielectric constant presented in (17.1) and (17.2). Since we see that the response of a real dielectric will in general contain real (in-phase) and imaginary (90◦ out-of-phase) components, we can generalize our definition of dielectric constant by making it a complex quantity, ε = ε − jε

(17.3)

Since the dielectric constant is defined as the coefficient of the imaginary portion of the admittance, its real component, ε , relates to the imaginary part of the admittance, what we have referred to as the dielectric constant,

528

G.A. Brown

while the imaginary part, ε , refers to the energy dissipative part, and is called the dielectric loss. It follows from the relationships discussed above that these two components are interrelated by the dissipation factor D; D = tan δ = ε /ε

(17.4)

Mechanisms of energy loss in dielectrics have been largely overlooked for SiO2 because of its near-ideal properties when formed as a CMOS gate dielectric. With the advent of the new high-k materials being developed, the possibility of the existence and the source of these loss mechanisms at significant levels must be considered [17.6, 17.7]. In general, the highest frequency absorption processes in a dielectric arise from polarization of the electron cloud surrounding the ions. This contributes a component to the dielectric constant equal to the square of the index of refraction. At lower frequency are ionic relaxation losses that occur when ions are able to jump back and forth between neighboring sites in the structure. Similar effects occur for electronic charge carriers hopping between potential minima, trapping centers, or crystalline or film interfaces, including the electrodes. These effects are normally characterized by a temperature-dependent hopping frequency that is a function of the species and its environment. This gives rise to structure in the dielectric constant/loss parameters as a function of frequency, where the loss peaks are associated with the natural frequencies of the processes. For the very thin films contemplated for advanced gate dielectrics, non-relaxation dc conduction arising from direct tunneling or other mechanisms will respond to alternating current excitation. This can result in a low frequency loss spectrum that decreases monotonically with frequency, and can dominate the other loss mechanisms in the film, as we shall see in an example below. In any event it is important to make capacitance-voltage measurements over as wide a range of frequency and temperature as possible to identify any loss mechanisms that might affect the viability of the candidate materials. 17.2.4 Measurement of Capacitance and Its Output in Series or Parallel Mode Earlier we mentioned the Schering bridge, a manually operated instrument in which an unknown impedance was balanced against known, variable standards to achieve a null signal indicating that both real and imaginary portions of the impedance had been identified. Such a tool and procedure is totally unrealistic in today’s high volume, high speed data acquisition requirements, so other procedures have been adopted. Besides high accuracy, one great advantage of the Schering bridge is that the user knew what was being measured, and how. With today’s modern tools, one simply connects one’s sample and a variety of numbers appear, making it easy to obtain data without an adequate understanding of its origin or meaning. There are a variety of measurement techniques that have been adapted to high-speed, automated measurement of electrical impedance [17.8, 17.9].

17 Electrical Measurement Issues for Alternative Gate Stack Systems

529

The choice of technique depends upon the nature of the impedance to be measured, and the frequency range of the desired measurement. For the lower frequency (< 100 MHz) component measurements that are typically used for MOS C–V characterization, the auto balancing bridge method is commonly used. This method is understandable in terms of the discussion of Fig. 17.3, above. There are three steps in the measurement; first, a reference sinusoidal signal is generated and applied to the device under test (DUT) and a reference resistor. The difference, or error voltage, is sent through phase detectors that separate it into in-phase and 90◦ out-of-phase components. These are applied to a modulator, amplified, and fed back to the reference resistor until a null is achieved. A vector ratio detector then measures the voltages across the DUT and the known reference resistor, generating outputs that can be displayed in a number of formats. The available choice of these pairs of output parameters can vary widely with the instrument used, but may include |Z| − Θ, |Y | − Θ, Cs − Rs , Cs − D, Cp − Gp , Cp − D, parameters defined in Figs. 17.1–17.3, and others. An important fact to remember is that all these parameter pairs are derived from the same basic measurement, and the values of any of them can be calculated from the others from known interrelationships. The choice of the parameter pair for any given measurement may be critical, however, depending upon the way the output data will be analyzed. Two of the most commonly used output pairs are series resistance/capacitance and parallel conductance/capacitance. These are interrelated to each other and to the dissipation factor, D, by the expression D = ωRs Cs = 1/(ωRp Cp ) = 1/Q

(17.5)

where Rp = 1/Gp . Given these interrelationships, it would be easy to suppose that it would make little difference which ones were chosen as outputs for a particular measurement. In a sense this is true, but there are situations in which a proper choice will greatly enhance the utility of the data. The underlying reason for this comes from the fact that given the usual set of caveats about linear passive elements, the impedance of an arbitrarily complex network of resistors, inductors, and capacitors can be represented by the combination of a single resistor and reactive element. Turning this relationship about, it follows that the series or parallel capacitance that is the output of a capacitance measurement may bear little or no direct relationship to any of the individual capacitors within the network being measured. This is the basis for the concern expressed earlier about our past experience with thicker SiO2 films lulling us into the assumption that the capacitance value we read on the face of our LCR meter is the value of the oxide capacitance we seek. This is true only if the effective equivalent circuit of our sample is the same as the equivalent circuit we have chosen for our measurement output, or if the dissipation

530

G.A. Brown

factor of our sample is very low. In this latter case, it can be shown that the series and parallel equivalent capacitances are essentially equal, with a correction given in terms of the dissipation factor, D. The relevant and very useful relationship here is Cs = Cp ∗ (1 + D2 ).

(17.6)

If D, which is tan δ, (see Fig. 17.2) is very small, D2 will be smaller still, and Cs = Cp . This is the reason why few people were concerned about choice of a measurement mode when relatively thick, high-Q capacitor samples were being characterized. As gate oxides thinned into the 5–20,nm range in the 1980’s and 90’s, this increased oxide capacitance for a typical test structure reduced its reactance to a point where it became comparable to the series resistance of the test structure, arising from substrate spreading and contact resistances. Since in this situation the parallel conductance associated with the relatively thick oxide was still negligible in comparison to the susceptance of the oxide, one could match the equivalent circuit of the sample, the series combination of the oxide/interface capacitance and substrate series resistance, with that of the selected measurement mode, series capacitance/series resistance. An advantage of the series mode output for these samples is that if the resistance changes with applied voltage, it does not necessarily affect the measured series capacitance, which continues to reflect the oxide/ interface capacitance of the sample. Thus we can achieve an accurate capacitance measurement even if the dissipation factor is relatively high. This is illustrated in Fig. 17.4. As the value of the series resistance changes, the length of the impedance vector changes, but its extension along the ordinate, −j1/(ωCs ), remains the same. Moving into the present age of ultra-thin oxides and oxynitrides as well as high-k gate stacks, we find that for the oxides direct tunneling begins to dominate the dc conduction characteristics and that the same situation may apply to highly scaled high-k films. In these cases, while the capacitance increases linearly with decreasing film thickness or EOT, the conductance is increasing exponentially because of the strong thickness dependence of the tunneling mechanism. When the two components become comparable, a parallel mode capacitance measurement output becomes the better choice, for the same

Rs -j1/(ωCs)

Z

Fig. 17.4. Impedance plane with varying series resistance

17 Electrical Measurement Issues for Alternative Gate Stack Systems

531

reason that the series mode was preferable above; the physical structure being measured better fits the equivalent circuit of the output. So long as this equivalence is valid, good correlation will be obtained between measured Cp and Gp values and thin oxide/high-k capacitance and conductance. Practically, there are relatively few cases in which this correlation can be found in practice. The reason for this is that to maintain this equivalence, the series conductance of the substrate/contacts/test leads must remain significantly higher (resistance lower) than the conductance and susceptance of the film being studied. Maintaining low series substrate resistance in MOS test structures has not been a design priority, because for earlier devices with thicker oxides it was not an important factor. For present ultra-thin oxide/high-k films characterization, it has become of prime importance, not only for these capacitance measurements but also for meaningful characterization of dc conduction properties, to be treated below. 17.2.5 More Complex Equivalent Circuits As capacitive and resistive impedances of ultra-thin oxide/high-k films shrink as a result of continuing scaling, eventually we reach a situation where they and the inherent series resistance of their test structures become comparable in magnitude, and no choice of a two-element equivalent circuit can provide an adequate description of the sample. In this situation it is incorrect and dangerous to attempt to associate the capacitance output reading of an LCR meter with the physical capacitance of the high-k film or the film/silicon interface. What are the consequences of making such an association? Consider the MOS C–V characteristic shown in Fig. 17.5a. It is measured on a 2.4 nm SiO2 film on an n-type substrate. The gate electrode is p+ polysilicon. The C–V curves are those that would result from parallel mode and series mode C–V measurement at 300 kHz. Agreement between the two curves is good in the regime from the flat band voltage through depletion, but the curves depart significantly as they go into accumulation. Referring to the dissipation factorvoltage characteristics of the sample shown in Fig. 17.5b, we see that the discrepancy arises when the value of the dissipation factor approaches unity. It is a consequence of the relationship in (17.6), above. Since the equivalent oxide thickness is one of the very primary parameters needed in the analysis of MOS C–V characteristics, and this parameter is most commonly extracted from the accumulation regime of the C–V curve, the situation presented to us in Fig. 17.5a is a critical shortcoming of the analysis. We need a more adequate model for our capacitor. Intuitively, we are led first to a three-element equivalent circuit for our test structure, including a parallel capacitance-resistance structure to represent the conductive highk/thin oxide film, and a resistance in series with this combination to represent the substrate and contacts. A diagram of this equivalent circuit is shown in Fig. 17.6. Even though the insulator capacitance CO is the only capacitor in

532

G.A. Brown 200 Cparallel Cseries

Capacitance [pF]

2

Area = 7.85E-5 cm f = 300 kHz

150 100 50 0 0

0.5

a

1 1.5 Voltage [V]

2

2.5

2

2.5

Dissipation Factor

10 Area = 7.85E-5 cm f = 300 kHz

2

1

0.1

0.01 0

0.5

b

1 1.5 Voltage [V]

Fig. 17.5. (a) Series and parallel mode MOS C–V characteristics of a conductive capacitor, showing C–V turnaround behavior. (b) Dissipation factor-voltage characteristic of the capacitor of (a)

the equivalent circuit, it is a mistake to assume that either the parallel or series equivalent capacitance indicated by the LCR meter is equal to CO . If one sets the LCR meter output to parallel mode, Henson [17.10] showed that the measured capacitance, Cm , is related to the insulator capacitance and conductance, CO and GO , and the series resistance Rs by the relationship  CO G2O + (ωCO )2 (17.7) Cm = 2 {GO + Rs [G2O + (ωCO )2 ]} + (ωCO )2 This rather forbidding expression can be simplified somewhat if we deal purely with conductors in the equivalent circuit, that is, converting the series resistance Rs to its equivalent conductance value, Gs . In these terms, the relationship to the parallel equivalent measured capacitance becomes Cm = 

G2s ∗ CO 2

2

(Gs + GO ) + (ωCO )

.

(17.8)

17 Electrical Measurement Issues for Alternative Gate Stack Systems

533

Fig. 17.6. Three-element equivalent circuit of a conductive film on a resistive substrate

The form here is slightly less complex, but it can be further simplified in many cases of practical interest, where it is found that the quantity (ωCO )2 is considerably smaller than the square of the sum of the conductances, (Gs + GO )2 . Where this is true, the denominator of (17.6) can be simplified, yielding Cm ≈

G2s 2

(Gs + GO )

∗ CO .

(17.9)

In this limiting case, the equation becomes simple enough to provide some insight into the nature of the dependence. First, of course, in the case of thick oxides where the oxide conductance is negligibly small, (17.9) reduces to Cm = CO , the familiar relationship from the past. As the oxide conductance increases, we see that the measured parallel conductance begins to fall off, in a modified sort of “conductance divider,” such that when the oxide conductance becomes equal to the conductance associated with the series resistance, the measured parallel capacitance becomes 1/4 the value of the oxide capacitance. This is the origin of the “capacitance turnaround” widely reported for conductive films, and seen in the parallel capacitance-voltage characteristics of Fig. 17.5a. For all the simplification that we have been able to achieve in (17.9), our efforts to obtain the capacitance of the insulating film, CO , is thwarted by the presence of three unknowns in that equation. We have seen that a capacitance measurement, or more properly an impedance or admittance measurement, yields us values of only two unknowns, associated with the real and imaginary parts of the impedance or admittance. In general, we will need another known relationship to allow us to solve for the third unknown element. There are several ways to address this problem. It has been shown [17.11] that for values of the series resistance having specific relationship to the static conductance of the entire sample, the series resistance may be determined by iterative solutions using a comparison with modeled or simulated C–V dependences. For cases where Rs can be determined in this way, the other equivalent circuit elements Cc and GT (CO and GO in Fig. 17.6) can be determined algebraically using the relationships derived in [17.11]. A more general approach to obtaining additional parameters to allow us to solve for the additional unknown in the relationship, first demonstrated by

534

G.A. Brown

Kevin Yang and Chenming Hu [17.12], was to make an additional admittance measurement at a second frequency. We now have four parameters, more than enough for us to be able to solve for the three unknowns in (17.5–17.7). We should be aware at the outset that this approach has some inherent difficulties, arising from the underlying assumptions of electrical circuit analysis, requiring constant, frequency-invariant circuit elements. Our series resistance or conductance is often dominated by spreading resistance in the silicon substrate, an effect that can be non-linear, and insulator conductances are in general known to be frequency-dependent, so we should not be surprised if this approach is successful over only a limited range of parameter variations. Following the Yang-Hu approach, if admittance measurements are made at two frequencies, f1 and f2 , with outputs of equivalent parallel capacitance and dissipation factor C1 , C2 and D1 , D2 respectively, we obtain the desired insulator capacitance CO as     f12 C1 1 + D12 − f22 C2 1 + D22 CO = (17.10) f12 − f22 An example of a set of measurements in which the Yang–Hu approach was successful in determining the values of all three circuit elements in our threeelement equivalent circuit is given in Figs. 17.7–17.9. In Fig. 17.7 we have the MOS C–V curves of a sample measured first at 10 kHz and then at 100 kHz. Also shown is the CO calculation made using (17.8) above. The three curves very nearly overlie one another, suggesting that this is not a very aggressive sample (about 2.6 nm oxide), but it serves to demonstrate the approach. The dc current-voltage characteristic of this sample is shown in Fig. 17.8, along with the dc conductance, obtained by taking the derivative of this I–V curve numerically. For samples of this level of dc conductance, this effect dominates ac conductance mechanisms (e.g. well-to-well hopping) that may be present in the films. The series and parallel resistances of the three-element model, Rs = 1/Gs and Rp = 1/GO , derived from the Yang–Hu formalism are plotted versus voltage in Fig. 17.9, along with Rdc , the inverse of the dc conductance Gdc derived from the I–V plot of Fig. 17.8. While the Yang–Hu analysis does not calculate meaningful values in the depletion and inversion regimes of the C–V dependence, in accumulation a well-behaved voltage-independent series resistance of about 1000 ohms is derived, and a parallel oxide resistance decreasing exponentially from 107 to 105 ohms with increasing voltage is observed. The excellent agreement between this latter parameter, which approximates the total resistance of the sample at dc conditions, and the Rdc characteristic derived from the dc conductance lends credibility to the overall analysis. While the Yang–Hu analysis provided satisfying results in the example above, there are many instances when it does not. Some reasons for this have been stated above. In order to improve the usefulness of this approach, Nara et al. [17.13] have suggested guidelines for improving the performance of the dual frequency technique. Among the suggestions are to maximize the frequency difference of the two frequencies; make f1 0 across the whole dielectric. These mechanisms are all considered elastic as the total energy and k|| are conserved during transport through the dielectric. The image force is lower in higher-K materials and reduced by the quantum mechanical repulsion of both bound and unbound carriers from the interface. Thus, its effect on barrier lowering is not considered. Finally, the total gate current density, Jg , is calculated as Jg = Jconduction + Jvalence ,

(18.4)

where Jconduction is from the conduction-band longitudinal valleys. With k|| conservation, the effective barrier for transverse valleys is higher and tunneling from them is negligible. Each component Ji , where i designates the band, is calculated separately for unbound states and bound states. For unbound states in the conduction band Ji,unbound =

2q (2π)3

∞

∞ dEz

Ez,min,c

   dkx dky T E, k|| fFD,gate − fFD,Si ,

k|| =k||,min

(18.5) and for unbound states in the valence band Ji,unbound

2q = (2π)3

Ez,max,v 

∞ dEz

−∞

   dkx dky T E, k|| fFD,gate − fFD,Si ,

k|| =k||,min

(18.6) where fFD,gate and fFD,Si are the Fermi–Dirac distribution functions for the gate and silicon substrate, respectively; Ez,min,c (Ez,max,v ) is the lowest (highest) unbound state in the conduction (valence) band; q is the charge of electron. For bound states,

572

Y.-Y. Fan et al.

Ji,bound

∞

2q  −1 = τ (2π)2 µ µ

   dkx dky T E, k|| fFD,gate − fFD,Si , (18.7)

k|| =k||,min

where τµ−1 is approximated by the surface impact frequency for a 2D subband µ [18.35], !  ∂ (µ) kz dz , (18.8) τµ = ∂Ez with kz(µ) =

1 

"   2mz Eµ − Eband min (z) ,

(18.9)

where the contour integral is around the quantum well formed in the channel; Eµ is the minimum energy of µth subband, Eband min the band edge in which the subbands formed, and mz the effective mass along the tunneling direction z. mz = 0.2me and mz = 0.9me are used for the conduction-band longitudinal and transverse valleys, respectively, where me is the free electron mass. Because the effective mass approximation is not very accurate for the valence band, mz then is treated as a subband-independent adjustable parameter. From comparison of gate current simulation with experiment for a SiO2 –PMOSFET in inversion, it is found that mz = 0.10 ∼ 0.25me gives good agreement, (Fig. 18.3). The SiO2 thickness is extracted by comparison of Cg –Vg simulation with experiment in accumulation, (Fig. 18.3). Similar agreement is also achieved for a SiO2 –NMOSFET, which was fabricated on Si(100), (Fig. 18.4). The gate current simulation agrees well with experiment both in inversion and accumulation. The band gap and conduction band offset with silicon of SiO2 is 9.0 eV and 3.15 eV, respectively. The band-edge effective masses are adjustable parameters but remain the same for the silicon conduction band and valence band components, respectively, whether for substrate-injected or gate-injected currents. Gate Current Versus Voltage/Temperature Behavior Near the flat-band region, Vg ∼ −0.8 V in Fig. 18.4, the quantum well is wide and shallow. Quantization effects due to the quantum well formed by the band edge become less important. Thus, a clear division between the bound and unbound states can cause significant errors in gate current calculation, which result in abnormal Ig –Vg behavior in simulation. The error may originate from either the impact frequency or the supply of carriers. Below flat-band, only energetic carries can tunnel elastically, greatly reducing the calculated current. In practice, other mechanisms may dominate. A possible mechanism might be related to the interface states, which is not considered in the model. Near flat-band, the carrier density is much lower than in accumulation or

18 High-k Gate Dielectric Materials

573

1e+1

1.6e-6

1e+0

1.4e-6

1e-1 1e-2

1.0e-6

2

Jg(A/cm )

2

Cg(F/cm )

1.2e-6

8.0e-7

symbols: experiment 6.0e-7

lines: simulation

mz=0.25me mz=0.10me

1e-4 1e-5 1e-6

symbols: experiment lines: simulation EOT ~ 20Å

1e-7

EOT ~ 20Å

4.0e-7

1e-3

1e-8 2.0e-7

1e-9

0.0

1e-10 0

1

2

3

-3

-2

Vg(V)

-1

0

Vg(V)

Fig. 18.3. Comparison of gate current and gate capacitance simulation with the experimental data for a SiO2 –PMOSFET: EOT∼ 20 ˚ A is found from Cg –Vg simulation. Area = 10−4 cm−2 1e+1

1.6e-6

1e+0 1.4e-6

1e-2

1.0e-6

1e-3

2

Jg(A/cm )

Cg(F/cm2)

1e-1 1.2e-6

8.0e-7

symbols: experiment lines: simulation EOT~20.0Å

6.0e-7 4.0e-7

1e-4

symbols: experiment lines: simulation EOT~20.0Å

1e-5 1e-6 1e-7 1e-8

2.0e-7

1e-9

0.0

1e-10 -3

-2

-1

Vg(V)

0

-3

-2

-1

0

1

2

3

Vg(V)

Fig. 18.4. Comparison of gate current and gate capacitance simulation with the experimental data for a SiO2 –NMOSFET: EOT∼ 20 ˚ A is found from Cg –Vg simulation. Area = 10−4 cm−2

inversion. The trapped charges can contribute appreciably to the total gate currents; empty traps may assist the carrier tunneling into/out of the silicon substrate. Contributions from the overlap regions with the source and drain may also be important in that region. Gate current characteristics at different temperatures are shown in Fig. 18.5. NMOSFETs with single-layer dielectric of ε = 3.9 with different thicknesses and barrier heights are simulated. The flat band voltages of these devices are ∼ −0.8 V. When the potential barrier of dielectric is high and/or thin, direct tunneling prevails. However, when the potential barrier is low, transition from direct to FN tunneling is seen. With the potential barrier of 1.0 eV high and 40 ˚ A wide, such transition occurs around Vg = −2 V in accumulation and Vg = 0.8 V in inversion. More energetic carriers see a relatively small barrier. Therefore, the hotter the carrier distribution, the greater the tunneling. This field-assisted thermionic effect is particularly significant when the barrier is low and/or thick, (Fig. 18.5). Polarity effects are also seen in the temperature depen-

Y.-Y. Fan et al.

2

Jg(A/cm )

574

1e+8 1e+7 1e+6 1e+5 1e+4 1e+3 1e+2 1e+1 1e+0 1e-1 1e-2 1e-3 1e-4 1e-5 1e-6 1e-7 1e-8 1e-9

tox=40Å ∆Ec=0.5eV

tox=40Å ∆Ec=1.0eV

: T = 25°C : T = 125°C

tox=20Å ∆Ec=2.0eV

tox=20Å ∆Ec=3.2eV

tox=40Å ∆Ec=2.0eV

-3

-2

-1

0

1

2

3

Vg(V)

Fig. 18.5. Simulated Ig –Vg at 25◦ C and 125◦ C for n-channel MOSFETs with different dielectric thicknesses. Single layer dielectric with dielectric constant of 3.9 is assumed for the devices

dence of gate current, which is stronger in accumulation than in inversion. Such effects are due to charge supply and asymmetry of the gate stack. 18.2.2 ZrO2 and HfO2 NMOSCAP Cg , Ig –Vg Analysis Different materials such as ZrO2 and HfO2 have been shown promise for transistor applications [18.12,18.20,18.30]. Owing to the complex fabrication process of transistors, most studies have been conducted on MOSCAPs. The uncertainty of the material properties, especially for the thin films, imposes another difficulty in terms of understanding the experimental results. In this work, through Cg –Vg and Ig –Vg simulations, the material properties of ZrO2 metal-gate-NMOSCAPs were extracted in the accumulation region. Cg Versus Vg Two samples of different thicknesses of ZrO2 were fabricated, labeled as A and B. According to TEM analysis, an interfacial Zr-silicate layer exists between the ZrO2 -layer and the silicon substrate as reported in [18.39]. The physical thickness of each layer was found as follows. For sample A, tZrO2 = 31 ∼ 39 ˚ A, and tint = 6 ∼ 10 ˚ A, and for sample B, tZrO2 = 36 ∼ 40 ˚ A, A [18.6], where tZrO2 and tint are physical thicknesses of the and tint = 7 ∼ 9 ˚

18 High-k Gate Dielectric Materials 1.0e+0

Jg, 25°C (fresh) Jg, 50°C Jg, 75°C Jg, 100°C Jg, 125°C

1.0e-1

2

Jg(A/cm )

1.0e-2

575

Jg, 25°C (stressed)

1.0e-3 1.0e-4 1.0e-5

Sample A

Flat Band

1.0e-6 1.0e-7 -3

-2

-1

0

1

2

3

Vg(V)

Fig. 18.6. Gate currents of Sample A at different temperatures, with tZrO2 + tint = 39.6 ˚ A. Also shown is the gate current measured at 25◦ C after electrical and thermal stresses

ZrO2 and interfacial silicate layers, respectively. Because of the thickness variations between devices, the experimental data was compared to simulation from the same device. EOT and other device structure parameters such as substrate doping concentration and metal work function are obtained from comparing Cg –Vg simulation with experimental data. If abrupt interfaces are assumed, the EOT of the dielectric stack can be calculated by εSiO2 εSiO2 , (18.10) EOT = tZro2 + tint εint εZrO2 where εSiO2 , εZrO2 , and εint are dielectric constants of SiO2 , ZrO2 , and Zrsilicate, respectively; εSiO2 = 3.9 and εZrO2 = 20. From ranges of dielectriclayer thicknesses, and to satisfy (18.10) for both samples A and B, εint is found to be equal to 5.5. The physical thickness of each layer is determined as: tZrO2 = 33.6 ˚ A, tint = 6.0 ˚ A for sample A, and tZrO2 = 38.5 ˚ A, tint = 8.6 ˚ A for sample B. These parameters, along with the ones obtained from Cg –Vg simulation, are used later for gate current simulation. Ig Versus Vg /Temperature Temperature-dependence of gate currents is shown in Figs. 18.6 and 18.7. Sample A, with tZrO2 + tint = 39.6 ˚ A, shows little temperature-dependence in accumulation but significant temperature-dependence in inversion. Sample B, with tZrO2 + tint = 47.1 ˚ A, shows a weak temperature-dependence in accumulation. In principle, the temperature-dependence could be contributed from either the transport mechanism or by the charge distribution in the system, or

576

Y.-Y. Fan et al. 1.0e-2

Flat Band

2

Jg(A/cm )

1.0e-3

1.0e-4

Jg, 25°C Jg, 50°C Jg, 75°C Jg, 100°C

1.0e-5

1.0e-6

Sample B

1.0e-7 -3

-2

-1

0

Vg(V)

Fig. 18.7. Gate currents of Sample B at different temperatures, with tZrO2 + tint = 47.1 ˚ A

both. In a NMOSCAP, electrons in the conduction band are minority carriers in the silicon channel. Without a source/drain as external supply, its concentration cannot be maintained at the thermodynamic equilibrium value. As the temperature is increased, substantially more minority carriers are available, and, as a result, the gate currents show strong temperature dependence in inversion. This strong temperature dependence contrasts markedly with the much weaker temperature dependence for the field-assisted thermionic current for accumulation shown in Fig. 18.5. Thus it is attributed to the temperature-dependence of the charge distribution. The strong temperaturedependence in inversion also indicates that the gate current is primarily from the conduction band. Yamaguchi et al. reported that band gaps of ZrO2 and Zr-silicate layers are 5.7 eV and 4.5 eV, respectively, from XPS analysis for sputter-deposited film [18.39]. Using these values in Franz dispersion, the gate current simulation agrees well with experiment for both samples, (Fig. 18.8). The conduction band offsets of ZrO2 and Zr-silicate are found as 1.45 eV and 1.0 eV, respectively. The band-edge effective masses are mc = mv = 0.35me for both ZrO2 and Zr-silicate. Yamaguchi assumed Frenkel–Poole conduction through the dielectric stack and reported the conduction band offsets with silicon of the ZrO2 and Zr-silicate layers as 1.5 eV and 1.0 eV, respectively. For Frenkel–Poole conduction [18.11], the conducting electrons are thermally emitted out of the traps in the dielectric. The trapping potential barrier will be reduced if an electric field is applied. As a result, the current will be increased by a factor proportional to exp(∆Ei /kB T ), where kB T is the thermal energy, and ∆Ei is the decrease of the trapping potential barrier due to the electric field. Thus, stronger field/voltage dependence of current is to be expected at low temperatures. Such characteristics were not observed for the gate currents of ZrO2 –NMOSCAPs in our experiments. We believe, therefore, the tempera-

18 High-k Gate Dielectric Materials 1e+1

2.5e-6

1e-1

2.0e-6

Sample B, Cg 1.5e-6

Sample A, Jg

1e-3

1.0e-6

2

Flat Band

1e-2

Cg(F/cm )

2

symbol: experiment line: simulation

Sample A, Cg

1e+0

Jg(A/cm )

577

1e-4 5.0e-7

Sample B, Jg

1e-5 1e-6

0.0 -3

-2

-1

0

Vg(V)

Fig. 18.8. Comparison of gate current and gate capacitance simulations with the experimental data for samples A and B

ture dependence of transport observed in sample B is not primarily due to the Frenkel–Poole mechanism. Although simulation and experiment agree quite well, it is not conclusive that direct tunneling or FN tunneling, which are the temperatureindependent conductions being modeled in this work, are the primary transport mechanisms. This is due to the uncertainties about the device structural information, especially those of the dielectric stack. Other temperatureinsensitive or weakly-dependent transport mechanisms such as trap-assisted tunneling remain possibilities. Comparison between simulation and experiment of gate currents at different temperatures is shown in Fig. 18.9. In simulation, the gate current of sample B has a stronger temperature dependence than that of sample A, consistent with the model for thicker barriers, (Fig. 18.5). However, the temperature dependence predicted by the model is weaker than experiment. Thus, additional temperature-dependent mechanisms may be involved, but their influence is weak. Such thermal processes can either be related to the transport through the dielectric or the supply of carriers. If it is related to the transport, it should be weakly dependent on the electric field in the dielectric. At low gate voltages, oscillations of gate currents were observed near the flat-band and depletion region. Kinks in Cg –Vg were also observed. These might be caused by the interface states between regions (Gate/Dielectric or Dielectric/Silicon). The high-density of interface states might significantly affect the gate capacitance and gate current by trapping or de-trapping the electrons in the absence of inversion or accumulation layers.

578

Y.-Y. Fan et al. 1e+0

Jg @ 25°C (exp.) Jg @ 100°C (exp.) Jg @ 25°C (sim.) Jg @ 100°C (sim.)

1e-1

Sample A

1e-3

2

Jg(A/cm )

1e-2

1e-4

Sample B

1e-5 1e-6 1e-7 1e-8 -3

-2

-1

Vg(V)

Fig. 18.9. Comparison of simulation and experiment for gate currents at different temperatures for samples A and B 1e+2 1e+1

Flat Band

1e+0 1e-1

1e-3

2

Jg(A/cm )

1e-2

1e-4 1e-5 1e-6 1e-7 1e-8

symbol: Experiment Line: Simulation

1e-9 1e-10 1e-11 -3

-2

-1

0

Vg(V)

Fig. 18.10. Comparison of gate current simulation with experimental data for a TaN/HfO2 /p-Si capacitor

The comparison of gate current simulation with experiment for a TaN/ HfO2 /p-Si capacitor is shown in Fig. 18.10. The simulation agrees well with experiment in accumulation. The EOT is ∼ 10.5 ˚ A obtained from Cg –Vg simulation. Eg = 5.7 eV, ∆Ec = 1.6e V, and mc = mv = 0.32me are used in the Franz dispersion for the HfO2 layer, and Eg = 4.5 eV, ∆Ec = 1.2 eV, and mc = mv = 0.5me for the interfacial layer.

18 High-k Gate Dielectric Materials

579

18.2.3 Conclusions for Fundamental Issues on Gate Capacitance and Current Modeling A gate current model has been developed, taking into account the quantum confinement effects in the silicon channel, direct and Fowler–Nordheim tunneling, and thermionic emission transport through the gate dielectric. Both gate-injected or substrate-injected currents are modeled for the silicon conduction-band and valence-band components. Subject to the energy dispersion relation in each region of the Gate-Dielectric-Silicon system and available carriers and empty states on either side of the dielectric, the gate current is determined. The energy dispersion in the dielectric band gap is approximated by Franz dispersion. The subband structures and carrier distribution in energy and position in the silicon channel are obtained by solving the Schrodinger and Poisson equations self-consistently, both for gate capacitance and gate current calculations. A self-consistent Cg , Ig –Vg model thus is established. This model was validated using SiO2 dielectric devices, and good agreement with experiment is achieved. Device structural information of ZrO2 NMOSCAPs is extracted in accumulation using Cg –Vg and Ig –Vg simulation. The simulation agrees well with the experiment for different thicknesses of dielectric stack, using consistent parameters. The extracted band gaps and band offsets with silicon are comparable with those that have been reported. The gate current simulation of a HfO2 NMOSCAP also agrees well with the experiment. Characteristics of gate current transport mechanism of ZrO2 –NMOSCAPs were studied. The temperature-dependence study shows that the gate current is primarily contributed from the silicon conduction band, and tunneling is the most likely primary transport mechanism. However, other tunneling processes such as trap-assisted tunneling may be possible. Other temperaturedependent processes are also not completely excluded, but their effects are weak. Interface states between different regions might significantly affect the gate capacitance and gate current at low voltages. Kinks in gate capacitancevoltage and oscillations of gate currents were observed. The gate current does not change much after the device is stressed electrically and thermally. This indicates good quality of the film and few charges or traps created in the dielectric stack as a result of the stress.

18.3 Wave Function Penetration Effect Issues As the device dimensions are reduced aggressively, invariably new physics or effects become relevant. When the oxide thickness is scaled below ∼ 1.5 nm, the effect of wave function penetration into the gate dielectric becomes important. In this section, a study of the gate capacitance is performed accounting for wave function penetration into the gate dielectric [18.28, 18.29]. Such effects will be discussed in the context of scaling, introduction of high-K

580

Y.-Y. Fan et al.

material and poly-depletion effects. A qualitative trend study of the high-K gate current, which considers the wave function penetration effects, is also included. 18.3.1 Quantum Transmitting Boundary (QTBM) Method A graphical description of the problem being solved is shown in Fig. 18.11, which shows the energy band diagram of the general tunneling problem along with the electron wave function penetrating into the gate electrode. The wave function must go to zero some point deep inside the substrate, while on the gate electrode side the wave function extends all the way. Thus the solution of a quasi-bound system is sought (A quasi-bound system is one in which the wave function does not go to zero at one of the boundaries). In order to overcome the open boundary condition at the gate dielectricgate electrode interface, the quantum transmitting boundary method (QTBM) technique [18.7,18.21] is used to connect the wave function on either side. For an isolated system, the bound states satisfy the time-independent Schrodinger’s equation. In discretised numerical form, Schrodinger’s equation represents a vector equation. Because the wave function at the boundaries is zero, the Hamiltonian matrix H is Hermitian, and the system has bound states only. The eigenvalues corresponding to this system are real. However, if the system is quasi-bound the Hamiltonian H is no longer Hermitian and the system has quasi-bound states with complex eigenenergy values. All Hamiltonian matrices must be Hermitian, but if the system is quasi-bound then it is not possible to write the Hamiltonian over the infinite region. In such cases, techniques such as QTBM are used to solve the problem over a part of the domain. This leads to the introduction of complex elements resulting in complex eigenenergy values. The physical meaning of the imaginary part of III I

Gate Electrode

II

Electron Wavefunction

Si Substrate

Fig. 18.11. Energy band diagram for a general tunneling problem along with the electron wave function

18 High-k Gate Dielectric Materials

581

the eigenenergy can be understood as the decay of the probability of finding an electron inside the domain of description. Thus the real part of the complex eigenenergy gives the energy of resonance [18.19], and its imaginary part Γ is related to the lifetime as τ = /(2Γ ) [18.1]. A more detailed description of the theoretical background for this calculation can be found in [18.34]. Schrodinger’s equation was discretised using a finite difference technique. The finite difference form of the Schrodinger equation can be written as 1 Ψj+1 mj+1/2 hj (hj + hj−1 ) mj−1/2 hj−1 + mj+1/2 hj + Ψj mj+1/2 mj−1/2 hj hj−1 (hj + hj−1 ) 1 Ψj+1 = −(Vj + E)Ψj , − mj−1/2 hj−1 (hj + hj−1 ) −

(18.11)

where hj is the grid spacing between xj+1 and xj ; mj+1/2 is the electron effective mass averaged in hj ; Vj and Ψj are the conduction band edge and wave function values on the node j; and E is the eigenenergy. To make the above infinite matrix solvable, it is necessary to truncate it by introducing the appropriate boundary condition at both the left and right boundaries. At the left boundary (near the oxide dielectric-gate electrode interface), the general formalism in QTBM is used to connect the general plane wave solution in region I and the solution to the wave equation in region II (Fig. 18.11). At the right boundary the wave function is assumed to go to zero. The solution in the asymptotic region for the Schrodinger equation can be written as Ψ = α(E)eikx + e−ikx , −2

x < 0,

(18.12)

where |α(E)| gives the reflection coefficient of the barrier quantum well system. For real E, α(E) = 1. For the electron which attempts to escape from the quantum well, however, a physical boundary condition at x 0 is given by the left-going plane wave. Therefore, the asymptotic wavefunction would have α = 0. Mathematically, a complex boundary condition usually leads to complex, but discrete, eigenenergies Ei . The physical meaning of Im(Ei ), the imaginary part of these eigenenergies, is the decay of the probability that electrons can be found in the quantum well as time goes on. Thus, the tunneling probability of each subband can be obtained once the Im(Ei ) is obtained. To incorporate the complex boundary condition in the Schrodinger equation, the following transformation scheme, using a nonlinear mesh, was adopted  Ψ˜j = hj + hj−1 Ψj , (18.13)

582

Y.-Y. Fan et al.

and the original equation becomes  ˜ jk Ψ˜k = E Ψ˜j , H

(18.14)

k

where ⎧ 1 ⎪ − ⎪ ⎪ ⎪ mj+1/2 hj (hj + hj−1 )1/2 (hj + hj+1 )1/2 ⎪ ⎪ ⎪ ⎪ ⎪ mj−1/2 hj−1 + mj+1/2 hj ⎪ ⎨ + Vj ˜ ˜ m j+1/2 mj−1/2 hj hj−1 (hj + hj−1 ) Hjk = Hkj = ⎪ ⎪ ⎪ 1 ⎪ ⎪ − ⎪ 1/2 (h 1/2 ⎪ m h (h + h ⎪ j−1 ) j−1 + hj−2 ) j−1/2 j−1 j ⎪ ⎪ ⎩ 0 Since ψ ∝ e−ikx for x < 0, we have  (h0 + h1 ) ikh1 ˜ ˜ e Ψ1 = Ψ2 , (h1 + h2 ) and



h0 + h1 ikh1 ˜ ˜ 22 e H21 + H h1 + h2

(18.15)

(18.16)

˜ 23 Ψ˜3 = E Ψ˜2 . Ψ˜2 + H

(18.17)

The complex boundary condition was thus introduced by truncating the in˜ 22 the first term in (18.17) finite matrix (18.14) at Ψ˜2 while adding to H −1 ik −1 h1 →0 eikh1 −→ + . m3/2 h1 (h1 + h2 ) m3/2 h1 h2 m3/2 h2

(18.18)

While the first term on the right in the above equation is of the same order of ˜ the second term magnitude (O(h−2 )) as the rest of the diagonal terms in H, is only of order O(h−1 ) and can be treated as a perturbation. The imaginary part of the i-th eigenenergy is, to the first order, (i) Im(Ei ) = Ψ˜2

− sin(kh1 ) (i) Ψ˜ , m3/2 h1 (h1 + h2 ) 2

(18.19)

˜ 22 replaced by where Ψ˜ (i) is the i-th eigenvector of (18.14) with H ˜ 22 + H

− cos(kh1 ) , m3/2 h1 (h1 + h2 )

(18.20)

and k is the wave vector corresponding to Ei . Thus, the imaginary part of the eigenenergy is calculated as a perturbation. Once the real and the

18 High-k Gate Dielectric Materials

583

imaginary part of the eigenenergy are calculated, the number of electrons in each subband and their respective lifetimes can be calculated. The total tunneling current can then be calculated using the following equation: J=

q ni Γi ,  i

(18.21)

where the sum is carried out over the ith subband, and ni is the total number of electrons in the ith subband. Fermi–Dirac statistics was used in computing the ni .

18.3.2 Effects on Quantization The eigenenergy of the lowest subband in the substrate was calculated with and without wave function penetration into the gate dielectric. The eigenenergies were then compared by varying the oxide thickness at a fixed gate bias, as shown in Fig. 18.12. It can be seen that there is significant difference when the oxide thickness is small. This is because the capacitance is higher due to the shift of the centroid closer to the interface when the wave function is allowed to penetrate into the gate electrode. Consequently the induced charge is different for the two cases being considered and hence the difference in the eigenenergies. Later in this section, a more detailed discussion on the effects of wave function penetration on the gate capacitance can be found.

Difference in Eigenenergy (meV)

45 40 35 30 25 20 15 10 5 0

5

10

15

20

25

EOT (Å)

EOT (Angstroms)

Fig. 18.12. Comparison of the eigenenergies calculated with open and closed boundary conditions

584

Y.-Y. Fan et al. 3

10

1

2

Gate Current (A/cm )

10

-1

10

-3

10

-5

10

-7

10

-9

10

ε highk =23.4,thighk =1.8nm, tox=1.7nm

-11

10

tox=2nm

-13

10

0

0.5

1

1.5

2

V (V) g

Fig. 18.13. Oscillations of gate current can be seen for the gate stack dielectric, but not for the pure oxide case

18.3.3 High-k Tunneling Gate Currents Trend Study In Fig. 18.13, the tunneling currents through an oxide of thickness 20 ˚ A and a gate stack of EOT 20 ˚ A, comprising of 17 ˚ A of oxide and 18 ˚ A of high-K, are compared, assuming a simple parabolic band structure. It can be observed that the tunneling current through the oxide dielectric is much higher than that through the stacked dielectric at low voltages. However, at higher gate biases the tunneling current through the gate stack seems to show oscillations. This phenomenon occurs when the tunneling electron emerges into the conduction band of the dielectric and undergoes partial reflection at a potential step, which function as a reflection interface. In this case the reflection occurs at the high-K and gate electrode interface. Similar oscillations have been observed even for a pure oxide dielectric [18.23, 18.25]. In reality, these oscillations are hard to observe, since phase coherence is required between the electrons at the interface [18.8]. The phase coherence is usually destroyed by phonon collisions when the electron is traveling in the conduction band of the dielectric and by surface roughness of the electrode at the interface. Hence a more realistic simulation accounting for scattering effects in a material is needed to realistically understand such ballistic transport and quantum interference effects. A simulation of the currents through various dielectrics was performed at 1 V for an EOT of 1nm. The simulation could be performed only for those di-

18 High-k Gate Dielectric Materials

585

0.5nm oxide + 0.5nm 0.5nm oxide +0.5nmEOT EOTofofhigh-K high-Kdielectric dielectric 1.0nm EOT of single-layer dielectric 1.0nm EOT of Pure dielectric SiO , φ =3.1eV

3

10

2

1

Si N , φ =2.0eV

2

Gate Current (A/cm )

10 10

-1

10

-3

10

-5

10

-7

10

-9

b

3

4

b

Al O , φ =3.5eV 2

3

b

ZrO , 1.2 20) and band gaps (4–5 eV) of BaO and SrO tabulated in Table 19.1 are, by themselves, large enough to meet the long range challenges for MOS capacitors with equivalent oxide thicknesses less than 10 ˚ A [19.1]. The alkaline earth oxides easily form crystals with stable (001) faces. It is difficult to create an amorphous layer of the simple rock salt structure characteristic of the alkaline earth oxides. Alkaline earth metal silicates, however, easily form amorphous structures. For example, amorphous barium metasilicate required a 24 h anneal at 700◦ C to crystallize [19.27]. The structures of MgO, CaO, SrO, and BaO crystallize as rock salt structures with lattice parameters of 4.22, 4.82, 5.14 and 5.54 (see Table 19.1). The (001) faces are inherently stable, as any student can observe by noting the cubic crystal shape of rock salt. The stability of the (001) face is derived from the fact that any given (001) layer is charge neutral and has the highest density of any other plane. This is in contrast to other cubic oxides that, while they are lattice matched to silicon, have charged (001) faces. Examples of such oxides include the fluorite structure ZrO2 , HfO2 and CeO2 . The (111) faces of the fluorite structure are charge neutral, but lack 4-fold symmetry and, therefore, are not compatible with the Si(001) surface. The stability of

19 High-k Crystalline Gate Dielectrics: A Research Perspective

617

Table 19.1. Properties of the oxides that can be grown on silicon Oxide or Semi- Lattice paraconductor meter (˚ A) Silicon Germanium BaO SrO CaO MgO SrTiO3 BaTiO3 PbZrTiO3 SrZrO3 BaZrO3

5.43/3.84 5.66/4.00 5.52 5.14 4.8 4.22 3.905 [19.31] 3.996 [19.34] 4.10 [19.35] 4.199 [19.36]

Dielectric constant

Bandgap (eV)

Notes

1.12 0.60 34 [19.30] 4–4.8 [19.30] 13.3 [19.30] 5.3–6.7 [19.30] 11.1-11.8 [19.30] 7.1–7.7 [19.30] 9.65-9.8 [19.30] 7.3–8.7 [19.30] 330 [19.32] 3.2 [19.33] 2300/80 [19.33] 3.2 [19.33] orthorhombic 43 [19.32]

the interface and surface is crucial to the growth of superlattices as described in [19.23]. Using MBE, each of the neighboring components may be alloyed to obtain an intermediate lattice parameter. The result of codepositing barium and strontium to form an alloyed oxide with a composition of 72% BaO and 28% SrO is a rocksalt structure exactly lattice matched to silicon. This lattice matching is predicted based on a linear combination of the lattice parameters shown in Table 19.1 for BaO and SrO. 19.3.3 Perovskites In [19.10], SrTiO3 was used as an example to show how a perovskite can be grown epitaxially on silicon. The structure of the cubic perovskite (see Fig. 19.7) is ideally suited to growth on the alkaline earth oxides. The first atomic layer of the perovskite is identical in structure with the surface of the alkaline earth oxide as shown in Fig. 19.7. Once the transition from silicon to alkaline earth oxide is complete, the first layer of the perovskites is complete (ABO3 with A=Ba,Sr, Ca and B=Zr or Ti) The cubic perovskite structure is stable if the face centered cubic arrangement of the A-O sublattice is maintained. This is true only for cations with an ion size similar to oxygen [19.28]. The list of cations that satisfy this condition includes the alkaline earths Ca, Sr, and Ba and the rare earths La and Eu and others like Pb. A total valence of 6+ sets the conditions for the cations that can go in the B site once the A-site cation is chosen. So for Ca, Sr, Ba and Pb with a valence of 2+, we require a valence of 4+ for the B-site cation: Ti, Zr, Hf, Sn, or Ce. We will call these perovskites 2–4. For La or Eu with a valence of 3+, we require the B-site cation to have a valence

618

F.J. Walker and R.A. McKee

Fig. 19.7. Cubic Perovskite structure of ABO3 . The AO plane in the perovskite is identical in symmetry to the (001) planes of the rock salt structures of BaO or SrO. The FCC sublattice formed by A and O implies that the sizes of A and O must be similar to retain the cubic lattice. The size restriction on B is that it must be small enough to fit inside the FCC, AO sublattice

of 3+ for the B-site: Al, Sc, V, Cr, Mn, Fe, Co, Ga. We will call these 3–3 perovskites [19.28]. For the example in [19.10], SrTiO3 is special in that it is a high dielectric constant oxide compatible with the alkaline earth oxides grown on silicon and its capacitor function was the first demonstration of a sub-10 ˚ A equivalent oxide thickness [19.11]. But the properties of the perovskites as a class are diverse, from ferroelectric to superconducting. The ferroelectrics display non-linear dielectric and optical behavior. The prototype ferroelectric perovskite is BaTiO3 . It becomes ferroelectric below 120◦ C in bulk and has an extremely large electro-optic coefficient. Other ferroelectric perovskites include the PZT (PbZrTiO3 ) series of solid solutions. These ferroelectrics have higher curie temperatures than BaTiO3 and can also be semiconducting. Nanometer ferroelectric domains were written on PZT grown on silicon using a buffer layer of SrTiO3 on Si by A. Lin et al [19.29]. Once the alkaline earth oxide is grown epitaxially on silicon, then the transition to perovskites appears straightforward. However, a major consideration for the formation of sharp perovskite interfaces is, as pointed out above, the stability of the {001} faces. The faces are stable only for those structures with charge-neutral layers. For the 2–4 perovskites, these faces are charge-neutral. The layer containing the A-sites in the perovskite structure has one oxygen (see Fig. 19.7). The 2+ A cation for the 2–4 perovskites balances the 2− of this oxygen making this plane charge-neutral. A similar argument holds for the layer containing the B cation. For the 3–3 perovskites, the {001} faces are charged. The 3+ of the cation and the 2− of oxygen combine to give a charge of 1+ for the AO layer. The BO2 layer is charged 1−. The inherent instability of the (001) face of the 3–3 perovskites complicates the epitaxy and simple superlattice structure of the MOS capacitor shown in Fig. 19.1. The instability may drive a reaction with the underlying alkaline earth oxide or faceting of the surface.

19 High-k Crystalline Gate Dielectrics: A Research Perspective

619

19.4 The Implementation of COS To see how we have applied these principles to the growth of crystalline oxides on silicon, we will describe the details of our approach to epitaxial growth. Much of these details are structural and derived from RHEED observations at different growth conditions. This approach was first outlined in 1991 by McKee et al. for the growth of BaTiO3 on silicon [19.37]. The key to the growth of COS is the deposition of a silicide before oxide growth was attempted. In this way the reaction between BaO and silicon is controlled and confined to a thickness of less than 1 ML. 19.4.1 Layer-Sequenced COS Growth We start with clean silicon that is prepared using standard wet chemical cleaning that ends with a UV-ozone produced oxide. The UV produced oxide is easily sublimed at 850◦ C in UHV to produce a clean 2×1Si(001) surface. It is important in the cleaning process that SiC is not observed in the RHEED, because subsequent growth would be disrupted if even small amounts are present. We begin with deposition of Sr metal and proceed along the Si–Sr phase field in Fig. 19.3 and Fig. 19.8 to produce an intermediate silicide. The growth conditions required in order to synthesize a crystalline ordered layer that doesn’t uncontrollably react is to deposit 1/4 ML at temperatures between 550◦ C and 700◦ C onto a clean 2 × 1Si(001) surface. The RHEED for this progression is shown in Fig. 19.9. The 1/4 ML structure covers the whole surface and is highly ordered as is evident from the RHEED.

Fig. 19.8. The growth sequence for COS. The first step is the 1/4 ML deposition of of metal (in this case, Sr) and formation of silicide, 1. The second step, 2, is the deposition of metal (in this case, Sr). The third step, 3, is the oxidation of the metal to the alkaline earth oxide. The oxidation step takes place by simultaneously consuming the metal at 2, while forming the oxide at 3. This approach establishes equilibrium between silicon and the silicide at 1 and the silicide and the oxide at 3

620

F.J. Walker and R.A. McKee

Fig. 19.9. Progression of RHEED from clean silicon. Flat, ordered, single phase structures are observed at 1/6th ML and 1/4 ML. All photographs looking down the 110 direction of silicon

Fig. 19.10. RHEED intensity variations and surface phase diagram derived from RHEED. The solid line represents the intensity within the oval centered on the third order rod of the RHEED while the circles represent the intensity in the oval centered around the half-order rod. The RHEED is looking down the 110 direction

The progression of the RHEED can be followed quantitatively as first shown in [19.26]. The graph of the RHEED intensities at various places in reciprocal space as the coverage progresses on a substrate held at 600◦ C is reproduced in Fig. 19.10. The fractional coverage of the 1/6th ML structure is roughly proportional to the intensity of the third order rod, while the coverage of the 1/4 ML structure is roughly proportional to the intensity of the half order rod [19.26]. This progression is shown in Fig. 19.10 and clearly shows the third order rod peaking at 1/6th ML and the 1/2 order rod peaking at 1/4 ML. As was pointed out in [19.26], this diffraction is consistent with chains of alkaline earth metal atoms ordering along 110 directions with, first, triple the 110 Si lattice spacing that is fully replaced at 1/4 ML by alkaline earth metal atoms at double the 110 lattice spacing. At 1/4 ML, the first step in the growth of COS is completed and we are at step 1 in Fig. 19.8. Why we believe this step to be important will be outlined in the following section.

19 High-k Crystalline Gate Dielectrics: A Research Perspective

621

Fig. 19.11. RHEED progression during oxidation of Sr metal – the 3×1 progresses to a 2 × 1 pattern. The observation of the 1/2 order rod after oxidation (right hand photograph) indicates that the silicide at the interface formed during the 1/4 ML step is still present between the oxide and silicon. Both photographs are looking down the 110 direction of silicon

The next step involves forming an ordered metal surface that can be subsequently oxidized. As pointed out in the rules for heteroepitaxial growth, this layer must uniformly cover the surface. We found that an ordered 3 × 1 metal surface forms at 3/8 ML additional alkaline earth metal coverage that is flat and commensurate with the silicon. Because the surface mobility of metal is high, we perform this deposition at substrate temperatures between 100 and 200◦ C. Its RHEED pattern is shown in Fig. 19.11 and clearly indicates that it is highly ordered and flat. At this point we are at step 2 in Fig. 19.8. To get to point 3 in Fig. 19.8 we oxidize the 3/8 ML alkaline earth metal layer by leaking molecular oxygen into the chamber with the substrate at 100◦ C. If the arrival rate of oxygen is kept to below 1 Langmuir/s, this oxidation can be controlled by timing the oxygen flow at about 10 sec. The RHEED pattern that results is shown to the right of the 3/8 ML pattern in Fig. 19.11. It, like the 1/4 ML silicide, is a 2 × 1 pattern, highly ordered and flat. The RHEED is consistent with an ordered alkaline earth oxide layer on top of the silicide. After the oxidation step, we have established the tie line in Fig. 19.8 between the oxide at step three and the silicide at step 1. Continued deposition of alkaline earth metal in the presence of oxygen results in highly ordered and flat alkaline earth oxide (i.e. Bax Sr1−x O). At this point the difficult steps of heteroepitaxy have been completed and a transition to homoepitaxial growth is required. Because we want atomically abrupt surfaces and interfaces, we need to establish conditions (including oxygen pressure and substrate temperature) for flat growth. As pointed out by Flynn [19.17], flat growth occurs at temperatures above the point where RHEED oscillations are not observed. RHEED oscillations come from the formation of islands on top of flat terraces and indicate growth conditions resulting in rough films. Growth of the film from ledges does not give rise to RHEED oscillations. We have empirically determined the growth conditions for flat growth of BaO and graphically represent them in

622

F.J. Walker and R.A. McKee

Fig. 19.12. Temperature and pressure diagram describing conditions for ledge growth of BaO. Flat, homoepitaxial growth is realized for conditions to the right of the solid boundary in the graph

Fig. 19.12. The surface had ledges spaced approximately 300 ˚ A apart and the growth rate was 1 ML BaO every 5 seconds. Above a temperature of 275◦ C, BaO grows flat regardless of oxygen pressure. For oxygen pressures below 3 × 10−7 Torr, the substrate temperature can be reduced to 175◦ C. With the completion of the alkaline earth oxide, we have followed a layersequencing as indicated in bold of Fig. 19.8. This route avoids the formation of silicates. We have shown this by measuring the photoelectron spectra for silicon coming from a silicon-oxide interface that is buried. This can be done for films up to 50 ˚ A-thick at which point the electrons from the Si–2p peak do not escape from the film. We compare in Fig. 19.13 the Si–2p peak spectra taken from the 1/4 ML silicide structure with that from a 6 ML thick Bax Sr1−x O oxide on silicon. The comparison shows a shoulder shifted < 0.5 eV from the Si–2p peak of the substrate toward higher binding energies. There is no evidence for silicon in a silicate or in silicon dioxide. Silicon in silicates or SiO2 would have peaks at 102.5 eV and 104 eV respectively [19.38, 19.39]. The shoulder most likely arises from silicon at the interface that bonds to the oxide across an atomically-sharp interface. For the growth of the perovskite, the transition from homoepitaxy to heteroepitaxy must be made in a different growth regime. The Ti-O bond in perovskites like SrTiO3 has a covalent part. Growth temperatures for flat, crystalline SrTiO3 are greater than 550◦ C [19.40, 19.41]. We observe that the diffraction shown in Fig. 19.13 from the surface of alkaline earth oxide films thinner than 10 ML deteriorates above 400◦ C. We do not attribute this observation to a thermodynamically driven reaction with the oxide because, as shown in Fig. 19.2, the alkaline earth oxide is still

19 High-k Crystalline Gate Dielectrics: A Research Perspective

623

Fig. 19.13. X-ray photoelectron spectra (XPS) and RHEED from Bax Sr1−x O film, XPS of the Si–2p core level after the 1/4 ML Sr deposition (dark line) and after the deposition of 7 ML Ba0.72 Sr0.28 O (light line) plotted against binding energy. The XPS after 7 ML was scaled to aid comparison. Transitions characteristic of silicon dioxide (peak at 104 eV) or silicate (peak at 102.5 eV) are not evident in the spectra for Bax Sr1−x O grown on silicon. The RHEED is for 3 ML Bax Sr1−x O looking down the 110 direction of silicon and Ba.72 Sr.28 O and is indicative of high crystalline quality

observed even after a 700◦ C anneal, as was done for the structure shown in the micrograph. Moreover, films thicker than approximately 50 ˚ A-thick are stable at temperatures of greater than 700◦ C. The growth of crystalline perovskite thin films requires growth temperatures above 550◦ C [19.40, 19.41]. Since we cannot use such high growth temperatures, we deposit the perovskite in a layer-by-layer fashion at low temperatures, Tg = 200◦ C, and subsequently anneal in vacuum to recrystallize the amorphous result to obtain an epitaxial, single crystal film of perovskite grown on an AO template. The deposition of the elemental metals is done in 10−6 Torr oxygen to form the oxide. The total thickness of the deposition can be anywhere from 3 to 10 unit cells thick. The result is an amorphous RHEED pattern. Recrystallization of the amorphous film takes place at 550◦ C after about 60 seconds. The resulting SrTiO3 shows a sharp diffraction that, for films less than 50 ˚ A-thick, is commensurate with the silicon with a +1.5% strain. This is determined by comparing the in-plane lattice parameter of the diffraction with that for clean silicon. Large critical thicknesses on the order of 80 ˚ A-thick have been observed for BaTiO3 growth on SrTiO3 , a +2.3% misfit [19.40, 19.41]. The completion of SrTiO3 marks the completion of COS with the general formula (AO)n (A BO2 )m . However, variations in our COS approach may be possible, but we believe that any process for the heteroepitaxial transition from silicon to crystalline oxide can be understood within the framework we have described above.

624

F.J. Walker and R.A. McKee

Fig. 19.14. RHEED intensity variations and surface phase diagram derived from RHEED. The points on the phase diagram to the right are determined from the Sr coverage at which a sharp peak is observed in the RHEED intensity variation at left. While these phases are observed to form for growth temperatures above 500◦ C, they are stable as they are cooled. Below 500◦ C (the dashed lines in the phase diagram), these phases are not observed due to the silicon surface diffusion required to form the 1/6th and 1/4 ML phases and related Si-dimers. The temperature of the eutectic has not yet been determined

We make this assertion because the COS approach describes heteroepitaxy in terms of thermodynamic stability which all epitaxially-grown, crystalline alkaline earth oxide structures on silicon must satisfy. As discussed earlier, a number of transition metals, Zr as an example, can also replace the ‘B’ cation titanium for increased flexibility in material choice. Additionally, a multitude of n and m combinations can be combined, for example the case of n = 6 and m = 3 is shown in Fig. 19.2. Variations have been described by Droopad et al. [19.42] and Ishiwara [19.43]. These processes result in an oxide on silicon by chemically reducing an approximately 20 ˚ A-thick silicon dioxide film with Sr metal to SiO at temperatures high enough to drive off SiO. These temperatures are lower than those required to drive off SiO2 and so may be more amenable to production. If excess Sr is supplied to the initial SiO2 film, then heating in vacuum will desorb the silicon dioxide and some of the excess Sr [19.42,19.43]. This process will leave a coverage of Sr on the surface that depends critically on the thermal history of the substrate [19.44]. The resulting mixture of surface structures is dictated by the thermodynamics that is described in the phase diagram shown in Fig. 19.14. After this de-oxidation process is complete [19.42], the oxide growth continues within the framework we describe here at some point between 1 and 2 on the diagram of Fig. 19.8.

19 High-k Crystalline Gate Dielectrics: A Research Perspective

625

19.4.2 The Importance of the Silicide The importance of the silicide at 1/4 ML coverage is that it establishes the tie line of Fig. 19.8 between a Sr-Si interface with the alkaline earth oxide. As described above, it is stable against oxidation and remains under the oxide. Moreover, if our goal is the synthesis of a MOS capacitor, the presence of the silicide serves to promote epitaxial growth and passivate the silicon dangling bonds similar to the role of sub-stoichiometric silicon oxides and hydrogen for the silicon dioxide system. While this silicide is also a potential source of interface states, its contribution has been shown to be negligible [19.11]. How the silicide fulfills this role as an interface is found in its structure, which we describe now. If we follow the RHEED during deposition (see Fig. 19.14), we observe ordered structures of submonolayer silicides that are flat, single phase and crystalline layers. We have observed ordered structures at 1/6th and 1/4 ML. For growth at higher temperatures, others report a 3 × 2 phase at 1/3rd ML. To understand why the 1/4 ML structure works so well as a template, we will fill in the Sr–Si phase field. Hu et al. [19.45] have studied the structures of barium after deposition at both high and low temperatures using STM. For coverages less than 1/2 ML and deposited at room temperature, both barium and strontium are observed to condense into isolated chains on the dimerized 2 × 1Si(001) surface. These chains do not destroy the dimers. These observations imply that the phase diagram at low temperatures consists of equilibrium between the dimerized 2 × 1Si(001) surface and islands of alkaline earth metal, see Fig. 19.14. Above 500◦ C, ordered structures condense on the surface and Hu et al. have proposed a 3 × 2 structure, shown in Fig. 19.15, that completely covers the surface at 0.16 ML [19.45]. The salient feature of the structure is the tripling of the Si(001) periodicity in the 110 direction. We also observe this tripling at the 110 zone axis in RHEED during deposition of Sr above 500◦ C, shown in the right hand panel of Fig. 19.15. For deposition stopped at 1/6th of a ML, this 3 × 2 structure is stable down to room temperature. The fact that the 1/6th ML structure forms at high temperatures, while isolated, metallic chains form at low temperature indicates that the chains are metastable and kinetically favored. Above 500◦ C, there is sufficient thermal activation to form the 3 × 2 structure. There are a number of possible sources for this thermal barrier. One is that for the structure of [19.45], significant movement of silicon is required. Silicon is also replaced or added to the structure proposed by McKee et al. [19.10] and may require high temperatures to aid in diffusion of silicon and/or detachment from the terrace edges. These temperatures are consistent with the epitaxy of silicon. For silicon, with an absolute melting temperature of 1687 K, we expect flat growth at Tg = 0.55 ∗ Tm = 654◦ C [19.21]. If silicon diffusion is required for the formation of the silicide, then our observed minimum growth temperature of 500◦ C is consistent with the universal curve of surface diffusivity for silicon (see Fig. 19.4).

626

F.J. Walker and R.A. McKee

Fig. 19.15. Real space structure as determined by STM [45] and diffraction from a saturated phase of the 3 × 2 Sr on Si(001). The tripling of the silicon 110 lattice spacing shows up in the RHEED as the third order rods. The Sr replaces the Si dimers

If we assume that thermal activation is required to form the 3×2 structure, we can draw the phase diagram as shown in Fig. 19.14. The “liquid” phase present below coverages of 0.04 ML by Hu et al. [19.45] may be a lattice gas on 2 × 1 Si(001) sites and does not appear in the RHEED as extra rods. In Fig. 19.14, the 3 × 2 begins to condense where the (0,n/3) rods begin to appear at 0.05 ML. At 1/4 ML, the 3 × 2 structure has been replaced by a 2 × 1 structure. The RHEED intensity of the third order rods has disappeared (Fig. 19.14) and the scattering at the 1/2 order rod has reached a maximum. This structure is also stable upon cooling to room temperature and is therefore a single phase as indicated in the phase diagram of Fig. 19.14. The structure of this phase is an interpolation of the 3 × 2 structure at 1/6th monolayer and the 3 × 2 structure observed at 1/3rd ML [19.45, 19.46]. At both 1/6th ML and 1/3rd ML, the Sr is observed to occupy the rows previously occupied by the Si dimers of the 2×1Si(001) surface. Because these rows are spaced at twice the Si(001) spacing, 1/6th of a ML occupying every other row gives rise to an average Sr spacing of 3 a, where a is the lattice spacing of the Si(001) surface or 3.84 ˚ A. For 1/3rd of a ML Sr, if every other row is occupied then the average spacing is 1.5 a. Such an average spacing on discrete lattice sites, namely the hollow sites as observed by Goodner et al. [19.47] means that for 1/3rd of a ML one obtains the structure shown in Fig. 19.16 after Hu et al. [19.45] where the Sr spacing alternates between a and 2 a. The interpolation for 1/4 ML means that if the Sr is confined to every other row, then the average spacing for the 2x1 structure is 2a. The simplest ordered structures are either a 2x2 structure or a c(4x2) structure. Neither 2x2 nor c(4x2) is observed in RHEED or in LEED. The solution to this puzzle is that the average structure is a random alloy of 2x2 and c(4x2) structures. This

19 High-k Crystalline Gate Dielectrics: A Research Perspective

627

Fig. 19.16. 3 × 2 structure observed at 1/3rd ML. The structure was deduced in by Hu et al. from STM measurements [19.45]. The larger diameter circles are Ba atoms that are buckled away from the surface. As described in the text, half of the buckled alkaline earth metal atoms in this figure are oxidized and incorporated into the oxide while the 1/4 ML silicide remains. Based on the similiarity observed in RHEED for Ba- and Sr- silicide, we believe the two structures to be similar. (Adapted from [19.45])

disorder may be thermally excited if the energy difference between the 2x2 and c(4x2) structure is small enough. This possibility has been investigated by first principle calculations of the energetics of both structures. The energy difference is found to be 700K [19.48]. The second source of the disorder is the random nucleation of chains. In Fig. 19.17 we have reproduced figure from [19.45] that shows some 3 × 2 Ba on Si(001) forming before the structure completely covers the surface at 1/6th ML. The disorder that arises from random nucleation does not condense into a higher symmetry phase due to the small energy difference and large barrier to ordering. The completion of the 1/4 ML silicide marks the completion of the first step in the growth of COS. The structure that forms satisfies our rules: it is observed by RHEED to be atomically flat and satisfies rule 1, the disappearance of the 1/3rd order rods indicates that it is single phase and satisfies rule 2, the RHEED indicates sharp rods so the phase is crystalline and satisfies rule 3, and finally, the RHEED is not observed to change as a function of time and the phase is therefore in equilibrium and satisfies rule 5. Rule 4 may or may not be broken and it is not possible with RHEED to say if the phase has a single orientation. From the STM image in Fig. 19.17, a single orientation is observed on individual terraces, but the registry can be translated by one lattice site and therefore is a defect in the structure.

628

F.J. Walker and R.A. McKee

Fig. 19.17. STM micrograph of the Si(001) surface with less than 1/6th ML deposition at 800◦ C. Rows A and B in the figure are shifted relative to each other by one lattice spacing along the 110 direction parallel to the Si dimer rows. The oval and arrow at the top of the micrograph indicate atoms that have moved during imaging. (Taken with permission from [19.45])

19.4.3 Alkaline Earth Metal The next step continues on the Sr–Si phase field and consists of deposition of 3/8 ML Sr metal onto the 1/4 ML silicide at substrate temperatures between 100–200◦ C. The phase that forms is an ordered 3 × 1 structure that completely covers the surface at 3/8 ML. The RHEED of this phase is shown in the first panel of Fig 19.11. We infer its metallic nature by the low temperature required for its formation and therefore believe that the addition or removal of silicon is not involved in its formation. This growth temperature is consistent with the global behavior for the growth of Sr metal surface layers (see Fig. 19.4). The surface diffusivity of Sr (Tm = 1042 K) is fast enough for layer-by-layer growth at temperatures above 117◦ C (Tg = 3/8 Tm ) [19.21]. Our choice of growth temperature is somewhat arbitrary. The same structure, as observed by RHEED, can be grown immediately after the growth of the 1/4 ML phase at temperature. By growing it at lower temperature, we minimize possible diffusion of Sr into the silicon. If by error, more than 3/8 ML is deposited, it is possible to form islands of silicide. By depositing at low temperature, we can reduce the rate at which these islands form before the oxidation step that comes next. 19.4.4 Oxide Growth Immediately following the oxidation (i.e. oxidation of the 3/8 ML alkaline earth metal described earlier), lattice matched Bax Sr1−x O with x = 0.72 is deposited by codepositing the constituent metals in the presence of molecular

19 High-k Crystalline Gate Dielectrics: A Research Perspective

629

oxygen. Excellent quality oxide with sharp diffraction can be grown from 1 ML to any thickness. Once the heteroepitaxial transition has been accomplished, we must switch to a different growth regime to continue with homoepitaxial growth. For flat, homoepitaxial growth, we require a growth temperature at which RHEED oscillations are not observed. Using the global growth characteristic for alkali halides, or ionically bonded crystals, Tg = 0.1 Tm [19.21]. Ba0.72 Sr0.28 O melts at 2100◦ C giving −34◦ C as a growth temperature. We find, however, that the alkaline earth oxides require a higher growth temperature to avoid RHEED oscillations that are indicative of the formation of islands. This behavior is summarized in Fig. 19.12 where the growth pressure is plotted as a function of growth temperature. BaO and SrO are more covalently bonded than the alkali halides and this may explain why the surface mobilities are relatively lower than the alkali halides [19.49]. The stability of the 1/4 ML silicide after oxide deposition is directly evident in the Z-contrast micrographs of [19.10]. The silicide, with its characteristic doubling of the silicon lattice parameter in the 110 direction is observed underneath the SrTiO3 thin film. Diffraction evidence is also presented after oxidation in Fig. 19.11 where the doubling of the silicon lattice parameter is indicated by the 2 × 1 RHEED. This 2 × 1 pattern persists after a complete monolayer of oxide and gradually disappears during the second monolayer. A possible mechanism for the consumption of the alkaline earth metal that forms a 3 × 1 structure is suggested by the structures deduced by STM observations in [19.45]. The structure for 1/3rd ML is a 3 × 1 structure where, as mentioned above, the alkaline earth (AE) metal fills in every other Si 110 row. The STM observations indicate that for the 1/3rd ML structure, neighboring AE metal in a row buckle. This implies that the AE metal atom that is furthest from the surface, the larger diameter circle in Fig. 19.16, is less strongly bound and more susceptible to oxidation. As these AE metal atoms are consumed we proceed backward from point 2 to point 1 in Fig. 19.8 and are left with a configuration where the average spacing of AE metal atoms in a row is two times the silicon lattice spacing. The AE metal coverage for this structure is 1/4 ML. This mechanism explains the observation of Leterii et al. [19.50] who report success in AO growth for 1/2 ML depositions at high temperatures as well 1/4 ML. The AE metal in excess of 1/4 ML is subsequently oxidized and included in the growing AO film.

19.5 Electrical Properties The development of crystalline oxides on silicon allows us to revisit the fundamentals of the MOS capacitor as well as the scattering of carriers at a semiconductor-insulator interface. Some of the issues addressed for the amorphous silicon dioxide/silicon interface will be the same for a crystalline oxide silicon interface. These include those properties of the device that depend on

630

F.J. Walker and R.A. McKee

the bulk properties of silicon, such as the electron-phonon interaction and its influence on carrier mobilities. We expect, however, those properties that depend on details of the interface to be quite different. These include band offset, interface trap state density and energy, interfacial strain and the effect of interface scattering on mobilities. Some of the unresolved issues of the Si-SiO2 system can be uniquely investigated with crystalline oxides on silicon. Recently, the SiO2 -Si interface has been investigated with the goal of understanding the effect of the atomic structure of the interface on electrical properties. Lucovsky and Phillips [19.51] have outlined general principles for the engineering of amorphous high-k structures. They identified three major issues, summarized here and discussed in more detail in Lucovsky’s article in this book: “. . . i) electron/nuclear charge balance in interface bonds, ii) physical constraints related to average interfacial bonding coordination and iii) reduced band-offsets in the transition metal oxides derived from ionic bonding and the d-character of the conduction band states. . . ” [19.51]. For crystalline oxides on silicon, the first issue is resolved by deposition of charge-neutral layers: strontium silicide and then alkaline earth oxide. The second issue is analogous to the lattice mismatch between silicon and the crystalline oxide. Both issues impact interface trap states and the electrical perfection of the interface. We show below that our system of crystalline oxides on semiconductors result in low interface trap state densities. For BaSrO alloys on silicon and BaTiO3 on germanium, this lattice match can be exact. The third issue is more delicate for crystalline oxides and will be dealt with in the next section. 19.5.1 Band Offset The bandgap for the titanates of strontium and barium is small compared to the silicon bandgap and so the exact nature of the interface is critical for the development of band offset. By our layer-sequencing, we can control the structure and composition of the interface and the band offset. The flexibility in the COS system of (AO)n (A BO3 )m allows us to modify the band offset using the alkaline earth oxide as a buffer. The charge neutrality levels derived by Robertson and Chen [19.8] are an approximation and there are additional considerations that depend on structural details of the interface [19.9]. For the issue of the band offset, the full quantum mechanical view of the interface illustrated in Fig. 19.1 becomes important. While we leave such a theoretical treatment to a future paper, we can explore the consequences of our structure series experimentally. As pointed out by McKee et al. [19.11], the band offset for BaTiO3 on Ge is expected to be small based on charge neutrality levels (CNL). We have estimated the band offset for BaO assuming a mid-gap CNL for BaO and clearly show that band offsets, both CB and VB, are substantially positive for germanium and for silicon. Therefore, if two or more BaO planes were

19 High-k Crystalline Gate Dielectrics: A Research Perspective

631

Fig. 19.18. Leakage current for BaTiO3 deposited on Ge. The dramatic reduction in leakage current is a reflection of an increased band offset brought about by the insertion of 6 ML BaO. (Taken from [19.11])

inserted between a perovskite and germanium or silicon, then the asymmetry of the band structure would adjust and support dielectric displacement across the junction. This is a simple and striking prediction for our heteroepitaxial approach. A collection of leakage current data obtained for BaTiO3 on germanium is presented (Fig. 19.18) with two values of the AO repeat, n = 1 and n = 6. The open circles show the n = 1 data for BaTiO3 grown directly on germanium with the BaTiO3 250 ˚ A thick (see Fig. 19.18). The data show that BaTiO3 directly on germanium is not an effective barrier to electron transfer from gate to germanium. However, if we modify the MOS capacitor structure by inserting 6 atomic planes of BaO between the BaTiO3 and germanium, then the leakage current (open squares) drops 6 orders of magnitude. The relationship between leakage current and bandoffset is being studied using photoelectron spectroscopy of the interface region and the progression as more or less BaO is inserted. Our preliminary results show a valence band offset of 2.8 eV and a conduction band offset of 0.05 eV for n = 1 and m = 2. Within the context of our structure series, (AO)n (A BO3 )m , we expect that the electronic structure can be adjusted for many of the perovskite and transition metal oxides. 19.5.2 Interface Traps We will use capacitance data to prove our thesis that a crystalline interface can be used to tie up the dangling bonds at the silicon surface.

632

F.J. Walker and R.A. McKee

High frequency (1 MHz) and low frequency (10 Hz) capacitance (C) data were taken from a 250 ˚ A thick BaTiO3 film on p-type Ge (Fig. 19.19a). An expanded view of the data in the bias range where the germanium surface potential varies from zero to its value at the Fermi level is shown (Fig. 19.19b). We can extract the density of interface charge (our measure of interface perfection) from the ∆C of CLF –CHF in this bias range [19.52]. Because this interface charge contributes to the capacitance in series, its contribution to the measured capacitance is additive. At high enough frequencies (∼ 106 Hz), it has been observed that interface traps do not contribute to the measured capacitance. Therefore, Dit is proportional to ∆C in accumulation where the fast responding majority carriers contribute to the depletion layer capacitance. The data for the determination of the interface trap density, Dit , is provided (Fig. 19.19b); Dit ∝ ∆C. As these data show, ∆C and hence trapped charge for our commensurate interface is negligible. To our measurement sensitivity, this is an electrically perfect interface. The flat band shift indicates a fixed positive oxide charge of 1012 /cm2 . This value is consistent with a 50 ppm error in the Ba/Ti ratio in the structure. This estimate assumes a uniform distribution throughout the film of Ba deficiencies and that every deficient Ba ion contributes a charge of +1. The above two examples demonstrates the flexibility of this structure series. We can manipulate the physical and electrical structure at the atomic level. Moreover, this approach is applicable to silicon or germanium and any silicon-germanium alloy.

Fig. 19.19. High-frequency-Low frequency capacitance data for BaTiO3 /Ge. Data taken with aluminum top and bottom electrodes. The doping is p-type, 1017 /cm3 . The measured flat band voltage is −0.8 volts. With an interface state density of 1010 /cm2 , the flat band shift indicates a fixed positive charge of 1012 /cm2 . (Taken from [19.11])

19 High-k Crystalline Gate Dielectrics: A Research Perspective

633

19.5.3 Channel Mobility With the development of a perfectly commensurate, crystalline interface, we can revisit electron or hole scattering at a semiconductor-crystal interface. With such a system, we can develop a fundamental understanding of how inversion layer charge interacts with strain, lattice disorder and polarization. In Fig. 19.20, we show the universal mobility curve for SiO2 on silicon plotted as a function of field at the interface and normal to it. For low fields, less than 5×105 V/cm, the mobility is dominated by electron-phonon scattering where the mobility has a E−0.3 functional dependence that has a strong temperature dependence [19.53]. Because the low-field scattering is bulk, the only way to affect it is to change the semiconductor substrate. Silicon can, for example, be alloyed with germanium to obtain a larger mobility, as suggested by McKee et al. [19.11]. For higher fields, the µE−2 field dependence is attributed to surface scattering. Based on considerations for interface uniformity, we expect that a MOSFET based on commensurate crystalline oxides may improve the mobility for fields above 5×105 V/cm relative to amorphous silicon dioxide. We have found that for a transistor fabricated from an oxide with 2 ML of lattice matched Ba0.75 Sr0.25 O and 52 unit cells of SrTiO3 that the mobility rivals that for SiO2 . The peak mobility that we extract, 321 cm2 /V–sec, is 1000 a b

SiO /Si 2

2

µ (cm /V sec)

c

d SrTiO 3 /Si

100 10

5

10 E

eff

6

(V/cm)

Fig. 19.20. Mobility of a MOSFET using SrTiO3 grown epitaxially on silicon. The n-channel mobility as a function of field evaluated self consistently with 8.6 fF/µm2 as the gate capacitance in a 210 ˚ A film. SiO2 /Si values are (a) H2 processed, Dit = 3 × 1010 /cm2 –eV; (b) H2 processed, Dit = 3 × 1011 /cm2 –eV, and (c) dry oxidized with Dit = 1 × 1013 /cm2 –eV. The data for SrTiO3 /Si were determined with the Pao–Sah model [19.53] and Ids –Vgs data at 50 mV Vds . The high field mobility of this oxide rivals that of SiO2 at high fields and is a result of the crystalline interface. (Taken from [19.11])

634

F.J. Walker and R.A. McKee

the highest mobility value that has been reported for any alternative gate FET [19.11]. This value and its field dependence can be explained by fluctuations in inversion charge density. Epitaxial strain, 2% for SrTiO3 on silicon (see Table 19.1), and associated misfit dislocations lead to a local variation in the silicon bandgap near the dislocation. This fluctuation leads to a decrease in the mobility [19.54]. Such a mechanism is intrinsic to the crystalline interface and will require further theoretical analysis and experimental investigation to elucidate the fundamentals. From the low-field mobility we conclude that a COSGATETM FET, where strain is absorbed in the silicon, is not an ideal device. Alloying SrTiO3 with Ca however, reduces the perovskite lattice parameter to exactly lattice match silicon thereby eliminating the adverse strain effect [19.55].

19.6 Conclusion We have applied MBE methods developed for metal superlattices and extended them to a reactive oxide-semiconductor system. This approach is a natural consequence of the view presented in Fig. 19.1 of the MOS capacitor as a superlattice. This structure becomes an entirely new physical system with the advent of crystalline components in the structure having atomically abrupt interfaces. Our implementation of this structure using the alkaline earth silicides, oxides and perovskites can be exploited to investigate the technological limitations and scientific opportunities of a semiconductor/oxide MOS capacitor as well as providing a platform for new functionality. The classic example of new functionality is derived from the polarization of ferroelectric perovskites like BaTiO3 or PZT coupling to the inversion charge of the semiconductor [19.29]. This device, first envisioned by Bell researchers in 1957 [19.55], is not the goal but the starting point for a device physics revolution based on crystalline oxides on semiconductors. Acknowledgments. We would like to acknowledge the careful reading of this manuscript by H. Hwang and G.E. Ice. We would also like to acknowledge the technical assistance of Curt Billman with some of the RHEED photographs. Research sponsored by the Division of Materials Sciences and Engineering, Office of Basic Energy Sciences, U.S. Department of Energy at Oak Ridge National Laboratory under contract DE-AC05-00OR22725 with UT-Battelle, LLC and at the University of Tennesssee under contract DE-FG02-01ER45937.

19 High-k Crystalline Gate Dielectrics: A Research Perspective

635

References 19.1. (2001) International Technology Roadmap for Semiconductors, 2001 Edition, International Technology Roadmap for Semiconductors 19.2. Nicollian EH, Brews JR (1982) MOS (Metal Oxide Semiconductor) Physics and Technology, J Wiley and Sons, New York:Chapter 2 19.3. Yang N, Henson WK, Hauser JR, Wortman JJ (1999) Modeling study of ultrathin gate oxides using direct tunneling current and capacitance-voltage measurements in MOS devices, IEEE Trans. El. Dev. 46(7):1464 19.4. Stern F (1972) Self-consistent results for n-type Si inversion layers, Phys. Rev. B 5:4891 19.5. Demkov A A, Liu R, Zhang, Xiaodong, Loechelt Heather (2000) Theoretical and experimental investigation of ultrathin oxynitrides and the role of nitrogen at the Si– SiO2 interface, J. Vac. Sci. Techno. B 18(5):2388 19.6. Lo SH, Buchanan DA, Tau Y (1999) Modeling and characterization of quantization, polysilicon depletion, and direct tunneling effects in MOSFETs with ultrathin oxides, IBM J. Res. and Dev. 43(3):327 19.7. Tershoff J (1984) Phys. Rev. Lett. 52:465 19.8. Robertson J and Chen CW (1999) Appl. Phys. Lett. 74:1168 and Robertson J (2000) J. Vac. Sci. Technol. B 18:1785 19.9. Tung RT (2000) Chemical bonding and Fermi level pinning at metalsemiconductor interfaces, Phys Rev Lett 84 (26):6078 (2000) and Tung RT (2001) Formation of an electric dipole at metal-semiconductor interfaces, Phys. Rev. B 64 (20) :205310 19.10. McKee RA, Walker FJ, Chisholm MF (1998) Crystalline Oxides on Silicon: The First Five Monolayers, Phys. Rev. Lett. 81 (14):3114 19.11. McKee RA, Walker FJ, Chisholm MF (2001) Physical Structure and Inversion Charge at a Semiconductor Interface with a Crystalline Oxide, Science 293:468 19.12. Hubbard KJ, Schlom DG (1996) J. Mater. Res. 11:2757 19.13. Tsao Jeffry Y (1993) Materials Fundamentals of Molecular Beam Epitaxy, Academic Press, Boston 19.14. Eckstein JN, Bozovic I, Schlom DG, Harris JS (1991) Growth of superconducting Bi2 Sr2 Can−1 Cun Ox , J. Cryst. Growth 111:973 19.15. Locquet JP, Machler E (1992) Characterization of radio-frequency plasma source for molecular-beam epitaxial-growth of high-Tc superconductor films, J. Vac. Sci. and Tech. A 10 (5):3100 19.16. Boyce BA, Neave JH, Dobson PJ, Larsen PK (1984) Analysis of reflection high energy electron diffraction data from reconstructed semiconductor Surfaces, Phys. Rev. B 29:814 19.17. Flynn CP (1988) Constraints on the growth of metallic superlattices, J. Phys. F 18 (9):L195 19.18. Gilmer GH and Grabow MH (1987) Models of Thin Film Growth, Journal of Metals 39 (6):19 19.19. Bruinsma R, Zangwill A (1987) Morphological Transitions in Solid Eptiaxial Overlayers, Europhys. Lett. 4 (6):729 19.20. Grabow MH, Gilmer GH (1988) Thin-film Growth modes, wetting and cluster nucleation, Surf. Sci. 194 (3):333-346 19.21. Yang MH, Flynn CP (1989) Growth of alkali halides from molecular beams: Global growth characteristics, Phys. Rev. Lett. 62 (21):2476

636

F.J. Walker and R.A. McKee

19.22. Flynn CP, Eades JA (2001) Structural variants in heteroepitaxial growth, Thin Solid Films 389:116 19.23. Walker FJ, McKee RA (1992) High-temperature stability of molecular beam epitaxy-grown multi-layer ceramic composites: TiO/Ti2 O3 , J. of Cryst. Growth 116:235 19.24. Fang XM, McCann PJ, Liu WK (1996) Growth studies of CaF2 and BaF2 /CaF2 on (100) silicon using RHEED and SEM, Thin Solid Films 272:87 19.25. Pfiefer L, Phillips JM, Smith TP, Augustyriak WM, West KW (1985) Use of a rapid anneal to improve CaF2 :Si(100) epitaxy, Appl. Phys. Lett. 46 (10):948 19.26. McKee RA, Walker FJ, Conner JR, Raj R (1993) BaSi2 and thin film alkaline earth silicides on silicon, Appl. Phys. Lett. 63 (20):2818 19.27. Frantz JD, Mysen BO (1995) Raman spectra and structure of BaO-SiO2 , SrO-SiO2 and CaO-SiO2 melts to 1600◦ C, Chemical Geology 121:155 19.28. Krebs H, Walter PHL (1968) Fundamentals of Inorganic Crystal Chemistry, McGraw-Hill, London:Chapter 23 19.29. Lin A, Hong X, Wood V, Verevkin AA, Ahn CH, McKee RA, Walker FJ, Specht ED (2001) Epitaxial growth of Pb(Zr0.2 Ti 0.8 )O3 on Si and its nanoscale piezoelectric properties, Appl. Phys. Lett. 78 (14):2034 19.30. Stoneham AM and Dhote J (2002) A compilation of crystal data for halides and oxides, http://www.cmmp.ucl.ac.uk/∼ahh/research/crystal/homepage.htm, University College London, London: and references contained therein 19.31. Megaw HD (1946) Crystal tructure of double oxides of the perovskite type, Proc. Of the Phys. Soc. London 58:133 19.32. Lide David R (1995) CRC Handbook of Chemistry and Physics, 75th edn, CRC Press 19.33. Jellison GE, Boatner LA, Lowndes DH, McKee RA, Godbole M (1994) Optical functions and transparent thin-films of SrTiO3 , BaTiO3 , and SiOx determined by spectroscopic ellipsometry, Appl. Optics 33 (25):6053 19.34. Kwei GH, Lawson AC, Billinge SJL, Cheong S-W (1993) Structures of the ferroelectric phases of barium titanate, J. Phys Chem 97, 2368; and Harada J, Pedersen T, Barnea Z (1970) X-ray and Neutron Diffraction Study of Tetragonal Barium Titanate, Acta Crystallogr. 26:336 19.35. Kennedy BJ, Howard CJ, Chakoumakos BC (1999) High-temperature phase transistions in SrZrO3 , Phys. Rev. B 59 (6):4023 19.36. Hinatsu Y (1996) Electron paramagnetic resonance spectra of Pr4+ in BaCeO3 , BaArO3 , BaSnO3 and their solid solutions, J. Solid State Chemistry 122:384 19.37. McKee RA, Walker FJ, J. R. Conner JR , Specht ED, Zelmon DE (1991) Molecular-beam eptiaxy growth of epitaxial Barium silicide, Barium Oxide, and Barium-Titanate on silicon, Appl. Phys. Lett. 59 (7):782 19.38. Wagner CD, Riggs WM, Davis LE, Moulder JF, Muilenberg GE (1979) Handbook of X-Ray Photoelectron Spectroscopy, Perkin-Elmer Corp, Eden Prarie, MN 19.39. Chambers SA, Liang Y, Yu Z, Droopad R, Ramdani J (2001) Band offset and structure of SrTiO3 /Si(001) heterojunctions, J. Vac. Sci. A 19 (3):934

19 High-k Crystalline Gate Dielectrics: A Research Perspective

637

19.40. Lee GH, Shin BC, Kim IS (2001) Critical thickness of BaTiO3 film on SrTiO3 (001) evaluated by reflection high-energy electron diffraction, Materials Letters 50:134 19.41. Tabata H, Tanaka H, Kawai T (1994) Formation of artificial BaTiO3 /SrTiO3 superlattices using pulsed laser deposition and their dielectric properties, Appl. Phys. Lett. 65 (15):1970 19.42. Droopad R, Yu ZY, Ramdani J, Hilt L, Curless J, Overgaard C, Edwards JL, Finder J, Eisenbeiser K, Wang J, Kaushik V, Ngyuen BY, Ooms B (2001) Epitaxial oxides on silicon grown by molecular beam epitaxy, J. Cryst. Growth 227:936 19.43. Mori H and Ishiwara H (1991) Jpn. J. Appl. Phys. 30:L1415 19.44. Herrera-Gomez A, Aguirre-Tostado FS, Sun Y, Pianetta P, Yu Z, Marshall D, Droopad R, Spicer WE (2001) Photoemission from the Sr/Si(001) interface, J. Appl. Phys. 90 (12):6070 19.45. Hu X, Yao X, Peterson CA, Sarid D, Yu Z, Wang J, Marshall DS, Droopad R, Hallmark JA, Ooms WJ (2000) The (3x2) phase of Ba adsorption on Si(001)-2x1, Surf. Sci. 44:256 19.46. Bakhtizin RZ, Kishimoto J, Hashizume T, Sakurai T (1996) STM study of Sr adsorption on Si(100) surface, Appl. Surf. Sci. 94/95:478 19.47. Goodner DM, Marasco DL, Escuardo AA, Cao L, Tinkham BP, Bedzyk MJ (2003) X-ray standing wave study of the Sr/Si(001) 2x3 surface, Surface Science 547:19 19.48. Stocks GM, Shelton WA, private communication 19.49. Tasker PW, Colbourn EA, Mackrodt WC (1985) Segregation of isovalent impurity cations at the surfaces of MgO and CaO, J. Am. Ceram. Soc. 68 (2):74 19.50. Lettieri J, Haeni JH, Schlom DG (2002) Critical issues in the heteroepitaxial growth of alkaline-earth oxides on silicon, J. Vac. Sci. Technol. A 20 (4):1332 19.51. Lucovsky G, Phillips JC (2000) Limitations for aggressively scaled CMOS Si devices due to bond coordination constraints and reduced band offset energies at Si-high-k dielectric interfaces, Appl. Surf. Sci. 166:497 19.52. An excellent treatment of MOS dielectric theory and field effect phenomena in such a device can be found in Nicollian and Brews (see pg. 332 for discussion of Dit and ∆C); Nicollian EH, Brews JR (1982) MOS(Metal Oxide Semiconductor) Physics and Technology, John Wiley & Sons, New York:332 19.53. Arora Narain (1993) MOSFET Models for VLSI Circuit Simulation, Springer Verlag, Wien, New York:Chapter 6 19.54. Brews JR (1975) Theory of carrier-density fluctuations in an IGFET near threshold, J Appl Phys 46:2181; and Brews JR (1975) Carrier-density fluctuations and IGFET mobility near threshold, J. Appl. Phys. 46:2193 19.55. McKee RA, Walker FJ (1998) CaTiO3 interfacial template structure on semiconductor-based material and the growth of electro ceramic thin-films in the perovskite class, US Patent No. 5,830,270

20 High-k Crystalline Gate Dielectrics: An IC Manufacturer’s Perspective R. Droopad, K. Eisenbeiser, and A.A. Demkov

20.1 Introduction Scaling of complementary metal oxide semiconductor (CMOS) devices has been a driving force in the digital age. One component of this scaling is the reduction of the gate dielectric thickness in order to increase the capacitance in the device. This thickness reduction has also led to increased gate currents due to quantum mechanical tunneling through these thin gate dielectrics. One alternative to the reduction of dielectric thickness to increase capacitance is to maintain the thickness and use a dielectric material with a dielectric constant higher than the gate dielectrics currently in use. Crystalline oxides are one of the classes of high dielectric constant (high-k) materials that are being investigated for future gate dielectric applications. Like all potential gate dielectrics, crystalline oxides must meet several requirements to be successfully implemented including: (a) the material must have a dielectric constant higher than approximately four of SiO2 , (b) the new material should be stable in contact with Si (that is, not to form silicides or silicates) and the gate electrode material at temperatures used during device processing, (c) the density of interfacial traps should be on par with SiO2 , and the trap levels within the dielectric should be low, (d) the conduction and valence band offset should be at least 1 eV (3.2 eV at the Si–SiO2 interface) to provide a sufficient energy barrier necessary for the gate action at the operational bias, (e) the electrical properties of the dielectric should be stable over a wide range of operating temperatures and operating frequencies, (f) the process to deposit the gate dielectric must meet the standards for uniformity and reproducibility set by current gate dielectric deposition techniques and must meet these standards in a cost effective manner. Crystalline dielectrics may hold advantages over amorphous films for highk applications in two areas: higher dielectric constants and the possibility of almost defect free bulk and interfaces. The higher dielectric constant can be understood from the mechanisms that contribute to polarization in a material. Within the adiabatic approximation the dielectric constant of a solid has two major contributions: the electronic and lattice susceptibilities. The orientational term important in polar liquids and molecular solids may be neglected for a covalent or ionic material. The electronic susceptibility scales approximately as the square of the ratio of the plasma frequency ωp (mea-

640

R. Droopad, K. Eisenbeiser and A.A. Demkov

sures the electron density) over the so-called Penn gap EPG (an empirical parameter measuring the spectral average energy gap of the electronic spectrum rather than the actual bandgap) [20.1]:  ε=1+

ωp EPG

2 (20.1)

With EPG being about 4–6 eV for most semiconductors, and ωp being in the range of 15–17 eV, it is clear that it is unlikely that the electronic component alone allows achieving a dielectric constant much higher than approximately 15. For this reason truly amorphous materials that rely solely on electronic polarizability do not have dielectric constants higher than 15–20. The frequency dependent lattice polarizability or the dielectric susceptibility tensor is given by [20.2]  ql ∗ (l, k)  ql j∗ (l , k) 1  2 2 −1 i αij (ω) = , (20.2) (ωk − ω ) √ √ ml ml  V  k

l

l

where ωk is the phonon mode frequency, V is the unit cell volume, ml and ql are the mass and charge of the lth ion, while (l , k) is its displacement vector in the kth eigen mode. It is readily seen that a static dielectric constant (ε(0) = 1 + 4πα(0)) can be arbitrarily large in the presence of so-called soft (very low or zero frequency) phonon modes. That is why the vast majority of novel gate dielectrics proposed to this day, e.g. SrTiO3 [20.3–20.5], ZrO2 [20.6], HfO2 [20.7], TiO2 [20.8], or Ta2 O5 [20.9], are insulators that display the high lattice polarizability in the crystalline form due to such phonon modes. This crystalline form can be monocrystalline, polycrystalline or nanocrystalline. While a dielectric constant higher than SiO2 allows a gate dielectric to have larger physical thickness for the same capacitance and reduces tunneling current through the dielectric, several performance factors place an upper bound on the desired dielectric constant. The first factor that was discussed above is band offsets. The band offsets are the energy barriers between the silicon channel and the gate dielectric for either electrons, conduction band offset, or holes, valance band offset. These energy barriers are needed to reduce the injection of hot electrons or holes into the dielectric where they may contribute to effects such as threshold shifts and gate leakage. At normal operating temperatures, offsets of greater than one electron volt are thought to be sufficient. This implies that the overall band gap of the dielectric then must greater than the band gap of Si plus two electron volts or ∼ 3.1 eV. In reality the conduction band and valance band offsets are seldom equal, so larger band gaps are desirable. Since the Penn gap is related to the band gap, (20.1) above shows that as the electronic contribution to the dielectric constant is increased the band gap is decreased. This trend has been seen in empirical data where the band gap drops off with increased dielectric

20 High-k Crystalline Gate Dielectrics

641

2 Leakage Current (A/cm )

1.E+06 1.E+04 1.E+02

EOT=5A

1.E+00

Tgate dielectric/Lgate>0.1 in shaded region

EOT=10A

1.E-02 1.E-04 1.E-06 EOT=15A

1.E-08 1.E-10 0

10

20

30

40

50

Dielectric Constant

Fig. 20.1. Estimation of gate leakage current as a function of dielectric constant for a high-k gate material. Curves show this leakage as a function of different EOT values. The shaded region on the right shows where short channel effects may start to degrade device performance

constant in commonly used materials. The reduction in band gap slows at higher dielectric constants, > 25, as lattice polarization becomes dominant. Use of a high dielectric constant material then may lead to issues with charge injection due to the reduced band offsets. This effect may be less of a problem with crystalline materials since they can rely on lattice polarization, which is not strongly related to band gap. The other factor that places an upper bound on desired dielectric constant is short channel effects. As discussed above, use of a high dielectric constant material allows constant capacitance to be maintained for thicker layers compared to a lower dielectric constant material. While this reduces tunneling current through the dielectric layer, short channel effects can become an issue. Studies have shown that as the physical thickness of the gate dielectric approaches 0.1 to 0.2 of the length of the gate, short channel effects will degrade transistor performance [20.10]. Since the capacitance in the device is generally maintained from generation to generation, the limit on the physical thickness of the gate dielectric effectively puts an upper limit on the dielectric constant. A model has been developed to balance these effects. This model uses a fit to empirical data to define the relationship between bandgap and dielectric constant. The conduction and valance band offsets are assumed to be equal. This band structure information is then used to estimate gate leakage current for a desired capacitance and dielectric constant. The gate leakage current includes contributions from both current over the barrier (i.e. thermionic

642

R. Droopad, K. Eisenbeiser and A.A. Demkov

emission) and current through the barrier (i.e. tunneling current). The gate length from the ITRS roadmap is then used to estimate when short channel effects may limit dielectric constant. A plot of this data for the case where there is no interfacial layers is shown in Fig. 20.1. This figure shows that for each generation the leakage current can be significantly reduced through the use of increased dielectric constant materials. However, since the gate length in the ITRS roadmap actually scales faster than the capacitance, short channel effects become a limiting factor. Even for the smallest equivalent oxide thickness shown, dielectric constants above 20 are probably undesirable since they may lead to short channel effects. This suggests that the higher dielectric constants possible with crystalline materials compared to amorphous materials are not needed if low dielectric constant interfacial layers can be avoided. While dielectric layers with no interfacial layers may be the target of most research into high dielectric constant materials, they are very seldom achieved. Much more common in the literature are dielectric stacks with an interfacial layer between the Si and the high dielectric constant material. This interfacial layer is usually a lower dielectric constant and may play an important role in the performance of the device. For instance if the interfacial layer is SiO2 , some benefits may be present. The SiO2 /Si interface is well understood and can be well controlled with known manufacturing processes. This would remove one of the requirements of a good interface with Si from the high-k material and may make selection of an appropriate high-k material easier. The model described above can be used to look at this case where the interfacial layer is un-doped SiO2 . These results are shown in Fig. 20.2. In this case with a 4 ˚ A SiO2 interfacial layer, higher dielectric constant materials are needed. From this figure it can be seen that for technology nodes requiring an equivalent oxide thickness less than 10 ˚ A, dielectric constants greater than 20 may be desirable and would require some crystalline dielectric. One issue with such a low-k/high-k stack is that an extra interface is added to the system. Since this interface is close to the channel, carriers can be injected to the interface or interact with charges at the interface resulting in performance issues. The other possible advantage of the crystalline dielectrics as mentioned above is the absence of defects in the bulk or at the interface in an ideal heteroepitaxial system. In the SiO2 –Si system dangling bonds at the interface are neutralized with hydrogen to create an interface with very few defects. This type of interface is necessary to maintain good performance in a MOSFET. In an ideal heteroepitaxial system there are no dangling bonds so the interface properties can meet or exceed the properties of the SiO2 –Si interface with no passivation. If such an interface can be achieved in a high-k/Si system it will have significant performance advantages compared to a passivated amorphous high-k/Si interface. The bulk trap density of the ideal crystalline high-k material could also be superior to many of the “amorphous” high-k materials. An ideal crystalline material has no defects and inhomogenities,

20 High-k Crystalline Gate Dielectrics 4A SiO2 interfacial layer

1.E+04

2

Leakage Current (A/cm )

1.E+06

643

EOT=5A 1.E+02 1.E+00

EOT=10A

1.E-02

Tgate dielectric/Lgate>0.1 in shaded region

1.E-04 1.E-06 EOT=15A 1.E-08 1.E-10 0

10

20

30

40

50

Dielectric Constant Fig. 20.2. Estimation of gate leakage current as a function of dielectric constant for high-k gate material. Curves show this leakage as a function of different EOT values. The shaded region in the lower right shows where short channel effects may start to degrade device performance

which cause trap states and enhance diffusion rates, which may be present in many of the poly or nanocrystalline forms that the amorphous high-k materials take upon exposure to high temperatures. Of course even the best crystalline high-k materials are not ideal and may have point defects as well as grain boundaries that may reduce these performance advantages in practice. Complex oxide materials, many of which belong to the class of perovskites (cubic crystals with the chemical formula ABO3 ) [20.11–20.17] are the crystalline high-k materials that have been most studied to date. One perovskite material that has been demonstrated for MOSFET applications is SrTiO3 . The development work that has been carried out at Motorola Labs in pursuing crystalline oxides for gate dielectric applications will be described. The SiSrTiO3 system will be discussed as a prototypical semiconductor-crystalline oxide system. The understanding of the atomic structure at the silicon/oxide interface, even though is still not complete, will be important in the development of growth processes for other crystalline oxides/silicon systems such as LaAlO3 , LaLuO3 , BaZrO3 , etc. The development of Si-SrTiO3 materials system has highlighted some of the advantages as well as challenges of using a crystalline dielectric. The rest of the chapter is organized as follows; first, a brief outline of the theoretical work used in support of the materials development is given followed by discussions on the growth process details including surface preparation. This chapter concludes with a discussion of the device results obtained for MOSFETs with the epitaxial SrTiO3 gate dielectric and a summary of the advantages and challenges of crystalline high-k materials.

644

R. Droopad, K. Eisenbeiser and A.A. Demkov

20.2 Theoretical Overview Over the past two decades, developments of the electronic structure theory, and the rapid growth of the computational power commercially available made density functional theory an indispensable integral tool in the materials development. In the past few years DFT within the local density approximation (LDA) has been successfully used to analyze various properties of the Si–STO system important for the high quality material growth and ultimately device fabrication. Among other issues considered theoretically is the formation of a stable Sr template on the Si (001) surface that is necessary to facilitate the epitaxial growth of STO on Si [20.13]. The thermodynamic stability of the Si–STO interface [20.14] has been estimated, and wetting of Si by the oxide has been investigated [20.15]. Several detailed studies of the structure and thermodynamics of the flat and stepped STO surface have also been conducted [20.15–20.19]. The band discontinuity at the Si–STO interface has been calculated for various possible structures [20.15–20.20]. Instead of summarizing the results of these studies right away, we will discuss them in the context of the actual growth process when appropriate. We will start with a brief description of the STO surface, which as will become apparent later, is important for the STO film growth on a substrate.

20.3 Perovskite Surface The surfaces of cubic perovskites have been recently studied theoretically [20.15–20.18,20.20] and experimentally [20.21]. The interest is primarily in understanding the fundamental relationship of various bulk properties of these materials such as piezoelectricity, ferroelectricity, etc. to the surface. The surface relaxation and relative energies for two possible surface termination of the cubic form of SrTiO3 are investigated as follows. First the equilibrium lattice constant and band structure of the bulk cubic phase of SrTiO3 is computed; 3.855 ˚ A, and 1.80 eV for the lattice constant and band gap, respectively. The lattice constant is slightly (1.3%) smaller than the experimental value of 3.905 ˚ A, and the band gap is about a half of the experimental value [20.22], which is typical of the LDA-DFT methods. Next, the geometry of relaxed surfaces was examined. Calculations are done for periodic symmetrically terminated slabs containing 17 atoms (SrO terminated) and 18 atoms (TiO2 terminated). The periodic boundary conditions in lateral directions are applied across (1 × 1) unit cells of the theoretical lattice constant. The z-axis is taken as normal to the surface. The slabs are three lattice constants thick separated by a vacuum region two lattice constants thick in the z direction. To estimate the relative stability of the SrO and TiO2 terminations of the SrTiO3 surface the grand thermodynamic potential was calculated following the method of [20.18]. To simplify the analysis, slabs of SrO and TiO2 units

20 High-k Crystalline Gate Dielectrics

645

were used to evaluate the grand thermodynamic potential as a function of the appropriate chemical potential. The chemical potentials of SrO and TiO2 are not independent, since the system is assumed to be in equilibrium with a reservoir of bulk SrTiO3 . Therefore, referencing the chemical potentials to corresponding bulk materials (e.g. µ ˜SrO = µSrO + ESrO , where ESrO is the energy per formulae unit computed for the bulk rock salt structure), and introducing the formation energy of the perovskite phase Ef (−Ef = ESrTiO3 − ESrO − ETiO2 2 ), gives µSrO + µTiO2 = −Ef ,

(20.3)

An estimation of the formation energy of SrTiO3 with respect to the binary oxides was made from available thermodynamic data. The CRC Handbook of Chemistry and Physics gives the enthalpies of formation with respect to separate atoms of −141.5 kCal/mol, −224.6 kCal/mol, and −399.7 kCal/mol for SrO, TiO2 and SrTiO3 , respectively [20.23]. This results in the SrTiO3 formation energy with respect to these oxides of 34 kCal/mol or 1.462 eV. First principles calculations result in 1.429 eV in good agreement with experimental observations. SrTiO3 energy is calculated for a relaxed bulk cubic structure, TiO2 was calculated for the relaxed rutile structure, and SrO for the relaxed rock salt structure. For comparison, values of 1.642 eV or 38 kCal/mol for BaTiO3 were obtained. Since, according to (20.3, only one of the chemical potentials is an independent variable TiO2 was chosen and allowed to vary between zero and −Ef . The end points of this range correspond to equilibrium with TiO2 and SrO, respectively. For values lower than Ef bulk SrO can form, and for those above zero the formation of bulk TiO2 becomes possible. The grand thermodynamic potential, F is given by [20.18] F =

& 1% Eslab − NTiO2 (µTiO2 + ETiO2 ) − NSrO (µSrO + ESrO ) , (20.4) 2

where the energy is given per surface unit cell, and the factor of 1/2 is due to having two surfaces in a slab calculation. The energy of the slab Eslab is taken for the relaxed slabs described previously. The energy gain due to the relaxation is 210 erg/cm2 and 231 erg/cm2 for the TiO2 -terminated and SrOterminated slabs, respectively, and therefore accounts for more than 10% of the surface energy. The free energy of the surface for both terminations as a function of the TiO2 chemical potential is shown in Fig. 20.3. Following the lowest energy termination (thick black line), it is clear that for the majority of the experimental conditions the SrO-terminated surface is thermodynamically more stable. However, that does not necessarily mean that the SrO termination should occur more often, and the energy of desorption is another factor to be considered [20.19]. The lowest surface energy of 801 erg/cm2 and the highest surface energy of 2127 erg/cm2 are achieved for the SrO-, and TiO2 -terminated surfaces, respectively, both occurring in Sr-rich growth conditions. It is worth noting that the energy of the SrO-terminated STO surface

R. Droopad, K. Eisenbeiser and A.A. Demkov 2

Surface Free Energy (erg/cm )

646

2200 2000

Ti Termination

1800 1600 1400 1200 Sr Termination 1000 800 -1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

TiO Chemical Potential (eV) 2

Fig. 20.3. Grand thermodynamic potential of the STO surface is plotted as a function of the TiO2 chemical potential. The dashed and dotted lines correspond to SrO and TiO2 -terminated surfaces, respectively. The thick line traces the lowest energy termination

under Sr rich conditions is similar to that of bulk SrO (800 erg/cm2 ) [20.24]. The quantity independent from the chemical potential is the average surface energy estimated to be 1.359 eV per (1 × 1) surface cell, or 146 erg/cm2 . To put this in perspective, the cleavage energy of Si along the (110) plane is 1900 erg/cm2 , and that of GaAs along the (110) plane is 860 erg/cm2 [20.25]. The most important implication of these results for the MBE growth is a relatively large energy difference between the SrO and TiO2 terminations in the Sr-rich regime. For example, the most natural choice of the chemical environment for the STO growth would be such that the energies of two terminations are equal. Figure 20.3 suggests a Ti rich regime. However, if the growth uses shuttering (sequential deposition of Sr and Ti ) the SrO layer is deposited under the Sr rich conditions, and the TiO2 layer under the Ti rich conditions. This puts restrictions on the growth temperature since the underlying layer is severely disadvantaged thermodynamically, and may form islands if the mobility is high enough or even desorb. If co-deposition is used, again a better quality film is expected in a Ti rich regime, and if a Sr rich environment is sought, the film should be terminated with an SrO layer. The surface morphology is generally consistent with this prediction [20.26]. For the films grown under the excess of Sr the surface layers are not “complete” and a large number of half unit cell steps are present. The energetics of steps on the STO surface has recently been considered by Zhang and Demkov using a novel thermodynamic approach to estimate the step energy for the SrTiO3 surface with a half unit cell high step [20.16]. The energy of the step edge varies between 0.06 eV/˚ A for the stoichiometric terrace to 0.3 eV/˚ A for the non-stoichiometric (metallic) terrace. It was

20 High-k Crystalline Gate Dielectrics

647

found that the stepped SrTiO3 surface prefers O step edge under extremely oxygen rich conditions and Sr or TiO edge under extremely oxygen deficient conditions. However, under the majority of chemical environments the edge termination is mixed, and it is the stoichiometry that drives the terrace termination.

20.4 Oxide Deposition Molecular Beam Epitaxy (MBE) is the method of choice when a tight control over the growth conditions and layer/interface properties is required. The growth of oxides on Si by this method has been useful in developing an understanding of the materials fundamentals and issues of integration, as well as to demonstrate the device performance. Molecular beam epitaxy is an evaporation technique, which can be used todeposit thin films of high quality single crystal materials onto a crystalline substrate. In this technique, neutral atomic and molecular beams generated either thermally from Knudsen effusion cells or e-beams are directed onto a heated substrate under ultrahigh vacuum (UHV) conditions. One of the most significant advantages of MBE is its ability to incorporate UHV-associated surface analytical equipment either directly or connected to the growth chamber. Such equipment allows for the monitoring of the chemical and structural properties of the so-called epilayer throughout the growth process. The growth rates in MBE are typically < 5 ˚ A/s and are controlled by the sticking coefficients of the constituent atoms or molecules. As a result, the layer thickness, composition and interface roughness can be controlled to within atomic dimensions. Because of the UHV conditions impurities within the film and heterointerfaces are also kept to a minimum. Growing oxide materials on Si is a challenging task. The main difficulty lies in the fact that kinetic and thermodynamic conditions necessary for a high quality film growth are mutually exclusive. The growth temperature must be high enough to achieve high crystallinity, the partial oxygen pressure needs to be sufficient to fully oxidize the metals, and in the same time the oxidation of the substrate must be suppressed. Clearly, a rather delicate balance between the oxidation kinetics and thermodynamics must be observed. Considering the lattice constants, Si has a diamond structure with a cubic lattice constant of 5.43 ˚ A, while SrTiO3 is a cubic perovskite with a lattice constant of 3.9 ˚ A. The simple cubic cell of SrTiO3 contains Sr atoms at the cube’s corners, one Ti atom at the body center, and oxygen atoms in the centers of all faces. The matching of the perovskite cell to silicon’s diamond cell is often described as a 45◦ rotation of the perovskite with respect to the conventional cubic cell of Si. Indeed, a (1 × 1) unit √ cell of the unreconstructed Si (001) surface has A), that is only 1.7% smaller than a lattice vector of 3.85 ˚ A (a/ 2, a = 5.43 ˚ 3.905 ˚ A of SrTiO3 . Several groups were recently able to grow thin STO films on Si using molecular beam epitaxy (MBE) [20.3–20.5, 20.11, 20.12].

648

R. Droopad, K. Eisenbeiser and A.A. Demkov

20.5 Growth Template The growth of crystalline oxides on Si(001) was shown to depend strongly on the phase diagram of a sub-monolayer of alkaline-earth metal on Si(001) surface. Experimentally, it was found that (i) the 1/6 ML–1/3 ML Sr coverage causes a 3 × 2 reconstruction of Sr/Si surface; (ii) a 1/3 ML–1/2 ML Sr coverage causes a 2 × 1 reconstruction; (iii) a 5 × 1 reconstruction will occur at approximately 0.7 ML coverage. For ∼ 0.8 ML coverage a 73 × 1 reconstruction will occur, while at very low temperature a 3 × 1 reconstruction is observed at a higher coverage [20.27]. Theoretically, Wang et al. [20.28] investigated possible bonding sites for Ba on a Si(001) surface. It was found that the favorable bonding site of the adatom is the fourfold site located in the trough between two Si dimer rows. However, little is known about the forces driving the surface to such ordered structures. We studied the physics behind various reconstructions of Sr/Si surface by means of ab-initio DFT calculations. By comparing the surface energy at various levels of coverage using thermodynamics considerations, two distinct mechanisms for the Sr surface interaction were identified. At low coverage (below 1 ML) the charge transfer between Sr and the Si surface drives the reconstruction. In addition, it was found that at high coverage (1 ML and higher) the size mismatch between Sr atoms and Si substrate induces a series of 3×, 5×, and 7× reconstructions. For the Sr coverage below 1/2 ML, the major mechanism for the system to reduce its energy is through the charge transfer between Sr and Si atoms resulting in “untilting” of Si dimers. The most important case is that of a 1/2 ML coverage. Due to the charge transfer from Sr to Si, all surface dimers are symmetric at this point, resulting in a (2 × 1) reconstruction. This also corresponds to the surface π* band being fully occupied. Note that for 1/2 ML coverage, all the best bonding sites (trough position) are filled and the stoichiometry of the topmost layer is SrSi2 . This stoichiometry is the same as that of a stable Sr silicide. Therefore, 1/2 ML coverage is a rather special case during the Sr deposition on the Si(001) surface. In crystalline SrSi2 silicon atoms are arranged in a three-coordinated net. This Si net structure is very different from the Si bulk four-coordinated diamond net, and is made possible by Sr. At 1/2 ML coverage the top layer’s stoichiometry is equivalent to that of the bulk silicide, however the geometry is rather different. Most intriguing is the fact that the surface is semi-insulating, while the silicide is metallic. It is believed that oxidation of this “silicide” layer is indeed key for the subsequent STO growth. The formation of Si–O bridges in the presence of Sr without the oxidation of Si below the topmost surface layer ensures the epitaxial registry between this “template” and subsequently grown perovskite. The alkaline earth metals used during the growth of (Sr,Ba)TiO3 are evaporated from low temperatures effusion cells with pyrolytic boron nitride crucibles and the Ti from a high temperature cell with a Ta crucible. The epitaxial oxide growth on silicon is achieved using a co-deposition process in which both the alkaline earth metal and the transition element shutters are opened

20 High-k Crystalline Gate Dielectrics

649

in a controlled oxygen environment. Since, under these growth conditions, the sticking coefficient of the individual elements is unity, careful calibration of the fluxes is required for stoichiometric oxide films. The Sr (or Ba) and the Ti fluxes are first balanced by determining the time the oxide surface converts from a Ti-rich to a Sr-rich (or Ba-rich) and vice versa and adjusting the effusion cell temperature to equalize the time. Stoichiometric STO displays a (1 × 1) surface reconstruction with a 2× reconstruction along the [110] azimuth for a Sr-rich surface and a 2× along the [100] for a Ti-rich surface. The use of RHEED is also critical during the growth of the film as it enables real time correction of the stoichiometry of the layers. Growth rate is determined by measuring the period of RHEED oscillation during the oxide layer growth. The typical growth rate used during this study is approximately 2 ˚ A/min.

20.6 Substrate Preparation The successful deposition of an epitaxial layer depends on the crystalline quality and cleanliness of the substrate. Epitaxial deposition requires a well-ordered monocrystalline surface that provides nucleation sites for the adatoms of the growing layer. The as-received (100) oriented silicon substrate, is subjected to exposure of ozone generated by a commercial ultraviolet ozone generator prior to loading into the MBE deposition system. This step is critical in the elimination of carbon containing species on the Si surface that can lead to surface hillocks and non-ideal nucleation of the oxide if not eliminated. The wafers are heated to 150◦ C in the fast entry load lock to remove any volatile contaminants before transfer into the deposition chamber. The process of removal of the native oxide involves depositing 1–2 monolayers of Sr metal on the surface at temperatures between 400–600◦ C followed by an anneal at a temperature ≥ 750◦ C in ultrahigh vacuum. The resulting surface is a well-ordered clean (2 × 1) reconstructed Si surface [20.29]. Figure 20.4 shows an STM image and a LEED pattern of such a surface.

300Å Fig. 20.4. STM image along with a LEED spectra of a Si (2 × 1) reconstructed surface after cleaning using a Sr assisted desorption procedure prior to the oxide growth

650

R. Droopad, K. Eisenbeiser and A.A. Demkov

Intensity (a.u.)

Si 2p3/2 Si 2p1/2

Binding Energy (eV) Fig. 20.5. XPS spectrum of the Si-2p peaks taken from a Si substrate immediately after the Sr assisted desorption procedure clearly showing the absence of silicon oxide peaks. The experimental data can be fitted with 2 curves for the Si 2p1/2 and Si 2p3/2 peaks

The images clearly show a stepped Si(100) surface with single step height of 1.4 ˚ A. In this case, the STM images show (2 × 1) reconstruction on most of the area, which confirmed the findings by RHEED and LEED. In-situ XPS and RHEED measurements also confirm the absence of any carbon on the surface. Figure 20.5 shows the XPS spectrum of the Si 2p peak clearly indicating the absence of any SiOx . One advantage of this process is that the silicon oxide can be removed at a lower temperature than that needed for a thermal oxide removal process thereby causing less thermal stress on the silicon wafer and lowering the overall thermal budget of the process. This is an important requirement for integration of oxide-based electronics with Si CMOS devices since any high temperature exposure to Si CMOS devices can affect their performance.

20.7 Initial Nucleation There are three different growth modes for the hetero-epitaxy [20.30]. Sometimes the deposited layer appears to be a collection of droplets much like those of oil on a frying pan. This is the three-dimensional (3D) or island growth of Volmer-Weber. In contrast, some films demonstrate the layer-by-layer 2D or Frank-Van der Merwe (FVdM) growth. Other films exhibit a mixture of the two modes described above. A 2D growth persists for a few first monolayers, and then is followed by island growth. This so-called Stranski–Krastanov growth, typical for films with a significant lattice mismatch, is what one would expect in the Si-STO system, if the STO film would “wet” the substrate. The wetting condition or the condition for the 2D growth is as follows: γSi > γSTO + Einterface .

(20.5)

20 High-k Crystalline Gate Dielectrics

651

The left hand side is the surface energy of the substrate, which should exceed the sum of the surface energy of the epitaxial layer and the energy of the interface. Therefore, to examine the possibility of Si “wetting” by STO the surface energies of Si and STO, and an estimate of the energy of the interface between the two have to be determined. The surface energy of Si is easy to compute. Using the theoretical lattice constant (0.54 nm) and then performing a slab calculation for a (2 × 2) cell to account properly for the (2 × 1) reconstructions of both surfaces, a value of 1710 erg/cm2 for the Si surface energy is obtained in good agreement with experiment. The surface free energy of the STO has been discussed in the previous paragraph. The surface energy of STO has been previously discussed and may vary from 860 to 2400 erg/cm2 depending on the termination and the growth conditions as captured by the chemical potential of TiO2 . The lowest possible surface energy of 800 erg/cm2 is achieved under the SrO rich conditions for the SrO terminated surface. This sets the stage for the search of an interfacial structure that would result in the wetting. Any interface with the energy cost higher than 900 erg/cm2 according to (20.1) means a 3D growth. An interfacial structure resulting in a low 574 erg/cm2 interface energy was found to exist under Sr rich conditions. Therefore the sum of the interface energy and the surface energy of the STO film is now only 1433 erg/cm2 . Comparing 1710 erg/cm2 with 1433 erg/cm2 it can be concluded that STO should wet Si. Note that finding at least one wetting interface structure means that 2 dimensional nucleation is thermodynamically possible. The deposition of any oxide on silicon surfaces needs to take into account the kinetics of silicon oxide formation when a clean silicon surface is exposed to oxygen. Once an amorphous oxide forms on the silicon surface there will be a loss of epitaxy for subsequent deposition. After the cleaning of the silicon substrate, the temperature is cooled to the oxide growth temperature, typically in the range of 200–500◦ C. At this point the RHEED displays a (3 × 2) surface reconstruction determined to be due to excess Sr on the surface. Additional Sr, up to 1/2 monolayers, is added to the surface until the RHEED displays a (2 × 1) reconstruction. This acts as a template for epitaxial growth of STO. Experimentally the (3 × 2) surface reconstruction was found to correspond to a Sr coverage of 1/3 monolayers while (2 × 1) reconstruction was determined to be approximately 1/2 monolayers [20.27]. From the growth kinetics point of view, to achieve two-dimensional nucleation of oxide on silicon, oxidation of the silicon surface, leading to the formation of SiO2 , has to be suppressed. By controlling the growth temperature and oxygen partial pressure, high quality two-dimensional growth of oxides on silicon can be achieved [20.31]. Strontium titanate growth is initiated by codeposition of Sr and Ti metal in the presence of oxygen at a low substrate temperature, typically in the range of 200–400◦ C with an oxygen partial pressure of 10−8 –10−7 mbar. Oxygen is first introduced into the MBE chamber until the partial pressure is in the range of 10−8 –10−7 mbar at which time the shutters are opened.

R. Droopad, K. Eisenbeiser and A.A. Demkov

RHEED Intensity (a.u)

652

0

5

10 Time (mins)

15

20

Fig. 20.6. RHEED intensity profile of the specular beam at the start of the nucleation and growth of SrTiO3 on silicon. The arrow represents the point at which the Sr and Ti shutters are opened. The oscillations shown represent the growth of approximately 25 ˚ A

STO[100]

STO[110]

Fig. 20.7. RHEED images after the deposition of 40 ˚ A SrTiO3 layer on Si along the [100] and the [110] azimuth

Strontium titanate growth is initiated by co-deposition of Sr and Ti metal in the presence of oxygen at a low substrate temperature, typically in the range of 200–400◦ C with an oxygen partial pressure of 10−8 –10−7 mbar. Oxygen is first introduced into the MBE chamber until the partial pressure is in the range of 10−8 –10−7 mbar at which time the shutters are opened. Figure 20.6 shows the RHEED intensity of the nucleation and growth of STO on silicon at approximately 300◦ C. Oscillations in the intensity of RHEED patterns are considered to be related to 2D layer-by-layer growth and can be used to study the dynamics of epitaxial growth [20.32].The initial decrease in the intensity is probably due to the decrease in electron emission from the RHEED gun and/or electron scattering in the presence of oxygen. The initial low amplitude of the oscillations is probably the result of the crystal transition from the Si to the oxide crystal structure. The arrow in the figure represents the opening of the Sr and Ti shutters.

20 High-k Crystalline Gate Dielectrics

653

Under these growth conditions, the oscillations observed in the RHEED features suggest that the oxide growth proceeds via a 2 dimensional layer-bylayer growth mode which can persist for thickness > 50 ˚ A. The period of the oscillations was determined to be the time required for the deposition of one unit cell of SrTiO3 or 3.9 ˚ A. The oxide surface is also monitored throughout the growth and any deviation in stoichiometry, as determined by appearance of a 2 fold reconstruction in either the [110] or the [100] azimuth, can be corrected by interrupting the respective fluxes. The RHEED pattern during deposition remained streaky and sharp indicating that the growth front is well ordered and crystalline. Figure 20.7 shows the RHEED image of an STO surface after the growth of approximately 40 ˚ A indicating the epitaxial nature and the high degree of crystallinity.

20.8 Stability of the Interface It is well known, that the stability of STO in contact with Si towards the formation of silicides and silicates is a problem; that is the SrTiO3 –Si interface is thermodynamically unstable [20.33]. For example, at 1000 K the reaction 3Si + SrTiO3 = SrSiO3 + TiSi2 reduces the free energy by 19.3 kCal/mol. Assuming that SrO is stable in contact with Si, Mori and Ishiwara proposed to use a SrO epitaxial buffer layer to stabilize the perovskite film on Si [20.34]. McKee et al. [20.3] pointed out that a stable ASi2 /AO (A=alkaline earth metal, i.e. Ba, Sr, etc.) interface is necessary for the perovskite heteroepitaxy on Si. They reported an alkaline earth silicide forming between the SrO layer of the perovskite and the Si substrate. From the dielectric constant point of view, it is highly undesirable to have an interfacial layer (oxide or silicate) between a high-k dielectric and Si, because that adds a lower capacitor connected in series. Ab-initio calculations using a DFT code CASTEP [20.15] were used to investigate the stability of a binary oxide mixture (SrO + SiO2 ) towards the formation of two silicate phases. The code employs a standard LDA-DFT, plane wave basis, and ultra-soft pseudopotentials. The energy difference convergence is maintained at a 1 kCal/mole level. First, the following reaction: Si + 2SrO = SiO2 + Sr was considered to confirm that Sr oxide is indeed stable in contact with Si by 27 kCal/mole in agreement with the argument of Mori and Ishikawa [20.34]. The calculated energy of crystalline silica (αquartz) was used to estimate the SiO2 contribution; the SiO2 glass has only 1.5 kCal/mole higher enthalpy than this ground state crystalline form [20.35]. The stability of the SrO + SiO2 mixture with respect to the formation of two crystalline silicates: the orthorhombic orthosilicate Sr2 SiO4 (chrysoberyl structure, space group Pnma) and monoclinic α-metasilicate SrSiO3 (pseudowollastanite structure, space group C2) [20.36] was then considered. The structure of Sr orthosilicate consists of isolated SiO4 tetrahedra and Sr atoms in such a way, that each Sr occupies an octahedron of six oxygen atoms. In

654

R. Droopad, K. Eisenbeiser and A.A. Demkov

α-metasilicate isolated rings of three SiO4 tetrahedra are present, while in β-metasilicate these tetrahedra form corner-sharing chains. A high-pressure perovskite type silicate ε-SrSiO3 was also considered, but was found to be too high in energy to be important in the following analysis. For the silicate formation both reactions were found to be exothermic with the energy release of −35 kCal/mole and −59 kCal/mole for the metasilicate and orthosilicate, respectively. Both numbers are in good agreement with experiment [20.15]. This indicates that the presence of the native Si oxide during the formation of the SrO buffer layer may be critical.

20.9 Structural Properties To determine the interfacial structures between the silicon and the overlying oxide layers and the dependence on deposition conditions, in-situ x-ray photoelectron spectroscopy (XPS) has been used [20.37]. In this case oxide films have been deposited at 3 different substrate temperatures with a constant oxygen partial pressure profile and transferred in-vacuo to an analysis chamber equipped with a number of in-situ analysis techniques. Figure 20.8 shows two typical XPS survey spectra, one was taken for a SrTiO3 thin film of about 40 ˚ A thick, grown at 400◦ C on Si(100) (top), compared to a spectrum taken for a clean SrTiO3 bulk sample (bottom). As the thickness of the SrTiO3 film is comparable to the in-elastic mean free path of the escaping photoelectrons emitted from the silicon substrate, a clear Si–2p peak can be identified at a binding energy of about 99.3 eV in the top spectrum. Although some differences of the 2 spectra can be identified, the chemical states of Ti, Sr, and oxygen in the grown film are similar to those in the SrTiO3 bulk, where the Ti has been fully oxidized to Ti4+ . One advantage of co-deposition for oxide growth is the complete oxidation of the transition metal that is less reactive than the alkaline earth metals. Figure 20.9 shows the XPS spectrum of a STO film deposited on silicon in which it can be seen that all the Ti atoms have been fully oxidized to Ti4+ . It is possible that Sr may act as a catalyst for the oxidation of Ti during co-deposition [20.12]. A direct comparison between the two spectra in Fig. 20.8 also reveals the close stoichiometry of the film to the bulk sample. A series of detailed scans for different samples in the Si–2p region are shown in Fig. 20.10 with all the spectra aligned for the Si–2p signals from the substrates. As a reference, spectrum (a) was taken for a Si(100) sample with native oxide (SiO2 ), the Si–2p bulk peak appears at about 99.3 eV and the SiO2 peak is centered at about 103.3 eV in agreement with the literature [20.38]. Spectrum (b), taken for a SrTiO3 /Si sample grown at 500◦ C, shows a peak centered around 102.4 eV, indicating possible silicates at the interface [20.38] and references therein. In spectrum (c), which was taken for a SrTiO3 /Si sample grown at 400◦ C, the majority of the signal arise from the silicon substrate with a small bump centered around 101.2 eV, an indication

20 High-k Crystalline Gate Dielectrics

655

Fig. 20.8. XPS survey spectra of a thin layer (about 40 ˚ A) SrTiO3 film on Si(100) (top) and a clean bulk SrTiO3 sample (bottom). The Si–2p signal can be clearly seen for the SrTiO3 /Si sample

Intensity (a.u)

Ti 2p3/2 4+

Ti 2p1/2 4+

Binding Energy (eV) Fig. 20.9. XPS spectrum of the Ti–2p peaks taken from a SrTiO3 /Si film grown by MBE

of Si–O bonding [20.39] exist at the interface. Spectrum (d) is for a sample grown at about 300◦ C; besides its similarities to spectrum (c), the SiOx region is broader, indicating possible multiple configurations of Si–O bonding which can be revealed by detailed curved fitting studies. Figure 20.11 shows a high resolution cross sectional TEM of a 40 ˚ A STO thick film deposited on silicon at 300◦ C. It can be seen that the interface between the STO and Si is clean and free of any amorphous SiO2 layer. The transition from Si to STO is abrupt and the uniform contrast and clear lattice fringes suggest that the oxide layer results from high quality 2 dimensional growth.

656

R. Droopad, K. Eisenbeiser and A.A. Demkov

Fig. 20.10. Comparison of XPS spectra in the Si–2p region of SrTiO3 /Si samples grown under different substrate temperatures with a sample of thin SiO2 layer on silicon

SrTiO3

Si

Fig. 20.11. High resolution cross sectional TEM image of a 40 A thick STO/Si film showing a crystalline transition across the interface

20 High-k Crystalline Gate Dielectrics

657

6

10

STO(002)

X-Ray Intensity (a.u.)

5

10

STO(001)

4

10

3

10

Si(002)

2

10

1

10

0

10

20

30

40

50

60

2 Theta (deg) Fig. 20.12. XRD theta-2theta scan of a 50 ˚ A thick SrTiO3 film grown on silicon showing single crystal (100) oriented oxide layer

This study, in addition to HRTEM suggests that the interface that forms during the epitaxial growth of SrTiO3 on Si(100) contains Sr in the oxidation state +2, O in the oxidation state −2, and Si in the oxidation state no higher than +2, that is consistent with the presence of Si–O bonds at the interface. Under relatively low temperature growth conditions (300–400◦ C), the interface mainly consists of Si–O bonding, while O–Si–O–Si–O bonding forms near silicon surface step edges. Under relatively high temperature growth conditions (e.g. > 500◦ C), the oxygen diffusion through the SrTiO3 film increases, and the silicon atoms can be more easily oxidized, leading to the formation of Sr-silicates, SiOx or even SiO2 . Furthermore, the initial oxidation of silicon may also have been promoted by the Sr atoms at the interface [20.40]. In addition to in-situ analyses, the structural properties were also characterized ex-situ using x-ray diffraction and high resolution transmission electron microscopy (HRTEM). A theta-2theta scan of a 50 ˚ A SrTiO3 layer grown on silicon is shown in Fig. 20.12. This data shows an (001) oriented epitaxial SrTiO3 thin film has been grown on Si(001) substrate. In this case the STO film shows a strong diffraction (002) peak and a weaker (001) peak in addition to the silicon substrate peaks with no extra diffraction features from other crystalline orientations of the oxide film existing in the measured theta2theta range. Because of the large lattice mismatch between STO (3.9 ˚ A) and Si (5.43 ˚ A), the oxide layer rotates 45o with respect to the Si (001) surface during the deposition. This in-plane epitaxial relationship is confirmed by x-ray psi scan measurements as shown in Fig. 20.13 in which the STO (220) family planes are observed to be oriented 45◦ with respect to the Si (220)

658

R. Droopad, K. Eisenbeiser and A.A. Demkov 5

X-Ray Intensity (c/s)

10

*

*

4

*

*

10

3

10

2

10

1

10

0

10

0

50

100

150

200

250

300

350

Psi (deg) Fig. 20.13. X-ray psi scan of the STO and Si (220) planes showing the in plane epitaxial relationship of a SrTiO3 film grown on silicon (100). The substrate peaks are denoted by *. This confirms the relationship that SrTiO3 [100]//Si[110]

planes (denote by *). These results confirm that the perovskite oxide growth with an orientation of STO(001)//Si(001) and oxide[100]//Si[110].

20.10 Band Discontinuity We now discuss the band discontinuity at the Si-perovskite interface. First order estimate of the offset can be obtained within a commonly used [20.41] model initially proposed by Tejedor-Flores-Tersoff (TFT) [20.42, 20.43]. In this model the conduction band offset is given by φn = (χa − ΦS,a ) − (χb − ΦS,b ) + S(ΦS,a − ΦS,b ) .

(20.6)

Here χ is the electron affinity, Φ is the charge neutrality level measured from the vacuum level, and S is an empirical pinning parameter describing the screening by the interfacial states, subscripts a and b refer to Si and STO, respectively. If S = 0 we get the strong pinning or the Bardeen limit, and if S = 1 we have no pinning or the Schottky limit. The electron affinities of Si and STO are 4.0 and 3.9 eV, respectively. The estimated charge neutrality level is 4.9 eV for Si and between 5.8 and 6.4 eV for STO (both are given with respect to the vacuum level. Our number for STO is different from that reported by Robertson and Chen [20.41]. Their result is obtained if searching for the zero of the Green’s function [20.43] the integration range

20 High-k Crystalline Gate Dielectrics

a

659

b

Fig. 20.14. The electron density obtained by integrating over the states within a 1e V window below the Fermi level (a) the (2 × 1) structure with 1/2 ML of Sr at the interface. The states localized in the plane of the interface are clearly seen. The localized states pin the Fermi level in agreement with Bardeen’s picture. (b) The (2 × 1) structure with 1 ML of Sr at the interface. No localized interface states are observed in the gap region, this results in the unpinned Schottky interface

is restricted to roughly six conduction bands, corresponding to a minimum basis tight binding calculation. The semi-converged estimate (the integration range exceeds 60 eV above the conduction band bottom, the expression used by Robertson has a logarithmic divergence) results in 6.1 eV. However, there are no reasons to believe that Tersoff’s technique should work for transition metal oxides at all. To get a better feel for the branch point’s position we computed the complex band structure of STO and found the branch point 0.73 eV above the valence band top. Given the uncertainty of the LDA band gap this gives the charge neutrality level in the range from 5.9 to 6.7 eV with respect to the vacuum level. Thus, within this simple theory, we expect a 1.0–1.6 eV conduction band offset in the Bardeen limit, and a small 0.1 eV offset in the Schottky limit. We now describe a more realistic approach to the band offset estimation. First we use a direct density of states analysis technique [20.44] to compute the valence band offset, and infer the conduction band offset using the experimental band gap values (1.17 eV and 3.2 eV for Si and STO, respectively). We calculate the total valence band density of states for a 4.5 nm thick 2 × 1 Si-STO slab in vacuum. Then site-projected densities are computed separately for Si and STO atoms in the slab. The valence band discontinuity is then readily obtained. With this technique, we obtain the conduction band offset of 0.87 eV, and 0.23 eV for two different models shown in Fig. 20.14. Alternatively one can use the reference potential method of Van de Walle and Martin [20.45]. We use the electrostatic potential across the slab as a reference, and place the valence bands with respect to using two additional bulk

660

R. Droopad, K. Eisenbeiser and A.A. Demkov

SrTiO3

7Å Interfacial Layer

Si Fig. 20.15. High Resolution transmission electron microscopy image of epitaxial SiTiO3 on Si. An amorphous interface layer of about 7 A is present between the two single crystal layers

calculations. This method gives conduction band offsets of 0.8 and 0.01 eV for the same models. The agreement between two methods is fair, taking into account the fast oscillation of the reference potential. In the spirit of the simple TFT theory one would conclude that structure with a larger offset corresponds closer to the Bardeen limit with the S value ranging between 0.1 and 0.47 (an empirical estimate gives 0.28 [20.41]), while the other structure corresponds to the Schottky limit. This picture is indeed correct. In Fig. 20.14 we show the electron density obtained by integrating over the states within a 1 eV window below the Fermi level. In the case of structure (a) states localized on Si dimers are clearly seen, while no localized charge is observed at the interface for structure (b). The localized states of structure (a) fall into the STO gap. The origin of these states can be explained as follows. Note that the interface layer has the SrSi2 stoichiometry corresponding to a half monolayer of Sr deposited on the Si (001) 2 × 1 reconstructed surface at the template stage. The top of the valence band for such a template is precisely the dimer localized surface state [20.14]. Another way to think of these states is offered by the Metal Induced Gap States (MIGS) model [20.46], Si here plays the role of a metal and electrons from its valence band occupy the evanescent states of STO up to the charge neutrality limit. The Fermi level pinning we find in the case of structure (a) would count against using STO as a gate dielectric even though large and almost symmetric offsets are predicted for both bands. Experimentally, Chambers and co-workers examined the band discontinuity at the Si–STO interface [20.47]. They reported a small conduction band offset for an n-type, and a negligible one for a p-type Si substrate. This would agree with our results for the interface structure (b) that happens to be thermodynamically more stable.

20 High-k Crystalline Gate Dielectrics

661

McKee suggested fixing the negligible Si–STO band offset problem by inserting several layers of SrO [20.3]. Indeed, SrO has a large band gap with a large +2.7 eV offset to the Si conduction band [20.48], and a −2.2 eV conduction band offset to STO thus resulting in about 0.5 eV overall offset from Si to STO [20.49]. Also, the formation of a thin amorphous SiO2 layer at the oxide/Si interface appears to shift the conduction band of the STO with respect to Si allowing for both p- and n-channel MOSFETs [20.50]. Also, as explained above, the use of a low-k/high-k dielectric stack can overcome some challenges for material integration such as band offsets and short channel effects [20.10]. The amorphous layer, determined to be SiO2 can be formed either during the oxide deposition, or after the deposition via a post growth annealing in the presence of oxygen. In both cases, the amorphous layer is formed through a reaction of the silicon substrate with oxygen that has diffused through the crystalline oxide layer. Increasing the growth temperature and/or the oxygen partial pressure can increase the thickness of this layer. Figure 20.15 shows a TEM image of an oxide/amorphous layer/Si structure in which the amorphous layer has been formed during the growth of the oxide film.

20.11 Device Results The oxide/amorphous layer/Si stack has been used to demonstrate device performance. Conventional cross sectional high-resolution transmission electron microscopy was used to examine the crystalline quality of SrTiO3 films as well as the SrTiO3 /Si interface. An extremely thin amorphous interfacial layer is observed at the interface. As shown in Fig. 20.15, the amorphous interfacial layer is approximately 7 ˚ A with good single crystal SrTiO3 above this interfacial layer. Tests on samples with different SrTiO3 thickness show that the interface layer has a lower dielectric constant than the crystalline SrTiO3 , approximately 175 for the SrTiO3 and approximately 4 for the interface layer. This stack is the low dielectric/high dielectric constant combination that has been proposed to reduce short channel effects in MOSFETs with very high dielectric constant gate dielectrics [20.10]. MOSFETs using the STO/SiO2 gate dielectric were fabricated by first forming n and p wells by implantation. Another implantation and activation anneal were then used to form the source and drain. After the STO/SiO2 gate dielectric, having a physical thickness of 110 ˚ A, was deposited using the process described above, the MOSFET was completed by defining a TaN/a–Si gate electrode using reactive ion etching, capping the structure with plasma enhanced chemical vapor deposited SiO2 , opening vias to the source and drain regions and finally creating ohmic contacts to the source and drain regions with Al–2wt%Si. The devices were then annealed by RTA at 450◦ C in forming gas to alloy the ohmic contacts. This non-self-aligned process is used due to instabilities of the STO-Si interface at normal activation anneal temperatures.

662

R. Droopad, K. Eisenbeiser and A.A. Demkov

µm 2 )

0.035 Co Hauser Fit

0.03

Gate Capacitance (pF/

0.025 0.02 0.015 0.01 0.005 0 -2.5

-2

-1.5

-1

-0.5

0

0.5

Gate Voltage (V) Fig. 20.16. The capacitance-voltage relationship extracted from a 110 ˚ A SrTiO3 /SiO2 thin film dielectric stack on p-type silicon with a boron concentration of 2 × 1015 cm−3 using a three element model

Capacitance-voltage (C–V) measurements were made to characterize the dielectric properties of the STO/SiO2 stack. The capacitance, shown in Fig. 20.16, was extracted using a 3-element model and multiple frequency C–V measurements [20.5] and then corrected for quantum mechanical effects. The results show that the 110 ˚ A SrTiO3 /SiO2 stack exhibits capacitance comparable to an 8 ˚ A SiO2 gate insulator. In addition to determining the oxide capacitance, a method proposed by Terman [20.51] was used to estimate the interface trap density. By comparing the theoretical C–V curve for a 8 ˚ A SiO2 gate insulator on silicon to our measured C–V curve at 100 kHz, an interface trap density of 6.4 × 1010 states/cm2 –eV at midgap and an oxide trap charge of 1.2 × 1011 states/cm2 were calculated. Finally the hysteresis of the SrTiO3 capacitor was measured using a dual C–V sweep from −5 to 5 V at 100 kHz. The dual C–V sweep (counter clockwise from accumulation to inversion and retrace) shows only a 10mV shift in gate voltage. Both n-channel and p-channel MOSFET drain current – drain voltage (I–V) curves are shown in Fig. 20.17. These devices have effective channel lengths of 1.2 µm. The threshold voltages of these devices are shifted toward negative values due to the low work function, 4.2 eV, of the TaN gate and also due to fixed charges in the films. The peak transconductance at 1 V drain bias is 269 µS/µm for the n-channel device and 90 µS/µm for the p-channel device. Low field transconductance measurements on 10 µm×10µm MOSFETs were

20 High-k Crystalline Gate Dielectrics 300

PMOS

663

NMOS V g = 1.0 V

250 200 150 V g = -2.0V

V g = 0.5 V

100 V g = -1.5V

50

V g = 0.0 V

V g = -1.0V

0 -2

-1

0

1

2

Drain Voltage (V )

Fig. 20.17. Drain current-drain voltage curves for n- and p-channel SrTiO3 gate insulator MOSFETs with gates 2µm long and 10µm wide

used to calculate inversion layer mobilities. From these measurements and the measured inversion capacitance, the electron peak mobility is calculated as 221 cm2 /Vs and the hole mobility is 62 cm2 /Vs. The n-channel devices have a subthreshold slope of 103 mV/decade and the p-channel devices have a slope of 95 mV/decade. The p-channel device gate leakage at −1 V is 3 mA/cm2 , and the n-channel device leakage current at 1 V is 0.14 A/cm2 . At a gate voltage 1 V into inversion the gate leakage current is 40 mA/cm2 and 15 mA/cm2 for the p-channel and n-channel devices respectively. Both types of device have higher leakage for positive gate bias than for negative. This asymmetry of the curves suggests that the leakage current is injection limited. In inversion the p-channel device then has leakage current 3 orders of magnitude better than a 10 ˚ A SiO2 film and the n-channel device has leakage 2 orders of magnitude better than a 10 ˚ A SiO2 film [20.52].

20.12 Conclusion These results suggest that a crystalline dielectric has the potential to enable MOSFETs with high capacitance and low leakage. They also show some of the difficulties in meeting all of the necessary criteria for a commercially viable high-k gate dielectric. The STO-Si system presented above shows that heteroepitaxy of a crystalline high-k material on silicon can be achieved with a high degree of perfection at the interface. This material system provides a very high dielectric constant, in fact too large for properly scaled devices.

664

R. Droopad, K. Eisenbeiser and A.A. Demkov

The STO-Si system also has several issues which make it impractical for commercial CMOS applications. The conduction band offset of the STO-Si interface is insufficient for a functional MOSFET. This offset can be improved through the insertion of an intermediate layer such as a SiO2 layer; however, the band gap of STO (3.3 eV) is too small to allow much asymmetry in band offset and still meet the 1 eV requirement that is especially important for high temperature operation. Even though the STO shows a high degree of crystalline perfection there are still defects in the bulk that lead to fixed charges in the film. The thermal stability of the films at high temperature is another issue. The deposition process used in this demonstration, MBE, also presents throughput challenges in a manufacturing environment. These results suggest that the difficult issues of heteroepitaxy on Si are solvable; however, a high-k material other than STO must be found to satisfy all of the relevant conditions. The criterion that the crystalline dielectric must have a crystalline structure compatible with the underlying silicon whereas an amorphous dielectric film does not have to meet this criterion, means that finding a crystalline dielectric that can produce commercially viable MOSFETs is significantly more difficult than finding an amorphous dielectric for this purpose. If an amorphous dielectric with an interface layer of sufficiently high dielectric constant can be found which meets the basic requirements outlined in the introduction to this chapter, crystalline films will probably not be needed or desirable; however, if a low k interface layer is required, crystalline oxides offer advantages in scaling due to the higher dielectric constants that can be achieved. Similarly, if adequate interface properties can’t be achieved with amorphous materials, crystalline materials may also offer an opportunity for a nearly ideal interface if their crystal structure is properly matched to silicon. The additional constraints imposed by this need to make the dielectric materials structure, growth and thermal properties compatible with silicon, however, mean that developing this crystalline material is a great challenge. Acknowledgments. The authors would like to acknowledge the members from the Advanced Materials Research and Advanced Processing and Materials Characterization Laboratories without whose tremendously hard work the data presented in this chapter would not have been possible.

References 20.1. P.Yu and M. Cardona, Fundamentals of Semiconductors (Springer-Verlag, New York 1995) p. 325 20.2. M. Born and K. Huang, Dynamical Theory of Crystal Lattices (Clarendon Press, Oxford 1988) p. 336 20.3. R.A. McKee, F.J. Walker, and M.F. Chisholm, Phys. Rev. Lett. 81, 3014 (1998)

20 High-k Crystalline Gate Dielectrics

665

20.4. R. Droopad, Z. Yu, J. Ramdani, L. Hilt, J. Curless, C. Overgaard, J. Edwards, J. Finder, K. Eisenbeiser, J. Wang, V. Kaushik, B.-Y Ngyuen, B. Ooms, J. Cryst. Growth 227-228, 936 (2001) 20.5. K. Eisenbeiser, J. Finder, Z. Yu, J. Ramdani, J. Curless, J. Hallmark, R. Droopad, W. Ooms, L. Salem, S. Bradshaw, C.D. Overgaard, Appl. Phys. Lett. 76, 1324 (2000) 20.6. M. Copel, M. Gribelyuk, E. Gusev, Appl. Phys. Lett. 76, 436 (2000) 20.7. G.D. Wilk and R.M. Wallace, Appl. Phys. Lett. 74, 2854 (1999) 20.8. H.S. Kim, D.C. Gilmer, S.A. Campbell, D.L. Polla, Appl. Phys. Lett. 69, 3860 (1996) 20.9. G.B. Alers, D.J. Werder, Y. Chabal, H.C. Lu, E.P. Gusev, E. Garfunkel, T. Gustafsson, R.S. Urdahl, Appl. Phys. Lett. 73, 1517 (1998) 20.10. B. Cheng, M. Cao, R. Rao, A. Inani, P. Vande Voorde, W. Greene, J. Stork, Z. Yu, P. Zeitzoff, J. Woo, IEEE Trans. Electron Devices 46, 1537 (1999) 20.11. R.A. McKee, F.J. Walker and M.F. Chisholm, Science 293, 468 (2001) 20.12. J. Lattieri, J. Heini, D. Schlom, J. Vac. Sci. Technol. A20, 1332 (2002) 20.13. X.Zhang and A.A. Demkov, unpublished 20.14. X. Zhang, A.A. Demkov, H. Li, X. Hu, Y. Wei, and J. Kulik, Phys. Rev. B 68, 125323 (2003) 20.15. A.A. Demkov, phys. stat. sol. (b) 226, 57 (2001) 20.16. X. Zhang and A.A. Demkov, J. Vac. Sci. Technol. B 20, 1664 (2002) 20.17. R.E. Cohen, J. Phys. Chem. Solids 57, 1393 (1996); Ferroelectrics 194, 323 (1997) 20.18. J. Padilla and D. Vanderbilt, Surf. Sci. 418, 64 (1998) 20.19. C. Cheng, K. Kunc, and M.N. Lee, Phys. Rev. B 62, 10409 (2000) 20.20. J. Robertson, J. Vac. Sci. Technol. B 18, 1785 (2000) 20.21. G. Koster, G. Rijnders, D.D.A. Blank, H. Rogalla, Physica C 339, 215 (2000) 20.22. S. Zollner, A.A. Demkov, R. Liu, P. Fejes, R.B. Gregory, P. Allury, J.A. Curless, Z. Yu, J. Ramdani, R. Droopad, T.E. Tiwald, J.N. Hilfiker, and J.A. Woollam, J. Vac. Sci. Technol. B 18, 2242 (2000) 20.23. CRC handbook of Chemistry and Physics, 67nd edn (CRC Press, Inc., Boca Raton 1987) 20.24. J. Goniakowski and C. Noguera, Surface Science 319, 68 (1994) 20.25. M.A. Berding, S. Krisnamurthy, A. Sher, J. Appl. Phys. 67, 6175 (1990) 20.26. Yi Wei, 2002, Private Communication 20.27. X. Hu, Z. Yu, J. Curless, R. Droopad, K. Eisenbeiser, J. Edwards, W. Ooms, D. Sarid, Appl. Surf. Sci. 181, 103 (2001) 20.28. J. Wang J.A. Hallmark, D.S. Marshall, W.J. Ooms, P. Ordejon, J. Unquera, D. Sanchez-Portal, E. Artacho, J.M. Soler, Phys. Rev. B 60, 4968 (1999) 20.29. Y. Wei, X. Hu, Y. Liang, D. Jordan, B. Craigo, R. Droopad, J. Yu, A. Demkov, J. Edwards, K. Moore, W. Ooms, J. Vac. Sci. Technol. B 20, 1402 (2002) 20.30. J.Y. Tsao, Materials Fundamentals of Molecular Beam Epitaxy (Academic Press, San Diego, CA, 1993) 20.31. H. Li, X. Hu, Y. Wei, J. Yu, X. Zhang, R. Droopad, A. Demkov, J. Edwards, K. Moore, W. Ooms, J. Kulik, P. Fejes, J. Appl. Phys. to be published 20.32. B.A. Joyce, J.H. Neave, J. Zhang, P.J. Dobson, Reflection High Energy Electron Imaging of Surfaces (Plenum Publishing Corp, 1988) p. 397

666

R. Droopad, K. Eisenbeiser and A.A. Demkov

20.33. K.J. Hubbard, D.G. Schlom, J. Mater. Res. 11, 2757 (1996) 20.34. H. Mori and H. Ishiwara, Jpn. J. Appl. Phys. 30, L1415 (1991) 20.35. I. Petrovic, A. Navrotsky, M. Davis, and S. Zones, Chem. Mater. 5, 1805 (1993) 20.36. K. Machida et al., Acta Crystallogr. B 38, 387 (1982) 20.37. X. Hu, H. Li, Y. Liang, Y. Wei, Z. Yu, D. Marshall, J. Edwards, R. Droopad, X. Zhang, A. Demkov, K. Moore, Appl. Phys. Lett. 82, 203 (2003) 20.38. J.F. Moulder , W.F. Stickle, P. E. Sobol, K.D. Bonhen, Handbook of X-Ray Photoelectron Spectroscopy (Physical Electronics Inc., 1995) 20.39. B.V Crist, Handbook of Monochromatic XPS Spectra, Vol. 1, The Elements and Native Oxides (XPS International Inc., 1999) 20.40. A. Mesarwi, W.C. Fan, A. Ignatiev, J. Appl. Phys. 68, 3609 (1990) 20.41. J. Robertson and C.W. Chen, J. Vac. Sci. Technol. B 18, 1785 (2000) 20.42. C. Tejedor, F. Flores, and E. Louis, J. Phys. C 10, 2163 (1977) 20.43. J. Tersoff, Phys. Rev. B 30, 4874 (1984) 20.44. A.A. Demkov, R. Liu, X. Zhang, and H. Loechelt, J. Vac. Sci. Technol. B 18, 2388 (2000) 20.45. C.G. Van de Walle and R.M. Martin, Phys. Rev. B 39, 1871 (1989) 20.46. V. Heine, Phys. Rev. 138, A1689 (1989) 20.47. S. Chambers, Y. Liang, Z. Yu, R. Droopad, J. Ramdani and K. Eisenbeiser, Appl. Phys. Lett. 77, 1662 (2000) 20.48. R.A. McKee, F.J. Walker, M. Buongiorno Nardelli, W.A. Shelton and G.M. Stocks, Science 300, 1726 (2003) 20.49. J. Junquera, arXiv:cond-mat/0210666 v1, (2002) 20.50. J. Wang, unpublished 20.51. L. M. Terman, Solid State Electron 5, 285 (1962) 20.52. C.-H. Choi, J.-S. Goo, T.-Y. Oh, Z. Yu, R. Dutton, A. Bayoumi, M. Cao, P. Vande Voorde, D. Vook, C. Diaz, IEEE Electron Device Lett. 20, 292 (1999)

21 Advanced MOS-Devices J. Bokor, T.-J. King, J. Hergenrother, J. Bude, D. Muller, T. Skotnicki, S. Monfray, and G. Timp

21.1 Introduction Gordon Moore has astutely observed that the complexity of an integrated circuit (IC), measured by the number of transistors incorporated into it, doubles about every 18 months with unerring regularity [21.1]. Following Moore’s law, since 1965 there has been more than a 1,000,000-fold improvement in the complexity of an IC without a corresponding increase in the manufacturing cost. About half of this improvement is due to miniaturization of the wires and transistors that are included in the circuit. Right now, the semiconductor industry can manufacture logic that incorporates more than 40 million MOSFETs (metal-oxide-semiconductor field effect transistors) into a single circuit. Supposedly, within the next ten years at the same cost the semiconductor industry will manufacture logic chips with more than billion nanometer-scale MOSFETs (nano-transistors). These nano-transistors, such as the one illustrated in Fig. 21.1, are expected to have a gate or control electrode shorter than 50 nm and a gate dielectric, which separates the control electrode from the current-carrying channel, with an effective thickness less than tox,eq = 0.6 nm of SiO 2 . This extrapolation into the future is based on the ITRS Roadmap [21.2], which is the current blueprint for the semiconductor industry. Implicit in the extrapolation is the assumption that the economics that currently drives integration will continue unabated, and that the physics governing transistor operation and its incorporation into an IC will permit it. And up to now, higher levels of integration have been physically accessible through miniaturization. In particular, the accuracy of the ITRS forecast relies on projected gains in transistor performance and on the reduction in power dissipation expected from miniaturization. However, the ability to recover the projected gains by scaling the critical dimensions of the wires and transistors smaller, beyond the 50 nm gate length technology node, is threatened by a confluence of limitations imposed by circuits, devices, materials and even the size of atoms. Two of the most stringent limitations on further integration are imposed by an upper bound on the operating temperature and by the lifetime of the battery. Both of these limits translate into a bound on the maximum tolerable power dissipation. To mitigate the effect of power dissipation, MOS technology uses complementary pairs of n-MOSFETs and p-MOSFETs together as

668

a

J. Bokor et al.

b

c

Fig. 21.1. (a) An atomic force micrograph of a sub-50 nm MOSFET, i.e. a nanotransistor. The cross-section through the gate electrode of the nanotransistor, indicated schematically by the dashed lines on the left, is shown on the right. (b) A high-resolution transmission electron micrograph through a cross-section of a nominally 25 nm gate length MOSFET. (c) The red inset shows the nominally 1.4 nm thick gate oxide

switches. A (complementary-MOS) CMOS switch is configured to dissipate active power only during a switching transient; otherwise it’s off. And since the switching transients occur only during a narrow window within the clock cycle, a CMOS switch is off most of the time. Accordingly, two figures-of-merit for CMOS are: (1) the active power dissipation, which is proportional to the product of the drive current or drain current in saturation, and the power supply voltage; and (2) the stand-by power dissipation, which is proportional to the leakage current and the supply voltage. Thus, the viability of sub-50 nm gate length CMOS technology is contingent upon the drive current performance and leakage current specification. Improvements in the drive performance can be used to reduce the power supply voltage, thereby reducing power dissipation while improving reliability at the same time. The drive current performance of a MOSFET is dictated by both the thickness of the SiO 2 gate dielectric and by carrier scattering in the channel. With miniaturization, the drive current performance improves because the thickness of the gate oxide and the gate length are reduced. Reducing the channel length diminishes carrier scattering in the channel, and thinning the gate oxide increases the gate capacitance (and correspondingly increases the inversion layer carrier density for the same power supply) and so the drain current improves proportionally. However, the SiO 2 gate dielectric is already the smallest feature in an IC, and reducing it beyond 0.7 nm seems untenable (especially for low power applications) because quantum mechanical tunneling through the oxide [21.3] results in a leakage current in excess of 300 A/cm2 . At this level, the gate leakage current has a detrimental effect on the stand-by power. And even with the discovery of a viable alternative to SiO 2 as a gate dielectric that reduces the leakage current without compromising the drive, the increase in subthreshold leakage current between the

21 Advanced MOS-Devices

669

source and drain contacts in a MOSFET threatens to make stand-by power dissipation intolerable (i.e. > 10 W/cm2 ). The subthreshold leakage current of a nanometer-scale MOSFET is comprised of three components: band-to-band tunneling between the body and drain; direct quantum-mechanical tunneling between the source and drain; and thermionic emission over the source-to-channel potential barrier. Independently calculating and summing these components determines the offstate leakage for a transistor design. Band-to-band tunneling occurs when the electric field across a p-n junction is large enough to induce tunneling of electrons from the valence band of the p-region into the conduction band of the n-region. It can be reduced by lowering the power-supply voltage or by using graded source/drain (S/D) doping profiles, at the expense of drive current (due to increased parasitic resistance). Direct tunneling of carriers from the source to the drain is expected to occur at very short gate lengths (∼ 5 nm) when the channel potential barrier width is very small. Thermionic emission is primarily controlled by the source-to-channel potential barrier height, which in turn is affected by the short-channel effects. Ideally, the gate voltage controls the barrier. Under “off” conditions, when the gate voltage is below the threshold voltage, the particle flux injected from the source cannot surmount the barrier so the current collected at the drain (subthreshold current) is miniscule. However, in a short channel MOSFET the drain voltage compromises the control exercised by the gate. The source and drain field penetrate deeply into the channel lowering the barrier causing an increase in the subthreshold current. This drain-induced barrier lowering (DIBL) and the concomitant increase in subthreshold current are detrimental to the stand-by power. Thinning the silicon body is one of the strategies being explored to reduce DIBL and eliminate leakage paths removed from the gate electrodes. Following this strategy, two classes of alternatives to the conventional, planar bulk-Si MOSFET structure have been proposed that use an extremely thin (nanometer-scale) silicon body either (1) on a buried oxide (Silicon-OnInsulator), or (2) sandwiched between two gate electrodes (double-gate or gate-all-around structures). Various fabrication technologies such as SIMOX, BESOI, Smart Cut, ELTRAN, ELO, etc. are being developed to produce silicon films as thin as 5–20 nm on a buried oxide, but none are commercially available yet. Research on nanometer-scale Silicon-On-Insulator (SOI ) MOSFETs, using an ultra-thin body, has already demonstrated superior subthreshold performance [21.5], which in theory can be used to improve the drive current by lowering the threshold voltage. On the other hand, the double-gate or gate-all-around MOSFET structures are supposed to minimize short-channel effects and allow for even more aggressive device scaling, as compared with both bulk-Si MOSFET and fully-depleted SOI MOSFET structures [21.6]. Consequently, these structure have been the subject of intensive research. Figure 21.2 is an illustration of a gate-all-around structure in which the gate electrode encompasses a nanometer-scale silicon body.

670

J. Bokor et al.

Fig. 21.2. A Gate-All-Around Double-Gate structure implemented in Silicon-OnNothing

Recent theoretical studies suggest that double-gate transistors can meet the ITRS [21.2] performance specifications down to 10 nm gate length [21.7] although the scaling limit of the MOSFET, i.e. the minimum gate length Lmin , will depend ultimately on the particular circuit application since the IOFF specifications differ for low-power vs. high-performance [21.4]. To satisfy the performance specifications in fully-depleted double-gate MOSFET, however, the body thickness, tbody , must be about 1.7 to 2.0 times smaller than the gate length to effectively control the subthreshold leakage. Morevoer, the front and back gates must be perfectly aligned [21.8] for optimal drive current performance. Some representative results from a simulation-based study [21.9] are illustrated in Fig. 21.3 for an equivalent gate-oxide thickness, tox,eq = 1 nm. Figure 21.3 shows that for tbody = 5 nm and tox,eq = 1 nm, the ITRS ION and IOFF targets can be met for gate lengths down to ∼ 10 nm. It has been postulated [21.10] that the minimum acceptable body thickness will be ∼ 5 nm due to series resistance and VT -variation concerns. It should be noted that the scaling limit is strongly dependent not only on the body thickness, but also the power-supply voltage (Vdd ) and the S/D lateral doping-concentration gradient [21.6]. Lmin decreases with Vdd and is larger for a more abrupt S/D doping profile. So, a reduction in the gate length results in increased leakage current due to DIBL, but thinning the body to eliminate leakage paths far away from the gate electrode can effectively reduce the subthreshold leakage by maintaining the control the gate has over the surface potential in the channel. While the double-gate MOSFET promises to be the most scalable transistor structure [21.6], a practical, manufacturable, double-gate CMOS technology hinges on the formation of an ultra-thin gate oxide, an ultra-thin silicon body, and perfectly aligned double-gate all at once. And even with an alternative gate dielectric that mitigates the gate leakage current, implemented in scalable MOS structure, the drive current performance that is so vital for scaling may still be severely constrained by carrier scattering in the channel and series resistance.

21 Advanced MOS-Devices

671

Fig. 21.3. Dependence of FinFET leakage current (Ioff ) and drive current (Ion ) on gate length LGate and body thickness Tbody = Wfin , from computer (MEDICI) simulations. Ioff decreases as Tbody is reduced, due to reduced drain-induced barrier lowering. For an Ioff specification of 160 nA/µm and 5 nm body thickness, Lmin ∼ 10 nm. For a fixed Ioff , Ion decreases slightly as Tbody decreases, due to increased parasitic series resistance

The effective mobility, µeff , is one measure of scattering in the channel. For a thick silicon body, the inversion layer mobility has been found to de−α grade universally according to a µeff ∼ E⊥ law [21.11] with α > 1 for transverse fields in the semiconductor E⊥ > 0.5 MV/cm, corresponding to present operating conditions. When the transverse field in the semiconductor exceeds 0.5 MV/cm, the scattering in the channel increases faster than the carrier density so that (at least in a long channel MOSFET where the drain current is proportional to the mobility) the performance gain diminishes. While Coulomb and phonon scattering still contribute, this dependence has been attributed ostensibly to interface roughness scattering. Yet, the correspondence between the roughness of the SiO 2 /Si interface and the mobility has never been established unequivocally. When the film thickness of the silicon body becomes comparable to the electron thermal wavelength (∼ 5– 10 nm), the electronic structure of the inversion layer and the scattering rates change. It has been suggested that a double-gate structure may yield a slight improvement in the roughness-limited mobility [21.12]. Finally, the real crucible for testing the viability of a new technology lies in the economics of the IC design and manufacturing process. Despite the reduction in design rules by a factor of 50, and an increase in the die area by 150, the cost of a 1 cm2 chip has not really changed much over the last twenty years. The price is about $4/cm2 , and the electronics industry is desirous for it to remain the same, independent of the technology. To produce more transistors and wires at ever decreasing cost, the factories themselves

672

J. Bokor et al.

are becoming more complex and expensive. The industry extrapolations have the factory costs increasing along with the scaling or miniaturization of the IC technology; the current generation of ICs is requiring factories that cost over $1 billion dollars. One of the main reasons for the escalating cost is associated with the stringent controls that must be maintained over the process used to produce ICs with high yield. Yield in IC manufacturing is plagued by process fluctuations and contamination from impurities and particles that occur at every processing step, which together give rise to statistical variations from device to device. So, there is a chance that a few devices will not meet a design specification because of statistical variations, and there is a small probability that enough device failures will occur to detrimentally affect circuit performance. To illustrate the effect of process control and the number of processing steps on the yield, we use a modified form of the Price yield law, denoting the average defect density in the ith-processing step as d0i and the probability of a fatal defect for that step as qi . Following Glaser [21.13], we find that the yield, Y , after k processing steps is: Y =

k '

1 C ≈ k 1 + q d i 0i (1 + qi (s/s0 ) d0i ) i=1

where the parameter qi (s/s0 ) represents the probability of a defect for a design rule with a minimum spacing, s, and an empirical threshold value of s0 , and C represents the yield lost through other defects. This idealized law clearly shows that increasing the number of processing steps has a detrimental effect on yield because each factor of 1/(1+qi d0i ) 1000◦ C) and is completed before the gate dielectric deposition. This replacement-gate approach enables low-temperature processing of the gate stack. This has two important consequences: 1) it minimizes possible dopant penetration through the high-κ layer; and 2) it minimizes any potential reaction of the high-κ layer with the poly-Si gate electrodes. After the undoped oxide sacrificial gate layer is removed selectively, a thin sacrificial thermal oxide is grown on the exposed portion of the c–Si body and then removed in dilute HF. An ultrathin (0.3– 0.7 nm) thermal oxy-nitride underlayer is then grown. This is followed by high-κ deposition using the highly uniform and precisely controllable ALD process [21.38]. The pulse and purge cycles of the ALD process are tuned for maximum conformality. After the deposition, the high-k films are annealed at low temperature. An in-situ doped α-Si gate is deposited and activated with a short 800◦ C rapid thermal anneal. This anneal is the highest temperature

686

J. Bokor et al.

seen by the high-κ films in this replacement-gate process. The gates are patterned, contact windows are opened, and back-end metallization is completed. The TEM images of Fig. 21.14 demonstrate that the ALD process provides outstanding conformality and is compatible with the demanding VRG geometry. These images show a completed 50 nm VRG nMOSFET that incorporates a HfO 2 gate dielectric with a tox,eq of 1.5 nm (4.5 nm physical HfO 2 on a 0.7 nm oxy-nitride underlayer). In Fig. 21.14a, the HfO 2 film is found to be conformal even in the gate “cavity” created by the removal of the sacrificial gate layer. Careful TEM measurements of the HfO 2 in the active area of the device as well as far away from the gate produce essentially identical physical thicknesses, indicating excellent film uniformity. Figure 21.14b is a magnified view of the active region in which the black layer is the conformal 4.5 nm (physical) HfO 2 film. This image clearly shows that the HfO 2 nucleates and deposits uniformly on both the oxy-nitride underlayer as well as the nitride offset spacers, which separate the gate from the phospho-silicate glass (PSG) dopant sources. The atomic-resolution image of Fig. 21.14c shows the rounded top corner of the gate, the intentional 0.7 nm oxynitride underlayer, and the lattice planes of the Si in the channel region. The lattice planes of a HfO2 grain which turns this corner are also clearly visible. The ALD HfO 2 film does not thin at the gate edges or on the nitride offset spacers that sandwich the gate. Convergent beam diffraction confirms that the HfO 2 film is poly-crystalline, with a grain size that is larger than the film thickness, in the active area as well as far away from the gate. Note that the tox,eq values are determined using physical thicknesses (measured by ellipsometry on planar films and calibrated by high-resolution TEM in the VRG geometry) with an assumed permittivity of κ = 22 for HfO 2 . These tEOT values include the intentional 0.7 nm oxy-nitride under-layer. As a result of this definition, tox,eq is thought of as an equivalent physical oxide thickness. The quoted tox,eq values are consistent with the electrical oxide thickness inferred from capacitance versus voltage (C–V ) measurements, taking into account that in inversion, quantum mechanical and poly-Si depletion effects typically add 0.8–1.0 nm to the equivalent physical thickness. The overall trend of gate leakage current density JG vs. tox,eq for an oxynitride/HfO 2 gate dielectric stack is illustrated in Fig. 21.15a, where the gate leakage was measured in inversion at VG − VT = 0.6 V on very wide, multi-finger transistors with total coded width WC = 3312 µm. Extracting JG at 0.6 V above threshold in inversion is appropriate since this characterizes what an actual nMOSFET sees in approximately 1 V operation. Referencing the gate voltage to threshold voltage compensates the effect on JG of the electric field shift due to fixed and trapped charge. As shown in Fig. 21.15a, the observed gate leakage current densities are extremely small down to the thinnest dielectric stacks with tox,eq = 1.3 nm. The 1.5 nm tox,eq HfO 2 devices show JG ∼ 10−7 A/cm2 at VG − VT = 0.6 V. This is more than five orders of magnitude smaller than the gate leakage measured under the same conditions for devices with a 1.6 nm SiO 2 gate dielectric.

10

a

@ VG - VT = 0.6 V

2

Gate Current Density (A/cm )

21 Advanced MOS-Devices

10 10 10 10 10

687

0

-2

SiO2

-4

-6

-8

HfO2

-10

10

15

20

25

Target EOT (Å)

30 b

Fig. 21.15. (a) Gate leakage (GL) for HfO2 (on a 0.7 nm oxynitride underlayer) and SiO2 measured in inversion at 0.6 V above threshold. The GL for the oxynitride/HfO2 stack is more than five orders of magnitude lower than that of SiO2 . The HfO2 trend suggests that in the VRG structure, this oxynitride/HfO2 stack can scale to an Equivalent Oxide Thickness (EOT) ≤ 1 nm while maintaining a gate current density below 1 A/cm2 . (b) Subthreshold characteristics for a 50 nm PD–VRG nMOSFET with a 1.5 nm tEOT oxynitride/HfO2 gate dielectric stack. Devices from the same wafer are shown in the TEM images of Fig. 21.14

Extrapolation of the measured trend (which spans many orders of magnitude in JG ) suggests that this oxynitride/HfO 2 stack should be scalable to tEOT ≤ 1.0 nm, using a conservative 1 A/cm2 upper limit for the current density. The magnitudes of these HfO 2 gate leakage currents are especially encouraging considering that they have been measured in a VRG structure, where the gate dielectric-channel interface has not been optimized. Moreover, the extremely low gate leakage of the poly-crystalline HfO 2 indicates that amorphous gate dielectrics may not be necessary to meet gate leakage requirements, at least when there is an ultrathin (≈ 0.7 nm) amorphous underlayer in the gate dielectric stack. Further work is required to evaluate the suitability of this amorphous oxynitride/poly-crystalline HfO 2 gate dielectric stack in terms of fixed charge, reliability, yield, and possible dopant penetration. Figure 21.15b shows the subthreshold characteristics for a 50 nm VRG nMOSFET with a 1.5 nm tox,eq oxynitride/HfO 2 gate dielectric stack. This figure shows a subthreshold slope of S = 97 mV/dec. This particular device has a drive current of ID /WC = 490 µA/µm (245 µA/µm intrinsic) for 1 V operation (VDS = 1 V with gate overdrive VGS − VT = 0.6 V). The off-current of this device at 0.4 V below threshold is a reasonable 80 nA/µm. Although the performance shown in Fig. 21.15b is well below the 1 V ITRS target, it is encouraging, given that this is the first time that VRG MOSFETs have been

688

J. Bokor et al.

built with high-κ gate dielectrics. Significant improvements to this performance are expected from further development of the oxy-nitride/HfO 2 gate dielectric stack and through the optimization of various aspects of the VRG process, for example: 1) straightforward minimization of extrinsic series resistance in the top and bottom source/drain and contacts, 2) improvements in the SDE depths and profiles and possibly dopant type (e.g. arsenic), 3) reduction of the SDE lengths, 4) optimization of the nitride offset spacer thicknesses and therefore the gate overlaps, and 5) improvements in the pregate-dielectric processing. Finally, the VRG process also provides a new route to the fabrication of fully-depleted double-gate and surrounding-gate MOSFETs with self-aligned gates and well-controlled parasitics. As in other novel MOSFET structures like the FinFET, one could provide a very thin body through aggressive lithography and etch as well. Following this strategy, the filling of a very narrow, high-aspect ratio trench would be straightforward with selective epitaxial growth. In fully-depleted double-gate devices, the body thickness must be about half the gate length, so tbody is clearly a critical device dimension. The precise gate length control afforded by the VRG process is of little value though in a fully-depleted double-gate device if the body thickness is determined directly through lithography.

21.4 The Double-Gate FinFET In 1998, researchers at UC-Berkeley introduced the FinFET, a vertical double-gate SOI MOSFET structure adapted from the DELTA MOSFET structure developed by Hitachi Ltd. [21.39] with an ultra-thin body defined by lithography. This structure is especially well-suited to the formation of perfectly aligned gates that are so critical to optimal drain current performance in a double gate structure. N-channel FinFETs with excellent short-channel behavior down to sub-30 nm gate length have been demonstrated using a “gate-last” fabrication process [21.40]. Sub-50 nm p-channel FinFETs with excellent drive current (ION ) and leakage current (IOFF ) were subsequently demonstrated using a similar process in 1999 [21.41]. The FinFET uses a single gate material (e.g. poly-Si ) deposited conformally over an SOI fin and patterned to form perfectly aligned, connected gates straddling the fin (see the schematic illustration in Fig. 21.16). Due to the use of a relatively thick (> 10 nm) SiO 2 hard mask on top of the SOI film, the top of the fin does not form a conducting channel. Current flows laterally (parallel to the surface of the wafer) between the source and drain regions, along the vertical sidewalls of the fin. The fin width (Wfin ) corresponds to the FinFET body thickness (tbody ), while the fin height (thickness of the SOI film) corresponds to one-half the effective channel width. Multiple fins can be used to achieve higher values of ION [21.39] so that, in the general case, the effective channel width is 2×(fin height)×(number of fins). An

21 Advanced MOS-Devices

689

Fig. 21.16. Schematic diagram of the quasi-planar FinFET structure. The siliconon-insulator (SOI) film is patterned to form a narrow fin, connected at each end to wider source/drain contact regions. Afterwards, the gate dielectric is formed and the gate material is conformally deposited. The gate material is then patterned using lithography and a dry-etch step which must clear away the gate material in the exposed regions, all the way to the bottom oxide (BOX). (The SiO2 layer on top of the SOI film serves as a hard mask to protect the top surface of the Si from being removed during the long gate etch.) The patterned gate electrode straddles the Si fin, forming perfectly aligned, connected front and back gates. Ion implantation is subsequently employed to form source/drain regions which are self-aligned to the gates

Fig. 21.17. Top-view scanning electron micrograph of a FinFET. The inset shows a tilted view of the same device. Although the gate length as seen in the top view is 20 nm, the actual gate length along the fin sidewalls is 15 nm due to undercutting which occurs during the gate over-etch step, resulting in a T-shaped gate profile

improved, “gate-first” FinFET fabrication process similar to a conventional CMOS process was recently developed in order to provide better circuit performance and improved manufacturability [21.42]. The resultant quasi-planar FinFET structure is similar in layout to a conventional planar MOSFET, and has been demonstrated to exhibit excellent short-channel behavior down to 15 nm gate length (Figs. 21.17 and 21.18) [21.43].

J. Bokor et al. Drain Current, Id [mA/mm]

690

600 600 550 550 |Vg-Vt|=1.2V NMOS 500 500 450 450 400 400 PMOS Voltage step 350 350 : 0.2V 300 300 250 250 200 200 150 150 100 100 50 50 0 0 -1.5 -1.2 -0.9 -0.6 -0.3 0.0 0.3 0.6 0.9 1.2 1.5

Drain Voltage, Vd [V]

Drain Current, Id [A/um]

a

b

10

-3

10

-4

10

-5

10

-6

10

-7

10

-8

10

-9

10

-3

10

-4

Vd=0.05 V 10

-5

10

-6

10

-7

Vd=1.0 V

Vd=-1.0 V Vd=-0.05 V PMOS P+Si0.6Ge0.4 Gate 18

NMOS P+Si0.6Ge0.4 Gate

-8

18 -3 N-body (2x10 cm ) 10 -9 10

-3

N-body (2x10 cm )

10

-10

10

-10

10

-11

10

-11

10

-12

10 2.0

-12

-1.0

-0.5

0.0

0.5

1.0

1.5

Gate Voltage, Vg [V]

Fig. 21.18. Current vs. voltage characteristics of CMOS FinFETs with Lg = 15 nm, Wfin = 10 nm, and Tox = 2.1 nm. Current values are normalized by 2×(fin height). (a) Output Id –Vd characteristics, (b) Subthreshold characteristics

Details of the fabrication process for the quasi-planar FinFET structure are provided in [21.42,21.43]. Electron-beam lithography was used to pattern fins as narrow as 10 nm. A novel “spacer” lithography technique for forming even narrower fins with improved control and uniformity was also introduced. A typical FinFET design uses a fin height in the range from 50 nm to 100 nm. An example of a FinFET with 50 nm fin height (W = 100 nm), 10 nm fin width and 15 nm gate length Lg is shown in Fig. 21.17. Although the gate length as seen from the top is 20 nm, the effective gate length along the fin sidewalls is actually 15 nm, due to undercutting which occurred during the gate over-etch step, resulting in a T-shaped gate profile. Because the sub-surface leakage current is effectively eliminated with the use of a very thin body, the gate-dielectric thickness does not need to be scaled down as aggressively for thin-body MOSFETs as for the classical bulkSi MOSFET in order to control short-channel effects. The 15 nm quasi-planar

21 Advanced MOS-Devices

691

Fig. 21.19. Scanning electron micrograph of a FinFET with multiple fins for increased effective channel width (higher ION ). Dependence of FinFET leakage current (IOFF ) and drive current (ION ) on gate length LGate and body thickness tbody = Wfin , from computer (MEDICI) simulations. IOFF decreases as Tbody is reduced, due to reduced drain-induced barrier lowering. For an IOFF specification of 160 nA/µm and 5 nm body thickness, Lmin ∼ 10 nm. For a fixed IOFF , ION decreases slightly as tbody decreases, due to increased parasitic series resistance

FinFETs reported in [21.43] employed a relatively thick SiO 2 gate dielectric (tox = 2.1 nm). For symmetric (identical work function) front and back gates, the average electric field in the inversion layer of a double-gate MOSFET is lower than that in a bulk-Si MOSFET. This not only provides for higher effective carrier mobilities but also allows for more aggressive scaling of the gate-dielectric thickness, for a given gate-leakage current specification [21.44]. Hence, the double-gate MOSFET potentially offers significantly enhanced intrinsic drive current as compared to the bulk-Si MOSFET. The drive currents achieved with gate drive |Vg − Vt | = 1 V and drain bias Vd = 1 V were 365 µA/µm and 270 µA/µm for the 15 nm n-channel and p-channel FinFETs, respectively (Fig. 21.18a). (Note: Drive currents are normalized using twice the fin height, which is a conservative definition of channel width for the double-gate structure.) The low drive currents are due in part to parasitic series resistance associated with the thin-body source/drain regions, and can be enhanced with the use of a raised S/D structure [21.45] or Schottky-S/D technology [21.46] and also a thinner gate oxide. The off-state n-channel and p-channel leakage currents intersect at 70 nA/µm for Vd = 1 V (Fig. 21.18b), indicating that ITRS high-performance transistor leakage current specifications can be met with a single near-midgap work-function gate material. To avoid threshold-voltage (VT ) variations due to statistical dopant fluctuations, the channel doping concentration in thin-body MOSFETs should

692

J. Bokor et al.

be low (less than 1018 cm−3 ) so as to exert no influence on VT . The desired value of VT is then achieved by setting the gate work function to the appropriate value. For low (∼ 0.2 V) and symmetric values of |VT |, the required gate work-functions are ∼ 4.5 eV for the n-channel FinFET and ∼ 4.9 eV for the pchannel FinFET [21.7]. Ideally, the gate work function should be adjustable, to allow for tailoring of the trade-off between high current drive and low leakage current, i.e. a multiple-VT technology is desirable. The work function of molybdenum (Mo) can be selectively adjusted between 4.5 eV and 4.9 eV by masked implantation of nitrogen [21.47], so that it is an attractive metal gate material for thin-body SOI transistor structures including the FinFET. As mentioned previously, the effective width of a FinFET is increased by using multiple fins [21.40,21.41]. A multiple-fin device is shown in Fig. 21.19. The layout-area efficiency of a multiple-fin FET can exceed that of traditional planar FETs if the fin height is greater than half the fin pitch. For example, given a fin height of 50 nm, the FinFET area efficiency is advantageous for fin pitches below 100 nm. Fin aspect ratio (height/width) limitations impose an upper limit on the fin height. In summary, the double-gate MOSFET is a promising transistor structure for scaling CMOS technology into the 10 nm regime provided the gates are aligned to optimize the drain current performance. The quasi-planar FinFET structure is well suited to the formation of self-aligned gates, using a conventional fabrication process flow and layout. Experimental results confirm the scalability of the FinFET, and have prompted the industry to investigate this structure for future CMOS technologies [21.48].

21.5 Silicon-On-Nothing MOSFETs As illustrated above, nanometer-scale crystalline silicon films on an insulator are widely recognized [21.49–21.64] for their potential as the foundation for end-of-roadmap CMOS transistors. However, films as thin as 5–20 nm are not yet commercially available. Since the thickness and roughness of the film must be stringently controlled to produce nanometer-scale MOSFETs with high yield, further refinement of the processes developed for conventional SOI, which typically has tbody > 100 nm, will be required. The Silicon-OnNothing (SON ) process [21.50, 21.51] represents an alternative strategy for the production of an ultra-thin body where the silicon film and buried insulator are defined with exquisite precision using epitaxy on a bulk silicon substrate. The SON process offers extremely thin films while affording the precise thickness control of the epitaxial growth process (less than 1nm) at the same time. In addition, only a bulk wafer is needed to fabricate the SON layers, eliminating the need for SOI starting substrate. The SON process begins after conventional shallow trench isolation on bulk wafers. The SON -dedicated active areas are exposed to SiGe (around 30%) epitaxy followed by a Si -body epitaxy as illustrated in Fig. 21.20a. The

21 Advanced MOS-Devices

693

Fig. 21.20. The key steps in the SON process flow for a single-gate MOSFET

SiGe layer is used to transfer the crystalline lattice structure of the bulk substrate into the thin silicon body. After the epitaxy, the process merges with a conventional CMOS flow up to the formation of the nitride spacer. At that point, a second TEOS spacer is formed on SON transistors. Using the gate and the double Si 3 N4 and TEOS spacers as a mask, the S/D regions of the SON transistors are etched, followed by the selective lateral SiGe etch as illustrated in Fig. 21.20b. (Only the active areas attached to the SON transistors are etched, all the other transistors are masked when etching the second TEOS spacer.) The selective elimination of the SiGe layer from underneath the Si -body produces a thin Si -body layer suspended over the substrate. The gap or tunnel between the Si -body and the substrate may be backfilled with a dielectric or remain empty, depending on application. Shrinking dimensions facilitates both the selective SiGe removal and the mechanical stability of the SON structures. Shorter gate length transistors and small active areas lead to an easier etch and shorter bridge. The gap or tunnel between the silicon body layer and the substrate is subsequently backfilled with a thin rapid thermal oxidation/high temperature oxide (RTO/HTO) liner and a Si 3 N4 CVD layer. The RTO liner passivates the inner walls of the gap, whereas the HTO is dedicated to serve as an etch stop against nitride etching. Next, the Si 3 N4 is etched-out in an isotropic plasma process everywhere except inside the gap. The remaining HTO, which serves as Si 3 N4 etch stop, is then removed from S/D open areas with HF.

694

J. Bokor et al.

a

b

Fig. 21.21. TEM micrographs illustrating the perfection of the morphology of a SON MOSFET. (a) A TEM cross-section through a 80nm gate MOSFET; (b) A magnified view of the cross-section demonstrating the crystalline perfection of the Si-body and the Box-body interface

This same HF clean is aimed at removing the TEOS spacers on SON dedicated active areas, as well. The removal of the second TEOS spacer is key to the subsequent S/D selective epitaxy since it exposes the channel extremities, as illustrated in Fig. 21.20d. The epitaxial growth process begins at the channel extremities and from the bottom of the S/D trenches, ensuring reliable reunification between the channel and the S/D areas. This SON process guarantees high performance transistors. Figure 21.21 shows a TEM micrograph illustrating the perfect morphology of the SON devices and the fidelity of the channel-to-S/D reunification. The crystalline perfection of the channel layer is worth emphasizing. The crystallinity of the silicon cap and the smooth bottom interface of the channel can be appreciated in the high-resolution cross-section taken from Fig. 21.21a. The smoothness of the bottom interface reflects the very high selectivity of the SiGe etching process and the abruptness of the SiGe/Si interface. Figure 21.22 contrasts schematics of the architecture of an SON device with a typical bulk device and SOI device. Notice that an SON device possesses the advantages of bulk and SOI devices without their respective weaknesses. For example, unlike bulk technology, both SOI and SON offer control over the thicknesses of the Si -film and the buried insulator, which are important for suppressing short channel effects. Reducing the thickness of the BOX helps to suppress the SCE, but the effect is smaller than that of the Si -film. The role of the BOX-thickness becomes clear when comparing potential distributions in SOI devices with thick and thin BOX layers. Compare the plots for BOX = 10 nm and BOX = 70 in Fig. 21.23 for the same time tbody = 10 nm. The potential contours in the case of very thin BOX (10 nm-

21 Advanced MOS-Devices

a

b

695

c

Fig. 21.22. Schematic of three device architectures: (a) bulk; (b) SOI; and (c) SON

a

b

Fig. 21.23. Iso-potential contours in two SOI devices. In (a) the BOX thickness is 70 nm, while in (b) the BOX is only 10 nm thick; otherwise the devices are identical

thick) resemble those of a long-channel MOSFET, whereas in the case of a thick BOX (70 nm) a “saddle-shape” develops indicative of the field penetration associated with short-channel operation. This is due to two effects: 1. only a limited number of drain-field lines are guided through the thin BOX without screening (the majority of the lines are screened by dopants in the substrate below the BOX); and 2. if the bulk doping is sufficiently high, a ‘ground plane” effect comes into play reinforcing the electrostatic integrity of the device. And, unlike SOI, the BOX layer in SON devices can be thinner and localized beneath the channel, permitting very shallow extensions and ‘groundplane” operation along with deep heavily doped junctions just like in bulk devices. This latter advantage keeps the series resistance of an SON MOSFET low and is also essential for conventional silicidation that otherwise arises as a major problem in thin-film devices. The electrical results obtained with the SON process have already been discussed elsewhere [21.65]. Here, we will illustrate a few key features of

696

J. Bokor et al.

a

b Fig. 21.24. Comparison of an 80 nm gate length SON and bulk nMOSFET with a 3 nm gate oxide. (a) Subthreshold characteristics; (b) Output characteristics

these first 80 nm SON nMOSFET with a 3 nm physical gate oxide. The improvement in ION –IOFF found SON devices, illustrated in Fig. 21.24, are attributed to the intrinsic high performance of this architecture. As illustrated by Fig. 21.24a, it is remarkable that SON shows a factor 3 reduction of DIBL. The difference observed in Fig. 21.24 between VT in the bulk nMOS and the SON transistors was unintentional. Although, the same channel implants were done to both devices, the etching of the SON gap removes part of the channel doping. The corresponding change in threshold can be estimated: ∆Qb = q ∗ Ttunnel ∗ Nchannel = 4.8 × 10−7 C which correlates well with the observed difference, ∆Qb /Cox = 400 mV shown in Fig. 21.24a. Despite the reduced doping, DIBL is suppressed. We attribute the improvement in DIBL to the superior electrostatics inherent in the SON architecture.

21 Advanced MOS-Devices

697

Concerning the output characteristic, an improvement of as much as 30% in the drain current measured at high voltage (VG − VT = 1.8 V) and as much as 130% at low voltage (VG − VT = 0.6 V) has been observed as illustrated in Fig. 21.24b. In our analysis of this improvement, two phenomena are being investigated: 1) the reduction of the transverse field Eeff in SON -channel which should lead to higher effective mobility and 2) the physical confinement of the dopants within the SON extensions. The transverse field in the semiconductor may be reduced by the removal of dopants when etching the gap. Alternatively, the confinement of the dopants in the layer sandwiched between the upper interface and the BOX may also affect the lateral diffusion and electrical activation leading to a shorter electrical channel and lower series resistances. In absolute numbers, a drain current as high as ID = 750 µA/µm @ VG − VT = 1.8 V (consistent with the relatively thick gate oxide) is measured for 25 nA/µm IOFF current and 100 mV of DIBL. Bulk reference transistors fabricated with the same CMOS process, show merely 600 µA/µm of ION at VG − VT = 1.8 V and 360 mV of DIBL, which is representative of the performance of a state-of-the-art bulk device with equivalent channel length and oxide thickness. We do not expect to obtain a similar improvement in the bulk devices by just lowering the channel doping to adjust the VT , however. A reduction in doping would have a deleterious effect on the already poor DIBL. The SON process can also be easily adapted to fabrication of double gate or gate-all-around (GAA) devices, which are reputed for their particularly strong immunity to short-channel effects. Some of the main issues were already outlined above: i.e. the thickness control of the Si -body; the surface roughness; and misalignment between the gates. The GAA SON -based process copes effectively with these issues as shown in Fig. 21.25. The process starts with Selective Epitaxial Growth (SEG) of a thin SiGe layer and a thin Si -layer on wafers with shallow trench isolation (STI) as illustrated schematically in Fig. 21.25a. The SiGe/S i stack is next patterned by reactive ion etching (RIE) into a strip expanding above the active area (future transistor channel) and overlapping on the STI (future S/D). Subsequently, the underlying SiGe is selectively etched out, as in the conventional SON process, and the entire bridge-like structure is covered again with dielectric, as illustrated in Fig. 21.25b. To ensure the quality of the dielectric, we used thermal oxidation in our application. Thus, through the application of the SON process, problems associated with the thickness and roughness of the films are resolved without appealing to lithography. Next, the structure is covered with poly-silicon that is patterned into a gate in a masked RIE step. The residual poly-silicon remaining in the corners under the bridge, shown in Fig. 21.25c, do not affect the static operation of the device, but do contribute to overlap capacitance necessitating the use of tight design rules to reduce the gate-to-poly spacing. The significance of this increase in the parasitic overlap capacitance should be weighed against the reduced junction capacitance in the structure. Note also that the top and

698

J. Bokor et al.

a

b

c Fig. 21.25. Key-steps of the Double Gate (or GAA) SON process (gate oxide, spacers, silicidation and back-end steps not visualized): (a) after selective epi growth of SiGe and non-selective epi growth of Si, (b) after patterning of the transistor channel, selective etch-out of SiGe, and thermal re-oxidation of the channel and of the walls of the tunnel, (c) after poly deposition, gate patterning, S/D implantation and contacting electrodes

bottom gates, although not of the same size, are self-aligned in the sense that they both appear within the same etching process (their centers are well aligned but not the borders). The structure is completed by ion-implantation of S/D junctions, silicidation and other conventional back-end processing. This process presents at least three important advantages over prior implementation of the SON process: 1) the SiGe is not subject to any thermal steps (it is etched away just after deposition and channel patterning) thus limiting the danger of relaxation; 2) the continuity between the channel and S/D is never broken, thus removing any necessity of reunification of this regions by SEG of Si ; and 3) the S/D junctions lay on a field oxide (STI), which promises reduced junction capacitance and leakage. The GAA device shown in Fig. 21.2 demonstrates the feasibility of the GAA SON process. It is worth noticing that the nature of the silicon within the bridge-like structure changes from a single crystal above the active-area to polycrystalline above the STI. This is not expected to be a problem as long as the effective channel region is taken in the central, single crystalline part of the bridge. As previously shown, the boundary between the mono- and polycrystalline regions is very sharp and passes close to the STI edge. The extremities of the bridge lying on STI are polycrystalline, but this should not be considered a disadvantage since they constitute the body of the junction. On the contrary, the Si epitaxy conditions may be adjusted in such a way so as to produce

21 Advanced MOS-Devices

699

a

b Fig. 21.26. (a) The subthreshold characteristics of a 90 nm Gate-All-Around nMOSFET with a 2 nm thick gate oxide (red ), contrasted with a bulk nMOSFET with the same gate oxide thickness and same gate length, illustrating the superior performance achieved with a thin (20 nm) silicon body. (b) Simulations indicate that a series resistance of ∼ 1.5 kΩ limits the drive current in the GAA nMOSFET

700

J. Bokor et al.

a polysilicon layer that is thicker than than the crystalline Si -layer within the same epi step. Taking advantage of this feature permits us to overcome one of the shortcomings of thin-film SOI devices: i.e. the series resistance and silicidation. Note that our bridge-like silicon strip never contacts the Si substrate. Consequently, it can be very highly doped and activated without the danger of junction diffusion as in the case of bulk devices. Gate-All-Around transistors have been realized with a minimal gate length down to 50 nm. The silicon body thickness is 20 nm. In order to test the resistance of the GAA structure to SCE/DIBL effects, no pocket implants were used. The measured characteristics illustrated in Fig. 21.26 show no more than 10 mV of DIBL for 90 nm GAA devices, while bulk devices measured on the same chip exhibit largely relaxed DIBL (600 mV for 90 nm device) due to lack of pockets. It is remarkable that the bulk devices are also functional on the same chip even though no effort was been made to separate processing of GAA and bulk devices or to accommodate their integration. The maximum on-current measured did not exceeded 170 µA/µm on GAA devices (see Fig. 21.26b); far below results obtained on bulk devices. To understand the origin of the degradation, the GAA structures have been simulated with a 2D simulator (ISE) and the geometrical properties have been strictly reproduced: gate oxide 3 nm, 20 nm-thick conduction Si -channel, no silicide and the distance between the contacts and the transistor gate equal to 1 µm. Evidently, this large distance between the gate and S/D is one of the root causes of the degradation in the drain current performance. According to the simulation, a potential drop as large as 0.4 V occurs along the extensions because of the large series resistance. In addition, to fit the measured characteristics, we have had to include a contact resistance as large as 1500 Ω per contact. Finally, to examine the potential of the GAA transistor, the calibrated simulator was used to predict performance assuming salicided junctions (9 Ohms/square) and improved contact resistance (50 Ω per contact). It is remarkable that even with the extensions as long as 1 µm, the predicted current is as large as 1.850 mA/µm. Of course, with adapted design rules the extension length should not exceed 3λ (where λ is the technology feature size) thus justifying the hope of transistor drivability in excess of 2 mA/µm. In sum, the new SON process allowing fabrication of thin crystalline silicon films on thin buried dielectric layers has been proposed and demonstrated. The SON process has been successfully integrated into a bulk CMOS process, targeting the fabrication of very thin film SOI -like transistors on bulk wafers. SON belongs to the rare family of processes with a technological window that improves when scaling to smaller dimensions. Both etching and filling of the gap becomes easier for a 50 nm feature size than for longer features. It should be emphasized that SON is accessible to conventional CMOS manufacturing without any special technology changes and investments. Thus, SON provides a viable strategy for producing high-performance single- or double-gated devices, extending MOS technology to nanometer-scale dimensions.

21 Advanced MOS-Devices

701

21.6 Conclusion Following Moore’s law, we expect continued improvement in the complexity of an IC through miniaturization of the wires and transistors that are included in the circuit; inexorably scaling the MOSFET structure toward 10 nm gate lengths using a gate dielectric that separates the control electrode from the current-carrying channel with an effective thickness less than tox,eq = 0.6 nm of SiO 2 . This expectation is physically reasonable and based on advanced device designs currently under research. However, such a dramatic reduction in the gate oxide will result in an exponential increase in gate leakage current unless alternative gate dielectrics are developed. One likely candidate that suppresses gate leakage current is HfO 2 . The dramatic reduction in the gate length will result in increased leakage current due to DIBL, but thinning the body to eliminate leakage paths far away from the gate electrode can effectively reduce the subthreshold leakage by maintaining the control the gate has over the surface potential in the channel. While the double-gate MOSFET promises to be the most scalable transistor structure [21.5], a practical, manufacturable, double-gate CMOS technology hinges on the formation of an ultra-thin gate oxide, an ultra-thin silicon body, and perfectly aligned double-gate all at once with a minimum number of processing steps in order to maintain yield. The variants of the double-gate MOSFET examined in this chapter: i.e. the vertical replacement gate MOSFET, the FinFET and the gate-all-around MOSFET implemented using an SON process, address the shortcomings inherent to the manufacture of double gates. But, even with a alternative gate dielectric implemented in scalable MOS structure, the drive current performance that is so vital for scaling the MOSFET may still be severely constrained by carrier scattering in the channel and series resistance. Ballistic transport may be used to improve the drive current performance and speed of the transistor through shorter gate lengths, tighter control over the roughness at the silicon-gate dielectric interface, and lower transverse fields in the semiconductor. On the other hand, the drain current will become more sensitive to variations in the gate length, and the structure of the interface between the Si -body and the gate dielectric will have to be controlled with atomic precision to recover the required performance gains. While the alternatives presented here may represent prospective solutions to the physical and manufacturing limitations that develop as the conventional planar MOSFET is scaled to nanometer-scale gate lengths, none of them provide for continued unrestricted scaling beyond 10 nm. Moreover, these alternatives have yet to demonstrate drive and leakage current performance that is superior to a well-designed planar MOSFET with the same gate length using the thinnest practical SiO 2 gate dielectric for the some price [21.18].

702

J. Bokor et al.

References 21.1. G.E. Moore, IEDM Tech. Dig., 11–13 (1975) 21.2. International Technology Roadmap for Semiconductors, Semiconductor Industry Association, 2000 update 21.3. D.A. Muller, T. Sorsch, S. Moccio, F.H. Baumann, K. Evans-Lutterodt and G. Timp, The Electronic Structure at the atomic scale of ultra-thin gate oxides, Nature 399, pp. 758–761 (June 24, 1999) 21.4. D.J. Frank, R.H. Dennard, E. Nowak, P.M. Solomon, Y. Taur and H.-S.P. Wong, “Device scaling limits of Si MOSFETs and their application dependencies,” Proceedings of the IEEE 89, p. 259 (2001) 21.5. R. Chau et al., “A 50nm Depleted-Substrate CMOS Transistor (DST),” IEDM Tech Dig, pp. 621–624 (2001); T. Matsumoto et al., 70 nm SOICMOS of 135 GHz fmax with Dual Offset-implanted Source-Drain Extension Structure for RF/Analog and Logic Applications,” IEDM Tech. Digest, pp. 219–222 (2001) 21.6. H.-S.P Wong, D.J. Frank and P.M. Solomon, “Device Design Considerations for Double-Gate, Ground-Plane, and Single-Gated Ultra-Thin SOI MOSFET’s at the 25 nm Channel Length Generation,” IEDM Tech. Digest, p. 407 (1998) 21.7. L. Chang, S. Tang, T.-J. King, J. Bokor and C. Hu, “Gate Length Scaling and Threshold Voltage Control of Double-Gate MOSFETs,” IEDM Technical Digest, p. 719 (2000) 21.8. H.-S. Wong, D.J. Frank, Y. Taur and J.M.C. Stork, “Design and performance considerations for sub-0.1 µm double-gate SOI MOSFET’s,” IEDM Technical Digest, p. 747 (1994) 21.9. L. Chang and C. Hu, “MOSFET scaling into the 10 nm regime,” Superlattices and Microstructures 28, p. 351 (2000) 21.10. D.J. Frank, S.E. Laux and M.V. Fischetti, “Monte Carlo Simulation of a 30 nm Dual-Gate MOSFET: How Short Can Si Go?” IEDM Technical Digest, p. 553 (1992) 21.11. A.G. Sabnis and J.T. Clemens, IEDM Tech. Digest, p. 18 (1979); S. Takagi et al., IEEE Trans. Electron Dev. 41, 2357 (1994) 21.12. D. Essenic et al., “An Experimental Study of Low field electron mobility in Double-gate, ultra-thin SOI MOSFETs,” IEDM Tech. Digest, pp 445– 448 (2001); F. Balestra et al., IEEE Electron Dev. Lett., p. 410 (1987); S. Venkatesan et al., IEEE Electron Dev. Lett., p. 44 (1992) 21.13. A.B. Glaser and G.E. Subak-Sharpe, Integrated Circuit Engineering, Addison-Wesley, Reading MA, May 1979, p. 786; D.L. Miller, J.X. Przybysz, and J.H. Kang, IEEE Trans. Applied Superconductivity 3, 2728 (1993) 21.14. D.M. Tennant et al., “Progress toward a 30nm silicon MOS gate technology,” J. Vac. Sci. Technol. B 17 (6), pp. 3158–3163 (Nov/Dec 1999) 21.15. M. Lundstrum, IEDM Technical Digest, p. 387 (1996); F. Assad, Z. Ren, D. Vasileska, S. Datta and M. Lundstrom, IEEE Trans. Electron Dev. 47, pp. 232–240 (2000) 21.16. G.Timp et al., “The Ballistic Nanotransistor,” IEDM Technical Digest, p. 55 (1999) 21.17. J.D. Bude, “MOSFET Modeling into the Ballistic Regime,” 2000 International Conference on Simulation of Semiconductor Processes and Devices 2000, SISPAD 2000, pp. 23–26 (2000)

21 Advanced MOS-Devices

703

21.18. B. Yu, H. Wang, A. Joshi et al., “15nm Gate Length Planar CMOS Transistor,” IEDM Tech. Digest, pp. 937–939 (2001) 21.19. F.H. Baumann et al., Gate stack and silicide issues in silicon processing, Ed. L.A. Clevenger, S.A. Campbell, P.R. Besser, S.B. Herner, J. Kittl., MRS Proceedings 611, C4.1.1–C4.1.12 (2000) 21.20. H. Akatsu and I. Ohdomari, Appl. Surf. Science 41/42, p. 357 (1989) 21.21. T. Yamanaka et al., IEEE Electron Dev. Lett. 17 (4), p. 178 (1996) 21.22. G. Mazzoni et al., “On Surface Roughness-limited Mobility in Highly Doped nMOSFETs,” Trans. Electron. Dev. 46 (7), pp. 1423–1427 (July 1999) 21.23. S.M. Goodnick et al., Phys. Rev. B 32 (12), p. 8171 (1985) 21.24. J. Yu et al, “The role of interface roughness scattering in inversion layer mobility,” submitted to the 2002 Silicon Nanoelectronics Workshop 21.25. J.M. Hergenrother, D. Monroe, F.P. Klemens, A. Kornblit, G.R. Weber et al., “The Vertical Replacement-Gate (VRG) MOSFET: A 50-nm vertical MOSFET with lithography-independent gate length,” IEDM Tech. Digest, p. 75 (1999) 21.26. S.-H. Oh, J.M. Hergenrother, T. Nigam, D. Monroe, F.P. Klemens et al., “50 nm Vertical Replacement-Gate (VRG) pMOSFETs,” IEDM Tech. Digest, p. 65 (2000) 21.27. D. Monroe and J. Hergenrother, “The Vertical Replacement-Gate (VRG) process for scalable general-purpose complementary logic,” ISSCC Tech. Digest, p. 134 (2000) 21.28. J.M. Hergenrother, G.D. Wilk, T. Nigam, F.P. Klemens, D. Monroe et al., “50 nm Vertical Replacement-Gate (VRG) nMOSFETs with ALD HfO2 and Al2 O3 Gate Dielectrics,” IEDM Tech. Digest, p. 51 (2001) 21.29. P. Kalavade, J.M. Hergenrother, T.W. Sorsch, S. Aravamudhan, M.K. Bude et al., “The Ultrathin-Body Vertical Replacement-Gate MOSFET: A Highly-Scalable, Fully-Depleted MOSFET with a Deposition-Defined Ultrathin (< 15 nm) Silicon Body,” submitted to the 2002 Silicon Nanoelectronics Workshop 21.30. L. Kang, K. Onishi, T. Jeon, B.-H. Lee, C. Kang et al., “MOSFET devices with poly-silicon on single-layer HfO2 high-κ dielectrics,” IEDM Tech. Digest, p. 35 (2000) 21.31. D. Barlage, R. Arghavani, G. Dewey, M. Doczy, B. Doyle et al., “Highfrequency response of 100 nm integrated CMOS transistors with high-k gate dielectrics,” IEDM Tech. Digest, p. 231 (2001) 21.32. E.P. Gusev, D.A. Buchanan, E. Cartier, A. Kumar, D. DiMaria et al., “Ultrathin high-k gate stacks for advanced CMOS devices,” IEDM Tech. Digest, p. 451 (2001) 21.33. C. Hobbs, H. Tseng, K. Reid, B. Taylor, L. Dip et al., “80 nm poly-Si gate CMOS with HfO2 gate dielectric,” IEDM Tech. Digest, p. 651 (2001) 21.34. K. Onishi, L. Kang, R. Choi, E. Dharmarajan, S. Gopalan et al., “Dopant penetration effects on poly-silicon gate HfO2 MOSFETs,” VLSI Symp. Tech. Digest, p. 131 (2001) 21.35. S.J. Lee, H.F. Luan, C.H. Lee, T.S. Jeon, W.P. Bai, Y. Senzaki, D. Roberts and D.L. Kwong, “Performance and reliability of ultra thin CVD HfO2 gate dielectrics with dual poly-Si gate electrodes,” VLSI Symp. Tech. Digest, p. 133 (2001)

704

J. Bokor et al.

21.36. R. Choi, C.S. Kang, B.H. Lee, K. Onishi, R. Nieh, S. Gopalan, E. Dharmarajan and J.C. Lee, “High-quality ultra-thin HfO2 gate dielectric MOSFETs with TaN electrode and nitridation surface preparation,” VLSI Symp Tech Digest, p. 15 (2001) 21.37. D.A. Buchanan, E.P. Gusev, E. Cartier, H. Okorn-Schmidt, K. Rim et al., “80 nm poly-silicon gated n-FETs with ultra-thin Al2 O3 gate dielectrics for ULSI applications,” IEDM Tech Digest, p. 223 (2000) 21.38. T. Suntola, “Atomic layer epitaxy,” Material Science Reports 4, p. 261 (1989) 21.39. D. Hisamoto, T. Kaga, Y. Kawamoto and E. Takeda, “A fully depleted leanchannel transistor (DELTA) – a novel vertical ultra thin SOI MOSFET,” IEDM Technical Digest, p. 833 (1989) 21.40. D. Hisamoto, W.-C. Lee, J. Kedzierski, E. Anderson, H. Takeuchi, K. Asano, T.-J. King, J. Bokor and C. Hu, “A Folded-Channel MOSFET for DeepSub-Tenth Micron Era,” IEDM Technical Digest, p. 1032 (1998) 21.41. X. Huang, W.-C. Lee, C. Kuo, D. Hisamoto, L. Chang, J. Kedzierski, E. Anderson, H. Takeuchi, Y.-K. Choi, K. Asano, V. Subramanian, T.-J. King, J. Bokor and C. Hu, “Sub 50-nm FinFET: PMOS,” IEDM Technical Digest, p. 67 (1999) 21.42. N. Lindert, Y.-K. Choi, L. Chang, E. Anderson, W. Lee, T.-J. King, J. Bokor and C. Hu, “Quasi-planar NMOS FinFETs with sub-100 nm gate lengths,” 59th Device Research Conference, p. 26 (2001) 21.43. Y.-K. Choi, N. Lindert, P. Xuan, S. Tang, D. Ha, E. Anderson, T.-J. King, J. Bokor and C. Hu, “Sub-20 nm CMOS FinFET Technologies,” IEDM Technical Digest, p. 421 (2001) 21.44. L. Chang, K. J. Yang, Y.-C. Yeo, Y.-K. Choi, T.-J. King and C. Hu, “Reduction of direct-tunneling gate leakage current in double-gate and ultra-thin body MOSFETs,” IEDM Technical Digest, pp. 99–102 (2001) 21.45. N. Lindert, Y.-K. Choi, L. Chang, E. Anderson, W.-C. Lee, T.-J. King, J. Bokor and C. Hu, “Quasi-planar FinFETs with selectively grown germanium raised source/drain,” 2001 IEEE International SOI Conference Proceedings, p. 111 (2001) 21.46. J. Kedzierski, P. Xuan, E.H. Anderson, J. Bokor, T.-J. King and C. Hu, “Complementary silicide source/drain thin-body MOSFETs for the 20 nm gate length regime,” IEDM Technical Digest, p. 57 (2000) 21.47. P. Ranade, H. Takeuchi, T.-J. King and C. Hu, “Work function engineering of molybdenum gate electrodes by nitrogen implantation,” Electrochemical and Solid-State Letters 4, p. G85 (2001) 21.48. J. Kedzierski, D.M. Fried, E.J. Nowak, T. Kanarsky, J.H. Rankin, H. Hanafi, W. Natzle, D. Boyd, Y. Zhang, R.A. Roy, J. Newbury, C. Yu, Q. Yang, P. Saunders, C.P. Willets, A. Johnson, S.P. Cole, H.E. Young, N. Carpenter, D. Rakowski, B.A. Rainey, P.E. Cottrell, M. Ieong and H.-S.P. Wong, “High-performance symmetric-gate and CMOS-compatible Vt asymmetricgate FinFET devices,” IEDM Technical Digest, pp. 437–440 (2001) 21.49. C. Figna, H. Iwai, T. Wada, T. Saito, E. Sangiorgi, B. Ricco, “A new scaling methodology for the 0.1–0025 µm MOSFET,” Symp. VLSI Techn. Dig., pp. 33–34 (1993)

21 Advanced MOS-Devices

705

21.50. M. Jurczak, T. Skotnicki, M. Paoli, B. Tormen, J-L Regolini, C. Morin, A. Schiltz, J. Martins, R. Pantel, J. Galvier, “SON (Silicon On Nothing) – a new device architecture for the ULSI Era”, Symp. VLSI Techn. Dig., pp. 29–30 (1999) 21.51. M. Jurczak, T. Skotnicki, M. Paoli, B. Tormen, J. Martins, J-L. Regolini, D. Dutartre, P. Ribot, D. Lenoble, R. Pantel and S. Monfray. SON (Silicon On Nothing) – an Innovative Process for Advanced CMOS, IEEE Trans Electron Devices, pp. 2179–2187 (Nov. 2000) 21.52. C.H. Wann, K. Noda, T. Tanaka, M. Yoshida, C. Hu, “Comparative Study of Advanced MOSFET Concepts”, IEEE Trans Electron Devices 43, no. 10, pp. 1742–1753 (1996) 21.53. R.H. Yan, A. Ourmazd, K.F. Lee, “Scaling the Si MOSFET: From Bulk to SOI to Bulk”, IEEE Trans Electron Devices 39, no.7, pp. 1704–1710 (1992) 21.54. L.T. Su, J.B. Jacobs, J.E. Chung, D.A. Antoniadis, “Deep-Submicrometer Channel Design in Silicon-on-Insulator (SOI) MOSFET’S” IEEE Electron Dev Lett 15, no. 9, pp. 366–369 (1994) 21.55. Y. Omura, S. Nakashima, K. Izumi, T. Ishii, “0.1 µ gate ultrathin film CMOS devices using SIMOX substrate with 80 nm thick buried oxide layer”, IEDM Tech. Dig., pp. 675–678 (1991) 21.56. E. Suzuki, K. Ishii, S. Kanemaru, T. Maeda, T. Tsutsumi, T. Sekigawa, K. Nagai, H. Hiroshima, “Highly Suppressed Short-Channel Effects in Ultrathin SOI n-MOSFET’s, IEEE Trans Electron Devices 47, no. 2, pp. 354–359 (2000) 21.57. H.-O. Joachim, Y. Yamaguchi, T. Fujino, T. Kato, Y. Inoue, T. Hirao, “Comparison of Standard and Low-Dose SIMOX Substrates for 0.15 µm SOI MOSFET Applications”, SSDM, Osaka, pp. 854–856 (1995) 21.58. R. Koh, “Buried Layer Engineering to Reduce the Drain-Induced Barrier Lowering of Sub-0.05 µm SOI-MOSFET”, Jpn J Appl Phys 38, pp. 2294– 2299 (1999) 21.59. O. Faynot, B. Giffard, “High performance ultrathin SOI MOSFETs obtained by localized oxidation”, IEEE Electron Device Lett 15, p. 175 (1994) 21.60. J.P. Colinge, “Subthreshold slope of thin-Film SOI MOSFETs,” IEEE Electron Device Lett EDL-7, no. 4, pp. 244–246 (1986) 21.61. C. Raynaud, O. Faynot, B. Giffard, J. Gautier and J-L. Pelloie, “High performance submicron SOI devices with silicon film thickness below 50 nm”, IEEE Int SOI Conference, pp. 55–56 (1994) 21.62. D.J. Godbey, J Electrochem Soc vol 139, pp. 2943–2947 (October 1992) 21.63. J.P. Colinge, M.H. Gao, A. Romano-Rodriguez, H. Maes, C. Claeys, “Silicon On Insulator Gate All Around Device”, IEDM Tech. Dig., pp. 595–598 (1990) 21.64. J–H. Lee, G. Taraschi, A. Wei, T.A. Langdo, E.A. Fitzgerald and D.A. Antoniadis, “Superself-aligned double-gate (SSDG) MOSFETs utilizing oxidation rate difference and selective epitaxy”, IEDM Technical Digest, pp. 71–74 (1999) 21.65. S. Monfray, T. Skotnicki et al., “First 80 nm SON (Silicon on Nothing) MOSFET with perfect morphology and high electrical performance,” IEDM Tech. Digest, pp. 645–648 (2001)

Index

AFM 81, 82, 84, 198 ALD 196, 271, 275, 276, 299, 303, 380–384, 386–392, 394–396, 399, 452, 455, 468, 470, 684–686 Alkaline earth metal silicides 615 Alkaline earth oxides 616, 617 Alternate gate dielectrics 223, 227, 237, 246, 270, 272, 322 Alternate substrates 176 Aluminium gate electrode 42 Ballistic nanotransistor 674 Barrier heights – silicon dioxide 66 Binary metal alloys 425, 427, 431 Boron penetration 195, 203–205, 209, 210, 216, 218 Breakdown models 91, 107 C–V analysis 523, 524, 547, 551 Capacitance equivalent thickness (CET) 130, 131, 205, 212, 256, 484, 523, 550 Carrier mobility 78, 254, 257, 258, 591 Charge neutrality level (CNL) 234, 609, 658, 659 CMOS 12, 42, 46, 48, 110, 123, 124, 126, 127, 130–132, 135, 137, 138, 140, 143–145, 153, 154, 159, 165–169, 172, 174–177, 179–181, 184–189, 191, 195, 197, 198, 209–211, 215, 238, 248, 253, 255, 257–260, 262, 268, 270, 272–277, 311, 312, 354, 415, 416, 423, 427, 429–431, 435, 436, 442, 444, 446–453, 457, 459, 462, 470, 471, 528, 538, 650, 664, 668, 670, 674, 682, 683, 689, 690, 692, 693, 697, 700, 701

Conduction band offset 264, 265, 311, 329, 335, 337, 348, 349, 354, 359, 572, 640, 658–661, 664 Conduction band offset and valence band offset 631 Constant current stress (CCS) 91, 92, 95, 97, 106, 108, 109 Constant voltage stress (CVS) 91, 92, 95, 97, 106, 108, 109, 202, 216, 539 Cross contamination issues 442 CVD 47, 195–197, 268, 275, 295, 297, 300, 302, 303, 386–388, 391–394, 399, 452, 455, 468, 693 Deal–Grove mode 39, 531 Decoupled plasma nitridation (DPN) 195, 197, 211–218 Design rules 145, 671, 697, 700 Device architecture 179, 180, 182 DIBL 144, 153–155, 161–165, 170, 179, 182–184, 188, 203–205, 463–465, 669, 670, 696, 697, 700, 701 Direct tunneling 107, 137, 195, 311, 312, 348, 350, 352, 353, 359, 528, 541, 558, 567, 568, 571, 573, 577, 592, 593, 669 Dopant segregation at oxide interface 39 Double gate 145, 183, 673, 688, 697, 698 Dry oxidation 56, 492 Ellipsometry 51–57, 66, 68, 71, 74, 76, 78, 79, 256, 334, 339, 346, 483–486, 509, 511, 516, 674, 686 ELYMAT 373 Energy band alignment 233

708

Index

Equivalent oxide thickness (EOT) 127, 130–132, 135–137, 139, 157, 158, 180, 198, 207, 211, 229, 230, 255, 256, 258, 259, 297, 300, 302, 311, 312, 348, 350–353, 393, 396, 399, 402, 403, 407, 416–418, 421, 423–426, 430, 431, 444, 445, 453, 484, 509, 511, 516, 517, 521, 523, 530, 531, 539, 541, 546–550, 556, 558, 562, 568, 570, 573, 575, 578, 584–589, 594, 596, 601, 641–643, 673, 687 Fermi pinning 418 Fick’s law 52, 53 FinFET 182, 183, 671, 673, 688–692, 701 Flat-band voltage 209, 259 Fowler–Nordheim tunneling 78, 133, 256, 312, 393, 421, 568, 569, 579 Full silicidation (FUSI) 456, 460, 462, 463 Gate all-around structure 669, 673 Gate dielectric properties 229 Gate dielectrics 1, 27, 33, 85, 123, 132, 135, 195, 197, 218, 224, 225, 237, 253, 257, 260, 266, 269–276, 311, 312, 323, 326, 354, 418, 430, 431, 483, 504, 511, 521, 528, 537, 538, 541, 556, 607, 639, 640, 673, 684, 701 Grazing incidence x-ray reflectivity (XRR) 497–499, 502 High-k trapped charge 347 High-k amorphous dielectrics 241, 322 High-k crystalline oxide interface stability 653 High-k crystalline oxide nucleation 650 High-k crystalline oxides 609, 639 High-k crystalline oxides electrical properties 629 High-k defect chemistry 237 High-k deposition 297, 379, 402, 444 High-k deposition techniques 379 High-k device modeling 260 High-k diffusivity in silicon 362

High-k electronic structure 327 High-k energy gap 234, 262 High-k epitaxial dielectrics 239 High-k film morphology 270 High-k gate circuit design issues 567 High-k gate dielectric cleans 445 High-k gate dielectric interfaces 287, 304 High-k gate dielectric process integration 436 High-k gate dielectrics 435, 436, 442–446, 448, 453, 461, 462, 464, 465, 468, 470 High-k gate stack charge trapping 538 High-k gate stack models 568 High-k metrology 483, 521 High-k optical models 503 High-k reliability 276 Hole fluence 91, 93, 95, 97, 101 Integrated circuit 2–4, 6, 7, 38, 240, 562, 567, 667 Integrated circuit yield 18 Interface charges 35, 83, 568, 593, 607 Interface engineering 257, 259 Interface state density 197, 198, 203, 269, 391, 393, 396, 399, 402, 523, 540, 542, 544, 594, 632 Interface traps 92, 99, 312, 319, 591, 631, 632 Ion drift effect 36 Ion-beam assisted deposition 404 Junction resistance 180, 181 JVD 197, 207–211, 218 Kinetic rate processes 297 Kooi effect (white ribbon effect) 438

41,

Lithography 5, 8, 12, 16–18, 20, 26, 50, 56, 466, 468, 672, 673, 681, 688–690, 697 LOCOS 38, 41, 436, 438, 439 Macroeconomics 21 MASTAR 145–148, 157, 159, 160, 163, 165, 176, 177, 179–181, 184, 187 Maxwell–Wagner effects 591, 596, 602, 603

Index MBE

403, 404, 610–612, 617, 634, 646, 647, 649, 651, 652, 655, 664 MEIS 302, 322, 496, 517 Mesa process 5 Metal gate electrode 35, 274, 302, 415, 431, 452, 454, 464, 468 Metal induced gap states (MIGS) 660 Metal nitride gate electrode 419 Metal silicon nitride gate electrode 423 Microeconomics 13 Minority-carrier recombination lifetime 372 Moore’s Law 1–4, 7–18, 20–29, 41, 144, 165, 169, 183, 189–191, 521, 667, 701 MOSFET 46–50, 57, 85, 107, 123–125, 129, 132, 139, 144, 148, 169, 170, 175, 186, 210, 263, 380, 394, 444, 454, 537, 601, 602, 607, 609, 633, 643, 661, 662, 668–677, 680, 682, 683, 688, 690–692, 695, 701 NMOS FET 42, 93, 177, 204, 210–215, 427, 438, 674–676, 678, 680, 685–687, 696, 699 Non-linear silicon dioxide oxidation 71 Non-planar integration processes 466 Oxidation kinetics 51, 58–60, 62, 65, 68, 69, 71, 73, 229, 647 Oxidation stress effects 62 Oxide charge trapping 92 PECVD 47, 396–399, 402, 403 Permittivity 20, 125, 127, 224, 225, 228–233, 240–242, 244, 245, 247, 253–255, 259–264, 277, 287, 290, 292, 359, 379, 396, 557, 591, 594, 597, 598, 603, 686 Perovskites 239, 240, 270, 610, 614, 617, 618, 622, 634, 643, 644 Planar integration processes 461 Planar process 1, 5, 6, 8 Plasma enhanced ALD (PEALD) 393–397, 402, 403 Plasma enhanced CVD (PECVD) 47, 396

709

PMOS FET 42, 181, 195, 204, 205, 210, 212–214, 218, 427, 430, 438, 572, 675, 676, 678, 683, 684 Polysilicon gate electrode 197, 523, 542 PSG passivation 37 PVD 399 Rare-earth properties in silicon 359 RBS 496, 503 Replacement gate process 273, 466 Scaling 1, 16, 20, 27, 43, 99, 107, 111, 123, 124, 126, 127, 131, 132, 135– 137, 139, 140, 143–145, 153–159, 161, 163–167, 169–171, 174, 179, 182–184, 187–189, 196, 205, 218, 224, 233, 254–257, 260, 270, 272, 275, 311, 312, 314, 325, 329, 335, 337, 342, 348, 350, 415, 436, 442, 444, 449, 461, 466, 468, 471, 483, 504, 505, 512, 513, 521, 524, 531, 548, 562, 567, 579, 603, 607, 610, 639, 664, 667, 669, 670, 672–675, 691, 692, 700, 701 Schottky barrier 234, 235 Shallow trench isolation 436, 438, 697 Short channel effect (SCE) 125, 144, 150, 151, 153–155, 161–165, 170, 179, 183, 184, 188, 461–466, 468, 470, 673, 694, 700 SILC 92, 93, 97–100, 105 Silicon dioxide 109, 253, 312, 359, 379, 435, 442, 446–448, 470, 483–486, 488, 502, 507, 591, 607, 609, 610, 623–625, 633 Silicon dioxide – electronic structure 316 Silicon dioxide – local atomic structure 315 Silicon dioxide breakdown 102 Silicon dioxide reliability 91, 201 Silicon dioxide soft breakdown 105 Silicon interface oxidation 295 Silicon nitride 37, 38, 195–197, 312, 319, 438, 439, 441, 446, 447, 560, 683 Silicon orientation effects 65

710

Index

Silicon oxynitride 123, 135, 195, 379, 439, 484, 485, 494, 495, 521 Silicon oxynitride reliability 201, 215 Silicon sub-oxides 75 Silicon surface passivation 34 Silicon thermal oxidation 38 Silicon–silicon dioxide interface 313 Silicon–silicon dioxide interface roughness 74, 80 SOI 167, 183, 241, 416, 429, 456, 457, 461, 462, 465–467, 669, 673, 688, 689, 692, 694, 695, 700 Sol-gel deposition 379, 405 SON 182, 183, 188, 469, 470, 673, 692–698, 700, 701 STEM 256, 485, 487–490, 492–495, 517, 676, 677 Strained silicon 178 Surface passivation 302 Surface pre-treatment 300 Surface states 33–36, 48–50 Thermionic emission 70, 319, 391, 569–571, 579, 642, 669 Thermodynamic stability 85, 226, 227, 233, 239, 241, 243, 244, 253, 266, 277, 290, 292, 616 Threshold voltage 35, 97, 123, 124, 138, 143, 148, 150, 152, 167, 184, 187, 191, 195, 196, 198, 209, 210,

254, 259, 272, 304, 312, 438, 444, 461, 594, 602, 669, 673, 675, 680, 686 Threshold voltage shift 212 Transistor performance 143, 176, 240, 260, 641 Transition metal diffusivity in silicon 364 Transition metal properties in silicon 359, 368 Ultra-thin silicon body

669

Vdd scaling 157, 166 Vertical replacement gate (VRG) 701 Viscous flow model 63, 65 VLSI 11, 15, 19, 137, 176, 177

681,

Wafer diameter 19 Wet oxidation 439 Work function 235, 273, 274, 394, 416–421, 423–431, 436, 449–452, 454, 456–466, 468, 469, 471, 575, 662, 691, 692 XPS

77, 295, 301, 302, 320, 322, 339–347, 494–496, 503, 576, 623, 650, 654–656 XTEM 77, 487