Microlithography: Science and Technology

q 2007 by Taylor & Francis Group, LLC q 2007 by Taylor & Francis Group, LLC q 2007 by Taylor & Francis Group, LLC

3,446 282 15MB

Pages 846 Page size 504 x 720 pts Year 2007

Report DMCA / Copyright

DOWNLOAD FILE

Recommend Papers

Transport Science and Technology

This page intentionally left blank Edited by KONSTADINOS G. GOULIAS Department of Geography University of Califo

2,852 314 57MB Read more

Polymeric foams: science and technology

1,709 170 6MB Read more

Insect Diets: Science and Technology

INSECT DIETS Science and Technology INSECT DlETS Science and Technology Allen Carson Cohen, Ph.D Insect Diet and Rear

950 161 6MB Read more

Handbook of Food Science Technology and Engineering

HANDBOOK OF FOOD SCIENCE, TECHNOLOGY, AND ENGINEERING Volume 1 FOOD SCIENCE AND TECHNOLOGY A Series of Monographs, Te

20,283 7,690 34MB Read more

Handbook of Cosmetic Science and Technology

ISBN: 0-8247-0292-1 This book is printed on acid-free paper. Headquarters Marcel Dekker, Inc. 270 Madison Avenue, New Yo

11,342 8,538 7MB Read more

The Philosophy of Science and Technology Studies

1,730 108 2MB Read more

Advances in Urethane Science and Technology

Editors: D. Klempner K.C. Frisch Daniel Klempner and Kurt Frisch Rapra Technology Limited Shawbury, Shrewsbury, Sh

914 608 5MB Read more

Biochar for environmental management: science and technology

Biochar for Environmental Management Biochar for Environmental Management Science and Technology Edited by Johannes L

3,007 1,692 5MB Read more

Science, Technology and Society: An Introduction

SCIENCE, TECHNOLOGY AND SOCIETY This book provides a comprehensive introduction to the human, social and economic aspec

33,336 16,086 9MB Read more

Animal Science Biology and Technology, 3rd Edition (Texas Science)

799 455 34MB Read more

File loading please wait...

Citation preview

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

q 2007 by Taylor & Francis Group, LLC

Preface Over the last three decades, accomplishments in microlithographic technology have resulted in tremendous advances in the development of semiconductor integrated circuits (ICs) and microelectromechanical systems (MEMS). As a direct result, devices have become both faster and smaller, can handle an ever increasing amount of information, and are used in applications from the purely scientific to those of everyday life. With the shrinking of device patterns approaching the nanometer scale, the wavelength of the exposing radiation has been reduced from the blue-UV wavelength of the mercury g-line (436 nm) into the mercury i-line(365 nm), deep UV (DUV), vacuum UV (VUV), and the extreme UV (EUV). The krypton fluoride (KrF) excimer laser at 248 nm was adopted as an exposure source in DUV regions and has been used in volume manufacturing since 1988. Since the first edition of this book, advances in 193-nm argon flouride (ArF) excimer laser lithography have allowed for the pursuit of sub-90-nm device fabrication and, when combined with high NA technology, polarized illumination, and immersion imaging, may be capable of imaging for device generations at 45 nm and beyond. The next generation of lithographic systems for 32-nm device technology will likely come from candidates including F2 excimer laser (157 nm) lithography, EUV (13.5 nm) lithography, electron projection lithography (EPL), nanoimprint lithography (NIL), or maskless lithography (ML2). Among these candidates, ML2 such as electron-beam direct-write system has been used for small-volume device production with quick turn around time (QTAT) because a mask is not necessary. Factors that will determine the ultimate course for a high-volume device production will include cost, throughput, resolution, and extendibility to finer resolution. The second edition of this volume is written not only as an introduction to the science and technology of microlithography, but also as a reference for those who with more experience so that they may obtain a wider knowledge and a deeper understanding of the field. The purpose of this update remains consistent with the first edition published in 1998 and edited by Dr. James R. Sheats and Dr. Bruce W. Smith. New advances in lithography have required that we update the coverage of microlithography systems and approaches, as well as resist materials, processes, and metrology techniques. The contributors were organized and revision work started in 2003. Additional content and description have been added regarding immersion lithography, 157-nm lithography and EPL in Chapter 1 System Overview of Optical Steppers and Scanners, Chapter 3 Optics for Photolithograph, Chapter 5 Excimer Laser for Advanced Microlithography, and Chapter 6 Electron Beam Lithography Systems. Because the topics of EUV and imprint lithography were not addressed in the first edition, Chapter 8 and Chapter 9 have been added to discuss these as well. A detailed explanation of scatterometry has been incorporated into Chapter 14 Critical Dimensional Metrology. Chapter 15 Electron Beam Nanolithography has also has been widely revised. In order to maintain the continuity of this textbook, that proved so valuable in the first edition, these topics and others that may be less obvious, but no less significant, have been tied into the other corresponding chapters as necessary. As a result, we are certain that this second edition of Microlithography: Science and Technology will remain a valuable textbook for students, engineers, and researchers and will be a useful resource well into the future. Kazuaki Suzuki Bruce W. Smith

q 2007 by Taylor & Francis Group, LLC

Editors

Kazuaki Suzuki is a project manager of Next Generation Lithography Tool Development at the Nikon Corporation. He has joined several projects of new concept exposure tools such as the first generation KrF excimer laser stepper, the first generation KrF excimer laser scanner, the electron-beam projection lithography system, and the full field EUV scanner. He has authored and coauthored many papers in the field of exposure tools and related technologies. He also holds numerous patents in the areas of projection lens control systems, dosage control systems, focusing control systems, and evaluation methods for image quality. For the last several years, he has been a member of program committees such as SPIE Microlithography and other international conferences. He is an associate editor of The Journal of Micro/Nanolithography, MEMS, and MOEMS (JM3). Kazuaki Suzuki received his BS degree in plasma physics (1981), and his MS degree in x-ray astronomy (1983) from Tokyo University, Japan. He retired from a doctorate course in x-ray astronomy and joined the Nikon Corporation in 1984. Bruce W. Smith is a professor of microelectronic engineering and the director of the Center for Nanolithography Research at the Rochester Institute of Technology. He is involved in research in the fields of DUV and VUV lithography, photoresist materials, resolution enhancement technology, aberration theory, optical thin film materials, illumination design, immersion lithography, and evanescent wave imaging. He has authored numerous scientific publications and holds several patents. Dr. Smith is a widely known educator in the field of optical microlithography. He received his MS degree and doctorate in imaging science from the Rochester Institute of Technology. He is a member of the International Society for Photo-optical Instrumentation Engineering (SPIE), the Optical Society of America (OSA), and the Institute of Electrical and Electronics Engineers (IEEE).

q 2007 by Taylor & Francis Group, LLC

Contributors

Mike Adel

KLA-Tencor, Israel IBM Almaden Research Center, San Jose, California

Robert D. Allen

Zvonimir Z. Bandic´ Hitachi San Jose Research Center, San Jose, California Palash Das

Cymer, Inc., San Diego, California

Elizabeth A. Dobisz Hitachi San Jose Research Center, San Jose, California Gregg M. Gallatin IBM Thomas J. Watson Research Center, Yorktown Heights, New York (Current Affiliation: Applied Math Solutions, LLC, Newton, Connecticut) Intel Corporation (Retired)

Charles Gwyn

Maureen Hanratty

Texas Instruments, Dallas, Texas

Michael S. Hibbs

IBM Microelectronic Division, Essex Junction, Vermont

Roderick R. Kunz

Massachusetts Institute of Technology, Lexington, Massachusetts

Gian Lorusso

IMEC, Leuven, Belgium

Chris A. Mack KLA-Tencor FINLE Divison, Austin, Texas (Retired, Currently Gentleman Scientist) Herschel M. Marchman KLA-Tencor, San Jose, California (Current Affiliation: Howard Hughes Medical Institute, Ashburn, Virginia) Martin C. Peckerar

University of Maryland, College Park, Maryland

Douglas J. Resnick Austin, Texas)

Motorola, Tempe, Arizona (Current Affiliation: Molecular Imprints,

Bruce W. Smith

Rochester Institute of Technology, Rochester, New York

Kazuaki Suzuki

Nikon Corporation, Saitama, Japan

Takumi Ueno

Hitachi Chemical Electronic Materials R&D Center, Ibaraki, Japan

Stefan Wurm

International SEMATECH (Qimonda assignee), Austin, Texas

Sanjay Yedur Timbre Technologies Inc., a division of Tokyo Electron Limited, Santa Clara, California

q 2007 by Taylor & Francis Group, LLC

Contents

Part I

Exposure System

1.

System Overview of Optical Steppers and Scanners ................................. 3 Michael S. Hibbs

2.

Optical Lithography Modeling .................................................................. 97 Chris A. Mack

3.

Optics for Photolithography .................................................................... 149 Bruce W. Smith

4.

Excimer Laser for Advanced Microlithography ...................................... 243 Palash Das

5.

Alignment and Overlay ............................................................................ 287 Gregg M. Gallatin

6.

Electron Beam Lithography Systems ....................................................... 329 Kazuaki Suzuki

7.

X-ray Lithography ..................................................................................... 361 Takumi Ueno

8.

EUV Lithography ...................................................................................... 383 Stefan Wurm and Charles Gwyn

9.

Imprint Lithography ................................................................................. 465 Douglas J. Resnick

Part II

Resists and Processing

10.

Chemistry of Photoresist Materials ........................................................ 503 Takumi Ueno and Robert D. Allen

11.

Resist Processing ...................................................................................... 587 Bruce W. Smith

12.

Multilayer Resist Technology .................................................................. 637 Bruce W. Smith and Maureen Hanratty

13.

Dry Etching of Photoresists ..................................................................... 675 Roderick R. Kunz

q 2007 by Taylor & Francis Group, LLC

Part III

Metrology and Nanolithography

14.

Critical-Dimensional Metrology for Integrated-Circuit Technology .... 701 Herschel M. Marchman, Gian Lorusso, Mike Adel, and Sanjay Yedur

15.

Electron Beam Nanolithography ............................................................. 799 Elizabeth A. Dobisz, Zvonimir Z. Bandic´ , and Martin C. Peckerar

q 2007 by Taylor & Francis Group, LLC

1 System Overview of Optical Steppers and Scanners Michael S. Hibbs

CONTENTS 1.1 Introduction ........................................................................................................................5 1.1.1 Moore’s Law ..........................................................................................................6 1.2 The Lithographic Exposure System ................................................................................7 1.2.1 The Lithographic Projection Lens ......................................................................7 1.2.2 The Illumination Subsystem ................................................................................8 1.2.3 The Wafer Positioning Subsystem ......................................................................9 1.3 Variations on a Theme ....................................................................................................10 1.3.1 Optical Contact Printing and Proximity Printing ..........................................10 1.3.2 X-ray Proximity Lithography ............................................................................11 1.3.3 Ebeam Proximity Lithography ..........................................................................12 1.3.4 Imprint Lithography............................................................................................12 1.3.5 1! Scanners..........................................................................................................13 1.3.6 Reduction Steppers..............................................................................................14 1.3.7 1! Steppers ..........................................................................................................15 1.3.8 Step-and-Scan ......................................................................................................16 1.3.9 Immersion Lithography ......................................................................................17 1.3.10 Serial Direct Writing ............................................................................................18 1.3.11 Parallel Direct Writing/Maskless Lithography ..............................................19 1.3.12 Extreme Ultraviolet Lithography ......................................................................20 1.3.13 Masked Particle Beam Lithography ................................................................21 1.4 Lithographic Light Sources ............................................................................................23 1.4.1 Requirements........................................................................................................23 1.4.2 Radiance ................................................................................................................23 1.4.3 Mercury–Xenon Arc Lamps ..............................................................................23 1.4.4 The Arc-Lamp Illumination System ................................................................24 1.4.5 Excimer Lasers ....................................................................................................25 1.4.6 157 nm F2 Lasers ..................................................................................................27 1.4.7 Other Laser Light Sources ..................................................................................28 1.4.8 Polarization ..........................................................................................................28 1.4.9 Nonoptical Illumination Sources ......................................................................29 1.5 Optical Considerations ....................................................................................................31 1.5.1 Requirements........................................................................................................31 1.5.2 Lens Control ........................................................................................................31 1.5.3 Lens Defects ..........................................................................................................32 1.5.4 Coherence..............................................................................................................32 3

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

4

1.6

1.7

1.8

1.9

1.10

1.11

1.5.5 k-Factor and the Diffraction Limit ....................................................................32 1.5.6 Proximity Effects ..................................................................................................34 Latent Image Formation ..................................................................................................34 1.6.1 Photoresist ............................................................................................................34 1.6.2 Thin-Film Interference and the Swing Curve ................................................36 1.6.3 Mask Reflectivity..................................................................................................37 1.6.4 Wafer Topography ..............................................................................................38 1.6.5 Control of Standing Wave Effects ....................................................................38 1.6.6 Control of Topographic Effects ..........................................................................40 1.6.7 Latent Image Stability ........................................................................................40 The Resist Image ..............................................................................................................40 1.7.1 Resist Development ............................................................................................40 1.7.2 Etch Masking ........................................................................................................41 1.7.3 Multilayer Resist Process....................................................................................43 1.7.4 Top-Surface Imaging ..........................................................................................43 1.7.5 Deposition Masking and the Liftoff Process ..................................................44 1.7.6 Directly Patterned Insulators ............................................................................45 1.7.7 Resist Stripping ....................................................................................................46 Alignment and Overlay ..................................................................................................46 1.8.1 Definitions ............................................................................................................46 1.8.2 Alignment Methodology ....................................................................................47 1.8.3 Global Mapping Alignment ..............................................................................48 1.8.4 Site-by-Site Alignment ........................................................................................49 1.8.5 Alignment Sequence............................................................................................50 1.8.6 Distortion Matching ............................................................................................51 1.8.7 Off-Axis Alignment ............................................................................................52 1.8.8 Through-the-Lens Alignment ............................................................................53 1.8.9 Alignment Mark Design ....................................................................................54 1.8.10 Alignment Mark Detection ................................................................................55 Mechanical Considerations ............................................................................................55 1.9.1 The Laser Heterodyne Interferometer..............................................................55 1.9.2 Atmospheric Effects ............................................................................................57 1.9.3 Wafer Stage Design..............................................................................................58 1.9.4 The Wafer Chuck ................................................................................................59 1.9.5 Automatic Focus Systems ..................................................................................59 1.9.6 Automatic Leveling Systems ............................................................................61 1.9.7 Wafer Prealignment ............................................................................................62 1.9.8 The Wafer Transport System..............................................................................63 1.9.9 Vibration ................................................................................................................64 1.9.10 Mask Handlers ....................................................................................................65 1.9.11 Integrated Photo Cluster ....................................................................................66 1.9.12 Cost of Ownership and Throughput Modeling..............................................66 Temperature and Environmental Control ....................................................................68 1.10.1 The Environmental Chamber ............................................................................68 1.10.2 Chemical Filtration ..............................................................................................69 1.10.3 Effects of Temperature, Pressure, and Humidity ..........................................69 1.10.4 Compensation for Barometric and Thermal Effects ......................................70 Mask Issues........................................................................................................................71 1.11.1 Mask Fabrication..................................................................................................71 1.11.2 Feature Size Tolerances ......................................................................................72 1.11.3 Mask Error Factor ................................................................................................73

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

5

1.11.4 Feature Placement Tolerance ............................................................................73 1.11.5 Mask Flatness ......................................................................................................74 1.11.6 Inspection and Repair ........................................................................................74 1.11.7 Particulate Contamination and Pellicles ..........................................................75 1.11.8 Hard Pellicles for 157 nm ..................................................................................77 1.11.9 Field-Defining Blades ..........................................................................................77 1.12 Control of the Lithographic Exposure System ............................................................78 1.12.1 Microprocessor Control of Subsystems............................................................78 1.12.2 Photocluster Control............................................................................................79 1.12.3 Communication Links ........................................................................................79 1.12.4 Stepper Self-Metrology ......................................................................................79 1.12.5 Stepper Operating Procedures ..........................................................................80 1.13 Optical Enhancement Techniques..................................................................................81 1.13.1 Optical Proximity Corrections ..........................................................................82 1.13.2 Mask Transmission Modification ......................................................................83 1.13.3 Phase-Shifting Masks ..........................................................................................84 1.13.4 Off-Axis Illumination ..........................................................................................87 1.13.5 Pupil Plane Filtration ..........................................................................................90 1.14 Lithographic Tricks ..........................................................................................................90 1.14.1 Multiple Exposures through Focus (FLEX) ....................................................90 1.14.2 Lateral Image Displacement ..............................................................................92 1.14.3 Resist Image Modifications ................................................................................93 1.14.4 Sidewall Image Transfer ....................................................................................93 1.14.5 Field Stitching ......................................................................................................94 References ......................................................................................................................................95

1.1 Introduction Microlithography is a manufacturing process for producing highly accurate, microscopic, 2-dimensional patterns in a photosensitive resist material. These patterns are optically projected replicas of a master pattern on a durable photomask, and they are typically made of a thin patterned layer of chromium on a transparent glass plate. At the end of the lithographic process, the patterned photoresist is used to create a useful structure in the device that is being built. For example, trenches can be etched into an insulator, or a uniform coating of metal can be etched to leave a network of electrical wiring on the surface of a semiconductor chip. Microlithography is used at every stage of the semiconductor manufacturing process. An advanced chip design can have 50 or more masking levels, and approximately 1/3 of the total cost of semiconductor manufacture can be attributed to microlithographic processing. The progress of microlithography has been measured by the ever smaller sizes of the images that can be printed. There is a strong economic incentive for improving lithographic resolution. A decrease in minimum image size by a factor of two leads to a factor of four increase in the number of circuits that can be built on a given area of the semiconductor chip as well as significant increases in switching speeds.pIt ﬃﬃﬃ has been traditional to define a decrease in minimum image size by a factor of 1= 2 as a new lithographic generation. Over the last two decades, these lithographic generations have been roughly coincident with generations of dynamic random-access memory (DRAM)

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

6 TABLE 1.1

Seven Lithographic and Dynamic Random-Access Memory (DRAM) Generations DRAM storage (Megabits) Minimum image size (mm)

1

4

16

64

256

1024

4096

1.00

0.70

0.50

0.35

0.25

0.18

0.13

chips that are defined by an increase in memory storage by a factor of four. Table 1.1 shows the correspondence of lithographic and DRAM generations. About half of the 4! increase per generation in DRAM capacity is due to the reduced lithographic image size, and the remaining increase is accomplished by advances in design techniques and by increasing the physical dimensions of the DRAM. Historically, there have been about three years between lithographic generations with leading-edge manufacturing at 0.35 mm starting in 1995. 1.1.1 Moore’s Law The historical trend of exponential increase in integrated circuit complexity with time was recognized by Gordon Moore very early in the history of the semiconductor industry. Moore published an article in 1965 [1] that summarized the increase of integrated circuit complexity between 1959 and 1965. He found that the number of discrete devices per integrated circuit had roughly doubled every year throughout that period, reaching the inspiring total of 50–60 devices per chip by 1965. Moore predicted that semiconductor complexity would continue to increase at the same rate for at least 10 years, and he predicted that chips would be built with 65,000 components in 1975. This prediction has become known in the semiconductor industry as Moore’s Law. Forty years later, it remains surprisingly accurate. Although the doubling time for devices per chip has varied slightly and probably averages closer to 18 months than one year, the exponential trend has been maintained. Of course, Moore’s Law is not a fundamental law of nature. In many ways, it functions as a sort of self-fulfilling prophesy. The economics of the semiconductor industry have become dependent on the exponential growth rate of semiconductor complexity that makes products become obsolete quickly and guarantees a market for their replacement every few years. Program managers have used the expectation of exponential growth in their planning, and exponential rates of improvement have been built into industry roadmaps such as the International Technology Roadmap for Semiconductors (ITRS) [2]. An excerpt from this roadmap is shown in Table 1.2. Notice that the minimum image size decreases by a factor of 2 every six years, the same rate of improvement that drove the 18 month doubling time of semiconductor complexity from the 1970s to the 1990s. The progression of minimum image sizes that started in Table 1.1 continues in Table pﬃﬃﬃ 1.2 with each lithographic generation having an image size reduced by a factor of 1= 2. The connection between lithographic image size and DRAM generation used as an illustration in Table 1.1 is not continued in Table 1.2. Memory chip capacity no longer represents as much of a limitation on computer power as it did in the 1980s and 1990s, and the emphasis TABLE 1.2 Six More Lithographic Generations Projected first use (year) Minimum image size (nm)

q 2007 by Taylor & Francis Group, LLC

2001 130

2004 90

2007 65

2010 45

2013 32

2016 22

System Overview of Optical Steppers and Scanners

7

has changed from increasing the storage capacity of DRAM chips to reducing their sizes and increasing access speed. Today, 1 GB (1024 MB) memory chips are being built with 100–130 nm lithography instead of with the 180 nm image sizes predicted by Table 1.1. The dates assigned by the ITRS must be treated with caution. The real dates that unfold will be affected by variable world economic conditions and the uncertain pace of new inventions needed to maintain this schedule. The Roadmap is intended by those who compiled it to reflect an industry consensus of expected progress. Many industrial program managers see it as the roadmap for their competitors, and they privately instruct their own scientists and engineers to plan for a schedule moved forward by a year. The pressure to match or exceed the roadmap dates has led to a sort of inflation in the meaning of minimum image size. When DRAM chips defined the leading-edge technology for semiconductors, density requirements in DRAM design forced the minimum image size to be approximately 1/2 of the minimum pitch (defined as the minimum center-to-center spacing between two adjacent lines). Because of this, lithographic technology has traditionally considered the minimum half-pitch to be synonymous with the minimum image size. Table 1.1 and Table 1.2 follow this convention. Today, logic chips have begun to replace DRAM chips as the first types of device to be manufactured at each lithographic node. Logic chip design makes extreme demands on image size, but it does not require the same level of circuit density as DRAM. There are a number of ways, discussed later in this chapter, of biasing the lithography to produce small lines on a relaxed pitch. Therefore, if a manufacturer is building 90 nm lines on a 260 nm pitch, there is an overwhelming temptation to report that the 90 nm technology node has been reached even though a strict half-pitch definition would call this 130 nm technology. Variability in definitions coupled with the natural desire of every manufacturer to be seen as the leader at each new technology node makes it increasingly hard to say exactly when a particular generation of lithography is first used in production.

1.2 The Lithographic Exposure System At the heart of the microlithographic process is the exposure system. This complex piece of machinery projects the image of a desired photomask pattern onto the surface of the semiconductor device being fabricated on a silicon wafer. The image is captured in a thin layer of a resist material and transformed into a permanent part of the device by a series of chemical etch or deposition processes. The accuracy with which the pattern must be formed is astonishing: lines a small fraction of a micron in width must be produced with dimensional tolerances of a few nanometers, and the pattern must be aligned with underlying layers of patterns to better than one fourth of the minimum line width. All of these tolerances must be met throughout an exposure field of several square centimeters. A lithographic exposure system filling an enclosure the size of a small office and costing several million dollars is used to meet these severe requirements. An exposure system for optical microlithography consists of three parts: a lithographic lens, an illumination system, and a wafer positioning system. A typical exposure system will be described in detail, followed by an expanded description of the many possible variations on the typical design. 1.2.1 The Lithographic Projection Lens The lithographic lens is a physically large, compound lens. It is made up of over thirty simple lens elements, mounted in a massive, rigid barrel. The total assembly can weigh up

q 2007 by Taylor & Francis Group, LLC

8

Microlithography: Science and Technology

to 1000 pounds. The large number of elements is needed to correct optical aberrations to a very high degree over a 30 mm or larger circular field of exposure. The lens is designed to produce an optical image of a photomask, reduced by a demagnification of 4!. A silicon wafer, containing hundreds of partially fabricated integrated circuits, is exposed to this image. The image is captured by a layer of photosensitive resist, and this latent image will eventually be chemically developed to leave the desired resist pattern. Every aspect of the lens design has extremely tight tolerances. In order to produce the smallest possible images, the resolution of the lens must be limited only by fundamental diffraction effects (Figure 1.1). In practice, this means that the total wavefront aberration at every point in the exposure field must be less than 1/10 of the optical wavelength. The focal plane of the lens must not deviate from planarity by more than a few tens of nanometers over the entire usable exposure field, and the maximum transverse geometrical distortion cannot be more than a few nanometers. The lens is designed for use over a narrow range of wavelengths centered on the illumination wavelength that may be 365, 248, or 193 nm. 1.2.2 The Illumination Subsystem The illumination source for the exposure system may be a high pressure mercury arc lamp or a high powered laser. The light is sent through a series of relay optics and uniformizing optics, and it is then projected through the photomask. Nonuniformity of the illumination intensity at the photomask must be less than 1%. The light continues through the photomask to form an image of the effective illumination source in the entrance pupil of the lithographic lens. The fraction of the pupil filled by the illumination source’s image determines the degree of coherence in the lithographic lens’s image formation. The light traversing the entire chain of illuminator and lithographic lens optics forms an image with an intensity of a few hundred mW/cm2. The illuminator assembly sends a controlled burst

FIGURE 1.1 (a) Optical layout of a small-field, experimental lithographic lens. This lens was designed in 1985, and it has 11 elements. (b) A modern full-field lithographic lens, reproduced at approximately the same scale as the 1985 lens. This lens has more than twice the resolution and nearly three times the field size of the older lens. (Figure 1.1b provided by courtesy of Nikon.)

q 2007 by Taylor & Francis Group, LLC

(a)

(b)

System Overview of Optical Steppers and Scanners

9

of light to expose the photoresist to the image for a few tenths of a second (Figure 1.2). The integrated energy of each exposure must be repeatable to within 1%. Although the tolerances of the illuminator are not as tight as those of the lithographic lens, its optical quality must be surprisingly high. Severe aberrations in the illumination optics will produce a variety of problems in the final image even if there are no aberrations in the lithographic lens. 1.2.3 The Wafer Positioning Subsystem The wafer positioning system is one of the most precise mechanical systems used in any technology today. A silicon wafer, typically 200–300 mm in diameter, may contain several hundred semiconductor devices, informally called chips. Each chip, in its turn, must be physically aligned to the image projected by the lithographic lens, and it must be held in alignment with a tolerance of a few tens of nanometers during the exposure. To expose all the chips on a wafer sequentially, the wafer is held by a vacuum chuck on an ultraprecision x–y stage. The stage position is determined by laser interferometry to an accuracy of a few nanometers. It takes less than one second for the stage to move between successive exposure sites and settle to within the alignment tolerance before the next exposure begins. This sequence of stepping from one exposure to the next has led this type of system to be called a step-and-repeat lithographic system, or more informally a stepper. Prior to exposure, the position of the wafer must be determined as accurately as possible with an automatic alignment system. This system looks for standardized alignment marks that were printed on the wafer during previous levels of lithography. The position of these marks is determined by one of a variety of optical detection techniques. A number of different

(a)

(b)

FIGURE 1.2 (a) A rather simple, experimental illuminator. Laser light is randomized in a light tunnel then projected through a series of five lenses and two folding mirrors onto the photomask. This illuminator was used with the lithographic lens in Figure 1.1a. (b) A modern illuminator using a fly’s eye randomizer and a rotating aperture assembly to allow variable illumination conditions. (Figure 1.2b provided by courtesy of Nikon.)

q 2007 by Taylor & Francis Group, LLC

10

Microlithography: Science and Technology

alignment strategies can be used, but at minimum, the within-plane rotation error of the wafer and its x- and y-translation errors must be determined relative to the projected image. The positioning system must reduce these errors to within the alignment tolerance before each exposure begins. The stepper must also automatically detect the surface of the resist and position this surface at the correct height to match the exact focal plane of the stepper lens within a tolerance of about 200 nm. In order to meet this tolerance over a large exposure field, it is also necessary to detect and correct tilt errors along two orthogonal axes. The wafer surface is not flat enough to guarantee that the focus tolerance will be satisfied everywhere on the wafer simultaneously, so the automated focus procedure is repeated at every exposure site on the wafer. During the entire process of loading a wafer, aligning, stepping, focusing, exposing, and unloading, speed of the process is of utmost importance. A stepper that can expose 100 wafers in an hour can pay back its huge capital cost twice as fast as a stepper that can only manage 50 wafers per hour (wph).

1.3 Variations on a Theme The typical stepper outlined in the previous section has been in common use for semiconductor microlithography for the past 20 years. But a number of other styles of equipment are used as well. Some of these other variations were the historical predecessors of the stepper described in Section 1.2. Many of them are still in use today, earning their keep by providing low-cost lithography for low-density semiconductor designs. Other variations on the basic design have become the new standard for leading-edge semiconductor lithography, and new improvements are continuously being made as optical lithography pushes harder and harder against the fundamental limits of the technology. 1.3.1 Optical Contact Printing and Proximity Printing The earliest exposure systems were contact printers and proximity printers. In these systems, a chrome-on-glass mask is held in close proximity or in actual contact with a photoresist-covered wafer. The resist is exposed through the back side of the mask by a flood exposure source. The mask pattern covers the entire wafer, and it is necessarily designed with a magnification of 1!. Alignment is accomplished by an operator manipulating a mechanical stage to superimpose two previously printed alignment marks on the wafer with corresponding alignment marks on the mask. Alignment of the two pairs of marks is verified by the operator through a split-field microscope that can simultaneously view opposite sides of the wafer. The wafer and mask can be aligned with respect to rotation and displacement on two orthogonal axes. Contact printing provides higher resolution than proximity printing but at the cost of enormous wear and tear on the masks. No matter how scrupulous the attention to cleanliness may be, particles of dirt are eventually ground into the surfaces of the wafer and the mask during the exposure. A frequent source of contamination is fragments of photoresist that adhere to the surface of the mask when it makes contact with the wafer. Masks have to be cleaned frequently and finally replaced as they wear out. This technology is not currently used in mainstream semiconductor manufacture (Figure 1.3). Proximity printing is kinder to the masks, but in many ways, it is a more demanding technology [3]. The proximity gap has to be as small as possible to avoid loss of resolution

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

11

FIGURE 1.3 In optical proximity printing, light is blocked from the photosensitive resist layer by chromium patterns on a photomask. The gap between the mask and the resist must be as small as possible to minimize diffractive blurring at the edges of the patterns.

from p ﬃﬃﬃﬃﬃ optical diffraction. The resolution limit for a proximity printer is proportional to ld, where l is the exposure wavelength, and d is the proximity gap. When optical or near-ultraviolet exposure wavelengths are used, the minimum image sizes that can be practically achieved are around 2 or 3 mm. This limits optical proximity printing to the most undemanding applications of semiconductor lithography. 1.3.2 X-ray Proximity Lithography A more modern variation of optical proximity printing is x-ray proximity lithography. The diffractive effects that limit resolution are greatly reduced by the very short wavelengths of the x-rays used, typically around 1.0–1.5 nm, corresponding to a 1 keV x-ray energy. This represents a wavelength decrease of a factor of 300 relative to optical proximity lithography or an improvement in resolution by a factor of about 15. X-ray proximity lithography is capable of excellent resolution, but it has been held back from large-scale manufacturing by a variety of technical and financial hurdles [4]. The electron synchrotron used as the x-ray source is very expensive, and it must support a very high volume of wafer production to make it affordable. Because a single electron synchrotron will act as the illumination source for a dozen wafer aligners or more, a failure of the synchrotron could halt production on an entire manufacturing line. Compact plasma sources of x-rays have also been developed, but they do not provide as high quality collimation as synchrotron x-rays. Each x-ray mask alignment system requires a helium atmosphere to prevent absorption and scattering of the x-rays. This complicates the transfer of wafers and masks to and from the exposure system. The most challenging feature of x-ray proximity lithography is the difficulty of producing the 1! membrane mask to the required tolerances. Because the mask-making infrastructure in the semiconductor industry is largely geared to 4! and 5! reduction masks, considerable improvements in mask-making technology are needed to produce the much smaller features on a 1! mask. Proportional reductions in line width tolerance and placement tolerance are also needed. To provide a transparent, flat substrate for an x-ray mask, a thin, tightly stretched membrane of a low atomic weight material such as silicon carbide can be used. The membrane is typically thinner than 1 mm to give sufficient transparency to the x-rays. The membrane supports an absorber pattern made of a high atomic weight material such as gold or tungsten that strongly absorb x-rays in the 1 keV energy range. X-ray proximity lithography remained under development throughout the 1980s and 1990s with the support of national governments and large semiconductor corporations, but further development efforts have nearly come to a halt. The technology continues to be

q 2007 by Taylor & Francis Group, LLC

12

Microlithography: Science and Technology

mentioned in discussions of nonoptical lithographic methods, but most research and development is concentrated on techniques such as extreme ultraviolet (EUV) or ebeam projection lithography that allow the use of reduction masks. 1.3.3 Ebeam Proximity Lithography A beam of moderately low energy electrons (2 keV, typically) can be used for proximity printing with a membrane mask similar to an x-ray proximity mask. Unlike x-rays, low energy electron beams cannot penetrate even a 1 mm membrane, so the mask must be used as a stencil with holes pierced through the membrane where the patterns are to be formed. Electrons have wavelengths defined by quantum mechanical laws and are subject to diffraction just as optical and x-ray photons are. A 2 keV electron has a wavelength of about 0.03 nm compared to 0.62 nm for an x-ray of the same energy. Because of the shorter wavelength, an electron proximity mask can be spaced up to 50 mm from the wafer surface, and the diffractive limit of resolution will still be below 50 nm. The electron beam typically illuminates only a few square millimeters of the mask surface at a time, and it has to be scanned across the mask to expose the entire patterned area unlike x-ray lithography where the entire mask is exposed at one time. Ebeam proximity lithography has some important advantages over x-ray proximity lithography. An intense, collimated beam of electrons can easily be created with inexpensive equipment, providing a major advantage over synchrotron or plasma x-ray sources. The angle with which the electron beam strikes the mask can readily be modulated by fast electronic deflection circuitry. This variable angle of exposure, in combination with the 50 mm print gap, allows electronic control of the final image placement on the wafer. Errors in the wafer position and even distortions in the mask can be corrected dynamically during exposure. The principal disadvantage of this type of lithography is the requirement for a 1! mask, a disadvantage shared with x-ray proximity lithography. The ebeam requirement for a stencil mask instead of a continuous membrane increases the difficulty. Stencil masks cannot be made with long slots or closed loops that break the continuity of the membrane. To overcome this difficulty, the pattern is broken up into two or more complementary masks. Each mask is allowed to have only short line segments, but when the masks are exposed sequentially, any type of structure can be built up as the union of the complementary patterns. The requirement for two or more sequential exposures increases the total exposure time and requires extremely accurate placement of each part of the subdivided pattern. Ebeam exposures must be done with the wafer in a vacuum, somewhat more complex than the helium environment that can be used for x-ray lithography. Development of ebeam proximity lithography began in the late 1980s [5], but the technology was not initially competitive with the well established optical projection lithography. More recently, the technology has been revived under the name low energy ebeam proximity lithography (LEEPL), and commercial ebeam exposure systems are becoming available [6]. 1.3.4 Imprint Lithography Recently, a large amount of research has been done on lithography using a physical imprinting process instead of optical pattern transfer. This method can be thought of as a high-tech version of wood block printing. A 1! mask is made with a 3-dimensional pattern etched into its surface. The pattern is transferred by physically pressing the mask against the surface of the wafer that has been coated with a material that receives the imprinted pattern. A number of ingenious methods have been developed for affecting

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

13

the image transfer. The mask may be inked with a thin layer of a chemical that is transferred to a resist layer on the wafer during contact. The transferred chemical layer catalyzes a chemical reaction in the resist and allows a pattern to be developed. Another method of image transfer is to wet the surface of the wafer with an organic liquid. A transparent imprint mask is pressed against the wetted wafer surface, forming the liquid into the desired pattern. Then, the wafer surface is exposed to a bright visible or ultraviolet light source through the transparent mask, photochemically polymerizing the liquid into a solid. Afterward, the mask is pulled away from the surface, leaving the patterned material behind. This method has the unexpected advantage that small bits of debris on the surface of the mask are likely to be trapped by the liquid imaging material and pulled away from the mask after the liquid has been hardened. Therefore, the mask is cleaned with each use. Imprint lithography does away with the limitations on feature size set by the wavelength of light. Features smaller than 10 nm have been created with this technology, and a growing number of researchers are developing a passionate interest in it. If the practical difficulties with imprint lithography can be overcome, a path to future Nonoptical lithographic manufacturing is provided. The practical difficulties with imprint lithography are many. The 1! mask magnification that is required will present mask makers with challenges similar to those of 1! x-ray mask technology. If the liquid polymerization method is used, a way must be found of wetting the surface of the wafer and mask without introducing bubbles or voids. When the liquid is hardened and the mask is pulled away, some way must be found to ensure that the pattern sticks to the wafer instead of the mask. This need for carefully tailored adhesion strengths to different materials is a new one for resist chemists. When the mask is pressed against the wafer surface to form the mold for the casting liquid, surface tension will prevent the mask from totally squeezing the liquid to a true zero thickness. This means that there will be a thin residue of resist in all of the spaces between the lines that form the pattern. The etch process that uses the resist pattern will have to include a descum step to remove the residue, and this will thin the resist and possibly degrade the resist sidewalls. Excess liquid that squeezes out around the edges of the mask pattern must be accommodated somehow. Finally, the requirements for surface wetting, contacting the mask to the wet resist, alignment, hardening, and mask separation will almost inevitably be much slower than the sequence of alignment, focus, and exposure used in optical steppers. None of these difficulties appears to be a total roadblock to the technology, but the road is certainly not a smooth one. 1.3.5 1! Scanners In the 1970s, optical proximity printing was replaced by the newly developed scanning lithography [7]. Optical scanners are able to project the image of a mask through a lens system onto the surface of a wafer. The mask is the same as that used by a proximity printer: a 1! chrome-on-glass pattern that is large enough to cover the entire wafer. But the use of a projection system means that masks are no longer damaged by accidental or deliberate contact with the wafer surface. It would be difficult to design a lens capable of projecting micron-scale images onto an entire 4-to 6-in. wafer in a single field of view. But a clever design by the Perkin–Elmer Corporation allows wafers of this size to be printed by simultaneously scanning the mask and wafer through a lens field shaped like a narrow arc. The lens design takes advantage of the fact that most lens aberrations are functions of the radial position within the field of view. A lens with an extremely large circular field can be designed with aberrations corrected only at a single radius within this field. An aperture limits the exposure field to a narrow arc centered on this radius. Because the projector

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

14

operates at 1! magnification, a rather simple mechanical system can scan the wafer and mask simultaneously through the object and image fields of the lens. Resolution of the projection optics is determined by the wavelength and numerical aperture using Rayleigh’s formula D Z k1

l NA

where D is the minimum dimension that can be printed, l is the exposure wavelength, and NA is the numerical aperture of the projection lens. The proportionality constant k1 is a dimensionless number in an approximate range from 0.6 to 0.8. The numerical aperture of the Perkin–Elmer scanner is about 0.17, and its illumination source contains a broad band of wavelengths centered around 400 nm. The Rayleigh formula predicts a minimum image size somewhat smaller than 2 mm for this system (Figure 1.4). 1! scanners are still in use for semiconductor lithography throughout the world. Resolution of these systems can be pushed to nearly 1 mm by using a deep ultraviolet light source at 250 nm wavelength. But the most advanced lithography is being done by reduction projectors, similar to the one described in the example at the beginning of this chapter. The one advantage still retained by a 1! scanner is the immense size of the scanned field. Some semiconductor devices such as 2-dimensional video detector arrays require this large field size, but in most cases, the need for smaller images has driven lithography toward steppers or the newer step-and-scan technology. 1.3.6 Reduction Steppers Steppers were first commercialized in the early 1980s [8]. A projection lens is used with a field size just large enough to expose one or two semiconductor chips. The fields are

FIGURE 1.4 A scanning exposure system projects the image of a 1! mask into an arc-shaped slit. The wafer and mask are simultaneously scanned across the field aperture (shaded area) until the entire wafer is exposed.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

15

exposed sequentially with the wafer being repositioned by an accurate x–y stage between exposures. The time to expose a wafer is considerably greater than with a scanner, but there are some great advantages to stepper lithography. The stepper lens can be made with a considerably higher numerical aperture than is practical for the full-wafer scanner lenses. The earliest steppers had numerical apertures of 0.28, yielding a resolution of about 1.25 mm at an exposure wavelength of 436 nm (the mercury g line). Another key advantage of steppers is their ability to use a reduction lens. The demagnification factor of 4! to 10! provides considerable relief in the minimum feature size and dimensional tolerances that are required on the mask (Figure 1.5). The resolution of steppers has improved considerably since their first introduction. The numerical aperture of lithographic lens designs has gradually increased, so today, values up to 0.85 are available. At the same time, there have been incremental changes in the exposure wavelength. In the mid-1980s, there was a shift from g-line (436 nm) to i-line (365 nm) wavelength for leading-edge lithography. During the 1990s, deep-ultraviolet wavelengths around 248 nm came into common use. By the early 2000s, the most advanced steppers began to use laser light sources at 193 nm wavelength. The combination of high numerical aperture and short wavelengths allows a resolution of 180 nm to be routinely achieved and 130 nm resolution to be produced in the most advanced productions lines of 2003. Future extensions to numerical apertures greater than 0.90 and research into even shorter exposure wavelengths gives some confidence that optical lithography can be extended to 45 nm or below. This expectation would have been unimaginable as recently as 1995. 1.3.7 1! Steppers Although the main development of lithography over the past decade has been with the use of reduction steppers, a few other notable lithographic techniques have been used. The Ultratech Stepper Corporation developed a stepper with 1! magnification using

FIGURE 1.5 A stepper employs reduction optics, and it exposes only one chip at a time. The 4! or 5! mask remains stationary with respect to the lens whose maximum exposure field is shown as the shaded area. After each chip is exposed, a high-precision stage moves the wafer to the position where the next exposure will occur. If the chip pattern is small enough, two or more chips may be printed in each exposure.

q 2007 by Taylor & Francis Group, LLC

16

Microlithography: Science and Technology

a particularly simple and elegant lens design. This lens design has been adapted to numerical apertures from 0.35 to 0.70 and wavelengths from 436 to 193 nm. The requirement for a 1! mask has prevented the general acceptance of this technology for the most critical levels of lithography, but it is an economical alternative for the less demanding masking levels [9]. 1.3.8 Step-and-Scan As lithographic image sizes evolve to smaller and smaller dimensions, the size of the semiconductor chip has been gradually increasing. Dynamic random-access memory chips are usually designed as rectangles with a 2:1 length-to-width ratio. A typical 16-Mbit DRAM has dimensions slightly less than 10!20 mm, and the linear dimensions tend to increase by 15%–20% each generation. Two adjacent DRAM chips form a square that fits into a circular lens field that must be 28–30 mm in diameter. Logic circuits such as microprocessor chips usually have a square aspect ratio, and they put similar demands on the field size. The combined requirements of higher numerical aperture and larger field size have been an enormous challenge for lithographic lens design and fabrication. One way to ease the demands on field size is to return to scanning technology. Lithographic exposure equipment developed in the late 1980s and perfected in the 1990s employs a technique called “step-and-scan” where a reduction lens is used to scan the image of a large exposure field onto a portion of a wafer [10]. The wafer is then moved to a new position where the scanning process is repeated. The lens field is required only to be a narrow slit as in the older full-wafer scanners. This allows a scanned exposure whose height is the diameter of the static lens field and whose length is limited only by the size of the mask and the travel of the mask-positioning stage. Step-and-scan technology puts great demands on the mechanical tolerances of the stage motion. Whereas a traditional step-and-repeat system has only to move the wafer rapidly to a new position and hold it accurately in one position during exposure, the step-and-scan mechanism has to simultaneously move both the mask and wafer, holding the positional tolerances within a few nanometers continuously during the scan. Because the step-andscan technique is used for reduction lithography, the mask must scan at a much different speed than the wafer and possibly in the opposite direction. All of the step-and-scan equipment designed so far has used a 4! reduction ratio. This allows the very large scanned field to be accommodated on a smaller mask than a 5! reduction ratio would permit. It also allows a very accurate digital comparison of the positional data from the wafer stage and mask stage interferometers (Figure 1.6). The first step-and-scan exposure system was developed by the Perkin–Elmer Corporation using an arc-shaped exposure slit. The projection lens had a numerical aperture of 0.35, and it was designed to use a broadband light source centered at a wavelength of 248 nm. The advantage of the fixed-radius arc field was not as great for a high-numericalaperture reduction lens as it had been for the 1! scanners, and all subsequent step-andscan systems have been designed with a rectangular slit aperture along a diameter of a conventional circular lens field. Step-and-scan lithographic equipment is now manufactured by every maker of lithographic exposure systems, and step-and-scan has become the dominant technology for advanced lithographic manufacturing throughout the semiconductor industry [11]. Although the complexities of step-and-scan technology are obvious, the benefit of a large scanned field is great. There are also a few more subtle advantages of step-andscan technology. Because the exposure field is scanned, a single feature on the mask is imaged through a number of different parts of the lens. Any localized aberrations or distortions in the lens will be somewhat reduced by averaging along the scan direction.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

17

FIGURE 1.6 A step-and-scan system combines the operations of a stepper and a scanner. The dotted outline represents the maximum scanned region. The shaded area is the slit-shaped exposure field aperture. The wafer and mask are simultaneously scanned across the field aperture. At the end of the scan, the wafer is stepped to a new position where the scanning process is repeated. In this example, the 4! mask contains two chip patterns.

Also, any local nonuniformity of illumination intensity is unimportant as long as the intensity integrated along the scan direction is constant. 1.3.9 Immersion Lithography In the pursuit of improved resolution in lithographic optics, there is a constant drive toward shorter wavelengths and higher numerical apertures. Wavelengths have followed a progression from 436 (mercury g line) to 365 nm (mercury i line) to 248 nm (krypton fluoride excimer laser) to 193 nm (argon fluoride excimer laser) to 157 nm (molecular fluorine laser—a technology still under development). Each reduction of wavelength has been accompanied by a great deal of engineering trauma as new light sources, new lens materials, new photomask materials, and new photoresists had to be developed for each wavelength. Whereas wavelengths have evolved in quantum jumps, numerical aperture has increased in a more gradual fashion with much less distress for the lithographer. The principal negative effects of increased numerical aperture are an increased complexity of lens design, a requirement for a narrower wavelength range in the laser light sources, and a reduced depth of focus in the aerial image projected by the lens. Although these negative factors increase the cost and complexity of lithography, the changes are incremental. However, whereas wavelength has no physical lower limit, numerical aperture does have an upper limit. Numerical aperture is defined as the index of refraction of the medium surrounding the lens times the sine of the angular acceptance of the lens. Because most lithography is done on dry land, the index of refraction is generally taken as that of air: nZ1.0. If a lens projects an aerial image with a maximum angular radius of 458, then the lens will have a numerical aperture of sin 458Z0.707. A little thought on this subject will lead to the conclusion that the maximum possible numerical aperture is sin 908Z1.00. In fact, this is true as long as

q 2007 by Taylor & Francis Group, LLC

18

Microlithography: Science and Technology

the lens is surrounded by air. However, microscope makers have long known that they can extend the numerical aperture of a lens well above 1.00 by immersing the high-magnification side of the lens and the sample they are observing in water, glycerin, or oil with an index of refraction much higher than 1. The thought of trying to do optical lithography in some sort of immersion liquid has occurred to many lithographers in the past, but the technical difficulties have always seemed too daunting. For immersion lithography to work, the entire gap between the wafer and the lithographic lens must be filled with the immersion liquid. The thought of a pool of liquid sloshing around on top of a fast-moving stepper or scanner stage is usually enough to put a stop to these sorts of daydreams. Even if a way could be found to keep the liquid under control, there are several other concerns that have to be examined. Photoresists are notoriously sensitive to the environment, and an underwater environment is much different than the temperature and humidity controlled atmosphere that photoresists are designed for. Dirt particles suspended in the immersion fluid could be a cause for serious concern, but filtration of liquids has been developed to a high art in the semiconductor industry, and it is likely that the immersion fluid could be kept clean enough. Bubbles in the liquid are a greater concern. An air bubble trapped between the liquid and the wafer’s surface would create a defect in the printed image just as surely as a dirt particle would. If the technical challenges can be overcome, there are some enticing advantages of immersion lithography. Foremost is the removal of the NAZ1.00 limitation on the numerical aperture. If water is used as the immersion liquid, its refractive index of nZ1.44 at the 193 nm exposure wavelength [12] will allow numerical apertures up to 1.3 or possibly greater. This, by itself, will give a greater improvement in resolution than the technically challenging jump from the 193 nm wavelength to 157 nm. A second great benefit of immersion lithography is an increased depth of focus. At a given value of the numerical aperture, the aerial image of an object is stretched along the z axis by a factor equal to the refractive index of the immersion fluid. This means that there is a benefit to using immersion lithography even at a numerical aperture less than 1.00. As a concrete example, a 0.90 NA non-immersion stepper using a 193 nm exposure wavelength might have a 190 nm total depth of focus for a mix of different feature types. A water-immersion stepper with the same exposure wavelength and numerical aperture would have a total depth of focus of 275 nm. Several recent advances have been made in immersion lithography. Application of sophisticated liquid-handling technology has led to the invention of water dispensers that apply ultra-pure water to the wafer surface just before it passes under the lithographic lens; they then suck the water off the wafer on the other side of the lens. In this way, only the portion of the wafer immediately under the lens is covered with water. Experiments have shown that extreme levels of filtration and de-gassification are needed to prevent defects from being created in the printed pattern. Hydrophobicity of the resist surface has been shown to have a strong effect on the number of bubble defects created during scanning, and water-insoluble surface coatings have been used to tune the wetting angle for minimum levels of bubble formation. Prototype immersion lithography equipment has been used to build fully functional integrated circuits [13,14]. The rapid development of immersion lithography has made it part of the strategic plan for lithography of essentially every semiconductor manufacturer. 1.3.10 Serial Direct Writing All of the microlithographic technologies discussed so far in this chapter have one thing in common: they are able to print massive amounts of information in parallel.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

19

Other technologies have been developed for writing lithographic patterns in a serial fashion. For example, several varieties of electron beam lithographic systems exist. Other systems use scanned, focused laser beams to write patterns in photoresist. The great failure of all of these serial technologies has been their speed of operation. A semiconductor circuit pattern may consist of a 21!21 mm square filled with patterns having a minimum resolvable image size of 0.35 mm. If a pixel is defined as a square one minimum image on a side, then the circuit in this example will be made up of 3.6!109 pixels. A serial pattern writer scanning in a raster fashion must sequentially address every one of these pixels. At a data rate of 40 MHz, it will take 90 s to write each circuit pattern. If there are 60 such patterns on a wafer, then the wafer writing time will be 1.5 h. This is nearly two orders of magnitude slower than a parallel exposure system. It is a considerable oversimplification to calculate serial writing rates as though the design were drawn on a piece of graph paper with every square colored either black or white and every square being one minimum feature size on a side. Actual chip circuits are designed on a grid at least ten times smaller than the minimum feature size that increases the writing time by about two additional orders of magnitude. The actual writing time for a single circuit is likely to be two hours or more, comparable to the time it takes to pattern a photomask on a raster-scan mask writer, and the writing time for an entire wafer will be measured in days. Various tricks have been used to increase writing speeds of serial exposure systems (often called direct-write systems). For example, a vector scan strategy may improve the speed of writing by eliminating the need to raster over unpatterned areas of the circuit. A certain amount of parallelism in pattern writing has been introduced with shapedbeam systems that can project a variable-sized rectangular electron beam [15]. An even greater amount of parallelism is achieved with electron beam cell projectors [16]. These systems use a stencil mask with a small repeating section of a larger circuit pattern. A large circuit can be stitched together from a library of these repeating patterns, using a rectangular shaped-beam strategy to fill in parts of the circuit that are not in the cell library. But even the fastest direct-write systems are far slower than parallel-exposure systems. Very rarely, direct-write systems have been used for commercial lithography on lowvolume, high-value semiconductor circuits. They have also been used occasionally for early development of advanced semiconductor designs when the parallel-exposure equipment capable of the required resolution has not yet been developed. However, the most common use of serial-exposure equipment has been for mask making. In this application, the slow writing time is not as serious an issue. It can make economic sense to spend hours writing a valuable mask that will then be used to create thousands of copies of itself at a fraction of a second per exposure. 1.3.11 Parallel Direct Writing/Maskless Lithography The attraction of direct writing lithography has become greater in recent years because of the unfavorable economics of masked lithography for low-volume semiconductor manufacturing. A specialized integrated circuit design might require only a few thousand chips to be made. This would represent only a few dozen 200 or 300 mm wafers. The manufacturing process typically requires twenty or thirty masks to create a complex circuit, and the cost of a photomask has risen dramatically with every new lithographic generation. A process that could write a lithographic pattern without a mask could become economically competitive even if the writing process were quite slow. The serial direct-writing strategy discussed in Section 1.3.10 is still orders of magnitude too slow to be competitive with masked lithography. Work has been done to increase

q 2007 by Taylor & Francis Group, LLC

20

Microlithography: Science and Technology

the parallelism of direct-writing pattern generators. Laser beam writers in use by mask makers write patterns using 32 independently modulated laser beams in parallel with a corresponding increase in speed over a single-beam pattern generator. Researchers are working on ebeam systems that can write with 4000 or more independent beams in parallel. A recent development by Micronic Laser Systems uses a 2-dimensional programmable array of 1 million micromirrors with each mirror being 16 mm square [17]. By independently tilting each mirror, a sort of programmable mask segment can be generated. Light reflected from the array is projected through a reduction lens onto a wafer surface, and the pattern is printed. Although this represents a million-fold increase in parallelism over a single-beam writer, it still does not approach the speed of a masked lithographic exposure system. With a large number of beams writing in parallel, the maximum rate of data transfer between the computer storing the pattern and the beam modulation electronics becomes a major bottleneck. The crossover point when maskless lithography can compete with the massive parallelism of data transfer during a masked exposure is not yet here, but there is still active interest in maskless lithography. The technology may start to be used when it reaches a speed of 1–10 wph compared to over 100 wph for conventional steppers using masks. 1.3.12 Extreme Ultraviolet Lithography Extensive amounts of work have been done over the past few years to develop the EUV wavelength range for use in lithography. This region of the spectrum with photon energies around 100 eV and wavelengths around 10–15 nm can be considered either the low-energy end of the x-ray spectrum or the short-wavelength limit of the EUV. Multilayer interference coatings with good EUV reflectivity at normal incidence were originally developed for x-ray astronomy. Extreme ultraviolet mirrors made with this multilayer technology have been used to make all-reflective lithographic lenses with relatively low numerical apertures. A diffraction-limited lens designed for a 13.5 nm exposure wavelength can achieve a resolution of 0.1 mm with a numerical aperture of only 0.10. This very modest numerical aperture could allow a fairly simple optical design. Such a projection lens can be designed with a conventional 4! image size reduction, easing the tolerances on the mask [18]. The multilayer mirrors used to make the projection optics can also be used to make reflective EUV masks. A defect-free EUV mirror coating is deposited on a mask blank and then overcoated with an EUV absorber such as tungsten or tantalum. A pattern is etched in the absorber using conventional mask-making techniques, and the exposed regions of the mirror create the bright areas of the mask pattern. The requirement for low levels of defects in the multilayer coatings on a mask blank is much greater than for a reflective lens element. A few point defects on a reflective lens element slightly reduce the efficiency of the lens and contribute to scattered radiation, but similar defects on a mask blank will create pattern defects that make the mask unusable. As development of EUV technology has progressed, conventional optical lithography has continued to advance as well. It now appears that EUV lithography will not be needed until the 45 nm or even 32 nm lithographic node. Because of the extremely short wavelength, EUV lenses with numerical apertures between 0.25 and 0.35 can easily resolve images in these ranges. The practical difficulties associated with EUV are great, but great resources are being applied worldwide to develop this technology. The stepper optics and EUV illumination system must be in a vacuum to prevent absorption and scattering of the x-rays. The multilayer interference mirrors require deposition of tens to hundreds of very accurate

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

21

films only a few nanometers thick. These mirrors reflect a rather narrow band of x-ray wavelengths, and they have a reflectivity that strongly varies with angle of incidence. The best reflectivity achieved by this technology is 65%–70%, which is extremely high for a normal-incidence x-ray mirror but dismally low by the standards of visible-light optics. The total transmission of an EUV lithographic lens is very low if more than two or three mirrors are used in the entire assembly. When the mirrors used in the illumination system and the reflective mask are factored into the total transmission, the net transmission from the radiation source to the wafer is less than 10%. The output power requirement for the EUV radiation source is over 100 W compared to about 20 W for a high-output excimer laser stepper. With the high level of investment in EUV development in recent years, many of the problems of EUV lithography have been overcome. High quality multilayer reflective coatings have been developed, suitable for lens components and nearly good enough for reflective masks. Good quality all-reflective EUV lens assemblies and masks have been built and tested. The radiation source technology is probably the most challenging problem that remains to be solved. If the level of commitment to EUV remains high in the industry, the remaining difficulties can probably be surmounted, but it will still be several years before the technology is ready for full manufacturing use. 1.3.13 Masked Particle Beam Lithography Electron beam lithography has already been discussed as a direct-writing or proximity printing lithographic technique. Charged particle beams, both electron beams and ion beams, have also been explored as exposure sources for masked reduction lithography. Electrons that are transmitted through a mask pattern can be imaged onto a wafer with an electronic lens very similarly to the way an optical lens forms an image with light. This technology is called electron projection lithography (EPL). Electronic lenses form images by steering fast moving charged particles in a vacuum using carefully shaped regions of electric and magnetic fields. In analogy to optics, an electronic lens has a numerical aperture, and electrons have a wavelength defined by the laws of quantum mechanics. Their wavelengths are vanishingly short compared to the wavelengths of optical or x-ray radiation used for lithography. A 100 keV pﬃﬃﬃ electron has a wavelength of 0.004 nm, and the wavelength scales approximately as 1= E in this energy range. Because of the extremely short wavelengths of electrons in the energy ranges used for lithography, the minimum feature size limit is set by lens aberrations, electron scattering, and charging effects rather than by the fundamental diffraction limit of the electron lenses. At the time of this writing, EPL has demonstrated image resolutions between 50 and 100 nm. An electron projection lens system called projection reduction exposure with variable axis immersion lenses (PREVAIL) was developed by IBM and Nikon in the late 1990s [19]. This system uses a 100 keV electron energy. It has a 4! magnification and a 250 mm square exposure field. Although this is only 1/100 the size of a typical optical lens field, the mask and wafer can be scanned through the exposure field in a raster fashion until the entire mask pattern is exposed. The stencil masks used for ebeam proximity lithography can also be used for ebeam projection lithography, but the much higher energy range of the projection systems makes it impractical for the mask membrane to be made thick enough to completely stop the electrons. Instead, the mask membrane scatters the electrons to a high enough angle that they are not transmitted by the electronic lens that is designed with a very small angular acceptance. Another type of electron beam mask has been developed based on the fact that some fraction of high energy electrons will pass through a very thin film of low atomic weight

q 2007 by Taylor & Francis Group, LLC

22

Microlithography: Science and Technology

material with no scattering. An ultrathin membrane (150 nm or less) is made of low atomic weight materials, and a layer of metal with high atomic weight is deposited on the membrane and etched to form the mask patterns. The metal is not thick enough to stop the electrons; rather, it scatters them strongly. The image is formed by only those electrons passing through the thin membrane with no scattering. This method of masked electron beam lithography was developed by Lucent Technologies in the mid-1990s, and it was named SCattering with Angular Limitation in Projection Electron beam Lithography (SCALPEL) [20]. The continuous membrane used for a SCALPEL mask allows a single mask to be used for each patterning level instead of the two complementary masks that would be needed if a stencil mask were used. However, the contrast and total transmission of a SCALPEL mask are considerably lower than those of a stencil mask. Both stencil masks and ultrathin membrane masks are very fragile. Rather than create a mask with the entire chip pattern on a single continuous membrane, ebeam projection masks are usually subdivided into multiple sub-field segments approximately 1 mm on a side. These subfields are butted against each other when the image is printed to form the complete mask pattern. The subfields are separated on the mask by a thick rib of material that provides support for the delicate membrane and greatly improves the stiffness of the mask. Because of the extremely short wavelength of high energy electrons, diffractive spreading of the electron beam is negligible, and the depth of focus is very large compared to optical lithography. Resists used for ebeam lithography do not need to use complicated and expensive methods for controlling thin-film interference as are often needed for optical lithography (see Section 1.6.2). There are some serious problems with electron beam lithography as well. High energy electrons tend to scatter within the photoresist and also within the wafer substrate beneath the resist layer. These scattered electrons partially expose the resist in a halo around each of the exposed features. A dense array of features may contain enough scattered electrons to seriously overexpose the resist and drive the critical line width measurements out of their specified tolerances. To counteract this, complex computer algorithms have been designed to anticipate the problem and adjust the dose of each feature to compensate for scattered electrons from neighboring features [21]. In direct-write electron beam systems, this proximity correction is directly applied to the pattern-writing software. Masked electron beam lithography must have its proximity corrections applied to the mask design. Electron-beam lithography on highly insulating substrates can be very difficult because electrostatic charges induced by the exposure beam can force the beam out of its intended path, distorting the printed pattern. In addition, electrical forces between electrons in the beam can cause the beam to lose its collimation with the problem becoming worse with increasing current density. Because of the electron scattering problem, it has been proposed that heavier charged particles such as protons or heavier ions should be used for masked lithographic exposures [22]. Heavy ions have very little tendency to scatter, but they are still susceptible to beam deflections from electrostatic charges on the substrate. Ion beams are more sensitive than electron beams to interactions between ions within the beam because of the slower ion velocity and resulting higher charge per unit volume. Ebeam projection lithography is not yet competitive with optical projection lithography, but the resolution that can be achieved is not currently limited by any fundamental laws of nature. Further improvements in resolution are expected, and the main barrier to use may turn out to be the complexity and expense of ebeam masks.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

23

1.4 Lithographic Light Sources 1.4.1 Requirements Light sources of fairly high power and radiance are required to meet the demands of a modern, high-speed lithographic exposure system. The optical power required at the wafer is easily calculated from the sensitivity of the photoresist and the time allowed for the exposure. A typical resist used at midultraviolet wavelengths (365–436 nm) may require an optical energy density of 100 mJ/cm2 for its exposure. If the exposure is to take 0.5 s or less, a minimum optical power density of 200 mW/cm2 is required. Often, power densities of 500–1000 mW/cm2 are provided in order to allow for lower resist sensitivities or shorter exposure times. The total illuminated area that a stepper projects onto a wafer may be a circle 30 mm in diameter that has an area of about 7 cm2. If the power density for this example is taken to be 500 mW/cm2, then a total power of 3.5 W is required at the wafer. This is a substantial amount of optical power. 1.4.2 Radiance Radiance (also known as luminance or brightness) is a concept that may be somewhat less familiar than power density. It is defined as the power density per steradian (the unit of solid angle), and it has units of W/cm2 sr. This quantity is important because fundamental thermodynamic laws prevent the radiance at any point in an optical imaging system from being greater than the radiance of the light source. If light is lost because of absorption or inefficient mirrors in the optical system, the radiance will decrease. Likewise, an aperture that removes a portion of the light will reduce the radiance. The concept of radiance is important because it may limit the amount of optical power that can be captured from the light source and delivered to the wafer. The power from a diffuse light source cannot be concentrated to make an intense one. If the stepper designer wants to shorten the exposure time per field, he or she must get a light source with more power, and the power must be concentrated within a region of surface area similar to that of the light source being replaced. If the additional power is emitted from a larger area within the light source, it probably cannot be focused within the exposure field. 1.4.3 Mercury–Xenon Arc Lamps The requirements for high power and high radiance have led to the choice of highpressure mercury–xenon arc lamps as light sources for lithography. These lamps emit their light from a compact region a few millimeters in diameter, and they have total power emissions from about 100 to over 2000 W. A large fraction of the total power emerges as infrared and visible light energy that must be removed from the optical path with multilayer dielectric filters and directed to a liquid-cooled optical trap that can remove the large heat load from the system. The useful portion of the spectrum consists of several bright emission lines in the near ultraviolet and a continuous emission spectrum in the deep ultraviolet. Because of their optical dispersion, refractive lithographic lenses can use only a single emission line: the g line at 435.83 nm, the h line at 404.65 nm, or the i line at 365.48 nm. Each of these lines contains less than 2% of the total power of the arc lamp. The broad emission region between about 235 and 260 nm has also been used as a light source for deep-UV lithography, but the power available in this region is less than that of the line emissions (Figure 1.7).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

24 1.00

Relative intensity

0.82

0.62

0.42

0.22

0.00 200

250

300

450 350 400 Wavelength (NM)

500

550

600

FIGURE 1.7 The mercury arc spectrum. The g line at 436 nm, the h line at 405 nm, the i line at 365 nm, and the emission region centered at 248 nm have all been used for microlithography.

1.4.4 The Arc-Lamp Illumination System A rather complex illumination system or condenser collects the light from the arc lamp, projecting it through the mask into the entrance pupil of the lithographic projection optics. The illumination system first collects light from a large angular region surrounding the arc using a paraboloidal or ellipsoidal mirror. The light is sent through one or more multilayer dielectric filters to remove all except the emission line that will be used by the projection optics. The light emitted from the arc lamp is not uniform enough to be used without further modification. In order to meet the G1% uniformity requirement for mask illumination, a uniformizing or beam-randomizing technique is used. Two common optical uniformizing devices are light tunnels and fly’s-eye lenses. Both of these devices create multiple images of the arc lamp. Light from the multiple images is recombined to yield an averaged intensity that is much more uniform than the raw output of the lamp. The illumination system projects this combined, uniform beam of light through the mask. The illuminator optics direct the light in such a way that it passes through the mask plane and comes to a focus in the entrance pupil of the lithographic projection optics. The technique of focusing the illumination source in the entrance pupil of the imageforming optics is called Ko¨hler illumination. If a fly’s eye or light tunnel is used in the system, the resulting multiple images of the arc lamp will be found focused in a neat array in the entrance pupil of the lithographic lens. A number of problems can occur if the illumination system does not accurately focus an image of the light source in the center of the entrance pupil of the lithographic lens [23]. The entrance and exit pupils of a lens are optically conjugate planes. This means that an object in the plane of the entrance pupil will have an image in the plane of the exit pupil. The exit pupil in a telecentric lithographic lens is located at infinity. However, if the illumination source is focused above or below the plane of the entrance pupil, then its image will not be in the proper location at infinity. This leads to the classic telecentricity error, a change of magnification with shifts in focus (Figure 1.8). An error in centering the image of the illumination source in the entrance pupil will cause an error known as focus walk. This means that the lithographic image will move

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

25

(b)

(a)

(c) FIGURE 1.8 (a) A telecentric pupil illuminates each point on the wafer’s surface with a cone of light whose axis is perpendicular to the surface. (b) The effects of a telecentricity error where the illumination does not appear to come from infinity. If the wafer surface is not perfectly located at the position of best focus, the separation between the three images will change, leading to a magnification error. (c) The effects of a decentration of the illumination within the pupil. In this situation, a change in focus induces a side-to-side shift in the image position.

from side to side as the focus shifts up and down. Focus walk will make the alignment baseline change with changes of focus, and it can affect the alignment accuracy. Spherical aberration in the illuminator optics causes light passing through the edge of the mask to focus at a different position than light passing through the center. This means that the image of the illumination source will be at different positions for each location in the image field, and the third-order distortion characteristics of the image will change with a shift in focus. 1.4.5 Excimer Lasers Laser light sources have been developed to provide higher power at deep-UV wavelengths [24]. The only laser light source that has been successfully introduced for commercial wafer steppers is the excimer laser. An excimer is an exotic molecule formed by a noble gas atom and a halogen atom. This dimeric molecule is bound only in a quasi-stable excited state. The word excimer was coined from the phrase “excited dimer.” When the excited state decays, the molecule falls apart into its two constituent atoms. A variety of noble gas chloride and fluoride excimers can be used as laser gain media. The 248 nm krypton fluoride excimer laser is now in common use for commercial lithography. It has many features that make it attractive as a lithographic light source, and it has a few undesirable features as well. Excimer lasers have considerably more usable power at 248 nm than the mercury arc lamp emission spectrum between 235 and 260 nm. The radiance of a laser is many orders of magnitude larger than that of a mercury arc lamp because the laser emits a highly collimated beam of light whereas the arc lamp isotropically emits light. The excimer laser’s spectral line width is about 0.5 nm. Although this is somewhat narrower than the line width of a high-pressure mercury arc lamp, it is extremely wide compared to most lasers. The reason for this is the lack of a well-defined ground state for the energy level transition that defines the laser wavelength, the ground state in this case being two dissociated atoms.

q 2007 by Taylor & Francis Group, LLC

26

Microlithography: Science and Technology

Excimer lasers have a very low degree of coherence compared to most other kinds of lasers. The laser cavity operates with a very large number of spatial modes, giving it low spatial coherence, and the relatively broad line width results in low temporal coherence. Low coherence is very desirable for a lithographic light source. If the degree of coherence is too high, undesirable interference patterns can be formed within the aerial image. These effects of interference may appear as a grainy pattern of bright and dark modulation or as linear or circular fringe patterns in the illuminated parts of the image. The term speckle is often applied to this sort of coherent interference effect. Although speckle is relatively slight with excimer laser illumination, it is still a factor to consider in the design of the illumination system. The 0.5 nm natural spectral width of the 248 nm excimer laser is not narrow enough to meet the bandwidth requirements of refractive lithographic lenses. These lenses used to be made of a single material—fused silica—and they had essentially no chromatic correction. The high level of chromatic aberration associated with these lenses forced the illumination source to have a spectral line width of less than 0.003 nm (3 pm). Krypton fluoride excimer lasers have been modified to produce this spectral line width or less. A variety of techniques have been used, mostly involving dispersive elements such as prisms, diffraction gratings, and/or etalons within the optical cavity of the excimer laser. The addition of these elements reduces the total power of the laser somewhat, and it tends to decrease the stability of the power level. A rather complex feedback system is required to hold the center wavelength of the line-narrowed laser constant to about the same picometer level of accuracy. If the laser wavelength drifts by a few picometers, the focus of the lithographic lens may shift by several hundred nanometers. More recently, lithographic lenses for the deep ultraviolet have been designed using some elements made of calcium fluoride. The difference between the optical dispersion of calcium fluoride and fused silica allows some amount of correction for chromatic aberration in the lens. However, the newer lenses are being designed with increasingly high numerical apertures that tend to require narrower optical bandwidths. Today, excimer lasers are available with bandwidths less than 1 pm. There are several difficulties associated with using an excimer laser as a lithographic light source. The most significant problem is the pulsed nature of the excimer laser light. Excimer lasers in the power range useful for lithography (between 10 and 40 W) typically produce pulses of laser energy at a rate of 200–4000 Hz. Each pulse is between 5 and 20 ns in length. Because of the extremely short pulses and the relatively long time between pulses, the peak power within each pulse is extremely high even when the time-averaged power is relatively modest. For example, an excimer laser running at 400 Hz with an average power of 10 W and a 10 ns pulse length will have a peak power of 2.5 MW for the duration of each pulse. Peak powers in this range can cause damage to optical materials and coatings if the laser beam is concentrated in a small area. Although optical materials vary over a large range in their susceptibility to laser damage, peak power in the range of 5 MW/cm2 can produce some degradation to a lithographic lens after prolonged exposure. Many of the damage mechanisms are functions of the square of the power density, so design features that keep the power density low are very desirable. The lithographic lens can be designed to avoid high concentrations of light within the lens material. Increases in the repetition rate of the laser are also beneficial because this produces the same average power with a lower energy per pulse. It is also possible to modify the laser optics and electronics to produce somewhat longer pulses that proportionately decrease the peak power within each pulse. Pulsed lasers also present a problem in exposure control. A typical control scheme for a static-field stepper is to integrate the pulse energy until the required exposure has been accumulated, at which point, a signal is sent to stop the laser from pulsing. This requires

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

27

a minimum of 100 pulses in order to achieve a dose control accuracy of 1%. With the low repetition rates of early models of excimer lasers, the requirement of 100 pulses for each exposure slowed the rate of wafer exposures considerably. More sophisticated exposure control schemes have been developed using several high-energy pulses to build up the exposure rapidly then reducing the laser power to a low value to trim the exposure to its final value. Laser repetition rates have been steadily increasing as excimer laser lithography has developed. Between better control schemes and higher laser repetition rates, exposure speeds of excimer laser steppers are at least the equal of arc lamp systems. Early models of excimer lasers used in prototype lithographic systems had pulseto-pulse energy variations of up to 30%. This does not seriously affect the exposure control system that can integrate pulses over a broad range of energies. It does cause the exposure time for each field to vary with the statistical fluctuations of the pulse energies. This is not a concern for a static-field stepper, but it is a serious problem when excimer lasers are used as light sources for scanning or step-and-scan systems. Scanners require a light source that does not fluctuate in time because they must scan at a constant velocity. A pulsed light source is not impossible to use with a scanning system as long as a sufficiently large number of pulses are accumulated during the time the exposure slit sweeps across a point on the wafer. The number of pulses required is determined by the pulse-to-pulse stability of the laser and the uniformity requirements of the final exposure on the wafer’s surface. The variability of the laser’s pulse energy is reduced by approximately the square root of the number of pulses that are accumulated. To reduce a 30% pulse energy variability to 1% would require accumulating about 900 pulses that would make any scanned exposure system impracticably slow. Fortunately, excimer lasers now used in lithography have a pulse stability much better than that of the earliest systems, and it is continuing to improve. With a pulse rate of 2–4 kHz, a state-of-the-art excimer laser can produce 0.3% exposure uniformity on a step and scan system with scan speeds up to 350 mm/s. Excimer lasers have a few undesirable features. The laser is physically large, occupying roughly 10–20 sq ft. in the crowded and expensive floor space around the stepper. The laser itself is expensive, adding 5%–10% to the cost of the stepper. Its plasma cavity is filled with a rather costly mixture of high purity gases, including toxic and corrosive fluorine. The plasma cavity is not permanently sealed, and it needs to be refilled with new gas on a fairly regular basis. This requires a gas handling system that meets rigorous industrial safety requirements for toxic gases. The electrical efficiency of an excimer laser is rather low, and the total electrical power consumption can be greater than 10 kW while the laser is operating. The pulsed power generation tends to produce a large amount of radiofrequency noise that must be carefully shielded within the laser enclosure to prevent damage to sensitive computer equipment in the area where the laser is installed. The ultraviolet light output of the laser is very dangerous to human eyes and skin, and fairly elaborate safety precautions are required when maintenance work is done on the laser. Lithographic excimer lasers are classified as class IV lasers by federal laser standards, and they require an interlocked enclosure and beam line to prevent anyone operating the system from being exposed to the laser light. Excimer lasers are also being used as the light source for a new generation of lithography at an exposure wavelength of 193 nm [25]. This is the wavelength of the argon fluoride excimer laser. Both krypton fluoride and argon fluoride lasers have reached a high state of technological maturity and make reliable, albeit expensive, light sources. 1.4.6 157 nm F2 Lasers Although not technically an excimer laser, the pulsed fluorine laser with an emission line at 157 nm is very similar to the excimer’s in its design and operating characteristics.

q 2007 by Taylor & Francis Group, LLC

28

Microlithography: Science and Technology

This laser has been developed into a light source suitable for lithographic applications with a power output and repetition rate equivalent to those of krypton fluoride and argon fluoride excimer lasers. A large effort has been underway for several years to develop a commercially useful 157 nm lithography. Difficulties at this wavelength are significantly greater than those of the 248 and 193 nm excimers. At 157 nm, fused silica is no longer transparent enough to be used as a refractive lens material. This leaves only crystalline materials such as calcium fluoride with enough transparency for 157 nm lenses. Prototype lenses for 157 nm lithography have been successfully built from calcium fluoride, but there are constraints on the availability of calcium fluoride with high enough quality to be used for lithographic lenses. Calcium fluoride has an intrinsic birefringence that makes design and fabrication of a lithographic lens considerably more complex than for lenses made of amorphous materials. Another difficulty with the 157 nm exposure wavelength is the loss of optical transmission of oxygen below about 180 nm. A stepper operating at 157 nm requires an atmosphere of nitrogen or helium around the stepper and laser beam transport. Water and most organic compounds are also highly opaque to 157 nm radiation. It has been found that water vapor and volatile organic compounds readily condense out of the atmosphere onto lens and mask surfaces, and they can seriously degrade transmission at this wavelength. The worst difficulty encountered in the development of 157 nm lithography has been the lack of a material suitable for a protective pellicle for the mask pattern (see Section 1.11.8). Strategic decisions in the semiconductor industry have brought 157 nm lithography development to a near halt in recent years. Advances in immersion lithography (Section 1.3.9) and the technical hurdles facing 157 nm lithography were the most important factors influencing this change of direction. It is possible that interest in the 157 nm exposure wavelength might revive when the limits of 193 nm lithography are reached, but for now, this technology is on hold. 1.4.7 Other Laser Light Sources Other light sources have been investigated for possible use in microlithography. The neodymium yttrium–aluminum–garnet (YAG) laser has a large number of applications in the laser industry, and its technology is very mature. The neodymium YAG laser’s fundamental wavelength of 1064 nm can be readily converted to wavelengths useful for lithographic light sources by harmonic frequency multiplication techniques. The two wavelengths with the greatest potential application to lithography are the 266 nm fourth harmonic and the 213 nm fifth harmonic [26]. Diode-pumped YAG lasers at these two wavelengths could potentially compete with 248 and 193 nm excimer lasers as lithographic light sources. The advantages of the solid-state YAG laser are low cost, simplicity, and compactness compared to excimer lasers. The main disadvantage is an excessively high level of coherence. Although excimer lasers are complex, inefficient, and expensive, they have achieved complete dominance in the lithographic laser market. The low coherence that is a characteristic of excimer lasers, and their excellent reliability record has effectively eliminated interest in replacing them with other types of lasers. 1.4.8 Polarization Arc lamp and excimer laser light sources typically produce unpolarized light, and until recently, the vector nature of light waves could be ignored. But as lithography progresses into the region of ultrahigh numerical apertures, the polarization properties of light become increasingly important [27]. When the openings in a photomask approach

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

29

the size of a wavelength of light, the amount of light transmitted is affected by polarization. Even if the illumination source is unpolarized, the difference in transmission efficiency between the light polarized parallel to the direction of a narrow mask feature and the light polarized perpendicular to the feature will partially polarize the light. The amount of polarization induced by the mask pattern depends in complex ways on the refractive index of the mask absorber and the feature pitch. A grating pattern on a mask made with a highly conductive metal absorber will preferentially transmit light polarized perpendicular to the lines making up the grating as long as the grating pitch is less than half the wavelength of the light used for illumination. This effect has been known for a long time, and it has been used commercially to make wire-grid polarizers for visible, infrared, and microwave radiation. The polarization effect reverses for grating pitches between half a wavelength and two wavelengths, and the grating preferentially transmits light polarized along the direction of the metal lines. For pitches larger than two wavelengths, the preferred polarization remains in the direction of the metal lines, but the polarization effect decreases until it becomes nearly negligible for pitches several times the wavelength of light. When a light ray is bent by diffraction from a grating or by refraction at a dielectric surface, the plane containing both the incoming ray and the deflected outgoing ray makes a convenient reference plane to define the two possible orientations of polarization. Light polarized with its electric field vector within the bending plane is called p polarized light or transverse magnetic (TM) light. The other polarization with the electric field perpendicular to the bending plane is called s polarization or transverse electric (TE) polarization. Using these definitions, the effect of a wire-grid polarizer can be summarized. When the pitch is less than l/2, gratings act as very efficient p polarizers; gratings with larger pitches act as moderately efficient s polarizers. Typical steppers with a reduction of 4! are not able to resolve a mask grating with pitch less than one wavelength of the illumination even when the numerical aperture is enhanced by immersion lithography techniques, so at the limits of optical resolution, mask patterns transmit s polarization better than p polarization. When the light projected through the stepper lens combines to reconstruct an image, polarization plays a second role. The light interfering to form the aerial image comes from a range of angles limited by the numerical aperture of the projection lens. Two equalintensity beams of light can interfere with 100% efficiency regardless of the angle between them if they are s polarized. This occurs because the interference is proportional to the dot product of their electric vectors, and these vectors are parallel if both beams are s polarized. For the case of p polarized light, the electric vectors of two interfering beams are not parallel, and dot product is proportional to the cosine of the angles between the two beams. This means that an image formed by two p polarized beams will have less contrast than the image formed by two s polarized beams. Because s polarized light is more easily transmitted by grating patterns and is more efficient in forming an aerial image near the resolution limit of advanced lithographic steppers, it has been proposed that the stepper illumination be polarized so that only the s polarization is used. Computer modeling has shown that useful improvements of image contrast can be realized with s polarized illumination, and steppers with polarized illumination have been built. 1.4.9 Nonoptical Illumination Sources Electron synchrotrons have already been mentioned as sources of 1 keV x-rays for x-ray proximity lithography [28]. These use technology developed for elementary particle research in the 1960s and 1970s to accelerate electrons to approximately 1 GeV and to

q 2007 by Taylor & Francis Group, LLC

30

Microlithography: Science and Technology

maintain them at that energy, circulating in an evacuated ring of pipe surrounded by a strong magnetic field. The stored beam of electrons radiates x-rays generated by a process called synchrotron radiation. The x-rays produced by an electron synchrotron storage ring have a suitable wavelength, intensity, and collimation to be used for x-ray proximity lithography. The magnets used to generate the synchrotron radiation can also be retuned to generate x-rays in the soft x-ray or EUV range. A synchrotron x-ray source can supply beams of x-rays to about 10–20 wafer aligners. If the cost of the synchrotron is divided among all of the wafer aligners, then it does not grossly inflate the normally expensive cost of each lithographic exposure system. However, if the lithographic exposure demands of the manufacturing facility are not sufficient to fully load the synchrotron, then the cost of each exposure system will rise. Likewise, if the demand for exposures exceeds the capacity of one synchrotron, an additional synchrotron will need to be installed. This makes the cost of a small additional amount of lithographic capacity become extremely large. Often, this is referred to as the problem of granularity. After a period of interest in synchrotron radiation sources in the early 1990s, there has been almost no further industrial investment in this technology. Other sources of x-rays have been developed to avoid the expense of an electron synchrotron. Ideally, each x-ray stepper should have its own illumination source just as optical steppers do. Extremely hot, high density plasmas are efficient sources of x-rays. Dense plasma sources are often called point sources to differentiate them from the collimated x-ray beams emitted by synchrotron storage rings. A point source of x-rays can be generated by a magnetically confined electrical discharge or a very energetic pulsed laser beam focused to a small point on a solid target. These x-ray point sources are not as ideal for proximity lithography as a collimated source. Because the radiation is isotropically emitted from a small region, there is a tradeoff between angular divergence of the beam and the energy density available to expose the resist. If the mask and wafer are placed close to the source of x-ray emission, the maximum energy will be intercepted, but the beam will diverge widely. A diverging x-ray beam used for proximity printing will make the magnification of the image printed on the wafer become sensitive to the gap between the mask and the wafer. This is analogous to a telecentricity error in a projection lens (see Section 1.5.1). In addition, there is a possibility of contaminating the mask with debris from the plasma, especially in the case of laser-generated point sources. Great advances have been made in point-source technology for soft x-ray or EUV radiation. With the development of normal-incidence EUV mirrors, it is now possible to design a condenser to collimate the radiation from an EUV point source. The most efficient sources of 13.5 nm EUV radiation are laser-generated xenon or tin plasmas. A solid tin target produces intense EUV radiation when struck by a focused laser beam, but the solid metal source also produces debris that can quickly contaminate nearby parts of the condenser optics. A plasma produced from gaseous xenon does not produce debris like a tin target, but xenon has a different set of problems. The material is expensive, and it must be recycled through a complex gas capture and recompression system. There must also be some way of preventing xenon from escaping into the vacuum environment of the EUV optics. In contrast to x-ray sources, electron sources are simple and compact. A hot filament provides a copious source of electrons that can be accelerated to any desired energy with electrostatic fields. Higher intensity and radiance can be achieved with the use of materials such as lanthanum and zirconium in the electron source or the use of field emission sources.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

31

Ion sources are also compact yet somewhat more complex than electron sources. They use radio frequency power to create an ionized plasma. The ions are extracted by electric fields and accelerated similarly to electrons.

1.5 Optical Considerations 1.5.1 Requirements Above all else, a lithographic exposure system is defined by the properties of its projection lens. Lithographic lenses are unique in their simultaneous requirements for diffractionlimited image formation, large field size, extremely low field curvature, a magnification accurate to six decimal places, and near-zero distortion. Distortion refers to errors in image placement. The lowest orders of placement error, namely x and y offset, magnification, and rotation, are not included in the lens distortion specification. Whereas a good quality camera lens may have distortion that allows an image placement error of 1% or 2% of the field of view, the permissible distortion of a modern lithographic lens is less than one part per million. Fortunately, a high degree of chromatic correction is not required because an extremely monochromatic light source can be selected. In fact, lenses designed for ultraviolet excimer laser light sources may have no chromatic correction whatsoever. The lithographic lens needs to achieve its designed level of performance for only one well-defined object plane and one image plane; therefore, the design is totally optimized for these two conjugate planes. A lens with high numerical aperture can give diffraction-limited performance only within a narrow focal range. This forces extremely accurate mechanical control of the wafer position to keep it within the range of best focus. A modern high-resolution lithographic lens may have a depth of focus of only a few tenths of a micron. The focus tolerance for the photomask is looser than that for the wafer by a factor of the lens magnification squared. A 4! stepper with a 200 nm depth of focus on the wafer side will have a 3.2 mm depth of focus for the mask. However, a large focus error at the mask will proportionately reduce the available depth of focus at the wafer. For this reason, every effort is made to keep the mask as flat as possible and maintain its focal position accurately. There is another, somewhat unobvious, requirement for a lithographic lens called telecentricity. This term means that the effective exit pupil of the lens is located at infinity. Under this condition, the light forming the images converges symmetrically around the normal to the wafer surface at every point in the exposure field. If the wafer is exposed slightly out of focus, the images will blur slightly, but the exposure field as a whole will not exhibit any change in magnification. If the lens pupil was not at infinity, a rather insignificant focus error could cause a severe change in magnification. Some lenses are deliberately designed to be telecentric on the wafer side but nontelecentric on the mask side. This allows the lens magnification to be fine-tuned by changing the separation between the mask and the lithographic lens. Other lenses are telecentric on both the wafer side and the mask side (called a double-telecentric design). With double-telecentric lenses, magnification adjustments must be made by other means. 1.5.2 Lens Control The ability to adjust magnification through a range of G50 ppm or so is needed to compensate for slight changes in lens or wafer temperature or for differences in calibration

q 2007 by Taylor & Francis Group, LLC

32

Microlithography: Science and Technology

between different steppers. Changing the mask-to-lens separation in a nontelecentric lithographic lens is only one common technique. Often, lithographic lenses are made with a movable element that induces magnification changes when it is displaced along the lens axis by a calibrated amount. In some designs, the internal gas pressure in the lithographic lens can be accurately adjusted to induce a known magnification shift. The magnification of a lens designed to use a line-narrowed excimer laser light source can be changed by deliberate shifts in the laser wavelength. Most of these methods for magnification adjustment also induce shifts in the focal position of the lens. The focus shift that results from a magnification correction can be calculated and fed back to the software of the focus control system. 1.5.3 Lens Defects Stray light scattered from imperfections in the lens material, coatings, or lens-mounting hardware can cause an undesirable haze of light in regions of the image that are intended to be dark. This imperfection, sometimes called flare, reduces the image contrast and generally degrades the quality of the lithography. A surprisingly large amount of flare sometimes occurs in lithographic lenses. More than 5% of the illumination intensity is sometimes scattered into the nominally dark areas of the image. Although this level of flare can be tolerated by a high-contrast resist, it is preferable to reduce the flare to less than 2%. Optical aberrations in either the lithographic projection lens or the illuminator can lead to a variety of problems in the image. Simple tests can sometimes be used to identify a particular aberration in a lithographic lens, but often, a high-order aberration will have no particular signature other than a general loss of contrast or a reduced depth of focus. Many manufacturers of advanced lithographic systems have turned to sophisticated interferometric techniques to characterize the aberrations of their lenses. A phase-measuring interferometer can detect errors approaching 1/1000 of a wavelength in the optical wave front. 1.5.4 Coherence Even a lens that is totally free of aberrations may not give perfect images from the perspective of the lithographer. The degree of coherence of the illumination has a strong effect on the image formation. Too high a degree of coherence can cause ringing where the image profile tends to oscillate near a sharp corner, and faint ghost images may appear in areas adjacent to the features being printed. On the other hand, a low degree of coherence can cause excessive rounding of corners in the printed images as well as loss of contrast at the image boundaries. The degree of coherence is determined by the pupil filling ratio of the illuminator, called s. This number is the fraction of the projection lens’s entrance pupil diameter that is filled by light from the illuminator. Highly coherent illumination corresponds to small values of sigma, and relatively incoherent illumination results from large values of sigma (Figure 1.9). A coherence of sZ0.7 gives nearly the best shape fidelity for a two-dimensional feature. However, there has been a tendency for lithographers to use a greater degree of coherence, sZ0.6 or even sZ0.5, because of the greater image contrast that results. This sort of image formation, neither totally coherent nor totally incoherent, is often called partially coherent imaging [29]. 1.5.5 k-Factor and the Diffraction Limit The ultimate resolution of a lithographic lens is set by fundamental laws of optical diffraction. The Rayleigh formula briefly mentioned in Section 1.3.5 is repeated here

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

33

Mask object

(a)

(c)

0.1

0.3

(b)

0.5

0.7

(d)

0.9

(e)

(f)

FIGURE 1.9 The computer-modeled aerial image of a T-shaped mask object (a) is shown projected with five different pupilfilling ratios (sZ0.1–0.9). Vertical height in each Figure (b–f) represents the light intensity at the wafer surface. Note the high contrast, but excessive amounts, of ringing in the images with the greatest coherence or lowest values of s. The bars in the mask object have dimensions of 0.7 l/NA.

D Z k1

l NA

where D is the minimum dimension that can be printed, l is the exposure wavelength, and NA is the numerical aperture of the projection lens. The proportionality constant k1 is determined by factors unrelated to the projection lens such as illumination conditions, resist contrast, and photomask contrast enhancement techniques. If D is defined as onehalf of the minimum resolvable line/space pitch, then there is an absolute minimum to the value of k1. Below a value of k1Z0.25, the contrast of a line/space pattern falls to zero. Even this limit can only be approached with incoherent illumination (sZ1) or with phase shifting masks. For totally coherent illumination (sZ0) using conventional binary masks, the diffraction limit for line/space patterns is k1Z0.50. Partially coherent illumination produces a diffraction limit of k1Z0.5/(1Cs), spanning the range between the coherent and incoherent diffraction limits. Near the diffraction limit, the contrast of a line/space pattern becomes so low that it is virtually unusable. The value of k1 is often used to define the aggressiveness of various lithographic techniques. Conventional lithography with no special enhancements can readily yield values

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

34

of k1 between 0.6 and 0.8. Unconventional illumination (Section 1.13.4) in combination with attenuating phase shifting masks (Section 1.13.3) can be used to drive k1 down to about 0.45. Strong phase shifting techniques such as alternating aperture or phase edge masks may push k1 to 0.30. A second Rayleigh formula gives an expression for depth of focus

DZ Z k2

l NA2

where DZ is the depth of focus, l is the exposure wavelength, and NA is the numerical aperture of the lens. The value of the proportionality constant k2 depends on the criteria used to define acceptable imaging and on the type of feature being imaged. A convenient rule of thumb is to use k2Z0.8 as the total depth of focus for a mix of different feature types. Some particular feature types such as equal line and space gratings may have a much greater value of k2.

1.5.6 Proximity Effects Some problems in lithography are caused by the fundamental physics of image formation. Even in the absence of any optical aberrations, the widths of lines in the lithographic image are influenced by other nearby features. This is often called the optical proximity effect. The systematic biases caused by this effect, although small, are quite undesirable. For example, the maximum speed of a logic circuit is greatly influenced by the uniformity of the transistor gate dimensions across the total area of the circuit. However, optical proximity effects give an isolated line a width somewhat greater than that of an identical line in a cluster of equal lines and spaces. This so-called isolated-to-grouped bias is a serious concern. It is actually possible to introduce deliberate lens aberrations that reduce the isolated-to-grouped bias, but there is always a fear that any aberrations will reduce image contrast and degrade the quality of the lithography. It is also possible to introduce selective image size biases into the photomask to compensate for optical proximity effects. The bookkeeping required to keep track of the density-related biases necessary in a complex mask pattern is daunting, but computer algorithms for generating optical proximity corrections (OPCs) on the mask have been developed and are being used with increasing frequency in semiconductor manufacturing. The final output of the lithographic illuminator, photomask, and lithographic lens is an aerial image, or image in space. This image is as perfect as the mask maker’s and lens maker’s art can make it. But the image must interact with a complex stack of thin films and patterns on the surface of the wafer to form a latent image within the bulk of the photoresist.

1.6 Latent Image Formation 1.6.1 Photoresist Photoresists are typically mixtures of an organic polymer and a photosensitive compound. A variety of other chemicals may be included to modify the optical or physical properties of the resist or to participate in the reactions between the photosensitive materials and

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

35

the polymer. Resist is applied to the surface of a wafer in an organic solvent using a process called spin casting. In this process, the surface of the wafer is first wetted with a small amount of resist, and the wafer is then spun at a several hundred to a few thousand rpm on a rotating spindle. The balance between surface tension and centrifugal forces on the wet resist creates an amazingly uniform film. The wafer continues to rotate at a constant angular velocity for a minute or so until the solvent has evaporated, and a stable dry film of photoresist has been created. The wafer is then subjected to a post-apply bake that drives out any remaining solvent and leaves the photoresist ready to be exposed. Photoresist has been brought to a very sophisticated level of development, and it tends to be expensive—on the order of several hundred dollars per liter, making the cost of photoresist one of the major raw material costs in semiconductor manufacturing. Several milliliters of liquid resist are usually used initially to wet the wafer surface, yet the actual amount of photoresist remaining on the wafer is tiny—typically 0.01–0.05 ml after spin casting. There is obviously a good opportunity for cost savings in this process. Attempts to salvage and reuse the spun-off resist that goes down the drain have not been particularly successful, but over the past several years, there have been spectacular reductions in the amount of resist dispensed per wafer. Shot sizes of 1 ml or even less can be achieved on advanced, automated resist application stations. This leads to an interesting economic dilemma. Most of the cost of photoresist is the recovery of development expenses, not the cost of the raw materials used to make the photoresist. Therefore, when resist usage is reduced by improvements in application technology, the resist makers have no choice but to increase the price per liter. This leads to a sort of technological arms race among semiconductor manufacturers. Those with the lowest resist usage per wafer not only save money, but they also force their competitors to bear a disproportionate share of supporting the photoresist industry. When the aerial image interacts with photoresist, chemical changes are induced in its photosensitive components. When the exposure is over, the image is captured as a pattern of altered chemicals in the resist. This chemical pattern is called the latent image. When the resist-coated wafer is exposed to a developer, the developer chemistry selectively dissolves either the exposed or the unexposed parts of the resist. Positive-tone resists are defined as resists whose exposed areas are removed by the developer. Negativetone resists are removed by the developer only in the unexposed areas. Choice of a positive-or negative-tone resist is dictated by a number of considerations, including the relative defect levels of positive and negative photomasks, the performance of the available positive and negative resists, and the differences in the fundamental optics of positive and negative image formation. Perhaps surprisingly, complementary photomasks with clear and opaque areas exactly reversed do not produce aerial image intensity profiles that are exact inverses of each other. Photoresist, unlike typical photographic emulsion, is designed to have an extremely high contrast. This means that its response to the aerial image is quite nonlinear, and it tends to exhibit a sort of threshold response. The aerial image of a tightly spaced grating may have an intensity profile that resembles a sine curve with very shallow slopes at the transitions between bright and dark areas. If this intensity profile were directly translated into a resist thickness profile, the resulting resist patterns would be unacceptable for semiconductor manufacturing. However, the nonlinear resist response can provide a very steep sidewall in a printed feature even when the contrast of the aerial image is low (Figure 1.10). The width of a feature printed in photoresist is a fairly sensitive function of the exposure energy. To minimize line width variation, stringent requirements must be placed on the exposure uniformity and repeatability. Typically, the allowable line width tolerance is G10% (or less) of the minimum feature size. For a 130 nm feature, this implies line width control of G13 nm or better. Because there are several factors contributing to line width

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

36

0.4

0.3

0.2

(a)

0.5

Image size (μm)

Image size (μm)

0.5

8

9

11 10 Exposure (mJ/cm2)

12

13

0.4

0.3

0.2

(b)

25

35 30 Exposure (mJ/cm2)

40

45

FIGURE 1.10 Two plots of line width versus exposure energy. (a) A rather sensitive negative-toned resist with good exposure latitude. The curve in (b) has a much steeper slope, indicating a much narrower exposure window for good line width control. The second resist is positive toned, and it is less sensitive than the resist in (a).

variation, the portion as a result of exposure variations must be quite small. Exposure equipment is typically designed to limit intrafield exposure variation and field-to-field variation to G1%. Over a moderate range of values, deliberate changes in exposure energy can be used to fine-tune the dimensions of the printed features. In positive-tone photoresist, a 10% increase in exposure energy can give a 5%–10% reduction in the size of a minimum-sized feature. 1.6.2 Thin-Film Interference and the Swing Curve The optical interaction of the aerial image with the wafer can be very complicated. In the later stages of semiconductor manufacture, many thin layers of semitransparent films will have been deposited on the surface of the wafer, topped by approximately a micron of photoresist. Every surface in this film stack will reflect some fraction of the light and transmit the rest. The reflected light interferes with the transmitted light to form standing waves in the resist. Standing waves have two undesirable effects. First, they create a series of undesirable horizontal ridges in the resist sidewalls, corresponding to the peaks and troughs in the standing wave intensity. But more seriously, the standing waves affect the total amount of light captured by the layer of resist (Figure 1.11). A slight change in thickness of the resist can dramatically change the amount of light absorbed by the resist, effectively changing its sensitivity to the exposure. A graph of resist sensitivity versus thickness will show a regular pattern of oscillations, often referred to as the swing curve. The swing between maximum and minimum sensitivity occurs with a thickness change of l/4n where l is the exposure wavelength, and n is the resist’s index of refraction. Because of the direct relationship between resist sensitivity and line width, the swing curve can make slight variations in resist thickness show up as serious variations in line width. The greatest line width stability is achieved when the resist thickness is tuned to an extremum in the swing curve (Figure 1.12). For best control, resist thickness must be held within about G10 nm of the optimum thickness. Variations in chip surface topography may make it impossible to achieve this level of control everywhere on the wafer. A swing curve can also be generated by variations in the thickness of a transparent film that lies somewhere in the stack of films underneath the resist [30]. This allows a change in an earlier deposition process to unexpectedly affect the behavior of a previously stable lithographic process. Ideally, all films on the wafer surface should be designed for

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

37

Energy absorption (μm−1)

2.0

1.5

1.0

0.5

0 0

0.2

0.4 0.6 0.8 Depth in photoresist (μm)

1.0

FIGURE 1.11 Optical energy absorption versus depth in a 1 mm layer of photoresist. The effect of standing waves is very prominent with the regions of maximum absorption separated by a spacing of l/2n. This resist is rather opaque, and the loss of energy is evident from the top to the bottom surface of the resist.

thicknesses to minimize the swing curve in the resist. However, because of the many other process and design requirements on these films, it is rare that their thicknesses can be controlled solely for the benefit of the lithography. The swing curve has the greatest amplitude when the exposure illumination is monochromatic. If a broad band of exposure wavelengths is used, the amplitude of the swing curve can be greatly suppressed. The band of wavelengths used in refractive lithographic lenses is not sufficiently broad enough to have much effect on the swing curve. However, reflective lens designs allow optical bandwidths of 10–20 nm or more. This can significantly reduce the swing curve to the point that careful optimization of the resist thickness may no longer be necessary. 1.6.3 Mask Reflectivity Light that is specularly reflected from the wafer surface will be transmitted backward through the projection lens, and it will eventually strike the front surface of the mask. If the mask has a high degree of reflectivity, as uncoated chromium does, this light will be reflected back to the wafer surface. Because this light has made three trips through the projection optics, the images will be substantially broadened by diffraction. This can result in a faint halo of light around each bright area in the image, causing some loss in contrast. The effect is substantially the same as the flare caused by random light scattering in the lens. The almost universally adopted solution to this problem is an antireflective coating

FIGURE 1.12 Standing waves set up by optical interference within the layer of photoresist can lead to undesirable ridges in the sidewalls of the developed resist pattern.

q 2007 by Taylor & Francis Group, LLC

38

Microlithography: Science and Technology

on the chromium surface of the mask. This can easily reduce the mask reflectivity from more than 50% to around 10%, and it can considerably suppress the problem of reflected light. 1.6.4 Wafer Topography In addition to the problems caused by planar reflective films on the wafer, there is a set of problems that result from the three-dimensional circuit structures etched into the wafer surface. Photoresist tends to bridge across micron-scale pits and bumps in the wafer, leaving a planar surface. But longer scale variations in the wafer surface are conformally coated by resist, producing a nonplanar resist surface to interact with the planar aerial image (Figure 1.13). This vertical wafer topography directly reduces the usable depth of focus. Vertical surfaces on the sides of etched structures can also reflect light into regions that were intended to receive no exposure. This effect is often called reflective notching, and in severe cases, circuit layouts may have to be completely redesigned before a working device can be manufactured. 1.6.5 Control of Standing Wave Effects A number of solutions to all of these problems have been worked out over the years. Horizontal ridges, induced by standing waves, in resist sidewalls can usually be reduced or eliminated by baking the resist after exposure. This post-exposure bake allows the chemicals forming the latent image to diffuse far enough to eliminate the ridges—about 50 nm—without significantly degrading the contrast at the edge of the image. The postexposure bake does nothing to reduce the swing curve, the other main effect of standing waves. Reflective notching and the swing curve can be reduced, to some extent, by using dyed photoresist. The optical density of the photoresist can be increased enough to suppress the light reflected from the bottom surface of the resist. This is a fairly delicate balancing act because too much opacity will reduce the exposure at the bottom surface of the resist and seriously degrade the sidewall profiles. An antireflective coating (often referred to by the acronym ARC) provides a much better optical solution to the problems of reflective notching and the swing curve. In this technique, a thin layer of a heavily dyed polymer or an opaque inorganic material is applied to the wafer underneath the photoresist layer. The optical absorption of this ARC layer is high enough to decouple the resist from the complex optical behavior of any underlying film stacks. Ideally, the index of refraction of the ARC should be matched to that of the photoresist so that there are no reflections from the resist-ARC interface. If this index matching is done perfectly, the swing curve will be totally suppressed. Although this is an elegant technique on paper, there are many practical difficulties with ARC layers. The ARC material must not be attacked by the casting solvent of the resist, and may it not interact chemically with the resist during exposure and development. The ARC substantially adds to the cost, time, and complexity of the photoresist application process. After the resist is developed, the ARC layer must be

FIGURE 1.13 In the presence of severe topographical variations across the chip’s surface, it may be impossible to project a focused image into all parts of the resist at the same time. The shaded area represents the stepper’s depth of focus. It has been centered on the higher regions of topography, leaving the lower regions badly out of focus.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

39

1.0

1.0

0.8

0.8

0.6 0.4 0.2 0 0.90

(a)

Relative sensitivity

Relative sensitivity

removed from the open areas by an etch process (Figure 1.14). There is typically very little etch selectivity between organic ARCs and photoresist, so the ARC removal step often removes a substantial amount of resist and may degrade the resist’s sidewall profile. This means that the ARC layer must be as thin as possible, preferably less than 10% of the resist thickness. If the ARC must be this thin, its optical absorbance must be very high. Aside from the difficulty of finding dyes with high enough levels of absorbance, a large discrepancy between the absorbance of the resist and that of the ARC makes it impossible to get an accurate match of the index of refraction. A fairly substantial swing curve usually remains a part of the lithographic process, even when an ARC is used. This remaining swing curve can be controlled by holding the resist thickness within tight tolerance limits. The swing curve can also be controlled with an antireflective layer on the top surface of the resist [31]. This occurs because the light-trapping effect that induces the swing curve is caused by optical interference between light waves reflected from the top and bottom surfaces of the resist layer. If there is no reflection from the top surface, this interference effect will not occur. A simple interference coating can be used, consisting of a quarterwavelength thickness of a material whose index of refraction is the square root of the photoresist’s index. Because photoresists typically have refractive indices that lie between 1.5 and 1.8, a top ARC should have a refractive index between 1.2 and 1.35. There are few materials of any kind with indices in this range, but some have been found and used for

0.95 1.00 1.05 Resist thickness (μ m)

0.6 0.4 0.2 0 0.90

1.10 (b)

0.95 1.00 1.05 Resist thickness (μ m)

1.10

Relative sensitivity

1.0 0.8 0.6 0.4 0.2 0 0.90 (c)

0.95 1.00 1.05 Resist thickness (μ m)

1.10

FIGURE 1.14 A series of swing curves showing the improvements that can be achieved with antireflective coatings (ARCs). (a) The suppression in swing curve when a bottom-surface ARC is used. (b) A similar suppression of swing curve by a top-surface ARC. Note that more light is coupled into the resist with the top ARC, increasing the resist’s effective sensitivity. (c) The much larger swing curve when no ARC is used. In these computer-modeled curves, resist sensitivity is taken as the optical absorption per unit volume of resist.

q 2007 by Taylor & Francis Group, LLC

40

Microlithography: Science and Technology

this application. The optical benefits of a top ARC are not as great as those of a conventional bottom-surface ARC because reflective notching, thin-film interference from substrate films and sidewall ridges are not suppressed. However, top ARC has some substantial process advantages over the conventional ARC. If a water soluble material is used for the top ARC, it does not have much tendency to interact with the resist during spin casting of the top-ARC layer. The top ARC can actually protect the underlying resist from airborne chemical contamination, a well-known problem for some types of modern photoresists. After the exposure is completed, the top ARC can be stripped without affecting the resist thickness or sidewall profiles. In fact, top ARC can be designed to dissolve in the aqueous base solutions that are typically used as developers for photoresist, eliminating the need for a separate stripping step. (It should be noted that this clever use of a water-soluble film cannot be used for bottom-surface ARC. If a bottom ARC washes away in the developer, the resist features sitting on top of it will wash away as well.) Using a moderately effective bottom-surface ARC along with a combined top ARC and chemical barrier should provide the maximum benefits. However, the expense and complexity of this belt-and-suspenders approach usually makes it unrealistic in practice. Because of the cost, ARCs of both kinds are usually avoided unless a particular lithographic level cannot be made to work without them. 1.6.6 Control of Topographic Effects Chip surface topography presents a challenge for lithography even when reflective notching and the swing curve are suppressed by ARCs. When the wafer surface within the exposure field is not completely flat, it may be impossible to get both high and low areas into focus at the same time. If the topographical variations occur over a short distance, then the resist may planarize the irregularities; however, it will still be difficult to create images on the thick and thin parts of the resist with the same exposure. A variety of optical tricks and wafer planarization techniques have been developed to cope with this problem. The most successful planarizing technique has been chemical–mechanical polish where the surface of the wafer is planarized by polishing it with a slurry of chemical and physical abrasives, much as one might polish an optical surface. This technique can leave a nearly ideal planar surface for the next layer of lithographic image formation. 1.6.7 Latent Image Stability The stability of the latent image varies greatly from one type of resist to another. Some resists allow exposed wafers to be stored for several days between exposure and development. However, there are also many resists that must be developed within minutes of exposure. If these resists are used, an automated wafer developer must be integrated with the exposure system so that each wafer can be developed immediately after exposure. This combination of exposure system and wafer processing equipment is referred to as an integrated photosector or photocluster.

1.7 The Resist Image 1.7.1 Resist Development After the latent image is created in resist, a development process is used to produce the final resist image. The first step in this process is a post-exposure bake. Some types of resist

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

41

(especially the so-called chemically amplified resists) require this bake to complete the formation of the latent image by accelerating reactions between the exposed photosensitizer and the other components of the resist. Other types of resist are also baked to reduce the sidewall ridges, resulting from standing waves. The baked resist is then developed in an aqueous base solution. In many cases, this just involves immersing an open-sided container of wafers in a solution of potassium hydroxide (KOH). This simple process has become more complicated in recent years as attempts are made to improve uniformity of the development, to reduce contamination of the wafer surface by metallic ions from the developer, and to reduce costs. Today, a more typical development would be done on a specialized single-wafer developer that would mechanically transport each wafer in its turn to a turntable, flood the surface of the wafer with a shallow puddle of metal ion-free developer (such as tetramethylammonium hydroxide, TMAH), rinse the wafer, and spin it dry. In a spray or puddle develop system, only a small amount of developer is used for each wafer, and the developer is discarded after use. With the continuing improvement in lens resolution, a new problem has begun to occur during the drying step after development. If a narrow resist structure has an excessively high aspect ratio (the ratio between the height and width after development), then it has a strong tendency to be toppled over by the surface tension of the developer as it dries. Resist collapse may limit the achievable lithographic resolution before lens resolution does. Chemical surfactants in the developer can improve the situation as can improvements in the adhesion between the resist and the wafer surface. A somewhat exotic technology called supercritical drying is also under investigation to prevent resist collapse. With this method, the aqueous developer is displaced by another liquid chosen to have a relatively accessible critical point. The critical point is the temperature and pressure at which the phase transition between the liquid and gas states disappears. The liquid containing the developed wafer is first pressurized above the critical pressure; it is then heated above the critical temperature. At this point, the fluid has become a gas without ever having passed through a phase transition. The gas can be pumped out while the temperature is held high enough to prevent recondensation, and the dry wafer can then be cooled back to room temperature. The entire process is complicated and slow, but it completely prevents the creation of a surface meniscus or the forces that cause resist collapse during drying. 1.7.2 Etch Masking The developed resist image can be used as a template or mask for a variety of processes. Most commonly, an etch is performed after the image formation. The wafer can be immersed in a tank of liquid etchant. The resist image is not affected by the etchant, but the unprotected areas of the wafer surface are etched away. For a wet etch, the important properties of the resist are its adhesion and the dimension at the base of the resist images. Details of the resist sidewall profile are relatively unimportant. However, wet etches are seldom used in critical levels of advanced semiconductor manufacturing. This occurs because wet chemical etches of amorphous films are isotropic. As well as etching vertically through the film, the etch proceeds horizontally at an equal rate, undercutting the resist image and making it hard to control the final etched pattern size. Much better pattern size control is achieved with reactive ion etching (RIE). In this process, the wafer surface is exposed to the bombardment of chemically reactive ions in a vacuum chamber. Electric and/or magnetic fields direct the ions against the wafer surface at normal incidence, and the resulting etch can be extremely anisotropic. This prevents the undercut usually seen with wet etches, and it allows a much greater degree of dimensional control.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

42

Reactive ion etching places a much different set of requirements on the resist profile than does a wet etch process. Because the RIE process removes material by both chemical reaction and mechanical bombardment, it tends to erode the resist much more than a wet etch. This makes the profile of the resist sidewall very important (Figure 1.15). Any shape other than a straight, vertical sidewall with square corners at the top and foot of the resist image is undesirable because it allows the transfer of a sloped profile into the final etched structure. A variety of profile defects such as sidewall slopes, T-tops, and image feet can be induced by deficiencies in the photoresist process. Another set of profile defects can result from the properties of the aerial image. With today’s high NA lithographic lenses, the aerial image can change substantially over the thickness of the resist. This leads to different resist profiles, depending on if the best focus of the aerial image is at the top, bottom, or middle of the resist film. Although the best resist image profiles can be formed in thin resist, aggressive RIE processes may force the use of a thick resist layer. Tradeoffs between the needs of the etch process and the lithographic process commonly result in a resist thickness between 0.3 and 1.0 mm. As lateral image dimensions shrink below 0.18 mm, the height-to-width ratio (or aspect ratio) of the resist images can become very large. Resist adhesion failure can become a problem for aspect ratios greater than about 3:1. Although resist patterns most frequently are used as etch masks, there are other important uses as well. The resist pattern can be used to block ion implantation or to block deposition of metal films. Ion implants require square resist profiles similar to the profiles required for RIE; however, the thickness requirements are dictated by the stopping range of the energetic ions in the resist material. For ion implants in the MeV range, a resist thickness considerably greater than 1 mm may be required. Fortunately, implant masks do not usually put great demands on the lithographic resolution or overlay tolerance. Both etch and implant processes put fairly strong demands on the physical and thermal durability of the resist material. The etch resistance and thermal stability of the developed resist image may be enhanced by a variety of resist-hardening treatments. These may

FIGURE 1.15 A comparison of wet and dry etch processes. The wet etch process shown in (a) is isotropic and tends to undercut the resist pattern. The etch does not attack the resist, and the final etch profiles are sloped. In contrast, the reactive-ion etch (RIE) process shown in (b) can be used to etch a much thicker film while retaining nearly vertical sidewall slopes. The resist is partially eroded during the RIE process. The bottom illustration in each series shows the etched pattern after the resist is stripped.

q 2007 by Taylor & Francis Group, LLC

(a)

(b)

System Overview of Optical Steppers and Scanners

43

involve diffusion of etch-resistant additives (such as silicon compounds) into the already patterned resist. More frequently, the hardening treatment consists of a high-temperature bake and/or ultraviolet flood exposure of the wafer. These processes tend to cross-link the resist polymers and toughen the material. A large number of g-line, i-line, and 248 nm (deep-UV) resists can be successfully UV or bake hardened, but there are some resists that do not cross-link under these treatments. The choice of whether or not to use resist-hardening treatments is dictated by details of the etch process and the etch resistance of the untreated resist. As the image size shrinks from one lithographic generation to the next, there has been no corresponding reduction of the resist thickness required by the etch or implant processes. If anything, there is a tendency to more aggressive etches and higher energy ion implants. This is pushing resist images to ever higher aspect ratios. Already, 0.13 mm features in 0.4 mm thick resist have an aspect ratio of 3:1. This is likely to grow to around 4:1 in the following generation. If resist processes start to fail at these high aspect ratios, there will have to be a migration to more complex resist processes. High aspect ratios can be readily (if expensively) achieved with multilayer resist (MLR) processes or top-surface-imaging (TSI) resists. 1.7.3 Multilayer Resist Process A MLR consists of one or two nonphotosensitive films covered with a thin top layer of photoresist. The resist is exposed and developed normally; the image is then transferred into the bottom layers by one or more RIE processes. The patterned bottom layer then acts as a mask for a different RIE process to etch the substrate. For example, the top layer may be a resist with a high silicon content. This will block an oxygen etch that can be used to pattern an underlying polymer layer. The patterned polymer acts as the mask for the substrate etch. Another MLR process uses a sandwich of three layers: a bottom polymer and a top resist layer separated by a thin layer of silicon dioxide (Figure 1.16). After the resist layer is exposed and developed, the thin oxide layer is etched with a fluorine RIE. The oxide then acts as a mask to etch the underlying polymer with an oxygen etch. Finally, the patterned polymer layer is used to mask a substrate etch. A MLR process allows the customization of each layer to its function. The bottom polymer layer can be engineered for maximum etch resistance, and the top resist layer is specialized for good image formation. The sidewall profiles are generated by RIE, and they tend to be extremely straight and vertical. It is relatively easy to create features with very high aspect ratios in MLRs. The cost of MLR is apt to be very high because of the multiple layers of materials that have to be deposited and the corresponding multiple RIE steps. Another way to design a MLR process is to deposit a thin layer of material by sputtering or chemical vapor deposition. This first layer acts as both a hard mask for the final etch process and as an inorganic antireflective coating for the top layer of photoresist. By judicious choice of materials with high relative etch selectivities, both the resist layer and the ARC/hard-mask layers can be made quite thin. 1.7.4 Top-Surface Imaging Top surface imaging resists are deposited in a single layer like conventional resists. These resists are deliberately made extremely opaque, and the aerial image does not penetrate very deeply into the surface. The latent image formed in the surface is developed by treating it with a liquid or gaseous silicon-bearing material. Depending on the chemistry of the process, the exposed surface areas of the resist will either exclude or preferentially

q 2007 by Taylor & Francis Group, LLC

44

Microlithography: Science and Technology

FIGURE 1.16 In a multilayer resist (MLR) process, a thin photosensitive layer is exposed and developed. The pattern is then transferred into an inert layer of polymer by a dry etch process. The imaging layer must have a high resistance to the pattern transfer etch. Very small features can be created in thick layers of polymer with nearly vertical sidewall profiles.

absorb the silylating agent. The silicon incorporated in the surface acts as an etch barrier for an oxygen RIE that creates the final resist profile. Top-surface imaging shares most of the advantages of MLR, but at somewhat reduced costs. It is still a more expensive process than standard single-layer resist. Top-surface-imaging resist images often suffer from vertical striations in the sidewalls and from unwanted spikes of resist in areas that are intended to be clear. The presence or absence of these types of defects is quite sensitive to the etch conditions (Figure 1.17). 1.7.5 Deposition Masking and the Liftoff Process Use of resist as a mask for material deposition is less common, but it is still an important application of lithography. Rarely, a selective deposition process is used to grow a

FIGURE 1.17 Top-surface imaging (TSI) uses a very opaque resist whose surface is activated by exposure to light. The photoactivated areas selectively absorb organic siliconbearing compounds during a silylation step. These silylated areas act as an etch barrier, and the pattern can be transferred into the bulk of the resist with a dry etch process.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

45

crystalline or polycrystalline material on exposed areas of the wafer surface. Because of the selectivity of the chemical deposition process, no material is deposited on the resist surface. The resist sidewalls act as templates for the deposited material, and the resist must be thick enough to prevent overgrowth of the deposition past its top surface. Other than this, there are no special demands on the resist used as a mask for selective deposition. Another masked deposition process involves nonselective deposition of a film over the top of the resist image. The undesired material deposited on top of the resist is removed when the resist is stripped. This is known as the liftoff process. A very specialized resist profile is required for the liftoff process to be successful. A vertical or positively sloped profile can allow the deposited material to form bridges between the material on the resist and that on the wafer surface (Figure 1.18). These bridges anchor down the material that is intended to be removed. There are various resist processes, some of them quite complex, for producing the needed negatively sloped or undercut resist profile. 1.7.6 Directly Patterned Insulators There is one final application of lithography that should be mentioned. It is possible to incorporate the patterned resist directly into the final semiconductor device, usually as an insulating layer. This is difficult to do near the beginning of the chip fabrication process because high temperature steps that are encountered later in the process would destroy the organic resist polymers. However, photosensitive polyimide materials have been used as photodefinable insulators in some of the later stages of semiconductor fabrication.

FIGURE 1.18 The liftoff process requires a specialized undercut photoresist profile. A thin layer of metal is deposited on the wafer surface. The metal that lies on top of the photoresist is washed away when the resist is stripped, leaving only the metal that was directly deposited on the wafer surface.

q 2007 by Taylor & Francis Group, LLC

46

Microlithography: Science and Technology

1.7.7 Resist Stripping The final step in the lithographic process is stripping the resist that remains on the wafer after the etch, deposition, or ion implant process is complete. If the resist has not been chemically altered by the processing, it can often be removed with an organic solvent. However, resists that have been heavily cross-linked during processing or by a deliberate resist-hardening step are much harder to remove. This problem, well known to anyone who has tried to clean the neglected top of his kitchen stove, is usually solved with powerful chemical oxidants such as strong acids, hydrogen peroxide, or ozone, or with an oxygen plasma asher that oxidizes the organic resist polymers with practically no residue.

1.8 Alignment and Overlay Image formation and alignment of the image to previous levels of patterns are equally critical parts of the lithographic process. Alignment techniques have evolved over many years from simple two-point alignments to very sophisticated multi-term models. This has brought overlay tolerances over a 10 year period from 0.5 mm to below 50 nm today. Over the years that semiconductor microlithography has been evolving, overlay requirements have generally scaled linearly with the minimum feature size. Different technologies have different proportionality factors, but, in general, the overlay tolerance requirement has been between 25 and 40% of the minimum feature size. If anything, there has been a tendency for overlay tolerance to become an even smaller fraction of the minimum feature size. 1.8.1 Definitions Most technically educated people have a general idea of what alignment and overlay mean, but there are enough subtleties in the jargon of semiconductor manufacturing that a few definitions should be given. Both alignment accuracy and overlay accuracy are the positional errors resulting when a second-level lithographic image is superimposed on a firstlevel pattern on a wafer. Alignment accuracy is measured only at the location of the alignment marks. This measurement serves to demonstrate the accuracy of the stepper’s alignment system. The total overlay accuracy is measured everywhere on the wafer, not just in the places where the alignment marks are located. It includes a number of error terms beyond those included in the alignment error. In particular, lens distortion, chuckinduced wafer distortion, and image placement errors on the mask can give significant overlay errors even if the alignment at the position of the alignment marks is perfect. Of course, it is the total overlay error that determines the production yield and quality of the semiconductor circuits being manufactured. Alignment and overlay could have been defined in terms of the mean length of the placement error vectors across the wafer, but it has been more productive in semiconductor technology to resolve placement error into x and y components and to analyze each component separately as a scalar error. This occurs because integrated circuits are designed on a rectangular grid with dimensions and tolerances specified in Cartesian coordinates. If a histogram is made of the x-axis overlay error at many points across a wafer, the result will be a more or less Gaussian distribution of scalar errors. The number quoted as the x-axis overlay or alignment error is the absolute value of the mean error plus

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

47

three times the standard deviation of the distribution about the mean C 3sx Overlayx Z jXj The y-axis overlay and alignment errors will have an analogous form. The evolutionary improvement of overlay tolerance has paralleled the improvements in optical resolution for many years, but the technologies involved in overlay are nearly independent of those involved in image formation. Resolution improvements are largely driven by increases in numerical aperture and reduction of optical aberrations in the lithographic lens. Although lens distortion is a significant component of the overlay budget, most of the overlay accuracy depends on the technologies of alignment mark detection, stage accuracy, photomask tolerance, thermal control, and wafer chucking. 1.8.2 Alignment Methodology Overlay errors consist of a mixture of random and systematic placement errors. The random component is usually small, and the best alignment strategy is usually to measure and correct as many of the systematic terms as possible. Before the wafer is placed on the vacuum chuck, it is mechanically pre-aligned to a tolerance of a few tens of microns. This prealignment is good enough to bring alignment marks on the wafer within range of the alignment mark detection system, but the rotation and x- and y-translation errors must be measured and corrected before the wafer is exposed. A minimum of two alignment marks must be measured to correct rotation and x and y translation. The use of two alignment marks also gives information about the wafer scale. If the previous level of lithography was exposed on a poorly calibrated stepper or if the wafer dimensions have changed because of thermal effects, then the information on wafer scale can be used to adjust the stepping size to improve the overlay. The use of a third alignment mark adds information about wafer scale along a second axis and about orthogonality of the stepping axes. These terms usually need to be corrected to bring overlay into the sub-0.1 mm regime. Each alignment mark that is measured provides two pieces of information—its x and y coordinates. This means that 2n alignment terms can be derived from n alignment mark measurements. It is usually not productive to correct stepping errors higher than the six terms just described: x and y translation, wafer rotation, x and y wafer scale, and stepping orthogonality. However, a large number of alignment mark positions can be measured on the wafer and used to calculate an over specified or constrained fit to these six terms. The additional measurements provide redundant information that reduces the error on each term (Figure 1.19). An additional set of systematic alignment terms results from errors within a single exposure field. The dominant terms are intrafield magnification error and field rotation relative to the stepping axes. In a static-field stepper, the lens symmetry prevents any differences between magnification in the x and y direction (known as anamorphism); however, there are higher order terms that are important. Third-order distortion (barrel or pincushion distortion) is a variation of magnification along a radius of the circular exposure field. X- and y-trapezoid errors result from a tilted mask in an optical system that is not telecentric on the mask side. As the name implies, trapezoid errors distort a square coordinate grid into a trapezoidal grid. In previous generations of steppers, these intrafield errors (except for third-order distortion that is a property of the lens design) were removed during the initial installation of the system and readjusted at periodic maintenance intervals. Today, most advanced steppers

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

48

(a)

(b)

(c)

(d)

FIGURE 1.19 Common wafer-scale overlay errors. The solid chip outlines represent the prior exposure level, and the dotted outlines are the newly aligned exposures. (a) A simple x and y translation error. (b) A positive wafer scale error in x and a negative wafer scale error in y. (c) Wafer rotation. (d) Stepping orthogonality error.

have the ability to adjust magnification and field rotation on each wafer to minimize overlay error. Alignment marks in at least two different locations within the exposure field are needed to measure these two intrafield terms. Step-and-scan exposure systems can have additional intrafield error terms. Because the total exposure field is built up by a scanning slit, an error in the scanning speed of either the mask or wafer can produce a difference between the x and y field magnifications. If the mask and the wafer scans are not exactly parallel, then an intrafield orthogonality error (often called skew) is generated. Of course, the lens magnification and field rotation errors seen in static exposure steppers are also present in step-and-scan systems. A minimum of three intrafield alignment marks is necessary to characterize x and y translation, x and y field magnification, field rotation, and skew in such a system. Although step-and-scan systems have the potential of adding new intrafield error terms, they also have the flexibility to correct any of these terms that occur in the mask. Because of the way photomasks are written, they are subject to some amount of anamorphic magnification error and intrafield skew. A static-field stepper cannot correct these errors, but a step-and-scan system can do so. The mask has to be analyzed for image placement errors prior to use, then corrections for x and y magnification and skew can be programmed into the scanning software (Figure 1.20). 1.8.3 Global Mapping Alignment Intrafield and stepping error terms can be simultaneously derived in a constrained fit to a large number of measured alignment mark positions across the wafer. This alignment

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

(a)

49

(b)

FIGURE 1.20 Correctable intra-field overlay errors. (a) A field magnification error; (b) field rotation. Contrast these with the corresponding wafer magnification and rotation errors in Figure 1.19b and c. There are many higher orders of intra-field distortion errors, but, in general, they cannot be corrected except by modifications to the lens assembly.

strategy is called global mapping. It requires that the wafer be stepped past the alignment mark detector while position information on several alignment marks is collected. After a quick computer analysis of the data, corrections to the intrafield and stepping parameters of the system are made, and the wafer exposure begins. The corrections made after the mapping are assumed to be stable during the one or two minutes required to expose the wafer. Global mapping has been a very successful alignment strategy. Its main drawback is the length of time required for the mapping pass that degrades the productivity of the system. Two-point global alignment and site-by-site alignment are the main strategies competing with global mapping. Because two-point global alignment is only able to correct for x and y translation, wafer rotation, and isotropic wafer scale errors, it can succeed only if the stage orthogonality is very good and the intrafield error terms are very small and stable over time. It also requires that all previous levels of lithography have very low values of these terms. Although most stepper manufacturers have adopted global mapping alignment, ASM Lithography (ASML) has been very successful with a highly accurate two-point alignment strategy. The time penalty for global mapping strategy has recently been reduced by an innovative development by ASML [32]. A step-and-scan system has been designed with two completely independent, identical stages. One stage performs the global mapping of alignment marks and simultaneously maps the vertical contour of the wafer. At the same time, the other stage is being used to expose a previously mapped wafer. The stage performing the exposures does not use any time for alignment or for performing an automatic focus at each exposure site. It spends all the available time moving to the premapped x, y, and z positions and performing the scanned exposures. When all of the exposures on the wafer are finished, the two stages exchange positions, and the stage that just completed its exposure pass starts a mapping pass on a new wafer. The mapping is all done with reference to a few fixed marks on the stage, and these marks must be aligned after the stage swap occurs and before the exposure pass begins. With this system, exposure rates on 300 mm wafers have exceeded 100 wph. 1.8.4 Site-by-Site Alignment Site-by-site alignment can be performed by most steppers with global mapping capability. In this alignment strategy, alignment marks at each exposure site are measured and corrections specific to that site are calculated. The site is exposed after the alignment

q 2007 by Taylor & Francis Group, LLC

50

Microlithography: Science and Technology

measurement is done, and the process is repeated at the next exposure site. The time required for this process is usually greater than that required for global mapping because of the need for alignment mark measurements at each exposure site on the wafer. The accuracy of site-by-site alignment can be better than that of global mapping, but this is not always the case. As long as the alignment is done to a lithographic level where the systematic component of placement error is greater than the random component, the data averaging ability of global mapping can yield a better final overlay. In general, global mapping reduces random measurement errors that occur during detection of the alignment mark positions, whereas site-by-site alignment does a better job of matching random placement errors of the previous level of lithography. As long as the random placement errors are small, global mapping can be expected to give better results than site-by-site alignment. 1.8.5 Alignment Sequence Global mapping, two-point global alignment, and site-by-site alignment are all used to align the current exposure to a previous level of lithographic patterns. Another important part of the alignment strategy is to determine the alignment sequence. This is the choice of the previous level to which the current level should align. One common choice is levelto-level alignment. With this strategy, each level of lithographic exposure is aligned to the most recent previous critical level. (Levels—sometimes called layers—of lithographic exposures are classified as critical or noncritical, depending on the image sizes and overlay tolerances. Critical levels have image sizes at or near the lithographic resolution limits and the tightest overlay tolerances. Noncritical levels have image sizes and overlay tolerances that are relaxed by 1.5!–2! or more from those of the critical levels.) This provides the most accurate possible alignment between adjacent critical levels. But critical levels that are separated by another intervening critical level are only related by a secondpﬃﬃﬃ order alignment. This will be less accurate than a first-order alignment by a factor of 2. More distantly separated levels will have even less accurate alignments to each other. In general, a high-order alignment will have an alignment error that is the root sum square of the individual alignments errors in the sequence. If all of the alignments in the sequence have the same magnitude of error, then anpnth ﬃﬃﬃ order alignment error will be greater than a first order alignment error by a factor of n. Another common alignment strategy is called zero-level alignment. In this strategy, every level of lithographic exposure is aligned to the first level that was printed on the wafer. This first level may be the first processing level or a specialized alignment level (called the zero level) containing nothing but alignment marks. With this strategy, every level has an accurate first-order alignment to the first level and a less accurate secondorder alignment to every other level. The choice of whether to use zero-level or levelto-level alignment depends on the needs of the semiconductor circuits being fabricated. In many cases, the most stringent requirement is for alignment of adjacent critical levels and alignment to more remote levels is not as important. Level-to-level alignment is clearly called for in this case. After a chain of six or eight level-to-level alignments, the first and last levels printed will have a 2.5!–3! degradation in the accuracy of their alignment relative to a first-order alignment. Zero-level alignments suffer from another problem. Many semiconductor processing steps are designed to leave extremely planar surfaces on the wafer. For example, chemical–mechanical polishing leaves a nearly optical finish on the wafer surface. If an opaque film is deposited on top of such a planarized surface, any underlying alignment marks will be completely hidden. Even in less severe cases, the accumulation of many levels of processing may seriously degrade the visibility of a zero-level alignment mark.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

51

Often, a mixed strategy is adopted. If zero-level alignment marks become unusable after a particular level of processing, then a new set of marks can be printed and used for successive levels of alignment. If a level-to-level alignment strategy fails because of the inaccuracy of a third-or fourth-order alignment, then the alignment of that level may be changed to an earlier level in the alignment sequence. Generally speaking, the impact of any particular alignment sequence on the overlay accuracy between any two exposure levels should be well known to the circuit designer, and the design tolerances must take these figures into account from the start of any new circuit design. 1.8.6 Distortion Matching After all of the correctable alignment terms have been measured and corrected, there are still some systematic, but uncorrectable, overlay errors. One important term is lens distortion. Although lens distortion has dramatically improved over the years (to values below 50 nm today), it is still an important term in the overlay budget. The distortion characteristics of a lens tend to be stable over time, so if both levels in a particular alignment are exposed on the same stepper, the relative distortion error between the two levels will be close to zero. This strategy is called dedication. It is a remarkably unpopular strategy in large-scale semiconductor manufacturing. A large amount of bookkeeping is required to ensure that each wafer lot is returned to its dedicated stepper for every level of lithography. Scheduling problems are likely to occur where some steppers are forced to stand idle while others have a large backlog of work. Worst of all, the failure of a single stepper can completely halt production on a large number of wafer lots. Lithographic lenses made by a single manufacturer to the same design tend to have similar distortion characteristics. This means that two masking levels exposed on steppers with the same lens design will usually have lower overlay errors than the absolute lens distortion values would predict. It is quite likely that a semiconductor fabricator will dedicate production of a particular product to a set of steppers of one model. As well as providing the best overlay (short of single-stepper lot dedication), a set of identical steppers provides benefits for operator training, maintenance, and spare parts inventory. For an additional cost, stepper manufacturers will often guarantee distortion matching within a set of steppers to much tighter tolerances than the absolute distortion tolerance of that model. Overlay between steppers of different models or from different manufacturers is likely to give the worst distortion matching. This strategy is often called mix-and-match. There may be several reasons for adopting a mix-and-match strategy. A small number of very expensive steppers may be purchased to run a level with particularly tight dimensional tolerances, but a cheaper stepper model may be used for all the other levels. Often, a semiconductor fabricator will have a number of older steppers that are adequate for the less critical levels of lithography. These will have to achieve a reasonable overlay tolerance with newer steppers that expose the more difficult levels. In recent years, improvements in lens design and fabrication have greatly reduced the total distortion levels in lithographic lenses. As this improvement continues, the characteristic distortion signatures of different lens designs have become less pronounced. Today, there are several examples of successful mix-and-match alignment strategies in commercial semiconductor manufacturing. The distortion characteristics of step-and-scan systems are different from those of static exposure lenses. Because the exposure field is scanned across the wafer, every point in the printed image represents the average of the distortion along the direction of the scan. This averaging effect somewhat reduces the distortion of the scanned exposure. (Scanning an exposure field with a small amount of distortion also makes the images slightly move during the scan. This induces a slight blurring of the image. If the magnitude of the

q 2007 by Taylor & Francis Group, LLC

52

Microlithography: Science and Technology

distortion vectors is small compared to the minimum image size, then the blurring effect is negligible.) The distortion plot of a scanned exposure may have a variety of different displacement vectors across the long axis of the scanning slit. However, in the scanned direction, the displacement vectors in every row will be nearly identical. In contrast, the distortion plot of a conventional static stepper field typically has displacement vectors that are oriented along radii of the field, and the distortion plot tends to have a rotational symmetry about its center. There is a potential solution to the problem of distortion matching among a set of steppers. If the masks used on these steppers are made with their patterns appropriately placed to cancel the measured distortion signature of each stepper, then nearly perfect distortion matching can be achieved. As with most utopian ideas, this one has many practical difficulties. There are costs and logistical difficulties in dedicating masks to particular steppers just as there are difficulties in dedicating wafer lots to a particular stepper. The software infrastructure required to merge stepper distortion data with mask design data has not been developed. Masks are designed on a discrete grid, and distortion corrections can be made only when they exceed the grid spacing. The discontinuity that occurs where the pattern is displaced by one grid space can cause problems in the mask inspection and even in the circuit performance at that location. The possibility remains that mask corrections for lithographic lens distortion may be used in the future, but it will probably not happen as long as lens distortion continues to improve at the present rate. 1.8.7 Off-Axis Alignment The sensors used to detect alignment mark positions have always used some form of optical position detection. The simplest technique is to have one or more microscope objectives mounted close to the lithographic projection lens and focused on the wafer’s surface when it is mounted on the vacuum chuck. The image of the alignment mark is captured by a television camera or some other form of image scanner; the position of the mark is determined by either a human operator or an automated image detection mechanism. Operator-assisted alignments are almost totally obsolete in modern lithographic exposure equipment, and automated alignment systems have become very sophisticated. Alignment by use of external microscope objectives is called off-axis alignment. The alternative to off-axis alignment is called through-the-lens (TTL) alignment. As the name implies, this technique captures the image of a wafer alignment mark directly through the lithographic projection lens. With off-axis alignment, every wafer alignment mark is stepped to the position of the detection microscope. Its x and y positions are recorded in absolute stage coordinates. After the mapping pass is complete, a calculation is done to derive the systematic alignment error terms. The position of the wafer relative to the alignment microscope is now known extremely accurately. An x and y offset must be added to the position of the wafer in order to translate the data from the alignment microscope’s location to that of the center of the mask’s projected image. This offset vector is usually called the baseline. Any error in the value of the baseline will lead directly to an overlay error in the wafer exposure. A number of factors can affect stability of the baseline. Temperature changes in the stepper environment can cause serious baseline drifts because of thermal expansion of the stepper body. Every time a mask is removed and replaced, the accuracy with which the mask is returned to its original position directly affects the baseline. The mask is aligned to its mounting fixture in the stepper by use of specialized mask alignment marks that are part of the chromium pattern generated by the original mask data. Any pattern placement error affecting these marks during the mask-making process also adds a term to the baseline error.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

53

In older generations of steppers, baseline drift was a significant source of overlay error. Baseline was corrected at frequent service intervals by a painstaking process of exposing lithographic test wafers and analyzing their overlay errors. In between corrections, the baseline—and the overlay on all the wafers exposed on the system—drifted. Baseline stability in modern steppers has greatly improved as a result of improved temperature control, better mounting technology for the alignment microscopes, and relocating the microscopes closer to the projection lens that physically shortens the baseline. Another key improvement has been a technology for rapid, automated measurement of the baseline. A specialized, miniature alignment mark detector can be built into the stepper’s wafer stage. (In some steppers, the arrangement described here is reversed. Instead of using a detector on the stage, an illuminated alignment mark is mounted on the stage. The image of this mark is projected backward through the lithographic projection lens onto the surface of the mask. The alignment between the mask and the projection alignment mark is measured by a detector above the mask.) This detector is designed to measure the position of the projected image of an alignment mark built into the mask pattern on every mask used on that stepper. Permanently etched into the faceplate of the detector on the stage is a normal wafer alignment mark. After the image of the mask alignment mark has been measured, the stage moves the detector to the off-axis alignment position. There, the etched wafer alignment mark on the detector faceplate is measured by the normal alignment microscope. The difference between these two measured positions plus the small separation between the detector and the etched alignment mark equals the baseline. The accuracy of the baseline is now determined by the accuracy of the stage-positioning interferometers and the stability of the few-millimeter spacing between the detector and the etched alignment mark. The detector faceplate can be made of a material with low thermal expansion coefficient such as fused silica to further reduce any remaining baseline drift. This automated baseline measurement can be made as often as necessary to keep the baseline drift within the desired tolerances. In high-volume manufacturing, baseline drift can be kept to a minimum by the techniques of statistical process control. By feeding back corrections from the constant stream of overlay measurements on product wafers, the need for periodic recalibration of baseline can be completely eliminated. 1.8.8 Through-the-Lens Alignment Through-the-lens alignment avoids the problem of baseline stability by directly comparing an alignment mark on the mask to the image of a wafer alignment mark projected through the lithographic projection lens. There are a number of techniques for doing this. One typical method is to illuminate the wafer alignment mark through a matching transparent window in the mask. The projected image is reflected back through the window in the mask, and its intensity is measured by a simple light detector. The wafer is scanned so that the image of the wafer alignment mark passes across the window in the mask, and the position of maximum signal strength is recorded. The wafer scan is done in both the x and y directions to determine both coordinates of the wafer alignment mark. Although TTL alignment is a very direct and accurate technique, it also suffers from a few problems. The numerical aperture of the detection optics is limited to that of the lithographic projection lens even though a higher NA might be desirable to increase the resolution of the alignment mark’s image. Because the wafer alignment mark must be projected through the lithographic lens, it should be illuminated with the lens’s designed wavelength. However, this wavelength will expose the photoresist over each alignment

q 2007 by Taylor & Francis Group, LLC

54

Microlithography: Science and Technology

mark that is measured. This is often quite undesirable because it precludes the mask designer from making a choice about whether or not to expose the alignment mark in order to protect it from the next level of processing. The alignment wavelength can be shifted to a longer wavelength to protect the resist from exposure, but the projection lens will have to be modified in order to accept the different wavelength. This is often done by inserting very small auxiliary lenses in the light path that is used for the TTL alignment. These lenses correct the focal length of the lens for the alignment wavelength but interfere with only a small region at the edge of the lens field that is reserved for use by the mask alignment mark. Whether the exposure wavelength or a longer wavelength is used, the chromatic aberration of the lithographic lens forces the use of monochromatic light for the TTL alignment. Helium–neon or argon-ion lasers are often used as light sources for TTL alignment. Monochromatic light is not ideal for detecting wafer alignment marks. Because these marks are usually made of one or more layers of thin films, they exhibit a strong swing curve when illuminated by monochromatic light. For some particular film stacks, the optical contrast of the alignment mark may almost vanish at the alignment wavelength. This problem is not as likely to occur with broadband (white light) illumination that is usually used in off-axis alignment systems. In general, off-axis alignment offers more flexibility in the design of the detector. Because it is decoupled from the projection optics, there is a free choice of numerical apertures and alignment wavelengths. There is no interference with the optical or mechanical design of the projection lens as there usually is with TTL alignment detectors. Off-axis alignment may require an additional amount of travel in the wafer stage in order that all parts of the wafer can be viewed by the alignment mark detector. The most serious difficulty with offaxis alignment is the baseline stability. If the baseline requires frequent recalibration, then the availability and productivity of the stepper will suffer. 1.8.9 Alignment Mark Design The alignment mark design is usually specified by the stepper manufacturer rather than being left up to the imagination of the mask designer. The alignment mark detector is optimized for best performance with one particular mark design. At minimum, the mark must have structures in two orthogonal directions so that its x and y position can be measured. A simple cross-shaped mark has been successfully used in the past. There are benefits gained by measuring multiple structures within the alignment mark. Today, many alignment marks are shaped like gratings with several horizontal and vertical bars. This allows the measurement error to be reduced by averaging the position error from the measurement of each bar. It also reduces the effect of tiny edge placement errors that may have occurred in the manufacture of the mask that was used to print the mark on the wafer and effects of edge roughness in the etched image of the mark. The size of the alignment mark involves a tradeoff between signal strength and the availability of space in the chip design. Alignment marks have been used with a variety of sizes, from less than 50 mm to more than 150 mm on a side. The space allowed for an alignment mark also depends on the prealignment accuracy of the wafer and the capture area of the alignment mark detector. If the alignment marks are placed too close to other structures on the mask, the detector may not be able to reliably find the alignment mark. In order to reduce the requirement for a dead band around the alignment mark, some stepper manufacturers use a two-step alignment procedure. A crude two-point global alignment is made using large marks that are printed in only two places on the wafer. This brings the wafer position well within the capture range of the small fine-alignment targets within each exposure field.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

55

1.8.10 Alignment Mark Detection The technology for alignment mark detection has advanced steadily since the beginning of microlithography. Normal microscope objectives with bright-field illumination were originally used. Dark-field illumination has the advantage that only the edges of the alignment marks are detected. This often provides a cleaner signal that can be more easily analyzed. In standard dark-field illumination, light is projected onto the wafer surface at grazing incidence, and the scattered light is captured by a normal microscope objective. A technique sometimes called reverse dark-field detection is often used. In this arrangement, light is projected through the central portion of the microscope objective onto the wafer surface. The directly reflected light is blocked by the illuminator assembly, but light scattered from the edges of the alignment mark are captured by the outer portions of the microscope objective. This provides a compact dark-field microscope. Because of the blocked central region, the microscope uses an annular pupil, forming an image with good contrast in the edges of features. Some types of process films, especially grainy metal films, do not give good alignment signals with dark-field imaging. Because of this, many steppers provide both bright-field and dark-field alignment capability. Brightfield, standard dark-field, and reverse dark-field detection can be used for either off-axis or TTL alignment. A great amount of sophisticated signal analysis is often used to reduce the raw output of the alignment microscope to an accurate alignment mark position. All of the available information in the signal is used to reduce susceptibility to detection noise and processinduced variability in the appearance of the mark on the wafer.

1.9 Mechanical Considerations 1.9.1 The Laser Heterodyne Interferometer In addition to the optical perfection of a lithographic exposure system, there is an almost miraculous mechanical accuracy. The overlay tolerance needed to produce a modern integrated circuit can be less than 40 nm. This requirement must be met by mechanically holding a wafer at the correct position, within the overlay tolerances, during the lithographic exposure. There are no clever optical or electronic ways of steering the aerial image the last few microns into its final alignment. The entire 200 mm wafer must physically be in the right place. This is the equivalent of bringing a 50 km iceberg to dock with an accuracy of 1 cm. The technology that enables this remarkable accuracy is the laser heterodyne interferometer. An extremely stable helium–neon laser is operated in a carefully controlled magnetic field. Under these conditions, the spectrum of the laser beam is split into two components with slightly different wavelengths. This effect is called Zeeman splitting. Each of the two Zeeman components has a different polarization. This allows them to be separated and sent along different optical paths. One beam is reflected from a mirror mounted on the wafer stage. The other beam is reflected from a stationary reference surface near the moving stage. Preferably, this reference surface should be rigidly attached to the lithographic lens. After the two beams are reflected, their planes of polarization are rotated to coincide with each other, and the two beams are allowed to interfere with each other on the surface of a simple power sensor. Because the two beams have different wavelengths and, therefore, different optical frequencies, a beat frequency will be generated. This beat frequency is just the difference between the optical frequencies of the two

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

56

Zeeman components of the laser beam. It is on the order of a few MHz. When the stage begins moving, the frequency of the signal changes by 2v/l where v is the stage velocity, and l is 632.8 nm (the helium–neon laser wavelength). This change in frequency is caused by the Doppler shift in the beam reflected from the moving stage. A stage velocity of 1 m/s will cause a frequency shift of 3.16 MHz, and a velocity of 1 mm/s will cause a frequency shift of 3.16 kHz. Researchers now have an accurate relationship between stage velocity and frequency; however, what they want is a measurement of stage position. This is accomplished by comparing the frequency of the stage interferometer signal with the beat frequency of a pair of beams directly out of the laser. The two beat frequencies are monitored by a sensitive phase comparator. If the stage is stationary, the phase of the two signals will remain locked together. If the stage begins to move, the phase of the stage interferometer signal will begin to drift relative to the signal directly from the laser. When the phase has drifted one full cycle (2p), the stage will have moved a distance of l/2 (316.4 nm). The phase comparator can keep track of phase differences from a fraction of a cycle to several million cycles, corresponding to distance scales from a fraction of a nanometer to several meters. The best phase comparators can detect phase changes of 1/1024 cycle, corresponding to a positional resolution of 0.31 nm. The maximum velocity that can be accommodated by a heterodyne interferometer is limited. If the stage moves so fast that the Doppler shift drives the detected beat frequency to zero, then the phase tracking information will be lost. For practical purposes, Zeeman splitting frequencies are limited to about 4 MHz, imposing a limit of 1.27 m/s on the stage velocity. One helium–neon interferometer laser supplies all the metrology needs of the wafer stage (Figure 1.21). After the beam is split into the two Zeeman components, each component is further split into as many beams as are needed to monitor the stage properly. Each of the final pairs of beams forms its own independent interferometer. The minimum

A

B

C

FIGURE 1.21 A simplified drawing of a stepper’s stage interferometer. A polarizing beam splitter sends one component of the laser beam to the stage along path B while a second component at slightly different wavelength travels along a reference path A to the lens mounting assembly. The retroreflected beams from the stage and the lens assembly are recombined in the beam splitter and collected by a detector C. Analysis of the beat frequency in the combined beam allows the stage position to be tracked to high accuracy. The second axis of stage motion is tracked by another interferometer assembly. Some components that control the beam polarization within the interferometer have been omitted for simplicity.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

57

stage control requires one interferometer on both the x and y axes. The laser beams are aimed so that they would intersect at the center of the lithographic projection lens’s exposure field. Linearity of the stage travel and orthogonality of the axes is guaranteed by the flatness and orthogonality of two mirrors mounted on the stage. To ensure that the axis orthogonality cannot be lost because one of the mirrors’ slipping in its mount, the two mirrors are usually ground from a single piece of glass. With two-interferometer control, the yaw, pitch, and roll accuracies of the stage are guaranteed only by the mechanical tolerances of the ways. Pitch and roll errors are relatively insignificant because they result in overlay errors proportional to the cosine of the angular errors (Figure 1.22). However, yaw (rotation in the plane of the wafer surface) can be a serious problem because it has the same effect on overlay as a field rotation error. Three-interferometer stage control has been used for many years. With this strategy, an additional interferometer monitors the mirror on one axis to eliminate yaw errors. With continued reduction of overlay tolerances, sources of placement error that could previously be ignored are now being brought under interferometer control. Pitch and roll of the stage and even the z (focus) position are now commonly monitored by additional interferometer beams. The resolution of interferometer position detection can be doubled by using a doublebounce strategy. The laser beam that is used to monitor the stage position is reflected back and forth twice between the stage mirror and a fixed mirror on the stepper frame before the beam is sent to the phase detector. This doubles the sensitivity to any stage motion, but it also reduces the maximum allowable stage velocity by a factor of two. Any reduction in stage velocity is undesirable because it reduces wafer throughput. The solution has been to use a different technology to induce the frequency splitting. A device called an acoustooptical modulator (AOM) can be used to produce frequency splitting of 20 MHz, allowing stage velocities up to about 3 m/s even when a double-bounce configuration is used. 1.9.2 Atmospheric Effects Although interferometer control is extremely accurate, there are some factors that affect its accuracy and must be carefully regulated. Because the length scale of interferometry is the laser wavelength, anything that affects the laser wavelength will cause a corresponding change in the length scale of the stage positioning. The laser’s optical frequency is extremely well controlled, but changes in the wavelength can be induced by any changes in the index of refraction of the air. Barometric pressure and air temperature affect the refractive index of air, but slow changes in these variables can be monitored and corrected. In fact, slow drifts in the interferometry are not very important because the stage interferometers

z θz θy y

θx x

q 2007 by Taylor & Francis Group, LLC

FIGURE 1.22 Three axes of translational motion and three axes of rotation on a wafer stage. X, Y, and qZ must be controlled to high precision to ensure the accuracy of overlay. The Z, qX, and qY axes affect focus and field tilt and can be controlled to somewhat lower precision.

58

Microlithography: Science and Technology

are used both for measuring the position of the wafer alignment marks and for determining the position of the exposure. Errors in the length scale tend to cancel (except where they affect the baseline); however, rapid changes in the air index can cause serious problems for the stage accuracy. Air turbulence in the interferometer paths can cause just this sort of problem. A fairly large effort has been made by stepper manufacturers to enclose or shield the light paths of the stage interferometers from stray air flows. Heat sources near the interferometers such as stage drive motors have been relocated or water cooled. More work remains to be done in this area, especially considering the inexorable tightening of overlay tolerances and the ongoing change in wafer sizes from 200 to 300 mm that will increase the optical path length of the interferometers. 1.9.3 Wafer Stage Design Wafer stage design varies considerably from manufacturer to manufacturer. All the designs have laser interferometers tied into a feedback loop to ensure the accurate position of the stage during exposure. Many designs use a low-mass, high-precision stage with a travel of only a few millimeters. This stage carries the wafer chuck and the interferometer mirrors. It sits on top of a coarse positioning stage with long travel. The coarse stage may be driven by fairly conventional stepper motors and lead screws. The high-precision stage is usually driven directly by magnetic fields. When the high-precision stage is driven to a target position, the coarse stage acts as a slave following the motion of the fine stage to keep it within its allowable range of travel. The control system must be carefully tuned to allow both stages to rapidly move to a new position and settle within the alignment tolerances, without overshoot or oscillation, and within a time of less than one second. Coarse stages have been designed with roller bearings, air bearings, and sliding plastic bearings. High-precision stages have used flexure suspension or magnetic field suspension. Not all advanced wafer stage designs follow the coarse stage and fine stage design. Some of the most accurate steppers use a single massive stage, driven directly by magnetic fields. Requirements on the stages of step-and-scan exposure equipment are even more severe than for static exposure steppers. When the stage on a static exposure stepper gets slightly out of adjustment, the only effect may be a slight increase in settling time before the stepper is ready to make an exposure. The stage on a step-and-scan system must be within its position tolerance continuously throughout the scan. It must also run synchronously with a moving mask stage. These systems have been successfully built and generally with the same stage design principles as conventional steppers. Mounted on the stage are the interferometer mirrors and the wafer chuck. The chuck is supported by a mechanism that can rotate the chuck to correct wafer rotation error from the prealigner. There may also be a vertical axis motion for wafer focus adjustment and even tilt adjustments along two axes to perform wafer leveling. All of these motions must be made without introducing any transverse displacements in the wafer position because the interferometer mirrors do not follow these fine adjustments. Often, flexure suspension is used for the tilt and rotation adjustments. Because of the difficulty of performing these many motions without introducing any translation errors, there is a tendency to move these functions to a position between the coarse and fine stages so that the interferometer mirrors will pick up any translations that occur. At the extreme, monolithic structures have been designed, consisting of the interferometer mirrors and a wafer chuck all ground from a single piece of low thermal-expansion ceramic. This structure can be moved through six axes of motion (three translation and three rotation). Up to six laser interferometers may be used to track all of these motions.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

59

1.9.4 The Wafer Chuck The wafer chuck has a difficult job to perform. It must hold a wafer flat to extremely tight tolerances (less than 100 nm) all the way across the wafer’s surface. At this scale of tolerances, micron-sized particles of dirt or thin residues of resist on the back side of the wafer will result in a completely unacceptable wafer surface flatness. Particle contamination and wafer backside residues are minimized by strict control of the resist application process and by particle filtration of the air supply in the stepper enclosure. The chuck is made as resistant as possible to any remaining particle contamination by using a low contact-area design. This design, sometimes called a bed of nails, consists of a regular array of rather small studs whose tips are ground and polished so that they are coplanar with the other studs in the array to the accuracy of an optical flat. The space between the studs is used as a vacuum channel to pull the wafer against the chuck’s surface. The space also provides a region where a stray particle on the back side of the wafer may exist without lifting the wafer surface. The actual fraction of the wafer area in contact with the chuck may be as low as 5%, providing considerable immunity to backside particle contamination. The symmetry of the stud array must be broken at the edge of the wafer where a thin solid rim forms a vacuum seal. This region of discontinuity frequently causes problems in maintaining flatness all the way to the edge of the wafer. At the end of the lithographic exposure, the chuck must release the wafer and allow a wafer handler to remove it. This causes another set of problems. It is almost impossible for a vacuum handler to pick up a wafer by the front surface without leaving an unacceptable level of particle contamination behind. Although front-surface handlers were once in common use, it is rare to find them today. The only other alternative is somehow to get a vacuum handler onto the back surface of the wafer. Sometimes, a section of the chuck is cut away near the edge to give the handler access to the wafer back side. However, the quality of lithography over the cutout section invariably suffers. The unsupported part of the wafer curls and ripples unpredictably, depending on the stresses in that part of the wafer. Another solution has been to lift the wafer away from the chuck with pins inserted through the back of the chuck. This allows good access for a backside wafer handler, and it is generally a good solution. The necessity for a vacuum seal around the lifter pin locations and the disruption of the chuck pattern there may cause local problems in wafer flatness. 1.9.5 Automatic Focus Systems Silicon wafers, especially after deposition of a few process films or subjection to hot processing, tend to become somewhat curled or bowed (on the scale of several microns) when they are not held on a vacuum chuck. The vacuum chuck does a good job of flattening the wafer, but there are still some surface irregularities at the sub-micron scale. The wafer may also have some degree of wedge between the front and back surface. Because of these irregularities, a surface-referencing sensor is used to detect the position and levelness of the top wafer surface so that it can be brought into the proper focal plane for exposure. A variety of surface sensors have been used in the focus mechanisms of steppers. One common type is a grazing-incidence optical sensor. With this technique, a beam of light is focused onto the surface of the wafer at a shallow, grazing angle (less than 58 from the plane of the surface). The reflected light is collected by a lens system and focused onto a position detector. The wavelength chosen for this surface measurement must be much longer than the lithographic exposure wavelength so that the focus mechanism does not expose the photoresist. Frequently, near-infrared laser diodes are used for this application. The shallow angle of reflection is intended to give the maximum geometrical sensitivity to

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

60

the wafer’s surface position and also to enhance the signal from the top of the resist (Figure 1.23). If a more vertical angle of incidence were used, there would be a danger that the sensor might look through the resist and any transparent films beneath the resist, finally detecting a reflective metal or silicon layer deep below the surface. The great advantage of the grazing-angle optical sensor is its ability to detect the wafer surface at the actual location where the exposure is going to take place without blocking or otherwise interfering with the exposure light path. A second surface-sensing technique that has been successfully used is the air gauge. A small-diameter air tube is placed in close proximity to the wafer surface so that the wafer surface blocks the end of the tube. The rate at which air escapes from the tube is strongly dependent on the gap between the wafer and the end of the tube. By monitoring this flow rate, the position of the wafer surface can be very accurately determined. Air gauges are compact, simple, and reliable. They are sensitive only to the physical surface of the resist, and they are never fooled by multiple reflections in a complex film stack. However, they are not completely ideal. They can be placed close to the exposure field, but they cannot intrude into it without blocking some of the image. This leaves two choices. The exposure field can be surrounded with several air gauges, and their average can be taken as the best guess of the surface position within the exposure field; or the air gauge can take a measurement at the actual exposure site on the wafer before it is moved into the exposure position. This second option is not desirable because it adds a time-consuming extra stage movement to every exposure. Because the air gauges are outside the exposure field, they may fall off the edge of the wafer if a site near the edge of the wafer is exposed. This may require some complexity in the sensing and positioning software to ignore the signals from air gauges that are off the wafer. A surprisingly large amount of air is emitted by air gauges. This can cause turbulences in the critical area where the lithographic image is formed unless the air gauges are shut off during every exposure. Another problem is particle contamination of the wafer surface that is carried in by air flow from the gauges. The third commonly used form of surface detector is the capacitance gauge. A small, flat electrode is mounted near the lithographic exposure lens close to the wafer surface.

(a)

(b)

(c) FIGURE 1.23 A grazing-incidence optical sensor used as a stepper autofocus mechanism. Light from a source on the left is focused at a point where the wafer surface must be located for best performance of the lithographic projection lens (not shown). If the wafer surface is in the correct position as in (a), the spot of light is reflected and refocused on a detector on the right. If the wafer surface is too high or low as in (b), the reflected light is not centered on the detector and an error signal is generated. (c) The optics do not generate an error if the wafer surface is tilted.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

61

This electrode and the conductive silicon wafer form a capacitor. The capacitance is a function of the electrode geometry and of the gap between the electrode and the wafer. When the wafer is in position under the capacitance gauge, the capacitance is measured electronically, and the gap spacing is accurately determined. A capacitance gauge is somewhat larger than an air gauge with the electrode typically being a few millimeters in diameter. It has similar problems to the air gauge in its inability to intrude into the exposure field and its likelihood to fall off the edge of the wafer for some exposure sites. Capacitance gauges do not induce any air turbulence or particle contamination. The capacitance gauge actually measures the distance to the uppermost conductive film on the wafer, not to the physical surface of the resist. At first glance, this would seem like a serious deficiency, but the insulating films (including the photoresist) on the surface of the wafer have very high dielectric constants relative to air. This means that the air gap between the capacitance gauge and the wafer is weighted much more heavily than the dielectric layers in the capacitance measurement. Although there is still a theoretical concern about the meaning of capacitance gauge measurements when thick insulating film are present, in practice, capacitance gauges have given very stable focus settings over a large range of different film stacks. 1.9.6 Automatic Leveling Systems Because of the large size of the exposure field in modern steppers (up to 30 mm field diameter) and the very shallow depth of focus for high-resolution imaging, it is usually not sufficient to determine the focus position at only one point in the exposure field. Many years ago, when depths of focus were large and exposure fields were small, the mechanical levelness of the chuck was the only guarantee that the wafer surface was not tilted relative to the lithographic image. In more recent years, leveling of the wafer has been corrected by a global leveling technique. The wafer surface detector is stepped to three locations across the wafer, and the surface heights are recorded. Corrections to two axes of tilt are made to bring the three measured points to the same height. Any surface irregularities of the wafer are ignored. This low level of correction is currently no longer sufficient. Site-by-site leveling has practically become a requirement. With detectors outside the exposure field (i.e., air gauges or capacitance gauges), three or four detectors can be placed around the periphery of the field. The field tilt can be calculated for these external positions and assumed with a good degree of confidence to be the same as the tilt within the field. Optical sensors can be designed to measure several discrete points within the exposure field and calculate tilt in the same way. A different optical technique has also been used. If a collimated beam of infrared light is reflected from the wafer surface within the exposure field, it can be collected by a lens and focused to a point. The position of this point is not sensitive to the vertical displacement of the wafer surface, but it is sensitive to tip and tilt of the surface [33]. If a quadrant detector monitors the position of the spot of light, its output can be used to level the wafer within the exposure field (Figure 1.24). This measurement automatically averages the tilt over all of the field that is illuminated by the collimated beam rather than sampling the tilt at three or four discrete points. If desired, the entire exposure field can be illuminated and used for the tilt measurement. With global wafer leveling, the leveling could be done before the alignment mapping, and any translations introduced by the leveling mechanism would not matter. With siteby-site leveling, any translation arising from the leveling process will show up in overlay error. The leveling mechanism has to be designed with this in mind. Fortunately, leveling usually involves very small angular corrections in the wafer tilt. The ultimate answer to the problem of leveling without inducing translation errors is 5-or 6-axis interferometry

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

62

(a)

(b)

(c) FIGURE 1.24 An optical sensor used for wafer leveling. This mechanism is very analogous to the focus sensor in Figure 1.23. A collimated beam of light from the source on the left is accurately refocused on a detector on the right if the wafer is level as in (a). Vertical displacements of the wafer surface, as in (b), do not affect the detector. However, tilts of the wafer surface generate an error signal as shown in (c).

that can cleanly separate the pitch and roll motions used to level the stage from x and y translational motions. Step-and-scan systems face a slightly different local leveling issue. Because the scanning field moves continuously during exposure, it has the capability of following the irregularities of the wafer surface. This capability is often called terrain following. The scanning field does not actually tilt to match the surface in the direction of scanning, but the field is so short in that direction (about 5 mm) that the amount of defocus at the leading and trailing edges is small. If two focus sensors are used, one at each end of the long axis of the scanning field, then there is potential to adjust the roll axis continually during scanning to keep both ends of the scanning field in focus. This results in a very accurate 2-axis terrainfollowing capability. The terrain following-ability of step-and-scan systems allows them to match focus to an irregular wafer surface in much more detail than can be done with the large, planar exposure field of a traditional stepper. 1.9.7 Wafer Prealignment A rather mundane, but very important, piece of the lithographic exposure system is the wafer prealigner. This mechanism mechanically positions the wafer and orients its rotation to the proper angle before it is placed on the wafer chuck for exposure. The prealignment is done without reference to any lithographic patterns printed on the wafer. The process only uses the physical outline of the wafer to determine its position. Early prealignment systems were fairly crude affairs with electromechanical or pneumatic solenoids tapping the edges of the wafer into centration on a prealignment chuck. When the wafer was centered, the prealignment chuck rotated it while a photodiode searched its edge for an alignment structure, typically a flattened section of the edge, but occasionally a small V-shaped notch. Today’s prealigners still use a rotating prealignment chuck, but the mechanical positioners are no longer used (banging on the edge of a wafer generates too much particle contamination). Instead, optical sensors map the edge of the wafer while it is rotating, and the centration of the wafer and rotation of the alignment flat (or notch) are calculated from the information collected by the sensor. After the wafer is rotated and translated into its final position, it must be transferred onto the wafer chuck that will hold it during exposure. The transfer arm used for this purpose must maintain the accuracy of the prealignment, and it is a fairly high-precision piece of

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

63

equipment. Of course, there is no problem with accurately positioning the wafer chuck to receive the wafer from the transfer arm because it is mounted on an interferometer stage. The hand-offs from prealignment chuck to the transfer arm and from transfer arm to the wafer chuck must be properly timed so that the vacuum clamp of the receiving mechanism is activated before the vacuum clamp of the sending mechanism is turned off. The entire transfer must be as rapid as possible because the stepper is doing no exposures during the transfer, and it is expensive to let it sit idle. The same transfer arm can also be used to unload the wafer at the end of its exposure, but the operation is simpler and faster if a second transfer arm is used for this purpose. After the prealignment is complete, the wafer will be positioned on the wafer chuck to an accuracy of better than 50 mm. The alignment flat (or notch) will be accurately oriented to some direction relative to the top of the mask image. Perhaps surprisingly, there is no standard convention among stepper makers for the orientation of the alignment flat relative to the mask image. The flat has been variously oriented at the top, bottom, or right side of the wafer by different manufacturers at different times. For a stepper with a symmetrical exposure field, this problem is only a matter of orienting the pattern properly on the mask regardless of where the nominal top side of the mask is located. However, for exposure equipment with rectangular exposure fields such as step-and-scan systems and some steppers, there is the possibility of serious incompatibility if one manufacturer orients the long axis of the exposure field parallel to the flat and another chooses a perpendicular orientation. By now, most steppers allow any orientation of the wafer flat as specified by the operator. 1.9.8 The Wafer Transport System A lithographic exposure system is a large piece of equipment, and within it, wafers must follow a winding path from an input station to a prealigner, transfer mechanism, wafer chuck, unload mechanism, and output station. The mechanisms that move the wafers along this path must be fast, clean, and—above all—reliable. A wafer handling system that occasionally drops wafers is disastrous. Aside from the enormous cost of a 200 mm wafer populated by microprocessor chips, the fragments from a broken wafer will contaminate the stepper enclosure with particles of silicon dust and take the stepper out of production until the mess can be cleaned up. Most steppers use at least a few vacuum handlers. These devices hold the wafer by a vacuum channel in a flat piece of metal in contact with the back surface of the wafer. A vacuum sensor is used to ensure that the wafer is clamped before it is moved to its new location. The mechanism that rotates the vacuum handler from one position to another must be designed to avoid particle generation. Some steppers use nothing but vacuum handlers to maneuver the wafer from place to place. Other steppers have used conveyor belts or air tracks to move the wafers. The conveyor belts, consisting of parallel pairs of elastic bands running on rotating guides, are quite clean and reliable. Air tracks that use jets of air to float wafers down flat track surfaces tend to release too much compressed air into the stepper environment with a corresponding risk of carrying particle contamination into the stepper. Air tracks were commonly used in steppers several years ago, but they are rare today. The general issue of particle contamination has received great attention from stepper manufacturers for several years. Stepper components with a tendency to generate particle contamination have been redesigned. The environmental air circulation within the stepper chamber uses particle filters that are extremely efficient at cleaning the air surrounding the stepper. In normal use, a stepper will add fewer than five particles greater than 0.25 mm in diameter to a 200 mm wafer in a complete pass through the system. If particles are

q 2007 by Taylor & Francis Group, LLC

64

Microlithography: Science and Technology

released into the stepper environment (for example, by maintenance or repair activities), the air circulation system will return the air to its normal cleanliness within about ten minutes. 1.9.9 Vibration Steppers are notoriously sensitive to vibration. This is not surprising considering the 50 nm tolerances with which the image must be aligned. The mask and wafer in a stepper are typically separated by 500–800 mm, and it is difficult to hold them in good relative alignment in the presence of vibration. Vibration generated by the stepper itself has been minimized by engineering design of the components. There was a time when several components generated enough vibration to cause serious problems. The first solution was to prevent the operation of these components (typically, wafer handlers and other moving equipment) during the exposure and stepping procedure. In order to achieve good productivity, many of the stepper components must work in parallel. For example, the total rate of production would suffer greatly if the prealigner could not begin aligning a new wafer until the previous wafer had been exposed. Step-and-scan systems have an even greater potential for system-generated vibration. Heavy stages for both the wafer and the mask move rapidly between exposures, and they scan at high velocities while the wafer is being exposed. Elaborate measures, including moving counter masses, have been employed to suppress vibrations from this source. Externally generated vibrations are also a serious problem. Floors in a large factory tend to vibrate fairly actively. Even a semiconductor fabricator, which does not have much heavy moving equipment, can suffer from this problem. Large air-handling systems generate large amounts of vibration that can easily be transmitted along the floor. Large vacuum pumps are also common in many wafer fabricating processes, and they are serious sources of vibration. Steppers are isolated from floor vibrations as much as possible by air isolation pedestals. These passively isolate the stepper from the floor by suspending it on relatively weak springs made of compressed air. Some lithographic exposure systems have an active feedback vibration suppression system build into their supporting frames. Even with these measures, lithographic equipment requires a quiet floor. Semiconductor manufacturers are acutely aware of the vibration requirements of their lithographic equipment. Buildings that are designed to house semiconductor fabricators have many expensive features to ensure that vibration on the manufacturing floor is minimized. Air-handling equipment is usually suspended in a “penthouse” above the manufacturing area, and it is anchored to an independent foundation. Heavy vacuum pumps are often placed in a basement area beneath the manufacturing floor. The manufacturing floor itself is frequently mounted on heavy pillars anchored to bed rock. The floor is vibrationally isolated from surrounding office and service areas of the building. Even with these precautions, there is usually an effort to find an especially quiet part of the floor to locate the lithographic exposure equipment. Manufacturers of exposure equipment often supply facility specifications that include detailed requirements on floor vibration. This is usually in the form of a vibration spectrum, showing the maximum accelerometer readings allowed at the installation site from a few Hz to a few kHz. Horizontal and vertical components of floor vibration are both important, and they may have different specifications (Figure 1.25). Vibration causes problems when it induces relative motion between the aerial image and the surface of the wafer. If the vibration is parallel to the plane of the image and the period of oscillation is short relative to the exposure time, the image will be smeared across the surface, degrading the contrast of the latent image captured by the resist.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

65

Floor acceleration (g)

10–2

10–3

10–4 1

10 Frequency (Hz)

100

FIGURE 1.25 Maximum floor vibration specifications for a random selection of three optical exposure and measurement systems. The sensitivity to vibration depends on the optical resolution for which the equipment is designed as well as the efficiency of its vibration isolation system.

The allowable transverse vibration should be considerably less than the minimum image size in order to not degrade the printed image resolution. There is remarkably less sensitivity to vibration in the direction perpendicular to the image plane (i.e., along the axis of focus or z axis). Aerial image modeling shows that there is very little degradation of the image contrast for z-axis vibration amplitudes up to the full depth of focus. Only differential motion between the wafer and aerial image causes problems. Very low vibration frequencies tend to simultaneously move the image and the wafer as a solid body. Without a great deal of structural modeling, it is very difficult to predict how an accelerometer reading on the floor of the factory will translate into the amplitude of image vibration across the surface of a wafer. 1.9.10 Mask Handlers In addition to the mechanisms that the stepper uses for wafer handling, there is likely to be another entire set of mechanisms for automatically loading, aligning, and unloading masks. Previous generations of steppers required that masks be manually loaded and aligned. A hand-held mask was placed on a mask platen and aligned with a mechanical manipulation system while mask-to-platen alignment marks were inspected through a microscope. The procedure required good eyesight, considerable dexterity, and great care to avoid dropping the valuable mask. Today’s automatic mask loaders represent a great improvement over the manual system. A mask library holds from six to twelve masks in protective cassettes. Any of these masks can be specified for an exposure by the stepper’s control software. The selected mask is removed from its cassette by a vacuum arm or a mechanically clamped carrier. It is moved past a laser bar code reader that reads a bar code on the mask and verifies that it is the mask that was requested. (In the days of manual mask loading, a surprising number of wafer exposures were ruined because the wrong mask was selected from the storage rack.) The mask is loaded onto the mask platen, and the alignment of the mask to the platen is done by an automatic manipulator. The entire procedure can be nearly as quick as loading a wafer to be exposed (although, there is a great deal of variation in loading speed from one stepper manufacturer to another). With a fast automatic mask loader, it is possible to expose more than one mask pattern on each wafer by changing masks while the wafer is still on the exposure chuck.

q 2007 by Taylor & Francis Group, LLC

66

Microlithography: Science and Technology

1.9.11 Integrated Photo Cluster The traditional way to load wafers into an exposure system is to mount a wafer cassette carrying 25 wafers onto a loading station. Wafers are removed from the cassette by a vacuum handler, one at a time, as they are needed. At the end of the exposure, each wafer is loaded into an empty output cassette. Sometimes, an additional rejected wafer cassette is provided for wafers that fail the automatic alignment procedure. When all the exposures are over, an operator unloads the filled output cassette and puts it into a protective wafer carrying box to be carried to the photoresist development station. There is currently a tendency to integrate an exposure system with a system that applies photoresist and another system that develops the exposed wafers. The combination of these three systems is called an integrated photosector or photocluster. With such an arrangement, clean wafers can be loaded into the input station of the cluster, and half an hour later, patterned, developed wafers can be unloaded and carried away. This has great benefits for reducing the total processing time of a lot of wafers through the manufacturing line. When the three functions of resist application, exposure, and development are separated, the wafers have a tendency to sit on a shelf for several hours between being unloaded from one system and being loaded on the next. In the case of resists with low chemical stability of the latent image, there is also a benefit for developing each wafer immediately after it is exposed. The main drawback of such a system is the increased complexity of the system and the corresponding decrease in the mean time between failures. An exposure system that is part of a photocluster may be built with no loading station or output station for wafer cassettes. Every wafer that comes into the system is handed from the resist application system directly to a vacuum handler arm in the stepper, and every wafer that comes out is handed directly to the resist developer. A robotic wafer handler is used to manage the transfers between the exposure system and the wafer processing systems. The software that controls the whole cluster is apt to be quite complex, interfacing with three different types of systems that are often made by different manufacturers. 1.9.12 Cost of Ownership and Throughput Modeling The economics of semiconductor manufacturing depend heavily on the productivity of the very expensive equipment used. Complex cost-of-ownership models are used to quantify the effects of all the factors involved in the final cost of wafer processing. Many of these factors such as the cost of photoresist and developer and the amount of idle time on the equipment are not under the control of the equipment manufacturer. The principal factors that depend on the manufacturer of the exposure system are the system’s capital cost, mean time between failures, and throughput. These numbers have all steadily increased over the years. Today, a single advanced lithographic stepper or step-and-scan exposure system costs over $10 million. Mean time between failures has been approaching 1000 h for several models. Throughput can approach 100 wph for 200 mm wafers. The number quoted as the throughput for an exposure system is the maximum number of wafers per hour for a wafer layout completely populated with the system’s maximum field size. Actual throughput for semiconductor products made on the system will vary considerably depending on the size of the exposed field area for that particular product and the number of fields on the wafer. For purposes of establishing the timing of bakes, resist development cycles, and other processes in the photocluster, it is important to know the actual throughput for each product. A simple model can be used to estimate throughput. The exposure time for each field is calculated by dividing the exposure requirement for the photoresist (in mJ/cm2) by the power density of the illumination in the exposure field (in mW/cm2). In a step-and-scan system, the calculation is somewhat

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

67

different. Because the power density within the illuminated slit does not need to be uniform along the direction of scan, the relevant variable is the integral of the power density along the scan direction. This quantity might be called the linear power density, and it has units of mW/cm. The linear power density divided by the exposure requirement of the resist (in mJ/cm2) is the scan speed in cm/s. The illuminated slit must over scan the mask image by one slit width in order to complete the exposure. The total length of scan divided by the scan speed is the exposure time. Exposure times for both steppers and step-and-scan systems can be as low as a few tenths of a second. After each exposure is completed, the stage moves to the next exposure site. For short steps between adjacent exposure sites, this stepping time can be approximated by a constant value. If a more accurate number is needed, it can be calculated from a detailed analysis of the stage acceleration, maximum velocity, and settling time. The number of exposure sites times the sum of the stepping time and the exposure time is the total exposure time of the wafer. At the end of the exposure, there is a delay while one wafer is unloaded and the next is loaded. The newly loaded wafer is stepper through several alignment sites and the positions of the alignment marks are mapped. The sum of the wafer exchange time and alignment time is called the wafer overhead. The sum of the wafer overhead and the wafer exposure time is the inverse of the steady-state throughput. A 60-wafer-per-hour throughput rate allows one minute for each wafer that may be broken down into 24 s for wafer overhead, 0.2 s for each exposure, and 0.2 s to step between exposures for 90 image fields on the wafer. The actual throughput achieved by an exposure system is less than the steady-state throughput because of a factor called lot overhead. Lot overhead is the time required for the first wafer in a lot to arrive at the exposure chuck plus the time required for the last wafer in the lot to be moved from the chuck to the output station. Its effect is distributed across the total number of wafers in the lot. For a 25-wafer lot running at a steadystate throughput of 60 wph, each minute of lot overhead reduces the net throughput by about 4%. The trend toward linking wafer processing equipment and exposure systems into an integrated photocluster can have serious effects on lot overhead. Although the addition of a resist application track and a wafer development track does not change the steady-state throughput of the exposure system, there is a great increase in the processing time before the first exposure begins and after the last exposure ends (Figure 1.26). If a new lot cannot be started until the previous lot has completely finished all the processing steps and has

Throughput (wafers per hour)

80

60 a b c

40

20

0

1

2 3 4 Exposure + stepping time (sec)

q 2007 by Taylor & Francis Group, LLC

5

FIGURE 1.26 The relationship between stepper throughput and exposure time. Curves a, b, and c correspond to wafer overhead times of 15, 30, and 45 s, respectively. It is assumed that 45 exposure fields are required to populate the wafer fully.

68

Microlithography: Science and Technology

been removed from the system, the actual throughput of the total photocluster will be considerably less than the steady-state throughput. To remedy this situation, complex software controls on the photocluster are needed to allow lots to be cascaded. That is, a new lot must be started as soon as the input station is empty while the previous lot is still being processed. The software must recognize the boundary between the two lots, and it must change the processing recipe when the first wafer of the new lot arrives at each processing station.

1.10 Temperature and Environmental Control Lithographic exposure equipment is quite sensitive to temperature variations. The baseline offset between the lithographic lens and the off-axis alignment microscope will vary with thermal expansion of the structural materials. The index of refraction of glass and fused silica changes with temperature, altering the optical behavior of the lithographic projection lens. The index of refraction of air is also a function of temperature. This can affect the lithographic lens and the performance of the stage interferometers. In most cases, a rapid change in temperature causes more serious effects than a slow drift. For most lens designs, a change from one stable temperature to another primarily causes changes in focus and magnification that can be measured and corrected. However, some lens designs react to temperature changes by developing aberrations that are not so easily corrected. The calibration of the stage interferometers is also sensitive to temperature. If the temperature changes during the course of a wafer alignment and exposure, there may be serious errors in the stepping-scale term of the overlay. 1.10.1 The Environmental Chamber This thermal sensitivity requires that the stepper be housed in an enclosed environmental chamber with the ability to control temperature to G0.18C or better. Some manufacturers supplement the thermal control of the environmental chamber with water coils on particularly sensitive elements of the system or on heat-generating items such as motors or arc lamps. As long as the environmental chamber remains closed, it can maintain very accurate temperature control, but if it is frequently necessary to open the chamber (for maintenance, for example), the chamber may suffer a large temperature fluctuation that takes a long time to stabilize. Fortunately, current steppers very rarely need any sort of manual intervention in their operations. In many cases, the most frequent reason for opening the environmental chamber is to replace masks in the mask library. The consequences of opening the chamber can be reduced if the manufacturing area around the environmental chamber is maintained at the same mean temperature as the inside (although with a looser tolerance for fluctuations). Because of the desirability of keeping the stepper temperature close to that of the surrounding factory, it is usually designed to operate at a fixed temperature between 208C and 228C, a fairly standard range of environmental temperatures in a semiconductor clean room. The environmental chamber maintains a constant flow of air past the stepper to keep the temperature uniform and within its specifications. The air is passed through high-efficiency particle filters before flowing through the chamber. As long as care has been taken that the stepper’s moving parts do not generate additional particle contamination, the environment in the environmental enclosure is very clean. Wafers are exposed to this environment for several minutes during transport, prealignment, and exposure, with fewer than five additional particles added per pass through the system.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

69

1.10.2 Chemical Filtration In some circumstances, especially when acid-catalyzed deep-UV resists are used in the stepper, chemical air filters are used in series with the particle filters. These types of resists are extremely sensitive to airborne vapors of volatile base compounds such as amines. Some resists show degradation of the printed image profile with exposure to as little as a few parts per billion of particular chemical vapors. Several types of chemical filters, relying on either absorption or chemical reaction with the atmospheric contaminants, have been developed for use in the air circulation units of environmental chambers. These chemical filters are frequently used in the air supply of wafer processing equipment as well. Another type of chemical contamination has also been seen in deep-UV steppers. Lens surfaces that are exposed to high fluxes of ultraviolet light will occasionally become coated with a film of some sort of contaminant. Sometimes, the film will be nearly invisible to visual inspection, but it will show up strongly in transmission tests at deep-UV wavelengths. In other cases, the contamination can be seen as a thick film of material. The problem is usually seen in regions where light intensity is the greatest as in the illuminator optics of a stepper with an excimer laser light source. Surface contamination has been seen in lithographic projection lenses and even on photomask surfaces. The occurrence of this problem seems somewhat random, and the different occurrences that have been seen have varied enough in detail that they were probably caused by several different mechanisms. As more anecdotal reports of photochemical contamination emerge, more understanding of this problem will develop.

1.10.3 Effects of Temperature, Pressure, and Humidity Although the environmental chamber does an excellent job of protecting the exposure system and the wafers from thermal drift, particle contamination, and chemical contamination, it can do nothing to control variations of atmospheric pressure. It is not practical to make a chamber strong enough to hold a constant pressure as the barometer fluctuates through a G50 mTorr range. (50 mTorr is about one pound per square in. On a 4!8 ft. construction panel, this gives a load of slightly more than two tons.) A constant-pressure environmental chamber would pose other difficulties such as slow and complex airlock mechanisms to load and unload wafers. Although atmospheric pressure cannot be readily controlled, it has a stronger effect on the index of refraction of air than do temperature variations. At room temperature and atmospheric pressure, a 18C temperature change will change the index of refraction of air by roughly K10K6 (K1 ppm). In an environmental chamber where the temperature variation is G0.18C, the thermally induced index change will be G0.1 ppm, corresponding to an interferometer error of G20 nm over a 200 mm stage travel. An atmospheric pressure change of 1 mTorr will produce an index change of about 0.36 ppm. With a change of 50 mTorr in barometric pressure, an index change of 18 ppm will occur, shifting the stage interferometer calibration by 3.6 mm across 200 mm of stage travel. Although this comparison makes the effect of temperature appear insignificant relative to that of barometric pressure, it should be kept in mind that temperature changes also induce thermal expansion of critical structures, and they cause changes in the index of refraction of optical glasses. Fused silica, by far the most common refractive material used in deep-UV lithographic lenses, has an index of refraction with an unusually high thermal sensitivity. Its index changes approximatelyC15 ppm per 8C when the index is measured in the deep UV. Note that this change of index has the opposite sign of the value for air. Other glasses greatly vary in their sensitivity to

q 2007 by Taylor & Francis Group, LLC

70

Microlithography: Science and Technology

temperature with index changes from K10 to C20 ppm per 8C, but most commonly used optical glasses have thermal index changes between 0 and C5 ppm per 8C. The index of refraction of air has a relatively low sensitivity to humidity, changing by approximately 1 ppm for a change between 0% and 100% relative humidity for air at 218C. Humidity is usually controlled to G10% within a wafer fabrication facility in order to avoid problems with sensitive processes like resist application. This keeps the humidity component of the air index variation to G0.1 ppm. Sometimes, an additional level of humidity control is provided by the stepper environmental chamber, but often it is not. 1.10.4 Compensation for Barometric and Thermal Effects The effects of uncontrolled barometric pressure variations and the residual effects of temperature and humidity variation are often compensated by an additional control loop in the stage interferometers and the lens control system. A small weather station is installed inside the stepper environmental enclosure. Its output is used to calculate corrections to the index of refraction of the air in the stepper enclosure. This can be directly used to apply corrections to the distance scale of the stage interferometers. Corrections to the lithographic projection lens are more complex. Information from the weather station is combined with additional temperature measurements on the lens housing. The amount of field magnification and focus shift are calculated from a model based on the lens design or empirical data, and they are automatically corrected. These corrections are made to compensate for slow drifts in external environmental conditions. Internally generated heating effects must also be taken into account. It has been found that some lithographic lenses are heated enough by light absorbed during the lithographic exposure that their focal positions can shift significantly. This heating cannot be reliably detected by temperature sensors on the outside of the lens housing because the heat is generated deep inside the lens, and it takes a long time to get to the surface. However, the focus drift can be experimentally measured as a function of exposure time and mask transmission. The time dependence is approximated by a negative exponential curve (1KeKt/t) that asymptotically approaches the focus value for a hot lens. When an exposure is complete, the lens begins to cool. The focus follows the inverse of the heating curve, but usually with a different (and longer) time constant t 0 . If these lens heating effects are well characterized, the stepper’s computer controller can predict the focus drift with its knowledge of how long the shutter has been open and closed over its recent history of operation. The stepper can adjust the focus to follow this prediction as it goes through its normal business of exposing wafers. Although this is an open-loop control process with no feedback mechanism, it is also an extremely wellbehaved control process. The focus predictions of the equations are limited to the range between the hot-lens and the cold-lens focus values. There is no tendency for error to accumulate because contributions of the exposure history farther in the past than 3 or 4 times t 0 fall rapidly to zero. If the stepper sits idle, the focus remains stable at the coldlens value. It should be noted that the average optical transmission of each mask determines the difference between the cold-lens focus and the hot-lens focus as well as the time constant for heating, t. A mask that is mostly opaque will not generate much lens heating, whereas one that has mostly transparent areas will allow the maximum heating effect. The average transmission of each mask used in a stepper with lens-heating corrections will need to be known in order to generate the correct value for the hot-lens focus and the time constant for heating. Depending on details of the lithographic lens design and the optical materials used in its construction, lens heating effects may or may not be significant enough to

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

71

require this sort of correction procedure. A number of lithographic lenses use no lensheating corrections at all.

1.11 Mask Issues 1.11.1 Mask Fabrication Optical masks are made on a substrate of glass or fused silica. Typical masks for a 4! or 5! reduction stepper are 5!5 or 6!6 in. square and between 0.090 and 0.250 in. thick. Although much more massive and expensive than the thinner substrates, 0.250 in. masks are considerably more resistant to deformation by clamping forces on the mask platen. As larger exposure fields become more common, 6 in. masks are more frequently used. With further increases in field sizes, it is likely that a larger mask format will be required. In fact, the first step-and-scan exposure system that was developed, the Perkin–Elmer Micrascan, had the capability of scanning a 20!50 mm field. With a 4! reduction, this would have required an 80!200 mm mask pattern. Allowing an additional space for clamping the mask on the platen, a 9 in. mask dimension would have been required in the scan direction. At the time of the Micrascan’s debut, there was no mask-making equipment available that could pattern a 9 in. mask, so the scanned field had to be limited to 20!32.5 mm. This was the maximum field size that could be accommodated on a 6 in. mask. This points out a distressing fact of life. Mask-making equipment is expensive and specialized, and the market for new equipment of this type is very small. It is not easy for manufacturers of mask-making systems to economically justify the large development effort needed to make significant changes in their technology. Today, many years after the ability to use a 9 in. mask was first developed, there is still no equipment capable of making such a mask. Fused silica is typically used in preference to borosilicate glass for mask making because of its low coefficient of thermal expansion. It is always used for masks in the deep-UV portion of the spectrum from 248 to 193 nm because other types of glass are not sufficiently transparent at these wavelengths. Fused silica is not transparent enough to be used as a mask material at 157 nm, and some investigation of calcium fluoride as a 157 nm mask substrate has been done. Calcium fluoride is very fragile compared to fused silica, and its coefficient of thermal expansion is about 30 times larger. Fortunately, a modified form of fused silica, containing a large amount of fluorine dopant has been developed [34]. This material can be used to make 0.250 in. thick masks with as high as 78% transmission. Although much lower than the 90%Ctransmission at 248 and 193 nm, this transmission is acceptable, and the development of calcium fluoride as a mask material has been dropped with a certain sense of relief. The recent interest in polarized illumination has created a new requirement for mask substrates to be used with polarized light: low birefringence. Birefringence is a property of optically anisotropic materials, leading to the index of refraction changing as a function of the direction of polarization. It is measured in nm/cm, i.e., nanometers of optical path length difference between the fast and slow polarization axes, per centimeter of sample thickness. This effect is often seen in crystalline materials such as calcite (calcium carbonate) that exhibits a dramatic amount of birefringence. Normally, anisotropic materials like fused silica are not birefringent. However, stress induced during annealing or even polishing processes can induce birefringence in the range of 10K15 nm/cm in a mask blank. Careful control of the blank manufacturing process can reduce the level of birefringence below 2 nm/cm that is thought to be acceptable for use with polarized illumination.

q 2007 by Taylor & Francis Group, LLC

72

Microlithography: Science and Technology

Uncontrolled birefringence can convert a pure linearly polarized illumination beam into an elliptically polarized beam with a large component of polarization along the undesired axis. Even high levels of birefringence have a negligible effect on unpolarized illumination. Chromium has, for many years, been the material of choice for the patterned layer on the mask’s surface. A layer of chromium less than 0.1 mm thick will block 99.9% of the incident light. The technology for etching chromium is well developed, and the material is extremely durable. The recent development of phase-shifting mask technology has led to the use of materials such as molybdenum silicon oxynitride that transmit a controlled amount of light while shifting its phase by 1808 relative to adjacent clear areas of the mask (see Section 1.13.3). Masks must be generated from an electronically stored original pattern. Some sort of direct-writing lithographic technique is required to create the pattern on a mask blank coated with photoresist. Both electron beam and laser beam mask writers are in common use. The amount of data that must be transferred onto the mask surface may be in the gigabyte range, and the time to write a complex mask is often several hours. After the resist is developed, the pattern is transferred to the film of chromium absorber using an etch process. Although the mask features are typically 4 or 5 times larger than the images created on the wafer (as a result of the reduction of the lithographic lens), the tolerances on the mask dimensions are a much smaller percentage of the feature sizes. Because of these tight tolerances and the continuing reduction of feature dimensions on the mask, chromium etch processes have recently moved from wet etch to dry RIE processes for masks with the most critical dimensional tolerances. 1.11.2 Feature Size Tolerances The dimensional tolerance of critical resist patterns on a wafer’s surface may be G10% of the minimum feature size. Many factors can induce variations in line width, including nonuniformity of resist thickness, variations in bake temperatures or developer concentration, changes of the exposure energy, aberrations in the projection lens, and variations in the size of the features on the photomask. Because the mask represents only one of many contributions to variations in the size of the resist feature, it must have a tighter fractional dimensional tolerance than the resist image. It is not completely obvious how to apportion the allowable dimensional variation among the various sources of error. If the errors are independent and normally distributed, then there is a temptation to add them in quadrature (i.e., take the square root of the sum of squares or RSS). This gives the dominant weight in the error budget to the largest source of error and allows smaller errors to be ignored. Unfortunately, this sort of analysis ignores an important feature of the problem, namely the differences in the spatial distribution of the different sources of error. As an example of this, two important contributions to line width error will be examined: exposure repeatability and mask dimensional error. The distribution of exposure energies may be completely random in time and may be characterized by a Gaussian distribution about some mean value. Because of the relationship between exposure and resist image size, the errors in exposure will create a Gaussian distribution in image sizes across a large number of exposures. Likewise, the distribution of feature sizes on a photomask (for a set of features with the same nominal dimension) may randomly vary across the mask’s surface with a Gaussian distribution of errors about the mean dimension. Yet, these two sources of dimensional error in the resist image cannot be added in quadrature. Combining errors with an RSS, instead of a straight addition, accounts for the fact that random, uncorrelated errors will only rarely experience a maximum positive excursion on the same experimental measurement. Statistically, it is much more likely that a maximum error of

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

73

one term will occur with an average error of the other term. However, when combining errors of exposure energy with errors of mask feature sizes, there is no possibility of such a cancellation. When the exposure energy fluctuates to a higher value, the entire exposure field is overexposed, and the transparent mask feature that has the highest positive departure from nominal image size will determine if the chip fails its dimensional tolerances. Conversely, if the energy fluctuates low, the most undersized transparent feature will determine if the chip fails. The key thing to notice is that the oversized and undersized mask features are distributed across the mask surface, but the errors in exposure energy simultaneously affect the entire mask. Therefore, a difference in spatial distribution of errors prevents them from being added in quadrature even though the errors are uncorrelated and normally distributed. An analysis of other sources of dimensional error in the resist image shows few that have the same spatial distribution as mask errors. This forces the mask dimensional tolerance to be linearly added in the error budget for image size control on the wafer, and it makes the control of feature sizes on the mask correspondingly more critical. A typical minimum feature size on a mask may be 0.50 mm with a tolerance of 3% or 15 nm (3s). When printed with 4! reduction on a wafer, the resist image size will be 0.13 mm with a required accuracy of G10%. In this example, the mask error has consumed nearly one third of the total error budget. 1.11.3 Mask Error Factor The analysis in the previous section has made an assumption that, until recently, seemed like common sense. If a stepper has an image size reduction of 4!, then it seems completely logical that an error in the mask pattern will translate into a wafer image error with 1⁄4 the magnitude. However, as lithography moves deeper into the regime of low k-factors (see Section 1.5.5), the linear relationship between mask image size and wafer image size begins to degrade. A 10 nm dimensional error on the mask would produce a 2.5 nm error on the wafer if the error remained proportional to the geometric reduction factor of the lens. Sometimes, it is found that a 10 nm mask error will produce a 5 nm or even 10 nm error on the wafer. The ratio between the observed wafer error and the wafer error expected from the simple reduction value of the lens is called the mask error factor (MEF), or alternatively, the mask error enhancement factor (MEEF). The MEF can sometimes be determined by lithographic modeling programs, but it is often easier to experimentally determine it by measuring the resist images printed from a series of known mask feature sizes. If the printed image sizes are plotted against the corresponding mask feature sizes, the slope of the curve will be the product of the lens magnification and the MEF. For some types of features such as contact holes printed near the diffraction limit of the projection lens, the MEF may soar to values over 10. The tolerance for image size errors on the mask may have to be reduced to compensate for MEF values greater than one. If it is impossible to tighten the mask tolerances enough to compensate for large values of MEF, then the mask will end up taking a larger portion of the total error budget than expected. 1.11.4 Feature Placement Tolerance The control of image placement on the wafer surface is subject to tolerances that are nearly as tight as those for image size control. Once again, the mask contribution to image placement error is only one component of many. Thermal wafer distortions, chucking errors, lens distortion, errors in wafer stage motion, and errors in acquisition of wafer alignment marks also make significant contributions to the total image placement or

q 2007 by Taylor & Francis Group, LLC

74

Microlithography: Science and Technology

overlay budget. The spatial distribution of errors in mask feature placement also determines if the mask contribution can be added in quadrature or if it must be added linearly. Some components of mask error can be corrected by the lithographic optics. For example, a small magnification error in the mask can be corrected by the magnification adjustment in the lithographic projection lens. For this reason, correctable errors are usually mathematically removed when the mask feature placement tolerance is calculated, but the higher order, uncorrectable terms usually must be linearly added in the image overlay budget. The total overlay tolerance for the resist image on an underlying level is quite dependent on the details of the semiconductor product’s design, but it is often around 20%–30% of the minimum image size. The corresponding feature placement tolerance on the mask is about 5% of the minimum mask dimension, or 25 nm (3s), on a mask with 0.5 mm minimum feature sizes. 1.11.5 Mask Flatness For many years, the flatness of the photomask has made only a negligible contribution to the field curvature budget of a stepper. It is relatively easy to make a photomask blank that is flat to 2 mm, and the non-flatness is reduced by the square of the mask demagnification when the image is projected into the wafer plane. Therefore, a 2 mm mask surface variation will show up as an 80 nm focal plane variation at the wafer in a 5! stepper. For numerical apertures less than 0.5, the total depth of focus is a substantial fraction of a micron, and an 80 nm focus variation is not too objectionable. Today, numerical apertures up to 0.85 are available with a total depth of focus less than 200 nm. At the same time, typical stepper magnifications have dropped from 5! to 4!. Now, a 2 mm mask nonflatness can devour 125 nm of a 200 nm total focus budget at the wafer. With some increase in cost, mask blanks can be made with flatness down to 0.5 or 0.3 mm, reducing the contribution to the focus budget to acceptable levels again. With this improved level of flatness, processing factors that could previously be ignored need to be taken into account. For example, excessive levels of stress in the chromium absorber may slightly bow even a 0.25 in. thick mask substrate. When a pattern is etched into the film, the stress is released in the etched regions, potentially producing an irregular surface contour. When a pellicle frame is attached to the mask (see Section 1.11.7), it can easily change the flatness of the mask surface if the adhesive is not compressed uniformly or if the pellicle frame is not initially flat. Temperature changes after the pellicle is attached can also deform the mask because of the large difference in thermal expansion between fused silica and the pellicle frame material (typically aluminum). 1.11.6 Inspection and Repair When a mask is made, it must be perfect. Any defects in the pattern will destroy the functionality of the semiconductor circuit that is printed with that mask. Before a mask is delivered to the semiconductor manufacturing line, it is passed through an automated mask inspection system that searches for any defects in the pattern. There are two possible strategies in mask inspection, known as die-to-database and die-to-die inspection. The first method involves an automated scanning microscope that directly compares the mask pattern with the computer data used to generate the mask. This requires a very large data handling capability, similar to that needed by the mask writer itself. Any discrepancy between the inspected mask pattern and the data set used to create it is flagged as an error. The inspection criteria cannot be set so restrictively

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

75

that random variations in line width or image placement are reported as defects. A typical minimum defect size that can be reliably detected without producing too many false-positive error detections is currently about 0.13 mm. This number is steadily decreasing as the mask feature sizes and tolerances become smaller from year to year. A mask defect may be either an undesired transparent spot in the chromium absorber or a piece of absorber (either chromium or a dirt particle) where a clear area is supposed to be. These types of defects are called clear defects and opaque defects, respectively. As mask requirements become more and more stringent, new categories of defects such as phase errors in transparent regions of the mask have been discovered. Phase defects can often be detected as low contrast defects that change intensity through focus. Die-to-die inspections can find many phase defects and die-to-database algorithms capable of detecting phase defects are under development. Die-to-die inspection can be used only on a mask with two or more identical chip patterns. It is fairly common for two, three, or four chips to be exposed in a single large stepper field in order to improve the stepper throughput. The die-to-die inspection system scans both chip patterns and compares them, point by point. Any difference between the two patterns is recorded as a defect in the mask. This does not require the massive data handling capacity of die-to-database inspection. In general, die-to-die inspection is rather insensitive to process deficiencies such as rounded corners on mask features that may be common to all the features on the mask. The cost and time involved in making a mask are too great to allow the mask to be discarded for small defects. If defects are found, attempts are made to repair the mask. Opaque defects can be blasted away by a focused pulse of laser light, or they can be eroded by ion milling, using a focused beam of gallium ions. Clear defects that must be made opaque can be covered with patches using laser-assisted or ion beam-assisted chemical deposition processes. New methods for repairing opaque defects have emerged in the past few years. Ion-activated chemical etch processes using focused ion beams to define the etch area are faster and more selective than simple ion milling. A method has been commercialized by Rave, LLC for mechanically scraping away opaque defects using a microscopic probe similar to an atomic force microscope tip [35]. Both of these technologies have the potential to repair phase defects in alternating aperture phase masks (see Section 1.13.3). As previously noted, the inspection criteria cannot be set tight enough to detect small errors in feature size or placement. These important parameters are determined in a separate measurement step using specialized image placement measurement equipment and image size measuring microscopes. Image sizes and placement errors are statistically sampled over dozens to hundreds of sites across the mask. If the mean or standard deviation of the measured feature sizes is not within the specifications, there is no choice but to rebuild the mask. The same is true when the image placement tolerances are exceeded. There is no mask repair process capable of correcting these properties. 1.11.7 Particulate Contamination and Pellicles When a mask has been written, inspected, repaired, and delivered to the semiconductor manufacturing line, it might be assumed that it can be used to produce perfect images without any further concerns until it is made obsolete by a new mask design, but the mask faces a life of hazards. There is an obvious possibility of dropping and breaking the valuable object during manual or automatic handling. Electrostatic discharge has recently been recognized as another hazard. If a small discharge of static electricity occurs when the mask is picked up, a surge of current through micron-sized chromium lines on the mask can actually melt the lines and destroy parts of the pattern. However,

q 2007 by Taylor & Francis Group, LLC

76

Microlithography: Science and Technology

the most serious threat to a mask is a simple particle of dirt. If an airborne dirt speck lands in a critical transparent area of the mask, the circuits printed with that mask may no longer be functional. Wafer fabrication facilities are probably the cleanest working environments in the world, but a mask will inevitably pick up dust particles after several months of handling and use for wafer exposures. The mask can be cleaned using a variety of washing techniques, often involving ultrasonic agitation, high-pressure jets of water, or automated scrubbing of the surface with brushes. Often, powerful oxidizing chemicals are used to consume any organic particles on the mask’s surface. These procedures cannot be repeated very frequently without posing their own threat to the mask pattern. A better solution to the problem of dirt on the mask is to protect the surface with a thin transparent membrane called a pellicle. A pellicle made of a film of an organic polymer is suspended on a frame 4–10 mm above the mask surface. The frame seals the edges of the pellicle so that there is no route for dust particles to reach the mask’s surface. When a dust particle lands on the pellicle, it is so far out of the focal plane that it is essentially invisible to the projection optics. If a pellicle height of 5 mm is used, particles up to 75 mm in diameter will cause less than a 1% obscuration of the projection lens pupil area for a point on the mask directly beneath the particle. Thin (0.090 in.) masks are sometimes given a pellicle on the back as well as the front surface. Backside pellicles are not used on thick (0.250 in.) masks because the back surface of the mask is already reasonably far from the focal plane of the projection optics. The effectiveness of a pellicle increases with higher numerical aperture of the projection lens, smaller lens reduction factor, and increased height of the pellicle. A mask-protecting pellicle is directly in the optical path of the lithographic projection lens, so its optical effects must be carefully considered. The pellicle acts as a freestanding interference film, and its transmission is sensitive to the exact thickness relative to the exposure wavelength. Pellicles are typically designed for a thickness that maximizes the transmission. A transparent film with parallel surfaces a few millimeters from the focal plane will produce a certain amount of spherical aberration, but this is minimized if the pellicle is thin. Typical pellicles are less than 1 mm thick, and they produce negligible spherical aberration (Figure 1.27). Any variations of the pellicle’s thickness across a transverse distance of 1 or 2 mm will show up directly as wavefront aberrations in the aerial image. Small tilts in the pellicle’s orientation relative to the mask surface have little significant effect. A wedge angle between the front and rear pellicle surfaces will induce a transverse shift in the image that is projected by the lithographic lens. If the amount of wedge varies over the surface of the pellicle, surprisingly large image distortions can be produced. The amount of transverse image displacement at the wafer is equal to h(nK1)qw/M where h is the pellicle height, n is the index of refraction of the pellicle material, qw is the wedge angle, and M is the reduction factor of the lithographic lens. Note that the pellicle thickness does not appear in this expression. For a 5-mm pellicle

FIGURE 1.27 The function of a mask pellicle is to keep dirt particles from falling onto the surface of a mask. This figure illustrates a large particle of dirt on the top surface of a pellicle. The dotted lines represent the cone of illumination angles that pass through the mask surface. At 5 mm separation from the mask surface, the dirt particle interrupts an insignificant amount of energy from any one point on the mask’s surface.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

77

height, a refractive index of 1.6 and a lens reduction of 4! qw must be less than 20 mrad if the image displacement is to be kept below 15 nm at the wafer. Several materials have been used for pellicles such as nitrocellulose acetate and various Teflon-like fluorocarbons. When a pellicle is used with deep-UV exposure wavelengths, transparency of the pellicle and its resistance to photo-erosion at those wavelengths must be carefully evaluated. Although pellicles do a very good job of protecting the mask’s surface, a large enough dust particle on the pellicle can still cause a dark spot in the aerial image. Some steppers provide a pellicle inspection system that can detect large dust particles using scattered light. The pellicle can be inspected every time the mask is loaded or unloaded to provide an extra measure of protection. 1.11.8 Hard Pellicles for 157 nm One of the principal technical challenges in the development of 157 nm lithography has been the lack of a suitable polymer that can be used as a pellicle at that wavelength. Most polymer films are extremely opaque to 157 nm radiation, and the few materials with sufficient transparency degrade so fast under 157 nm exposure that they could not be used to expose more than a few wafers. Although research into polymer pellicles for 157 nm lithography is continuing, most of the development efforts are concentrating on thick, non-polymeric pellicles. A plate of fluorinate-fused silica between 300 and 800 mm thick is sufficiently stiff and durable to act as a pellicle for 157 nm lithography [36]. A polymer pellicle thickness is typically less than 1 mm, and new problems have to be addressed with the thicker hard pellicles. The thick piece of material in the light path between the mask surface and the projection lens introduces a substantial amount of spherical aberration in the image. This must be corrected by modifications in the lens design. Because the sensitivity to pellicle wedge does not depend on thickness, a thick pellicle must maintain the same wedge tolerances as a thin pellicle. In addition, a thick pellicle must maintain control of tilt. For small tilt angles qt, the image displacement at the wafer is tqt(nK1)/Mn where t is the pellicle thickness, n is the pellicle refractive index, and M is the demagnification of the projection lens. Note that the pellicle height is not a factor in this expression. For a hard pellicle with a thickness of 800 mm and a refractive index of 1.6, used in a stepper with a demagnification of 4!, a 200 mrad pellicle tilt will produce an image displacement of 15 nm at the wafer. 1.11.9 Field-Defining Blades The patterned area of the mask rarely fills the stepper field to its extremes. When the mask is made, there must be an opaque chromium border to define the limits of the exposed area. This allows each field to be butted against the adjacent fields without stray light from one field double-exposing its neighbor. The chromium that defines the border must be free of pinhole defects that would print as spots of light in a neighboring chip. It is expensive to inspect and repair all the pinholes in a large expanse of chromium. For this reason, almost all steppers have field-defining blades that block all of the light that would hit the mask except in a rectangular area where the desired pattern exists. The blades then take over the job of blocking light leaks except in a small region surrounding the patterned area that must be free of pinholes. It is desirable that the field-defining blades be as sharply focused as possible to avoid a wide blurred area or penumbra at their edges. Some amount of penumbra—on the order of 100 mm—is unavoidable, so the limits of the exposed field must always be defined by a chromium border.

q 2007 by Taylor & Francis Group, LLC

78

Microlithography: Science and Technology

The field-defining blades are also useful in a few special circumstances. For diagnostic and engineering purposes, the blades may be used to define a small sub-region of the mask pattern that can be used to expose a compact matrix of exposure and focus values. In this case, the fuzzy edges of the field can be ignored. The blades can also be used to select among several different patterns printed on the same mask. For example, a global wafer alignment mark or a specialized test structure could be defined on the same mask as a normal exposure pattern. The specialized pattern could be rapidly selected by moving the blades, avoiding the slow procedure of changing and realigning the mask. This would allow two or more different patterns to be printed on each wafer without the necessity of changing masks.

1.12 Control of the Lithographic Exposure System 1.12.1 Microprocessor Control of Subsystems Lithographic exposure systems have to perform several complex functions in the course of their operations. Some of these functions are performed by specialized analog or digital electronics designed by the system’s manufacturer. Other functions are complex enough to require a small computer or microprocessor to control their execution. For example, magnification and focus control of the lithographic projection lens requires a stream of calculations using inputs from temperature sensors, atmospheric data from the internal weather station, and a running history of exposure times for calculating lens-heating effects. This task cannot easily be performed by analog circuitry, so a dedicated microprocessor controller is often used. Other functions have also been turned over to microprocessors in systems made by various manufacturers. Excimer lasers, used as light sources for the lithographic exposure in some steppers, usually have an internal microprocessor control system. The environmental enclosure often has its own microprocessor control. Wafer transport and prealignment functions are sometimes managed by a dedicated microprocessor. A similar control system can be used for the automatic transportation of masks between the mask library and mask platen and for alignment of the mask. Some manufacturers have used an independent computer controller for the acquisition of wafer alignment marks and the analysis of alignment corrections. The exposure system also has a process control computer that controls the operation of the system as well as coordinating the activities of the microprocessor-controlled subsystems. Often, the controlling computer is also used to create, edit, and store the data files that specify the details of the operations that are to be performed on each incoming lot of wafers. These data files, called job control files or product files, include information on the mask or masks to be used, the alignment strategy to be used, the location of the alignment marks to be measured on the wafer, and the exact placement and exposure energy for each field that is to be printed on the wafer. In some systems, these bookkeeping functions are delegated to an auxiliary computer that also acts as an interface to the operator. Control of the stepper’s various subsystems with microprocessor controllers has been a fairly successful strategy. The modularity that results from this approach has simplified the design of the control system. There have occasionally been problems with communication and data transfer links between the microprocessors and the central computer. Recently, some manufacturers have started integrating the microprocessor functions into a single, powerful controlling computer or workstation.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

79

1.12.2 Photocluster Control There has also been an increase in the complexity of the control system caused by linking wafer processing equipment and the exposure system into an integrated photocluster. The simplest photocluster arrangements simply provide a robotic mechanism to transfer the wafer between the wafer processing tracks and the exposure system with a few data lines to exchange information when a wafer is waiting to be transferred. The increasing need to cascade wafer lots through the photocluster without breaks between lots has forced the development of a much more complex level of control. 1.12.3 Communication Links Frequently, a data link is provided for communications between the exposure system and a central computer that monitors the entire manufacturing operation. The Semiconductor Equipment Communications Standard (SECS II) protocols are often used. This link allows the central computer to collect diagnostic data generated by the exposure system and track the performance of the system. It also allows the detailed job control files to be stored on the central computer and transferred to the exposure system at the time of use. This ensures that every exposure system on the factory floor is using the same version of every job control file, and it makes it possible to revise or update the job control files in one central location. A central computer system linked to each exposure system on the factory floor can automatically collect data on the operating conditions of each system and detect changes that may signal the need for maintenance. The same central computer can track the progress of each lot of wafers throughout its many processing steps, accumulating a valuable record for analysis of process variables that affect yield. There has been an emphasis on automated data collection, often using computer-readable bar codes on masks and wafer boxes to avoid the errors inherent in manual data entry. The possibility of tracking individual wafers via a miniature computer-readable code near the wafer’s edge has been considered. 1.12.4 Stepper Self-Metrology Every year shows an increased level of sophistication in stepper design. As stepper control systems are improved, there has been a trend toward including a greater level of selfmetrology and self-calibration functions. Automatic baseline measurement systems (discussed in Section 1.8.7) are often provided on exposure systems with off-axis alignment. The same type of image detection optics that is used for the baseline measurement can often be used to analyze the aerial image projected by a lithographic lens [37]. If a detector on the wafer stage is scanned through the aerial image, the steepness of the transition between the bright and dark areas of the image can be used as an indication of the image quality. It is difficult to make a practical detector that can sample on a fine enough scale to discriminate details of a high-resolution stepper image; however, another trick is possible. The aerial image of a small feature can be scanned across the sharply defined edge of a rather large light detector. The detected signal represents the spatial integral of the aerial image in the direction of the scan. It can be mathematically differentiated to reconstruct the shape of the aerial image. Best focus can be defined as the position where the aerial image achieves the highest contrast. By repeatedly measuring the contrast of the aerial image through a range of focus settings, the location of the best focus can be found. Using an aerial image measurement system in this way allows the stepper to calibrate its automatic focus

q 2007 by Taylor & Francis Group, LLC

80

Microlithography: Science and Technology

mechanism to the actual position of the aerial image and correct for any drifts in the projection optics. Besides simple determination of best focus, aerial image measurements can be used to diagnose some forms of projection lens problems. Field tilts and field curvature can be readily measured by determining best focus at many points in the exposure field and analyzing the deviation from flatness. Astigmatism can be determined by comparing positions of best focus for two perpendicular orientations of lines. The field tilt measurements can be used to calibrate the automatic leveling system. However, there are no automated adjustments for field curvature or astigmatism. Instead, these measurements are useful for monitoring the health of the lithographic lens so that any degradation of the imaging can be detected early. Most of the aerial image measurements described here can be done with an equivalent analysis of the developed image in resist. For example, a sequence of exposures through focus can be analyzed for the best focus at several points across the image field, providing the same information on tilt and field curvature as an aerial image measurement. Measurements in resist are extremely time-consuming compared to automated aerial image measurements. A complete analysis of field curvature and astigmatism using developed photoresist images could easily require several hours of painstaking data collection with a microscope. Such a procedure may only be practical to perform as an initial test of a stepper at installation. An automated measurement, on the other hand, can be performed as part of a daily or weekly monitoring program. The automatic focus mechanism can be used to perform another test of the stepper’s health. With the availability of the appropriate software, the stepper can analyze the surface flatness of a wafer on the exposure chuck. The automatic focus mechanism, whether it is an optical mechanism, capacitance gauge, or air gauge, can sample the wafer’s surface at dozens or hundreds of positions and create a map of the surface figure. This analysis provides important information, not so much about the wafer, but about the flatness of the chuck. Wafer chucks are subject to contamination by specks of debris carried in on the back sides of wafers. A single large particle transferred onto the chuck can create a high spot on the surface of every wafer that passes through the exposure system until the contamination is discovered and removed. Occasional automated wafer surface measurements can greatly reduce the risk of yield loss from this source. To reduce the effect of random non-flatness of the wafers used in this measurement, a set of selected ultra-flat wafers can be reserved for this purpose. 1.12.5 Stepper Operating Procedures The ultimate cost-effectiveness of the lithographic operations performed in a semiconductor fabrication plant depends on a number of factors. The raw throughput of the stepper, in wafers per hour, is important; however, other factors can have a significant effect on the cost of operations. Strategies such as dedication of lots to particular steppers (in order to achieve the best possible overlay) can result in scheduling problems and high amounts of idle time. One of the most significant impacts on stepper productivity is the use of send-ahead wafers. This manufacturing technique requires one or more wafers from each lot to be exposed, developed, and measured for line width and/or overlay before the rest of the lot is exposed. The measurements on the send-ahead wafer are used to correct the exposure energy or adjust the alignment by small amounts. If the stepper is allowed to sit idle while the send-ahead wafer is developed and measured, there will be a tremendous loss of productivity. A more effective strategy is to interleave wafer lots and send-ahead wafers so that a lot can be exposed while the send-ahead wafer for the next lot is being developed and analyzed. This requires a sort of logistical juggling

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

81

act with some risk of confusing the correction data of one lot for that of another. Even with this strategy, there is a substantial waste of time. The send-ahead wafer is subject to the full lot overhead, including the time to load the lot control software and the mask plus the time to load and unload the wafer. A manufacturing facility that produces large quantities of a single semiconductor product can attempt a different send-ahead strategy. A large batch of several lots that require the same masking level can be accumulated and ran as a superlot with a single send-ahead wafer. This may introduce serious logistical problems as lots are delayed to accumulate a large batch. The most successful strategy can be adopted when such a large volume of a single semiconductor product is being manufactured that a stepper can be completely dedicated to a single mask level of that product. When every wafer is exposed with the same mask, the concept of send-ahead wafers is no longer needed. Instead, statistical process control can be introduced. Sample wafers can be pulled at intervals from the product stream, and measurements from these samples can be fed back to control the stepper exposure and alignment. Of course, the most desirable situation would be one where the stepper is so stable and insensitive to variations in alignment marks that send-ahead wafers are not needed. This goal has been pursued by all stepper manufacturers with some degree of success. Nearly all semiconductor manufacturers have found some way of operating without send-ahead wafers because of the serious loss in productivity that they cause. When the stability of the stepper is great enough, lots can be exposed with a so-called risk strategy. The entire lot is exposed without a send-ahead wafer, and sample wafers are then measured for line width and overlay. If the lot fails one of these measurements, it is reworked. The resist is stripped and reapplied, then the lot is exposed again with corrected values of alignment or exposure. As long as only a small fraction of lots require rework, the risk strategy can be much more effective than a strategy requiring a send-ahead for each lot. The risk strategy is most successful when the steppers and processes are so stable that the line width and overlay are rarely outside the tolerance specifications and when the flow of wafers through each stepper is continuous enough that statistical feedback from the lot measurements can be used to fine-tune the stepper settings.

1.13 Optical Enhancement Techniques An optical lithographic projection system is usually designed for perfection in each component of the system. The mask is designed to represent the ideal pattern that the circuit designer intends to see on the surface of the wafer. The projection lens is designed to form the most accurate image of the mask that is possible. The photoresist and etch processes are designed to faithfully capture the image of the mask and transfer its pattern into the surface of the wafer. Any lack of fidelity in the image transfer, whether caused by mechanical imperfections or by fundamental limitations in the physics and chemistry, tends to be cumulative. The errors in the mask are faithfully transmitted by the optics, and the optical diffractive limitations of the projection lens are just as faithfully recorded by the photoresist. Instead of striving for perfect masks, perfect optics, and perfect resist and etch processes, it may be more practical for some of these elements of the lithographic process to be designed to compensate for the deficiencies of the others. For example, masks can be designed to correct some undesirable effects of optical diffraction.

q 2007 by Taylor & Francis Group, LLC

82

Microlithography: Science and Technology

The nonlinear nature of photoresist has been exploited since the earliest days of microlithography to compensate for the shallow slope of the aerial image’s intensity profile. Etch biases can sometimes compensate for biases of the aerial image and the photoresist process. It is sometimes possible to use a trick of optics to enhance some aspect of the image forming process. Usually, this exacts a cost of some sort. For example, the multipleexposure process known as focus latitude enhancement exposure (FLEX) (described below) can greatly enhance the depth of focus of small contact holes in a positive-toned resist. This comes at the cost of reduced image contrast for all the features on the mask. However, for some applications, this tradeoff can be very advantageous. 1.13.1 Optical Proximity Corrections The physics of optical image formation leads to interference between closely spaced features within the aerial image. This can lead to a variety of undesirable effects. As discussed in Section 1.5.6, a prominent proximity effect is the relative image size bias between isolated and tightly grouped lines in the aerial image. This effect can be disastrous if the circuit design demands that isolated and grouped lines print at the same dimension as when the lines form gates of transistors that must all switch at the same speed. In simple cases, the circuit designer can manually introduce a dimensional bias into his or her design that will correct for the optical proximity effect. To extend this to a general OPC algorithm requires a fairly massive computer program with the ability to model the aerial image of millions of individual features in the mask pattern and add a correcting bias to each. The more sophisticated of these programs attempt to correct the 2-dimensional shape of the aerial image in detail instead of just adding a 1-dimensional bias to the line width. This can result in a mask design so complicated and with such a large number of tiny pattern corrections that the mask generation equipment and the automated die-to-database inspection systems used to inspect the mask for defects are taxed by the massive quantity of data. Another technique of OPC is to add sub-resolution assist features (SRAFs) to the design. By surrounding an isolated line with very narrow assist lines, sometimes called scattering bars, the isolated line will behave as though it were part of a nested group of lines even though the assist lines are so narrow that their images are not captured by the photoresist. Rather complex design rules can be generated to allow automated computer generation of assist features. Pattern-density biases are only one form of optical proximity effect. Corner rounding, line-end shortening, and general loss of shape fidelity in small features are caused by the inability of the lithographic projection lens to resolve details below the optical diffraction limit of that lens. These effects are often classified as another form of optical proximity effect. Corner rounding can be reduced by the addition of pattern structures that enhance the amount light transmitted through the corners of transparent mask features or by increasing the amount of chromium absorber at the corners of opaque mask features. These additional structures are often called anchors or serifs, in analogy to the tiny decorations at the ends of lines in printed letters and numerals. The serifs effectively increase the modulation at high spatial frequencies in order to compensate for the diffractive loss of high spatial frequencies in the transmission of the lithographic lens [38]. The methods of OPC are being rapidly implemented in semiconductor manufacturing, driven by the increased nonlinearity and reduced image fidelity of low k-factor lithography. Selective pattern bias, SRAF structures, and corner serifs can all be added to a mask design by automated computer programs. Two methods called rules-based and model-based OPC are in use. In rules-based OPC, the pattern is corrected according to a tabulated set of rules. For example, the program may scan the mask pattern and

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

(a)

(c)

83

(b)

(d)

FIGURE 1.28 A T-shaped feature (a) with dimensions near the resolution limit of the projection lens is printed as a rather featureless blob (c) on the wafer. Addition of decorative serifs (b) brings the printed image (d) much closer to the shape originally designed. The multitude of tiny features in (b) add to the volume of data that must be processed when making and inspecting the mask.

automatically insert a sub-resolution scattering bar in any space larger than some specified minimum. Model-based OPC analyzes each pattern on the mask using a semi-empirical model of image formation, and it makes corrections to the pattern to improve the quality of the printed image. Model-based OPC requires considerably greater computational effort than rules-based OPC, but it usually produces a more accurate correction. Once the methods of correcting optical diffraction are in place, the same methods can be extended to correct for nonlinearities in other parts of the manufacturing process. Corrections for feature-size nonlinearity in mask making, the photoresist process, and wafer etch processes can be folded into the OPC programs, giving a top-to-bottom image correction that includes all parts of the lithographic process (Figure 1.28). A large number of measurements are required to calibrate the OPC algorithm, and once the calibration has been done, all parts of the process are effectively frozen. Any changes to a single process, even changes that yield improvements, will usually require a new calibration of the entire chain of processes. The enormous increase in pattern complexity and data volumes driven by OPC would have completely overloaded the data handling capacity of mask writers and inspection equipment just a few years ago. However, the exponential rate of improvement in lithography has driven a corresponding rate of improvement in affordable computer power. Today’s computers can readily process mask designs approaching 100 gb of data, and they can accommodate very complex levels of proximity correction. 1.13.2 Mask Transmission Modification Optical proximity correction by image size biasing or the addition of scattering bars or serifs uses fairly standard mask-making technology. The proximity corrections are simply modifications of the original design patterns. There are more radical ways to modify the mask to print images beyond the normal capability of the lithographic lens. Corner rounding and the general issue of shape fidelity could be remedied if there were a way to increase the mask transmission above 100% in the corners of transparent mask features. This is not physically possible, but there is an equivalent technique. If the transparent parts of the mask are covered with a partially absorbing film and the illumination intensity is increased to compensate for this, there will be no difference in the image formation compared to a standard mask image. If the partially absorbing film is now removed in selected regions of the mask, the desired effect of O100% mask transmission will be effectively achieved [39]. If free rein is given to the imagination, masks can be designed with multiple levels of transmission, approaching a continuous gray scale in the limit. If such a mask could be

q 2007 by Taylor & Francis Group, LLC

84

Microlithography: Science and Technology

built, it could provide a high level of correction for optical proximity effects, yet the practical difficulties of such a task are disheartening. There are no technologies for patterning, inspecting, or repairing masks with multiple levels of transmission. Although such technologies could be developed, the enormous cost would probably not be worth the modest benefits of gray-scale masks. 1.13.3 Phase-Shifting Masks The concept of increasing the resolution of a lithographic image by modifying the optical phase of the mask transmission was proposed by Levenson et al. in 1982 [40]. This proposal was very slow to catch the interest of the lithographic community because of the difficulties of creating a defect-free phase-shifting mask and the continual improvement in the resolution achievable by conventional technologies. As the level of difficulty in conventional lithography has increased, there has been a corresponding surge of interest in phaseshifting masks. Several distinct types of phase masks have been invented. All share the common feature that some transparent areas of the mask are given a 1808 shift in optical phase relative to nearby transparent areas. The interaction between the aerial images of two features with a relative phase difference of 1808 generates an interference node or dark band between the two features. This allows two adjacent bright features to be printed much closer together than would be the case on a conventional mask. Except for the obvious difficulties in fabricating such masks, the drawbacks to their use are surprisingly small. The first type of optical phase-shifting mask to be developed was the so-called alternating-phase mask. In this type of mask, closely spaced transparent features are given alternate phases of 0 and 1808. The interference between the alternating phases allows the features to be spaced very closely together. Under ideal circumstances, the maximum resolution of an alternating-phase mask may be 50% better than that of a conventional mask. A mask consisting of closely spaced transparent lines in an opaque background gains the maximum benefit from this phase-shifting technique (Figure 1.29). Some feature geometries may make it difficult or impossible to use the alternating-phase approach. For example, tightly packed features that are laid out in a brick pattern with alternating rows offset from each other cannot be given phase assignments that allow every feature to have a phase opposite to that of its neighbors. Non-repetitive patterns can rarely be given phase assignments that meet the alternating-phase requirement. Another type of problem occurs in a mask with opaque features in a transparent background. Although there may be a way to create an alternating-phase pattern within a block of opaque features, there will be a problem at the edges of the array where the two opposite phases must meet at a boundary. Interference between the two phases will make this boundary print as a dark line. Cures for these problems have been proposed, involving the use of additional phase values between 0 and 1808, but few of these cures have been totally satisfactory (Figure 1.30). Phase-shifted regions on an alternating-phase mask can be created either by etching the proper distance into the fused silica mask substrate or by adding a calibrated thickness of a transparent material to the surface of the mask. The regions that receive the phase shift must be defined in a second mask-writing process and aligned accurately to the previously created chromium mask pattern. Techniques for inspection and repair of phase defects are under development and have achieved a good level of success. Bumps of unetched material in an etched region of the mask can be removed with gas-assisted ion milling or mechanical microplaning techniques (see Section 1.11.6) There is still no way of successfully repairing a pit accidentally etched into a region where no etch is desired. The only way to prevent this type of defect is to ensure that there are no pinholes in the second level resist coating.

q 2007 by Taylor & Francis Group, LLC

Relative image intensity

Relative image intensity

System Overview of Optical Steppers and Scanners

85

1.0 0.8 0.6 0.4 0.2 0

0.5

0

1.0

1.5

2.0

1.5

2.0

Microns

1.0 0.8 0.6 0.4 0.2 0

0

0.5

1.0 Microns

FIGURE 1.29 The benefits of an alternating-phase mask. (a) Aerial image of a line-space grating near the resolution limit of a stepper using a conventional mask. The line width of the image is 0.5 l/NA. When alternating clear areas are given a 180aˆ phase shift on the mask as in (b), the contrast of the aerial image is markedly improved.

Because alternating-phase masks are not universally applicable to all types of mask patterns, other types of phase-shifting techniques have been devised. With one method, a narrow, 1808 phase-shifted rim is added to every transparent feature on the mask. The optical interference from this rim steepens the slope of the aerial image at the transition between transparent and opaque regions of the mask. A variety of procedures have been invented for creating this phase rim during the mask-making process without requiring a second, aligned, mask-writing step. Masks using this rim-shifting technique do not provide as much lithographic benefit as do alternating-phase masks, but they do not suffer from the pattern restrictions that afflict the alternating-phase masks. Most of the current interest in rim-shifting phase masks is centered on contrast enhancement for contact level masks. Another pattern-independent phase-shifting technique is the use of

π

0 ?

? π

0 ? 0

?

(a)

?

? 0

π

? π

0 ?

π

0

π

0

π

0 ?

π

π

0

0

? 0

π

π (b)

0

FIGURE 1.30 Intractable design problems for alternating-phase masks. In (a) there is no way to assign phases so that a 180aˆ phase difference occurs between adjacent transparent features. In this example, the phases have been assigned to the odd-numbered rows, but there is no way to assign phases consistently to the even rows. (b) A problem that occurs when a mask consists of isolated opaque features in a clear background. Although the alternating-phase condition is met within the array of lines and spaces, the opposite phases will collide at the end of each line. At the boundaries marked by a dashed line, an unwanted, thin dark line will print.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

86

Relative image intensity

a partially transmitting, 1808 phase-shifting film to replace the chromium absorbing layer on the mask. These masks are often called attenuated phase-shifting masks. Interference between light from the transparent regions of the mask and phase-shifted light passing through the partially transmitting absorber gives a steep slope to the aerial image of feature edges. Transmission of the phase-shifting absorber in the range from 5 to 10% provides a modest benefit in contrast, and it does not seem to have any negative effects on image fidelity. Higher levels of transmission (around 20%) give a stronger contrast enhancement, but at the cost of fairly severe pattern distortions such as ghost images at the edges of grating arrays. High transmission attenuated phase masks may be designed with opaque chromium regions strategically placed to block the formation of ghost images. This type of design is sometimes called a tritone mask (Figure 1.31). Away from the interference region at the edges of the mask features, a partially transmitting absorber allows a fairly undesirable amount of light to fall on the photoresist. However, a high-contrast photoresist is not seriously affected by this light that falls below the resist’s exposure threshold. The contrast enhancement from partially transmitting phase shifters is rather mild compared to alternating-phase masks. However, the technology for making these masks is much easier than for other styles of phase masks. The phase-shifting absorber can usually be patterned just like the chromium layer in a conventional mask. With appropriate adjustments in detection thresholds, mask defects can be detected with normal inspection equipment. Although the inspection does not reveal any errors in phase, it does detect the presence or absence of the absorbing film. Isolated defects, either clear or opaque, can be repaired with conventional mask repair techniques. Defects in the critical region at the edge of an opaque feature cannot yet be perfectly repaired with the correct phase and transmission. Despite the problem with mask repair, attenuating phase shift masks are now commonly used in semiconductor

Relative image intensity

(a)

1.0 0.8 0.6 0.4 0.2 0

0

0.5

1.0

1.5

2.0

1.5

2.0

Microns 1.0 0.8 0.6 0.4 0.2 0

0

(b)

0.5

1.0 Microns

FIGURE 1.31 The benefits of partially transparent phase-shifting mask absorbers. (a) Aerial image of a line-space grating using a conventional mask. The line width of the image is 0.7 l/NA. (b) Aerial image that results when the opaque absorber is replaced with a material that transmits 6% of the incident light with a 180aˆ phase shift. The slope of the aerial image is steepened, but a certain amount of light leaks into the dark spaces between the bright lines.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

87

manufacturing. They provide a valuable improvement in image contrast in exchange for some increase in mask cost and delivery time. The most radical application of phase-shifting technology is in the phase edge mask. This type of mask consists only of transparent fused silica with a pattern etched into the surface to a depth yielding a 1808 phase shift. Only the edges of the etched regions project images onto the wafer, but these images are the smallest features that can be transmitted through the lithographic lens. The resolution of a phase edge mask can be twice as good as that of a conventional chromium mask. There are serious limitations to the types of features that can be printed with a phase edge mask. All of the lines in the printed pattern represent the perimeter of etched regions on the mask, so they must always form closed loops. It is rare that a semiconductor circuit requires a closed loop. The loops may be opened by exposing the pattern with a second trimming mask, but this adds a great deal of complexity to the process. All of the lines printed by a phase edge mask are the same width. This puts a serious constraint on the circuit designer who is used to considerably greater latitude in the types of features he or she can specify. There is a possibility that hybrid masks containing some conventional chromium features and some phase edge structures may provide the ultimate form of phase mask, but the challenges to mask fabrication, inspection, and repair technologies are severe. 1.13.4 Off-Axis Illumination Until recently, the standard form of illumination for lithographic lenses was a circular pupil fill centered in the entrance pupil of the projection optics. The only variable that lithographers occasionally played with was the pupil filling ratio that determines the degree of partial coherence in the image formation. In 1992, researchers at Canon, Inc. [41] and the Nikon Corporation [42] introduced quadrupole illumination that has significant benefits for imaging small features. Today, several different pupil illumination patterns are available on steppers, often as software-selectable options. These can be selected as appropriate for each mask pattern that is exposed on the stepper. The process of image formation can most easily be understood for a simple structure such as a grating of equal lines and spaces. Also for simplicity, it is best to consider the contribution to the image formation from a single point of illumination. The actual image formed by an extended source of illumination is just the sum of the images formed by the individual point sources within the extended source. A grating mask, illuminated by a single point of illumination, will create a series of diffracted images of the illumination source in the pupil of the lithographic lens. The lens aperture acts as a filter, excluding the higher diffracted orders. When the lens recombines the diffracted orders that fall within its aperture, it forms an image with the higher spatial frequency content removed. A grating with the minimum resolvable pitch will cast its G1st-order diffraction just inside the lens aperture along with the undiffracted illumination point at the center of the pupil (the 0th diffraction order). The diffraction from gratings with smaller pitches will fall completely outside the pupil aperture, and the gratings will not be resolved (Figure 1.32). If the point of illumination is moved away from the center of the pupil, closer to the edge of the aperture, then it is possible to resolve a grating with a smaller pitch than can be resolved with on-axis illumination. The K1st diffracted order will now fall completely out of the pupil aperture, but the 0th and C1st orders will be transmitted to form an image. The asymmetry between the 0th and C1st order leads to severe telecentricity errors. However, the symmetry can be restored by illuminating with two point sources placed on opposite sides of the pupil. The final image will be composed of the 0th and C1st diffracted orders from one source and the 0th and K1st diffracted orders from the other.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

88

FIGURE 1.32 With conventional illumination, the incident light is directed at the center of the lens pupil. A mask consisting of a line-space grating near the resolution limit of the lens diffracts light into a multitude of symmetrical diffracted orders. The lens accepts only the 0th and G1st orders. When these orders are recombined into an image, the high spatial frequencies contained in the 2nd and higher orders are lost, producing an aerial image with reduced contrast. When the mask pitch becomes so small that the G1st diffracted orders fall outside the lens aperture, the image contrast falls to zero.

−3

−2

−1

0

+1

+2

+3

This type of illumination is called dipole illumination, and it provides the best possible resolution for conventional masks with features oriented perpendicular to a line passing through the two illuminated spots in the pupil. On the other hand, features oriented parallel to this line will not see the dipole illumination and will have a much larger resolution limit. Differences in imaging properties for two different orientations of lines are usually not desirable, although it is possible to imagine a mask design with all of the critical dimensions aligned along one axis (Figure 1.33). In order to improve the symmetry between the x and y axes, it seems fairly obvious to add another pair of illuminates spots at the top and bottom of the pupil. A more careful analysis of this situation shows that it has an undesirable characteristic. The light sources at 6 o’clock and 12 o’clock spoil the dipole illumination for vertically oriented lines, and the light sources at 3 o’clock and 9 o’clock have the same effect on horizontally oriented lines. The quadrupole illumination introduced by Nikon and Canon is a more clever way of achieving dipole illumination along two axes. The four illuminated spots are placed at the ends of two diagonal lines passing through the center of the pupil. This provides two dipole illumination patterns for features oriented along either the x or y axis. The separation of the two dipoles can be, at most, 70% of the separation of a single dipole, so the enhancement in resolution is not nearly as great, but the ability to print both x-and y-oriented features with the same resolution is fairly important. It should be noted that features oriented along the G458 diagonals of the field will not see the dipole illumination, and they will suffer considerably worse resolution than features oriented along the x and y axes. Although this is somewhat undesirable, it can usually be tolerated because critical features are rarely oriented at odd angles to the sides of the chip (Figure 1.34). Some of the benefits of dipole or quadrupole illumination can be achieved with an annular ring of illumination. Annular illumination does not give as strong a resolution enhancement as the other two forms of off-axis illumination, but it does have a completely symmetrical imaging behavior.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

i′

−1

89

i

−1′ 0

0′ +1

+1′

FIGURE 1.33 With off-axis illumination, the incident light i is directed toward the edge of the lens pupil. The 0th and C1st diffracted orders are captured to form an image. Because the 0th and 1st orders can be separated by the full width of the pupil, an image can be formed with a much tighter pitch than would be possible with conventional illumination. To create symmetrical pupil illumination, a second off-axis beam i 0 is used as well. The 0th and K1st orders of this beam form another image identical to that formed by the illumination from i. The intensities from these two identical images are added together in the final image.

(a)

(b)

(c)

FIGURE 1.34 Three unconventional forms of pupil illumination. The shaded areas represent the illuminated portions of the circular pupil. (a) Dipole illumination that provides a benefit only for lines oriented along the y axis. Quadrupole illumination, illustrated in (b), provides benefits for lines oriented along the x and y axes but gives much poorer results for 45aˆ angled lines. Annular illumination (c) provides milder benefits than dipole or quadrupole illumination, but the benefits are independent of feature orientation.

q 2007 by Taylor & Francis Group, LLC

90

Microlithography: Science and Technology

One of the most attractive features of off-axis illumination is its relative ease of use. The stepper manufacturer can usually provide any of the illumination patterns described above by simply inserting an aperture at the appropriate location in the stepper’s illuminator. Often, a series of apertures can be provided on a turret, and the particular aperture desired for any mask can be automatically supplied by the stepper control program. The most difficult design issues for these forms of illumination are the loss of illumination intensity when an off-axis aperture is used and maintenance of good field illumination uniformity when changing from one aperture to another. 1.13.5 Pupil Plane Filtration Modifications to masks and the illumination system have been intensively studied for their benefits to the practice of lithography. The projection lens is the last remaining optical component that could be modified to provide some sort of lithographic enhancement. Any modification to the lens behavior can be defined in terms of a transmission or phase filter in the pupil plane. Apodization filters are a well-known example of pupil modification sometimes used in astronomical telescopes. By gradually reducing the transmission toward the outer parts of the pupil, an apodization filter reduces optical oscillations, or ringing, at the edges of an image (Apodize was coined from the Greek words meaning “no foot.” The foot on the image is caused by the abrupt discontinuity in transmission of high spatial frequencies at the limits of the pupil.) These oscillations are very small at the coherence values used in lithography, and they are generally of no concern; however, filters with different patterns of transmission may have some advantages. Reducing the transmission at the center of the pupil can enhance the contrast of small features at the cost of reducing the contrast of large features. Phase modifications in the lens pupil can affect the relative biases between isolated and grouped features. It should be noted that a phase variation in the pupil plane is just another name for an optical aberration. A phase filter in the pupil plane simply introduces a controlled aberration into the projection optics. Combinations of phase and transmission filtration can sometimes be found that enhance the depth of focus or contrast of some types of image features. The benefits that could be realized from pupil plane filtration are more easily achieved by OPCs on the mask. As OPC has become increasingly common, interest in pupil plane filtration has declined.

1.14 Lithographic Tricks A variety of ingenious techniques have been used in the practice of lithography. Some of these tricks are used in the day-to-day business of exposing wafers, and others are only useful for unusual requirements of an experiment or early development project. The following are the most interesting tricks available. 1.14.1 Multiple Exposures through Focus (FLEX) In 1987, researchers at the Hitachi Corporation came up with an ingenious method for increasing the depth of focus that they called FLEX for Focus Latitude Enhancement Exposure [43]. This technique works especially well for contact holes. These minimumsized transparent features in an opaque background typically have the shallowest depth

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

91

of focus of anything that the lithographer tries to print. If the etched pattern on the wafer’s surface has a large amount of vertical height variation, it may be impossible to print contact holes on the high and low parts of the pattern at the same focus setting. Fukuda et al. realized that the exposure field can be exposed twice: once with low regions of the surface in focus and once focusing on the high regions. Each contact hole image will consist of two superimposed images: one in focus and one out of focus. The outof-focus image spreads over a broad region and only contributes a small background haze to the in-focus image. This technique can be also be used on wafer surfaces with more than two planes of topography or with random surface variations caused by wafer non-flatness. The two exposures are made with only a slight change in focus so that their in-focus ranges overlap. This effectively stretches the depth of focus without seriously degrading the image contrast by the presence of the out-of-focus image. The technique can be further extended by exposing at three or more different focal positions or even continuously exposing through a range of focus. If an attempt is made to extend the focal range too much, the process will eventually fail because the many out-of-focus images will degrade the contrast of the in-focus image until it is unusable. Isolated bright lines and line-space gratings receive fewer benefits from FLEX than do contact holes (Figure 1.35). The out-of-focus image of a line or grating does not subside into a negligible background as fast as that of a contact hole, and it induces a much greater degradation in the in-focus image. If the reduced contrast can be compensated by a highcontrast resist process, some increased depth of focus may be achieved for line-space gratings. The greatest benefit of FLEX is seen in extending the depth of focus of contact holes. The use of FLEX tends to increase exposure time somewhat because of the stepper’s need to shift focus one or more times during the exposure of each field. The technique also requires modification to the stepper software to accommodate the double exposure. Otherwise, it is one of the easiest lithographic tricks to implement, and it seems to be used fairly frequently.

+2

+1

0

−1

−2 (a)

(b)

q 2007 by Taylor & Francis Group, LLC

FIGURE 1.35 The focus latitude enhancement exposure (FLEX) technique allows a great increase in depth of focus for small, isolated, bright features. (a) Aerial image of a 0.35 mm contact hole through G1 mm of focus. (b) Aerial image of the same feature, double exposed with a 1.75 mm focus shift between exposures. The depth of focus is nearly double that of the conventional exposure technique. With close inspection, it can be seen that the contrast of the aerial image at best focus is slightly worse for the FLEX exposures.

Microlithography: Science and Technology

92 1.14.2 Lateral Image Displacement

Another trick involving double-exposed images can be used to print lines that are smaller than the normal resolution of the lithographic optics. If the aerial image is laterally shifted between two exposures of a dark line, then the resulting latent image in resist will be the sum of the two exposures. The left side of the image will be formed by one of the two exposures, and the right side by the other. By varying the amount of lateral shift between the two exposures, the size of the resulting image can be varied, and very small lines may be produced. Horizontal and vertical lines can be produced at the same time with this technique by shifting the image along a 458 diagonal. The benefits of this trick are rather mild, and the drawbacks are rather severe. The only difference between the aerial image of a single small line and that of a line built up from two exposures of image edges is the difference in coherence between the two cases (Figure 1.36). In the lateral image displacement technique, the light forming one edge of the image is incoherent with the light forming the other edge. In a single small feature, the light forming the two edges has a considerable amount of relative coherence. This difference gives a modest benefit in contrast to the image formed with lateral image displacement. Images formed with this technique cannot be printed on a tight pitch. A grating with equal lines and spaces is impossible because of the constraints of geometry. Only lines with horizontal or vertical orientation can be used.

+

=

(a)

1.2

Intensity

Intensity

1.2

0.8

0.4

0 0

0.8

0.4

0.5

(b)

1.0 Microns

1.5

2.0 (c)

0 0

0.5

1.0 Microns

1.5

2.0

FIGURE 1.36 Use of lateral image displacement to produce very small dark features. Two large dark features are superimposed using a double exposure to create a very narrow dark feature in the region of overlap. (a) The result of adding the aerial images of two edges to produce a single dark feature. (b) The same double-exposed composite feature on graphical axes for comparison with the aerial image of a conventionally exposed dark feature in (c). Benefits of the technique are real, but the practical difficulties are severe. The technique only works with extremely coherent illumination (sZ0.2 in this example).

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

93

Lateral image displacement has been used in a small number of experimental studies, but there do not seem to be any cases of its use in semiconductor manufacturing. Other techniques for producing sub-resolution features on large pitches are more easily used. 1.14.3 Resist Image Modifications The most common way of producing images below the resolution limit of the lithographic optics is to use some chemical or etch technique to reduce the size of the developed image in resist. The resist may be partially eroded in an oxygen plasma to reduce the image size in a controlled way. This reduces the image size, but it cannot reduce the pitch. It also reduces the resist thickness, which is quite undesirable. Another trick, with nearly the opposite effect, is to diffuse a material into the exposed and developed resist to induce swelling of the resist patterns. In these cases, the spaces between the resist images can be reduced to dimensions below the resolution limits of the optics. Both of these tricks have seen limited use in semiconductor development laboratories and occasionally in semiconductor manufacturing as well. Often, they serve as stop-gap measures to produce very small features on relatively large pitches before the lenses becomes available to produce the needed image sizes with conventional lithographic techniques. Simple changes of exposure can also be used to bias the size of a resist image. Overexposing a positive resist will make the resist lines become smaller, and underexposing will make the spaces smaller. This works very well for small changes in image size, and it is the usual method of controlling image size in a manufacturing line. However, using large over-or under-exposures to achieve sub-resolution lines or spaces usually results in a drastic loss of depth of focus and is not as controllable as the post-development resist image modifications. 1.14.4 Sidewall Image Transfer Another technique with the ability to produce sub-resolution features is a processing trick called sidewall image transfer. A conformal coating of a material such as silicon dioxide is deposited over the developed resist image. Then, the oxide is etched with a very directional etch until the planar areas of oxide are removed. This leaves the resist features surrounded by collars of silicon dioxide. If the resist is then removed with an oxygen etch, only the oxide collars will remain. These collars form a durable etch mask, shaped like the outline of the original photoresist pattern. The effect is almost identical to that of a chromeless, phase edge mask. Only feature edges are printed, and all of the lines have a fixed, narrow width. All of the features form closed loops, and a second, trim mask is required to cut the loops open. In the case of sidewall image transfer, the line width is determined by the thickness of the original conformal oxide coating. Very narrow lines with well-controlled widths can be formed with this process. The line width control is limited by the verticality of the original resist image sidewalls, the accuracy of the conformal coating, and the directionality of the etch; and it is practically independent of optical diffraction effects. The pitch of the sidewall pattern can be one half that of the original pattern in resist (Figure 1.37). This trick does not seem to have been used in semiconductor manufacturing, mostly because of the serious design limitations and the relative complexity of the processing required. It does provide a way of generating very small lines for early semiconductor device studies, long before the standard techniques of lithography can create the same image sizes.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

94

(a)

FIGURE 1.37 Sidewall image transfer. This non-optical lithographic trick uses wafer processing techniques to produce extremely small patterns on a very tight pitch. The photoresist pattern produced in (a) has a conformal coating of a material such as silicon dioxide applied in (b). The oxide is etched away with a directional etch, leaving the wafer surface and the top of the resist pattern exposed but the sidewalls of the resist images still coated with oxide. After the resist is stripped, the freestanding oxide side walls form a series of very narrow features on a pitch equal to twice that of the original photoresist features.

(b)

(c)

(d)

1.14.5 Field Stitching Most of the tricks described in this section were developed to surpass the resolution limits of the lithographic lens or to extend the depth of focus. Field stitching is a multiple exposure technique intended to increase the field size. Very large chips can be built up by accurately abutting two or more subchips, each of which can fit into a single exposure field of a stepper. For the most accurate stitching boundary, the wafer must be exposed with each of the subchips before the wafer is removed from the exposure chuck. This requires that one or more mask changing operations must be performed for each wafer exposure, greatly reducing the system throughput. Because of the high accuracy of automatic mask loading systems and wafer stages, the alignment at the field-stitching boundaries can be remarkably good. The chip must be designed with no features having critical tolerances right at the stitching boundary, and it may not be possible to traverse the stitching boundary with lines on the minimum pitch. Otherwise, there are few impediments to the use of field stitching. Field-stitching strategies have not found their way into commercial manufacturing, partly because of the low throughput inherent in the scheme, but also because there is rarely a need for a chip that is too large to fit into a single stepper field. Field stitching is sometimes contemplated in the earliest stages of a development program when the only steppers that can support the small lithographic dimensions are experimental prototypes with small field sizes. However, commercial steppers with suitable field sizes have always been available by the time the chip reaches the manufacturing stage. Future development of field stitching in optical lithography has been largely preempted by the step-and-scan technology that, in some ways, can be considered a sort of continuous field-stitching technique.

q 2007 by Taylor & Francis Group, LLC

System Overview of Optical Steppers and Scanners

95

Field stitching is commonly and successfully used in ebeam and scanned laser beam lithography. Mask writing systems are designed to stitch small scanned fields together without detectable errors at the stitching boundaries. Masked ebeam systems (the PREVAIL technology) are also designed to stitch multiple sub-millimeter fields into a final, seamless pattern on the wafer.

References 1. 2. 3. 4. 5.

6. 7. 8. 9. 10. 11.

12. 13. 14.

15.

16. 17. 18. 19. 20.

G. Moore. 1965. “Cramming more components onto integrated circuits,” Electronics, 38:8, 114–117. The International Technology Roadmap for Semiconductors. 2003. Edition. B.J. Lin. 1975. “Deep UV lithography,” J. Vac. Sci. Technol., 12:6, 1317–1375. R. DellaGuardia, C. Wasik, D. Puisto, R. Fair, L. Liebman, J. Rocque, S. Nash et al., 1995. “Fabrication of 64 Mbit DRAM using x-ray lithography,” Proc. SPIE, 2437: 112–125. U. Behringer, P. Vettiger, W. Haug, K. Meissner, W. Ziemlich, H. Bohlen, T. Bayer, W. Kulcke, H. Rothuizen, and G. Sasso. 1991. “The electron beam proximity printing lithography, a candidate for the 0.35 and 0.25 micron generations,” Microelectron. Eng., 13:1–4, 361–364. T. Utsumi. 1999. “Low energy electron-beam proximity projection lithography: Discovery of a missing link,” J. Vac. Sci. Technol., 17:6, 2897–2902. D.A. Markle. 1974. “A new projection printer,” Solid State Technol., 17:6, 50–53. J.H. Bruning. 1980. “Optical imaging for microfabrication,” J. Vac. Sci. Technol., 17:5, 1147–1155. J.T. Urbano, D.E. Anberg, G.E. Flores, and L. Litt. 1994. “Performance results of large field mixmatch lithography,” Proc. IEEE/SEMI Adv. Semicond. Manf. Conf., 38. J.D. Buckley and C. Karatzas. 1989. “Step-and-scan: A system overview of a new lithography tool,” Proc. SPIE, 1088: 424–433. K. Suzuki, S. Wakamoto, and K. Nishi. 1996. “KrF step and scan exposure system using higher NA projection lens,” Proc. SPIE, 2726: 767–770; D. Williamson, J. McClay, K. Andresen, G. Gallatin, M. Himel, J. Ivaldi, C. Mason et al., 1996. “Micrascan III, 0.25 mm resolution step and scan system,” Proc. SPIE, 2726: 780–786. J.H. Burnett and S.G. Kaplan. 2004. “Measurement of the refractive index and thermo-optic coefficient of water near 193 nm,” J. Microlith. Microfab. Microsyst., 3:1, 68–72. J.H. Chen, L.J. Chen, T.Y. Fang, T.C. Fu, L.H. Shiu, Y.T. Huang, N. Chen et al., 2005. “Characterization of ArF immersion process for production,” Proc. SPIE, 5754: 13–22. D. Gil, T. Bailey, D. Corliss, M.J. Brodsky, P. Lawson, M. Rutten, Z. Chen, N. Lustig, and T. Nigussie. 2005. “First microprocessors with immersion lithography,” Proc. SPIE, 5754: 119–128. H.C. Pfeiffer, D.E. Davis, W.A. Enichen, M.S. Gordon, T.R. Groves, J.G. Hartley, R.J. Quickle, J.D. Rockrohr, W. Stickel, and E.V. Weber. 1993. “EL-4, A new generation electron-beam lithography system,” J. Vac. Sci. Technol. B, 11:6, 2332–2341. Y. Okamoto, N. Saitou, Y. Haruo, and Y. Sakitani. 1994. “High speed electron beam cell projection exposure system,” IEICE Trans. Elect., E77-C:3, 445–452. T. Sandstrom, T. Fillion, U. Ljungblad, and M. Rosling. 2001. “Sigma 7100, A new architecture for laser pattern generators for 130 nm and beyond,” Proc. SPIE, 4409: 270–276. T.E. Jewell. 1995. “Optical system design issues in development of projection camera for EUV lithography,” Proc. SPIE, 2437: 340–346. H.C. Pfeiffer. 2000. “PREVAIL: Proof-of-concept system and results,” Microelecton. Eng., 53:1, 61–66. S.D. Berger and J.M. Gibson. 1990. “New approach to projection-electron lithography with demonstrated 0.1 mm linewidths,” Appl. Phys. Lett., 57:2, 153–155; L.R. Harriot, S.D. Berger, C. Biddick, M.I. Blakey, S.W. Bowler, K. Brady, R.M. Camarda et al. 1997. “SCALPEL proof of concept sytem,” Microelectron. Eng., 35:1–4, 477–480.

q 2007 by Taylor & Francis Group, LLC

96

Microlithography: Science and Technology

21.

J. Zhu, Z. Cui, and P.D. Prewett. 1995. “Experimental study of proximity effect corrections in electron beam lithography,” Proc. SPIE, 2437: 375–382. W.H. Bruenger, H. Loeschner, W. Fallman, W. Finkelstein, and J. Melngailis. 1995. “Evaluation of critical design parameters of an ion projector for 1 Gbit DRAM production,” Microelectron. Eng., 27:1–4, 323–326. D.S. Goodman and A.E. Rosenbluth. 1988. “Condenser aberrations in koehler illumination,” Proc. SPIE, 922: 108–134. V. Pol, J.H. Bennewitz, G.C. Escher, M. Feldman, V.A. Firtion, T.E. Jewell, B.E. Wilcomb, and J.T. Clemens. 1986. “Excimer laser-based lithography: A deep ultraviolet wafer stepper,” Proc. SPIE, 633: 6–16. M. Hibbs and R. Kunz. 1995. “The 193-nm full-field step-and-scan prototype at MIT Lincoln Laboratory,” Proc. SPIE, 2440: 40–48. J.M. Hutchinson, W.M. Partlo, R. Hsu, and W.G. Oldham. 1993. “213 nm lithography,” Microelectron. Eng., 21:1–4, 15–18. D. Flagello, B. Geh, S. Hansen, and M. Totzeck. 2005. “Polarization effects associated with hyper-numerical-aperture (O1) lithography,” J. Microlith. Microfab. Microsyst., 4:3, 031104-1– 031104-17. M.N. Wilson, A.I.C. Smith, V.C. Kempson, M.C. Townsend, J.C. Schouten, R.J. Anderson, A.R. Jorden, V.P. Suller, and M.W. Poole. 1993. “Helios 1 compact superconducting storage ring x-ray source,” IBM J. Res. Dev., 37:3, 351–371. H.H. Hopkins. 1953. “On the diffraction theory of optical images,” Proc. R. Soc. Lond., A-217: 408–432. D.D. Dunn, J.A. Bruce, and M.S. Hibbs. 1991. “DUV photolithography linewidth variations from reflective substrates,” Proc. SPIE, 1463: 8–15. T.A. Brunner. 1991. “Optimization of optical properties of resist processes,” Proc. SPIE, 1466: 297–308. R. Rubingh, Y. Van Dommelen, S. Templaars, M. Boonman, R. Irwin, E. Van Donkelaar, H. Burgers et al. 2002. “Performance of a high productivity 300 mm dual stage 193 nm 0.75 NA Twinscane AT:1100B system for 100 nm applications,” Proc. SPIE, 4691: 696–708. K. Suwa and K. Ushida. 1988. “The optical stepper with a high numerical aperture i-line lens and a field-by-field leveling system,” Proc. SPIE, 922: 270–276. Y. Ikuta, S. Kikugawa, T. Kawahara, H. Mishiko, K. Okada, K. Ochiai, K. Hino, T. Nakajima, M. Kawata, and S. Yoshizawa. 2000. “New modified silica glass for 157 nm lithography,” Proc. SPIE, 4066: 564–570. B. LoBianco, R. White, and T. Nawrocki. 2003. “Use of nanomachining for 100 nm mask repair,” Proc. SPIE, 5148: 249–261. K. Okada, K. Ootsuka, I. Ishikawa, Y. Ikuta, H. Kojima, T. Kawahara, T. Minematsu, H. Mishiro, S. Kikugawa, and Y. Sasuga. 2002. “Development of hard pellicle for 157 nm,” Proc. SPIE, 4754: 570–578. R. Unger and P. DiSessa. 1991. “New i-line and deep-UV optical wafer steppers,” Proc. SPIE, 1463: 725–742. A. Starikov. 1989. “Use of a single size square serif for variable print bias compensation in microlithography: Method, design, and practice,” Proc. SPIE, 1088: 34–46. W.-S. Han, C.-J. Sohn, H.-Y. Kang, Y.-B. Koh, and M.-Y. Lee. 1994. “Overcoming of global topography and improvement of lithographic performance using a transmittance controlled mask (TCM),” Proc. SPIE, 2197: 140–149. M.D. Levenson, N.S. Viswanathan, and R.A. Simpson. 1982. “Improving resolution in photolithography with a phase-shifting mask,” IEEE Trans. Electron. Dev., ED-29:12, 1812–1846. M. Noguchi, M. Muraki, Y. Iwasaki, and A. Suzuki. 1992. “Subhalf micron lithography system with phase-shifting effect,” Proc. SPIE, 1674: 92–104. N. Shiraishi, S. Hirukawa, Y. Takeuchi, and N. Magome. 1992. “New imaging technique for 64M-DRAM,” Proc. SPIE, 1674: 741–752. H. Fukuda, N. Hasegawa, and S. Okazaki. 1989. “Improvement of defocus tolerance in a halfmicron optical lithography by the focus latitude enhancement exposure method: Simulation and experiment,” J. Vac. Sci. Technol. B, 7:4, 667–674.

22.

23. 24.

25. 26. 27.

28.

29. 30. 31. 32.

33. 34.

35. 36.

37. 38. 39.

40. 41. 42. 43.

q 2007 by Taylor & Francis Group, LLC

2 Optical Lithography Modeling Chris A. Mack CONTENTS 2.1 Introduction ......................................................................................................................98 2.2 Structure of a Lithography Model ................................................................................98 2.3 Aerial Image Formation................................................................................................100 2.3.1 Basic Imaging Theory ......................................................................................100 2.3.2 Aberrations........................................................................................................104 2.3.3 Zero-Order Scalar Model ................................................................................106 2.3.4 First-Order Scalar Model ................................................................................106 2.3.5 High-NA Scalar Model....................................................................................107 2.3.6 Full Scalar and Vector Models ......................................................................108 2.4 Standing Waves ..............................................................................................................109 2.5 Photoresist Exposure Kinetics ....................................................................................112 2.5.1 Absorption ........................................................................................................112 2.5.2 Exposure Kinetics ............................................................................................115 2.5.3 Chemically Amplified Resists ........................................................................117 2.6 Photoresist Bake Effects ................................................................................................122 2.6.1 Prebake ..............................................................................................................122 2.6.2 Postexposure Bake ..........................................................................................127 2.7 Photoresist Development..............................................................................................129 2.7.1 Kinetic Development Model ..........................................................................129 2.7.2 Enhanced Kinetic Development Model........................................................131 2.7.3 Surface Inhibition ............................................................................................132 2.8 Linewidth Measurement ..............................................................................................133 2.9 Lumped-Parameter Model ..........................................................................................135 2.9.1 Development-Rate Model ..............................................................................135 2.9.2 Segmented Development ................................................................................137 2.9.3 Derivation of the Lumped-Parameter Model..............................................138 2.9.4 Sidewall Angle..................................................................................................139 2.9.5 Results ................................................................................................................140 2.10 Uses of Lithography Modeling....................................................................................141 2.10.1 Research Tool ....................................................................................................141 2.10.2 Process Development Tool..............................................................................142 2.10.3 Manufacturing Tool ........................................................................................143 2.10.4 Learning Tool ....................................................................................................143 References ....................................................................................................................................144

97

q 2007 by Taylor & Francis Group, LLC

98

Microlithography: Science and Technology

2.1 Introduction Optical lithography modeling began in the early 1970s at the IBM Yorktown Heights Research Center, when Rick Dill began an effort to describe the basic steps of the lithography process with mathematical equations. At a time when lithography was considered a true art, such an approach was met with considerable skepticism. The results of their pioneering work were published in a landmark series of papers in 1975 [1–4], now referred to as the Dill papers. These papers not only gave birth to the field of lithography modeling, they represented the first serious attempt to describe lithography as a science. They presented a simple model for image formation with incoherent illumination—the firstorder kinetic Dill model of exposure—and an empirical model for development coupled with a cell algorithm for photoresist profile calculation. The Dill papers are still the most referenced works in the body of lithography literature. While Dill’s group worked on the beginnings of lithography simulation, a professor from the University of California at Berkeley, Andy Neureuther, spent a year on sabbatical working with Dill. Upon returning to Berkeley, Neureuther, and another professor, Bill Oldham, started their own modeling effort. In 1979, they presented the first result of their effort, the lithography modeling program SAMPLE [5]. SAMPLE improved the state of the art in lithography modeling by adding partial coherence to the image calculations and by replacing the cell algorithm for dissolution calculations with a string algorithm. More importantly, SAMPLE was made available to the lithography community. For the first time, researchers in the field could use modeling as a tool to understand and improve their lithography processes. The author began working in the area of lithographic simulation in 1983 and, in 1985, introduced the model PROLITH (the positive resist optical lithography model) [6]. This model added an analytical expression for the standing wave intensity in the resist, a prebake model, a kinetic model for resist development (now known as the Mack model), and the first model for contact and proximity printing. PROLITH was also the first lithography model to run on a personal computer (the IBM PC), making lithography modeling accessible to all lithographers from advanced researchers to process development engineers and manufacturing engineers. Over the years, PROLITH advanced to include a model for contrast enhancement materials, the extended source method for partially coherent image calculations, and an advanced focus model for high numerical aperture (NA) imaging. Since the late 1980s, commercial lithography simulation software has been available to the semiconductor community, providing dramatic improvements in the usability and graphics capabilities of the models. Modeling has now become an accepted tool for use in a wide variety of lithography applications.

2.2 Structure of a Lithography Model Any lithography model must simulate the basic lithographic steps of image formation, resist exposure, postexposure bake diffusion, and development to obtain a final resist profile. Figure 2.1 shows a basic schematic of the calculation steps required for lithography modeling. Below is a brief overview of the physical models found in a typical lithography simulator:

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

Aerial image & Standing waves

Intensity within the resist film

Exposure kinetics & PEB diffusion

Concentration of photoactive compound

Development kinetics & etch algorithm

Developed resist profile

99

FIGURE 2.1 Flow diagram of a lithography model.

† Aerial image: The extended source method, or Hopkin’s method, can be used to

†

†

† †

†

predict the aerial image of a partially coherent diffraction-limited or aberrated projection system based on scalar diffraction theory. Single-wavelength or broadband illumination is possible. The image model must account for the effect of image defocus through the resist film, at a minimum. Mask patterns can be onedimensional lines and spaces or two-dimensional contacts and islands, as well as arbitrarily complex two-dimensional mask features. The masks often vary in the magnitude and phase of their transmission in what are called phase-shifting masks. The illumination source may be of a conventional disk shape or other more complicated shapes, as in off-axis illumination. For very high numerical apertures, vector calculations should be used. Standing waves: An analytical expression is used to calculate the standing-wave intensity as a function of depth into the resist, including the effects of resist bleaching on planar substrates. Film stacks can be defined below the resist with many layers between the resist and substrate. Contrast enhancement layers or top-layer antireflection coatings can also be included. The high-NA models should include the effects of nonvertical light propagation. Prebake: Thermal decomposition of the photoresist photoactive compound during prebake is modeled using first-order kinetics resulting in a change in the resist’s optical properties (the Dill parameters A and B). Other important effects of baking have not yet been modeled. Exposure: First-order kinetics are used to model the chemistry of exposure using the standard Dill ABC parameters. Both positive and negative resists can be used. Postexposure bake: A diffusion calculation allows the postexposure bake to reduce the effects of standing waves. For chemically amplified resists, this diffusion includes an amplification reaction which accounts for cross-linking, blocking, or deblocking in an acid-catalyzed reaction. Acid loss mechanisms and nonconstant diffusivity could also be needed. Development: A model relating resist dissolution rate to the chemical composition of the film is used in conjunction with an etching algorithm to determine the resist profile. Surface inhibition or enhancement can also be present. Alternatively, a data file of development rate information could be used in lieu of a model.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

100

† CD measurement: The measurement of the photoresist linewidth should give

accuracy and flexibility to match the model to an actual CD measurement tool. The combination of the models described above provides a complete mathematical description of the optical lithography process. Use of the models incorporated in a simulation software package allows the user to investigate many significant aspects of optical lithography. The following sections describe each of the models in detail, including derivations of most of the mathematical models, as well as physical descriptions of their basis. Of course, there is more work that has been done in the field of lithography simulation than is possible to report in one chapter. Typically there are several approaches, sometimes equivalent and sometimes not, that can be applied to each problem. Although the models presented here are representative of the possible solutions, they are not necessarily comprehensive reviews of all possible models.

2.3 Aerial Image Formation 2.3.1 Basic Imaging Theory Consider the generic projection system shown in Figure 2.2. It consists of a light source, a condenser lens, the mask, the objective lens, and finally the resist-coated wafer. The combination of the light source and the condenser lens is called the illumination system. In optical design terms, a lens is a system of lens elements, possibly many. Each lens element is an individual piece of glass (refractive element) or a mirror (reflective element). The purpose of the illumination system is to deliver light to the mask (and eventually to the objective lens) with sufficient intensity, the proper directionality and spectral characteristics, and adequate uniformity across the field. The light then passes through the clear areas of the mask and diffracts on its way to the objective lens. The purpose of the objective lens is to pick up a portion of the diffraction pattern and project an image onto the wafer that will, ideally, resemble the mask pattern. The first, and most basic, phenomenon occurring is the diffraction of light. Diffraction is typically thought of as the bending of light as it passes through an aperture, which is an

Mask

Light source

Condenser lens

Objective lens

Wafer FIGURE 2.2 Block diagram of a generic projection system.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

101

appropriate description for diffraction by a lithographic mask. More accurately, diffraction theory simply describes how light propagates. This propagation includes the effects of the surroundings (boundaries). Maxwell’s equations describe how electromagnetic waves propagate, but with partial differential equations of vector quantities that, for general boundary conditions, are extremely difficult to solve without the aid of a powerful computer. A more simple approach is to artificially decouple the electric and magnetic field vectors and describe light as a scalar quantity. Under most conditions, scalar diffraction theory is surprisingly accurate. Scalar diffraction theory was first used rigorously by Kirchoff in 1882, and involves performing one numerical integration (more simple than solving partial differential equations). Kirchoff diffraction was further simplified by Fresnel in a case where the distance away from the diffracting plane (the distance from the mask to the objective lens) is much greater than the wavelength of light. Finally, if the mask is illuminated by a spherical wave that converges to a point at the entrance to the objective lens, Fresnel diffraction simplifies to Fraunhofer diffraction. Consider the electric field transmittance of a mask pattern as m(x,y), where the mask is in the xy plane and m(x,y) has both magnitude and phase. For a simple chrome–glass mask, the mask pattern becomes binary: m(x,y) is 1 under the glass and 0 under the chrome. Let the x 0 y 0 plane be the diffraction plane, the entrance to the objective lens, and let z be the distance from the mask to the objective lens. Finally, assume monochromatic light of wavelength l is used and that the entire system is in air (enabling its index of refraction to be dropped). Then, the electric field of the diffraction pattern, E(x 0 ,y 0 ), is given by the Fraunhofer diffraction integral: 0

0

Eðx ; y Þ Z

N ð ð N

mðx;yÞeK2piðfx xCfy yÞ dx dy;

(2.1)

KN KN

where fxZx 0 /(zl) and fyZy 0 /(zl) and are called the spatial frequencies of the diffraction pattern. For many scientists and engineers (electrical engineers, in particular), this equation should be quite familiar: it is simply a Fourier transform. Thus, the diffraction pattern (i.e., the electric field distribution as it enters the objective lens) is just the Fourier transform of the mask pattern. This is the principle behind an entire field of science called Fourier optics (for more information, consult Goodman’s classic textbook [7]). Figure 2.3 shows two mask patterns, one an isolated space, the other a series of equal lines and spaces, both infinitely long in the y direction. The resulting mask pattern functions, m(x), look like a square pulse and a square wave, respectively. The Fourier transforms are easily found in tables or textbooks and are also shown in Figure 2.3. The isolated space gives rise to a sine function diffraction pattern, and the equal lines and spaces yield discrete diffraction orders. Close inspection of the diffraction pattern for equal lines and spaces reveals that the graphs of the diffraction patterns in Figure 2.3 use spatial frequency as their x axis. Since z and l are fixed for a given stepper, the spatial frequency is simply a scaled x 0 coordinate. At the center of the objective lens entrance (fxZ0), the diffraction pattern has a bright spot called the zero order. The zero order is the light that passes through the mask and is not diffracted. The zero order can be thought of as DC light, providing power but no information as to the size of the features on the mask. To either side of the zero order are two peaks called the first diffraction orders. These peaks occur at spatial frequencies of G1/p, where p is the pitch of the mask pattern (linewidth plus spacewidth). Because the position of these diffraction orders depends on the mask pitch, their position contains information about the pitch. It is this information that the objective lens will use to reproduce the image of the mask. In fact, for the objective lens to form a true image of the mask, it must have the zero order and at least one higher order. In addition to the first order, there can be many higher orders, with the nth order occurring at a spatial frequency of n/p.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

102

Mask

1 m (x ) 0

E (x ′ ) 0

fx

0

FIGURE 2.3 Two typical mask patterns, an isolated space and an array of equal lines and spaces, and the resulting Fraunhofer diffraction patterns.

Summarizing, given a mask in the xy plane described by its electric-field transmission m(x,y), the electric field M as it enters the objective lens (the x 0 y 0 plane) is given by: Mðfx; fyÞ Z Ffmðx;yÞg;

(2.2)

where the symbol F represents the Fourier transform, and fx and fy are the spatial frequencies and are simply scaled coordinates in the x 0 y 0 plane. We are now ready to describe what happens next and follow the diffracted light as it enters the objective lens. In general, the diffraction pattern extends throughout the x 0 y 0 plane. However, the objective lens, being of finite size, cannot collect all of the light in the diffraction pattern. Typically, lenses used in microlithography are circularly symmetric and the entrance to the objective lens can be thought of as a circular aperture. Only those portions of the mask diffraction pattern that fall inside the aperture of the objective lens go on to form the image. Of course, the size of the lens aperture can be described by its radius, but a more common and useful description is to define the maximum angle of diffracted light that can enter the lens. Consider the geometry shown in Figure 2.4. Light passing through the mask is diffracted at various angles. Given a lens of a certain size placed a certain distance from the mask, there is some maximum angle of diffraction, a, for which diffracted light barely makes it into the lens. Light emerging from the mask at larger angles misses the lens and is not used in forming the image. The most convenient way to describe the size of the lens aperture is by its numerical aperture (NA), defined as the sine of the maximum half-angle of diffracted light that can enter the lens multiplied by the index of refraction of the surrounding medium. In the case of lithography, all of the lenses are in air and the numerical aperture is given by NAZsin a. Note that the spatial frequency is the sine of the diffracted angle divided by the wavelength of light. Thus, the maximum spatial frequency that can enter the objective lens is given by NA/l.

Objective lens α FIGURE 2.4 The numerical aperture is defined as NAZsin a, where a is the maximum half-angle of the diffracted light which can enter the objective lens.

q 2007 by Taylor & Francis Group, LLC

Mask

Aperture

Optical Lithography Modeling

103

Obviously, the numerical aperture is going to be quite important. A large numerical aperture means that a larger portion of the diffraction pattern is captured by the objective lens. For a small numerical aperture, much more of the diffracted light is lost. To proceed further, the discussion must now focus on how the lens affects the light entering it. Obviously, one would like the image to resemble the mask pattern. Because diffraction gives the Fourier transform of the mask, if the lens gave the inverse Fourier transform of the diffraction pattern, the resulting image would resemble the mask pattern. In fact, spherical lenses do behave in this manner. An ideal imaging lens can be described as one that produces an image identically equal to the Fourier transform of the light distribution entering the lens. It is the goal of lens designers and manufacturers to create lenses as close as possible to this ideal. An ideal lens does not produce a perfect image. Because of the finite size of the numerical aperture, only a portion of the diffraction pattern enters the lens. Thus, unless the lens is infinitely large, even an ideal lens cannot produce a perfect image. Because, in the case of an ideal lens, the image is limited only by the diffracted light that does not make it through the lens, such an ideal system is termed diffraction-limited. To write the final equation for the formation of an image, the objective lens pupil function P (a pupil is another name for an aperture) will be defined. The pupil function of an ideal lens describes what portion of light enters the lens—it is one inside the aperture and zero outside: 8 qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ > < 1; fx2 C fy2 ! NA=l Pðfx ;fy Þ Z : (2.3) qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ > : 0; fx2 C fy2 O NA=l Thus, the product of the pupil function and the diffraction pattern describes the light entering the objective lens. Combining this with our description of how a lens behaves gives us our final expression for the electric field at the image plane (i.e., at the wafer): Eðx;yÞ Z FK1 fMðfx ;fy ÞPðfx ;fy Þg:

(2.4)

The aerial image is defined as the intensity distribution at the wafer and is simply the square of the magnitude of the electric field. Consider the full imaging process. First, light passing through the mask is diffracted. The diffraction pattern can be described as the Fourier transform of the mask pattern. Because the objective lens is of finite size, only a portion of the diffraction pattern actually enters the lens. The numerical aperture describes the maximum angle of diffracted light that enters the lens and the pupil function is used to mathematically describe this behavior. Finally, the effect of the lens is to take the inverse Fourier transform of the light entering the lens to give an image that resembles the mask pattern. If the lens is ideal, the quality of the resulting image is only limited by how much of the diffraction pattern is collected. This type of imaging system is called diffraction-limited as defined above. Although the behavior of a simple ideal imaging system has been completely described, one more complication must be added before the operation of a projection system for lithography has been described. Thus far, it has been assumed that the mask is illuminated by spatially coherent light. Coherent illumination means simply that the light striking the mask arrives from only one direction. It has been further assumed that the coherent illumination on the mask is normally incident. The result was a diffraction pattern which was centered in the entrance to the objective lens. What would happen if the direction of the illumination was changed so that the light struck the mask at some angle q 0 ? The effect is simply to shift the diffraction pattern with respect to the lens aperture (in terms of spatial frequency, the amount shifted is sin q 0 /l). Recalling that

q 2007 by Taylor & Francis Group, LLC

104

Microlithography: Science and Technology

only the portion of the diffraction pattern passing through the lens aperture is used to form the image, it is quite apparent that this shift in the position of the diffraction pattern can have a profound effect on the resulting image. Letting fx0 and fy0 be the shift in the spatial frequency because of the tilted illumination, Equation 2.4 becomes Eðx; y; fx0 ; fy0 Þ Z FK1 fMð fx Kfx0 ; fy Kfy0 ÞPð fx ; fy Þg:

(2.5)

If the illumination of the mask is composed of light coming from a range of angles, rather than just one angle, the illumination is called partially coherent. If one angle of illumination causes a shift in the diffraction pattern, a range of angles will cause a range of shifts, resulting in broadened diffraction orders. One can characterize the range of angles used for the illumination in several ways, but the most common is the partial coherence factor, s (also called the degree of partial coherence, the pupil filling function, or just the partial coherence). Partial coherence is defined as the sine of the half-angle of the illumination cone divided by the objective lens numerical aperture. It is, therefore, a measure of the angular range of the illumination relative to the angular acceptance of the lens. Finally, if the range of angles striking the mask extends from K908 to 908 (that is, all possible angles), the illumination is said to be incoherent. The extended-source method for partially coherent image calculations is based upon the division of the full source into individual point sources. Each point source is coherent and results in an aerial image given by Equation 2.5. Two point-sources from the extended source, however, do not interact coherently with each other. Thus, the contributions of these two sources must be added to each other incoherently (i.e., the intensities are added together). The full aerial image is determined by calculating the coherent aerial image from each point on the source, and then integrating the intensity over the source. 2.3.2 Aberrations Aberrations can be defined as the deviation of the real behavior of an imaging system from its ideal behavior (the ideal behavior was described above using Fourier optics as diffraction-limited imaging). Aberrations are inherent in the behavior of all lens systems and come from three basic sources: defects of construction, defects of use, and defects of design. Defects of construction include rough or inaccurate lens surfaces, inhomogeneous glass, incorrect lens thicknesses or spacings, and tilted or decentered lens elements. Defects of use include use of the wrong illumination or tilt of the lens system with respect to the optical axis of the imaging system. Also, changes in the environmental conditions during use, such as the temperature of the lens or the barometric pressure of the air, result in defects of use. Defects of design may be a misnomer because the aberrations of a lens design are not mistakenly designed into the lens, but rather were not designed out of the lens. All lenses have aberrated behavior because the Fourier optics behavior of a lens is only approximately true and is based on a linearized Snell’s law for small angles. It is the job of a lens designer to combine elements of different shapes and properties so that the aberrations of each individual lens element tends to cancel in the sum of all of the elements, giving a lens system with only a small residual amount of aberrations. It is impossible to design a lens system with absolutely no aberrations. Mathematically, aberrations are described as a wavefront deviation—the difference in phase (or path difference) of the actual wavefront emerging from the lens compared to the ideal wavefront as predicted from Fourier optics. This phase difference is a function of the position within the lens pupil, most conveniently described in polar coordinates. This wavefront deviation is, in general, quite complicated, so the mathematical form used to describe it is also quite complicated. The most common model for describing the phase error across the pupil is the Zernike polynomial, a 36-term polynomial of powers of

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

105

the radial position, R, and trigonometric functions of the polar angle, q. The Zernike polynomial can be arranged in many ways, but most lens design software and lens measuring equipment in use today employ the fringe or circle Zernike polynomial, defined below:

WðR; qÞ ZZ1 R cos q C Z2 R sin q C Z3 ð2 R RK1Þ C Z4 R R cos 2q C Z5 R R sin 2q C Z6 ð3 R RK2Þ R cos q C Z7 ð3 R RK2Þ R sin q C Z8 ð6 R 4K6 R R C 1Þ C Z9 R 3 cos 3q C Z10 R 3 sin 3q C Z11 ð4 R RK3Þ R R cos 2q C Z12 ð4 R RK3Þ R R sin 2q C Z13 ð10 R 4K12 R R C 3Þ R cos q C Z14 ð10 R 4K12 R R C 3Þ R sin q C Z15 ð20 R 6K30 R 4 C 12 R RK1Þ C Z16 R 4 cos 4q C Z17 R 4 sin 4q C Z18 ð5 R RK4Þ R 3 cos 3q C Z19 ð5 R RK4Þ R 3 sin 3q C Z20 ð15 R 4K20 R R C 6Þ R R cos 2q C Z21 ð15 R 4K20 R R C 6Þ R R sin 2q C Z22 ð35 R 6K60 R 4 C 30 R RK4Þ R cos q C Z23 ð35 R 6K60 R 4 C 30 R RK4Þ R sin q C Z24 ð70 R 8K140 R 6 C 90 R 4K20 R R C 1Þ C Z25 R 5 cos 5q C Z26 R 5 sin 5q C Z27 ð6 R RK5Þ R 4 cos 4q C Z28 ð6 R RK5Þ R 4 sin 4q C Z29 ð21 R 4K30 R R C 10Þ R 3 cos 3q C Z30 ð21 R 4K30 R R C 10Þ R 3 sin 3q C Z31 ð56 R 6K105 R 4 C 60 R 2K10Þ R R cos 2q C Z32 ð56 R 6K105 R 4 C 60 R 2K10Þ R R sin 2q C Z33 ð126 R 8K280 R 6 C 210 R 4K60 R R C 5Þ R cos q C Z34 ð126 R 8K280 R 6 C 210 R 4K60 R R C 5Þ R sin q C Z35 ð252 R 10K630 R 8 C 560 R 6K210 R 4 C 30 R RK1Þ C Z36 ð924 P 12K2772 P 10 C 3150 P 8K1680 P 6 C 420 P 4K42 P P C 1Þ

q 2007 by Taylor & Francis Group, LLC

(2.6)

Microlithography: Science and Technology

106

where W(R,q) is the optical path diference relative to the wavelength and Zi is called the ith Zernike coefficient. It is the magnitude of the Zernike coefficients that determine the aberration behavior of a lens. They have units of optical path length relative to the wavelength. The impact of aberrations on the aerial image can be calculated by modifying the pupil function of the lens with the phase error due to aberrations given by Equation 2.6. Pðfx ;fy Þ Z Pideal ðfx ;fy Þei2pWðR;qÞ :

(2.7)

2.3.3 Zero-Order Scalar Model Calculation of an aerial image means, literally, determining the image in air. Of course, in lithography, one projects this image into a photoresist. The propagation of the image into a resist can be complicated, so models usually make one or more approximations. This section and the sections that follow describe approximations made in determining the intensity of light within the photoresist. The lithography simulator SAMPLE [5] and the 1985 version of PROLITH [6] used the simple imaging approximation first proposed by Dill [4] to calculate the propagation of an aerial image in a photoresist. First, an aerial image Ii(x) is calculated as if projected into air (x being along the surface of the wafer and perpendicular to the propagation direction of the image). Second, a standing-wave intensity Is(z) is calculated assuming a plane wave of light is normally incident on the photoresist-coated substrate (where z is defined as zero at the top of the resist and is positive going into the resist). Then, it is assumed that the actual intensity within the resist film I(x,z) can be approximated by Iðx;zÞ zIi ðxÞIs ðzÞ:

(2.8)

For very low numerical apertures and reasonably thin photoresists, these approximations are valid. They begin to fail when the aerial image changes as it propagates through the resist (i.e., it defocuses) or when the light entering the resist is appreciably nonnormal. Note that if the photoresist bleaches (changes its optical properties during exposure), only Is(z) changes in this approximation. 2.3.4 First-Order Scalar Model The first attempt to correct one of the deficiencies of the zero-order model was made by the author [8] and, independently, by Bernard [9]. The aerial image, while propagating through the resist, is continuously changing focus. Thus, even in air, the aerial image is a function of both x and z. An aerial image simulator calculates images as a function of x and the distance from the plane of best focus, d. Letting d0 be the defocus distance of the image at the top of the photoresist, the defocus within the photoresist at any position z is given by z dðzÞ Z d0 C ; n

(2.9)

where n is the real part of the index of refraction of the photoresist. The intensity within the resist is then given by Iðx;zÞ Z Ii ðx;dðzÞÞIs ðzÞ:

q 2007 by Taylor & Francis Group, LLC

(2.10)

Optical Lithography Modeling

107

Here, the assumption of normally incident plane waves is still used when calculating the standing wave intensity. 2.3.5 High-NA Scalar Model The light propagating through the resist can be thought of as various plane waves traveling through the resist in different directions. Consider first the propagation of the light in the absence of diffraction by a mask pattern (i.e., exposure of the resist by a large open area). The spatial dimensions of the light source determine the characteristics of the light entering the photoresist. For the simple case of a coherent point source of illumination centered on the optical axis, the light traveling into the photoresist would be the normally incident plane wave used in the calculations presented above. The standing wave intensity within the resist can be determined analytically [10] as the square of the magnitude of the electric field given by t12 EI eKi2pn2 z=l C r23 t2D ei2pn2 z=l ; EðzÞ Z 1 C r12 r23 t2D

(2.11)

where the subscripts 1, 2, and 3 refer to air, the photoresist, and the substrate, respectively, D is the resist thickness, EI is the incident electrical field, l is the wavelength, and where Complex index of refraction of film j : nj Z nj Kikj Transmission coefficient from i to j : Reflection coefficient from i to j : Internal transmittance of the resist :

2ni n i C nj ni Knj rij Z n i C nj

tij Z

tD Z eKi2pn2 D=l

A more complete description of the standing-wave Equation 2.11 is given in Section 2.4. The above expression can be easily modified for the case of nonnormally incident plane waves. Suppose a plane wave is incident on the resist film at some angle q1. The angle of the plane wave inside the resist will be q2 as determined from Snell’s law. An analysis of the propagation of this plane wave within the resist will give an expression similar to Equation 2.11 but with the position z replaced with z cos q2. Eðz;q2 Þ Z

t12 ðq2 ÞEI eKi2pn2 z cos q2 =l C r23 ðq2 Þt2D ðq2 Þei2pn2 z cos q2 =l : 1 C r12 ðq2 Þr23 ðq2 Þt2D ðq2 Þ

(2.12)

The transmission and reflection coefficients are now functions of the angle of incidence and are given by the Fresnel formulas (see Section 2.4). A similar approach was taken by Bernard and Urbach [11]. By calculating the standing-wave intensity at one incident angle q1 to give Is(z,q1), the full standing-wave intensity, can be determined by integrating over all angles. Each incident angle comes from a given point in the illumination source, so that integration over angles is the same as integration over the source. Thus, the effect of partial coherence on the standing waves is accounted for. Note that for the model described here, the effect of the nonnormal incidence is included only with respect to the zero-order light (the light which is not diffracted by the mask).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

108

Besides the basic modeling approaches described above, there are two issues that apply to any model. First, the effects of defocus are taken into account by describing defocus as a phase error at the pupil plane. Essentially, if the curvature of the wavefront exiting the objective lens pupil is such that it focuses in the wrong place (i.e., not where you want it), one can consider the wavefront curvature to be wrong. Simple geometry then relates the optical-path difference (OPD) of the actual wavefront from the desired wavefront as a function of the angle of the light exiting the lens, q: OPDðqÞ Z dð1Kcos qÞ:

(2.13)

Computation of the imaging usually involves a change in variables where the main variable used is sin q. Thus, the cosine adds some algebraic complexity to the calculations. For this reason, it is common in optics texts to simplify the OPD function for small angles (i.e., low numerical apertures). d OPDðqÞ Z dð1Kcos qÞ z sin2 q: 2

(2.14)

Again, the approximation is not necessary, and is only made to simplify the resulting equations. In this work, the approximate defocus expression is used in the standard image model. The high-NA model uses the exact defocus expression. Reduction in the imaging system adds an interesting complication. Light entering the objective lens will leave the lens with no loss in energy (the lossless lens assumption). However, if there is reduction in the lens, the intensity distribution of the light entering will be different from that leaving because the intensity is the energy spread over a changing area. The result is a radiometric correction well known in optics [12] and first applied to lithography by Cole and Barouch [13]. 2.3.6 Full Scalar and Vector Models The above method for calculating the image intensity within the resist still makes the assumption of separability, that an aerial image and a standing wave intensity can be calculated independently and then multiplied together to give the total intensity. This assumption is not required. Instead, one could calculate the full I(x,z) at once making only the standard scalar approximation. The formation of the image can be described as the summation of plane waves. For coherent illumination, each diffraction order gives one plane wave propagating into the resist. Interference between the zero order and the higher orders produces the desired image. Each point in the illumination source will produce another image that will add incoherently (i.e., intensities will add) to give the total image. Equation 2.12 describes the propagation of a plane wave in a stratified media at any arbitrary angle. By applying this equation to each diffraction order (not just the zero order as in the high-NA scalar model), an exact scalar representation of the full intensity within the resist is obtained. Light is an electromagnetic wave that can be described by time-varying electric and magnetic field vectors. In lithography, the materials used are generally nonmagnetic so that only the electric field is of interest. The electric field vector is described by its three vector components. Maxwell’s equations, sometimes put into the form of the wave equation, govern the propagation of the electric field vector. The scalar approximation assumes that each of the three components of the electric field vector can be treated separately as scalar quantities and each scalar electric field component must individually satisfy the wave equation. Further, when two fields of light (say, two plane waves) are

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

109

added together, the scalar approximation means that the sum of the fields would simply be the sum of the scalar amplitudes of the two fields. The scalar approximation is commonly used throughout optics and is known to be accurate under many conditions. There is one simple situation, however, in which the scalar approximation is not adequate. Consider the interference of two plane waves traveling past each other. If each plane wave is treated as a vector, they will interfere only if there is some overlap in their electric field vectors. If the vectors are parallel, there will be complete interference. If, however, their electric fields are at right angles to each other, there will be no interference. The scalar approximation essentially assumes that the electric field vectors are always parallel and will always give complete interference. These differences come into play in lithography when considering the propagation of plane waves traveling through the resist at large angles. For large angles, the scalar approximation may fail to account for these vector effects. Thus, a vector model would keep track of the vector direction of the electric field and use this information when adding two plane waves together [14,15].

2.4 Standing Waves When a thin dielectric film placed between two semiinfinite media (e.g., a thin coating on a reflecting substrate) is exposed to monochromatic light, standing waves are produced in the film. This effect has been well documented for such cases as antireflection coatings and photoresist exposure [1,16–19]. In the former, the standing-wave effect is used to reduce reflections from the substrate. In the latter, standing waves are an undesirable side effect of the exposure process. Unlike the antireflection application, photolithography applications require a knowledge of the intensity of the light within the thin film itself. Previous work [1,19] on determining the intensity within a thin photoresist film has been limited to numerical solutions based on Berning’s matrix method [20]. This section presents an analytical expression for the standing-wave intensity within a thin film [10]. This film may be homogeneous or of a known inhomogeneity. The film may be on a substrate or between one or more other thin films. The incident light can be normally incident or incident at some angle. Consider a thin film of thickness D and complex index of refraction n2 deposited on a thick substrate with complex index of refraction n3 in an ambient environment of index n1. An electromagnetic plane wave is normally incident on this film. Let E1, E2, and E3 be the electric fields in the ambient, thin film, and substrate, respectively (see Figure 2.5). Assuming monochromatic illumination, the electric field in each region is a plane wave or the sum of two plane waves traveling in opposite directions (i.e., a standing wave).

EI Air

Ai r

n1 z=0

Resist Substrate

D

Substrate

n3

(a) FIGURE 2.5 Film stack showing geometry for standing wave derivation.

q 2007 by Taylor & Francis Group, LLC

Resist

n2

(b)

Microlithography: Science and Technology

110

Maxwell’s equations require certain boundary conditions to be met at each interface: specifically, Ej and the magnetic field, Hj, are continuous across the boundaries zZ0 and zZD. Solving the resulting equations simultaneously, the electric field in region 2 can be shown to be [10]: t12 eKi2pn2 z=l C r23 t2D ei2pn2 z=l E2 ðx;y;zÞ Z EI ðx;yÞ ; (2.15) 1 C r12 r23 t2D where EI(x,y) is the incident wave at zZ0, which is a plane wave; rij Z ðni Knj Þ=ðni C nj Þ is the reflection coefficient; tij Z 2ni =ðni C nj Þ is the transmission coefficient; tD Z expðKik2 DÞ is the internal transmittance of the film; kj Z 2pnj =l is the propagation constant; nj Z nj Kikj is the complex index of refraction; and l is the vacuum wavelength of the incident light. Equation 2.15 is the basic standing-wave expression where film 2 represents the photoresist. Squaring the magnitude of the electric field gives the standing-wave intensity. Note that absorption is taken into account in this expression through the imaginary part of the index of refraction. The common absorption coefficient a is related to the imaginary part of the index by aZ

4pk : l

(2.16)

It is very common to have more than one film coated on a substrate. The problem then becomes that of two or more absorbing thin films on a substrate. An analysis similar to that for one film yields the following result for the electric field in the top layer of an mK1 layer system: 0 t12 eKi2pn2 z=l C r23 t2D ei2pn2 z=l E2 ðx;y;zÞ Z EI ðx;yÞ ; 0 2 1 C r12 r23 tD

(2.17)

where 0 r23 Z

n2 Kn3 X3 n2 C n3 X3

X3 Z

0 2 1Kr34 tD3 0 t2 1 C r34 D3

0 r34 Z

n3 Kn4 X4 n3 C n4 X4 «

1Krm;mC1 t2Dm 1 C rm;mC1 t2Dm n KnmC1 rm;mC1 Z m nm C nmC1

Xm Z

tDj Z eKikj Dj ; 0 is the effective reflecand all other parameters are defined previously. The parameter r23 tion coefficient between the thin film and what lies beneath it.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

111

If the thin film in question is not the top film (layer 2), the intensity can be calculated in layer j from Ej ðx;y;zÞ Z EIeff ðx;yÞtjK1; j

0 eKikj zj C rj;jC1 t2Dj eikj zj 0 1 C rjK1;j rj;jC1 t2Dj

;

(2.18)

where tjK1; j Z 1C rjK1; j . The effective reflection coefficient r* is analogous to the coefficient r 0, looking in the opposite direction. EIeff is the effective intensity incident on layer j. Both EIeff and r* are defined in detail by Mack [10]. If the film in question is not homogeneous, the equations above are, in general, not valid. One special case will now be considered in which the inhomogeneity takes the form of small variations in the imaginary part of the index of refraction of the film in the z direction, leaving the real part constant. In this case, the absorbance, Abs, is no longer simply az, but becomes ðz

AbsðzÞ Z aðz 0 Þ dz 0 :

(2.19)

0

It can be shown that Equation 2.15 through Equation 2.18 are still valid if the anisotropic expression for absorbance (Equation 2.19) is used. Thus, I(z) can be found if the absorption coefficient is known as a function of z. Figure 2.6 shows a typical result of the standingwave intensity within a photoresist film coated on an oxide on silicon film stack. Equation 2.15 can be easily modified for the case of nonnormally incident plane waves. Suppose a plane wave is incident on the resist film at some angle q1. The angle of the plane wave inside the resist will be q2, as determined from Snell’s law. An analysis of the propagation of this plane wave within the resist will give an expression similar to Equation 2.15, but with the position z replaced with z cos q2: t12 ðq2 ÞEI eKi2pn2 z cos q2 =l C r23 ðq2 Þt2D ðq2 Þ ei2pn2 z cos q2 =l : Eðz;q2 Þ Z 1 C r12 ðq2 Þr23 ðq2 Þt2D ðq2 Þ

(2.20)

1.6 1.4 Relative intensity

1.2 1.0 0.8 0.6 0.4 0.2 0 0

200

400 600 Depth into resist (nm)

q 2007 by Taylor & Francis Group, LLC

800

1000

FIGURE 2.6 Standing-wave intensity within a photoresist film at the start of exposure (850 nm of resist on 100 nm SiO2 on silicon, lZ436 nm). The intensity shown is relative to the incident intensity.

Microlithography: Science and Technology

112

The transmission and reflection coefficients are now functions of the angle of incidence (as well as the polarization of the incident light) and are given by the Fresnel formulas: rijt ðqÞ Z

ni cosðqi ÞKnj cos ðqj Þ ; ni cos ðqi Þ C nj cos ðqj Þ

tijt ðqÞ Z

2ni cosðqi Þ ; ni cosðqi Þ C nj cosðqj Þ

rijjj ðqÞ Z

ni cosðqj ÞKnj cosðqi Þ ; ni cosðqj Þ C nj cosðqi Þ

tijjj ðqÞ Z

2ni cosðqi Þ : ni cosðqj Þ C nj cosðqi Þ

(2.21)

For the typical unpolarized case, the light entering the resist will become polarized (but only slightly). Thus, a separate standing wave can be calculated for each polarization and the resulting intensities summed to give the total intensity.

2.5 Photoresist Exposure Kinetics The kinetics of photoresist exposure is intimately tied to the phenomenon of absorption. The discussion below begins with a description of absorption, followed by the chemical kinetics of exposure. Finally, the chemistry of chemically amplified resists will be reviewed. 2.5.1 Absorption The phenomenon of absorption can be viewed on a macroscopic or a microscopic scale. On the macro level, absorption is described by the familiar Lambert and Beer laws, which give a linear relationship between absorbance and path length times the concentration of the absorbing species. On the micro level, a photon is absorbed by an atom or molecule, thereby promoting an electron to a higher energy state. Both methods of analysis yield useful information needed in describing the effects of light on a photoresist. The basic law of absorption is an empirical one with no known exceptions. It was first expressed by Lambert in differential form as dI ZKaI; dz

(2.22)

where I is the intensity of light traveling in the z direction through a medium, and a is the absorption coefficient of the medium and has units of inverse length. In a homogeneous medium (i.e., a is not a function of z), Equation 2.22 may be integrated to yield IðzÞ Z I0 expðKazÞ;

(2.23)

where z is the distance the light has traveled through the medium and I0 is the intensity at zZ0. If the medium is inhomogeneous, Equation 2.23 becomes IðzÞ Z I0 expðKAbsðzÞÞ;

q 2007 by Taylor & Francis Group, LLC

(2.24)

Optical Lithography Modeling

113

where ðz

AbsðzÞ Z aðz 0 Þdz 0 Z the absorbance 0

When working with electromagnetic radiation, it is often convenient to describe the radiation by its complex electric field vector. The electric field can implicitly account for absorption by using a complex index of refraction n such that n Z nKik:

(2.25)

The imaginary part of the index of refraction, sometimes called the extinction coefficient, is related to the absorption coefficient by a Z 4pk=l:

(2.26)

In 1852, Beer showed that, for dilute solutions, the absorption coefficient is proportional to the concentration of the absorbing species in the solution: asolution Z ac;

(2.27)

where a is the molar absorption coefficient, given by aZaMW/r, MW is the molecular weight, r is the density, and c is the concentration. The stipulation that the solution be dilute expresses a fundamental limitation of Beer’s law. At high concentrations, where absorbing molecules are close together, the absorption of a photon by one molecule may be affected by a nearby molecule [21]. Because this interaction is concentration-dependent, it causes deviation from the linear relation (Equation 2.27). Also, an apparent deviation from Beer’s law occurs if the real part of the index of refraction changes appreciably with concentration. Thus, the validity of Beer’s law should always be verified over the concentration range of interest. For an N-component homogeneous solid, the overall absorption coefficient becomes aT Z

N X

aj cj :

(2.28)

jZ1

Of the total amount of light absorbed, the fraction of light which is absorbed by component i is given by IAi ac Z i i ; IAT aT

(2.29)

where IAT is the total light absorbed by the film, and I Ai is the light absorbed by component i. The concepts of macroscopic absorption will now be applied to a typical positive photoresist. A diazonaphthoquinone positive photoresist is made up of four major components: a base resin R that gives the resist its structural properties, a photoactive compound M (abbreviated PAC), exposure products P generated by the reaction of M with ultraviolet light, and a solvent S. Although photoresist drying during prebake is intended to drive off solvents, thermal studies have shown that a resist may contain 10% solvent after a 30 min

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

114

of 1008C prebake [22,23]. The absorption coefficient a is then a Z aM M C aP P C aR R C aS S:

(2.30)

If M0 is the initial PAC concentration (i.e., with no UV exposure), the stoichiometry of the exposure reaction gives P Z M0 KM:

(2.31)

Equation 2.30 may be rewritten as [2]: a Z Am C B;

(2.32)

where AZ ðaM KaP ÞM0 ; BZ apM0 C aR RC aS S, and mZM/M 0, A and B are called the bleachable and nonbleachable absorption coefficients, respectively, and make up the first two Dill photoresist parameters [2]. The quantities A and B are experimentally measurable [2] and can be easily related to typical resist absorbance curves, measured using a UV spectrophotometer. When the resist is fully exposed, MZ0 and aexposed Z B:

(2.33)

Similarly, when the resist is unexposed, mZ1 (MZM0) and aunexposed Z A C B:

(2.34)

A Z aunexposed Kaexposed :

(2.35)

From this, A may be found by

Thus, A(l) and B(l) may be determined from the UV absorbance curves of unexposed and completely exposed resist (Figure 2.7). As mentioned previously, Beer’s law is empirical in nature and, therefore, should be verified experimentally. In the case of positive photoresists, this means formulating resist mixtures with differing photoactive-compound-to-resin ratios and measuring the

FIGURE 2.7 Resist parameters A and B as a function of wavelength measured using a UV spectrophotometer.

q 2007 by Taylor & Francis Group, LLC

Resist A & B parameters (1/μm)

1.50 1.20 A 0.90 0.60 B 0.30 0.00 300

340

380 420 Wavelength (nm)

460

500

Optical Lithography Modeling

115

resulting A parameters. Previous work has shown that Beer’s law is valid for conventional photoresists over the full practical range of PAC concentrations [24]. 2.5.2 Exposure Kinetics On a microscopic level, the absorption process can be thought of as photons being absorbed by an atom or molecule causing an outer electron to be promoted to a higher energy state. This phenomenon is especially important for the photoactive compound because it is the absorption of UV light that leads to the chemical conversion of M to P. UV

M $$% P:

(2.36)

This concept is stated in the first law of photochemistry: only the light that is absorbed by a molecule can be effective in producing photochemical change in the molecule. The actual chemistry of diazonaphthoquinone exposure is given below. O N2

C=O

COOH

UV

H2O

+ N2 SO2

SO2

SO2

R

R

R

The chemical reaction in Equation 2.36 can be rewritten in general form as k1

k3

M% M $$% P

(2.37)

k2

where M is the photoactive compound (PAC); M* is molecule in an excited state; P is the carboxylic acid (product); and k1, k2, and k3 are the rate constants for each reaction. Simple kinetics can now be applied. The proposed mechanism (Equation 2.37) assumes that all reactions are first order. Thus, the rate equation for each species can be written: dM Z k2 M Kk1 M; dt dM Z k1 MKðk2 C k3 ÞM ; dt

(2.38)

dP Z k3 M : dt A system of three coupled, linear, first-order differential equations can be solved exactly using Laplace transforms and the initial conditions Mðt Z 0Þ Z M0

M ðt Z 0Þ Z Pðt Z 0Þ Z 0

:

(2.39)

However, if one uses the steady-state approximation, the solution becomes much simpler. This approximation assumes that in a very short time the excited molecule M* comes to a steady state, i.e., M* is formed as quickly as it disappears. In mathematical form,

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

116

dM Z 0: dt

(2.40)

A previous study has shown that M* does indeed come to a steady state quickly, on the order of 10K8 sec or faster [25]. Thus, dM ZKKM; dt

(2.41)

where KZ

k1 k3 : k2 C k3

Assuming K remains constant with time, M Z M0 expðKKtÞ:

(2.42)

The overall rate constant, K, is a function of the intensity of the exposure radiation. An analysis of the microscopic absorption of a photon predicts that K is directly proportional to the intensity of the exposing radiation [24]. Thus, a more useful form of Equation 2.41 is dm ZKCIm; dt

(2.43)

where the relative PAC concentration m (ZM/M0) has been used and C is the standard exposure rate constant and the third Dill photoresist parameter. A solution to the exposure rate in Equation 2.43 is simple if the intensity within the resist is constant throughout the exposure. However, this is generally not the case. In fact, many resists bleach upon exposure, i.e., they become more transparent as the photoactive compound M is converted to product P. This corresponds to a positive value of A, as seen, for example, in Figure 2.7. Because the intensity varies as a function of exposure time, this variation must be known to solve the exposure rate equation. In the simplest possible case, a resist film coated on a substrate of the same index of refraction, only absorption affects the intensity within the resist. Thus, Lambert’s law of absorption, coupled with Beer’s law, could be applied: dI ZK ðAm C BÞI; dz

(2.44)

where Equation 2.32 was used to relate the absorption coefficient to the relative PAC concentration. Equation 2.43 and Equation 2.44 are coupled, and thus become firstorder nonlinear partial differential equations that must be solved simultaneously. The solution to Equation 2.43 and Equation 2.44 was first carried out numerically for the case of lithography simulation [2], but in fact was solved analytically by Herrick [26] many years earlier. The same solution was also presented more recently by Diamond and Sheats [27] and by Babu and Barouch [28]. These solutions take the form of a single numerical integration, which is much simpler than solving two differential equations. Although an analytical solution exists for the simple problem of exposure with absorption only, in more realistic problems the variation of intensity with depth in the film is more complicated than Equation 2.44. In fact, the general exposure situation results in the formation of standing waves, as discussed previously. In such a case, Equation 2.15

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

117

through Equation 2.18 can give the intensity within the resist as a function of the PAC distribution m(x,y,z,t). Initially, this distribution is simply m(x,y,z,0)Z1. Thus, Equation 2.15 for example would give I(x,y,z,0). The exposure Equation (Equation 2.43) can then be integrated over a small increment of exposure time, Dt, to produce the PAC distribution m(x,y,z,Dt). The assumption is that over this small increment in exposure time the intensity remains relatively constant, leading to the exponential solution. This new PAC distribution is then used to calculate the new intensity distribution I(x,y,z,Dt), which in turn is used to generate the PAC distribution at the next increment of exposure time m(x,y,z,2Dt). This process continues until the final exposure time is reached. 2.5.3 Chemically Amplified Resists Chemically amplified photoresists are composed of a polymer resin (possibly “blocked” to inhibit dissolution), a photoacid generator (PAG), and possibly a crosslinking agent, dye or other additive. As the name implies, the photoacid generator forms a strong acid when exposed to deep-UV light. Ito and Willson first proposed the use of an aryl onium salt [29], and triphenylsulfonium salts have been studied extensively as PAGs. The reaction of a common PAG is shown below: Ph Ph S+ CF3COO–

hν

CF3COOH + others

Ph

The acid generated in this case (trifluoroacetic acid) is a derivative of acetic acid where the electron-drawing properties of the fluorines are used to greatly increase the acidity of the molecule. The PAG is mixed with the polymer resin at a concentration of typically 5%–15% by weight, with 10% as a typical formulation. The kinetics of the exposure reaction are standard first-order: vG ZKCIG; vt

(2.45)

where G is the concentration of PAG at time t (the initial PAG concentration is G0), I is the exposure intensity, and C is the exposure rate constant. For constant intensity, the rate equation can be solved for G: G Z G0 eKCIt :

(2.46)

H Z G0 KG Z G0 ð1KeKCIt Þ:

(2.47)

The acid concentration H is given by

Exposure of the resist with an aerial image I(x) results in an acid latent image H(x). A postexposure bake (PEB) is then used to thermally induce a chemical reaction. This may be the activation of a crosslinking agent for a negative resist or the deblocking of the polymer resin for a positive resist. The reaction is catalyzed by the acid so that the acid is not consumed by the reaction and H remains constant. Ito and Willson first proposed the concept of deblocking a polymer to change its solubility [29]. A base polymer such as poly (p-hydroxystyrene), PHS, is used which is very soluble in an aqueous base developer. The hydroxyl groups give the PHS its high solubility, so by “blocking” these sites (by reacting the hydroxyl group with some longer-chain molecule) the solubility can be reduced. Ito and Willson employed a t-butoxycarbonyl group (t-BOC), resulting in a

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

118

very slowly dissolving polymer. In the presence of acid and heat, the t-BOC blocked polymer will undergo acidolysis to generate the soluble hydroxyl group, as shown below. CH2-CH

CH2-CH

CH3

H+ + CH2

CH3

O C O O C CH3 CH3

C

+ CO2 CH3

OH

One drawback of this scheme is that the cleaved t-BOC is volatile and will evaporate, causing film shrinkage in the exposed areas. Higher-molecular-weight blocking groups can be used to reduce this film shrinkage to acceptable levels (below 10%). Also, the blocking group is such an effective inhibitor of dissolution, that nearly every blocked site on the polymer must be deblocked to obtain significant dissolution. Thus, the photoresist can be made more “sensitive” by only partially blocking the PHS. Typical photoresists use 10%–30% of the hydroxyl groups blocked, with a typical value of 20%. Molecular weights for the PHS run in the range of 3000–5000, giving about 20–35 hydroxyl groups per molecule. Using M as the concentration of some reactive site, these sites are consumed (i.e., are reacted) according to kinetics of some unknown order n in H and first order in M [30]: vM ZKKamp MHn : vt 0

(2.48)

where Kamp is the rate constant of the amplification reaction (crosslinking, deblocking, etc.) and t 0 is the bake time. Simple theory would indicate that nZ1, but the general form will be used here. Assuming H is constant, Equation 2.48 can be solved for the concentration of reacted sites X: n 0

X Z M0 KM Z M0 ð1KeKKamp H t Þ:

(2.49)

(Note: Although HC is not consumed by the reaction, the value of H is not locally constant. Diffusion during the PEB and acid-loss mechanisms cause local changes in the acid concentration, thus requiring the use of a reaction-diffusion system of equations. The approximation that H is constant is a useful one, however, that gives insight into the reaction as well as accurate results under some conditions.) It is useful here to normalize the concentrations to some initial values. This results in a normalized acid concentration h and normalized reacted and unreacted sites x and m:

q 2007 by Taylor & Francis Group, LLC

hZ

H ; G0

xZ

X ; M0

mZ

M : M0

(2.50)

Optical Lithography Modeling

119

Equation 2.47 and Equation 2.49 become h Z 1KeKCIt ; n

m Z 1Kx Z eKah ;

(2.51)

where a is a lumped “amplification” constant equal to Gn0 Kamp t 0 . The result of the PEB is an amplified latent image m(x), corresponding to an exposed latent image h(x), resulting from the aerial image I(x). The above analysis of the kinetics of the amplification reaction assumed a locally constant concentration of acid, H. Although this could be exactly true in some circumstances, it is typically only an approximation and is often a poor approximation. In reality, the acid diffuses during the bake. In one dimension, the standard diffusion equation takes the form vH v vH DH Z ; (2.52) vt 0 vz vz where DH is the diffusivity of acid in the photoresist. Solving this equation requires a number of things: two boundary conditions, one initial condition, and a knowledge of the diffusivity as a function of position and time. The initial condition is the initial acid distribution within the film, H(x,0), resulting from the exposure of the PAG. The two boundary conditions are at the top and bottom surface of the photoresist film. The boundary at the wafer surface is assumed to be impermeable, giving a boundary condition of no diffusion into the wafer. The boundary condition at the top of the wafer will depend on the diffusion of acid into the atmosphere above the wafer. Although such acid loss is a distinct possibility, it will not be treated here. Instead, the top surface of the resist will also be assumed to be impermeable. The solution of Equation 2.52 can now be performed if the diffusivity of the acid in the photoresist is known. Unfortunately, this solution is complicated by two important factors: the diffusivity is a strong function of temperature and, most probably, the extent of amplification. because the temperature is changing with time during the bake, the diffusivity will be time-dependent. The concentration dependence of diffusivity results from an increase in free volume for typical positive resists: as the amplification reaction proceeds, the polymer blocking group evaporates resulting in a decrease in film thickness but also an increase in free volume. Because the acid concentration is time and position dependent, the diffusivity in Equation 2.52 must be determined as a part of the solution of Equation 2.52 by an iterative method. The resulting simultaneous solution of Equation 2.48 and Equation 2.52 is called a reaction-diffusion system. The temperature dependence of the diffusivity can be expressed in a standard Arrhenius form: D0 ðTÞ Z AR expðKEa =RTÞ;

(2.53)

where D0 is a general diffusivity, AR is the Arrhenius coefficient, and Ea is the activation energy. A full treatment of the amplification reaction would include a thermal model of the hotplate to determine the actual time-temperature history of the wafer [31]. To simplify the problem, an ideal temperature distribution will be assumed: the temperature of the resist is zero (low enough for no diffusion or reaction) until the start of the bake, at which time it immediately rises to the final bake temperature, stays constant for the duration of the bake, then instantly falls back to zero. The concentration dependence of the diffusivity is less obvious. Several authors have proposed and verified the use of different models for the concentration dependence

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

120

of diffusion within a polymer. Of course, the simplest form (besides a constant diffusivity) would be a linear model. Letting D0 be the diffusivity of acid in completely unreacted resist and let Df be the diffusivity of acid in resist which has been completely reacted, DH Z D0 C xðDf KD0 Þ:

(2.54)

Here, diffusivity is expressed as a function of the extent of the amplification reaction. Another common form is the Fujita–Doolittle equation [32] that can be predicted theoretically using free-volume arguments. A form of that equation which is convenient for calculations is shown here: DH Z D0 exp

ax ; 1 C bx

(2.55)

where a and b are experimentally determined constants and are, in general, temperaturedependent. Other concentration relations are also possible [33], but the Fujita–Doolittle expression will be used in this work. Through a variety of mechanisms, acid formed by exposure of the resist film can be lost and thus not contribute to the catalyzed reaction to change the resist solubility. There are two basic types of acid loss: loss that occurs between exposure and postexposure bake, and loss that occurs during the postexposure bake. The first type of loss leads to delay-time effects—the resulting lithography is affected by the delay time between exposure and postexposure bake. Delay-time effects can be very severe and, of course, are very detrimental to the use of such a resist in a manufacturing environment [34,35]. The typical mechanism for delay-time acid loss is the diffusion of atmospheric base contaminates into the top surface of the resist. The result is a neutralization of the acid near the top of the resist and a corresponding reduced amplification. For a negative resist, the top portion of a line is not insolublized and resist is lost from the top of the line. For a positive resist, the effects are more devastating. Sufficient base contamination can make the top of the resist insoluble, blocking dissolution into the bulk of the resist. In extreme cases, no patterns can be observed after development. Another possible delay-time acid-loss mechanism is base contamination from the substrate, as has been observed on TiN substrates [35]. The effects of acid loss due to atmospheric base contaminants can be accounted for in a straightforward manner [36]. The base diffuses slowly from the top surface of the resist into the bulk. Assuming that the concentration of base contaminate in contact with the top of the resist remains constant, the diffusion equation can be solved for the concentration of base, B, as a function of depth into the resist film: B Z B0 expðKðz=sÞ2 Þ;

(2.56)

where B0 is the base concentration at the top of the resist film, z is the depth into the resist (zZ0 at the top of the film) and s is the diffusion length of the base in the resist. The standard assumption of constant diffusivity has been made here so that diffusion length goes as the square root of the delay time. Because the acid generated by exposure for most resist systems of interest is fairly strong, it is a good approximation to assume that all of the base contaminant will react with acid if there is sufficient acid present. Thus, the acid concentration at the beginning of

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

121

the PEB, H*, is related to the acid concentration after exposure, H, by H Z HKB

or

h Z hKb;

(2.57)

where the lower case symbols again represent the concentration relative to G0, the initial photoacid-generator concentration. Acid loss during the PEB could occur by other mechanisms. For example, as the acid diffuses through the polymer, it may encounter sights that “trap” the acid, rendering it unusable for further amplification. If these traps were in much greater abundance than the acid itself (for example, sites on the polymer), the resulting acid loss rate would be firstorder: vh ZKKloss h; vt 0

(2.58)

where Kloss is the acid loss reaction rate constant. Of course, other more complicated acidloss mechanisms can be proposed, but in the absence of data supporting them, the simple first-order loss mechanism will be used here. Acid can also be lost at the two interfaces of the resist. At the top of the resist, acid can evaporate. The amount of evaporation is a function of the size of the acid and the degree of its interaction with the resist polymer. A small acid (such as the trifluoroacetic acid discussed above) may have very significant evaporation. A separate rate equation can be written for the rate of evaporation of acid: vh ZKKevap ðhð0;t 0 ÞKhair ð0; t 0 ÞÞ; vt 0 zZ0

(2.59)

where zZ0 is the top of the resist and hair is the acid concentration in the atmosphere just above the photoresist surface. Typically, the PEB takes place in a reasonably open environment with enough air flow to eliminate any buildup of evaporated acid above the resist, making hair Z0. If Kevap is very small, then virtually no evaporation takes place and the top boundary of the resist is said to be impenetrable. If Kevap is very large (resulting in evaporation that is much faster than the rate of diffusion), the effect is to bring the surface concentration of acid in the resist to zero. At the substrate there is also a possible mechanism for acid loss. Substrates containing nitrogen (such as titanium nitride and silicon nitride) often exhibit a foot at the bottom of the resist profile [35]. Most likely, the nitrogen acts as a site for trapping acid molecules, which gives a locally diminished acid concentration at the bottom of the resist. This, of course, leads to reduced amplification and a slower development rate, resulting in the resist foot. The kinetics of this substrate acid loss will depend on the concentration of acid trap sites at the substrate, S. It will be more useful to express this concentration relative to the initial concentration of PAG. sZ

S : G0

(2.60)

A simple trapping mechanism would have one substrate trap site react with one acid molecule. vh ZKKtrap hðD; t 0 Þs: vt 0 zZD

q 2007 by Taylor & Francis Group, LLC

(2.61)

Microlithography: Science and Technology

122

Of course, the trap sites would be consumed at the same rate as the acid. Thus, knowing the rate constant Ktrap and the initial relative concentration of substrate trapping sites s0, one can include Equation 2.61 in the overall mechanism of acid loss. The combination of a reacting system and a diffusing system where the diffusivity is dependent on the extent of reaction is called a reaction-diffusion system. The solution of such a system is the simultaneous solution of Equation 2.48 and Equation 2.52 using Equation 2.47 as an initial condition, and Equation 2.54 or Equation 2.55 to describe the reactiondependent diffusivity. Of course, any or all of the acid-loss mechanisms can also be included. A convenient and straightforward method to solve such equations is the finite difference method (see, for example, Incropera, and DeWitt [37]). The equations are solved by approximating the differential equations by difference equations. By marching through time and solving for all space at each time step, the final solution is the result after the final time step. A key part of an accurate solution is the choice of a sufficiently small time step. If the spatial dimension of interest is Dx (or Dy or Dz), the time step should be chosen such that the diffusion length is less than Dx (using a diffusion length of about one third of Dx is common).

2.6 Photoresist Bake Effects 2.6.1 Prebake The purpose of a photoresist prebake (also called a postapply bake) is to dry the resist by removing solvent from the film. However, as with most thermal processing steps, the bake has other effects on the photoresist. When heated to temperatures above about 708C, the photoactive compound (PAC) of a diazo-type positive photoresist begins to decompose to a nonphotosensitive product. The reaction mechanism is thought to be identical to that of the PAC reaction during ultraviolet exposure [22,23,38,39]. O N2

C=O

Δ

+ N2

SO2

SO2

R

R

X

(2.62)

The identity of the product X will be discussed in a following section. To determine the concentration of PAC as a function of prebake time and temperature, consider the first-order decomposition reaction, D

M $$% X;

(2.63)

where M is the photoactive compound. If M00 represents the concentration of PAC before prebake and M0 represents the concentration of PAC after prebake, simple kinetics dictate that dM0 ZKKT M0 ; dt M0 Z M00 expðKKT tb Þ; m 0 Z expðKKT tb Þ;

q 2007 by Taylor & Francis Group, LLC

(2.64)

Optical Lithography Modeling

123

where tb is the bake time, KT is the decomposition rate constant at temperature T, and m 0 Z M0 =M00 . The dependence of KT upon temperature may be described by the Arrhenius equation, KT Z AR expðKEa =RTÞ;

(2.65)

where AR is the Arrhenius coefficient, Ea is the activation energy, and R is the universal gas constant. Thus, the two parameters Ea and AR allow m 0 to be known as a function of the prebake conditions, provided Arrhenius behavior is followed. In polymer systems, caution must be exercised because bake temperatures near the glass-transition temperature sometimes leads to non-Arrhenius behavior. For normal prebakes of typical photoresists, the Arrhenius model appears well founded. The effect of this decomposition is a change in the chemical makeup of the photoresist. Thus, any parameters that are dependent upon the quantitative composition of the resist are also dependent upon prebake. The most important of these parameters fall into two categories: (1) optical (exposure) parameters such as the resist absorption coefficient, and (2) development parameters such as the development rates of unexposed and completely exposed resist. A technique will be described to measure Ea and AR and thus quantify these effects of prebake. In the model proposed by Dill et al. [2], the exposure of a positive photoresist can be characterized the three parameters: A, B, and C. A and B are related to the optical absorption coefficient of the photoresist, a, and C is the overall rate constant of the exposure reaction. More specifically, a Z Am C B; A Z ðaM KaP ÞM0 ;

(2.66)

B Z aP M0 C aR R C aS S; where aM is the molar absorption coefficient of the photoactive compound M, aP is the molar absorption coefficient of the exposure product P, aS is the molar absorption coefficient of the solvent S, aR is the molar absorption coefficient of the resin R, M0 is the PAC concentration at the start of the exposure (i.e., after prebake), and mZM/M0, the relative PAC concentration as a result of exposure. These expressions do not explicitly take into account the effects of prebake on the resist composition. To do so, we can modify Equation 2.66 to include absorption by the component X: B Z aP M0 C aR R C aX X;

(2.67)

where aX is the molar absorption coefficient of the decomposition product X and the absorption term for the solvent has been neglected. The stoichiometry of the decomposition reaction gives X Z M00 KM0 :

(2.68)

B Z aP M0 C aR R C aX ðM00 KM0 Þ:

(2.69)

Thus,

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

124

Consider two cases of interest, no-bake (NB) and full-bake (FB). When there is no prebake (meaning no decomposition), M00 Z M0 and ANB Z ðaM KaP ÞM00 ;

(2.70)

BNB Z aP M00 C aR R:

Full bake shall be defined as a prebake that decomposes all PAC. Thus M0Z0 and AFB Z 0;

(2.71)

BFB Z aX M00 C aR R: Using these special cases in our general expressions for A and B, A Z ANB m 0 ;

(2.72)

B Z BFB KðBFB KBNB Þm 0 :

The A parameter decreases linearly as decomposition occurs, and B typically increases slightly. The development rate is, of course, dependent on the concentration of PAC in the photoresist. However, the product X can also have a large effect on the development rate. Several studies have been performed to determine the composition of the product X [22,23,39]. The results indicate that there are two possible products and the most common outcome of a prebake decomposition is a mixture of the two. The first product is formed via the reaction in Equation 2.73 and is identical to the product of UV exposure. C=O

COOH + H2O

(2.73)

SO2

SO2

R

R

As can be seen, this reaction requires the presence of water. A second reaction that does not require water, is the esterification of the ketene with the resin. CH3

CH3

CH3 C=O Resin OH

O CO

SO2

OH

(2.74)

R SO2 R

Both possible products have a dramatic effect on dissolution rate. The carboxylic acid is very soluble in developer and enhances dissolution. The formation of carboxylic acid can be thought of as a blanket exposure of the resist. The dissolution rate of unexposed resist (rmin) will increase due to the presence of the carboxylic acid. The dissolution rate of fully exposed resist (r max ), however, will not be affected. Because the chemistry of the

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

125

dissolution process is unchanged, the basic shape of the development rate function will also remain unchanged. The ester, on the other hand, is very difficult to dissolve in aqueous solutions and thus retards the dissolution process. It will have the effect of decreasing rmax, although the effects of ester formation on the full dissolution behavior of a resist are not well known. If the two mechanisms given in Equation 2.73 and Equation 2.74 are taken into account, the rate from Equation 2.64 will become dM0 ZKK1 M0 KK2 ½H2 OM0 ; dt

(2.75)

where K1 and K2 are the rate constants of Equation 2.73 and Equation 2.74, respectively. For a given concentration of water in the resist film, this reverts to Equation 2.64, where KT Z K1 C K2 ½H2 O:

(2.76)

Thus, the relative importance of the two reactions will depend not only on the ratio of the rate constants, but also on the amount of water in the resist film. The concentration of water is a function of atmospheric conditions and the past history of the resist-coated wafer. Further experimental measurements of development rate as a function of prebake temperature are needed to quantify these effects. Examining Equation 2.72, one can see that the parameter A can be used as a means of measuring m 0 , the fraction of PAC remaining after prebake. Thus, by measuring A as a function of prebake time and temperature, one can determine the activation energy and the corresponding Arrhenius coefficient for the proposed decomposition reaction. Using the technique given by Dill et al. [2], A, B, and C can be easily determined by measuring the optical transmittance of a thin photoresist film on a glass substrate while the resist is being exposed. Examples of measured transmittance curves are given in Figure 2.8, where transmittance is plotted vs. exposure dose. The different curves represent different prebake temperatures. For every curve, A, B, and C can be calculated. Figure 2.9 shows the variation of the resist parameter A with prebake conditions. According to Equation 2.64 and Equation 2.72, this variation should take the form

1.0

Transmittance

0.8 0.6 80°C 125°C

0.4 0.2 0.0 0

100 200 300 Exposure dose (mJ/cm2)

q 2007 by Taylor & Francis Group, LLC

400

FIGURE 2.8 Two transmittance curves for Kodak 820 resist at 365 nm. The curves are for a convectionoven prebake of 30 min at the temperatures shown. (From Mack, C. A. and Carback, R.T. in Proceedings of the Kodak Microelectronics Seminar, 1985, 155–158.)

Microlithography: Science and Technology

126 1.2

80°C

A (μm–1)

1.0

95°C 110°C

0.8

125°C

0.6 0.4 0.2 0.0 0

20

40

60

80

100

120

Bake time (min) FIGURE 2.9 The variation of the resist absorption parameter A with prebake time and temperature for Kodak 820 resist at 365 nm. (From Mack, C. A. and Carback, R. T. in Proceedings of the Kodak Microelectronics Seminar, 1985, 155–158.)

A Z eKKT tb ; ANB

ln

A ANB

(2.77)

ZKKT tb :

(2.78)

Thus, a plot of ln(A) vs. bake time should give a straight line with a slope equal to KKT. This plot is shown in Figure 2.10. Knowing KT as a function of temperature, one can determine the activation energy and Arrhenius coefficient from Equation 2.65. One should note that the parameters ANB, BNB, and BFB are wavelength-dependent, but Ea and Ar are not. Figure 2.9 shows an anomaly in which there is a lag time before decomposition occurs. This lag time is the time it took the wafer and wafer carrier to reach the temperature of the convection oven. Equation 2.64 can be modified to accommodate this phenomena,

0.5 0.0 −0.5 ln(A/ANB)

−1.0

FIGURE 2.10 Log plot of the resist absorption parameter A with prebake time and temperature for Kodak 820 resist at 365 nm. (From Mack, C. A. and Carback, R.T. in Proceedings of the Kodak Microelectronics Seminar, 1985, 155–158.)

q 2007 by Taylor & Francis Group, LLC

−1.5 −2.0

80°C 95°C 110°C 125°C

−2.5 −3.0 −3.5 −4.0 −4.5 0

15

30 45 60 Bake time (min)

75

90

Optical Lithography Modeling

127 m 0 Z eKKT ðtbKtwup Þ ;

(2.79)

where twup is the warm up time. A lag time of about 11 min was observed when convection oven baking a 1⁄4 -inch-thick glass substrate in a wafer carrier. When a 60-mil glass wafer was used without a carrier, the warm-up time was under 5 min and could not be measured accurately in this experiment [40]. Although all the data presented thus far has been for convection oven prebake, the above method of evaluating the effects of prebake can also be applied to hotplate prebaking. 2.6.2 Postexposure Bake Many attempts have been made to reduce the standing-wave effect and thus increase linewidth control and resolution. One particularly useful method is the post-exposure, pre-development bake as described by Walker [41]. A 100ce:hsp spZ"0.25"/>8C oven bake for 10 min was found to reduce the standing wave ridges significantly. This effect can be explained quite simply as the diffusion of photoactive compound (PAC) in the resist during a high temperature bake. A mathematical model which predicts the results of such a post-exposure bake (PEB) is described below. In general, molecular diffusion is governed by Ficke’s second law of diffusion, which states (in one dimension): vCA v2 C A ZD ; vt vx2

(2.80)

where CA is the concentration of species A, D is the diffusion coefficient of A at some temperature T, and t is the time that the system is at temperature T. Note that the diffusivity is assumed to be independent of concentration here. This differential equation can be solved given a set of boundary conditions, i.e., an initial distribution of A. One possible boundary condition is known as the impulse source. At some point x0 there are N moles of substance A and at all other points there is no A. Thus, the concentration at x0 is infinite. Given this initial distribution of A, the solution to Equation 2.80 is the Gaussian distribution function, 2 2 N CA ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ eKr =2s ; 2 2ps

(2.81)

C0 Dx Kr2 =2s2 ﬃe CA ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : 2ps2

(2.82)

pﬃﬃﬃﬃﬃﬃﬃﬃ where sZ 2Dt, the diffusion length, and rZxKx0. In practice there are no impulse sources. Instead, we can approximate an impulse source as having some concentration C0 over some small distance Dx centered at x0, with zero concentration outside of this range. An approximate form of Equation 2.81 is then

This solution is fairly accurate if Dx!3s. If there are two “impulse” sources located at x1 and x2, with initial concentrations C1 and C2 each over a range Dx, the concentration of A at x after diffusion is

2 2 2 2 C1 C2 ﬃ eKr1 =2s C pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ eKr2 =2s Dx; CA ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2ps2 2ps2 where r1ZxKx1 and r2ZxKx2.

q 2007 by Taylor & Francis Group, LLC

(2.83)

Microlithography: Science and Technology

128

If there are a number of sources Equation 2.83 becomes 2 2 Dx X Cn eKrn =2s : CA ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2ps

(2.84)

Extending the analysis to a continuous initial distribution C0(x), Equation 2.84 becomes 1 CA ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2ps2

N ð

C0 ðxKx 0 ÞeKx

02

=2s2

dx 0 :

(2.85)

KN

where x 0 is now the distance from the point x. Equation 2.84 is simply the convolution of two functions. CA ðxÞ Z C0 ðxÞ f ðxÞ;

(2.86)

where 2 2 1 f ðxÞ Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ eKx =2s : 2 2ps

This equation can now be made to accommodate two-dimensional diffusion: CA ðx;yÞ Z C0 ðx;yÞ f ðx;yÞ;

(2.87)

where f ðx;yÞ Z

rZ

1 Kr2 =2s2 e ; 2ps2

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ x2 C y2 :

We are now ready to apply Equation 2.87 to the diffusion of PAC in a photoresist during a postexposure bake. After exposure, the PAC distribution can be described by m(x,z), where m is the relative PAC concentration. According to Equation 2.87 the relative PAC concentration after a postexposure bake, m*(x,z), is given by 1 m ðx;zÞ Z 2ps2

N ð

ð

mðxKx 0 ; zKz 0 ÞeKr

02

=2s2

dx 0 dz 0 :

(2.88)

KN

In evaluating Equation 2.88 it is common to replace the integrals by summations over intervals Dx and Dz. In such a case, the restrictions that Dx!3s and Dz!3s will apply. An alternative solution is to solve the diffusion equation (Equation 2.80) directly, for example using a finite-difference approach. The diffusion model can now be used to simulate the effects of a postexposure bake. Using the lithography simulator, a resist profile can be generated. By including the model for a postexposure bake, the profile can be generated showing how the standing-wave effect is reduced. The only parameter that needs to be specified in Equation 2.88 is the

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

129

diffusion length s, or equivalently, the diffusion coefficient D and the bake time t. In turn, D is a function of the bake temperature, T, and, of course, the resist system used. Thus, if the functionality of D with temperature is known for a given resist system, a PEB of time t and temperature T can be modeled. A general temperature dependence for the diffusivity D can be found using the Arrhenius Equation (for temperature ranges which do not traverse the glass-transition temperature). D Z D0 eKEa =RT ;

(2.89)

where D0 is the Arrhenius constant (units of nm2/min), Ea is the activation energy, R is the universal gas constant, and T is the temperature in Kelvin. Unfortunately, very little work has been done in measuring the diffusivity of photoactive compounds in photoresists.

2.7 Photoresist Development An overall positive resist processing model requires a mathematical representation of the development process. Previous attempts have taken the form of empirical fits to development rate data as a function of exposure [2,42]. The model formulated below begins on a more fundamental level, with a postulated reaction mechanism that then leads to a development rate equation [43]. The rate constants involved can be determined by comparison with experimental data. An enhanced kinetic model with a second mechanism for dissolution inhibition is also presented [44]. Deviations from the expected development rates have been reported under certain conditions at the surface of the resist. This effect, called surface induction or surface inhibition, can be related empirically to the expected development rate, i.e., to the bulk development rate as predicted by a kinetic model. Unfortunately, fundamental experimental evidence of the exact mechanism of photoresist development is lacking. The model presented below is reasonable, and the resulting rate equation has been shown to describe actual development rates extremely well. However, faith in the exact details of the mechanism is limited by this dearth of fundamental studies. 2.7.1 Kinetic Development Model To derive an analytical development rate expression, a kinetic model of the development process will be used. This approach involves proposing a reasonable mechanism for the development reaction and then applying standard kinetics to this mechanism to derive a rate equation. It will be assumed that the development of a diazo-type positive photoresist involves three processes: diffusion of developer from the bulk solution to the surface of the resist, reaction of the developer with the resist, and diffusion of the product back into the solution. For this analysis, it is assumed that the last step—diffusion of the dissolved resist into solution—occurs very quickly so that this step may be ignored. The first two steps in the proposed mechanism will now be examined. The diffusion of developer to the resist surface can be described with the simple diffusion rate equation, given approximately by rD Z kD ðDKDS Þ;

(2.90)

where rD is the rate of diffusion of the developer to the resist surface, D is the bulk

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

130

developer concentration, DS is the developer concentration at the resist surface, and kD is the rate constant. A mechanism will now be proposed for the reaction of developer with the resist. The resist is composed of large macromolecules of resin R along with a photoactive compound M, which converts to product P upon exposure to UV light. The resin is quite soluble in the developer solution, but the presence of the PAC (photoactive compound) acts as an inhibitor to dissolution, making the development rate very slow. The product P, however, is very soluble in developer, enhancing the dissolution rate of the resin. Assume that n molecules of product P react with the developer to dissolve a resin molecule. The rate of the reaction is r R Z k R D S Pn ;

(2.91)

where rR is the rate of reaction of the developer with the resist and kR is the rate constant. (Note: that the mechanism shown in Equation 2.91 is the same as the “polyphotolysis” model described by Trefonas and Daniels [45].) From the stoichiometry of the exposure reaction, P Z M0 KM;

(2.92)

where M0 is the initial PAC concentration (i.e., before exposure). The two steps outlined above are in series, i.e., one reaction follows the other. Thus, the two steps will come to a steady state such that rR Z rD Z r:

(2.93)

Equating the rate equations, one can solve for DS and eliminate it from the overall rate equation, giving rZ

kD kR DPn : k D C k R Pn

(2.94)

Using Equation 2.92 and letting mZM/M0, the relative PAC concentration, Equation 2.94 becomes rZ

kD Dð1KmÞn : kD =kR Mno C ð1KmÞn

(2.95)

When mZ1 (resist unexposed), the rate is zero. When mZ0 (resist completely exposed), the rate is equal to rmax, where rmax Z

kD D : kD =kR Mno C 1

(2.96)

If a constant a is defined such that a Z kD =kR Mno ;

q 2007 by Taylor & Francis Group, LLC

(2.97)

Optical Lithography Modeling

131

the rate equation becomes r Z rmax

ða C 1Þð1KmÞn : a C ð1KmÞn

(2.98)

Note that the simplifying constant a describes the rate constant of diffusion relative to the surface reaction rate constant. A large value of a will mean that diffusion is very fast, and thus less important, compared to the fastest surface reaction (for completely exposed resist). There are three constants that must be determined experimentally: a, n, and rmax. The constant a can be put in a more physically meaningful form as follows. A characteristic of some experimental rate data is an inflection point in the rate curve at about mZ0.2–0.7. The point of inflection can be calculated by letting d2 r Z 0; dm2 giving aZ

ðn C 1Þ ð1KmTH Þn : ðnK1Þ

(2.99)

where mTH is the value of m at the inflection point, called the threshold PAC concentration. This model does not take into account the finite dissolution rate of unexposed resist (rmin). One approach is simply to add this term to Equation 2.98, giving

r Z rmax

ða C 1Þð1KmÞn C rmin : a C ð1KmÞn

(2.100)

This approach assumes that the mechanism of development of the unexposed resist is independent of the above-proposed development mechanism. In other words, there is a finite dissolution of resin that occurs by a mechanism that is independent of the presence of exposed PAC. Consider the case when the diffusion rate constant is large compared to the surface reaction rate constant. If a[1, the development rate in Equation 2.100 will become r Z rmax ð1KmÞn C rmin :

(2.101)

The interpretation of a as a function of the threshold PAC concentration mTH given by Equation 2.99 means that a very large a would correspond to a large negative value of mTH. In other words, if the surface reaction is very slow compared to the mass transport of developer to the surface, there will be no inflection point in the development rate data and Equation 2.101 will apply. It is quite apparent that Equation 2.101 could be derived directly from Equation 2.91 if the diffusion step were ignored. 2.7.2 Enhanced Kinetic Development Model The previous kinetic model is based on the principle of dissolution enhancement. The carboxylic acid enhances the dissolution rate of the resin/PAC mixture. In reality, this is a simplification. There are really two mechanisms at work. The PAC acts to inhibit

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

132

dissolution of the resin while the acid acts to enhance dissolution. Thus, the rate expression should reflect both of these mechanisms. A new model, called the enhanced kinetic model, was proposed to include both effects [44]: 1 C kenh ð1KmÞ n R Z Rresin ; 1 C kinh ðmÞl

(2.102)

where kenh is the rate constant for the enhancement mechanism, n is the enhancement reaction order, kinh is the rate constant for the inhibition mechanism, l is the inhibition reaction order, and Rresin is the development rate of the resin alone. For no exposure, mZ1 and the development rate is at its minimum. From Equation 2.102, Rmin Z

Rresin : 1 C kinh

(2.103)

Similarly, when mZ0, corresponding to complete exposure, the development is at its maximum: Rmax Z Rresin ð1 C kenh Þ:

(2.104)

Thus, the development rate expression can be characterized by five parameters: Rmax, Rmin, Rresin, n, and l. Obviously, the enhanced kinetic model for resist dissolution is a superset of the original kinetic model. If the inhibition mechanism is not important, then lZ0. For this case, Equation 2.102 is identical to Equation 2.101 when Rmin Z Rresin ; Rmax Z Rresin kenh :

(2.105)

The enhanced kinetic model of Equation 2.102 assumes that mass transport of developer to the resist surface is not significant. Of course, a simple diffusion of developer can be added to this mechanism as was done above with the original kinetic model. 2.7.3 Surface Inhibition The kinetic models given above predict the development rate of the resist as a function of the photoactive compound concentration remaining after the resist has been exposed to UV light. There are, however, other parameters that are known to affect the development rate, but which were not included in this model. The most notable deviation from the kinetic theory is the surface inhibition effect. The inhibition, or surface induction, effect is a decrease in the expected development rate at the surface of the resist [38,46,47]. Thus, this effect is a function of the depth into the resist and requires a new description of development rate. Several factors have been found to contribute to the surface inhibition effect. High temperature baking of the photoresist has been found to produce surface inhibition and is thought to cause oxidation of the resist at the resist surface [38,46,47]. In particular, prebaking the photoresist may cause this reduced development rate phenomenon [38,47]. Alternatively, the induction effect may be the result of reduced solvent content near the resist surface. Of course, the degree to which this effect is observed depends upon the prebake time and temperature. Finally, surface inhibition can be induced with the use of surfactants in the developer.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

133

An empirical model can be used to describe the positional dependence of the development rate. If it is assumed that development rate near the surface of the resist exponentially approaches the bulk development rate, the rate as a function of depth, r(z), is rðzÞ Z rB ð1Kð1Kr0 ÞeKz=d Þ;

(2.106)

where rB is the bulk development rate, r0 is the development rate at the surface of the resist relative to rB, and d is the depth of the surface inhibition layer. In several resists, the induction effect has been found to take place over a depth of about 100 nm [38,47].

2.8 Linewidth Measurement A cross-section of a photoresist profile has, in general, a very complicated two-dimensional shape (Figure 2.11). To compare the shapes of two different profiles, one must find a convenient description for the shapes of the profiles that somehow reflects their salient qualities. The most common description is to model the resist profile as a trapezoid. Thus, three numbers can be used to describe the profile: the width of the base of the trapezoid (linewidth, w), its height (resist thickness, D), and the angle that the side makes with the base (sidewall angle, q). Obviously, to describe such a complicated shape as a resist profile with just three numbers is a great, though necessary, simplification. The key to success is to pick a method of fitting a trapezoid to the profile which preserves the important features of the profile, is numerically practical, and as a result is not overly sensitive to slight changes in the profile. There are many possible algorithms for measuring the resist profile. One algorithm, called the linear weight method, is designed to mimic the behavior of a top-down linewidth measurement system. The first step is to convert the profile into a “weighted” profile as follows: at any given x position (i.e., along the horizontal axis), determine the “weight” of the photoresist above it. The weight is defined as the total thickness of resist along a vertical line at x. Figure 2.12 shows a typical example. The weight at this x position would be the sum of the lengths of the line segments that are within the resist profile. As can be seen, the original profile is complicated and multivalued, whereas the weighted profile is smooth and single-valued. A trapezoid can now be fit accurately to the weighted profile. The simplest type of fit will be called the standard linewidth determination method: ignoring the top and bottom 10% of the weighted resist thickness, a straight line is fit through the remaining 80% of the sidewall. The intersection of this line with the substrate gives the linewidth, and the slope

D

q (a)

(b)

q 2007 by Taylor & Francis Group, LLC

w

FIGURE 2.11 Typical photoresist profile and its corresponding trapezoid.

134

Microlithography: Science and Technology

Original profile

FIGURE 2.12 Determining the weighted resist profile.

(a) x

Weighted profile

(b) x

of this line determines the sidewall angle. Thus, the standard method gives the best-fit trapezoid through the middle 80% of the weighted profile. There are cases where one part of the profile may be more significant than another. For these situations, one could select the threshold method for determining linewidth. In this method, the sidewall angle is measured using the standard method, but the width of the trapezoid is adjusted to match the width of the weighted profile at a given threshold resist thickness. For example, with a threshold of 20%, the trapezoid will cross the weighted profile at a thickness of 20% up from the bottom. Thus, the threshold method can be used to emphasize the importance of one part of the profile. The two linewidth determination methods deviate from one another when the shape of the resist profile begins to deviate from the general trapezoidal shape. Figure 2.13 shows two resist profiles at the extremes of focus. Using a 10% threshold, the linewidths of these two profiles are the same. Using a 50% threshold, however, shows profile (a) to be 20% wider than profile (b). The standard linewidth method, on the other hand, shows profile (a) to be 10% wider than profile (b). Finally, a 1% threshold gives the opposite result, with profile (a) 10% smaller than profile (b). The effect of changing profile shape on the measured linewidth is further illustrated in Figure 2.14, which shows CD vs. focus for the standard and 5% threshold CD measurement methods. It is important to note that sensitivity of the measured linewidth to profile shape is not particular to lithography simulation, but is present in any CD measurement system. Fundamentally, this is the result of using the trapezoid model for resist profiles. Obviously, it is difficult to compare resist profiles when the shapes of the profiles are changing. It is very important to use the linewidth method (and proper threshold value, if necessary) that is physically the most significant for the problems being studied. If the bottom of the resist profile is most important, the threshold method with a small (e.g., 5%) threshold is recommended. It is also possible to “calibrate” the simulator to a linewidth

FIGURE 2.13 Resist profiles at the extremes of focus.

q 2007 by Taylor & Francis Group, LLC

(a) Focus below the resist

(b) Focus above the resist

Optical Lithography Modeling

135

0.70 0.65

CD (μm)

0.60 0.55 0.50 0.45 0.40 0.35 –2.0

–1.5

–1.0 –0.5 0.0 0.5 Focal position (μm)

1.0

1.5

FIGURE 2.14 Effect of resist profile shape on linewidth measurement in a lithography simulator. CD measurement methods are standard (dashed line) and 5% threshold (solid line).

measurement system. By adjusting the threshold value used by the simulator, results comparable to actual measurements can be obtained.

2.9 Lumped-Parameter Model Typically, lithography models make every attempt to describe physical phenomena as accurately as possible. However, in some circumstances, speed is more important than accuracy. If a model is reasonably close to correct and fast, many interesting applications are possible. With this trade-off in mind, the lumped-parameter model was developed [48–50]. 2.9.1 Development-Rate Model The mathematical description of the resist process incorporated in the lumped-parameter model uses a simple photographic model relating development time to exposure, whereas the aerial image simulation is derived from the standard optical parameters of the lithographic tool. A very simple development-rate model is used based on the assumption of a constant contrast. Before proceeding, however, a few terms needed for the derivations that follow will be defined. Let E be the nominal exposure energy (i.e., the intensity in a large clear area times the exposure time), let I(x) be the normalized image intensity, and let I(z) be the relative intensity variation with depth into the resist. It is clear that the exposure energy as a function of position within the resist (Exz) is simply E I(x)I(z), where xZ0 is the center of the mask feature and zZ0 is the top of a resist of thickness D. Defining logarithmic versions of these quantities, 3 Z ln½E;

iðxÞ Z ln½IðxÞ;

iðzÞ Z ln½IðzÞ;

(2.107)

and the logarithm of the energy deposited in the resist is ln½Exz Z 3 C iðxÞ C iðzÞ:

q 2007 by Taylor & Francis Group, LLC

(2.108)

Microlithography: Science and Technology

136

The photoresist contrast (g) is defined theoretically as [51]: gh

d ln r ; d ln Exz

(2.109)

where r is the resulting development rate from an exposure of Exz. Note that the base-e definition of contrast is used here. If the contrast is assumed constant over the range of energies of interest, Equation 2.109 can be integrated to give a very simple expression for development rate. To evaluate the constant of integration, a convenient point of evaluation will be chosen. Let 30 be the energy required to just clear the photoresist in the allotted development time, tdev, and let r0 be the development rate which results from an exposure of this amount. Carrying out the integration gives rðx;zÞ Z r0 egð3CiðxÞCiðzÞK30 Þ Z r0

Exz E0

g

:

(2.110)

As an example of the use of the above development rate expression and to further illustrate the relationship between r0 and the dose to clear, consider the standard dose to clear experiment where a large clear area is exposed and the thickness of photoresist remaining is measured. The definition of development rate, rZ

dz ; dt

(2.111)

can be integrated over the development time. If 3Z30, the thickness remaining is by definition zero, so that ðD

ðD

0

0

dz 1 Z tdev Z r r0

eKgiðzÞ dz;

(2.112)

where i(x) is zero for an open-frame exposure. Based on this equation, one can now define an effective resist thickness, Deff, which will be very useful in the derivation of the lumpedparameter model that follows:

Deff Z r0 tdev e

giðDÞ

giðDÞ

Ze

ðD

KgiðzÞ

e

dz Z

0

ðD

IðzÞ IðDÞ

Kg dz:

(2.113)

0

As an example, the effective resist thickness can be calculated for the case of absorption only, causing a variation in intensity with depth in the resist. For such a case, I(z) will decay exponentially and Equation 2.113 can be evaluated to give Deff Z

1 agD ðe K1Þ: ag

(2.114)

If the resist is only slightly absorbing so that agD/1, the exponential can be approximated by the first few terms in its Taylor series expansion:

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

137 agD : Deff zD 1 C 2

(2.115)

Thus, the effect of absorption is to make the resist seem thicker to the development process. The effective resist thickness can be thought of as the amount of resist of constant development rate that requires the same development time to clear as the actual resist with a varying development rate. 2.9.2 Segmented Development Equation 2.110 is an extremely simple-minded model relating development rate to exposure energy based on the assumption of a constant resist contrast. To use this expression, a phenomenological explanation will be developed for the development process. This explanation will be based on the assumption that development occurs in two steps: a vertical development to a depth z, followed by a lateral development to position x (measured from the center of the mask feature) [52] as shown in Figure 2.15. A development ray, which traces out the path of development, starts at the point (x0,0) and proceeds vertically until a depth z is reached such that the resist to the side of the ray has been exposed more than the resist below the ray. At this point, the development will begin horizontally. The time needed to develop in both vertical and horizontal directions, tz and tx, respectively, can be computed from Equation 2.110. The development time per unit thickness of resist is just the reciprocal of the development rate: 1 Z tðx;zÞ Z t0 eKgð3CiðxÞCiðzÞÞ ; rðx;zÞ

(2.116)

where

t0 Z

1 g30 e : r0

(2.117)

The time needed to develop to a depth z is given by

Kg3 Kgiðx0 Þ

t z Z t0 e

e

ðz

0

eKgiðz Þ dz 0 :

(2.118)

0

Similarly, the horizontal development time is

Resist

Substrate

q 2007 by Taylor & Francis Group, LLC

FIGURE 2.15 Illustration of segmented development: development proceeds first vertically, then horizontally, to the final resist sidewall.

Microlithography: Science and Technology

138

t x Z t0 e

e

ðx

Kg3 KgiðzÞ

0

eKgiðx Þ dx 0 :

(2.119)

x0

The sum of these two times must equal the total development time: 2 3 ðz ðx 0 0 6 7 tdev Z t0 eKg3 4eKgiðx0 Þ eKgiðz Þ dz 0 C eKgiðzÞ eKgiðx Þ dx 05: 0

(2.120)

x0

2.9.3 Derivation of the Lumped-Parameter Model The above equation can be used to derive some interesting properties of the resist profile. For example, how would a small change in exposure energy, D3, affect the position of the resist profile x? A change in overall exposure energy will not change the point at which the development ray changes direction. Thus, the depth z is constant. Differentiating Equation 2.120 with respect to log [exposure energy], the following equation can be derived: dx gt Z dev Z gtdev rðx;zÞ: (2.121) d3 z tðx;zÞ Because the x position of the development ray endpoint is just one half of the linewidth, Equation 2.121 defines a change in critical dimension (CD) with exposure energy. To put this expression in a more useful form, take the log of both sides and use the development rate expression (Equation 2.110) to give

dx d3

Z lnðgtdev r0 Þ C gð3 C iðxÞ C iðzÞK30 Þ:

(2.122)

1 dx 1 3 Z 30 KiðxÞKiðzÞ C ln K lnðgtdev r0 Þ; g d3 g

(2.123)

ln Rearranging,

where 3 is the (log) energy needed to expose a feature of width 2x. Equation 2.123 is the differential form of the lumped parameter model and relates the CD vs. log [exposure] curve and its slope to the image intensity. A more useful form of this equation is given below; however, some valuable insight can be gained by examining Equation 2.123. In the limit of very large g, one can see that the CD vs. exposure curve becomes equal to the aerial image. Thus, exposure latitude becomes image limited. For small g, the other terms become significant and the exposure latitude is process limited. Obviously, an imagelimited exposure latitude represents the best possible case. A second form of the lumped-parameter model can also be obtained in the following manner. Applying the definition of development rate to Equation 2.121 or, alternatively, solving for the slope in Equation 2.123 yields d3 1 Z eKgð3CiðxÞCiðzÞK30 Þ : dx gtdev r0

(2.124)

Before proceeding, a slight change in notation will be introduced that will make the role of the variable 3 more clear. As originally defined, 3 is just the nominal exposure energy.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

139

In Equation 2.122 through Equation 2.124, it takes the added meaning as the nominal energy which gives a linewidth of 2x. To emphasize this meaning, 3 will be replaced by 3(x), where the interpretation is not a variation of energy with x, but rather a variation of x (linewidth) with energy. Using this notation, the energy to just clear the resist can be related to the energy that gives zero linewidth: 30 Z 3ð0Þ C iðx Z 0Þ:

(2.125)

Using this relation in Equation 2.124, d3 1 Z eKgiðzÞ egð3ð0ÞK3ðxÞÞ egðið0ÞKiðxÞÞ : dx gtdev r0

(2.126)

Invoking the definitions of the logarithmic quantities,

dE EðxÞ Eð0ÞIð0Þ g Z ; dx gDeff EðxÞIðxÞ

(2.127)

where Equation 2.113 has been used and the linewidth is assumed to be measured at the resist bottom (i.e., zZD). Equation 2.127 can now be integrated: EðxÞ ð Eð0Þ

E

gK1

ðx 1 g dE Z ½Eð0ÞIð0Þ Iðx 0 ÞKg dx 0 ; gDeff

(2.128)

0

giving 2 31=g ðx 0 Kg EðxÞ 4 1 Iðx Þ Z 1C dx 0 5 : Eð0Þ gDeff Ið0Þ

(2.129)

0

Equation 2.129 is the integral form of the lumped parameter model. Using this equation, one can generate a normalized CD vs. exposure curve by knowing the image intensity, I(x), the effective resist thickness, Deff, and the contrast, g. 2.9.4 Sidewall Angle The lumped-parameter model allows the prediction of linewidth by developing down to a depth z and laterally to a position x, which is one-half of the final linewidth. Typically, the bottom linewidth is desired so that the depth chosen is the full resist thickness. By picking different values for z, different x positions will result, giving a complete resist profile. One important result that can be calculated is the resist sidewall slope and the resulting sidewall angle. To derive an expression for the sidewall slope, Equation 2.120 will be rewritten in terms of the development rate:

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

140 ðz

ðx

0

x0

dz 0 C tdev Z rð0;z 0 Þ

dx 0 : rðx 0 ;zÞ

(2.130)

Taking the derivative of this expression with respect to z, ðz

ðx

0

x0

dt 0 1 0Z dz C C dz rð0;zÞ

dt 0 1 dx dx C : dz rðx;zÞ dz

(2.131)

The derivative of the reciprocal development rate can be calculated from Equation 2.110 or Equation 2.116, dt dln½Exz ZKgtðx;zÞ : dz dz

(2.132)

As one would expect, the variation of development rate with depth into the resist depends on the variation of the exposure dose with depth. Consider a simple example where bulk absorption is the only variation of exposure with z. For an absorption coefficient of a, the result is dln½Exz ZKa: dz

(2.133)

Using Equation 2.132 and Equation 2.133 in Equation 2.131, 0

ðz

ðx

0

x0

1

B C Kag@ tdz 0 C tdx 0A Z

1 1 dx C : rð0;zÞ rðx;zÞ dz

(2.134)

Recognizing the term in parentheses as simply the development time, the reciprocal of the resist slope can be given as dx rðx;zÞ rðx;zÞ dx K Z C agtdev rðx;zÞ Z Ca : dz rð0;zÞ rð0;zÞ d3

(2.135)

Equation 2.135 shows two distinct contributors to sidewall angle. The first is the development effect. Because the top of the photoresist is exposed to developer longer than the bottom, the top linewidth is smaller resulting in a sloped sidewall. This effect is captured in Equation 2.135 as the ratio of the development rate at the edge of the photoresist feature to the development rate at the center. Good sidewall slope is obtained by making this ratio small. The second term in Equation 2.135 describes the effect of optical absorption on the resist slope. High absorption or poor exposure latitude will result in a reduction of the resist sidewall angle. 2.9.5 Results The lumped-parameter model is based on a simple model for development rate and a phenomenological description of the development process. The result is an equation that predicts the change in linewidth with exposure for a given aerial image. The major

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

141

advantage of the lumped parameter model is its extreme ease of application to a lithography process. The two parameters of the model—resist contrast and effective thickness—can be determined by the collection of linewidth data from a standard focus-exposure matrix. This data is routinely available in most production and development lithography processes; no extra or unusual data collection is required. The result is a simple and fast model that can be used as an initial predictor of results or as the engine of a lithographic control scheme. Additionally, the lumped-parameter model can be used to predict the sidewall angle of the resulting photoresist profile. The model shows the two main contributors to resist slope: development effects due to the time required for the developer to reach the bottom of the photoresist, and absorption effects resulting in a reduced exposure at the bottom of the resist. Finally, the lumped-parameter model presents a simple understanding of the optical lithography process. The potential of the model as a learning tool should not be underestimated. In particular, the model emphasizes the competing roles of the aerial image and the photoresist process in determining linewidth control. This fundamental knowledge lays the foundation for further investigations into the behavior of optical lithography systems.

2.10 Uses of Lithography Modeling In the twenty years since optical lithography modeling was first introduced to the semiconductor industry, it has gone from a research curiosity to an indispensable tool for research, development, and manufacturing. There are numerous examples of how modeling has had a dramatic impact on the evolution of lithography technology, and many more ways in which it has subtly, but undeniably, influenced the daily routines of lithography professionals. There are four major uses for lithography simulation: (1) as a research tool, performing experiments that would be difficult or impossible to do any other way, (2) as a development tool, quickly evaluating options, optimizing processes, or saving time and money by reducing the number of experiments that have to be performed, (3) as a manufacturing tool, for troubleshooting process problems and determining optimum process settings, and (4) as a learning tool, to help provide a fundamental understanding of all aspects of the lithography process. These four applications of lithography simulation are not distinct—there is much overlap among these basic categories. 2.10.1 Research Tool Since the initial introduction of lithography simulation in 1974, modeling has had a major impact on research efforts in lithography. Here are some examples of how modeling has been used in research. Modeling was used to suggest the use of dyed photoresist in the reduction of standing waves [53]. Experimental investigation into dyed resists did not begin until 10 years later [54,55]. After phase-shifting masks were first introduced [56], modeling has proven to be indispensable in their study. Levenson used modeling extensively to understand the effects of phase masks [57]. One of the earliest studies of phase-shifting masks used modeling to calculate images for Levenson’s original alternating phase mask, then showed how phase

q 2007 by Taylor & Francis Group, LLC

142

Microlithography: Science and Technology

masks increased defect printability [58]. The same study used modeling to introduce the concept of the outrigger (or assist slot) phase mask. Since these early studies, modeling results have been presented in nearly every paper published on phase-shifting masks. Off-axis illumination was first introduced as a technique for improving resolution and depth of focus based on modeling studies [59]. Since then, this techniques has received widespread attention and has been the focus of many more simulation and experimental efforts. Using modeling, the advantages of having a variable numerical-aperture, variable partial coherence stepper were discussed [59,60]. Since then, all major stepper vendors have offered variable-NA, variable coherence systems. Modeling remains a critical tool for optimizing the settings of these flexible new machines. The use of pupil filters to enhance some aspects of lithographic performance have, to date, only been studied theoretically using lithographic models [61]. If such studies prove the usefulness of pupil filters, experimental investigations may also be conducted. Modeling has been used in photoresist studies to understand the depth of focus loss when printing contacts in negative resists [62], the reason for artificially high values of resist contrast when surface inhibition is present [51], the potential for exposure optimization to maximize process latitude [63,64], and the role of diffusion in chemically amplified resists [65]. Lithographic models are now standard tools for photoresist design and evaluation. Modeling has always been used as a tool for quantifying optical proximity effects and for defining algorithms for geometry-dependent mask biasing [66,67]. Most people would consider modeling to be a required element of any optical proximity correction scheme. Defect printability has always been a difficult problem to understand. The printability of a defect depends considerably on the imaging system and resist used, as well as the position of the defect relative to other patterns on the mask and the size and transmission properties of the defect. Modeling has proven itself a valuable and accurate tool for predicting the printability of defects [68,69]. Modeling has also been used to understand metrology of lithographic structures [70–73] and continues to find new application in virtually every aspect of lithographic research. One of the primary reasons that lithography modeling has become such a standard tool for research activities is the ability to simulate such a wide range of lithographic conditions. Whereas laboratory experiments are limited to the equipment and materials on hand (a particular wavelength and numerical aperture of the stepper, a given photoresist), simulation gives an almost infinite array of possible conditions. From high numerical apertures to low wavelengths, hypothetical resists to arbitrary mask structures, simulation offers the ability to run “experiments” on steppers that you do not own with photoresists that have yet to be made. How else can one explore the shadowy boundary between the possible and the impossible? 2.10.2 Process Development Tool Lithography modeling has also proven to be an invaluable tool for the development of new lithographic processes and equipment. Some of the more common uses include the optimization of dye loadings in photoresists [74,75], simulation of substrate reflectivity [76,77], the applicability and optimization of top and bottom antireflection coatings [78,79], and simulation of the effect of bandwidth on swing-curve amplitude [80,81]. In addition, simulation has been used to help understand the use of thick resists for thin-film head manufacture [82] as well as other nonsemiconductor applications.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling

143

Modeling is used extensively by makers of photoresists to evaluate new formulations [83,84] and to determine adequate measures of photoresist performance for quality control purposes [85]. Resist users often employ modeling as an aid for new resist evaluations. On the exposure tool side, modeling has become an indispensable part of the optimization of the numerical aperture and partial coherence of a stepper [86–88] and in the understanding of the print bias between dense and isolated lines [89]. The use of optical proximity correction software requires rules on how to perform the corrections, which are often generated with the help of lithography simulation [90]. As a development tool, lithography simulation excels due to its speed and cost-effectiveness. Process development usually involves running numerous experiments to determine optimum process conditions, shake out possible problems, determine sensitivity to variables, and write specification limits on the inputs and outputs of the process. These activities tend to be both time consuming and costly. Modeling offers a way to supplement laboratory experiments with simulation experiments to speed-up this process and reduce costs. Considering that a single experimental run in a wafer fabrication facility can take from hours to days, the speed advantage of simulation is considerable. This allows a greater number of simulations than would be practical (or even possible) in the fab. 2.10.3 Manufacturing Tool Less published material exists on the use of lithography simulation in manufacturing environments [91–93] because of the limited publications by people in manufacturing rather than the limited use of lithography modeling. The use of simulation in a manufacturing environment has three primary goals: to reduce the number of test or experimental wafers that must be run through the production line, to troubleshoot problems in the fab, and to aid in decision making by providing facts to support engineering judgment and intuition. Running test wafers through a manufacturing line is costly, not so much due to the cost of the test, but due to the opportunity cost of not running product [94]. If simulation can reduce the time a manufacturing line is not running product even slightly, the return on investment can be significant. Simulation can also aid in the time required to bring a new process on-line. 2.10.4 Learning Tool Although the research, development, and manufacturing applications of lithography simulation presented above give ample benefits of modeling based on time, cost, and capability, the underlying power of simulation is its ability to act as a learning tool. Proper application of modeling allows the user to learn efficiently and effectively. There are many reasons why this is true. First, the speed of simulation vs. experimentation makes feedback much more timely. Because learning is a cycle (an idea, an experiment, a measurement, then comparison back to the original idea), faster feedback allows for more cycles of learning. Because simulation is very inexpensive, there are fewer inhibitions and more opportunities to explore ideas. Furthermore, as the research applications have demonstrated, there are fewer physical constraints on what “experiments” can be performed. All of these factors allow the use of modeling to gain an understanding of lithography. Whether learning fundamental concepts or exploring subtle nuances, the value of improved knowledge cannot be overstated.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

144

References 1. 2. 3. 4. 5.

6. 7. 8. 9. 10. 11.

12. 13.

14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.

F.H. Dill. 1975. “Optical lithography,” IEEE Transactions on Electron Devices, 22:7, 440–444. F.H. Dill, W.P. Hornberger, P.S. Hauge, and J.M. Shaw. 1975. “Characterization of positive photoresist,” IEEE Transactions on Electron Devices, 22:7, 445–452. K.L. Konnerth and F.H. Dill. 1975. “In-situ measurement of dielectric thickness during etching or developing processes,” IEEE Transactions on Electron Devices, 22:7, 452–456. F.H. Dill, A.R. Neureuther, J.A. Tuttle, and E.J. Walker. 1975. “Modeling projection printing of positive photoresists,” IEEE Transactions on Electron Devices, 22:7, 456–464. W.G. Oldham, S.N. Nandgaonkar, A.R. Neureuther, and M. O’Toole. 1979. “A general simulator for VLSI lithography and etching processes: part I—application to projection lithography,” IEEE Transactions on Electron Devices, 26:4, 717–722. C.A. Mack. 1985. “PROLITH: a comprehensive optical lithography model,” Proceedings of SPIE, 538: 207–220. J.W. Goodman. 1968. in Introduction to Fourier Optics, New York: McGraw-Hill. C.A. Mack. 1988. “Understanding focus effects in submicron optical lithography,” Optical Engineering, 27:12, 1093–1100. D.A. Bernard. 1988. “Simulation of focus effects in photolithography,” IEEE Transactions on Semiconductor Manufacturing, 1:3, 85–97. C.A. Mack. 1986. “Analytical expression for the standing wave intensity in photoresist,” Applied Optics, 25:12, 1958–1961. D.A. Bernard and H.P. Urbach. 1991. “Thin-film interference effects in photolithography for finite numerical apertures,” Journal of the Optical Society of America A, 8:1, 123–133. M. Born and E. Wolf. 1980. in Principles of Optics, 6th Ed., Oxford: Pergamon Press. D.C. Cole, E. Barouch, U. Hollerbach, and S.A. Orszag. 1992. “Extending scalar aerial image calculations to higher numerical apertures,” Journal of Vacuum Science and Technology B, 10:6, 3037–3041. D.G. Flagello, A.E. Rosenbluth, C. Progler, and J. Armitage. 1992. “Understanding high numerical aperture optical lithography,” Microelectronic Engineering, 17: 105–108. C.A. Mack and C-B. Juang. 1995. “Comparison of scalar and vector modeling of image formation in photoresist,” Proceedings of SPIE, 2440: 381–394. S. Middlehoek. 1970. “Projection masking, thin photoresist layers and interference effects,” IBM Journal of Research and Development, 14: 117–124. J.E. Korka. 1970. “Standing waves in photoresists,” Applied Optics, 9:4, 969–970. D.F. Ilten and K.V. Patel. 1971. “Standing wave effects in photoresist exposure,” Image Technology, February/March: 9–14. D.W. Widmann. 1975. “Quantitative evaluation of photoresist patterns in the 1 mm range,” Applied Optics, 14:4, 931–934. P.H. Berning. 1963. “Theory and calculations of optical thin films,” in Physics of Thin Films, G. Hass, Ed., New York: Academic Press, pp. 69–121. D.A. Skoog and D.M. West. 1976. in Fundamentals of Analytical Chemistry, 3rd Ed., New York: Holt, Rinehart, and Winston. J.M. Koyler. 1979. “Thermal properties of positive photoresist and their relationship to VLSI processing,” in Kodak Microelectronics Seminar Interface ’79, pp. 150–165. J.M. Shaw, M.A. Frisch, and F.H. Dill. 1977. “Thermal analysis of positive photoresist films by mass spectrometry,” IBM Journal of Research and Development, 21:3, 219–226. C.A. Mack. 1988. “Absorption and exposure in positive photoresist,” Applied Optics, 27:23, 4913–4919. J. Albers and D.B. Novotny. 1980. “Intensity dependence of photochemical reaction rates for photoresists,” Journal of the Electrochemical Society, 127:6, 1400–1403. C.E. Herrick, Jr. 1966. “Solution of the partial differential equations describing photo-decomposition in a light-absorbing matrix having light-absorbing photoproducts,” IBM Journal of Research and Development, 10: 2–5.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling 27. 28. 29.

30.

31. 32.

33. 34.

35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48.

49. 50. 51.

145

J.J. Diamond and J.R. Sheats. 1986. “Simple algebraic description of photoresist exposure and contrast enhancement,” IEEE Electron Device Letters, 7:6, 383–386. S.V. Babu and E. Barouch. 1986. “Exact solution of Dill’s model equations for positive photoresist kinetics,” IEEE Electron Device Letters, 7:4, 252–253. H. Ito and C.G. Willson. 1984. “Applications of photoinitiators to the design of resists for semiconductor manufacturing,” in Polymers in Electronics, in ACS Symposium Series 242, Washington, DC: American Chemical Society, pp. 11–23. D. Seligson, S. Das, H. Gaw, and P. Pianetta. 1988. “Process control with chemical amplification resists using deep ultraviolet and x-ray radiation,” Journal of Vacuum Science and Technology B, 6:6, 2303–2307. C.A. Mack, D.P. DeWitt, B.K. Tsai, and G. Yetter. 1994. “Modeling of solvent evaporation effects for hot plate baking of photoresist,” Proceedings of SPIE, 2195: 584–595. H. Fujita, A. Kishimoto, and K. Matsumoto. 1960. “Concentration and temperature dependence of diffusion coefficients for systems polymethyl acrylate and n-alkyl acetates,” Transactions of the Faraday Society, 56: 424–437. D.E. Bornside, C.W. Macosko, and L.E. Scriven. 1991. “Spin coating of a PMMA/chlorobenzene solution,” Journal of the Electrochemical Society, 138:1, 317–320. S.A. MacDonald, N.J. Clecak, H.R. Wendt, C.G. Wilson, C.D. Snyder, C.J. Knors, and N.B. Deyoe. 1991. “Airborne chemical contamination of a chemically amplified resist,” Proceedings of SPIE, 1466: 2–12. K.R. Dean and R.A. Carpio. 1994. “Contamination of positive deep-UV photoresists,” in Proceedings of the OCG Microlithography Seminar Interface ’94, pp. 199–212. T. Ohfuji, A.G. Timko, O. Nalamasu, and D.R. Stone. 1993. “Dissolution rate modeling of a chemically amplified positive resist,” Proceedings of SPIE, 1925: 213–226. F.P. Incropera and D.P. DeWitt. 1990. in Fundamentals of Heat and Mass Transfer, 3rd Ed., New York: Wiley. F.H. Dill and J.M. Shaw. 1977. “Thermal effects on the photoresist AZ1350J,” IBM Journal of Research and Development, 21:3, 210–218. D.W. Johnson. 1984. “Thermolysis of positive photoresists,” Proceedings of SPIE, 469: 72–79. C.A. Mack and R.T. Carback. 1985. “Modeling the effects of prebake on positive resist processing,” in Proceedings of the Kodak Microelectronics Seminar, pp. 155–158. E.J. Walker. 1975. “Reduction of photoresist standing-wave effects by post-exposure bake,” IEEE Transactions on Electron Devices, 22:7, 464–466. M.A. Narasimham and J.B. Lounsbury. 1977. “Dissolution characterization of some positive photoresist systems,” Proceedings of SPIE, 100: 57–64. C.A. Mack. 1987. “Development of positive photoresist,” Journal of the Electrochemical Society, 134:1, 148–152. C.A. Mack. 1992. “New kinetic model for resist dissolution,” Journal of the Electrochemical Society, 139:4, L35–L37. P. Trefonas and B.K. Daniels. 1987. “New principle for image enhancement in single layer positive photoresists,” Proceedings of SPIE, 771: 194–210. T.R. Pampalone. 1984. “Novolac resins used in positive resist systems,” Solid State Technology, 27:6, 115–120. D.J. Kim, W.G. Oldham, and A.R. Neureuther. 1984. “Development of positive photoresist,” IEEE Transactions on Electron Devices, 31:12, 1730–1735. R. Hershel and C.A. Mack. 1987. “Lumped parameter model for optical lithography,” in Lithography for VLSI, VLSI electronics—Microstructure science, R.K. Watts and N.G. Einspruch, Eds, New York: Academic Press, pp. 19–55. C.A. Mack, A. Stephanakis, and R. Hershel. 1986. “Lumped parameter model of the photolithographic process,” in Proceedings of the Kodak Microelectronics Seminar, pp. 228–238. C.A. Mack. 1994. “Enhanced lumped parameter model for photolithography,” Proceedings of SPIE, 2197: 501–510. C.A. Mack. 1991. “Lithographic optimization using photoresist contrast,” Microelectronics Manufacturing Technology, 14:1, 36–42.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

146 52. 53.

54. 55. 56.

57.

58. 59. 60. 61.

62. 63. 64. 65.

66. 67.

68. 69. 70.

71. 72. 73. 74.

M.P.C. Watts and M.R. Hannifan. 1985. “Optical positive resist processing II, experimental and analytical model evaluation of process control,” Proceedings of SPIE, 539: 21–28. A.R. Neureuther and F.H. Dill. 1974. “Photoresist modeling and device fabrication applications,” in Optical and Acoustical Micro-Electronics, New York: Polytechnic Press, pp. 233–249. H.L. Stover, M. Nagler, I. Bol, and V. Miller. 1984. “Submicron optical lithography: I-line lens and photoresist technology,” Proceedings of SPIE, 470: 22–33. I.I. Bol. 1984. “High-resolution optical lithography using dyed single-layer resist,” Kodak Microelectronics Seminar Interface ’84, pp. 19–22. M.D. Levenson, N.S. Viswanathan, and R.A. Simpson. 1982. “Improving resolution in photolithography with a phase-shifting mask,” IEEE Transactions on Electron Devices, 29:12, 1828–1836. M.D. Levenson, D.S. Goodman, S. Lindsey, P.W. Bayer, and H.A.E. Santini. 1984. “The phaseshifting mask II: imaging simulations and submicrometer resist exposures,” IEEE Transactions on Electron Devices, 31:6, 753–763. M.D. Prouty and A.R. Neureuther. 1984. “Optical imaging with phase shift masks,” Proceedings of SPIE, 470: 228–232. C.A. Mack. 1989. “Optimum stepper performance through image manipulation,” in Proceedings of the KTI Micro-electronics Seminar, pp. 209–215. C.A. Mack. 1990. “Algorithm for optimizing stepper performance through image manipulation,” Proceedings of SPIE, 1264: 71–82. H. Fukuda, T. Terasawa, and S. Okazaki. 1991. “Spatial filtering for depth-of-focus and resolution enhancement in optical lithography,” Journal of Vacuum Science and Technology B, 9:6, 3113–3116. C.A. Mack and J.E. Connors. 1992. “Fundamental differences between positive and negative tone imaging,” Microlithography World, 1:3, 17–22. C.A. Mack. 1987. “Photoresist process optimization,” in Proceedings of the KTI Microelectronics Seminar, pp. 153–167. P. Trefonas and C.A. Mack. 1991. “Exposure dose optimization for a positive resist containing poly-functional photoactive compound,” Proceedings of SPIE, 1466: 117–131. J.S. Peteren, C.A. Mack, J. Sturtevant, J.D. Byers, and D.A. Miller. 1995. “Non-constant diffusion coefficients: short description of modeling and comparison to experimental results,” Proceedings of SPIE, 2438: 167–180. C.A. Mack and P.M. Kaufman. 1988. “Mask bias in submicron optical lithography,” Journal of Vacuum Science and Technology B, 6:6, 2213–2220. N. Shamma, F. Sporon-Fielder, and E. Lin. 1991. “A method for correction of proximity effect in optical projection lithography,” in Proceedings of the KTI Microelectronics Seminar, pp. 145–156. A.R. Neureuther, P. Flanner, III, and S. Shen. 1987. “Coherence of defect interactions with features in optical imaging,” Journal of Vacuum Science and Technology B, 5:1, 308–312. J. Wiley. 1989. “Effect of stepper resolution on the printability of submicron 5x reticle defects,” Proceedings of SPIE, 1088: 58–73. L.M. Milner, K.C. Hickman, S.M. Gasper, K.P. Bishop, S.S.H. Naqvi, J.R. McNeil, M. Blain, and B.L. Draper. 1992. “Latent image exposure monitor using scatterometry,” Proceedings of SPIE, 1673: 274–283. K.P. Bishop, L.M. Milner, S.S.H. Naqvi, J.R. McNeil, and B.L. Draper. 1992. “Use of scatterometry for resist process control,” Proceedings of SPIE, 1673: 441–452. L.M. Milner, K.P. Bishop, S.S.H. Naqvi, and J.R. McNeil. 1993. “Lithography process monitor using light diffracted from a latent image,” Proceedings of SPIE, 1926: 94–105. S. Zaidi, S.L. Prins, J.R. McNeil, and S.S.H. Naqvi. 1994. “Metrology sensors for advanced resists,” Proceedings of SPIE, 2196: 341–351. J.R. Johnson, G.J. Stagaman, J.C. Sardella, C.R. Spinner, III, F. Liou, P. Tiefonas, and C. Meister. 1993. “The effects of absorptive dye loading and substrate reflectivity on a 0.5 mm I-line photoresist process,” Proceedings of SPIE, 1925: 552–563.

q 2007 by Taylor & Francis Group, LLC

Optical Lithography Modeling 75.

76. 77.

78. 79. 80. 81. 82. 83. 84.

85.

86.

87.

88. 89. 90. 91. 92. 93.

94.

147

W. Conley, R. Akkapeddi, J. Fahey, G. Hefferon, S. Holmes, G. Spinillo, J. Sturtevant, and K. Welsh. 1994. “Improved reflectivity control of APEX-E positive tone deep-UV photoresist,” Proceedings of SPIE, 2195: 461–476. N. Thane, C. Mack, and S. Sethi. 1993. “Lithographic effects of metal reflectivity variations,” Proceedings of SPIE, 1926: 483–494. B. Singh, S. Ramaswami, W. Lin, and N. Avadhany. 1993. “IC wafer reflectivity measurement in the UV and DUV and its application for ARC characterization,” Proceedings of SPIE, 1926: 151–163. S.S. Miura, C.F. Lyons, and T.A. Brunner. 1992. “Reduction of linewidth variation over reflective topography,” Proceedings of SPIE, 1674: 147–156. H. Yoshino, T. Ohfuji, and N. Aizaki. 1994. “Process window analysis of the ARC and TAR systems for quarter micron optical lithography,” Proceedings of SPIE, 2195: 236–245. G. Flores, W. Flack, and L. Dwyer. 1993. “Lithographic performance of a new generation I-line optical system: a comparative analysis,” Proceedings of SPIE, 1927: 899–913. B. Kuyel, M. Barrick, A. Hong, and J. Vigil. 1991. “0.5 mm deep UV lithography using a micrascan-90 step-and-scan exposure tool,” Proceedings of SPIE, 1463: 646–665. G.E. Flores, W.W. Flack, and E. Tai. 1994. “An investigation of the properties of thick photoresist films,” Proceedings of SPIE, 2195: 734–751. H. Iwasaki, T. Itani, M. Fujimoto, and K. Kasama. 1994. “Acid size effect of chemically amplified negative resist on lithographic performance,” Proceedings of SPIE, 2195: 164–172. U. Schaedeli, N. Mu¨nzel, H. Holzwarth, S.G. Slater, and O. Nalamasu. 1994. “Relationship between physical properties and lithographic behavior in a high resolution positive tone deepUV resist,” Proceedings of SPIE, 2195: 98–110. K. Schlicht, P. Scialdone, P. Spragg, S.G. Hansen, R.J. Hurditch, M.A. Toukhy, and D.J. Brzozowy. 1994. “Reliability of photospeed and related measures of resist performances,” Proceedings of SPIE, 2195: 624–639. R.A. Cirelli, E.L. Raab, R.L. Kostelak, and S. Vaidya. 1994. “Optimizing numerical aperture and partial coherence to reduce proximity effect in deep-UV lithography,” Proceedings of SPIE, 2197: 429–439. B. Katz, T. Rogoff, J. Foster, B. Rericha, B. Rolfson, R. Holscher, C. Sager, and P. Reynolds. 1994. “Lithographic performance at sub-300 nm design rules using high NA I-line stepper with optimized NA and ( in conjunction with advanced PSM technology,” Proceedings of SPIE, 2197: 421–428. P. Luehrmann and S. Wittekoek. 1994. “Practical 0.35 mm I-line lithography,” Proceedings of SPIE, 2197: 412–420. V.A. Deshpande, K.L. Holland, and A. Hong. 1993. “Isolated-grouped linewidth bias on SVGL micrascan,” Proceedings of SPIE, 1927: 333–352. R.C. Henderson and O.W. Otto. 1994. “Correcting for proximity effect widens process latitude,” Proceedings of SPIE, 2197: 361–370. H. Engstrom and J. Beacham. 1994. “Online photolithography modeling using spectrophotometry and PROLITH/2,” Proceedings of SPIE, 2196: 479–485. J. Kasahara, M.V. Dusa, and T. Perera. 1991. “Evaluation of a photoresist process for 0.75 micron, G-line lithography,” Proceedings of SPIE, 1463: 492–503. E.A. Puttlitz, J.P. Collins, T.M. Glynn, and L.L. Linehan. 1995. “Characterization of profile dependency on nitride substrate thickness for a chemically amplified I-line negative resist,” Proceedings of SPIE, 2438: 571–582. P.M. Mahoney and C.A. Mack. 1993. “Cost analysis of lithographic characterization: an overview,” Proceedings of SPIE, 1927: 827–832.

q 2007 by Taylor & Francis Group, LLC

3 Optics for Photolithography Bruce W. Smith

CONTENTS 3.1 Introduction ....................................................................................................................150 3.2 Image Formation: Geometrical Optics ........................................................................152 3.2.1 Cardinal Points ..................................................................................................154 3.2.2 Focal Length ......................................................................................................154 3.2.3 Geometrical Imaging Properties ....................................................................155 3.2.4 Aperture Stops and Pupils ..............................................................................156 3.2.5 Chief and Marginal Ray Tracing ....................................................................157 3.2.6 Mirrors ................................................................................................................158 3.3 Image Formation: Wave Optics ....................................................................................160 3.3.1 Fresnel Diffraction: Proximity Lithography ..................................................161 3.3.2 Fraunhofer Diffraction: Projection Lithography ..........................................163 3.3.3 Fourier Methods in Diffraction Theory..........................................................164 3.3.3.1 The Fourier Transform......................................................................165 3.3.3.2 Rectangular Wave..............................................................................167 3.3.3.3 Harmonic Analysis............................................................................168 3.3.3.4 Finite Dense Features........................................................................168 3.3.3.5 The Objective Lens ............................................................................170 3.3.3.6 The Lens as a Linear Filter ..............................................................170 3.3.4 Coherence Theory in Image Formation ........................................................171 3.3.5 Partial Coherence Theory: Diffracted-Limited Resolution ........................172 3.4 Image Evaluation ............................................................................................................176 3.4.1 OTF, MTF, and PTF ..........................................................................................176 3.4.2 Evaluation of Partial Coherent Imaging ........................................................179 3.4.3 Other Image Evaluation Metrics ....................................................................181 3.4.4 Depth of Focus ..................................................................................................182 3.5 Imaging Aberrations and Defocus ..............................................................................185 3.5.1 Spherical Aberration ........................................................................................186 3.5.2 Coma....................................................................................................................187 3.5.3 Astigmatism and Field Curvature ..................................................................188 3.5.4 Distortion ............................................................................................................188 3.5.5 Chromatic Aberration ......................................................................................189 3.5.6 Wavefront Aberration Descriptions................................................................191 3.5.7 Zernike Polynomials ........................................................................................191 3.5.8 Aberration Tolerances ......................................................................................192 3.5.9 Microlithographic Requirements ....................................................................197 149

q 2007 by Taylor & Francis Group, LLC

150

Microlithography: Science and Technology

3.6

Optical Materials and Coatings....................................................................................200 3.6.1 Optical Properties and Constants ..................................................................201 3.6.2 Optical Materials Below 300 nm ....................................................................202 3.7 Optical Image Enhancement Techniques....................................................................203 3.7.1 Off-Axis Illumination ........................................................................................203 3.7.1.1 Analysis of OAI ................................................................................206 3.7.1.2 Isolated Line Performance ..............................................................207 3.7.2 Phase Shift Masking ..........................................................................................209 3.7.3 Mask Optimization, Biasing, and Optical Proximity Compensation........217 3.7.4 Dummy Diffraction Mask ................................................................................219 3.7.5 Polarized Masks ................................................................................................220 3.8 Optical System Design ..................................................................................................221 3.8.1 Strategies for Reduction of Aberrations: Establishing Tolerances ............222 3.8.1.1 Material Characteristics ....................................................................222 3.8.1.2 Element Splitting ..............................................................................222 3.8.1.3 Element Compounding ....................................................................222 3.8.1.4 Symmetrical Design ..........................................................................223 3.8.1.5 Aspheric Surfaces ..............................................................................223 3.8.1.6 Balancing Aberrations ......................................................................223 3.8.2 Basic Lithographic Lens Design......................................................................224 3.8.2.1 The All-Reflective (Catoptric) Lens ................................................224 3.8.2.2 The All-Refractive (Dioptric) Lens..................................................224 3.8.2.3 Catadioptric-Beamsplitter Designs ................................................226 3.9 Polarization and High NA ............................................................................................229 3.9.1 Imaging with Oblique Angles ........................................................................230 3.9.2 Polarization and Illumination..........................................................................231 3.9.3 Polarization Methods ........................................................................................232 3.9.4 Polarization and Resist Thin Film Effects......................................................233 3.10 Immersion Lithography ................................................................................................234 3.10.1 Challenges of Immersion Lithography ..........................................................236 3.10.2 High Index Immersion Fluids ........................................................................238 References ....................................................................................................................................240

3.1 Introduction Optical lithography involves the creation of relief image patterns through the projection of radiation within or near the ultraviolet (UV) visible portion of the electromagnetic spectrum. Techniques of optical lithography, or photolithography, have been used to create patterns for engravings, photographs, and printing plates. In the 1960s, techniques developed for the production of lithographic printing plates were utilized in the making of microcircuit patterns for semiconductor devices. These early techniques of contact or proximity photolithography were refined to allow circuit resolution on the order of 3–5 mm. Problems encountered with proximity lithography such as mask and wafer damage, alignment difficulty, and field size have limited its application for most photolithographic needs. In the mid-1970s, projection techniques minimized some of the problems encountered with proximity lithography and have led to the development of tools that currently allow resolution below 0.25 mm.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

Illumination system

151

Source

Illumination system

Source

Condenser lens

Condenser lens

Mask

Mask

Gap (z) Objective lens

Substrate

(a)

(b)

Substrate

FIGURE 3.1 Schematic of optical lithography techniques (a) proximity and (b) projection lithographic systems.

Diagrammed in Figure 3.1 are generic proximity and projection techniques for photolithrography. Figure 3.1a is a schematic of a proximity setup where a mask is illuminated and held in close contact with a resist-coated substrate. The illumination system consists of a source and a condenser lens assembly that provides uniform illumination to the mask. The illumination source outputs radiation in the blue ultraviolet portion of the electromagnetic spectrum. The mercury-rare gas discharge lamp is a source well suited for photolithography, and it is almost entirely relied on for production of radiation in the 350–450 nm range. Because output below 365 mm is weak from a mercury or mercury-rare gas lamp, other sources have been utilized for shorter wavelength exposure. The ultraviolet region from 150 to 300 nm referred to as the deep UV. Although a small number of lithographic techniques operating at these wavelengths have made use of gas discharge lamps, the use of a laser source is an attractive alternative. Several laser sources have potential for delivering high-power deep ultraviolet radiation for photoresist exposure. A class of lasers that has been shown to be well suited for photolithography is the excimer lasers. Excimer lasers using argon fluoride (ArF) and krypton fluoride (KrF) gas mixtures are most prominent, producing radiation at 193 and 248 nm, respectively. Details of these systems can be found elsewhere. Figure 3.1b shows a setup for a projection imaging system. The optical configuration for projection microlithography tools most closely resembles a microscope system. Early microlithographic objective lenses were modifications of microscope lens designs that have now evolved to allow diffraction-limited resolution over large fields at high numerical apertures. Like a proximity system, a projection tool includes an illumination system and a mask, but it utilizes an objective lens to project images toward a substrate. The illumination system focuses an image of the source into the entrance pupil of the objective lens to provide maximum uniformity at the mask plane. Both irradiance and coherence properties are influenced by the illumination system. The temporal coherence of a source is a measure of the correlation of the source wavelength to the source spectral bandwidth. As a source spectral bandwidth decreases, its temporal coherence increases. Coherence length, lc, is related to source bandwidth as lc Z l2 =Dl

q 2007 by Taylor & Francis Group, LLC

152

Microlithography: Science and Technology

Interference effects become sufficiently large when an optical path distance is less than the coherence length of a source. Optical imaging effects such interference (standing wave) patterns in photoresist become considerable as source coherent length increases. The spatial coherence of a source is a measure of the phase relationships between photons or wavefronts emitted. A true point source, by definition, is spatially coherent, because all wavefronts originate from a single point. Real sources, however, are less than spatially coherent. A conventional laser that utilizes oscillation for amplification of radiation can produce nearly spatially coherent radiation. Lamp sources such as gas discharge lamps exhibit low spatial coherence as do excimer lasers that require few oscillations within the laser cavity. Both temporal and spatial coherence properties can be controlled by an illumination system. Source bandwidth and temporal coherence are controlled through wavelength selection. Spatial coherence is controlled through manipulation of the effective source size imaged in the objective lens. In image formation, the control of spatial coherence is of primary importance because of its relationship to diffraction phenomena. Current designs of projection lithography systems include (1) reduction or unit magnification, (2) refractive or reflective optics, and (3) array stepping or field scanning. Reduction tools allow a relaxation of mask requirements, including minimum feature size specification and defect criteria. This, in turn, reduces the contribution to the total process tolerance budget. The drawbacks for reduction levels greater than 5:1 include the need for increasingly larger masks and the associated difficulties in their processing. Both unit magnification (1:1) and reduction (M:1) systems have been utilized in lithographic imaging system design, each well suited for certain requirements. As feature size and control place high demands on 1:1 technology, reduction tools are generally utilized. In situations where feature size requirements be met with a unit magnification, such systems may prove superior as mask field sizes, defect criteria, and lens aberrations are reduced. A refractive projection system must generally utilize a narrow spectral band of a lamp-type source. Energy outside this range would be removed prior to the condenser lens system to avoid wavelength-dependent defocus effects or chromatic aberration. Some degree of chromatic aberration correction is possible in a refractive lens system by incorporating elements of various glass types. As wavelengths below 300 nm are pursued for refractive projection lithography, the control of spectral bandwidth becomes more critical. As few transparent optical materials exist at these wavelengths, chromatic aberration correction through glass material selection is difficult. Greater demands are therefore placed on the source that may be required to deliver a spectral bandwidth on the order of a few picometers. Clearly, such a requirement would limit the application of lamp sources at these wavelengths, leading to laser-based sources as the only alternative for short-wavelength refractive systems. Reflective optical systems (catoptric) or combined refractive–reflective systems (catadioptric) can be used to reduce wavelength influence and reduce source requirements, especially at wavelengths below 300 nm. To understand the underlying principles of optical lithography, fundamentals of both geometrical and physical optics need to be addressed. Because optical lithography using projection techniques is the dominant technology for current integrated circuit (IC) fabrication, the development of the physics behind projection lithography will be concentrated on in this chapter. Contact lithography will be covered in less detail.

3.2 Image Formation: Geometrical Optics An understanding of optics where the wave nature of light is neglected can provide a foundation for further study into a more inclusive approach. Therefore, geometrical optics

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

153

FIGURE 3.2 Single-element lens shapes. At the top are positive lenses—bi-convex, planoconvex, and meniscus convex. At the bottom are negative lenses—bi-concave, plano-concave, and meniscus concave.

will be introduced here, allowing investigation into valuable information about imaging [1]. This will lead to a more complete study of imaging through physical optics where the wave nature of light is considered and interference and diffraction can be investigated. Both refractive lenses and reflective mirrors play important roles in microlithography optical systems. The optical behavior of mirrors can be described by extending the behavior of refractive lenses. Although a practical lens will contain many optical elements, baffles, apertures, and mounting hardware, most optical properties of a lens can be understood through the extension of simple single-element lens properties. The behavior of a simple lens will be investigated to gain an understanding of optical systems in general. A perfect lens would be capable of an exact translation of an incident spherical wave through space. A positive lens would cause a spherical wave to converge faster, and a negative lens would cause a spherical wave to diverge faster. Lens surfaces are generally spherical or planar, and they may have forms, including biconvex, planoconvex, biconcave, planoconcave, negative meniscus, and positive meniscus as shown in Figure 3.2. In addition, aspheric surfaces are possible that may be used in an optical system to improve its performance. These types of elements are generally difficult and expensive to fabricate and are not yet widely used. As design and manufacturing techniques improve,

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

154

applications of aspherical elements will grow, including their use in microlithographic lens systems. 3.2.1 Cardinal Points Knowledge of the cardinal points of a simple lens is sufficient to understand its behavior. These points, the first and second focal points (F1 and F2), the principal points (P1 and P2), and the nodal points (N1 and N2), lie on the optical axis of a lens as shown in Figure 3.3. The principal planes are also shown here that contain respective principal points and can be thought of as the surfaces where refraction effectively occurs. Although these surfaces are not truly planes, they are nearly so. Rays that pass through a lens act as if they refract only at the first and second principal planes and not at any individual glass surface. A ray passing through the first focal point (F1) will emerge from the lens at the right parallel to the optical axis. For this ray, refraction effectively occurs at the first principal plane. A ray traveling parallel to the optical axis will emerge from the lens and pass through the second focal point (F2). Here, refraction effectively occurs at the second principal plane. A ray passing through the optical center at the lens will emerge parallel to the incident ray and pass through the first and second nodal points (N1,N2). A lens or lens system can, therefore, be represented by its two principal planes and focal points. 3.2.2 Focal Length The distance between a lens focal point and corresponding principal point is known as the effective focal length (EFL) as shown in Figure 3.4. The focal length can be either positive, when F1 is to the left of P1 and F2 is to the right of P2, or negative, when the opposite occurs. The reciprocal of the EFL (1/f ) is known as the lens power. The front focal length (FFL) is the distance from the first focal point (F1) to the leftmost surface of the lens along the optical axis. The back focal length (BFL) is the distance from the rightmost surface to the second focal point (F2). The lens maker’s formula can be used to determine the EFL of a lens if the radii of curvature of surface (R1 and R2 for first and second surfaces), lens refractive index (ni), and lens thickness (t) are known. Several sign conventions are possible. Distances measured toward the left will generally be considered as positive. R1 will be considered

P1

P2

N1

F2

N2

F1

First principal surface (a)

Second principal surface (b)

FIGURE 3.3 Cardinal points of a simple lens. (a) Focal points (F1 and F2) and principal points (P1 and P2). (b) Nodal points (N1 and N2).

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

155

F2

F1 FFL

BFL EFL

FIGURE 3.4 Determination of focal length for a simple lens, front focal length (FFL), and back focal length (EFL).

positive if its center of curvature lies to the right of the surface, and R2 will be considered negative if its center of curvature lies to the left of the surface. Focal length is determined by 1 1 1 ðni K1Þt Z ðni K1Þ K C f R1 R2 ni R1 R2 3.2.3 Geometrical Imaging Properties If the cardinal points of a lens are known, geometrical imaging properties can be determined. A simple biconvex is considered such as the one shown in Figure 3.5 where an object is placed a positive distance s1 from focal point F1 at a positive object height y1. This object can be thought of as consisting of many points that will emit spherical waves to be focused by the lens at the image plane. The object distance (d1) is the distance from the principal plane to the object that is positive for objects to the left of P1.

Object y1

Image −y2

s1

s2 f1

f2

d1 FIGURE 3.5 Ray tracing methods for finding image location and magnification.

q 2007 by Taylor & Francis Group, LLC

d2

Microlithography: Science and Technology

156

The image distance to the principal plane (d2) that is positive for an image to the right of P2 can be calculated from the lens law 1 1 1 C Z d1 d2 f For systems with a negative EFL, the lens law becomes 1 1 1 C ZK d1 d2 f The lateral magnification of an optical system is expressed as

mZ

y2 Kd2 Z y1 d1

where y2 is the image height that is positive upward. The location of an image can be determined by tracing any two rays that will intersect in the image space. As shown in Figure 3.5, a ray emanating from an object point, passing through the first focal point F1, will emerge parallel to the optical axis, being effectively refracted at the first principal plane. A ray from an object point traveling parallel to the optical axis will emerge after refracting at the second principal plane passing through F2. A ray from an object point passing through the center of the lens will emerge parallel to the incident ray. All three rays intersect at the image location. If the resulting image lies to the right of the lens, the image is real (assuming light emanates from an object on the left). If the image lies to the right, it is virtual. If the image is larger than the object, magnification is greater than unity. If the image is erect, the magnification is positive. 3.2.4 Aperture Stops and Pupils The light accepted by an optical system is physically limited by aperture stops within the lens. The simplest aperture stop may be the edge of a lens or a physical stop placed in the system. Figure 3.6 shows how an aperture stop can limit the acceptance angle of a lens. The numerical aperture (NA) is the maximum acceptance angle at the image plane that is determined by the aperture stop. NAIMG Z ni sinðqmax Þ Because the optical medium is generally air, NAIMGwsin(qmax). The field stop shown in Figure 3.7 limits the angular field of view that is generally the angle subtended by the object or image from the first or second nodal point. The angular field of view for the image is generally that for the object. The image of the aperture stop viewed from the object is called the entrance pupil, whereas the image viewed from the image is called the exit pupil as seen in Figure 3.8 and Figure 3.9. As will be viewed, the aberrations of an optical system can be described by the deviations in spherical waves at the exit pupil coming to focus at the image plane.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

157

Object θmax

Image

Aperture stop

FIGURE 3.6 Limitation of lens maximum acceptance angle by an aperture stop.

3.2.5 Chief and Marginal Ray Tracing We have seen that a ray emitted from an off-axis point, passing through the center of a lens, will emerge parallel to the incident ray. This is called the chief ray, and it is directed toward the entrance pupil of the lens. A ray that is emitted from an on-axis point and directed toward the edge of the entrance pupil is called a marginal ray. The image plane can, therefore, be found where a marginal ray intersects the optical axis. The height of the image is determined by the height of the chief ray at the image plane as seen in Figure 3.10. The marginal ray also determines the numerical aperture. The marginal and chief rays are related to each other by the Lagrange invariant that states that the product of the image NA and image height is equal to the object NA and object height, or NAOBJy1ZNAIMGy2. It is essentially an indicator of how much information can be processed by a lens. The implication is that as object or field size increases, NA decreases. To achieve an increase in both NA and field size, system complexity increases. Magnification can now be

Field stop

Object

Image

FIGURE 3.7 Limitation of angular field of view by a field stop.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

158

Entrance pupil

FIGURE 3.8 Location of the entrance pupil for a simple lens.

expressed as mZ

NAOBJ NAIMG

3.2.6 Mirrors A spherical mirror can form images in ways similar to refractive lenses. Using the reflective lens focal length, the lens equations can be applied to determine image position, height, and magnification. To use these equations, a sign convention for reflection needs to be established. Because refractive index is the ratio of the speed of light in vacuum to the speed of light in the material considered, it is logical that a change of sign would result if the direction of propagation was reversed. For reflective surfaces, therefore, 1. Refractive index values are multiplied by K1 upon reflection. 2. The signs of all distances upon reflection are multiplied by K1.

Exit pupil

FIGURE 3.9 Location of the exit pupil for a simple lens.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

159

Chief ray

Image

Marginal ray Object

Exit pupil

Entrance pupil

Aperture stop FIGURE 3.10 Chief and marginal ray tracing through a lens system.

Figure 3.11 shows the location of principal and focal points for two mirror types: concave and convex. The concave mirror is equivalent to a positive converging lens. A convex mirror is equivalent to a negative lens. The EFL is simplified because of the loss of the thickness term and sign changes to 1 f ZK R 2

C

P2

f2 R/2

P2

C

R/2

R

FIGURE 3.11 Location of principal and focal points for (a) concave and (b) convex mirrors.

q 2007 by Taylor & Francis Group, LLC

f2 R

Microlithography: Science and Technology

160

3.3 Image Formation: Wave Optics Many of the limitations of geometrical optics can be explained by considering the wave nature of light. As it has been reasoned that a perfect lens translates spherical waves from an object point to an image point, such concepts can be used to describe deviations from nongeometrical propagation that would otherwise be difficult to predict. An approach proposed by Huygens [2] allows an extension of optical geometric construction to wave propagation. Through use of this simplified wave model, many practical aspects of the wave nature of light can be understood. Huygens’ principle provides a basis for determining the position of a wavefront at any instance based on knowledge of an earlier wavefront. A wavefront is assumed to be made up of an infinite number of point sources. Each of these sources produces a spherical secondary wave called a wavelet. These wavelets propagate with appropriate velocities that are determined by refractive index and wavelength. At any point in time, the position of the new wavefront can be determined as the surface tangent to these secondary waves. Using Huygens’ concepts, electromagnetic fields can be thought of as sums of propagating spherical or plane waves. Although Huygens had no knowledge of the nature of the light wave or the electromagnetic character of light, this approach has allowed analysis without the need to fully solve Maxwell’s equations. The diffraction of light is responsible for image creation in all optical situations. When a beam of light encounters the edge of an opaque obstacle, propagation is not rectilinear as might be assumed based on assumptions of geometrical shadowing. The resulting variation in intensity produced at some distance from the obstacle is dependent on the coherence of light, its wavelength, and the distance the light travels before being observed. The situation for coherent illumination is shown in Figure 3.12. Shown are a coherently illuminated mask and the resulting intensity pattern observed at increasing distances. Such an image in intensity is known as an aerial image. Typically, with coherent illumination, fringes are created in the diffuse shadowing between light and dark, a result of interference. Only when there is no separation between the obstacle and the recording plane does rectilinear propagation occur. As the recording plane is moved away from the obstacle, there is a region where the geometrical shadow is still discernible. Beyond this region, far from the obstacle, the intensity pattern at the recording plane no longer Coherent illumination

Fraunhofer region

Fresnel region of diffraction FIGURE 3.12 Diffraction pattern of a coherently illuminated mask opening at near (Fresnel) and far (Fraunhofer) distances.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

161

resembles the geometrical shadow; rather, it contains areas of light and dark fringes. At close distances, where geometric shadowing is still recognizable, near-field diffraction, or Fresnel diffraction, dominates. At greater distances, far-field diffraction, or Fraunhofer diffraction, dominates. 3.3.1 Fresnel Diffraction: Proximity Lithography

Phase

The theory of Fresnel diffraction is based on the Fresnel approximation to the propagation of light, and it describes image formation for proximity printing where separation distances between the mask and wafer are normally held to within a few microns [3]. The distribution of intensity resembles that of the geometric shadow. As the separation between the mask and wafer increases, the integrity of an intensity pattern resembling an ideal shadowing diminishes. Theoretical analysis of Fresnel diffraction is difficult, and Fresnel approximations based on Kirchhoff diffraction theory are used to obtain a qualitative understanding [4]. Because our interest lies mainly with projection systems and diffraction beyond the near-field region, a rigorous analysis will not be attempted here. Instead, analysis of results will provide some insight into the capabilities of proximity lithography. Fresnel diffraction can be described using a linear filtering approach that can be made valid over a small region of the observation or image plane. For this analogy, a mask function is effectively frequency filtered with a quadratically increasing phase function. This quadratic phase filter can be thought of as a slice of a spherical wave at some plane normal to the direction of propagation as shown in Figure 3.13. The resulting image will exhibit “blurring” at the edges and oscillating “fringes” in bright and dark regions. Recognition of the geometrical shadow becomes more difficult as the illumination wavelength increases, the mask feature size decreases, or the mask separation distance increases. Figure 3.14 illustrates the situation where a space mask is illuminated with 365-nm radiation and the separation distance between mask and wafer is 1.8 mm. For relatively large features, on the order of 10–15 mm, rectilinear propagation dominates, and the resulting image intensity distribution resembles the mask. In order to determine the minimum feature width resolvable, some specification for maximum intensity loss and line width deviation must be made. These specifications are determined by the photoresist material and processes. If an intensity tolerance of G5% and a mask space width to image width tolerance of G20% is acceptable, a relationship for minimum resolution results pﬃﬃﬃﬃﬃ w z0:7 ls

0

Frequency

q 2007 by Taylor & Francis Group, LLC

FIGURE 3.13 A quadratic phase function.

Microlithography: Science and Technology

162

w = 0.51 micron

w = 2.6 micron

w = 1.28 micron

w = 5.16 micron

w = 7.74 micron

w = 10.31 micron

w = 12.9 micron

w = 15.48 micron

FIGURE 3.14 Aerial images resulting from frequency filtering of a slit opening with a quadratic phase function. The illumination wavelength is 365 nm and separation distance is 1.8 mm for mask opening sizes shown.

where w is space width, l is illumination wavelength, and s is separation distance. As can be shown, resolution below 1 mm should be achievable with separations of 5 mm or less. A practical limit for resolution using proximity methods is closer to 3–5 mm because of surface and mechanical separation control as well as alignment difficulties.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

163

3.3.2 Fraunhofer Diffraction: Projection Lithography For projection lithography, diffraction in the far-field or Fraunhofer region needs to be considered. No longer is geometric shadowing recognizable; rather, fringing takes over in the resulting intensity pattern. Analytically, this situation is easier to describe than Fresnel diffraction. When light encounters a mask, it is diffracted toward the object lens in the projection system. Its propagation will determine how an optical system will ultimately perform, depending on the coherence of the light that illuminates the mask. Consider a coherently illuminated single space mask opening as shown in Figure 3.15 The resulting Fraunhofer diffraction pattern can be evaluated by examining light coming from various portions of the space opening. Using Huygens’ principle, the opening can be divided into an infinite number of individual sources, each acting as a separate source of spherical wavelets. Interference will occur between every portion of this opening, and the resulting diffraction pattern at some far distance will depend on the propagation direction q. It is convenient for analysis to divide the opening into two halves (d/2). With coherent illumination, all wavelets emerging from the mask opening are in phase. If waves emitted from the center and bottom of the mask opening are considered (labeled W1 and W3), it can be seen that an optical path difference (OPD) exists as one wave travels a distance d/2 sin q farther than the other. If the resulting OPD is one half-wavelength or any multiple of one half-wavelength, waves will interfere destructively. Similarly, an OPD of d sin q exists between any two waves that originate from points separated by one half of the space width. The waves from the top portion of the mask opening interference destructively with waves from the bottom portion of the mask when d sin q Z ml

ðmZG1; G2; G3; .Þ

where jmj%d/l. From this equation, the positions of dark fringes in the Fraunhofer diffraction pattern can be determined. Figure 3.16 is the resulting diffraction pattern from a single space where a broad central bright fringe exists at positions corresponding to qZ0, and dark fringes occur where q satisfies the destructive interference condition.

Fraunhofer region

d

P

W3 q W1

OPD=d/2 sin q

FIGURE 3.15 Determination of Fraunhofer diffraction effects for a coherently illuminated single mask opening.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

164

Single space mask pattern amplitude

Fraunhofer diffraction pattern amplitude

FIGURE 3.16 (a) A single space mask pattern and (b) its corresponding Fraunhofer diffraction pattern. These are Fourier transform pairs.

Although this geometric approach is satisfactory for a basic understanding of Fraunhofer diffraction principles, it cannot do an adequate job of describing the propagation of diffracted light. Fourier methods and scalar diffraction theory provide a description of the propagation of diffracted light through several approximations (previously identified as the Fresnel approximation), specifically [5,6] 1. The distance between the aperture and the observation plane is much greater than the aperture dimension. 2. Spherical waves can be approximated by quadratic surfaces. 3. Each plane wave component has the same polariation amplitude (with polarization vectors perpendicular to the optical axis). These approximations are valid for optical systems with numerical apertures below 0.6 if illumination polarization can be neglected. Scalar theory has been extended beyond these approximations to numerical apertures of 0.7 [7], and full vector diffraction theory has been utilized for more rigorous analysis [8]. 3.3.3 Fourier Methods in Diffraction Theory Whereas geometrical methods allow determination of interference minimums for the Fraunhofer diffraction pattern of a single slit, the distribution of intensity across the pattern is most easily determined through Fourier methods. The coherent field distribution of a Fraunhofer diffraction pattern produces by a mask is essentially the Fourier transform of the mask function. If m(x,y) is a two-dimensional mask function or electric field distribution across the x–y mask plane and M(u,v) is the coherent field distribution across the u–v Fraunhofer diffraction plane, then Mðu; vÞ Z Ffmðx; yÞg will represent the Fourier transform operation. Both m(x,y) and M(u,v) have amplitude and phase components. From Figure 3.12, we could consider M(u,v) the distribution (in amplitude) at the farthest distance from the mask. The field distribution in the Fraunhofer diffraction plane represents the spatial frequency spectrum of the mask function. In the analysis of image detail, preservation of spatial structure is generally of most concern. For example, the lithographer is interested in optimizing an imaging process to maximize the reproduction integrity of fine feature detail. To separate out such spatial structure from an image, it is convenient to

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

165

work in a domain of spatial frequency rather than of feature dimension. The concept of spatial frequency is analogous to temporal frequency in the analysis of electrical communication systems. Units of spatial frequency are reciprocal distance. As spatial frequency increases, pattern detail becomes finer. Commonly, units of cycles/mm or mmK1 are used where 100 nmK1 is equivalent to 5 mm, 1000 mmK1 is equivalent to 0.5 mm, and so forth. The Fourier transform of a function, therefore, translates dimensional (x,y) information into spatial frequency (u,v) structure. 3.3.3.1 The Fourier Transform The unique properties of the Fourier transform allow convenient analysis of spatial frequency structure [9]. The Fourier transform takes the general form

FðuÞ Z

N ð

f ðxÞeK2piux dx

KN

for one dimension. Uppercase and lowercase letters are used to denote Fourier transform pairs. In words, the Fourier transform expresses a function f(x) as the sum of weighted sinusoidal frequency components. If f(x) is a real-valued, even function, the complex exponential (eK2piux) could be replaced by a cosine term, cos(2pux), making the analogy more obvious. Such transforms are utilized but are of little interest for microlithographic applications because masking functions, m(x,y), will generally have odd as well as even components. If the single slit pattern analyzed previously with Fraunhofer diffraction theory is revisited, it can be seen that the distribution of the amplitude of the interference pattern produced is simply the Fourier transform of an even, one-dimensional, nonperiodic, rectangular pulse, commonly referred to as a rect function, rect(x). The Fourier transform of rect(x) is a sinc(u) where sincðuÞ Z

sinðpuÞ pu

that is shown in Figure 3.16. The intensity of the pattern is proportional to the square of the amplitude, or a sinc2(u) function, that is equivalent to the power spectrum. The two functions, rect (x) and sinc(u), are Fourier transform pairs where the inverse Fourier transform of F(u) is f(x)

f ðxÞ Z

N ð

FðuÞeC2piux du

KN

The Fourier transform is nearly its own inverse, differing only in sign. The scaling property of the Fourier transform is of specific importance in imaging applications. Properties are such that n x o F f Z jbjFðbuÞ b

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

166 rect(2x)

1/2 sinc(u/2)

rect(x)

sinc(u)

rect(x/2)

2 sinc(2u)

FIGURE 3.17 Scaling effects on rect(x) and sinc(u) pairs.

and n x o Z jbjsincðbuÞ F rect b where b is the effective width of the function. The implication of this is that as the width of a slit decreases, the field distribution of the diffraction pattern becomes more spread out with diminished amplitude values. Figure 3.17 illustrates the effects of scaling on a onedimensional rect function. A mask object is generally a function of both x and y coordinates in a two-dimensional space. The two-dimensional Fourier transform takes the form

Fðu;vÞ Z

N ð ð N

f ðx;yÞeK2piðuxCvyÞ dx dy

KN KN

The variables u and v represent spatial frequencies in the x and y directions, respectively. The inverse Fourier transform can be determined in a fashion similar to the one-dimensional case with a conventional change in sign. In IC lithography, isolated as well as periodic lines and spaces are of interest. Diffraction for isolated features through Fourier transform of the rect function has been analyzed. Diffraction effects for periodic features types can be analyzed in a similar manner.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

167

f (x) A

−3X

−2X

−X

0

X

2X

3X

x

FIGURE 3.18 A periodic rectangular wave, representing dense mask features.

3.3.3.2 Rectangular Wave Where a single slit mask can be considered as a nonperiodic rectangular pulse, line/space patterns can be viewed as periodic rectangular waves. In Fraunhofer diffraction analysis, this rectangular wave is analogous to the diffraction grating. The rectangular wave function of Figure 3.18 has been chosen as an illustration where the maximum amplitude is A and the wave period is p, also known as the pitch. This periodic wave can be broken up into components of a rect function, with width 1/2 of the pitch, or p/2, and a periodic function that will be called comb(x) where N X x Z dðxKnpÞ Comb p nZKN an infinite train of unit-area impulse functions spaced one pitch unit apart. (An impulse function is an idealized function with zero width and infinite height, having an area equal to 1.0.) To separate these functions, rect(x) and comb(x), from the rectangular wave, we need to realize that it is a convolution operation that relates them. Because convolution in the space (x) domain becomes multiplication in frequency

x x comb p=2 * p

x x !F comb MðuÞ Z FfmðxÞg Z F rect p=2 p mðxÞ Z rect

By utilizing the transform properties of the comb function Ffcombðx=bÞg Z jbjcombðbuÞ the Fourier transform of the rectangular wave can be expressed as X N x x A u F rect comb Z MðuÞ Z sinc dðuKnu0 Þ p=2 * p 2 2u0 nZKN

where u0Z1/p, the fundamental frequency of the mask grating. The amplitude spectrum of the rectangular wave is shown in Figure 3.19, where A/2 sinc(u/2u0) provides an envelope for the discrete Fraunhofer diffraction pattern. It can be shown that the discrete interference maxima correposnd to d sin qZml where mZ0, G1, G2, G3, and so on, where d is the mask pitch.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

168

A /2

A /π

1 −3u0 −5u0

3u0 −1u0

1u0

A /5π 5u0

−A /3π

u

FIGURE 3.19 The amplitude spectrum of a rectangular wave, A/2 sinc(u/2u0). This is equivalent to the discrete orders of the coherent Fraunhofer diffraction pattern.

3.3.3.3 Harmonic Analysis The amplitude spectrum of the rectangular wave can be utilized to decompose the function into a linear combination of complex exponentials by assigning proper weights to complex-valued coefficients. This allows harmonic analysis through the Fourier series expansion, utilizing complex exponentials as basis functions. These exponentials, or sine and cosine functions, allow us to represent the spatial frequency structure of periodic functions as well as non-periodic functions. Let us consider the periodic rectangular wave function m(x), of Figure 3.18. Because the function is even and real-valued, the amplitude spectrum can be utilized to decompose m(x) into the cosinusoidal frequency components mðxÞ Z

A 2A 2A 2A 2A C ½cosð2pu0 xÞK ½cosð2pð3u0 ÞxÞ C ½cosð2pð5u0 ÞxÞK 2 p 3p 5p 7p !½cosð2pð7u0 ÞxÞ C .

By graphing these components in Figure 3.20, it becomes clear that each additional term brings the sum closer to the function m(x). These discrete coefficients are the diffraction orders of the Fraunhofer diffraction pattern that are produced when a diffraction grating is illuminated by coherent illumination. These coefficients, represented as terms in the harmonic decomposition of m(x) in Figure 3.20, correspond to the discrete orders seen in Figure 3.19. The zeroth order (centered at uZ0) corresponds to the constant DC term A/2. At either side of the G first orders, where u1Z1/p. TheGsecond orders correspond to u2ZG2/p, and so on. It would follow that if an imaging system was not able to collect all diffracted orders propagating from a mask, complete reconstruction would not be possible. Furthermore, as higher frequency information is lost, fine image detail is sacrificed. There is, therefore, a fundamental limitation to resolution for an imaging system determined by its inability to collect all possible diffraction information. 3.3.3.4 Finite Dense Features The rectangular wave is very useful for understanding the fundamental concepts and Fourier analysis of diffraction. In reality, however, finite mask functions are dealt with rather than such infinite functions as the rectangular wave. The extent by which a finite

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

169

Contstant term (0.5)

f '(x)

2nd term (1st harmonic)

f '(x)

3rd term (3rd harmonic)

f '(x)

4th term (5th harmonic)

f '(x)

FIGURE 3.20 Reconstruction of a rectangular wave (right) using Fourier series expansion.

number of mask features can be represented by an infinite function depends on the number of features present. Consider a mask consisting of five equal line/space pairs or a five-bar function as shown in Figure 3.21. This mask function can be represented before as the convolution of a scaled rect (x) function and an impulse train comb(x). In order to limit the mask function to five features only, a windowing function must be introduced as follows:

x x x mðxÞ Z rect !rect comb p=2 * p 5p

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

170

m(x)

x M(u)

FIGURE 3.21 A five-bar mask function m(x) and its corresponding coherent spatial frequency distribution M(u).

u

As before, the spatial frequency distribution is a Fourier transform, but now each diffraction order is convolved with a sinc(u) function and scaled appropriately by the inverse width of the windowing function "

# X N A u u sinc MðuÞ Z dðuKnu0 Þ * 5 sinc 2 2u0 nZKN u0 =5 As more features are added to the five-bar function, the width of the convolved sinc(u) is narrowed. At the limit where an infinite number of features is considered, the sinc(u) function becomes a d(u), and the result is identical to the rectangular wave. At the other extreme, if a one-bar mask function is considered, the resulting spatial frequency distribution is the continuous function shown in Figure 3.16. 3.3.3.5 The Objective Lens In a projection imaging system, the objective lens has the ability to collect a finite amount of diffracted information from a mask that is determined by its maximum acceptance angle or numerical aperture. A lens behaves as a linear filter for a diffraction pattern propagating from a mask. By limiting high-frequency diffraction components, it acts as a low-pass filter, blocking information propagating at angles beyond its capability. Information that is passed is acted on by the lens to produce a second inverse Fourier transform operation, directing a limited reconstruction of the mask object toward the image plane. It is limited not only by the loss of higher frequency diffracted information but also by any lens aberrations that may act to introduce image degradation. In the absence of lens aberrations, imaging is referred to as diffraction limited. The influence of lens aberration on imaging will be addressed later. At this point, if an ideal diffraction-limited lens can be considered, the concept of a lens as a linear filter can provide insight image formation. 3.3.3.6 The Lens as a Linear Filter If an objective lens could produce an exact inverse Fourier transform of the Fraunhofer diffraction pattern emanating from an object, complete image reconstruction would be possible. A finite lens numerical aperture will prevent this. Consider a rectangular grating where p sin qZml describes the positions of the discrete coherent diffraction orders.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

171

If a lens can be described in terms of a two-dimensional pupil function H(u,v), limited by its scaled numerical aperture, NA/l, then pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ NA u2 C n 2 ! l pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ NA 0 if u2 C v2 O l

Hðu; vÞZ1 if

describes the behavior of the lens as a low-pass filter. The resulting image amplitude produced by the lens is the inverse Fourier transform of the mask’s Fraunhofer diffraction pattern multiplied by this lens pupil function Aðx;yÞZFfMðu;vÞ !Hðu;vÞg The image intensity distribution, known as the aerial image, is equal to the square of the image amplitude 2 Iðx; yÞZAðx; yÞ For the situation described, coherent illumination allows simplification of optical behavior. Diffraction at a mask is effectively a Fourier transform operation. Part of this diffracted field is collected by the objective lens where diffraction is, in a sense, reversed through a second Fourier transform operation. Any losses incurred through limitations of a lens NA!1.0 results in less than complete reconstruction of the original mask detail. To extend this analysis for real systems, as understanding of coherence theory is needed. 3.3.4 Coherence Theory in Image Formation Much has been written about coherence theory and the influence of spatial coherence on interference and imaging [10]. For projection imaging, three illumination situations are possible that allow the description of interference behavior. These are coherent illumination where wavefronts are correlated and are able to interfere completely; incoherent illumination where wavefronts are uncorrelated and unable to interfere; and partial coherent illumination where partial interference is possible. Figure 3.22 shows the situation where spherical wavefronts are emitted from point sources that can be used to describe coherent, incoherent, and partial coherent illumination. With coherent illumination, spherical waves emitted by a single point source on axis result in plane waves normal to the optical axis when acted upon by a lens. At all positions on the mask, radiation arrives in phase. Strictly speaking, coherent illumination implies zero intensity. For incoherent illumination, an infinite collection of off-axis point sources result in plane waves at all angles (Gp). The resulting illumination at the mask has essentially no phase-to-space relationship. For partially coherent illumination, a finite collection of off-axis point sources describes a source of finite extent, resulting in plane waves within a finite angle. The situation of partial coherence is of most interest for lithography, the degree of which will have a great influence on imaging results. Through the study of interference, Young’s double-slit experiment has allowed an understanding of a great deal of optical phenomenon. The concept of partial coherence can be understood using modifications of Young’s double-slit experiment. Consider two slits separated a distance p apart and illuminated by a coherent point source as depicted in

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

172 On-axis source

Off-axis source

Condenser Lens

Mask (a)

(b)

FIGURE 3.22 The impact of on-axis (a) and off-axis (b) point sources on illumination coherence. Plane waves result for each case and are normal to the optical axis only for an on-axis point.

Figure 3.23a. The resulting interference fringes are cosinusoidal with frequency u0Z1/p as would be predicted using interference theory or Fourier transform concepts of Fraunhofer diffraction (the Fourier transform of two symmetrically distributed point sources or impulse functions is a cosine). Next, consider a point source shifted laterally and the resulting phase-shifted cosinusoidal interference pattern as shown in Figure 3.23b. If this approach is extended to a number of point sources to represent a real source of finite extent, it can be expected that the resulting interference pattern would be an average of many cosines with reduces modulation and with a frequency u0 as shown in Figure 3.23c. The assumption for this analysis is that the light emitted from each point source is of identical wavelength or that there is a condition of temporal coherence. 3.3.5 Partial Coherence Theory: Diffracted-Limited Resolution The concept of degree of coherence is useful as a description of illumination condition. The Abbe theory of microscope imaging can be applied to microlithographic imaging with coherent or partially coherent illumination [11]. Abbe demonstrated that when a ruled grating is coherently illuminated and imaged through an objective lens, the resulting image depends on the lens numerical aperture. The minimum resolution that can be obtained is a function of both the illumination wavelength and the lens NA as shown in Figure 3.24 for coherent illumination. Because no imaging is possible if no more than the undiffracted beam is accepted by the lens, it can be reasoned that a minimum of the first diffraction order is required for resolution. The position of this first order is determined as sinðqÞ Z

q 2007 by Taylor & Francis Group, LLC

l p

Optics for Photolithography

173

cos(2πu0x) d

u0 = 1/d Coherent point source (a)

shifted cos(2πu0x + f)

Shifted point source

(b) shifted

sum

Σ

1 cos(2πu0x + fn) = n cos(2πu0x) n

(c)

Extended real source

FIGURE 3.23 Diffraction patterns from two slits separated by a distance d for (a) coherent illumination, (b) oblique off-axis illumination, and (c) partially coherent illumination.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

174

−3

3

−2

2 −1

1

0

FIGURE 3.24 The condition for minimum diffraction limited resolution for a coherently illuminated grating mask.

Collection lens

Because a lens numerical aperture is defined as the sine of the half acceptance angle (q), the minimum resolvable line width (RZp/2) becomes p l R Z Z 0:5 2 NA Abbe’s work made use of a smooth uniform flame source and a substage condenser to form its image in the object plane. To adapt to nonuniform lamp sources, Ko¨hler devised a two-stage illuminating system to form an image of the source into the entrance pupil of the objective lens as shown in Figure 3.25 [12]. A pupil at the condenser lens can control the numerical aperture of the illumination system. As the pupil is closed down, the source size (ds) and the effective source size ðds0 Þ are decreased, resulting in an increase in the extent of coherency. Thus, Ko¨hler illumination allows control of partial coherence. The degree of partial coherence (s) is conventionally measured as the ratio of effective source size to full objective aperture size or the ratio of condenser lens NA to objective lens NA Degree of coherence ðsÞ Z ðd 0 s =do Þ Z ðNAC =NAO Þ Wafer

Mask

ds

do ds'

Source Condenser NA Objective NA FIGURE 3.25 Schematic of Ko¨hler illumination. The degree of coherence (s) is determined as ds 0 /do or NAC/NAO.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

175

As s approaches zero, a condition of coherent illumination exists. As s approaches one, incoherent illumination exists. In lithographic projection systems, s is generally in the range 0.3–0.9. Values below 0.3 will result in “ringing” in images, fringes that result from coherent interference effects similar so those shown as terms are added in Figure 3.20. Partial coherence can be thought of as taking an incoherent sum of coherent images. For every point within a source of finite extent, a coherent Fraunhofer diffraction pattern in produced that can be described by Fourier methods. For a point source on axis, diffracted information is distributed symmetrically and discretely about the axis. For off-axis points, diffraction patterns are shifted off axis and, as all points are considered together, the resulting diffraction pattern becomes a summation of individual distributions. Figure 3.26 depicts the situation for a rectangular wave mask pattern illuminated with s greater than zero. Here, the zeroth order is centered on axis but with a width O0, a result of the extent of partially coherent illumination angles. Similarly, each higher diffraction order also has width O0, an effective spreading of discrete orders. The impact of partial coherence is realized when the influence of an objective lens is considered. By spreading the diffraction orders about their discrete coherent frequencies, operation on the diffracted information by the lens produces a frequency averaging effect of the image and loss of image modulation as previously seen in Figure 3.23 for the double-slit example. This image degradation is not desirable when coherent illumination would allow superior image reconstruction. If, however, a situation exists where coherent illumination of a given mask pattern does not allow lens collection of diffraction orders beyond the zeroth order, partially coherent illumination would be preferred. Consider a coherently illuminated rectangular grating mask where G first diffraction orders fall just outside a projection systems lens NA. With coherent illumination, imaging is not possible as feature sizes fall below the RZ0.5l/NA limit. Through the use of partially coherent illumination, partial first diffraction order information can be captured by the lens, resulting in imaging capability. Partial coherent illumination, therefore, is desirable as mask features fall below RZ0.5l/NA in size. An optimum degree of coherence can be determined for a feature based on its size, the illumination wavelength, and the objective lens NA. Figure 3.27 shows the effect of partial coherence on imaging features of two sizes. The first case,

s>0

−2

2 −1

1 0 NA< l /2R

q 2007 by Taylor & Francis Group, LLC

FIGURE 3.26 Spread of diffraction orders for partially coherent illumination. Resolution below 0.5l/NA becomes possible.

Microlithography: Science and Technology

176

Intensity

Intensity

s=0

s = 0.9

s = 0.9 s=0 Position

Position (b)

(a)

FIGURE 3.27 Intensity aerial images for features with various levels of partial coherence. Features corresponding to 0.6l/NA are relatively large and are shown on the left (a). Small features corresponding to 0.4l/NA are shown on the right (b).

Figure 3.27a, is one where aerial images for features are larger than the resolution possible for coherent illumination (here, 0.6l/NA). As seen, any increase in partial coherence above sZ0 results in a degradation of the aerial image produced. This is due to the averaging effect of the fundamental cosinusoidal components used in image reconstruction. As seen in Figure 3.27b, features smaller than the resolution possible with coherent illumination (0.4l/NA) are resolvable only as partial coherence levels increase above sZ0. It stands to reason that for every feature size and type, there exists a unique optimum partial coherence value that allows the greatest image improvement while allowing the minimum degradation. Focus effects also need to be considered as partial coherence is optimized. This will be addressed further as depth of focus is considered.

3.4 Image Evaluation The minimum resolution possible with coherent illumination is that which satisfies RZ

0:5l NA

that is commonly referred to as the Rayleigh criterion [13]. Through incoherent or partially coherent illumination, resolution beyond this limit is made possible. Methods of image assessment are required to evaluate an image that is transferred through an optical system. As will be shown, such methods will also prove useful as an imaging system deviates from ideal and optical aberrations are considered. 3.4.1 OTF, MTF, and PTF The optical transfer function (OTF) is often used to evaluate the relationship between an image and the object that produced it [14]. In general, a transfer function is a description of an entire imaging process as a function of spatial frequency. It is a scaled Fourier transform of the point spread function (PSF) of the system. The PSF is the response of the optical system to a point object input, essentially the distribution of a point aerial image.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

177

For a linear system, the transfer function is the ratio of the image modulation (or contrast) to object modulation (or contrast) Cimage ðuÞ=Cobject ðuÞ where contrast (C) is the normalized image modulation at frequency u CðuÞZ ðSmaxKSmin Þ=ðSmaxC Smin Þ% 1 Here, S is the image or object signal. To fulfill the requirements of a linear system, several conditions must be met. In order to be linear, the input of a system’s response to the superposition of two inputs must equal to the superposition of the individual responses. If Qff ðxÞgZ gðxÞ represents the operation of a system on an input f(x) to produce an output g(x), then Qff1 ðxÞC f2 ðxÞg Z g1 ðxÞC g2 ðxÞ represents a system linear with superposition. A second condition of a linear system in shift invariance where a system operates identically at all input coordinates. Analytically, this can be expressed as Qff ðxKx0 ÞgZgðxKx0 Þ or a shift in input results in an identical shift in output. An optical system can be thought of as shift invariant in the absence of aberrations. Because the aberration of a system changes from point to point, the PSF can significantly vary from a center to an edge field point. Intensities must add for an imaging process to the linear. In the coherent case of the harmonic analysis of a square wave in Figure 3.20, the amplitudes of individual components have been added rather than their intensities. Whereas an optical system is linear in amplitude for coherent illumination, it is linear in intensity only for incoherent illumination. The OTF, therefore, be can be used as a metric for analysis of image intensity transfer only for incoherent illumination. Modulation is expressed as MZðImaxKImin Þ=ðImaxCImin Þ where I is image or object intensity. It is a transfer function for a system over a range of spatial frequencies. A typical OTF is shown in Figure 3.28 where modulation is plotted as a function of spatial frequency in cycles/mm. As seen, higher frequency objects (corresponding to finer feature detail) are transferred through the system with lower modulation. The characteristics of the incoherent OTF can be understood by working backward through an optical system. We have seen that an amplitude image is the Fourier transform of the product of an object and the lens pupil function. Here, the object is a point, and the image is its PSF. The intensity PSF for an incoherent system is a squared amplitude PSF, also know as an Airy Disk, shown in Figure 3.29. Because multiplication becomes a convolution via a Fourier transform, the transfer function of an imaging system with incoherent illumination is proportional to the self-convolution or the autocorrelation of the lens pupil function that is equivalent to the Fourier transform of its PSF. As seen in Figure 3.28, the OTF resembles a triangular function, the result of autocorrelation of a rectangular pupil function (that would be circular in two dimensions). For coherent illumination, the coherent transfer

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

178

1.0

Modulation

Coherent transfer function

0.5 Incoherent OTF R=0.25 l /NA

R=0.5 l /NA

0 Spatial frequency (cy/mm) FIGURE 3.28 Typical incoherent optical transfer function (OTF) and coherent contrast transfer function (CTF).

function is proportional to the pupil function itself. The incoherent transfer function is twice as wide as the coherent transfer function, indicative that the cutoff frequency is twice that for coherent illumination. The limiting resolution for incoherent illumination becomes RZ

0:25l NA

Although Rayleigh’s criterion for incoherent illumination describes the point beyond which resolution is not longer possible, it does not give an indication of image quality at lower frequencies (corresponding to larger feature sizes). The OTF is a description of not only the limiting resolution but also the modulation at spatial frequencies up to that point.

Intersity PSF

Y

X FIGURE 3.29 Intensity point spread function (PSF) for an incoherent system.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

179

The OTF is generally normalized to 1.0. The magnitude of the OTF is the modulation transfer function (MTF) that is commonly used. The MTF ignores phase information transferred by the system that can be described using the phase transfer function (PTF). Because of the linear properties of incoherent imaging, the OTF, MTF, and PTF are independent of the object. Knowledge of the pupil shape and the lens aberrations is sufficient to completely describe the OTF. For coherent and partially coherent systems, there are no such metrics that are object independent.

3.4.2 Evaluation of Partial Coherent Imaging For coherent or partially coherent imaging, the ratio of image modulation to object modulation is object dependent, making the situation more complex than for incoherent imaging. The concept of a transfer function can still be utilized, but limitations should be kept in mind. As previously shown, the transfer function of a coherent imaging system is proportional to the pupil function itself. The cutoff frequency corresponds exactly to the Rayleigh criterion for coherent illumination. For partially coherent systems, the transfer function is neither the pupil function nor its autocorrelation, resulting in a more complex situation. The evaluation of images requires a summation of coherent images correlated by the degree of coherence at the mask. A partially coherent transfer function must include a unique description of both the illumination system and the lens. Such a transfer function is commonly referred to as a cross transfer function or the transmission cross coefficient [15]. For a mask object with equal lines and spaces, the object amplitude distribution can be represented as f ðxÞ Z ao C 2

N X

an cosð2pnuxÞ

nZ1

where x is image position and u is spatial frequency. From partial coherence theory, the aerial image intensity distribution becomes IðxÞ Z A C B cosð2puo xÞ C C cos2 ð2puo xÞ that is valid for uR(1Cs)/3. The terms A, B, and C are given by A Z a20 Tð0;0Þ C 2a21 ½Tðu1 ;u2 ÞKTðKu1 ;u2 Þ B Z 4a0 a1 RE½Tð0;u2 Þ C Z 4a21 TðKu1 ;u2 Þ where T(u1,u2) is the transmission cross coefficient, a measure of the phase correlation at two frequencies u1 and u2. Image modulation can be calculated as MZB/(ACC). The concepts of an MTF can be extended to partially coherent imaging if generated for each object uniquely. Steel [16] developed approximations to an exact expression for the MTF for partially coherent illumination. Such normalized MTF curves (denoted as MTFp curves) can be generated for various degrees of partial coherence [17] as shown in Figure 3.30. In systems with few aberrations, the impact of changes in the degree of partial coherence can be evaluated for any unique spatial frequency. By assuming a linear change in MTF between spatial frequencies u1 and u2, a correlation factor G(s,u) can be calculated that relates incoherent MTFINC to partially coherent

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

180 1

0.3s Partial coherence modulation

0.8

0.5s 0.7s 0.9s

0.6

0.4

0.2

0 600

800

1000 1200 1400 Spatial frequency (cy/mm)

1600

1800

FIGURE 3.30 Partially coherent MTFp curves for s values from 0.3 to 0.7 for a 365 nm, 0.37 NA diffraction limited system.

MTFp u1 Z ð1KsÞNA=l u2 Z ð1 C 0:18sÞNA=l 1 Z 1Kð4=pÞ sinðul=2NAÞ

u% u1

1Kð4=pÞ sinðu2 l=2NAÞðuKu1 Þ=ðu2 Ku1 Þ 1Kð4=pÞ sinðu2 l=2NAÞ ! 1 u2 ! u

Gðs; uÞ Z

u1 ! u! u2

The partially coherent MTF becomes MTFP ðs;uÞ Z Gðs;uÞMTFINC ðs;uÞ Using MTF curves such as those in Figure 3.30 for a 0.37 NA I-line system, partial coherence effects can be evaluated. With partial coherence of 0.3, the modulation at a spatial frequency of 1150 cycles/mm (corresponding to 0.43 mm lines) is near 0.35. Using a s of 0.7, modulation increases by 71%. At 950 cycles/mm, however (corresponding to 0.53 mm lines), modulation decreases as partial coherence increases. The requirements of photoresist materials need to be addressed to determine appropriate s value for a given spatial frequency. The concept of critical modulation transfer function (CMTF) is a useful approximation for relating minimum modulation required for a photoresist material. The minimum required modulation for resist with contrast g can be determined as CMTF Z

q 2007 by Taylor & Francis Group, LLC

101=g K1 101=g C 1

Optics for Photolithography

181

For a resist material with a g of 2, a CMTF of 0.52 results. At this modulation, large s values are best suited for the system depicted in Figure 3.30, and resolution is limited to somewhere near 0.38 mm. Optimization of partial coherence will be further addressed as additional image metrics are introduced. As linearity is related to the coherence properties of mask illumination, stationarity is related to aberration properties across a specific image field. For a lens system to meet the requirements of stationarity, an isoplanatic patch needs to be defined in the image plane where the transfer function and PSF does not significantly change. Shannon [18] described this region as “much larger than the dimension of the significant detail to be examined on the image surface” but “small compared to the total area of the image.” A real lens, therefore, requires a set of evaluation metrics, required for any lens will be a function of required performance and financial or technical capabilities. Although a large number of OTFs will better characterize a lens, more than a few may be impractical. Because an OTF will degrade with defocus, a position of best focus is normally chosen for lens characterization. 3.4.3 Other Image Evaluation Metrics MTF or comparable metrics are limited to periodic features or gratings of equal lines and spaces. Other metrics may be used for the evaluation of image quality by measuring some aspect of an aerial image with less restriction on feature type. These may include measurements of image energy, image shape fidelity, critical image width, and image slope. Because feature width is a critical parameter for lithography, aerial image width is a useful metric for insight into the performance of resist images. A 30% intensity threshold is commonly chosen for image width measurement [19]. Few of these metrics, though, given an adequate representation of the impact of aerial image quality on resist process latitude. Through measurement of the aerial image log slope or ILS, an indication of resist process performance can be obtained [20]. Exponential attenuation of radiation through an absorbing photoresist film leads an exposure profile (de/dx) related to aerial image intensity as d dðln IÞ Z dx dx Because an exposure profile leads to a resist profile upon development, measurement of the slope of the log of an aerial image (at the mask edge) can be directly related to a resist image. Changes in this log aerial image gradient will, therefore, directly influence resist profile and process latitude. Using ILS as an image metric, aerial image plots such as those in Figure 3.27 can be more thoroughly evaluated. Shown in Figure 3.31 is as plot of image log slope versus partial coherence for features of size RZ0.4–0.6l/NA. As can be seen, for increasing levels of partial coherence, image log slope increases for features smaller than 0.5l/NA and decreases for features smaller than 0.5l/NA. It is important to notice, however, that all cases converge to a similar image log slope value. Improvements for small features achieved by increasing partial coherence values cannot improve the aerial image in a way equivalent to decreasing wavelength or increasing NA. To determine to minimum usable ILS value and optimize situations such as the one above for use with a photoresist process, resist requirements need to be considered. As minimum image modulation required for a resist (CMTF) has been related to resist contrast properties, there is also a relationship between resist performance and minimum ILS requirements. As bulk resist properties such as contrast may not be adequately related to process-specific responses such as feature size control, exposure

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

182

Image log slope (ILS)μm−1

15

12 0.7l /NA

9 0.6l /NA

6 0.5l /NA

3 0.4l /NA

FIGURE 3.31 Image log slope (ILS) versus partial coherence for dense features from 0.4l/NA to 0.7l/NA in size.

0

0.90

0.10 Partial coherence (s)

latitude, or depth of focus, exposure matrices can provide usable depth-of-focus information for a resist-imaging system based on exposure and feature size specifications. Relating DOF to aerial image data for an imaging system can result in determination of a minimum ILS specification. Although ILS is feature size dependent (in units of mmK1), image log slope normalized by multiplying it by the feature width is not. A minimum normalized image log slope (NILS) can then be determined for a resist-imaging system with less dependence on feature size. A convenient rule-of-thumb value for minimum NILS is between 6 and 8 for a single-layer positive resist with good performance. With image evaluation requirements established, Rayleigh’s criterion can be revisited and modified for other situations of partial coherence. A more general form becomes

RZ

k1 l NA

where k1 is a process-dependent factor that incorporates everything in a lithography process that is not wavelength or numerical aperture. Its importance should not be minimized as any process or system modification that allows improvements in resolution effectively reduces the k1 factor. Diffraction-limited values are 0.25 for incoherent and 0.50 for coherent illumination as previously shown. For partial coherence, k1 can be expressed as

k1 Z

1 2ðsC1Þ

where the minimum resolution is that which places the G first diffraction order energy within the objective lens pupil as shown in Figure 3.26. 3.4.4 Depth of Focus Depth of focus needs to be considered along with resolution criteria when imaging with a lens system. Depth of focus is defined as the distance along the optical axis that produces

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

183

an image of some suitable quality. The Rayleigh depth of focus generally takes the form DOF ZG

k2 l NA2

where k2 is also a process-dependent factor. For a resist material of reasonably high contrast, k2 may be on the order of 0.5. A process specific value of k2 can be defined by determining the resulting useful DOF after specifying exposure latitude and tolerances. DOF decreases linearly with wavelength and as the square of numerical aperture. As measures are taken to improve resolution, it is more desirable to decrease wavelength than to increase NA. Depth of focus is closely related to defocus, the distance along the optical axis from a bet focus position. The acceptable level of defocus for a lens system will determine the usable DOF. Tolerable levels of this aberration will ultimately be determined by the entire imaging system as well as the feature sizes of interest. To understand the interdependence of image quality and focus can be thought of as deviations from a perfect spherical wave emerging from the exit pupil of a lens toward an image point. This is analogous to working backward through an optical system where a true point source in image space would correspond to a perfect spherical wave at the lens exit pupil. As shown in Figure 3.32, the deviation of an actual wavefront from an unaberrated wavefront can be measured in terms of an optical path difference (OPD). The OPD in a medium is the product of the geometrical path length and the refractive index. For a point object, an ideal spherical wavefront leaving the lens pupil is represented by a dashed line. This wavefront will come to focus as a point in the image plane. Compared to this reference wavefront, a defocused wavefront (one that would focus at a point some distance from the image plane) introduces error in the optical path distance to the image plane. This error increases with pupil radius. The resulting image will generally no longer resemble a point; instead, it will be blurred. The acceptable DOF for a lithographic process can be determined by relating OPD to phase error. An optical path is best measured in terms of the number (or fraction) of corresponding waves. OPD is realized, therefore, as a phase-shifting effect or phase error (Ferr) that can be expressed as Ferr Z

2p OPD l

Axis Objective lens

OPD W S

Image position for reference (S) wavefront d

Image position for defocussed (W) wavefront

FIGURE 3.32 Depiction of optical path error (d) introduced with defocus. Both reference (S) and defocused (W) wavefronts pass through the center of the objective lens pupil.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

184

By determining the maximum allowable phase error for a process, an acceptable level of defocus can be determined. Consider again Figure 3.32. The optical path distance can be related to defocus (d) as d sin4 q sin6 q 2 OPD Z dð1Kcos qÞ Z sin q C C C. 2 4 8 d F l OPD z sin2 q Z err 2 2p for small angles (the Fresnel approximation). Defocus can now be expressed as Ferr l Ferr l dZ Z 2 p p sin q NA2 A maximum phase error term can be determined by defining the maximum allowable defocus that will maintain process specifications. DOF can, therefore, be expressed in terms of corresponding defocus (d) and phase error (Fen/p) terms through use of the process factor k2 DOF ZG

k2 l NA2

as previously seen. If the distribution of mask frequency information in the lens pupil is considered, it is seen that the impact of defocus is realized as zero and first diffraction orders travel different optical path distances. For coherent illumination, the zero order experiences no OPD while the G first orders go through a pupil-dependent OPD. It follows that only features that have sufficiently important information (i.e., first diffraction orders) at the edge of the lens aperture will possess a DOF as calculated by the full lens NA. For larger features whose diffraction orders are distributed closer to the lens center, DOF will be substantially higher. For dense features of pitch p, an effective NA can be determined for each feature size that can subsequently be used for DOF calculation NAeffective w

l p

As an example, consider dense 0.5-mm features imaged using coherent illumination with a 0.50 NA objective lens and 365-nm illumination. The first diffraction orders for these features are contained within the lens aperture at an effective NA of 0.365 rather than 0.50. The resulting DOF (for a k2 of 0.5) is, therefore, closer toG1.37 mm rather than toG0.73 mm as determined for the full NA. The distribution of diffraction orders needs to be considered in the case of partial coherence. By combining the wavefront description in Figure 3.32 with the frequency distribution description in Figure 3.26, DOF can be related to partial coherence as shown in Figure 3.33. For coherent illumination, there is a discrete difference in optical path length traveled between diffraction orders. By using partial coherence, however, there is an averaging effect of OPD over the lens pupil. By distributing frequency information over a broad portion of the lens pupil, the difference in path lengths experienced between diffraction orders is reduced. In the limit of complete incoherence, the zero and first diffraction orders essentially share the same pupil area, effectively eliminating the

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

185

Partially coherent illumination

−2

2 −1

1 0

Objective lens pupil NA

d FIGURE 3.33 An increase in DOF will result from an increase in partial coherence as path length differences are averaged across the lens pupil. In the limit for incoherent illumination, the zero and first diffraction orders fill the lens pupil, and DOF is theoretically infinite (in the absence of higher orders).

effects of defocus (that is possible only in the absence of any higher order diffraction terms). This can be seen in Figure 3.34 that is similar to Figure 3.31 except that a large defocus value has been incorporated. Here, it is seen that at low partial coherence values, ILS remains high, indicating that a greater DOF is possible.

3.5 Imaging Aberrations and Defocus Discussion of geometrical image formation has so far been limited to the paraxial region that allows determination of the size and location of an image for a perfect lens. In reality, some degree of lens error or aberration exists in any lens, causing deviation from this firstorder region. For microlithographic lenses, an understanding of the tolerable level of aberrations and interrelationships becomes more critical than for most other optical applications. To understand their impact on image formation, aberrations can be classified by their origin and effects. Commonly referred to as the Seidel aberrations, these include monochromatic aberrations: spherical, coma, astigmatism, field curvature, and distortion as well as chromatic aberration. A brief description of each aberration will be given along

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

Image log slope (ILS)

186

0.7l /NA 0.6l /NA

0.5l /NA

0.4l /NA

0.10

0.90 Partial coherence (s)

FIGURE 3.34 ILS versus partial coherence for dense features with l/NA2 of defocus. Degradation is minimal with higher s values.

with the effect of each on image formation. In addition, defocus is considered as an aberration and will be addressed. Although each aberration is discussed uniquely, all aberration types at some level will nearly always be present. 3.5.1 Spherical Aberration Spherical aberration is a variation in focus as a function of radial position in a lens. Spherical aberration exists for objects either on or off the optical axis. Figure 3.35 shows the situation for a distant on-axis point object where rays passing through the lens near the optical axis come into focus nearer the paraxial focus than rays passing through the edge of the lens. Spherical aberration can be measured as either a longitudinal (or axial) or transverse (or lateral) error. Longitudinal spherical aberration is the distance from the paraxial focus to the axial intersection of a ray. Transverse spherical aberration is similar, but it is measured in the vertical direction. Spherical aberration is often represented graphically in terms of ray

Circle of least confusion

Transverse

FIGURE 3.35 Spherical aberration for an on-axis point object.

q 2007 by Taylor & Francis Group, LLC

Longitudinal spherical aberration

Optics for Photolithography

187

YR

LAR

FIGURE 3.36 Longitudinal spherical aberration (LAR) plotted against ray height (YR).

height as in Figure 3.36 where longitudinal error (LAR) is plotted against ray height at the lens (YR). The effect of spherical aberration on a point image is a blurring effect or the formation of a diffuse halo by peripheral rays. The best image of a point object is no longer located at the paraxial focus; instead, it is at the position of the circle of lease confusion. Longitudinal spherical aberration increases as the square of the aperture, and it is influenced by lens shape. In general, a positive lens will produce an undercorrection of spherical aberration (a negative value), whereas a negative lens will produce an overcorrection. As with most primary aberrations, there is also a dependence on object and image position. As an object changes position, for example, ray paths change, leading to potential increases in aberration levels. If a lens system is scaled up or down, aberrations are also scaled. This scaling would lead to a change in field size but not in numerical aperture. A simple system that is scaled up by 2! with a 1.5! increase in NA, for example, would lead to a 4.5! increase in longitudinal spherical aberration. 3.5.2 Coma Coma is an aberration of object points that lie off axis. It is a variation in magnification with aperture that produces an image point with a diffuse comet-like tail. As shown in Figure 3.37, rays passing through the center and edges of a lens are focused at different heights. Tangential coma is measured as the distance between the height of the lens rim ray and the lens center ray. Unlike spherical aberration, comatic flare is not symmetric, and point image location is sometimes difficult. Coma increases with the square of the lens aperture and also with field size. Coma can be reduced, therefore, by stopping down the lens and limiting field size. It can also be reduced by shifting the aperture and optimizing field angle. Unlike spherical aberration, coma is linearly influenced by lens shape. Coma is positive for a negative meniscus lens and decreases to negative for a positive meniscus lens.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

188

Periphery

Center

FIGURE 3.37 Coma for an off-axis object point. Rays passing through the center and edges of a lens are focused at different heights.

3.5.3 Astigmatism and Field Curvature Astigmatism is also an off-axis aberration. With astigmatism present, rays that lie in different planes do not share a common focus. Consider, for instance, a plane that contains the chief ray and the optical axis, known as the tangential plane. The plane perpendicular to this is called the sagittal plane that also contains the chief ray. Rays in the tangential plane will come to focus at the tangential focal surface as shown in Figure 3.38. Rays in the sagittal plane will come to focus at the sagittal focal surface, and if these two do not coincide, the intermediate surface is called the medial image surface. If no astigmatism exists, all surfaces coincide with the lens field curvature called the Petzval curvature. Astigmatism does not exist for on-axis points and increases with the square of field size. Undercorrected astigmatism exists when the tangential surface is to the left of the sagittal surface. Overcorrection exists when the situation is reversed. Point images in the presence of astigmatism generally exhibit circular or elliptical blur. Field curvature results in a Petzval surface that is not a plane. This prevents imaging of point objects in focus on a planar surface. Field curvature and astigmatism are closely related and must be considered together if methods of field flattening are used for correction. 3.5.4 Distortion Distortion is a radial displacement of off-axis image points, essentially a field variation in magnification. If an increase in magnification occurs as distance from field center Radial Tangential

FIGURE 3.38 Astigmatism for an off-axis object point. Rays in different planes do not share a common focus.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

189

increases, a pincushion or overcorrected distortion exists. For a decrease in magnification, barrel distortion results. Distortion is expressed either as a dimensional error or as a percentage. It varies as a third power of field size dimensionally or as the square of field size in terms of percent. The location of the aperture stop will greatly influence distortion. 3.5.5 Chromatic Aberration Chromatic aberration is a change in focus with wavelength. Because the refractive index of glass materials is not constant with wavelength, the refractive properties of a lens will vary. Generally, glass dispersion is negative, meaning that refractive index decreases with wavelength. This leads to an increase in refraction for shorter wavelengths and image blurring using multiple wavelengths for imaging. Figure 3.39 shows a longitudinal chromatic aberration for two wave-lengths, a measure of the separation of the two focal positions along the optical axis. For this positive lens, there is a shortening of focal length with decreasing wavelength or undercorrected longitudinal chromatic aberration. The effects of chromatic aberration are of great concern hen light is not temporally coherent. For most primary aberrations, some degree of control is possible by sacrificing aperture of field size. Generally, these methods are not sufficient to provide adequate reduction, and methods of lens element combination are utilized. Lens elements with opposite aberration sign can be combined to correct for specific aberration. Chromatic and spherical aberration can be reduced through use of an achromatic doublet where a positive element (biconvex) is used in contact with a negative element (negative meniscus or planoconcave). On its own, the positive element possesses undercorrected spherical as well as undercorrected chromatic aberration. The negative element on its own has both overcorrected spherical and overcorrected chromatic aberration. If the positive element is chosen to have greater power as well as lower dispersion than the negative element, positive lens power can be maintained while chromatic aberration is reduced. To address the reduction of spherical aberration with the doublet, the glass refractive index is also considered. As shorter wavelengths are considered for lens systems, the choice of suitable optical materials becomes limited. At wavelengths below 300 nm, few glass types exist, and aberration correction, especially for chromatic aberration, becomes difficult. Although aberration correction can be quite successful through the balancing of several elements of varying power, shape, and optical properties, it is difficult to a corrected a lens over the entire aperture. A lens is corrected for rays at the edge of the lens. This results in either overcorrection or undercorrection in different zones of the lens. Figure 3.40, for example, is a plot of longitudinal spherical aberration (LA) as a function of field height. At the center of the field, no spherical aberration exists. This lens has been corrected so that

Longer wavelength Shorter wavelength

Longitudinal color FIGURE 3.39 Chromatic aberration for an on-axis point using two wavelengths. For a positive lens, focal length is shortened with decreasing wavelength.

q 2007 by Taylor & Francis Group, LLC

190

Microlithography: Science and Technology

Undercorrected (−) Overcorrected (+)

Y0

0.7Y0

FIGURE 3.40 Spherical aberration corrected on-axis and at the edge of the field. Largest aberration (K) is at a 70% zone position.

LAR

no spherical aberration also exists at the edge of the field. Other portions of the field exhibit undercorrection, and positions outside the field edge become overcorrected. The worst-case zone here is near 70%, which is common for many lens systems. Figure 3.41 shows astigmatism or field curvature plotted as a function of image height. For this lens, there exists one position in the field where tangential and sagittal surfaces coincide or astigmatism is zero. Astigmatism is overcorrected closer to the axis (relative to the Petzval surface), and it is undercorrected farther out.

Tangential surface Sagittal surface

Petzval surface

FIGURE 3.41 Astigmatism and field curvature plotted as a function of image height. One filed position exists where surfaces coincide.

q 2007 by Taylor & Francis Group, LLC

Field curvature

Optics for Photolithography

191

3.5.6 Wavefront Aberration Descriptions For reasonably small levels of lens aberrations, analysis can be accomplished by considering the wave nature of light. As demonstrated for defocus, each primary aberration will produce unique deviations in the wavefront within the lens pupil. An aberrated pupil function can be described in terms of wavefront deformation as 2p Wðr;qÞ Pðr;qÞ Z Hðr;qÞexp i l The pupil function is represented in polar coordinates where W(r,q) is the wavefront aberration function, and H(r,q) is the pupil shape, generally circular. Each aberration can, therefore, be described in terms of the wavefront aberration function W(r,q). Table 3.1 shows the mathematical description of W(r,q) for primary aberrations, spherical, coma, astigmatism, and defocus. As an example, defocus aberration can be described in terms of wavefront deformation. Using Figure 3.32, the aberration of the wavelength w to the reference wavefront s is the OPD between the two. The defocus wave aberration W(r) increases with aperture as [21] n 1 1 r2 K WðrÞ Z 2 Rs Rw where Rs and Rw are radii of two spherical surfaces. Longitudinal defocus is defined as (Rs–Rw). Defocus wave aberration is proportional to the square of the aperture distance as previously seen. Shown in Figure 3.42 through Figure 3.45 are three-dimensional plots of defocus, spherical, coma, and astigmatism as wavefront OPD in the lens pupil. The plots represent differences between an ideal spherical wavefront and an aberrated wavefront. For each case, 0.25 waves of each aberration are present. Higher order aberration terms also produce unique and related shapes in the lens pupil. 3.5.7 Zernike Polynomials Balanced aberrations are desired to minimize the variance within a wavefront. Zernike polynomials describe balanced aberration in terms of a set of coefficients that are orthogonal over a unit circle polynomial [22]. The polynomial can be expressed in Cartensian (x,y) or polar (r,q) terms, and it can be applied to rotationally symmetrical nonsymmetrical systems. Because these polynomials are orthogonal, each term individually represents TABLE 3.1 Mathematical Description for Primary Aberrations and Values of Peak-to-Valley Aberrations Aberration Defocus Spherical Balanced spherical Coma Balanced coma Astigmatism Balanced astigmatism a

W(r,q) Ar2 Ar4 A(r4Kr2) Ar3 cos q A(r3K2r/3) cos q Ar2 cos2q (A/2)r2 cos2q

The A coefficient represents the peak value of an aberration.

q 2007 by Taylor & Francis Group, LLC

Wp–v Aa A A/4 2A 2A/3 A A

192

Microlithography: Science and Technology

FIGURE 3.42 Defocus aberration (r2) plotted as pupil wavefront deformation. Total OPD is 0.25l.

a best fit to the aberration data. Generally, fringe Zernike coefficient normalization to the pupil edge is used in lens design, testing, and simulation. Other normalizations do exist, including a renormalizing to the root-mean-square (RMS) wavefront aberration. The fringe Zernike coefficients are shown in Table 3.2 along with corresponding primary aberrations. 3.5.8 Aberration Tolerances For OPD values less than a few wavelengths of light, aberration levels can be considered small. Because any amount of aberration results in image degradation, tolerance levels must be established for lens systems, dependent on application. This results in the need to consider not only specific object requirements and illumination but also resist

FIGURE 3.43 Primary spherical aberration (r4).

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

193

FIGURE 3.44 Primary coma aberration (r3 cos q).

requirements. For microlithographic application, resist and process capability will ultimately influence the allowable lens aberration level. Conventionally, an acceptably diffraction-limited lens is one that produces no more than one quarter-wavelength (l/4) wavefront OPD. For many nonlithographic lens systems, the reduced performance resulting from this level of aberration is allowable. To measure image quality as a result of lens aberration, the distribution of energy in an intensity PSF (or Airy disk) can be evaluated. The ratio of energy at the center of an aberrated point image to the energy at the center of an unaberrated point image is known as the Strehl ratio as shown in Figure 3.46. For an aberration-free lens, of course, the Strehl ratio is 1.0. For a lens with l/4 OPD, the Strehl ratio is 0.80, nearly independent of the specific primary aberration types present. This is conventionally known as the Rayleight l/4 rule [23]. A general rule of thumb is that the effects on image quality are similar for identical levels of primary wavefront aberration. Table 3.3 shows the relationship between peak-to-valley (P-V) OPD, RMS OPD, and Strehl ratio. For low-order aberration, RMS OPD can be related

FIGURE 3.45 Primary astigmatism (r2cos2q).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

194 TABLE 3.2

Fringe Zernike Polynomial Coefficients and Corresponding Aberrations Term 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

Fringe Zernike Polynomial 1 r cos(a) r sin(a) 2r2K1 r2cos(2a) r2 sin(2a) (3r3K2r) cos(a) (3r3K2r) sin(a) (6r4K6r2) 1 r3 cos(3a) r3 sin(3a) (4r4K3R2) cos(2a) (4R4K3R2) sin(2a) (10r5K12r3C3r) cos(a) (10r5K12r3C3r) sin(a) 20r6K30r4C12r2K1 r4 cos(4a) r4 sin(4a) (5r5K4r3) cos(3a) (5r5K4r3) sin(3a) (15r6K20r4C6r2) cos(2a) (15r6K20r4C6r2) sin(2a) (35r7K60r5C30r3K4r) cos(a) (35r7K60r5C30r3K4r) sin(a) 70r8K140r6C90r4K20r2C1 r5 cos(5a) r5 sin(5a) (6r6K5r4) cos(4a) (6r6K5r4) sin(4a) (21r7K30r5C10r3) cos(3a) (21r7K30r5C10r3) sin(3a) (56r8K105r6 C60r4K10r2) cos(2a) (56r8K105r6C60r4K10r2) sin(2a) (126r9K280r7C210r5K60r3C5r) cos(a) (126r9K280r7C210r5K60r3C5r) sin(a) 252r10K630r8C560r6K210r4C30r2K1 924r12K2772r10C3150r8K1680r6C420r4K42r2C1

Aberration Piston X Tilt Y Tilt Defocus 3rd Order astigmatism 3rd Order 458 astigmatism 3rd Order X coma 3rd Order Y coma 3rd Order spherical 3rd Order X three foil 3rd Order Y three foil 5th Order astigmatism 5th Order 458 astigmatism 5th Order X coma 5th Order Y coma 5th Order spherical

7th 7th 7th 7th 7th

Order astigmatism Order 458 astigmatism Order X coma Order Y coma Order spherical

9th Order astigmatism 9th Order 458 astigmatism 9th Order X coma 9th Order Y coma 9th Order spherical 11th Order spherical

Coefficients are normalized to the pupil edge.

to P–V OPD by RMS OPD Z

ðPKV OPDÞ 3:5

The Strehl ratio can be used to understand a good deal about an imaging process. The PSF is fundamental to imaging theory and can be used to calculate the diffraction image of both coherent and incoherent objects. By convolving a scaled object with the lens system PSF, the resulting incoherent image can be determined. In effect, this becomes the summation of the irradiance distribution of the image elements. Similarly, a coherent image can be determined by adding the complex amplitude distributions of the image elements. Figure 3.47 and Figure 3.48 show the effects of various levels of aberration and defocus on the PSF for an otherwise ideal lens system. Figure 3.47a through Figure 3.47c show

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

195 Energy 0.8

UNABERRATED PSF

0.7 0.6 0.5 STREHL RATIO 0.4 0.3 0.2 0.1 0 Distance

0.61l /NA

FIGURE 3.46 Strehl ratio for an aberrated point image.

PSFs for spherical, coma, and astigmatism aberration at 0.15lOPD levels. It is seen that the aberrations produce similar levels of reduced peak intensities. Energy distribution, however, varies somewhat with aberration type. Figure 3.48 shows how PSFs are affected by these primary aberrations combined with defocus. For each aberration type, defocus is fixed at 0.25lOPD. To extend evaluation of aberrated images for partially coherent systems, the use of the PSF (or OTF) becomes difficult. Methods of aerial image simulation can be utilized for lens performance evaluation. By incorporating lens aberration parameters into a scalar or vector diffraction model, most appropriately through use of Zernike polynomial coefficients, aerial image metrics such as image modulation of ILS can be used. Figure 3.49 shows the results of a three-bar mask object imaged through an aberrated lens system at a partial coherent of 0.5. Figure 3.49a shows aerial images produced in the presence of 0.15lOPD spherical aberration with G0.25lOPD of defocus. Figure 3.49b and Figure 3.49c show resulting images with coma and astigmatism, respectively.

TABLE 3.3 Relationship Between Peak-to-Valley OPD, RMS OPD, and Strehl Ratio P-V OPD 0.0 0.25RLZl/16 0.5RLZl/8 1.0RLZl/4 2.0RLZl/2 a

RMS OPD

Strehl Ratioa

0.0 0.018l 0.036l 0.07l 0.14l

1.00 0.99 0.95 0.80 0.4

Strehl ratios below 0.8 do not provide for a good metric of image quality.

q 2007 by Taylor & Francis Group, LLC

196

Microlithography: Science and Technology

FIGURE 3.47 Point spread functions for 0.15 l of primary (a) spherical aberration, (b) coma, and (c) astigmatism.

Figure 3.49d shows the unaberrated aerial image through the same defocus range. These aerial image plots suggest that the allowable aberration level will be influenced by resist capability as more capable resists and processes will tolerate larger levels of aberration.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

197

FIGURE 3.48 Point spread functions for 0.15l of primary aberrations combined with 0.25l of defocus: (a) spherical aberration, (b) coma, and (c) astigmatism.

3.5.9 Microlithographic Requirements It is evident from the preceding image plots that the Rayleigh l/4 rule may not be suitable for microlithographic applications where small changes in the aerial image can be

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

198

(a)

(b)

(c)

(d)

FIGURE 3.49 Aerial images for three-bar mask patterns imaged with a partial coherence of 0.5 (a) 0.15 l OPD of spherical aberration with G0.25l OPD of defocus. Note that optimal focus is shifted positively. (b) Coma with defocus. Defocus symmetry remains but positional asymmetry is present. (c) Astigmatism with defocus. Optimal focal position is dependent on orientation. (d) Aerial images with no aberration present.

translated into photoresist and result in substantial loss of process latitude. To establish allowable levels of aberration tolerances, photoresist requirements need to be considered along with process specifications. For a photoresist with reasonably high contrast and reasonably low NILS requirements, a balanced aberration level of 0.05lOPD and a Strehl ratio of 0.91 would have been acceptable a short while ago [24]. As process requirements are tightened, demands on a photoresist process will be increased to maintain process latitude at this level of aberration. As shorter wavelength technology is pursued, resist and process demands require that aberration tolerance levels be further reduced to 10% of this level. It is also important to realize that aberrations cannot be strictly considered to be independent as they contribute to image degradation in a lens. In reality, aberrations are balanced with one another to minimize the size of on image point in the image plane. Although asymmetric aberrations (i.e., coma, astigmatism, and lateral chromatic aberration) should be minimized for microlithographic lens application, this may not necessarily be the case for spherical aberration. This occurs because imaging is not carried out through a uniform medium toward an imaging plane, but instead, it is through several material media and within a photoresist layer. Figure 3.50 shows the effects of imaging in photoresist with an aberration-free lens using a scalar diffraction model and a positive resist model [25] for simulation. These are plots of resist feature width as a function of focal position for various levels of exposure. Focal position is chosen to represent the resist top surface (zero position) as well as a range below (negative) and above (positive) the top surface. This focus exposure matrix does not behave symmetrically throughout the entire focal range. Change in feature size with exposure is not equivalent for positive and negative defocus amounts as seen in Figure 3.50.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

199

1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1

(a)

0

Linewidth (micron)

1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

(b)

1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

(c)

1.2 1.1

(d)

1

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

(e)

−3

−2

−1

0

1

Focus (micron)

q 2007 by Taylor & Francis Group, LLC

2

3

FIGURE 3.50 Focus-exposure matrix plots for imaging of 0.6 mm dense features, 365 nm, 0.5 NA, 0.3s. Spherical aberration levels are (a) K0.2 l, (b) K0.05 l, (c) K0.03 l, (d) 0.00 l, (e) C0.20 l.

Microlithography: Science and Technology

200

1.1

Resist linewidth (micron)

1

FIGURE 3.51 Resist linearly plots from 0.35 to 1.0 mm (with imaging system in Figure 3.50 for positive resist). Linearity is improved with the presence of positive spherical aberration.

0.9

±10% CD SPEC

0.8 −0.10l 0.7 0.6 0.5 0.4 0.3 0.35

+0.10l 0.45

0.55

0.65

0.75

0.85

0.95

Mask linewidth (micron)

Figure 3.50d is a focus-exposure matrix plot for positive resist and an unabberated objective lens. Figure 3.50a through e are plots for systems with various amounts of primary spherical aberration, showing how CD slope and asymmetry is impacted through focus. For positive spherical aberration, an increase in through-focus CD slope is observed, whereas, for small negative aberration, a decrease results. For this system, 0.03l of negative spherical aberration produces better symmetry and process latitude than with no aberration. The opposite would occur for a negative resist. It is questionable if such techniques would be appropriate to improve imaging performance because some degree of process dedication would be required. Generally, a lithographic process is optimized for the smallest feature detail present; however, optimal focus and exposure may not coincide for larger features. Feature size linearity is also influenced by lens aberration. Figure 3.51 shows a plot of resist feature size versus mask feature size for various levels of spherical aberration. Linearity is also strongly influenced by photoresist response. These influences of photoresist processes and lens aberration on lithographic performance can be understood by considering the nonlinear response of photoresist to an aerial image. Consider a perfect aerial image with modulation of 1.0 and infinite image log slope such as that which would result from a collection of all diffraction orders. If this image is used to expose photoresist of any reasonable contrast, a resist image with near-perfect modulation could result. In reality, small feature aerial images do not have unity modulation; instead, they have a distribution of intensity along the x-y plane. Photoresist does not behave linearly to intensity, nor is it a high-contrast-threshold detector. Imaging into a resist film is dependent on the distribution of the serial image intensity and resist exposure properties. Resist image widths are not equal at the top and at the bottom of the resist. Some unique optimum focus and exposure exist for every feature/resist process/imaging system combination, and any system or process changes will affect features differently.

3.6 Optical Materials and Coatings Several properties of optical materials must be considered in order to effectively design, optimize, and fabricate optical components. These properties include transmittance,

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

201

reflectance, refractive index, surface quality, chemical and mechanical stability, and purity. Transmittance, reflectance, and absorbance are fundamental material properties that are generally determined by the glass type and structure, and they can be described locally using optical constants. 3.6.1 Optical Properties and Constants Transmittance through an optical element will be affected by the internal absorption of the material and external reflectances at its surfaces. Both of these properties can be described for a given material thickness (t) through the complex refractive index n^ Z nð1 C ikÞ where n is the real component of the refractive index, and k is the imaginary component, also known as the extinction coefficient. These constants can be related to a material’s dielectric constant (3), permeability (m), and conductivity (s) for real s and 3as as n2 ð1Kk2 Þ Z m3 n2 k Z

ms n

Internal transmittance for a homogeneous material is dependent on material absorbance (a) by Beer’s law IðtÞ Z Ið0ÞexpðKatÞ where I(0) is incident intensity, and I(t) is transmitted intensity through the material thickness t. Transmittance becomes I(t)/I(0). Transmittance cascades through an optical system through multiplication of individual element transmittance values. Absorbance as expressed by K(1/t) ln(transmission) is additive through an entire system. External reflection at optical surfaces occurs as light passes from a medium of one refractive index to a medium of another. For materials with nonzero absorption, surface reflection (from air) can be expressed as RZ

½nð1 C ikÞcos qi Kn1 cos qt ½nð1 C ikÞcos qi C n1 cos qt

where n and n1 are the medium refractive indices, qi is incident angle, and qt is transmitted angle. For normal incidence in air, this becomes RZ

n2 ð1 C k2 Þ C 1K2n n2 ð1 C k2 Þ C 1 C 2n

This simplifies for nonabsorbing materials in air to: RZ

q 2007 by Taylor & Francis Group, LLC

ðnK1Þ2 ðn C 1Þ2

Microlithography: Science and Technology

202

FIGURE 3.52 Frequency dependence of refractive index (n). Values approach 1.0 at high and low frequency extremes.

Refractive index (n)

Resonant frequency, n0

Frequency (n)

Because refractive index is wavelength dependent, transmission, reflection, and refraction cannot be treated as constant over any appreciable wavelength range. The real refractive index for optical materials may behave as shown in Figure 3.52 where a large spectral range is plotted and areas of index discontinuity occur. These transitions represent absorption bands in a glass material that generally occur in the UV and infrared (IR) regions. For optical systems operating in or near the visible region, refractive index is generally well behaved and can be described through the use of dispersion equations such as a Cauchy equation [26]

n ZaC

b c C 4 C. 2 l l

where the constants a, b, and c are determined by substituting known index and wavelength values between absorption bands. For optical systems operating in the UV or IR, absorption bands may limit the application of many otherwise suitable optical materials. 3.6.2 Optical Materials Below 300 nm Optical lithography below 300 nm is made difficult because of the increase in absorption in optical materials. Few transparent materials exist below 200 nm, limiting design and fabrication flexibility in optical systems. Refractive projection systems are possible at these short wavelengths, but they require consideration of issues concerning aberration effects and radiation damage. The optical characteristics of glasses in the UV are important when considering photolithographic systems containing refractive elements. As wavelengths below 250 nm are utilized, issues of radiation damage and changes in glass molecular structure become additional concerns. Refraction in insulators is limited by interband absorption at the material’s band gap energy, Eg. For 193 nm radiation, a photon energy of Ew6.4 eV limits optical materials to those with relatively large band gaps. Halide crystals, including CaF2, LiF, BaF2, MgF2, and NaF, and amorphous SiO2 (or fused silica) are the few materials that possess large enough band gaps and have suitable transmission below 200 nm. Table 3.4 shows experimentally determined band gaps and UV cutoff wavelengths of several halide crystals and fused silica [27]. UV cutoff wavelength is determined as hc/Eg. The performance of fused silica, in terms of environmental stability, purity, and manufacturability, make it a superior candidate in critical UV applications such as

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

203

TABLE 3.4 Experimentally Determined Band Gaps and UV Cutoff Wavelengths for Selected Materials Material BaF2 CaF2 MgF2 LiF NaF SiO2

Ey(eV)

lcZhc/Eg (nm)

8.6 9.9 12.2 12.2 11.9 9.6

144 126 102 102 104 130

photolithographic lens components, beam delivery systems, and photo-masks. Although limiting the number of available materials to fused silica does introduce optical design constraints (for correction of aberrations including chromatic), the additional use of materials such as CaF2 and LiF does not prochromatic), the additional use of materials such as CaF2 and LiF does not provide a large increase in design flexibility because of the limited additional refractive index range (ni at 193 nm for CaF2 is 1.492, for LiF is 1.521, and for fused silica is 1.561 [28]). Energetic particles (such as electrons and x-rays) and short-wavelength photons have been shown to alter the optical properties of fused silica [29]. Furthermore, because of the high peak power of pulsed lasers, optical damage through rearrangement is possible with excimer lasers operating at wavelengths of 248 and 193 nm [30]. Optical absorption and luminescence can be caused by a lack of stoichiometry in the fused silica molecular matrix. Changes in structure can come about through absorption of radiation and energy transfer processes. E 0 color centers in type III fused silica (wet fused silica synthesized directly by flame hydrolysis of silicon tetrachloride in a hydrogen–oxygen flame [31]) have been shown to exist at 2.7 eV (458 nm), 4.8 eV (260 nm), and 5.8 eV (210 nm) [32].

3.7 Optical Image Enhancement Techniques 3.7.1 Off-Axis Illumination Optimization of the partial coherence of an imaging system has been introduced for circular illuminator apertures. By controlling the distribution of diffraction information in the objective lens, maximum image modulation can be obtained. An illumination system can be further refined by considering illumination apertures that are not necessarily circular. Shown in Figure 3.53 is a coherently illuminated mask grating imaged through an objective lens. Here, the G1 diffraction orders are distributed symmetrically around the zeroth order. As previously seen in Figure 3.33, when defocus is introduced, an OPD between the zeroth and theGfirst order results. The acceptable depth of focus is dependent on the extent of the OPD and the resulting phase error introduced. Figure 3.54 shows a system where illumination is obliquely incident on the mask at an angle so that the zeroth and first diffraction orders are distributed on alternate sides of the optical axis. Using reasoning similar to that used for incoherent illumination, it can be shown that the minimum k factor for this oblique condition of partially coherent illumination is 0.25. The illumination angle is chosen uniquely for a given wavelength, NA, and feature size, and it can be calculated for dense features an sinK1(0.5l/d) for NAZ0.5l/d

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

204

Coherent illumination

−1

+1 0

FIGURE 3.53 Coherently illuminated mask grating and objective lens. Only 0 and G1st diffraction orders are collected.

where d is the feature pitch. The most significant impact of off-axis illumination is realized when considering focal depth. In this case, the zeroth and first diffraction orders now travel an identical path length regardless of the defocus amount. The consequence is a depth of focus that is effectively infinite. In practice, limiting illumination to allow for one narrow beam or pair of beams leads to zero intensity. Also, imaging is limited to features oriented along one direction in an x-y plane. To overcome this, an annular or ring aperture can be employed that delivers illumination at angles needed with a finite ring width to allow some finite intensity as shown in Figure 3.55a. The resulting focal depth is less than that for the ideal case, but an improvement over a full circular aperture can be achieved. For most integrated circuit applications, features can be limited to horizontal and vertical orientation, and quadrupole

B'

B ''

Off axis illumination

−1''

+1' −1' 0''

+1'' 0'

FIGURE 3.54 Oblique or off-axis illumination of a mask grating where 0 and 1st diffraction orders coincide in lens pupil.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

205

Mask

(a)

Objective lens

Mask

(b)

Objective lens

Mask

(c)

Objective lens

FIGURE 3.55 Off-axis illumination schemes for projection imaging (a) annular (b) quadrupole with horizontal and vertical poles, and (c) quadrupole with diagonal poles.

configurations may be more suitable. For the quadrupole configuration shown in Figure 3.55b, two beams are optimally off axis for one feature direction, whereas, the opposite two beams are optimal for the orthogonal orientation. There is an offsetting effect between the two sets of poles for both feature directions. An alternative configuration is depicted in Figure 3.55c where poles are at diagonal positions oriented 458 to horizontal and vertical mask features. Here, each beam is off axis to all mask features, and minimal image degradation occurs. Either the annular or quadrupole off-axis system would need to be optimized for a specific feature size and would provide non-optimal illumination for all others. Consider, for instance, features that are larger than those optimal for a given illumination angle. Only at angles corresponding to sinK1(0.5l/d) do mask frequency components coincide. With smaller features, higher frequency components do not overlap, and additional spatial frequency artifacts are introduced. This can lead to a possible degradation of imaging performance. For the optimal quadrupole situation with poles oriented at diagonal positions, resolution to 0.25l/NA is not possible as it is with the two-pole or the horizontal/vertical quadrupole. As shown in Figure 3.56, the minimum resolution pﬃﬃﬃ becomes l=ð2 2 NAÞ.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

206

NAPole NACenter

−NA0

+NA0 sin q

+ l 2p

− lp 2

Rmin =

s Pole = s Center =

Rmin =

l 2 2NA0

NAPole NA0

NACenter l = NA0 2pNA0 l 2 2NA0

FIGURE 3.56 Optimal quadrupole illumination with diagonal poles. Pole size and position can be specified in relative sigma values, spole and scenter. Minimum resolution (Rmin) is also derived.

3.7.1.1 Analysis of OAI To evaluate the impact of off-axis illumination on image improvement, consider the electric field for a binary grating mask illuminated by two discrete beams as shown in Figure 3.54. The normalized amplitude or electric field distribution can be represented as

2px 2px C q C cos Kq AðxÞ Z 0:25 2 cos q C cos l l that can be derived by multiplying the electric field of a coherently illuminated mask by eiq and eKiq and summing. The resulting aerial image takes the form 1 2px 4px 2px 6 C 8 cos C 2 cos C 6 cosð2 qÞ C 4 C2 q IðxÞf jEðxÞj Z 32 l l l 2px 2px 2px C4 cos K2 q C cos 2 C q C cos 2 Kq l l l 2

The added frequency terms present can lead to improper image reconstruction compared to that for an aerial image, resulting from simple coherent illumination 0 1 1@ 2px A EðxÞ Z 1 C cos C 2 l 0 1 1 2px 4px A IðxÞ Z @3 C 4 cos C cos 8 l l

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

207

The improvement in the aerial image for two-beam illumination is seen when using a lens with NA !l/p. With two-beam illumination, high-frequency artifact terms are not passed by the lens, but information beyond the zero the order is acted upon, expressed as: 1 2px 2px 2px 5 C 4 cos C 4 cosð2qÞ C 4 cos K2q C cos 2 Kq IðxÞ Z 32 l l l At the optimum illumination angle, spatial frequency vectors are symmetrical about the optical axis, and the aerial image simplifies to IðxÞ Z

9 2px 1 C cos 32 l

There are no higher “harmonic” frequencies present in the aerial image produced with off-axis illumination. This becomes evident by comparing three-beam inference (0, G1st orders) with two-beam interference (0 and 1st orders only). Under coherent illumination, three-beam interference results in a cosine biased by the amplitude of the zeroth order. The amplitude of the zeroth-order bias is less than the amplitude of the first-order cosine, resulting in sidelobes at twice the spatial frequency of the mask features seen in Figure 3.20. With off-axis illumination and two-beam interference, the electric field is represented by an unbiased cosine, resulting in a frequency-doubled resolution and no higher frequency effects. 3.7.1.2 Isolated Line Performance By considering grating features, optical analysis of off-axis and conventional illumination can be quite straightforward. When considering more isolated features, however, diffraction orders are less discrete. Convolving such a frequency representation with either illumination poles or an annular ring will result in diffraction information distributed over a range of angles. An optical angle of illumination that will place low-frequency information out at the full numerical apertures of the objective lens will distribute most energy at non-optimal angles. Isolated line performance is, therefore, minimally enhanced by off-axis illumination. Any improvement is significantly reduced also as the pole or ring width is increased. When both dense and isolated features are considered together in a field, it follows that the dense to isolated feature size bias or proximity effect will be affected by off-axis illumination [33]. Figure 3.57 shows, for instance, the decrease in image CD bias between dense and isolated 0.35-mm features for increasing levels of annular illumination using a 0.55 NA I-line exposure system. As obscuration in the condenser lens pupil is increased (resulting in annular illumination of decreasing ring width), dense to isolated feature size bias decreases. As features approach 0.25l/NA, however, larger amounts of energy go uncollected by the lens that may lead to an increase in this bias as seen in Figure 3.58. Off-axis illumination schemes have been proposed by which the modulation of nonperiodic features could be improved [34]. Resolution improvement for off-axis illumination requires multiple mask pattern openings for interference, leading to discrete diffraction orders. Small auxiliary patterns can be added close to an isolated feature to allow the required interference effects. By adding features below the resolution cutoff of an imaging system (0.2l/NA, for example) and placing them at optimal distances so that their side lobes coincide with main feature main lobs (0.7l/NA, for instance), peak amplitude and image log slope can be improved [35]. Higher order lobes of isolated feature diffraction patterns can be further enhanced by adding additional 0.2l/NA spaces at corresponding

q 2007 by Taylor & Francis Group, LLC

Linewidth (micron)

FIGURE 3.57 Image CD bias versus annular illumination. Inner sigma values correspond to the amount of obscuration in the condenser lens pupil. Partial coherence s (outer) is 0.52 for 0.35 mm features using 365 nm illumination and 0.55 NA. Defocus is 0.5 mm. As central obscuration is increased, image CD bias increases.

0.42

−0.01

0.4

−0.02

0.38

−0.03 Isolated CD

0.36

−0.04 CD Bias

0.34

−0.05

Dense CD 0.32

−0.06

0.3

−0.07

0.28

−0.08

CD Bias

Microlithography: Science and Technology

208

−0.09

0.26 0

0.08

0.16

0.24 0.32 Inner sigma

0.4

0.48

FIGURE 3.58 Similar conditions as Figure 3.57 for 0.22 mm features. Image CD bias is now reversed with increasing inner sigma.

q 2007 by Taylor & Francis Group, LLC

0.3 0.28 0.26 0.24 0.22 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0

0.23 0.22 0.21

Dense CD

0.2

CD Bias

0.19 Isolated CD

0.18 0.17 0.16 0.15 0.14

0

0.12

0.24

0.36 0.48 Inner sigma

0.6

0.72

CD Bias

Linewidth (micron)

distances [36]. Various arrangements are possible as shown in Figure 3.59. This figure shows arrangements for opaque line space patterns, an isolated opaque line, clear line space patterns, and an isolated clear space. The image enhancement offered by using these techniques is realized as focal depth is considered. Figure 3.60 through Figure 3.62 show the DOF improvement for a five-bar space pattern through focus. Figure 3.60 shows aerial images through l/NA2 (G0.74 mm) of defocus for 0.5l/NA features using conventional illumination with sZ0.5. Figure 3.61 gives results using quadrupole illumination. Figure 3.62 shows aerial images through focus for the same feature width with auxiliary patterns smaller than 0.2l/NA and off-axis illumination. An improvement in DOF is apparent with minimal intensity in the dark field. Additional patterns would be required to increase peak intensity that may be improved by as much as 20%. Another modification of off-axis illumination has been introduced that modifies the illumination beam profile [37]. This modified beam illumination technique fills the condenser lens pupil with weak quadrupoles where energy is distributed within and between poles as seen in Figure 3.63. This has been demonstrated to allow better control of DOF of proximity effects for a variety of feature types.

Optics for Photolithography

209

(b)

(a)

(d)

(c)

FIGURE 3.59 Arrangement for additional auxiliary patterns to improve isolated line and CD bias performance using OAL (a) opaque dense features, (b) clear dense features, (c) opaque isolated features, and (d) clear isolated features.

3.7.2 Phase Shift Masking Up to this point, control of the amplitude of a mask function has been considered, and phase information has been assumed to be nonvarying. It has already been shown that the spatial coherence or phase relation of light is responsible for interference and diffraction effects. It would follow, therefore, that control of phase information at the mask may allow additional manipulation of imaging performance. Consider the situation in Figure 3.64 where two rectangular grating masks are illuminated with coherent illumination. The conventional “binary” mask in Figure 3.64a produces an electric field that varies from 0 to 1 as a transition is made from opaque to transparent regions. The minimum numerical aperture that can be utilized for this situation is one that captures the zero and G first diffraction orders or NA Rl/p. The lens acts on this information to produce a cosinusoidal amplitude image appropriately biased by the zeroth diffraction orders. The aerial image is

1.2

Image intensity

1

0.8

0.6

0.4

0.2

0 Distance (nm) FIGURE 3.60 Aerial image intensity for 0.37 mm features through G0.5l/NA2 of defocus using sZ0.5, 365 nm, and 0.5 NA.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

210 1.2

Image intensity

1

0.8

0.6

0.4

0.2

0 Distance (nm) FIGURE 3.61 Aerial images as in Figure 3.60 using OAI.

proportional to the square of the amplitude image. Now consider Figure 3.64b where a p “phase shifter” is added (or subtracted) at alternating mask openings that create an electric field at the mask varying from K1 to C1 where a negative amplitude represents a p phase shift (a p/2 phase shift would be 908 out of the paper, 3p/2 would be 908 into the paper, and so forth). Analysis of this situation can be simplified if the phase shift mask function is into separate functions, one for each state of phase where m(x)Zm1(x)Cm2(x).

1.2

Image intensity

1

0.8

0.6

0.4

0.2

0 Distance (nm) FIGURE 3.62 Aerial images for features with 0.08 l/NA auxiliary patterns and OAI. Note the improvement in minimum intensity of outermost features at greatest defocus.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

211

FIGURE 3.63 Modified illumination profiles for conventional and OAI. (Reproduced from Ogawa, T., Uematsu, M., Ishimaru, T., Kimura, M., and Tsumori, T., SPIE, 2197, 19, 1994.)

The fist function, m1(x), can be described as a rectangular wave with a pitch equal to four times the space width m1 ðxÞ Z rect

x x comb p=2 * 2p

The second mask function, m2(x), can be described as x x m2 ðxÞ Z rect comb K1 * 3p=2 2p The spatial frequency distribution becomes MðuÞ Z FfmðxÞg Z Ffm1 ðxÞg C Ffm2 ðxÞg that is shown in Figure 3.65. It is immediately noticed that the zero term is removed through the subtraction of the centered impulse function, d(x). Also, the distribution of the diffraction orders has been defined by a comb(u) function with one half the frequency required for a conventional binary mask. The minimum lens NA required is that which captures the G first diffraction orders, or l/2p. The resulting image amplitude pupil filtered and distributed to the wafer is an unbiased cosine with a frequency of one half the mask pitch. When the image intensity is considered (I(x)ZjA(x)j2), the result is a squared cosine with the original mask pitch. Intensity minimum points are ensured as the amplitude function passes through zero. This “forced zero” results in minimum intensity transfer into photoresist, a situation that will not occur for the binary case as shown. For coherent illumination, a lens acting on this diffracted information has a 50% decrease in the numerical aperture required to capture these primary orders. Alternatively, for a given lens numerical aperture, a mask that utilizes such alternating aperture phase shifters can produce a resolution twice that possible using a conventional binary mask. Next, consider image degradation through defocus or other aberrations. For the

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

212

p

Chrome

Phase shifter

Mask

Mask

1.0

1.0

0.0 −1.0

0.0 (x)

−1.0

Amplitude 1.0

1.0

0.0

0.0

−1.0

(x)

Amplitude

(x)

−1.0

(x)

Intensity 1.0

Intensity 1.0

0.0

0.0

−1.0 (a) Conventional binary mask

(x)

(x) −1.0 (b) Alternating phase shifting mask

FIGURE 3.64 Schematic of (a) a conventional binary mask and (b) an alternating phase shift mask. The mask electric field, image amplitude, and image intensity is shown for each.

conventional case, the resulting intensity image becomes an average of cosines with decreased modulation. The ability to maintain a minimum intensity becomes more difficult as the aberration level is increased. For the phase-shifted mask case, the minimum intensity remains exactly zero, increasing the likelihood that photoresist can reproduce a usable image. For the phase-shifted mask, because features one half the size can be resolved, the minimum resolution can be expressed as R Z 0:25 l=NA As the partial coherence factor is increased from zero, the impact of this phase shift technique is diminished to a point at which for incoherent illumination, no improvement is realized for phase shifting over the binary mask. To evaluate the improvement of phase shift masking over conventional binary masking, the electric field at the wafer, neglecting higher order terms, can be considered px EðxÞ Z cos l The intensity in the aerial image is approximated by 1 2px IðxÞ Z 1 C cos 2 l

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

213

m1(x) 1.0

2p

0.0 (x)

−1.0

m2(x) 1.0 0.0 −1.0

(x)

2p

M(u)

−l /2p +l /2p

(u)

FIGURE 3.65 Spatial frequency distribution m(u), resulting from coherent illumination of an alternating phase shift mask as decomposed into m1(x) and m2(x).

that is comparable to that for off-axis illumination. In reality, higher order terms will affect DOF. Phase shift masking may, therefore, result in a lower DOF than for fully optimized off-axis illumination. The technique of phase shifting alternating features on a mask is appropriately called alternating phase shift masking. Phase information is modified by either adding or subtracting optional material from the mask substrate at a thickness that corresponds to a p phase shift [38,39]. Figure 3.66 shows two wave trains traveling through a transparent refracting medium (a glass plate), both in phase on entering the material. The wavelength of light as it enters the medium from air is compressed by a factor proportional to the refractive index at that wavelength. Upon exiting the glass plate into air, the initial wavelength of the wavefronts is restored. If one wave train travels a greater path length than the other, a shift in phase between the two will result. By controlling the relationship between the respective optical path distances traveled over the area of some refracting medium with refractive index ni, a phase shift can be produced as follows: Df Z

2p ðn K1Þt l i

where t is the shifter thickness. The required shifter thicknesses for a p phase shift at 365, ˚ , respectively. At 248, and 193 nm wavelengths in fused silica are 3720, 2470, and 1850 A shorter wavelengths, less phase shift material thickness is required. Depending on the mask fabrication technique, this may limit the manufacturability of these types of phase shift masks for short UV wavelength exposures. Generally, a phase shift can be produced by using either thin-film deposition and delineation or direct glass etch methods. Both techniques can introduce process control problems. In order to control phase shifting to within G58, a reasonable requirement for low k factor lithography, I-line phase shifter ˚ in fused silica. For 193 nm lithography, this thickness must be held to within 100 A ˚ becomes 50 A if etching techniques cannot operate within this tolerance level over large mask substrates (in a situation where an etch stop layer is not present), the application of

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

214

Φ1=2π/λ (n'd+t)

Φ2=2π/λ (n'd+n"t)

air n=1.0

Substrate (d) n'>1.0

(t) Phase shifter n">1.0

FIGURE 3.66 Diagram of wavetrain propagation through phase shifted and unshifted positions of a mask.

etched glass phase shift masks for IC production may be limited to longer wavelengths. There also exists a trade-off between phase errors allowed through fabrication techniques and those allowed through increasing partial coherence. As partial coherence is increased above zero, higher demands are placed on phase shifter etch control. If etch control ultimately places a limitation on maximum partial coherence allowed, the issue of exposure throughput becomes a concern. Variations in the alternating phase shift mask have been developed to allow for application to nonrepetitive structures [40]. Figure 3.67 shows several approaches where phaseshifting structures are applied at or near the edge of isolated features. These rim phase-shifting techniques do not offer the doubling resolution improvement of the alternating approach, but they do produce a similar forced zero in intensity at the wafer because of a phase transition at feature edges. The advantage of these types of schemes is their ability to be applied to arbitrary feature types. As with the alternating phase shift mask, these rim masks require film deposition and patterning or glass etch processing and may be difficult to fabricate for short UV wavelength applications. In addition, pattern placement accuracy of these features that are sub-0.25 k factor in size is increasingly challenging as wavelength decreases. Other phase shift mask techniques make use of a phase-only transition and destructive interference at edges [41]. A “chromeless” phase edge technique, as shown in Figure 3.67, requires a single mask patterning step and produces intensity minimums at the wafer mask plane at each mask phase transition. When used with a sufficiently optimized resist process, this can result in resolution well beyond the Rayleigh limit. Resist features as small as kZ0.20 have been demonstrated with this technique that introduces opportunities for application especially for critical isolated feature levels. An anomaly of using such structures is the addition of phase transitions at every shifter edge. To eliminate resulting intensity dips produced at these edges, multiple-level masks have been used [42]. After exposure with the chromeless phase edge mask, a binary chrome mask can be utilized to eliminate undesired field artifacts. An alternative way to reduce these unwanted phase edge effects is to engineer into the mask additional phase levels such as 608 and 1208 [43]. To achieve such a phase combination, two phase etch process steps are required during mask fabrication. This may ultimately limit application. Variations on

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

215

(a)

(b)

(c)

(d) FIGURE 3.67 Various phase shift mask schemes (a) etch outriggers, (b) additive rim shifters, (c) etched rim shifters, and (d) chromeless phase shift mask.

these phase-shifting schemes include a shifter-shutter structure that allows control over feature width and reduces field artifacts and a clear field approach using sub-Rayleigh limit gating or checkerboard structures [36]. Each of these phase shift masking approaches requires some level of added mask and process complexity. In addition, none of these techniques can be used universally for all feature sizes, shapes, or parity. An approach that can minimize mask design and fabrication complexity may gain the greatest acceptance for application to manufacturing. An attenuated phase shift mask (APSM) may be such an approach where conventional opaque areas on a binary mask are replaced with partially transmitting regions (5%–15%) that produce a p phase shift with respect to clear regions. This is a phase shift mask approach that has evolved out of x-ray masking where attenuators inherently possess some degree of transparency [44]. As shown in Figure 3.68, such a mask will produce a mask electric field that varies from 1.0 to K0.1 in amplitude (for a 10% transmitting attenuator) with a shift in phase, represented by a transition from a positive electric field component to a negative. The electric field at the wafer possesses a loss of

π PS 100% 10%

E field mask

q 2007 by Taylor & Francis Group, LLC

Partially transmitting shifters

0

FIGURE 3.68 A 10% attenuated phase shift mask. A p phase shift and 10% transmission is achieved in attenuated regions. The zero mask electric field ensures minimum aerial image intensity.

Microlithography: Science and Technology

216

modulation, but it retains the phase change and transition through zero. Squaring the electric field results in an intensity with a zero minimum. Recent work in areas of attenuated phase shift masking has demonstrated both resolution and focal depth improvement for a variety of feature types. Attenuated phase shift mask efforts at 365, 248, and 193 nm have shown a near doubling of focal depth for features on the order of kZ0.5 [45,46]. As such technologies are considered for IC mask fabrication, practical materials that can satisfy both the 1808 phase shift and the required transmittance at wavelengths to 193 nm need to be investigated. A single-layer APSM material is most attractive from the standpoint of process complexity, uniformity, and control. The optimum degree of transmission of the attenuator can be determined through experimental or simulation techniques. A maximum image modulation or image log slope is desired while maintaining a minimum printability level of side-lobes formed from intensity within shadowed regions. Depending on feature type and size and resist processes, APSM transmission values between 4 and 15% may be appropriate. In addition to meeting optical requirements to allow appropriate phase shift and transmission properties, an APSM material must be able to be patterned using plasma etch techniques, have high etch selectivity to fused silica, be chemically stable, have high absorbance at alignment wave-lengths, and not degrade with exposure. These requirements may ultimately limit the number of possible candidates for practical mask application. Phase shifting in a transparent material is dependent on a film’s thickness, real refractive index, and the wavelength of radiation as seen earlier. To achieve a phase shift of 1808, the requirement film thickness becomes tZ

l 2ðnK1Þ

The requirements of an APSM material demand that films are absorbing, i.e., that they possess a nonzero extinction coefficient (k). This introduces additional phase-shifting contributions from film interfaces that can be determined by

2n F Z arg 2 n 1 C n2

where n1 is the complex refractive index (nCk) of the first medium and n2 is the complex refractive index of the second [47]. These additional phase terms are nonnegligible as k increases as shown in Figure 3.69. In order to determine the total phase shift resulting from an absorbing thin film, materials and interface contributions need to be accounted for. To deliver both phase shift and transmission requirements, film absorption (a) or extinction coefficient (k) are considered aZ

4pk l

where a is related to transmission as TZeKat. In addition, mask reflectivity below 15% is desirable and can be related to n and k through the Fresnel equation for normal incidence RZ

ðnK1Þ2 C k2 ðn C 1Þ2 C k2

In order to meet all optical requirements, a narrow range of material optical constants is suitable at a given exposing wavelength. Both chromium oxydinitride-based and

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

217

10 8 6 4 2 0 −2 −4 φ (°) −6 −8 −10 −12 −14 −16 −18 −20

0.5 1 1.5 3

2.5

2

1.5

n

K

2 1

FIGURE 3.69 Additional phase terms resulting at interfaces as a function of n and k.

molybdenum silicon oxynitride-based materials have been used as APSM materials at 365 nm. For shorter wavelength applications, these materials become too opaque. Alternative materials have been introduced that, through modification of material composition or structure, can be tailored for optical performance at wavelengths from 190 to 250 nm [48]. These materials include understoichiometric silicon nitride, aluminum-rich aluminum nitride, and other metal oxides, nitrides, and silicides. The usefulness of these materials in production may ultimately be determined by their ability to withstand short wavelength exposure radiation. In general, understoichiometric films possess some degree of instability that may result in optical changes during exposure. 3.7.3 Mask Optimization, Biasing, and Optical Proximity Compensation When considering one-dimensional imaging, features can often be described through use of fundamental diffraction orders. Higher order information lost through pupil filtering leads to less square-wave image reconstruction and a loss of aerial image integrity. With a high-contrast resist, such a degraded aerial image can be used to reconstruct a near-squarewave relief image. When considering two-dimensional imaging, the situation becomes more complex. Whereas mask to image width bias for a simplified one-dimensional case can be controlled via exposure/process or physical mask feature size manipulation, for two-dimensional imaging there are high-frequency interactions that need to be considered. Loss or redistribution of high-frequency information results in such things as corner or contact rounding that may influence device performance. Other problems encountered when considering complex mask patterns are the fundamental differences between imaging isolated lines, isolated spaces, contacts, and dense features. The reasons for these differences are many-fold. First, a partially coherent system is not linear in either amplitude or intensity. As we have seen, only an incoherent system is linear in intensity, and only a coherent system is linear in amplitude. Therefore, it should not be expected that an isolated line and an isolated space feature are complementary. In addition, photoresist is a nonlinear detector, responding differently to the thresholds introduced by these two feature types. This reasoning can be extended to the concept of

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

218

mask biasing. At first guess, it may be reasoned that a small change in the size of a mask feature would result in a near-equivalent change in resist feature width or at least aerial image width. Neither is possible because, in addition to nonlinearity of the imaging system, biasing is not a linear operation. Differences in image features of various types are also attributed to the fundamental frequency representation of dense versus isolated features. Dense features can be suitably represented by discrete diffraction orders using coherent illumination. Orders are distributed with some width for incoherent and partially incoherent illumination. Isolated features, on the other hand, can be represented as some fraction of a sinc function for coherent illumination, distributed across the frequency plane for incoherent and partially coherent illumination. In terms of frequency information, these functions are very different. Figure 3.70 shows the impact of partial coherence on dense to isolated feature bias for 0.6l/NA features. Dense lines (equal lines and spaces) print smaller than isolated lines for low values of partial coherence. At high partial coherence values, the situation is reversed. There also exists some optimum where the dense to isolated feature bias is near zero. Variations in exposure, focus, aberrations, and resist process will also have effects. Through characterization of the optical and chemical processes involved in resist patterning, image degradation can be predicted. If the degradation process is understood, small feature biases can be introduced to account for losses. This predistortion technique is often referred to as optical proximity compensation (OPC) that is not a true correction in that lost diffraction detail is not accounted for. Mask biasing for simple shapes can be accomplished with an iterative approach, but complex geometry or large fields probably require rule-based computation schemes [49]. Generally, several adequate solutions are possible. Those that introduce the least process complexity are chosen for implementation. Figure 3.71a shows a simple two-dimensional mask pattern and the resulting simulated resist image for a k1Z0.5 process. Feature rounding is evident at both inside and outside corners. The image degradation can be quantified by several means. Possible approaches may be to measure linear deviation, area deviation, or radius deviation. Figure 3.71b shows a biased version of the same simple pattern and resulting simulated aerial image. Comparisons of the two images show the improvement realized with such correction schemes. The advantage of these techniques is the relatively low cost of implementation.

0.45 s = 0.3

Feature size

s = 0.5

FIGURE 3.70 The variation in dense to isolated feature bias with partial coherence. For low s values, dense features (2! pitch) print smaller than isolated features (6! pitch). For high s values, the situation is reversed (365 nm, 0.5 NA, 0.4 mm, 0.5 NA, 0.4 mm lines, positive resist).

q 2007 by Taylor & Francis Group, LLC

0.40 s = 0.7 0.35 s = 0.9

0.30 0.8

1.2

1.6 Pitch (μm)

2.0

2.4

Optics for Photolithography

219

(a)

(b)

FIGURE 3.71 Simple two-dimensional mask patterns (a) without OPC and the resulting resist image (b) with OPD and the resulting resist image.

3.7.4 Dummy Diffraction Mask A technique of illumination control at the mask level is possible that offers resolution improvement similar to that for off-axis illumination [50]. Here, two separate masks are used. In addition to a conventional binary mask, a second diffraction mask composed of line space or checkerboard phase patterns is created with 1808 phase shifting between patterns. Coherent light incident on the diffraction mask is diffracted by the phase grating as shown in Figure 3.72. When the phase grating period is chosen so that the angle of diffraction is sinK1(l/p), the first diffraction orders from the phase mask will deliver illumination at an optimum off-axis angle to the binary mask. There is no energy in the phase diffraction pattern on axis (no DC term), and higher orders have less energy than the first. For a line/space phase grating mask, illumination is delivered to the binary mask as with off-axis two-pole illumination. For a checkerboard phase grating mask, a situation similar to quadrupole illumination results.

Coherent illumination

π ps

Grating mask pitch = 2p

sin q1 = q1

l 2p

q3

Primary mask pitch = p

FIGURE 3.72 Schematic of a grating diffraction mask used to produce off-axis illumination for primary mask imaging.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

220

A basic requirement for such an approach is that the phase mask and the binary mask are sufficiently far apart to allow far-field diffraction effects from the phase mask to dominate. This distance is maximized for coherent illumination on the order of 2p/l where 2p is the phase mask grating period. As partial coherence is increased, a collection of illumination angles exists. This will decrease image contrast as well as maximum intensity and decrease the required mask separation distance. The tolerance to phase error has been shown to be greater than G10%. Angular misregistration of 108 may also be tolerable. Resolution capability for coherent illumination is identical to that for alternating phase shift masking and off-axis illumination. This approach is, however, limited to periodic mask features. 3.7.5 Polarized Masks So far, the amplitude and phase components of light for design for lithographic masks have been considered. Light also possesses polarization characteristics that can be utilized to influence imaging performance [51]. Consider Figure 3.19 and Figure 3.24 where zero and G first diffraction orders are collected for an equal lines/space object. Here the zerothorder amplitude is A/2, and the G first-order amplitudes are A/p. For a transverse electric (TE) state of linear polarization, these orders can be represented in terms of complex exponentials as 0 1 0 B C zeroth order : A=2@ 1 Aexp½i2p=lð0x C 0y C 1zÞ 0 0 1 0 B C Cfirst order : A=p@ 1 Aexp½i2p=lðax C 0y C czÞ 0 0 1 0 B C Kfirst order : A=p@ 1 Aexp½i2p=lðKax C 0y C czÞ 0 For transverse magnetic (TM) polarization, these orders become 0 1 1 B C zeroth order : A=2@ 0 Aexp½i2p=lð0x C 0y C 1zÞ 0

0 c

1

B C Cfirst order : A=p@ 0 Aexp½i2p=lðax C 0y C czÞ Ka 0 1 c B C Kfirst order : A=p@ 0 Aexp½i2p=lðKax C 0y C czÞ a

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

221

As previously shown, the sum of these terms produces the electric field at the wafer plane. The aerial images at the wafer plane for TE and TM polarization become ITE ðxÞ Z

ITM ðxÞ Z

A 4A 2pax 2A 2pax C cos2 cos C p p l p l

A 4A 2 2pax 2c 2pax C 2 a C ðc2 Ka2 Þcos2 C cos p l p l p

The normalized image log slope (NILSZILS!line width) for each aerial image becomes NILSTE Z 8 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð1Ka2 Þ NILSTM Z 8 1 C ð16a2 =p2 Þ The second term in the NILSTM equation is less than one, resulting in a lower resolution value for TM polarization as compared to TE polarization. Therefore, there can be some benefit to using TE polarization over TM polarization or non-polarized light. Conventionally, polarized light has not been used for optical lithographic systems, but recent advances in catadioptric systems do require polarization control. For any system, it is difficult to illuminate all critical features with TE-only polarization through source control because feature orientation would be limited to one direction only. The concept of polarization modulation built into a mask itself has been introduced as a potential step for mask modification. This would require the development of new, probably single crystalline, materials and processes. A polarized mask has been proposed as a means of accomplishing optimization of various feature orientations [52,53]. An alternating aperture polarization mask can also be imagined that could produce maximum image contrast.

3.8 Optical System Design In an ideal lens, the image formed is a result of all rays at all wavelengths from all object points, forming image plane points. Lens aberrations create deviations from this ideal, and a lens designer must make corrections or compensation. The degrees of freedom available to a designer include material refractive index and dispersion, lens surface curvatures, element thickness, and lens stops. Other application-specific requirements generally lead lens designers toward only a few practical solutions. For a microlithographic optical system, Ko¨hler illumination is generally used. Requirements for a projection lens are that two images are simultaneously relayed: the image of the reticle and the image of the source (or the illumination exit pupil). The projection lens cannot be separated from the entire optical system; consideration of the illumination optics needs to be included. In designing a lens system for microlithographic work, image quality is generally the primary consideration. Limits must often be placed on lens complexity and size to allow workable systems. The push toward minimum aberration, maximum numerical aperture, maximum field size, maximum mechanical flexibility,

q 2007 by Taylor & Francis Group, LLC

222

Microlithography: Science and Technology

and minimum environmental sensitivity has lead to designs that incorporate features somewhat unique to microlithography. 3.8.1 Strategies for Reduction of Aberrations: Establishing Tolerances Several classical strategies can be used to achieve maximum lens performance with minimum aberration. These might include modification of material indices and dispersion, splitting the power of elements, compounding elements, using symmetric designs, reducing the effective field size, balancing existing aberrations, or using elements with aspheric surfaces. Incorporating these techniques is often a delicate balancing operation. 3.8.1.1 Material Characteristics When available, the use of several glass types of various refractive index values and dispersions allows significant control over design performance. Generally, for positive elements, high-index materials will allow reduction of most aberrations because of the reduction of ray angles at element surfaces. This is especially useful for the reduction of Petzval curvature. For negative elements, lower index materials are generally favored that effectively increases the extent to which correcting is effective. Also, a high value of dispersion is often used for the positive element of an achromatic doublet, whereas, a low dispersion is desirable for the negative element. For microlithographic applications, the choice of materials that allows these freedoms is limited to those that are transparent at design wavelengths. For g-line and i-line wavelengths, several glass types transmit well, but below 300 nm, only fused silica and fluoride crystalline materials can be used. Without the freedom to control refractive index and dispersion, a designer is forced to look for other ways to reduce aberrations. In the case of chromatic aberration, reduction may not be possible, and restrictions must be placed on source bandwidth if refractive components are used. 3.8.1.2 Element Splitting Aberrations can be minimized or balanced by splitting the power of single elements into two or more components. This allows a reduction in ray angles, resulting in a lowering of aberration. This technique is often employed to reduce spherical aberration where negative aberration can be reduced by splitting a positive element and positive aberration can be reduced by splitting a negative element. The selection of the element to split can often be determined through consideration of higher order aberration contributions. Using this technique for microlithographic lenses has resulted in lens designs with a large number of elements. 3.8.1.3 Element Compounding Compounding single elements into a doublet is accomplished by cementing the two and forming an interface. This technique allows control of ray paths and allows element properties not possible with one glass type. In many cases, a doublet will have a positive element with a high index combined with a negative element of lower index and dispersion. This produces an achromatized lens component that performs similar to a lens with a high index and very high dispersion. This accomplishes both a reduction in chromatic aberration and a flattening of the Petzval field. Coma aberration can also be modified by taking advantage of the refraction angles at the cemented interface where upper and lower rays may be bent differently. The problem with utilizing a cemented

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

223

doublet approach with microlithographic lenses is again in the suitable glass materials. Most UV and deep UV glass materials available have a low refractive index (w1.5), limiting the corrective power of a doublet. This results in a narrow wavelength band over which an achromatized lens can be corrected in the UV. 3.8.1.4 Symmetrical Design An optical design that has mirror symmetry about the aperture stop is free of distortion, coma, and chromatic aberration. This is due to an exact canceling of aberrations on each side of the pupil. In order to have complete symmetry, unit magnification is required. Optical systems that are nearly symmetrical can result in substantial reduction of higher order residuals of distortion, coma, and chromatic aberration. These systems, however, operate with unit magnification, a requirement for object-to-image symmetry. Because 1! imaging limits mask and wafer geometry, these systems can be limiting for very high resolution applications but are widely used for larger feature lithography. 3.8.1.5 Aspheric Surfaces Most lens designs restrict surfaces to being spherically refracting or reflecting. The freedom offered by allowing incorporation of aspheric surfaces can lead to dramatic improvements in residual aberration reduction. Problems encountered with aspheric surfaces include difficulties in fabrication, centering, and testing. Several techniques have been utilized to produce parabolic as well as general aspheres [54]. Lithographic system designs have started to take advantage of aspheric elements on a limited basis. The success of these surfaces may allow lens designs to be realized that would otherwise be impossible. 3.8.1.6 Balancing Aberrations For well-corrected lenses, individual aberrations are not necessarily minimized; instead, they are balanced with respect to wavefront deformation. The optimum balance of aberration is unique to the lens design and is generally targeted to achieve minimum OPD. Spherical aberration can be corrected for in several ways, depending largely on the lens application. When high-order residual aberrations are small, correction of spherical aberration to zero at the edge of the aperture is usually best as shown in Figure 3.40. Here, the aberration is balanced for minimum OPD and is best for diffraction-limited systems such as projections lenses. If a lens is operated over a range of wavelengths, however, this correction may result in a shift in focus with aperture size. In this case, spherical aberration may be overcorrected. This situation would result in a minimum shift in best focus through the full aperture range, but a decrease in resolution would result at full aperture. Chromatic aberration is generally corrected at a 0.7 zone position within the aperture. In this way, the inner portion of the aperture is undercorrected, and the outer portion of the lens is overcorrected. Astigmatism can be minimized over a full field by overcorrecting third-order astigmatism and undercorrecting fifth-order astigmatism. This will result in the sagittal focal surface located inside the tangential surface in the center of the field and vice versa at the outside of the field. Petzval field curvature is adjusted so that the field is flat with both surfaces slightly inward. Correction such as these can be made through control of element glass, power, shape, and position. The impact of many elements of a full lens design makes minimization and optimization very difficult. Additionally, corrections such as those discussed operate primarily on third-order aberration. Corrections of higher order and interactions cannot be made with single element or surface modifications. Lens design becomes a delicate process best handled with optical design programs that utilize local and global

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

224

optimization. Such computational tools allow interaction of lens parameters based on a starting design and an optical designer experience. By taking various paths to achieve low aberration, high numerical aperture, large flat fields, and robust lithographic systems several lens designs have evolved. 3.8.2 Basic Lithographic Lens Design 3.8.2.1 The All-Reflective (Catoptric) Lens Historically, the 1! ring-field reflective lens used in a scanning mode was one of the earliest projection systems used in integrated circuit manufacture [55]. The reflective aspect of such catoptric systems has several advantages over refractive lens designs. Because most or all of the lens power is in the reflective surfaces, the system is highly achromatized and can be used over a wide range of wavelengths. Chromatic variation of aberrations is also absent. In addition, aberrations of special mirrors are much smaller than those of a refractive element. A disadvantage of a conventional catoptric system such as the configurations lens shown in Figure 3.73 is the obscuration required for imaging. This blocking of light rays close to the optical axis is, in effect, a low-pass filtered system that can affect image modulation and depth of focus. The 1! Offner design of the ring-field reflecting system gets around this obscuration by scanning through restricted off-axis annulus of a full circular field as shown in Figure 3.74. This not only eliminates the obscuration problems, but it also substantially reduces redial aberration variation. Because the design is rotationally symmetrical, all aberrations are constant around the ring. By scanning the image field through this ring, astigmatism, field curvature, and distortion are averaged. It can also be seen that this design is symmetric on both image and object sides. This results in 1! magnification, but it allows further cancellation of aberration. Vignetting of rays by the secondary mirror forces operation off axis and introduces an increase in aberration level. Mechanically, at larger numerical apertures, reticle and wafer planes may be accessible only by folding the design. Field size is also limited by lens size and high-order aberration. Moreover, unit magnification limits both resolution and wafer size. 3.8.2.2 The All-Refractive (Dioptric) Lens Early refractive microlithographic lenses resembled microscope objectives, and projection lithography was often performed using off-the-shelf microscope designs and construction. As IC device areas grow, requirements for lens field seizes are increased. Field sizes greater than 25 mm are not uncommon for current IC technology, using lens numerical apertures above 0.50. Such requirements have led to the development of UV lenses that operate well

(a)

(b)

FIGURE 3.73 Two-mirror catoptric systems (a) the Schwarzchild and (b) the Cassegrain configurations.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

225

FIGURE 3.74 The 1! Offner ring-field reflective lens.

beyond l/4 requirements for diffraction-limited performance, delivering resolution approaching 0.15 mm. Shown in Figure 3.75 is a refractive lens design for use in a 5! I-line reduction system [56]. The design utilizes a large number of low-power elements for minimization of aberration as well as aberration-canceling surfaces. The availability of several glass types at I-line and g-line wavelengths allows chromatic aberration correction of such designs over bandwidths approaching 10 nm. The maximum NA for these lens types is approaching 0.65 with field sizes larger than 30 nm. Achromatic refractive lens design is not possible at wavelengths below 300 nm and, apart from chromatic differences of paraxial magnification, chromatic aberration cannot be corrected. Restrictions must be placed on exposure sources, generally limiting spectral bandwidth on the order of a few picometers. First-order approximations for source bandwidth based on paraxial defocus of the image by half of the Rayleigh focal depth also show the high dependence on lens NA and focal length. Chromatic aberration can be expressed as df Z

f ðdnÞ ðnK1Þ

FIGURE 3.75 An all-refractive lens design for a 5! I-line reduction system.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

226

FIGURE 3.76 A chromatic all-refracting lens design for a 4! 248 nm system.

where f is focal length, n is refractive index, and df is focus error or chromatic aberration. Combining with the Rayleigh depth of focus condition DOF ZG0:5

l NA2

produces a relationship DlðFWHMÞ Z

ðnK1Þl 2f ðdn=dlÞNA2

where dn/dl is the dispersion of the lens material. Lens magnification, m, further affects required bandwidth as: DlðFMHWÞ Z

ðnK1Þl 2f ð1 C mÞðdn=dlÞNA2

A desirable chromatic refractive lens from the standpoint of the laser requirements would, therefore, have a short focal length and a small magnification (high reduction factor) for a given numerical aperture. Requirements for IC manufacture, however, do no coincide. Shown in Figure 3.76 is an example of a chromatic refractive lens design [56]. This system utilizes an aspherical lens element that is close to the lens stop [57]. Because refractive index is also dependent on temperature and pressure, chromatic refractive lens designs are highly sensitive to barometric pressure and lens heating effects. 3.8.2.3 Catadioptric-Beamsplitter Designs Both the reflective (catoptric) and refractive (dioptric) systems have advantages that would be beneficial if a combined approach to lens design were utilized. Such a refractive–reflective approach is known as a catadioptric design. Several lens designs have been developed for microlithographic projection lens application. A catadioptric lens design that is similar to the reflective ring-field system is the 4! reduction Offner shown in Figure 3.77 [56]. The field for this lens is also an annulus or ring that must be scanned for full-field imaging. The design uses four spherical mirrors and two fold mirrors. The refractive elements are utilized for aberration correction, and their power is minimized, reducing chromatic effects and allowing the lens to be used with an Hg lamp at DUV wavelengths. This also minimizes the sensitivity of the design to lens heating and barometric pressure. The drawbacks of this system are its numerical aperture, limited to sub-0.5 levels by vignetting, and the aberration contributions from the large number of reflective surfaces. The alignment of lens elements is also inherently difficult. To avoid high amounts of obscuration of prohibitively low lens numerical apertures, many lens designs have made use of the incorporation of a beam-splitter. Several beamsplitter types are possible. The conventional cube beam-splitter consists of matched pairs of right angle prisms, one with a partially reflecting film deposited on its face, optically

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

The 4! catadioptric MSI design.

q 2007 by Taylor & Francis Group, LLC

227

FIGURE 3.77

228

Microlithography: Science and Technology

FIGURE 3.78 A polarizing beam-splitter. A linearly polarized beam is divided into TM and TE states at right angles.

cemented. A variation on the cube beam-splitter is a polarizing beam-splitter as shown in Figure 3.78. An incident beam of linearly polarized light is divided with transverse magnetic (TM) and transverse electric (TE) states emerging at right angles. Another possibility is a beam-splitter that is incorporated into a lens element, known as a Mangin mirror, as shown in Figure 3.79. Here, a partial reflector allows one element to act as both a reflector and a refractor. Although use of a Mangin mirror does require central obscuration, if a design can achieve levels below 10% (radius), the impacts on imaging resolution and depth of focus are minimal [58]. The 4! reduction Dyson shown in Figure 3.80 is an example of a catadioptric lens design based on a polarizing beam-splitter [56]. The mask is illuminated with linearly polarized light that is directed through the lens toward the primary mirror [59]. Upon reflection, a waveplate changes the state of linear polarization, allowing light to be transmitted toward the wafer plane. Variations on this design use a partially reflecting beamsplitter that may suffer from reduced throughput and a susceptibility coating damage at short wavelengths. Obscuration is eliminated as is the low-NA requirement of the off-axis designs to prevent vignetting. The beam-splitter is well corrected for operation on axis, minimizing high-order aberrations and the requirement for an increasingly thin ring field for high NA as with the reduction Offner. The field is square, which can be used in a stepping mode, or rectangular for step and scanning. The simplified system, with only one mirror possessing most of the lens power, leads to lower aberration levels than for the reduction Offner design. This design allows a spectral bandwidth on the order of 5–10 nm, allowing operation with a lamp or laser source.

FIGURE 3.79 A Mangin mirror-based beam-splitter approach to a catadioptric system. A partially reflective surface allows one element to act as a reflector and a refractor.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

229

FIGURE 3.80 A 4! reduction Dyson catadioptric lens design utilizing a polarizing beam-splitter.

As previously seen, at high NA values (above 0.5) for high-resolution lithography, diffraction effects for TE and TM are different. When the vectorial nature of light is considered, a biasing between horizontally oriented and vertically oriented features results. Although propagation into a resist material will reduce this biasing effect [60], it cannot be neglected. Improvements on the reduction Dyson in Figure 3.80 have included elimination of the linear polarization effect by incorporating a second waveplate near the wafer plane. The resulting circular polarization removes the H-V biasing possible with linear polarization and also rejects light reflected from the wafer and lens surface, reducing lens flare. Improvements have also increased the NA of the Dyson design, up to 0.7 using approaches that include larger NA beam-splitter cubes, shorter image conjugates, increased mirror asphericity, and source bandwidths below 1 nm. This spectral requirement, along with increasingly small field widths to reduce aberration, requires that these designs be used only with excimer laser sources. Designs have been developed for both 248 and 193 nm wavelengths. Examples of these designs are shown in Figure 3.81 [61] and Figure 3.82 [62].

3.9 Polarization and High NA As with any type of imaging, lithography is influenced by the polarization of the propagating radiation. In reality, the impact of polarization on imaging has been relatively low at NA values below 0.80 NA as interfering rays are imaged into a photoresist with a refractive

q 2007 by Taylor & Francis Group, LLC

230

Microlithography: Science and Technology

FIGURE 3.81 An improved reduction Dyson, utilizing a second waveplate to eliminate linear polarization effects at the wafer. (Reproduced from Williamson, McClay, D., Andresen, J., Gallatin, K., Himel, G., Ivaldi, M., Mason, C., McCullough, A., Otis, C., and Shamaly, J., SPIE, 2726, 780, 1996.)

index greater than that for air. Because the refractive index of the resist (nPR) is in the range of 1.60–1.80, the resulting angles of interference are reduced by Snell’s Law to NA/nPR. Concerns with polarization have, therefore, been limited to the requirements of the optical coatings within lens systems and those lithography approaches making use of polarization for selection such as the reduction Dyson lens designs seen in Figure 3.80 and Figure 3.81. As immersion lithography has enabled numerical apertures above 1.0, the impact of polarization becomes more significant. For this reason, special attention needs to be paid to the influence of the polarization at most all stages of the lithographic imaging process. Polarized radiation results as the vibrations of a magnetic or electric field vector are restricted to a single plane. The direction of polarization refers to the electric field vector that is normal to the direction of propagation. Linear polarization exists when the direction of polarization is fixed. Any polarized electric field can be resolved into two orthogonally polarized components. Circular polarization occurs when the electric field vector has two equal orthogonal components, causing the resultant polarization direction to rotate about the direction of propagation. Circular polarization with a preferred linear component is termed elliptical polarization. Unpolarized radiation has no preferred direction of polarization. 3.9.1 Imaging with Oblique Angles At oblique angles, radiation polarized in the plane of incidence exhibits reduced image contrast as interference is reduced [63]. This is referred to as transverse magnetic TM, p, or X polarization with respect to vertically oriented geometry. As angles approach p/4

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

231

300 18,2017,21 15 19

400 150

23 40

24 25 26 27 28 2930 31 32 33 34 35

A 36

15,22 13 12 11 10 9 200 8

14

7

0

6

5

1 2 34 100 FIGURE 3.82 A reduction Dyson approach with the stop behind the beam-splitter. Numerical apertures to 0.7 can be achieved with a high degree of collumation. Spectral narrowing is likely needed.

pﬃﬃﬃ [or sinK1 1= 2 ], no interference is possible, and image contrast in air is reduced to zero. If the image is translated into of higher index, the limiting angle is increased by the ﬃ paﬃﬃmedia media index (n) as sinK1 n= 2 . For polarization perpendicular to the plane of incidence, complete interfere exists, and no reduction in image contrast will result. Figure 3.83 shows the two states of linear polarization that contribute to a mask function oriented out of the plane of the page. TM polarization is in the plane of the page, and transverse electric (TE or Y) polarization is perpendicular. For non-linear polarization, an image is formed as the sum of TE and TM image states. 3.9.2 Polarization and Illumination At high NA values, methods can be used that avoid TM interference. Several approaches to remove this field cancellation from TM polarization have been proposed, including image decomposition for polarized dipole illumination [64]. Illumination that is consistently TE polarized in a circular pupil could achieve the optimum polarization for any object orientation. This is possible with an illumination field that is TE polarized over all angles in the pupil, known as azimuthal polarization that is shown in Figure 3.84, along

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

232

Air

TE, s, or Y-polarized

ni =1.00

TM, p, or X-polarized

Resist ni >1.0 FIGURE 3.83 Interference in a photoresist film depicted in the two states of linear polarization. The mask features are oriented out of the paper.

with TM or radial polarized illumination. Such an arrangement provides for homogeneous coupling of the propagating radiation regardless of angle or orientation. As an example, Figure 3.85 shows imaging results for four cases of illumination of line/space features as focus is varied. The conditions plotted are for TE polarized dipole, TE polarized cross-quadrupole, azimuthal annular, and unpolarized annular illumination. In this case, the TE polarized cross-quadrupole illumination results in superior image performance. 3.9.3 Polarization Methods Selection of a single linear state to polarize radiation requires a method that provides efficiency in the wavelength region of interest. Though many methods exist at longer wavelengths, the choices are more limited in the UV. Polarization can be achieved with crystalline materials that have a different index of refraction in different crystal planes. Such materials are said to be birefringent or doubly refracting. A number of polarizing prisms have been devised that make use of birefringence to separate two beams in a crystalline material. Often, they make use of total internal reflection to eliminate one of the planes. The Glan-Taylor, Glan-Thompson, Glan-Laser, beam-splitting Thompson,

FIGURE 3.84 TE or azimuthally polarized illumination (left) and TM or radially polarized illumination.

q 2007 by Taylor & Francis Group, LLC

TE polarized (Azimuthal)

TM polarized (Radial)

Optics for Photolithography

233

0.7

Modulation in Resist

0.6 0.5 TE polarized dipole

0.4

TE polarized cross-quad Azimuthal annular

0.3

Unpolarized annular

0.2 0.1 0 0

50

100

150

200

250

300

Defocus (nm) FIGURE 3.85 The modulation in a photoresist film for 45 nm features imaged at 1.20 NA (water immersion) under various conditions of illumination. The greatest modulation through focus is achieved using a TE polarized dipole illuminator.

beam displacing, and Wollaston prisms are most widely used, and they are typically made of nonactive crystals such as calcite that transmit well from 350 to 2300 nm. Active crystals such as quartz can also be used in this manner if cut with the optic axis parallel to the surfaces of the plate. Polarization can also be achieved through reflection. The reflection coefficient for light polarized in the plane of incidence is zero at the Brewster angle, leaving the reflected light at that angle linearly polarized. This method is utilized in polarizing beam-splitter cubes that are coated with many layers of quarter-wave dielectric thin films on the interior prism angle to achieve high extinction ratio between the TE and TM components. Wire grid polarization can also be employed as a selection method [65]. Wire grids, generally in the form of an array of thin parallel conductors supported by a transparent substrate, have been used as polarizers for the visible, infrared, and other portions of the electromagnetic spectrum. When the grid period is much shorter than the wavelength, the grid functions as a polarizer that reflects electromagnetic radiation polarized parallel to the grid elements, and it transmits radiation of the orthogonal polarization. These effects were first reported by Wood in 1902, and they are often referred to as “Wood’s Anomalies” [66]. Subsequently, Rayleigh analyzed Wood’s data and believed that the anomalies occur at combinations of wavelength and angle where a higher diffraction order emerges [67]. 3.9.4 Polarization and Resist Thin Film Effects To reduce the reflectivity at an interface between a resist layer and a substrate, a bottom anti-reflective coating (BARC) is coated between beneath the resist as discussed in detail in Chapter 12. Interference minima occur as reflectance from the BARC/substrate interface destructively interferes with the reflection at the resist/BARC interface. This destructive interference thickness repeats at quarter wave thickness. Optimization of a single layer BARC is possible for oblique illumination and also for specific cases of polarization as seen in the plots of Figure 3.86. The issue with a single layer AR film, however, is its inability to achieve low reflectivity across all angles and through both states of linear polarization. This can be achieved using a multilayer BARC design as shown in Figure 3.87 [68]. By combining two films in a stack and optimizing their optical and thickness properties,

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

234 0˚ optimized, ARC(1.7, 0.5), 35nm

45˚ optimized, ARC(1.7, 0.3), 52nm

25.00 Avg TE TM

20.00 15.00

Reflectance (%)

Reflectance (%)

25.00

TE 10.00 TM 5.00

Avg TE TM

20.00 15.00 10.00 5.00

0.00

TM

0

20

40

60

80

0

20

Angle (degrees)

40

60

80

Angle (degrees)

45˚ TM optimized, ARC(1.7, 0.5), 52nm

45˚ TE optimized, ARC(1.7, 0.3), 48nm

25.00

25.00 Avg TE TM

20.00 15.00

Reflectance (%)

Reflectance (%)

TE

0.00

TE 10.00 TM 5.00 0.00

Avg TE TM

20.00 15.00 10.00 5.00

TM TE

0.00 0

20

40

60

80

0

20

Angle (degrees)

40

60

80

Angle (degrees)

FIGURE 3.86 Reflectance plots for a bottom ARC for various polarization states and conditions of optimization.

Resist (1.70,0.005) 42 nm (1.70,0.20) 48 nm (1.70,0.50)

ARC FIGURE 3.87 A two-layer BARC stack using matched refractive index (n) and dissimilar extinction coefficient (k) films.

Poly-Si

Full angular optimization 0-45˚ (1.2NA)

Reflectance (%)

25.00

FIGURE 3.88 The optimized 193 nm reflection for the film stack of Figure 3.87 measured beneath the photoresist.

q 2007 by Taylor & Francis Group, LLC

Avg TE TM

20.00 15.00 10.00

TE

5.00

TM

0.00

0

20

40 60 Angle (degrees)

80

Optics for Photolithography

235

reflectivity below 0.6% can be made possible for angles to 458 (or at 1.2NA) for all polarization states as shown in Figure 3.88.

3.10 Immersion Lithography Ernst Abbe was the first to discover that the maximum ray slope entering a lens from an axial point on an object could be increased by a factor equal to the refractive index of the imaging media. He first realized this in the late 1870s by observing an increase in the ray slope in the Canada balsam mounting compound used in microscope objectives at the time. To achieve a practical system employing this effect, he replaced the air layer between a microscope objective and a cover glass with oil having a refractive index in the visible near that of the glass on either side. This index matching prevents reflective effects at the interfaces (and total internal reflection at large angles), leading to the term homogenous immersion for the system he developed. The most significant application of the immersion lens was in the field of medical research where oil immersion objectives with a high resolving power were introduced by Carl Zeiss in the 1880s. Abbe and Zeiss developed oil immersion systems by using oils that matched the refractive index of glass. This resulted in numerical aperture values up to a maximum of 1.4, allowing light microscopes to resolve two points distanced only 0.2 mm apart (corresponding to a k1 factor value in lithography of 0.5). The application of immersion imaging to lithography had not been employed until recently for several reasons. The immersion fluids used in microscopy are generally opaque in the UV and are not compatible with photoresist materials. Also, the outgassing of nitrogen during the exposure of DNQ/novolac (g-line and i-line) resists would prevent their application to a fluid-immersed environment. Most important, however, is the availability of alternative approaches to extend optical lithography. Operation at the 193 nm ArF wavelength leaves few choices other than immersion to continue the pursuit of optical lithography. Fortunately, poly acrylate photoresists used at this wavelength do not outgas upon exposure (nor do 248 nm PHOST resists). The largest force behind the insurgence of immersion imaging into mainstream optical lithography has been the unique properties of water in the UV. Figure 3.89 shows the refractive index of water in the UV and visible

1.44

Refractive Index

1.42 1.40 1.38 1.36 1.34 1.32 180

220

260

300

Wavelength (nm)

q 2007 by Taylor & Francis Group, LLC

340

380

FIGURE 3.89 The refractive index of water in the ultraviolet.

Microlithography: Science and Technology

236

%Transmission

100

FIGURE 3.90 The transmission of water in the ultra-violet at 1 mm and 1 cm depths.

90 1 cm 1 mm 80

70

60 190

195

200 205 210 Wavelength (nm)

215

220

region. Figure 3.90 shows the transmission for 1 mm and 1 cm water thickness values. As the wavelength decreases toward 193 nm, the refractive index increases to a value of 1.44, significantly larger than its value of 1.30 in the visible. Furthermore, the absorption remains low at 0.05/cm. Combined with the natural compatibility of IC processing with water, the incentive to explore water immersion lithography at DUV wavelengths now exists. The advantages of immersion lithography can be realized when the resolution is considered together with depth of focus. The minimum resolvable pitch for an optical imaging system is determined by wavelength and numerical aperture pZ

l n sinðqÞ

where n is the refractive index of the imaging media and q is the half angle. As the refractive index of the media is increased, the NA increases proportionately. At a sin q value of 0.93 (or 688), which is near the maximum angle of any practical optical system, the largest NA allowed using water at 193 nm is 1.33. The impact on resolution is clear, but it is really just half of the story. The paraxial depth of focus for any media of index n takes the form DOF ZG

k2 l n sin2 q

Taken together, as resolution is driven to smaller dimensions with increasing NA values, the cost to DOF is significantly lower if refractive index is increased instead of the half angle. As an example, consider two lithography systems operating at NA values of 0.85, one being water immersion and the other imaging through air. Although the minimum resolution is the same, the paraxial DOF for the water immersion system is 45% larger than the air imaging system, a result of a half angle of 368 in water versus 588 in air. 3.10.1 Challenges of Immersion Lithography Early concerns regarding immersion lithography included the adverse effects of small micro-bubbles that may form or trap during fluid filling, scanning, or exposure [69]. Though defects caused by trapped air bubbles remain a concern associated with fluid mechanics issues, the presence or creation of microbubbles has proven non-critical. In the process of forming a water fluid layer between the resist and lens surfaces, air bubbles

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography

237

θ i

water

s r FIGURE 3.91 The geometric description of the reflection of an air bubble sphere in water.

bubbl e bubble

are often created as a result of the high surface tension of water. The presence of air bubbles in the immersion layer could degrade the image quality because of the inhomogeneity induced light scattering in the optical path. Scattering of air bubbles in water can be approximately described using a geometrical optics model [70]. As seen in Figure 3.91, an air bubble assumes a spherical shape in water when the hydraulic pressure because of gravity is ignored. The assumption is reasonable in lieu of immersion lithography where a thin layer of water with thickness of about 0.5 mm is applied. The reflection/refraction at the spherical interface causes the light to scatter into various directions that can be approximated by flat surface Fresnel coefficients. However, the air bubble in water is a special case where the refractive index of the bubble is less than that of the surrounding media, resulting in a contribution of total reflection to scattered irradiance at certain angles. The situation is described in Figure 3.92. For an arbitrary ray incident on a bubble, the angle of incidence is i Z arcsinðs=aÞ where a is the radius of the bubble, and s is the deviation from the center. The critical incident angle is ic Z arcsinðni =nw Þ

2 μm air bubble in water

−1

100 μm from particle 200 μm from particle 500 μm from particle 1000 μm from particle

−2

LOG10 (ls/li)

−3 −4 −5 −6 −7 −8 −150

−100

−50

0

50

Lateral distance. μm

q 2007 by Taylor & Francis Group, LLC

100

150

FIGURE 3.92 The ration of the scattered intensity to the incident intensity for a 2 mm air particle in water at various separation values.

Microlithography: Science and Technology

238

where ni is the refractive index of the air, and nw is the refractive index of water. The corresponding critical scattering angle is qc Z 1808K2ic At a wavelength of 193 nm, the refractive index of water is nwZ1.437. Therefore, the critical incident angle and critical scattering angle are

1 ic Z arcsin 1:437

Z 448

qc Z 1808K2ic Z 928 The presence of total reflection greatly enhances the light scattered into the region subtended by. In this case, the region covers all the forward directions. Hence, air bubbles in water cause strong scattering in all the forward directions. However, a complete understanding of scattering will require taking into account the effects of interference of the reflected light with other transmitted light. The rigorous solution of the scattering pattern can be numerically evaluated by partial wave (Mie) theory. In Mie scattering theory, the incident, scattered, and internal fields are expanded in a series of vector spherical harmonics [71]. At the wavelength of 193 nm, the scattering of an air bubble 2 mm in diameter was calculated according to Mie theory and plotted in Figure 3.93. At relatively short lateral distances from such a bubble and at distances beyond 100 mm, the scatter intensity becomes is very low. Trapping of bubbles or the collection of several bubbles is a concern that needs to be addressed in the design of the liquid flow cell for an immersion lithography system [72].

3.10.2 High Index Immersion Fluids Optical lithography is being pushed against fundamental physical and chemical limits, presenting the real challenges involved with resolution at dimensions of 32 nm and below. Hyper-NA optical lithography is generally considered to be imaging at angles close to 908 to achieve numerical apertures above 1.0. Because of the small gains in numerical aperture at propagation angles in optics above 658, values much above this are not likely in current or future lens design strategies. Hyper NA is, therefore, forced upon material refractive index where the media with the lowest index create a weak link to system resolution. The situation is one where the photoresist possesses the highest refractive index and

Low index (air) imaging a

FIGURE 3.93 The effect of media index on the propagation angle for imaging equivalently sized geometry.

q 2007 by Taylor & Francis Group, LLC

High index imaging b

air na

media nm

sin qa> sin qr

sinqm= sinqr

resist nr

Optics for Photolithography

239

a photoresist top-coat has the lowest refractive index nphotoresist O nglass O nfluid O ntop-coat The ultimate resolution of a lithography tool then becomes a function of the lowest refractive index. RET methods that are already being employed in lithography can achieve k1 process factors near 0.30 where 0.25 is the physical limit. It is, therefore, not likely that much ground will be achieved as a move is made into hyper NA immersion lithography. The minimum half-pitch (hp) for 193 nm lithography following classical optical scaling, using a 688 propagation angle becomes hpmin Z

k1 l ð0:25 to 0:30Þð193Þ 52 62 Z Z to nm ni sin q ni ð0:93Þ ni ni

for aggressive k1 values between 0.25 and 0.30, where ni is the lowest refractive index in the imaging path. Water as an immersion fluid is currently the weak link in a hyper-NA optical lithography scenario. Advances with second generation fluid indices approaching 1.65 may direct this liability toward optical materials and photoresists. As resolution is pushed below 32 nm, it will be difficult for current photoresist refractive index values (w1.7) to accommodate. As photoresist refractive index is increased, the burden is once again placed on the fluid and the optical material. As suitable optical materials with refractive indices larger than fused silica are identified, the fluid is, once again, the weak link. This scenario will exist until a fluid is identified with a refractive index approaching 1.85 (limited by potential glass alternatives currently benchmarked by sapphire) and high index polymer platforms. To demonstrate the advantages of using higher refractive index liquids, an imaging system using an immersion fluid is shown in Figure 3.93. The left portion of this figure depicts an optical wavefront created by a projection imaging system that is focused into a photoresist (resist) material with refractive index nr. The refractive index of the imaging media is na (and, in this example, is air). The right portion of the figure depicts an optical wavefront focused through a media of refractive index larger than the one on the left, specifically nm. As the refractive index nm increases, the effect of defocus that is proportional to sin2q is reduced. Furthermore, as shown in Figure 3.94, a refractive index approaching that of the photoresist is desirable to allow for large angles into the photoresist film and to also allow for reduced reflection at interfaces between the media and Media >Resist

Resist >Media glass ng

media nm nm sin qm = nr sin qr R

resist nr

q 2007 by Taylor & Francis Group, LLC

R FIGURE 3.94 The effect of media index on reflection and path length.

Microlithography: Science and Technology

240 TABLE 3.5

The Absorption Peak (in eV and nm) For Several Anions in Water

I BrK ClK ClOK 4 HPO42K1 SO42K1 H2 POK 4 HSOK 4 K

eV

nm

5.48 6.26 6.78 6.88 6.95 7.09 7.31 7.44

227 198 183 180 179 175 170 167

the resist. Ultimately, a small NA/n is desirable in all media, and the maximum NA of the system is limited to the smallest media refractive index. In general, the UV absorption of a material involves the excitation of an electron from the ground state to an excited state. When solvents are associated, additional “charge-transferto-solvent” transitions (CTTS) are provided [73,74]. The absorption wavelength resulting from CTTS properties and absorption behavior of aqueous solutions of phosphate, sulfate, and halide ions follow the behavior 2K K K K K Phosphates PO3K 4 ! Sulfates SO4 ! F ! Hydroxides OHK!Cl ! Br ! I

where phosphate anions absorb at shorter wavelengths than iodide. Table 3.5 shows the effect of these ions on the absorption peak of water where anions resulting in a shift of this peak sufficiently below 193 nm are the most interesting. The presence of alkalai metal cations can shift the maximum absorbance wavelength to lower values. Furthermore, the change in the absorption with temperature is positive and small (w500 ppm/8C), whereas the change with pressure is negative and small. These anions represent one avenue for exploration into high refractive index fluids for 193 and 248 nm application [75].

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

W. Smith. 1966. Modern Optical Engineering, New York: Mc-Graw-Hill. C. Huygens. 1690. Traite` de la lumiere`, Leyden. (English translation by S.P. Thompson, Treatise on Light, Macmillan, London, 1912). A. Fresnel. 1816. Ann. Chem. Phys., 1: 239. J.W. Goodman. 1968. Introduction to Fourier Optics, New York: McGraw-Hill. H. von Helmholtz. 1859. J. Math., 57: 7. G. Kirchhoff. 1883. Ann. Phys., 18: 663. D.C. Cole. 1992. “Extending scalar aerial image calculations to higher numerical apertures,” J. Vac. Sci. Technol. B, 10:6, 3037. D.G. Flagello and A.E. Rosenbluth. 1992. “Lithographic tolerances based on vector diffraction theory,” J. Vac. Sci. Technol. B, 10:6, 2997. J.D. Gaskil. 1978. Linear Systems, Fourier Transforms and Optics, New York: Wiley. H.H. Hopkins. 1953. “The concept of partial coherence in optics,” Proc. R. Soc. A, 208: 408. R. Kingslake. 1983. Optical System Design, London: Academic Press. D.C. O’Shea. 1985. Elements of Modern Optical Design, New York: Wiley. Lord Rayleigh. 1879. Philos. Mag., 8:5, 403.

q 2007 by Taylor & Francis Group, LLC

Optics for Photolithography 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32.

33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60.

241

H.H. Hopkins. 1981. “Introductory methods of image assessment,” SPIE, 274: 2. M. Born and E. Wolf. 1964. Principles of Optics, New York: Pergamon Press. W.H. Steel. 1957. “Effects of small aberrations on the images of partially coherent objects,” J. Opt. Soc. Am., 47: 405. A. Offner. 1979. “Wavelength and coherence effects on the performance of real optical projection systems,” Photogr. Sci. Eng., 23: 374. R.R. Shannon. 1995. “How many transfer functions in a lens?” Opt. Photon. News, 1: 40. A. Hill, J. Webb, A. Phillips, and J. Connors. 1993. Design and analysis of a high NA projection system for 0.35 mm deep-UV lithography, SPIE, 1927: 608. H.J. Levinson and W.H. Arnold. 1987. J. Vac. Sci. Technol. B, 5:1, 293. V. Mahajan. 1991. Aberration Theory Made Simple, Bellingham, WA: SPIE Press. F. Zernike. 1934. Physica, 1: 689. Lord Rayleigh. 1964. Scientific Papers, Vol. 1, New York: Dover. B.W. Smith, 1995. First International Symposium on 193 nm Lithography, Colorado Springs, CO. PROLITH/2 KLA-Tencor FINLE Division, 2006. M. Born and E. Wolf. 1980. Principles of Optics, Oxford: Pergamon Press. F. Gan. 1992. Optical and spectroscopic properties of glass, New York: Springer-Verlag. Refractive Index Information (Approximate), Acton Research Corporation, Acton, MA, 1990. M. Rothschild, D.J. Ehrlich, and D.C. Shaver. 1989. Appl. Phys. Lett., 55:13, 1276. W.P. Leung, M. Kulkarni, D. Krajnovich, and A.C. Tam. 1991. Appl. Phys. Lett., 58:6, 551. J.F. Hyde, 1942. U.S. Patent 2,272, 342. Method of making a transparent article of silica. H. Imai, K. Arai, T. Saito, S. Ichimura, H. Nonaka, J.P. Vigouroux, H. Imagawa, H. Hosono, and Y. Abe. 1988. in The Physics and Technology of Amorphous SiO2, R.A.B. Devine, ed., New York: Plenum Press, p. 153. W. Partlow, P. Thampkins, P. Dewa, and P. Michaloski. 1993. SPIE, 1927: 137. S. Asai, I. Hanyu, and K. Hikosaka. 1992. J. Vac. Sci. Technol. B, 10:6, 3023. K. Toh, G. Dao, H. Gaw, A. Neureuther, and L. Fredrickson. 1991. SPIE, 1463: 402. E. Tamechika, T. Horiuchi, and K. Harada. 1993. Jpn. J. Apply. Phys., 32: 5856. T. Ogawa, M. Uematsu, T. Ishimaru, M. Kimura, and T. Tsumori. 1994. SPIE, 2197: 19. R. Kostelak, J. Garofalo, G. Smolinsky, and S. Vaidya. 1991. J. Vac. Sci. Technol. B, 9:6, 3150. R. Kostelak, C. Pierat, J. Garafalo, and S. Vaidya. 1992. J. Vac. Sci. Technol. B, 10:6, 3055. B. Lin. 1990. SPIE, 1496: 54. H. Watanabe, H. Takenaka, Y. Todokoro, and M. Inoue. 1991. J. Vac. Sci. Technol. B, 9:6, 3172. M. Levensen. 1993. Phys. Today, 46:7, 28. H. Watanabe, Y. Todokoro, Y. Hirai, and M. Inoue. 1991. SPIE, 1463: 101. Y. Ku, E. Anderson, M.L. Shattenburg, and H. Smith. 1988. J. Vac. Sci. Technol. B, 6:1, 150. R. Kostelak, K. Bolan, and T.S. Yang, 1993. Proc. OCG Interface Conference, p. 125. B.W. Smith and S. Turget. 1994. SPIE Optical/Laser Microlithography VII, 2197: 201. M. Born and E. Wolf. 1980. Principles of Optics, Oxford: Pergamon Press. B.W. Smith, S. Butt, Z. Alam, S. Kurinec, and R. Lane. 1996. J. Vac. Technol. B, 14:6, 3719. Y. Liu and A. Zakhor. 1992. IEEE Trans. Semicond., 5: 138. H. Yoo, Y. Oh, B. Park, S. Choi, and Y. Jeon. 1993. Jpn. J. Appl. Phys., 32: 5903. B.W. Smith, D. Flagello, and J. Summa. 1993. SPIE, 1927: 847. S. Asai, I. Hanyu, and M. Takikawa. 1993. Jpn. J. Appl. Phys., 32: 5863. K. Matsumoto and T. Tsuruta. 1992. Opt. Eng., 31:12, 2656. D. Golini, H. Pollicove, G. Platt, S. Jacobs, and W. Kordonsky. 1995. Laser Focus World, 31:9, 83. A. Offner. 1975. Opt. Eng., 14:2, 130. D.M. Williamson, by permission. J. Buckley and C. Karatzas. 1989. SPIE, 1088: 424. Bruning, J., 1996. OSA Symposium on Design, Fabrucation, and Testing for sub-0.25 micron Lithographic Imaging. H. Sewell. 1995. SPIE, 2440: 49. D. Flagello and A. Rosenbluth. 1992. J. Vac. Sci. Technol. B, 10:6, 2997.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

242 61. 62. 63. 64. 65. 66. 67. 68. 69.

70. 71. 72. 73. 74. 75.

D. Williamson, J. McClay, K. Andresen, G. Gallatin, M. Himel, J. Ivaldi, C. Mason, A. McCullough, C. Otis, and J. Shamaly. 1996. SPIE, 2726: 780. G. Fu¨rter, Carl-Zeiss-Stiftung, by permission. B.W. Smith, J. Cashmore, and M. Gower. 2002. “Challenges in high NA, polarization, and photoresists,” SPIE Opt. Microlith., XV, 4691. B.W. Smith, L. Zavyalova, and A. Estroff. 2004. “Benefiting from polarization–effects of highNA on imaging,” Proc. SPIE Opt. Microlith. XVII, 5377. A. Estroff, Y. Fan, A. Bourov, B. Smith, P. Foubert, L.H. Leunissen, V. Philipsen, and Y. Aksenov. 2005. “Mask-induced polarization effects at high NA,” Proc. SPIE Opt. Microlit., 5754. R.W. Wood, Uneven Distribution of Light in a Diffraction Grating Spectrum, Philosophical Magazine, September, 1902. Lord Rayleigh, On the Remarkable Case of Diffraction Spectra Described by Prof. Wood, Philosophical Magazine, July, 1907. B.W. Smith, L. Zavyalova, and A. Estroff. 2004. “Benefiting from polarization–effects of highNA on imaging,” Proc. SPIE Opt. Microlit. XVII, 5377. B.W. Smith, Y. Fan, J. Zhou, A. Bourov, L. Zavyalova, N. Lafferty, F. Cropanese, and A. Estroff. 2004. “Hyper NA water immersion lithography at 193 nm and 248 nm,” J. Vac. Sci. Technol. B: Microelectron. Nanometer Struct., 22:6, 3439–3443. P.L. Marston, 1989. “Light scattering from bubbles in water,” Ocean 89 Part 4, Acoust. Arct. Stud., 1186-1193. C.F. Bohren and D.R. Huffman. 1983. Absorption and Scattering of Light by Small Particles, Wiley. Y. Fan, N. Lafferty, A. Bourov, L. Zavyalova, and B.W. Smith. 2005. “Air bubble-induced lightscattering effect on image quality in 193 nm immersion lithography,” Appl. Opt., 44:19, 3904. E. Rabinowitch, 1942. Rev. Mod. Phys., 14, 112; G. Stein and A. Treinen, 1960. Trans. Faraday Soc., 56, 1393. M.J. Blandamer and M.F. Fox, 1968. Theory and Applications of Charge-Transfer-To-Solvent Spectra. B.W. Smith, A. Bourov, H. Kang, F. Cropanese, Y. Fan, N. Lafferty, and L. Zavyalova. 2004. “Water immersion optical lithography at 193 nm,” J. Microlith. Microfab. Microsyst., 3:1, 44–51.

q 2007 by Taylor & Francis Group, LLC

4 Excimer Lasers for Advanced Microlithography Palash Das

CONTENTS 4.1 Introduction and Background ........................................................................................244 4.2 Excimer Laser ....................................................................................................................246 4.2.1 History ..................................................................................................................246 4.2.2 Excimer Laser Operation....................................................................................248 4.2.2.1 KrF and ArF ..........................................................................................248 4.2.2.2 Ionization Phase ..................................................................................250 4.2.2.3 Preionization ........................................................................................250 4.2.2.4 Glow Phase ..........................................................................................251 4.2.2.5 Streamer Phase ....................................................................................252 4.2.3 Laser Design ........................................................................................................253 4.2.4 F2 Laser Operation ..............................................................................................254 4.2.5 Comments ............................................................................................................256 4.3 Laser Specifications ..........................................................................................................256 4.3.1 Power and Repetition Rate ................................................................................256 4.3.2 Spectral Linewidth ..............................................................................................257 4.3.3 Wavelength Stability............................................................................................258 4.3.4 Pulse Duration......................................................................................................261 4.3.5 Coherence..............................................................................................................262 4.3.6 Beam Stability ......................................................................................................263 4.4 Laser Modules ..................................................................................................................265 4.4.1 Chamber ................................................................................................................265 4.4.2 Line Narrowing....................................................................................................268 4.4.3 Wavelength and Linewidth Metrology ............................................................269 4.4.4 Pulsed Power........................................................................................................272 4.4.5 Pulse Stretching....................................................................................................273 4.4.6 Beam Delivery Unit ............................................................................................275 4.4.7 Master Oscillator: Power Amplifier Laser Configuration ............................278 4.4.8 Module Reliability and Lifetimes......................................................................283 4.5 Summary ............................................................................................................................285 References ....................................................................................................................................285

243

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

244 ABSTRACT

Since its introduction in 1987, the excimer laser for the stepper has evolved from a laboratory instrument to fully production-worthy fabrication-line equipment. One could not overstate the role of it as the source for advanced lithography. Excimer lasers provide direct deep-UV light, are scalable in energy and power, and are capable of operating with narrow spectral widths. Also, by providing three wavelengths at 248, 193, and 157 nm, excimer lasers span three generations. They have large beams, and a low degree of coherence. Their physics and chemistry are well understood. Thanks to major technical developments, excimer laser performance has kept pace with semiconductor industry requirements. Key developments that have placed excimer laser as the total light solution for advanced microlithography are discussed.

4.1 Introduction and Background When this chapter was first written, microlithography for advanced Ultra-Large Scale Integration (ULSI) fabrication was making a transition from using i-line (365-nm) mercury lamp to deep-UV excimer laser—krypton fluoride (248 nm)—as the illumination source. This transition was revolutionary because of the complexity and pulsed nature of the laser compared to the simple and continuously operating lamp. That was 1995. By 1997, the Hg i-line light sources were replaced with excimer lasers in volume manufacturing of semiconductor devices. Today, there are more than 2000 excimer-laser-based scanners in use in over 30 semiconductor factories worldwide. Figure 4.1 shows how the spectral power (ratio of power and linewidth) of KrF and ArF lasers has increased in the past ten years. This increase is fueled by requirements of higher scanner productivity and smaller features in semiconductor devices. The success behind the microelectronics revolution is attributed to many factors, including the excimer laser. The transition from Hg-lamp-light-source technology to excimer technology is clearly illustrated in Figure 4.2. As one can see, this Hg-to-excimer transition was driven by the need to make sub-0.25-mm features in semiconductor devices. Based on Rayleigh’s criterion, one would think that the introduction of shorter wavelengths should have occurred much earlier, either in 1993 or 1995. Rayleigh’s criterion states that the resolution 100

Spectral power (W/pm)

KrF, doubles every 24 months

10

ArF, doubles every 21 months

FIGURE 4.1 Evolution of spectral power in the last decade for KrF and ArF lasers for lithography.

q 2007 by Taylor & Francis Group, LLC

1990

1992

1994

1996

1998

Year

2000

2002

2004

Excimer Lasers for Advanced Microlithography

245

Discharge cathode Electron beam cathode

Laser chamber Discharge anode

Connected to Marx generator

+ HV

Anode foil

C= 405 nF Voltage monitor

Gas in FIGURE 4.2 Electron-beam-sustained KrF laser. Electron beam enters the discharge region through the anode foil. Due to foil heating, high repetition rate operation is not possible.

of an imaging lens with a numerical aperture represented by NA is affected by the wavelength, l, by the relationship R Z k1

l ; NA

(4.1)

where k1 is the dimensionless process k factor. The larger the process k factor, the easier it is to produce the wafer, but at the expense of the resolution of the imaging lens. KrF at 248 nm would have been a better source wavelength than i-line at 365 nm in 1995 for 0.3-mm features or even in 1993 for 0.35-mm features. However, associated with each transition in wavelength are enormous technical issues related to photoresists and materials (primarily optical) at the new wavelength. Instead, two techniques were used to extend i-line. One was to increase the NA of the lens from 0.4 to 0.6. The other was to decrease the process k factor from 0.8 to 0.5 by the use of enhanced reticle techniques such as phase shift masks or oblique illumination. By 1995, the development of deep-UV-grade fused silica and 248-nm resists were complete and the KrF laser became the mainstay of semiconductor manufacturing. Ironically, the issues that prevented the entry of KrF lasers would now extend its usability to beyond 0.18 mm. The entry-feature size for ArF was 0.13 mm because the quality of fused silica, calcium fluorides (optical materials at 193 nm), and resists matured only around 2001. The entry-feature size for F2 is probably 0.07 or 0.05 mm in 2005, presuming fused silica (for reticles) and calcium fluoride quality could go through another round of improvement and robust resists could be developed by that time. Based on what the author has experienced this past decade, the excimer laser would be the source for 16-Gbit DRAM and 10-GHz Micro Processing Unit (MPU) lithography ten years from now. To better understand the role of the excimer laser in the lithography process, it is helpful to establish some simple relationships between laser parameters and corresponding stepper performance. This is presented in Table 4.1. This table also shows how the specifics of these parameters have changed since 1995 (when this chapter was first written), compared to today.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

246 TABLE 4.1 Excimer Laser Requirements Laser specification

Effect on Scanner Performance

Requirements in 1995

Requirements in 2004

Wavelength

Resolution

KrF at 248 nm

Linewidth

Resolution, depth of focus of projection lens

0.8 pm @ 248 nm

Relative wavelength stability Absolute wavelength stability Power

Focus and resolution of projection lens Magnification and distortion of reticle image at wafer Throughput

G0.15 pm

KrF at 248 nm ArF at 193 nm F2 at 157 nm, develop 0.3 pm @ 248 nm 0.2 pm @ 193 nm w1 pm @ 157 nm G0.05 pm, all

G0.25 pm

G0.05 pm, all

10 W @ 248 nm

Repetition rate Dose stability Pulse duration Pointing stability

1000 Hz @ 248 nm G0.8% @ 248 nm No requirement App. G200 mrad

Polarization stability

Scanner throughput Linewidth control Fused silica life @ 193 nm Reticle illumination uniformity Illuminator efficiency

30 W @ 248 nm 40 W @ 193 nm O40 W @ 157 nm 4000 Hz, all G0.3%, all O75 ns @ 193 nm !G50 mrad, all

Gas lifetime

Uptime

100 M @ 248 nm

Lifetimes for Chamber Line narrowing Laser metrology

Cost-of-operation

G5%

4 B @ 248 nm 5 B @ 248 nm 5 B @ 248 nm

G5% @ 248 nm G5% @ 193 nm G2% @ 157 nm 300 M @ 248 nm 100 M @ 193 nm 100 M @ 157 nm 20 B @ 248 nm 20 B @ 248 nm 20 B @ 248 nm 15 B @ 193 nm 15 B @ 193 nm 15 B @ 193 nm 5 B @ 157 nm 5 B @ 157 nm 5 B @ 157 nm

In the next section, the theory, design, and performance of an excimer laser for lithography are discussed. The basic operating principles of an excimer laser, the wafer exposure process, and a stepper and scanner operation are discussed as they pertain to laser operation. The technology is then discussed in detail, along with the changes in the fundamental architecture of these lasers that had to be made to meet power and linewidth requirements.

4.2 Excimer Laser 4.2.1 History The term “excimer” comes from “excited dimer,” a class of molecules that exists only in the upper excited state but not in the ground state. The excimer molecule has a short upper

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

247

state lifetime, and it decays to the ground state through disassociation while emitting a photon [1]. There are two types of excimer molecules: rare-gas excited dimers such as Xe2 and Kr2 , and the rare-gas halogens, such as: XeF*, XeC1*, KrF*, and ArF*. The latter class of excimer molecules is of greater interest because they emit deep-UV photons (351, 308, 248, and 193 nm). The F2 laser is not an excimer laser; it is a molecular laser. However, the principle of operation of the laser is similar to a KrF or ArF laser. The concept of using an excimer molecule as a laser medium was initially stated in 1960 [2]. The first successful rare-gas halide lasers were demonstrated in 1975 by several researchers [3,4]. The availability of pulsed energetic electron beams permitted excitation of the rare gas and halogen mixtures to create the so-called “e-beam” pumped excimer lasers (Figure 4.2). In these lasers, a short pulse from a high-power electron beam provides the only source of power to the laser gas. The electron beam maintains very high electric fields in the discharge such that electron multiplication due to ionization dominates. If the e-beam is not pulsed, the discharge can collapse into an arc, resulting in the termination of laser output. Therefore, to maintain discharge stability, the electron beam is pulsed. The high efficiencies (about 9% in KrF [5]) and large energies (about 350 J with KrF [6]) obtained from these systems revolutionized the availability of high-power photon sources in UV. These energetic UV beams found applications in isotope separation, x-ray generation, and spectroscopy [7]. There were several technical problems associated with optics and beam transport at UV wavelengths, especially at these high energies. In addition, electron-beam-pumped lasers suffered from self-pinching of the electron beam due to its own magnetic field and from heating of the foil through which electrons enter the discharge region. These aforementioned issues limited the growth of electron-beampumped lasers in the commercial environment. Commercial excimer lasers belong to a class of lasers called discharge pumped selfsustained lasers. A self-sustained discharge pumped laser is similar to an electron-beam laser, with the electron beam turned off. The foil in the electron-beam laser is replaced with a solid electrode, thus avoiding problems of foil heating at high repetition rate. In the first moment after the voltage is turned on across the electrodes, the gas experiences an electric field. The few free electrons that are present (these electrons are created by preionizing the gas prior to or during application of the sustaining voltage) are accelerated in this field, collide, and ionize the gas atoms or molecules, thus creating new electrons that again ionize, and so on, resulting in an avalanche effect known as the Townsend avalanche. At high pressures and high voltages, the avalanche proceeds rapidly from the cathode to the anode, resulting in a diffuse uniformly ionized region between the electrodes—the so-called glow discharge. During this phase, excited excimer molecules are formed in the region between the electrodes, and it becomes a gain medium. This means that a single photon at the right wavelength will multiply exponentially as it traverses the length of the gain medium. With proper optics, laser energy can be extracted from the gain medium during this glow discharge. This discharge is self-sustained (i.e., it provides its own ionizing electrons) due to the lack of an external electron-beam source. With continued supply of energy, after a sufficient length of time, the discharge becomes an arc. Experience shows that limiting the discharge current density and optimizing the gas mixture and pressure can delay the formation of arcs. Nevertheless, the glow discharge duration is short, about 20–30 ns. The typical round-trip time of a photon between the two mirrors of the lasers is between 6 and 7 ns. This means the photons make very few passes between the mirrors before exit as useful laser energy. As a result, the output is highly multimode and spatially incoherent. It is this property of incoherence that makes the excimer laser suitable for lithography, because speckle problems are reduced compared to when the beam is coherent. However, as will be discussed, the short gain duration also complicates the laser’s pulsed power and spectral control technology.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

248 4.2.2 Excimer Laser Operation

4.2.2.1 KrF and ArF The potential energy diagram of a KrF [8] laser is shown in Figure 4.3. The radiative lifetime of the upper laser state is about 9 ns, and the dissociative lifetime of the lower level is on the order of 1 ps. Therefore, population inversion in a KrF laser is easily achieved. The upper state (denoted by KrF*) is formed from harpooning collisions between excited Kr (Kr*) and the halogen (F2), Kr C F2 / KrF C F;

(4.2)

and from the ion channels via collisions of FK with rare gas ions, KrC C FK/ KrF :

(4.3)

Numerical calculations [9] indicate that the ion-channel contribution towards the formation of the KrF* is approximately 20% and the balance is from harpooning collisions. The kinetics of the KrF laser are best understood by referring to the operation of a commercial discharge pumped KrF laser. A typical electrical circuit used for this purpose is shown in Figure 4.4. The operating sequence of this circuit is as follows: 1. The high-voltage (HV) supply charges the storage capacitor Cs at a rate faster than the repetition rate of the laser. The inductor L across the laser head prevents Cp from being charged. The high-voltage switch (thyratron) is open. 2. The thyratron is then commanded to commute, i.e., the switch is closed. The closing time is about 30 ns. 3. At this point, Cs pulse-charges the capacitors, Cp, at a rate determined by Cs, Cp, and Lm. A typical charge rate of Cp by Cs is 100 V/ns. 4. As Cp charges, voltage appears between the electrodes. The gas becomes ionized due to the electric fields created by the electrodes. If the electrode gap, Dimer

Atoms

12

Kr+ + F–

10

Kr + F2

Energy (eV)

8 6

KrF* e

4 Laser transition 2 Kr+F

Kr + F2

0 −2 2

3

4 5 6 7 8 Interatomic distance R (Å)

9

10

11

FIGURE 4.3 Energy diagram for a KrF* excimer laser. KrF* is formed via two reaction channels. It decays to the ground state via disassociation into Kr and F while emitting a photon at 248 nm.

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

249

+HV, 0 – 20 kV

Cs

Lm

Laser

Lc Lh

Thyratron

L Cp FIGURE 4.4 Typical discharge circuit for an excimer laser.

gas mixture, and pressure is right, the voltage on Cp can ring-up to higher than the voltage on Cs. Therefore, Cp is often referred to as a peaking capacitor (Figure 4.5). 5. When the voltage across Cp reaches a certain threshold, the gap between the electrodes breaks down and it conducts heavily. The current between the electrode rises at a rate determined by Cp and (LcCLh). 6. If conditions are correct, the discharge forms an amplifying gain medium; with suitable optics, laser energy can be extracted. 7. Subsequently, any residual energy is dissipated in the discharge and in the electrodes. There are three distinct phases in the discharge process: the ionization, glow, and streamer phases.

Voltage on Cp (kV)

6 3 0 −3 −6 −9 −12 −15 −18 (a) 6 3 0 −3 −6 −9 −12 −15 −18

Voltage Current Streamer

Ionization

Pulse (Arb. Units)

Glow

Voltage on Cp (kV)

(b)

Current (Arb. Units)

Voltage, current & pulse waveform Three discharge phases - Ionization, glow & streamer

Voltage Pulse Breakdown voltage Vb 0

100

200

300

400

500

600

Time (ns) FIGURE 4.5 (a) Voltage on peaking capacitors, current through laser discharge and the three discharge phases. (b) Voltage on peaking capacitors, laser pulse waveforms, and breakdown voltage, Vb. The laser pulse occurs only during the glow phase.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

250

4.2.2.2 Ionization Phase The ionization phase constitutes sequences 3 and 4 discussed above. It lasts for approximately 100–200 ns depending on the magnitudes of Cs, Cp, and Lm. Experience shows that the ionization phase needs a minimum of 106–108 electrons per cm3 for its initiation. This is generally achieved through either arcs or corona to generate deep-UV photons that ionize the gas to create the electron density. This process is known as preionization and is described later. Ionization proceeds by direct electron excitation of Kr to create KrC in a two-step ionization process: Kr C eK/ Kr C eK

(4.4)

Kr C eK/ KrC C 2eK: However, ionization is moderated by loss of electrons through attachment with F2: F2 C eK/ FK C F:

(4.5)

Subsequently, the electron density grows exponentially as a result of intense electrical fields in the vicinity of the discharge, until it reaches 1013/cm3. At this point, gas breakdown occurs, resulting in a rapid drop of voltage across the electrodes and a rapid rise of current through the electrodes. The next phase (the glow phase) is initiated. 4.2.2.3 Preionization UV preionization of excimer lasers has been the subject of much investigation. Its role in creating a stable glow discharge was realized 1974 by Palmer [10]. Pioneering investigations by Taylor [11] and Treshchalov [12] showed that achieving discharge stability is dependent upon two main criteria: (1) a very uniform initial electron density in the discharge region, and (2) achieving a minimum preionization electron density. This threshold density is dependent upon gas mixture and pressure, the concentration of F2, and electronegative impurities, electrode shape, profile, and voltage rise time. The uniformity of initial electron density in the discharge depends on the method of preionization. The common method of preionization, at least for some commercial lasers, is to use a array of sparks located near and along the length of the discharge electrodes (Figure 4.6). These sparks provide a very high level of preionization and the resultant electron density far exceeds the required minimum. However, it is difficult to achieve sufficient uniformity due to the discrete and finite sparks, resulting in increased discharge instability. Because

UV Preionization

Spark Electrodes

Corona Cp Arc between pins

Energy in arcs ~ 1 to 2 Joules

To dV/dt source Dielectric,

Energy in corona ~ 0.01 Joules

FIGURE 4.6 Two common methods of UV-preionization in lithography lasers.

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

251

the sparks are connected to the peaking capacitors, Cp, the discharge current that passes through the peaking capacitors also passes through the pins resulting in their erosion. Erosion promotes chemical reactions with laser gas and causes rapid F2 burn-up. At high repetition rates, the spark electrodes can become a source of localized heating of the laser gas and can cause index gradients near its vicinity. The resulting variation in beam position, often referred to as pointing instability of the beam, can be a problem to the optical designer. A variation in preionizer gaps can lead to nonsimultaneous firing of the pins, leading to increased beam pointing instability. Corona preionization is now the most common preionization technique. Corona preionizer consists of two electrodes of opposite polarity with a dielectric sandwiched between them (Figure 4.6). As with the spark preionization, the corona preionizer is located along the length of the discharge electrodes. When one of the corona preionizer electrodes is charged with respect to the other electrode, a corona discharge develops on the surface of the dielectric. Increasing the dielectric constant and increasing the rate of increase in voltage can increase the level of preionization. Although corona preionization is considered to be a relatively weaker source of preionization electrons as compared to the spark preionizer, the uniformity of the preionization is excellent due to the continuous nature of the preionizer (as compared to discrete sparks). Theoretical estimates [13] show that under the weak preionization condition, the voltage rise time across the discharge electrodes should be on the order of 1 kV/ns. This is much faster than the 0.04 kV/ns shown in Figure 4.5. The question arises as to why the coronapreionized KrF and ArF lasers work. A possible explanation [14] is that for lasers using F2, electron attachment produces FK ions within a few nanoseconds. For homogeneous discharge development, some of these weakly attached electrons become available from collisional detachment as the discharge voltage promotes acceleration. This collisional detachment process partially compensates for electron loss due to electron attachment, which could explain why corona-preionized KrF, ArF, and F2 lasers work. 4.2.2.4 Glow Phase During the glow phase, energy from Cp is transferred to the region between the electrodes, (sequence 5). The rapid rise of current through the electrodes is controlled only by the pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ magnitude of Cp and (LhCLc), i.e., current rise time is approximately 1= ðLhC LcÞCp. This region conducts heavily and the upper state KrF* excimer is formed via three-body reactions: Kr C F2 C Ne/ KrF C F C Ne; KrC C FK C Ne/ KrF C Ne

(4.6)

Excited KrF* molecule decays to the ground level through dissociation into Kr and F. An amplifying gain medium is created by this dissociation with the emission of photons via both spontaneous (Equation 4.7) and stimulated (Equation 4.8) processes: KrF / Kr C F C hn

(4.7)

(4.8)

KrF C hn/ Kr C F C 2hn:

It has been observed that fluorescence due to spontaneous emission follows the current waveform through the discharge (Figure 4.5), indicating the close dependence of KrF* density on electron density. The lasing, however, does not begin until much later in the discharge. Laser energy is extracted, by means of an optical resonator, when the gain

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

252

exceeds a certain threshold. It is estimated [9] that about 40%–60% of the KrF* population is lost as fluorescence before the start of the laser pulse. During the glow phase, the voltage across the electrodes is approximately constant, albeit for a short duration (20–30 ns). This voltage is often referred to as the discharge voltage (Vd), or glow voltage. Vd may be calculated from E/P: Vd Z

E H; P

(4.9)

where E is the electric field between the electrodes, P is the total gas pressure, and H is the electrode spacing. The magnitudes of the discharge voltage are primarily dependent on the gas mixture and the total pressure. They have been measured by Taylor [11]. Table 4.2 lists the contribution of the component gases used in a typical commercial laser to the discharge voltage. The discharge voltage is then calculated by simply summing the contribution from each component gas. For example, for a electrode spacing of 2 cm, at a pressure of 300 kPa of 0.1% F2, 1% Kr, and balance of Ne, the discharge voltage is 4.0 kV (due to F2)C1.1 kV (due to Kr) and 3.4 kV (due to Ne). Therefore, for that laser, the discharge voltage is 8.5 kV. As previously mentioned, the duration of the glow phase is very small: 20–30 ns for a typical KrF, ArF, or F2 laser. It is advantageous to lengthen the glow phase time as it permits deposition and extraction of greater energy; more importantly, the energy can be deposited more slowly. A slower rate of energy deposition reduces the discharge peak current, and also increases the pulse duration. As will be described later, an increase in the pulse duration can ease requirements of spectral line narrowing and reduces fused silica damage. Experimental evidence [11], coupled with theoretical modeling of this complex discharges [15] indicate that the glow phase duration can be increased by decreasing the F2 concentration and by decreasing the number density of electrons at the onset of the glow phase. The glow phase is initiated by the ionization phase, during which the electron density increases as the field (or voltage) across the electrodes increases. Therefore, to a large extent, the number density of electron at the onset of glow phase depends on the voltage across the electrodes just before the initiation of the glow phase. In Figure 4.5, the voltage is referred to as Vb. Thus, it is possible to increase the glow-phase time by reducing the peak voltage across the electrodes and by reducing the concentration of F2. These facts form some of the critical design rules to the laser designer. 4.2.2.5 Streamer Phase After the glow phase, the discharge degenerates into streamers or an arc and the laser intensity drops. As compared with the glow phase, during which energy is deposited uniformly over the electrode surface, energy is localized during the streamer phase. The total energy deposited in the streamer phase is generally a small fraction of that TABLE 4.2 Contribution of the Gases to the Discharge Voltage of Excimer Lasers

q 2007 by Taylor & Francis Group, LLC

Gas

KV/cm torr

Ne Kr Ar F2

7.5!10K4 2.4!10K2 1.0!10K2 0.9

Excimer Lasers for Advanced Microlithography

253

deposited during glow phase. Nevertheless, because of its localized nature, the energy density (and corresponding current density) on the electrodes is very high. The high current density heats the electrode surface, causing it to vaporize. Some other deleterious effects of the streamer phase include loss of fluorine due to continuous electrode passivation after each pulse and creation of metal fluoride dust. At high repetition rates, the residual effects of a streamer adversely affect the energy stability of the following pulse. Maximizing power transfer into the discharge can minimize the energy deposited in the electrodes during this phase. Later, we will describe that as an added benefit of solid state switching technology (replaces thyratron switched technology shown in Figure 4.4) residual energy in the laser’s electrical circuit is reused for the next pulse. Therefore, energy supply to the electrodes during the streamer phase is quenched. 4.2.3 Laser Design How do the aforementioned discussion affect the design of the laser? To the laser designer, the discharge voltage, Vd, is the key parameter. The discharge voltage then determines the electrode spacing, gas pressure, and the range of operating voltage on the peaking capacitors (Cp). The analysis generally proceeds as follows. Prior to the initiation of the ionization phase, the storage capacitor, Cs, is charged to a DC voltage, V1 (Figure 4.4). The charging current bypasses the peaking capacitors, Cp, as the electrode gap is nonconductive during this time. When the thyratron switches, the Cs pulse charges Cp (Figure 4.4). The voltage on peaking capacitors, V2, also appears across the electrodes. After a time (the discharge formation time), the number density of electrons reaches a certain threshold value, neo. At this time, the gap between the electrodes breaks down into the glow phase. The magnitude of neo depends, as mentioned before, on the gas mixture and on the voltage on Cp at breakdown (Vb). During the discharge formation time, the voltage (V2) waveform on Cp is V2 Z

bV1 ð1Kcos utÞ; b C1

(4.10)

where bZ

Cs ; Cp

CpCs ; Cp C Cs 1 u Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : LmC

CZ

Therefore, for a given Cp and Cs, the rate is controlled by Lm. Typical value of b is between 1.1 and 1.2, and that of Lm is approximately 100 nH. The designer adjusts b, Lm, and the gas mixture such that energy transfer between Cs and Cp is nearly complete when the voltage across the electrodes breaks down. At that moment, the ratio of the energy on Cp and Cs is 4b/(bC1)2. This is the fraction of the energy stored in Cs which is transferred to Cp. Inductance Lm is large enough that Cs cannot deliver charge directly to the discharge due its the short duration (less than 30 ns). The residual energy in Cs rings in the circuit after the glow phase until this energy is damped in the discharge and circuit elements. However, the aforementioned requirements will not automatically guarantee high efficiency or energy. The designer has to compromise with other issues to operate the laser at

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

254

its highest efficiency. Some of the issues are: electrode profile, electrode gap, total loop inductance in the head (LcCLh), gas mixture and pressure and materials which come in contact with laser gas. It can be shown [16] that if the breakdown voltage is twice the discharge voltage, i.e., Vb Z 2Vd ;

(4.11)

the discharge impedance matches with the impedance of the discharge circuit, and maximum power is transferred from the circuit to the discharge. Under these conditions, the power transferred to the discharge is: pﬃﬃﬃﬃﬃﬃﬃ E 2 P2 H2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ : Pd Z Cp P Lc C Lh

(4.12)

Therefore, for a given volume, a large electrode gap favors a strong discharge pumping. However, when the discharge volume is fixed, increasing H will lead to an increase in Lh that would decrease the power transferred to the discharge. This explains why excimer lasers have tall narrow beam. An increase in Cp would increase the power transferred to the discharge, i.e., bz1, consistent with our above comments. A decrease in Lc and Lh would also help power transfer although their square-root dependence (Equation 4.12) reduces their sensitivity. It should be noted that increasing H as its own undesirable consequences: higher voltages. The designer faces several conflicting issues. Ultimately, however, the design path is based on such considerations as line narrowed or broadband, reliability, costs and mechanical constraints.

4.2.4 F2 Laser Operation The typical gas mixture of a F2 laser is about 0.1% F2 in helium. The total pressures are usually higher than KrF or ArF lasers, around 400–600 kPa. A large fraction of the energy during the glow phase goes in ionizing helium (that is, to create HeC). A small fraction of excited helium (He*) is also created during this phase. Unlike KrF and ArF, the energy levels involved for laser action are both bound electronic stated (from D 0 to A 0 ). Some of the dominant reactions leading to the formation of F2*(D) state and then subsequent spontaneous and stimulated emissions are the following [17]: † Charge transfer:

HeC C F2 / FC 2 C He

(4.13)

K He C F2 / FC 2 C e C He

(4.14)

He C F2 / F C F C He

(4.15)

† He Penning ionization:

† Dissociation:

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

255

† Dissociative attachment:

eK C F2 / F C F K

(4.16)

F K C FC 2 / F C 2F

(4.17)

† Collision:

† Production of excited molecular fluorine:

F C F2 / F2 ðD 0 Þ C F

(4.18)

F2 ðD 0 Þ/ F2 ðA 0 Þ

(4.19)

† Spontaneous emission:

† Stimulated emission:

F2 ðD 0 Þ C hy/ F2 ðA 0 Þ C 2hy

(4.20)

Relative intensity

Due to the fact that both the states involved in lasing are bound states, the radiation has a much narrower linewidth than KrF or ArF. Typically, broadband (non-line-narrowed) KrF and ArF linewidths are about 300 and 500 pm, respectively, at FWHM (Figure 4.7a). However, F2 laser linewidth is about 1 pm at FWHM (Figure 4.7b). There exists an adjacent

(a)

1.0 0.8 0.6 0.4 0.2 0.0

ArF Δλ

192.50

~519 pm FWHM

192.75

193.00

193.25

193.50

193.75

194.00

194.25

Wavelength (nm) 1.0 1.2

F2

Relative intensity

0.8

1.0

Δl

FWHM

157.6309

~ 1.1 pm

0.8 0.6

0.6

Δl

0.4

0.4

FWHM

0.2 0.0

0.2

157.5242

−10 −8 −6 −4 −2 0 2 4 6 8 10 Wavelength around 157.6309 nm (pm)

0.0 157.525 (b)

157.550

157.575

FIGURE 4.7 (a) Broadband ArF spectrum. (b) F2 laser spectrum.

q 2007 by Taylor & Francis Group, LLC

157.600

Wavelength (nm)

157.625

Microlithography: Science and Technology

256

electronic state next to D 0 state, and a weaker line exists about 100 pm from the stronger line. 4.2.5 Comments Both KrF and ArF lasers operate in a buffer gas of neon. The kinetics of the laser favors neon over helium or argon as buffer gases. On the other hand, an F2 laser operates best with helium as a buffer gas. Typically, a line-narrowed KrF laser is most efficient (about 0.3%). A similarly line-narrowed ArF laser is about 50% as efficient as KrF. A single-line F2 laser is about 70%–80% as efficient as a line-narrowed ArF laser. These rules of thumb are generally used to guide system-level laser design.

4.3 Laser Specifications 4.3.1 Power and Repetition Rate The requirement for high power and repetition rate is determined by the throughput requirements of the scanner. The scanner exposure time is given by Ts Z

W CS ; V

(4.21)

where Ts is the scanner exposure time for a chip, W is the chip width, S is the slit width, and V is the scan speed for the chip. For a given chip width, the slit width must be minimized and the scan speed must be maximized to reduce Ts. In general, the slit width is matched to the maximum wafer speed: S Z Vm

n ; f

(4.22)

where Vm is the maximum wafer scan speed, n is the minimum number of pulses to make the dose on wafer with the required dose stability (the dose is the integrated energy over n and dose stability is usually an indicator of how this dose varies from the target dose), and f is the laser’s repetition rate. Advanced scanners are capable of scan speeds of up to 500 mm/s. Therefore, to minimize S, the ratio n/f must be minimized. The number n cannot be small due to the fact that a finite number of pulses are required to attain a specific dose at the wafer with a given dose stability [18]. A typical number is 100 pulses to attain a dose stability of C 0.25%. Slit widths of 7–8 mm are common. For a 7-mm slit width, a 300-mm/s scanner would require repetition rates of approximately 4300 Hz. The scan and step of a scanner and the exposure and step operation of a stepper impact the operation of the laser. Excimer lasers operate in burst mode, meaning that they expose for few hundred pulses, and then wait for a longer period during wafer exchange (Figure 4.8). Continuous operation is the preferred operating method of an excimer laser, similar to a lamp. Continuous operation permits stabilization of all laser operating conditions, such as gas temperature and pressure. Based on the exposure conditions shown in the Figure 4.8, continuous operation implies 50% waste of pulses. Laser manufacturers now live with burst-mode operation despite the presence of significant transients in energy and beam pointing at the start of a burst. Fortunately, these transients are

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

257

65 Exposures on 1 wafer Stepper or scanner status

Wafer exchange

1 exposure of chip 1 step to new chip

Laser on Laser status

1 Burst ~100–200 pulses

Stand by 10 s

Short Interval 0.2 s

Long interval, ~5 s

FIGURE 4.8 Operating mode of a stepper or scanner.

predictable and the laser’s control software corrects for these transients [19]. After the laser’s controller learns the transient behavior of the laser (Figure 4.9), the subsequent wafers are exposed correctly. In practice, the laser’s software learns about the transient behavior prior to exposing wafers. 4.3.2 Spectral Linewidth As stated by Rayleigh’s criteria, the resolution of an imaging system could be improved by increasing its NA. In the mid-1990s, only fused silica was fully qualified as the suitable lens material at KrF wavelengths. But recently, UV-grade CaF2 materials in sizes large enough 9500

Energy (μJ)

9000 8500 8000 7500 7000

Burst 1

Burst 2

6500 6000 9500

100 pulses

100 pulses

Energy (μJ)

9000 8500 8000 7500 7000

Burst 3

Burst 4

6500 6000 100 pulses

100 pulses

FIGURE 4.9 Transient in laser energy during exposure. The laser control software learns the energy transients and by the fourth burst corrects the transients using a feed-forward technique.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

258

for imaging lenses became available. For KrF, the lenses did not correct for chromatic aberration. The increase in NA severely narrowed the linewidth (DlFWHM) requirements as: DlFWHM z

ðnK1Þl ; 2f ðdn=dlÞð1 C mÞNA2

(4.23)

where n is the refractive index of the material, l is the wavelength, dn/dl is the material dispersion, m is the lens magnification, f is the lens focal length, and DlFWHM is the full width at half-maximum (the other measure for line width is Dl95%, which is full width at 95% energy). The equation predicts that at NAZ0.7 and 0.8 at 248 nm; the linewidth requirements would be 0.6 and 0.45 pm, respectively, in close agreement with what lens designers have required for KrF lasers. At 193 nm, due to the large increase in dispersion in fused silica, the same NAZ0.7 and 0.8 would require linewidths of 0.2 and 0.15 pm, respectively. Such narrow linewidths from excimer lasers were not practical in the mid-nineties when the technical approach for ArF lithography was formed. Hence, ArF lenses were chromatically corrected by using a combination of fused silica and CaF2. However, the delay in availability of large-lens-quality CaF2 by at least two years delayed the introduction of ArF, and KrF continued to be the laser of choice until recently. Thus, chromatic correction is not possible. Attempts to spectrally narrow the linewidth from to below 0.20 pm were not successful. Therefore, 157-nm scanners would use a catadioptric imaging system. As mentioned earlier, a weak F2 line co-exists with the strong line. This weak line must be suppressed for the catadioptric imaging system; otherwise, it would appear as background radiation at the wafer. 4.3.3 Wavelength Stability The penalty for high NA of a lens manifests itself in an additional way. The Rayleigh depth of focus (DOF) is related to the NA of the lens by DOF ZG0:5

l : ðNAÞ2

(4.24)

Thus, for a 0.6-NA lens at 248 nm, the DOF is about C0.35 mm, and for a 0.8-NA lens, the DOF is only C0.2 mm. A change in wavelength induces a change in focus. This is shown in Figure 4.10 for a 0.6-NA lens at 248 nm. The permissible change in wavelength to maintain the focus of the lens is large—about 3 pm. However, the 3-pm range also restricts the spectral distribution of the laser line shape. If the laser contains significant energy (greater than 5%) in the tails of its spectrum that extend beyond the acceptable wavelength range, there is a deterioration of the lens imaging properties [20]. The spectral distribution encompasses all the energy from the laser within the DOF of the lens. However, during laser operation, stochastic processes that cause energy fluctuation also lead to laser’s wavelength fluctuation. As a result, the spectral distribution during exposure is actually an envelope due to the fluctuations. The effect of wavelength fluctuations on focus could be analyzed easily when one realizes that wavelength fluctuations follow Gaussian distribution, as do energy fluctuations (Figure 4.11). Thus, these wavelength fluctuations could be characterized in the same manner as energy fluctuations: as standard deviation around the mean (sl) and as the deviation of an average number of pulses from the target (lavg). Although, when averaged over 100 pulses (a typical number to expose a chip), lavg is close to target and the magnitude of sl is not insignificant (Figure 4.12). During the exposure of the chip, sl fluctuations tend to broaden the spectrum. This effect is shown in Figure 4.13,

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

259

1.0 Spectrum

0.8

0.4 0.2

0.6

0.70 μm

Focus shift (μm)

0.6

0.0 −0.2

0.4

−0.4 −0.6

0.2 3 pm

−0.8 −1.0

−3

−2

−1

0

2

1

3

Wavelength around target (pm)

Spectrum relative instensity

1.0

0.8

FIGURE 4.10 Measurement of best focus as a function of wavelength and the lineshape of a KrF laser.

Relative probability (arbitrary units)

300 Energy distribution of 10,000 pulses 250 200 Gauss fit 150 100 50 0 7500 7600 7700 7800 7900 8000 8100 8200 Energy (μJ)

Relative probability (arbitrary scale)

3500 Wavelength distribution of 10,000 pulses 3000 2500 Gauss fit 2000 1500 1000 500 0 −0.08

−0.04

0.00

0.04

0.08

l Difference around a target wavelength (pm)

q 2007 by Taylor & Francis Group, LLC

FIGURE 4.11 KrF laser energy and wavelength distribution around a target, measured over 10,000 pulses.

Microlithography: Science and Technology

260

σλ (pm)

0.030 0.025 0.020 0.015 0.010 λavg (pm)

FIGURE 4.12 Wavelength stability for a KrF laser; it is characterized by two parameters. The first is wavelength standard deviation around the mean and the other is average wavelength around a target wavelength. The averaging is over the number of pulses required to expose a chip, typically 100.

0.005 0.000 0

2000

4000 6000 Pulse number

8000

10,000

which is Figure 4.12 with the 3sl fluctuation superimposed. The net result is that the specifications of DlFWHM and sl go together. Combined, they could cause lens defocus at the wafer, especially for high-NA lenses [20]. The next section contains a discussion of how the operation of lasers at high repetition rates tends to have dramatic effects on wavelength stability. The impact of nonchromatic lenses with high NA is their sensitivity to pressure and temperature changes [21]. For the 0.6-NA lens at 248 nm, 1 mm of Hg change in pressure results in a focus shift of 0.24 mm. Also, a 18C change in temperature induces a focus shift of 5 mm. This high sensitivity to temperature and pressure increases the precision to which the corrections need to be made. The shift in focus could be compensated by a shift in laser wavelength. Because these environmental changes can occur during the exposure, rapid change in wavelength is required, usually at the start of chip exposure. In other words, the laser’s target wavelength must be changed and the laser must rapidly reach the target within a few pulses (less than 10 pulses). The specification of the laser that relates to how closely it can maintain is target wavelength during exposure is the average wavelength from the target, lavg. Because rapid pressure changes of a fraction of a mm of Hg can occur, the laser must respond to changes in wavelength of few tenths of a pm (as high as 0.5 pm).

0.8

Spectrum

0.8

0.4 0.2

0.6

0.70 μm

Focus shift (μm)

0.6

0.0 −0.2

0.4

−0.4 −0.6

FIGURE 4.13 Illustration of how wavelength fluctuation and lineshape combine to defocus a lens image.

q 2007 by Taylor & Francis Group, LLC

0.2 3 pm

−0.8 −1.0

−3

−2

−1

0.0 0

1

2

Wavelength around target (pm)

3

Spectrum relative instensity

1.0

1.0

Excimer Lasers for Advanced Microlithography

261

TABLE 4.3 Contribution of Laser Operating Parameters on F2 Laser Wavelength at 157.630 nm Parameter

Sensitivity

Pressure F2 Voltage Temperature Energy

Expected Effect

0.0018 pm/kPa with helium 0.00081 pm/kPa with neon K0.020 pm/kPa 0.001 pm/V K0.0006 to K0.0013 pm/8C 0.0022 pm/mJ

G0.0009 pm for G5 kPa fluctuation with helium G0.001 pm for G0.05 kPa fluctuation of F2 G0.020 pm for G20 V fluctuation in voltage G0.0065 pm for a G58C fluctuation in temperature G0.0022 pm for G1 mJ fluctuation in energy

In the next section, a description is given of how advanced excimer lasers have adopted new technologies to lock the wavelength to the target to within 0.005 pm. At 157 nm, wavelength tuning is not possible as the laser emits single, narrow lines involving two electronic states of F2. However, the operating pressure and buffer gas type (helium/neon mixtures) have been shown to shift the central wavelength and to broaden the bandwidth [22] due to collisional broadening of these electronic states. Variability in operating parameters (pressure, temperature, voltage, and F2 concentration) has the potential to shift the central wavelength and therefore affect focus (Table 4.3). According to the table, the wavelength could shift by as much as G0.03 pm due to pulse-to-pulse fluctuation in the operating parameters. This fluctuation cannot be reduced. Therefore, the projection lens must accommodate (Figure 4.14). 4.3.4 Pulse Duration Under exposure to ArF radiation [23], fused silica tends to densify according to the equation 2 0:6 Dr NI Zk ; t r

(4.25)

where r is the density of fused silica, Dr is the increase in density, N is the number of pulses in million, I is the energy density in mJ/cm2, k is the sample-dependent constant, and t is

1.0 Shift relative to zero pressure wavelength (pm)

0.9

l =157.63053+1.86 × 10−6 P (He), nm

0.8 0.7 l =157.63090 nm 0.6 @ 200 kPa of He 0.5 0.4 0.3 0.2

Zero pressure wavelength l =157.63053 nm

0.1 0.0 0

100

200 300 400 500 Laser gas pressure (kPa)

q 2007 by Taylor & Francis Group, LLC

600

FIGURE 4.14 Pressure-induced shift of F2 laser line.

Microlithography: Science and Technology

262 the integral square pulse duration defined by

Ð 2 PðtÞdt Ð tZ ; PðtÞ2 dt

(4.26)

where P(t) is the time-dependent power of the pulse. The refractive index of fused silica is affected by densification. After billions of pulses, the irradiated fused silica lens would seriously affect the image. Experience has shown that the magnitude of k can vary greatly depending on the fused silica supplier, meaning that the details of manufacturing fused silica is important. Also, k is a factor of 10 less at 248 nm as compared to 193 nm. Therefore, compaction is primarily observed at 193 nm. The other technique to increase lifetime is to reduce the intensity, I, by increasing repetition rate. The third technique is to stretch the pulse duration. Numerous fused silica manufacturers working in conjunction with International Sematech [24] have investigated the first solution. As a result, the quality of fused silica today has improved significantly. The laser manufacturers are investigating the other two. The validity of Equation 4.25 has been questioned [24] at low energy densities (less than 0.1 mJ/cm2), comparable to the density experienced by a projection lens. Despite this ongoing debate on the effect of pulse duration, we expect long pulse duration could soon become an ArF laser specification. 4.3.5 Coherence The requirements for narrower linewidth result in lower beam divergence. As a result, the spatial coherence of the beam improves. A simple relationship [25] between spatial coherence (Cs) and divergence (q) is qCs z2l:

(4.27)

With narrower linewidths, the coherence lengths have increased. Today, for a 0.4-pm KrF laser, the coherence length is about one-tenth the beam size in the short dimension and about one-fiftieth in the tall dimension. At the same time, narrower linewidth increases the temporal coherence of the beam. A simple relationship between temporal coherence (CT) and linewidth is CT Z

l2 : Dl

(4.28)

Do narrow linewidths make the excimer laser no longer an “incoherent” laser source? If so, the lithography optics must correct for coherence effects and a significant advantage of an excimer laser is lost. To answer this question, one could perform a simple calculation to answer this question. Based on the fact that the coherence length is 1/10th and 1/50th of the short and tall dimension, respectively, of the excimer beam, there are 10!50 or 500 spatially coherent cells in the beam. The temporal coherence of the KrF laser with 0.5-pm linewidth is 123 mm. Combined, this may be interpreted as 500 spatially coherent cells of 123-mm length exiting from the laser and then incident upon the chip during the laser pulse. Each of these 123-mm-long cells could cause interference effects, as they are fully spatially coherent. Because the pulse length of an excimer laser is about 25 ns, the number of temporally coherent cells during the pulse (product of speed of light and pulse length divided by temporal coherence length) is about 60. Thus, total numbers of cells that are incident on the chip are 500!60 or 30,000. All of these contribute to interference effects, or noise, at the chip. Speckle can then be estimated

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

263

to be simply 1/N where N is the total coherent cells (30,000), or about 0.6%. This amount of speckle is not negligible considering the tight tolerance requirements of the features in present-day semiconductor devices. This is the sad fact of life: narrow linewidths imply a coherent beam. These two properties go together, and the lithography optics must handle the coherent excimer beam [25]. 4.3.6 Beam Stability The term beam stability is used here as a measure of how well the beam exiting the laser tracks a specified target at the scanner. Quantitatively, beam stability is measured by position and pointing-angle errors from the target as viewed along the optical axis of the beam. Position stability impacts dose stability (energy per pulse integrated over several pulses) at the wafer as a shift in beam position induces shift in transmission through the scanner optics. Pointing instability adversely affects the illumination uniformity at the reticle. To the lithography process engineers, the effects of beam stability are not new; both result in loss of CD control. At a 130 nm or greater node, the loss of CD control due to beam instability was insignificant and therefore ignored. However, below that node, it will be shown that unless the beam exiting the beam delivery unit (BDU) is stabilized in position and pointing, the loss in CD control is on the order or 1 nm, which is a significant portion of the total CD control budget. For example, for an MPU gate node of 65 nm, the International Technology Roadmap for Semiconductors (ITRS) roadmap allocates CD control of 3.7 nm. Thus, the 1-nm loss of CD control due to aforementioned instability alone is considered to be very significant. To understand how beam stabilization impacts CD control, the role of the illuminator in the optical train of a scanner will be examined. The function of the illuminator is to spatially homogenize, expand, and illuminate the reticle. Figure 4.15 illustrates the key elements of the illuminator and the optical path of a beam along the axis of the illuminator. The fly’s eye element (FE1) segments the beam to multiple beamlets, typically into 3!3 segments. The focal length of each lens element is 50 mm. The relay lens (R1) directs the output of FE1 to a rod homogenizer (HOM). The HOM is a hexagonal rod with a 10:1 aspect ratio (length:diameter). A beam incident on one face exits the other face after undergoing multiple Y X Relay 1 (R1)

Fly’s eye 1 (FE1)

Z

Homogenizer (HOM)

Zoom relay (ZR) Reticle

Relay 2 (R2) FIGURE 4.15 Key elements of an illuminator. Dr. Russ Hudyma, Paragon Optics for Cymer, Inc., performed design and simulation.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

264

reflections. Thus, at the exit face, a series of virtual point sources of a uniform nature are created by the HOM. A zoom relay images the output of the HOM to the input of the second fly’s eye, FE2. Typically, the number of elements in FE2 is higher than FE1, about 81. The relay lens R2 then channels the output of the FE2 to the reticle. For the simulation, the intensity profile at the reticle was first calculated for a beam along the optical axis. Then, the beam was misaligned along the y-axis (Figure 4.15) in steps of 50 up to 400 mrad, and the intensity profile was calculated for each of the misaligned beams. The deviation in each case from the axial beam is the resultant nonuniformity (Figure 4.16). For advanced scanners, the maximum permissible deviation is about 0.5%. This deviation is the sum total from all sources: beam misalignment, optical aberrations, and optical defects. This is equivalent to beam misalignment of much less than 200 mrad for it to be a negligible component of reticle uniformity. Thus, a 50-mrad misaligned beam would result in a 0.1% nonuniformity, which is considered acceptable even for advanced lithography. The simulation presented here was carried further down the optical train right up to the wafer. Ultimately, the reticle nonuniformity must translate to CD error. By using a Monte Carlo simulation technique, the across-chip-linewidth-variation (ACLV) was calculated. This is the average CD variation over a chip. The results are shown in Figure 4.17 for a microprocessor with 53 and 35-nm gates, corresponding to 90 and 65-nm nodes. It can be seen that 0.5% illumination nonuniformity could result in 1.5 nm ACLV. To a process engineer in advanced lithography, this variation is unacceptable. We again conclude that the beam must be aligned to within 50 mrad during wafer exposure. The stochastic processes during laser operation that create energy and wavelength fluctuation also create beam instability in pointing and position. But unlike energy and wavelength control, beam instability can only be minimized outside the laser. This is because angular adjustment of the laser can be done buy adjusting the laser resonator mirror. However, this is exactly the optics that is used to adjust wavelength. Because two independent parameters (wavelength and pointing) cannot be adjusted by one adjustment, we introduce a novel beam stabilization control system in the BDU that transports the beam from the laser to the scanner. Such beam stabilization maintains beam position and pointing during exposure of a die of a wafer, virtually eliminating CD control errors. In summary, the requirements of excimer lasers have increased significantly since they were introduced for semiconductor R&D and then for volume production. The exponential

Reticle non-uniformity (%)

Uniformity at reticle effect of misaligned beam

FIGURE 4.16 Effect of beam misalignment on reticle uniformity.

q 2007 by Taylor & Francis Group, LLC

1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 −0.1 0

100

200

300

Misalignment (urad)

400

Excimer Lasers for Advanced Microlithography 6

265

53 nm

5 ACVL (3σ nm)

35 nm 4 3 2 1

0.0

0.2

0.4

0.6

0.8

Illumination nonuniformity (% rms)

1.0

FIGURE 4.17 Effect of illumination nonuniformity on across-chip linewidth variation (ACLV), also known as CD variation. Dr. Alfred Wong, Fortis System, performed this simulation.

growth in power requirements at all wavelengths, in combination with a massive drop in DlFWHM specification is spurred by the drive for higher-resolution features on wafers and higher wafer throughputs from scanners. In the next section, the key technologies that comprise an excimer laser will be described and the changes that were made to make these lasers meet the challenging demands will be discussed.

4.4 Laser Modules The lithography excimer laser consists of the following major modules: 1. 2. 3. 4. 5. 6. 7.

Chamber Pulsed power Line narrowing Energy, wavelength, and linewidth monitoring Control Support BDU

The control and the support modules have kept up with the advances in laser technology and will not be discussed here. Figure 4.18 shows the layout of a commercial excimer laser for lithography. Typical dimensions for a laser at 2 kHz are about 1.7 m length, 0.8 m wide, and about 2 m tall. Because these lasers occupy clean-room space, the laser manufactures are sensitive to the laser’s footprint. In the last seven years, as the laser’s power increased threefold, the laser’s footprint remained virtually unchanged. 4.4.1 Chamber The discharge chamber is a pressure vessel designed to hold high-pressure F2 (approximately 0.1% of the total) gas in a buffer of krypton and neon. Typical operating pressures of excimer lasers range from 250 to 500 kPa, out of which 99% is neon for KrF and ArF and helium for F2. The chambers are quite massive, usually greater than 100 kg. One would

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

266

Power supply Controls

Pulsed power, commutation

Pulsed power, compression

Energy, λ and Δλ metrology

Line narrowing Chamber

Support module (gas)

Support module (water)

FIGURE 4.18 The key modules of an excimer laser for lithography.

think that the chamber sizes have increased with output power. However, it appears that scientists and engineers have learnt how to get more power from the same chamber. Thus, the same chamber that produced 2.5 W in 1988 produced 20 W in 1999. Figure 4.19 shows a cross-section of the chamber. The construction materials of presentday chambers are aluminum, nickel, and brass. The only nonmetallic material that comes in contact with gas is 99.5% pure ceramic. The electrical discharge is initiated between the electrodes. The electrode shape and the gap between the electrodes determine the size Dust trap for window

Ceramic HV feedthru and insulator Electrodes

19 cm Blower FIGURE 4.19 The cross section of a chamber.

q 2007 by Taylor & Francis Group, LLC

Heat exchanger

Excimer Lasers for Advanced Microlithography

267

and shape of the beam. Typical beam widths are 2–3 mm and heights range from 10 to 15 mm. It is advantageous to keep the beam size large. A large beam assures that the beam is multimode and therefore spatially more incoherent than small-beam lasers. The energy that is deposited between the electrodes heats the gas adiabatically. This heating generates pressure waves originating at the electrodes that travel to the chamber walls and other structures. These structures could reflect the sound waves back to the electrode region. At a typical gas operating temperature of 458C, the speed of sound in neon is about 470 m/s. In 1 ms, the sound wave must travel 47 cm before it can reach the electrode region coincident with the next pulse. Considering the dimensions of the chamber shown in Figure 4.19, the sound wave must have made few reflections before it reached the electrode region 1 ms later. The reflected wave after 1 ms is weak. But at 2 kHz, the sound wave must travel only 23.5 cm before it is coincident with next pulse. Although the gas temperature determines the exact timing of the arrival waves, the dimension of the chamber and the location of structures within the chamber almost guarantees that some reflected sound wave is coincident with the next 2-kHz pulse. The situation is much worse at 4 kHz because the sound wave must only travel less than 12 cm. Additionally, during burst-mode operation of the laser that is typical during scanner exposure, the laser gas temperature changes over several degrees over a few milliseconds. These changing temperatures change the location of the coincident pressure waves from pulse to pulse within the discharge region. In turn, this affects the index of refraction of the discharge region causing the laser beam to change direction every pulse. The linenarrowing technology described below is sensitive to the angle of the incident beam. A change in angle of the incident beam in the narrowing module would induce a change in wavelength. Figure 4.20 shows this wavelength variation in a chamber (with a linenarrowing module) at a fixed temperature (w458C) as a function of repetition rate. This variation results in loss of control of the target wavelength and also causes an effective broadening of the spectrum, as discussed in the previous section. This problem manifests itself at high repetition rate, and worsens as the repetition rate increases. Also, depending on the gas temperature, the repetition rate, and the location of structures near the discharge, there are some resonant repetition rates where stability is much worse. The effect of these pressure waves can also be seen in laser’s energy stability, beam pointing stability, beam uniformity, and linewidth stability. A proper choice of temperature and spacing between the discharge and structures may delay the pressure waves for a particular repetition rate but not for another. Because excimer lasers in lithography applications do not operate at fixed repetition rate, temperature optimization is not a solution. The other impractical solution is to increase the distance of all support structures around the discharge, which would lead to larger chamber for every increase in repetition rate. This implies that, at 4 kHz, the cross-section of the chamber would be four times larger.

σλ (pm)

0.04 0.03 0.02 0.01 0.00 500

1000

1500

2000

2500

Repetition rate, Hz FIGURE 4.20 Increase in wavelength variation as a function of repetition rate.

q 2007 by Taylor & Francis Group, LLC

3000

3500

4000

Microlithography: Science and Technology

268

Although the presence of these pressure waves does not bode well for high-repetition excimer lasers for the future, very innovative and practical techniques have been invented by a group of scientists [26]. They introduced several reflecting structures in the chamber shown in Figure 4.19 such that the reflected waves would be directed away from the discharge region. The reflecting structures, made from F2-compatible metals, were designed to scatter the pressure waves. The effect of these so-called “baffles” is shown in Figure 4.21. These “baffles” reduce the wavelength variation by a factor of three for most repetition rates. As the lithography industry continues to strive for higher scanner throughput via higher repetition rates, the excimer laser designers would face great technical hurdles related to the presence of pressure waves. 4.4.2 Line Narrowing The most effective line-narrowing technique, implemented on nearly all lithography lasers, is shown in Figure 4.22. This technique utilizes a highly dispersive grating in the Littrow configuration. In this configuration, that angle of incidence on the grating equals the angle of diffraction. Due to the dispersive nature of the grating, the linewidth is proportional to the divergence of the beam incident on the grating. Thus, the beam incident on the grating is magnified usually by a factor of 25–30 to fill the width of the grating. Prisms are used for beam expansion because they maintain the beam wavefront during expansion. Due to the fact that the beam attendue (product of beam divergence and beam dimension) is constant, the large beam reduces the divergence, which then reduces the linewidth of the laser. The beam divergence is also limited by the presence of apertures in

Wavelength fluctuation around target, ΔλAVG (pm)

Without acoustic damping

With acoustic damping

0.06

0.06

0.05

0.05

0.04

0.04

0.03

0.03

0.02

0.02

0.01

0.01

0.00

0.00

−0.01

−0.01

−0.02

−0.02

−0.03

−0.03

−0.04

−0.04

−0.05

−0.05

−0.06 1500

1600

1700

1800

1900

Repetition rate (Hz)

2000

−0.06 1500

1600

1700

1800

1900

2000

Repetition rate (Hz)

FIGURE 4.21 Wavelength fluctuations reduce by a factor of three when acoustic damping is introduced in a chamber.

q 2007 by Taylor & Francis Group, LLC

Grating

Excimer Lasers for Advanced Microlithography

269

Output mirror

Prism beam expander

Gain medium Aperture Mirror Mirror adjustment FIGURE 4.22 Line narrowing via prisms and grating.

the line-narrowing module and near the output mirror. These apertures effectively define the number of transverse modes, or the divergence of the beam. The combination of high magnification, large gratings, and narrow apertures will be used in the future to meet the linewidth requirements. We expect that this technology could be extended to the 0.2-pm range. Because the beam expansion prisms are made of CaF2 and the grating is reflective, this line-narrowing technology is applicable to KrF and ArF lasers. For F 2 lasers, line narrowing is not required—just line selection. Therefore, the grating in Figure 4.22 is not used; a combination of prisms and mirrors are used instead. The angle of the beam incident on the grating determines the wavelength of the laser. Therefore, adjusting this angle makes the wavelength adjustment of the laser. In practice, the mirror shown in Figure 4.22 is adjusted to change the wavelength because it is less massive than the grating. Until recently, simple linear stepping motors were used to accomplish small wavelength changes. Typically, the minimum change in wavelength that could be accomplished was 0.1 pm over a period of 10 ms. This means that at 4000 Hz, the response to change 0.1 pm would take about 40 pulses, which is nearly half the number used to expose a chip. Also, 0.1 pm corresponds to nearly 20%–30% of the laser’s linewidth and is therefore unacceptable. Recent advances [27] made in wavelength control technology have significantly reduced the minimum wavelength change to about 0.01 pm over a period of only 1 ms, or 4 pulses at 4000 Hz. The mirror movement is now carried out via a piezo (PZT)-driven adjustment. The rapid response of the PZT permits tighter control of the laser’s wavelength stability, as shown in Figure 4.23. Also, the use of PZT permits rapid change in wavelength to maintain focus of the lens during the exposure of the chip. Thus, active adjustment of lens focus in response to pressure or temperature changes in the lens is now feasible. 4.4.3 Wavelength and Linewidth Metrology Associated with tight wavelength stability to maintain focus of the lens is the requirement to measure wavelength accurately and quickly (i.e., every pulse). In 1995, the precision of wavelength measurements of G0.15 pm was adequate. Now, wavelength

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

270 0.06 Wavelength stability λAVG (pm)

Without PZT

FIGURE 4.23 Wavelength stability before and after PZT-based active wavelength correction.

0.04 0.02

With PZT

0.00 −0.02 −0.04 −0.06 0

2000

4000 6000 Pulse number

8000

10,000

must be measured to a precision of G0.01 pm, consistent with maintaining wavelength stability to within less than 0.05 pm. In addition, the metrology must be capable of measuring linewidths from 0.8 pm in 1995 to 0.2 pm today. The fundamental metrology to perform these measurements has not changed since 1995. Figure 4.24a shows the layout of the metrology tool integrated into the laser. Today, such tools can measure wavelengths and linewidths at 4000 Hz more precisely than in 1995, without any significant change in their size. The grating and the etalon are used to make an approximate and an accurate measurement of wavelength, respectively. The output from the grating is imaged on a 1024element silicon photodiode array (PDA). The fringe pattern of the etalon is imaged on one side of the grating (Figure 4.24b) on the PDA. The central fringe from the etalon is intentionally blocked so that it does not overlap with the grating signal. The approximate wavelength is calculated straight from the grating equation: 2d sin q Z ml;

(4.29)

where d is the grating groove density, q is the q angle of incidence on the grating, and m is the order of the diffraction. By selecting d and m appropriately, the knowledge of the angle is sufficient to provide the wavelength. The angle is measured from its position on the PDA. In practice, the PDA is calibrated with a known wavelength that encompasses all the constants of the grating and imaging optics. This equation only gives an approximate wavelength that is determined by the location of the grating signal on the PDA with respect to the wavelength used for calibrating the PDA. In practice, it is adjusted to be within one free spectral range of the etalon (about 5 pm). The knowledge of the approximate wavelength coupled with the inner and outer fringe diameter of an etalon fringe is used to calculate the exact wavelength: l1 Z l0 C Cd ðD21 KD20 Þ C N !FSR;

(4.30)

where D0 and D1 are defined in Figure 4.24b, l1 is the wavelength corresponding to D1, l0 is calibration wavelength, Cd is the calibration constant depending on the optics of the setup, FSR is the free spectral range of the etalon, and N is an integer: 0, G1, G2, G3.

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

271

Pressurized housing

Slit

Grating Approximate wavelength measurement

Laser

PDA

Pressurized housing

Exact wavelength measurement

Diffuser

Fine etalon

Calibrating absorption cell

Detector

Hollow cathode Ne Fe lamp (248.3271 nm) Absolute wavelength calibrating system

6 Grating signal Etalon and grating fringe

5 4

Etalon signal

3 2 1

D1 D2

0 PDA scan FIGURE 4.24 Optical components in a wavelength and linewidth metrology tool.

The magnitudes of l0, Cd, and FSR are predetermined and saved by the tool’s controller. The value of N is selected such that 1 jl1 Klg j% FSR; 2

(4.31)

where lg is the approximate wavelength calculated by the grating. Similarly, l2 is calculated from D2. The final wavelength is the average of l1 and l2: lZ

q 2007 by Taylor & Francis Group, LLC

l1 C l2 : 2

(4.32)

Microlithography: Science and Technology

272

Due to the laser’s linewidth, each fringe is broadened. The laser’s linewidth at full width at half maximum is calculated by Dl Z

l1 Kl2 : 2

(4.33)

Due the finite finesse of the etalon, its (FSR/finesse) ratio limits the resolution of the etalon. Typically, the resolution is between 0.4 and 0.8 pm. This finite resolution broadens the measured linewidth in Equation 4.33. The common practice to extract the correct linewidth is to subtract a fixed correction factor from Equation 4.33. As the linewidths continue to decrease with each generation of laser, practical techniques to extract linewidth must be refined. An innovative approach [28] uses a doublepass etalon to improve the etalon’s resolution to measure linewidth. The metrology for KrF and ArF is similar. For F2 lasers, practical metrology that fits in a laser must be invented. Because F2 lasers are operated at their natural linewidth, metrology may be simpler than for KrF or ArF lasers. All metrology tools need periodic calibration to compensate for drifts in the optics of the tool. Fortunately, atomic reference standards exist for all three wavelengths, and laser manufacturers have integrated these standards in the metrology tools. For KrF, the standard is an atomic iron line at 248.3271 nm (wavelength at standard temperature and pressure [STP] conditions). For ArF, the standard is an atomic platinum line at 193.4369 nm (vacuum wavelength). For F2, the standard is D2 at 157.531 nm. 4.4.4 Pulsed Power Excimer lasers require their input energies switched in very short times, typically 100 ns. Thus, for a typical excimer laser for lithography, at 5 J/pulse, the peak power into the laser is 50 MW. This may appear to be trivial until one considers the repetition rate (greater than 4000 Hz) and lifetime requirements (switch life of greater than 50B pulses). High-voltage switched circuits such as those driven by thyratrons have worked well for excimer lasers in other industries such as medicine. For lithography, the switching must be precise and reliable. The precision of input-switched energy must be within 0.1%–0.2% and the reliability of the switching must be 100%. By 1995, it was realized that conventional switching with a high-voltage switch, such as thyratron, was not appropriate for this industry. Thyratrons were unpredictable (numerous missed pulses) and limited in lifetime. Instead, solid-state switching, using a combination of solid-state switches, magnetic switches, and fast-pulse transformers were adapted. This proved to be worthwhile; this same technology that switched lasers at 1 kHz would now be carried forward to all generations of excimer lasers. Figure 4.25 is a schematic of a solid-state switched circuit used in a 4000-Hz excimer laser [29]. The power supply charges the capacitor C0 to within 0.1%. Typical voltages are less than 1000 V. Thus, a precision of 0.1% corresponds to 1 V. Typical dE/dV values of these lasers are approximately 50 mJ/V. For a 5000-mJ-per-pulse output energy, this corresponds to 1% of the energy. If the laser must achieve dose stability of 0.3%, the precision of the supply cannot be greater than 0.1%. When the insulated gate bipolar transistor (IGBT) commutes, the energy is transferred to C1. The inductor L0 is in series with the switch to temporarily limit the current through the IGBT while it changes state from open to closed. Typically, the transfer time between C0 and C1 is 5 ms. The saturable inductor L1 holds off voltage on the capacitor until it saturates, allowing the transfer of energy from C1 through a step-up transformer to the CpK1 capacitor in a transfer time of about 500–550 ns. The transformer efficiently transfers the 1000-V, 20,000-A, 500-ns pulse to a 23,000-V, 860-A, 550-ns pulse that is stored in C pK1.

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

273 Bias current

IGBT

C0

L0

Lh

Lp−1

L1 C1

Cp−1

Cp

XFMR

Power supply

Commutation stage

Compression stage

Chamber

FIGURE 4.25 A solid-state switched-pulsed power circuit capable of operating at 4000 Hz.

The saturable inductor LpK1 holds off the voltage on the CpK1 capacitor bank for approximately 500 ns and then allows the charge on CpK1 to flow onto the laser’s capacitor Cp in about 100 ns. As Cp is charged, the voltage across the electrodes increases until gas breakdown between the electrodes occurs. The discharge lasts about 50 ns, during which the laser pulse occurs. The excimer laser manufacturers have lived with the increased complexity of a solidstate switched pulsed power. This is because the switches have the capability of recovering residual energy in the circuits that are not dissipated in the discharge so that the subsequent pulse requires less input energy [30]. With the solid-state pulsed power module (SSPPM), this energy no longer rings back and forth between the SSPPM and the laser chamber. The SSPPM circuit is designed to transmit this reflected energy all the way back through the pulse-forming network into C0. Upon recovery of this energy onto C0 (Figure 4.26), the IGBT switches off to ensure that this captured energy remains on C0. Thus, regardless of the operating voltage, gas mixture, or chamber conditions, the voltage waveform across the laser electrodes exhibits the behavior of a well-tuned system. This performance is maintained over all laser operating conditions. Today, solid-state switching and chamber developments proceed together. The longterm reliability of a lithography laser is as dependent on the chamber as it is on its pulsed power. 4.4.5 Pulse Stretching Previous investigations on long pulse from broadband ArF lasers have been sporadic [14], with the conclusion that a practical long pulse ArF laser was not feasible. But recently, a simple modification to the circuit shown in Figure 4.25 was proposed [31]. The simplicity of the technique makes it an attractive technology for stretching ArF pulses (Figure 4.27). In the pulsed power technique, the capacitor Cp in Figure 4.25 is replaced with two capacitors, Cp1 and Cp2, with a saturable inductor, Ls, between them. The compression module in Figure 4.25 charges Cp1. The saturable inductor prevents Cp2 from being charged until Cp1 reaches a voltage close to twice the laser’s discharge voltage. Once Ls

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

274

1000

Energy recovery with solid-state circuit Energy on C0 before, during and after laser pulse SCR Switches here

800 Laser pulse occurs here

Voltage on C0

600 400

Input energy

Pulse length ~ 15–30 ns

200 Recovered energy

0 −200 −400 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00

Time (ms) FIGURE 4.26 Voltage on C0 after the laser pulse indicates recovered energy from the discharge.

saturates, charge is transferred to Cp2 until the discharge breaks down. By adjusting the relative magnitudes of Cp1 and Cp2, two closely spaced pulses are generated: one driven by Cp2 and then the next driven by Cp1. Figure 4.28 shows the stretched pulse and compares that with a normal ArF pulse. Due to discharge stability issues, ArF pulses are inherently short, approximately 20–25 ns. The penalty for a long pulse is a degradation of pulse stability by as much as 25%. The technique now used on lasers that have access energy is an optical delay line technique (Figure 4.29). A 2! pulse stretcher doubles the t (Equation 4.26) pulse length. A beam splitter at the input splits the beam by 50%. The transmitted beam generates peak 1. The remaining 50% of the beam reflects off the front surface of the beam splitter towards mirror 1. From mirror 1, the beam traverses mirrors 2, 3, and 4. The mirrors are arranged in a confocal arrangement. Therefore, the beam that is incident on the rear surface of the beam splitter is identical to the input beam. About 50% of the beam from mirror 4 is output by the beam splitter that results in peak 2. The time interval between the two peaks is the time it takes for light to traverse the four-mirror path shown in Figure 4.29. The mirrors have typical reflectivity of about 95%. This means

To pulse XFMR

Cp−1

FIGURE 4.27 A modification to the solid-state switched-pulsed power to extend the laser pulse.

q 2007 by Taylor & Francis Group, LLC

Lh

Ls

Lp−1 Cp1

Compression stage

Cp2

Chamber

Excimer Lasers for Advanced Microlithography

275

Relative power

2.5 2.0

Normal pulse,Tis=28 ns

1.5 1.0 0.5 0.0

Relative power

2.5

Stretched pulse,Tis=45 ns

2.0 1.5 1.0 0.5 0.0 −10

0

10

20

30

40

50

60

70

80

90

100

Time (ns)

FIGURE 4.28 A comparison of ArF pulse shapes using pulsed power from Figure 4.25 through Figure 4.27, respectively.

that peak 2 is 80% as intense as peak 1 (the reflection from four mirrors is 0.954). Peak 3 would be only 40% as intense as peak 2 (80% reflection due to four mirrors and 50% reflection of the beam splitter) and so on. After about 5 peaks, the stretched pulse terminates. The total loss through a pulse stretcher is about 20%. But from Equation 4.26, the compaction is reduced by 35%. As stated in Table 4.1, the requirements for pulse length is 80 ns. Chaining two pulse stretchers in series can do this. The penalty of such an approach is that the losses increase to 40%. Thus, optical pulse stretching is only possible if the laser has enough output margins to compensate for these losses. Fused silica is not used in 157-nm lithography; hence, pulse stretching will not be required. 4.4.6 Beam Delivery Unit With the advent of advanced 193-nm systems processing 300-mm wafers, the production lithography cell has undergone a technology shift. This is because processing 300-mm wafers required introduction of several new technologies. These included technologies that enable increasing laser power at 193 nm, the NA of the projection lens, and the speed of scanner stages. Coupled with the need to maintaining high wafer throughput, the scanners must also deliver very tight CD control to within a few nm (typically less than 3 nm). The author believes that certain key technologies—traditionally ignored at 248 nm for 200-mm wafers—must be revisited. This paper pertains to one such technology: the mechanism to deliver stable light from the laser to the input of the scanner [32]. We refer to this as the BDU. With a BDU, all laser performance specifications, traditionally defined at the laser exit, are now defined at the BDU exit. The BDU exit is the input to the scanner— the point of use of the laser beam. Thus, the BDU is simply an extension to the laser and this unit should be integrated with the laser. A typical BDU is shown in Figure 4.30. The total length of the BDU can be between 5 and 20 m. Although a two-mirror BDU is shown, in practice, a BDU can comprise of three to five mirrors. The beam exiting the laser is first attenuated. The attenuator is under the control of the scanner and is used to vary the output of the laser from 3% to nearly 100%.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

276

Input beam

OPUS module

Mirror 1

Mirror 4

Mirror 2

Mirror 3 Beam splitter

Output beam Pulse length increase using optical delay line

Power (MW) @ 10 mJ

Power (MW) @ 10 mJ

1.00 Input Pulse

0.80 0.60

TiS ~ 25 ns

0.40 0.20 0.00 1.00 Output pulse

0.80 0.60

TiS ~ 50 ns

#1 #2

0.40

#3

0.20

#4

#5

0.00 −50 −40 −30 −20 −10 0

10 20 30 40 50 60 70 80 90 100 Time (ns)

FIGURE 4.29 Optical pulse stretcher based on delay line technique.

Attenuator Turning mirror module Coarse alignment stage Fast steering mirror (FSM)) Beam expander Metrology module Beam to Scanner Photo-Diode Arrays

FIGURE 4.30 A typical beam delivery unit delivering light from laser to scanner.

q 2007 by Taylor & Francis Group, LLC

Tubes

Laser

Excimer Lasers for Advanced Microlithography

277

Such a wide range of output is not possible from the laser alone. Figure 4.31 shows the detail of the attenuator. The attenuation is controlled by the angle of incidence. Normally, the laser is 95% polarized in one direction (usually horizontal). The attenuator plates reject the remaining 5% and the output of the attenuator is a 100% polarized attenuated beam. The beam is then turned 908 with a turning mirror. The mirror is mounted on a faststeering motor and a slow-adjustment motor, both of which are based on PZT technology. The laser’s beam size and divergence usually do not match with the scanner requirements. The beam expander ensures the beam has the right size (typically 25!25 mm2) and divergence (w1!1 mrad2). The second turning mirror is similar to the first. Just as the beam exits the BDU and enters the scanner, a beam metrology module measures the laser energy, beam size, beam divergence, beam position, and beam pointing. The energy measurement supplements energy measurements at the laser and in the scanner and helps isolate defective optics, either in the scanner or in the BDU. The beam size and divergence measurements are used to monitor beam at scanner input and, as in the case of energy, can be used to isolate defective optics. The beam position and pointing are measured to ensure the beam is located and pointed correctly at the scanner input. As previously discussed, stochastic processes can induce large beam pointing fluctuations from the laser. Thus, a 100-mrad pointing fluctuation can cause a 2-mm beam position fluctuation at the exit of a 20-m-long BDU. In addition, floor vibrations due to the scanner or other machinery could induce fluctuations in the beam angle. The BDU handles the pointing and position fluctuation using a closed-loop control involving the metrology module and the two turning mirrors. The technology and algorithm to maintain beam pointing and position is similar to the laser’s wavelength control described earlier. The metrology module generates signals proportional to the deviation of the beam position and angle, and the fast steering motors in the turning mirrors compensate for these deviations. The fast-turning mirrors permit active single-shot correction of beam position and angle resulting in a well-aligned system during exposure. Figure 4.32 shows the performance of the BDU with control on and off. As one can see, without control, the beam pointing can deviate in excess of 200 mrad. From Figure 4.16, this corresponds to an illumination nonuniformity of 0.5% and a CD variation of 0.5 nm. With control on, the pointing variation is negligible. A problem that has been occasionally observed at 248 nm and now at 193 nm is a gradual degradation of BDU transmission despite the fact that the laser energy is maintained constant. The degradation is usually significant, around 25%–30%, and occurs over

Laser in

To scanner

Angle of incidence

Beam deviation Optical plates

Compensating optical plates

= Horizontally polarized light = Unwanted vertically polarized light FIGURE 4.31 Optical details of the attenuator. The beam towards the scanner is 100% polarized.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

278 100 Control on

Control off

50

Horizontal pointing (urad)

0 −50

31% 75%

−100 −150 −200

31%

−250

Maximum deviation Minimum deviation

−300 0

2000

4000

6000

75%

8000 10,000 12,000 14,000 16,000 18,000 Burst count

FIGURE 4.32 Beam pointing as seen at the input to the scanner with beam stabilization control on and off. The numbers 31 and 75% refer to the duty cycle use of the laser. Without control, the beam drifts rapidly.

2–3 billion pulses. As a result, the intensity at the wafer decreases by the corresponding amount. Usually, increasing the number of exposure pulses by 25%–30% compensates for this decrease in intensity. The degradation in transmission is normally due to contamination-induced damage of the BDU optics. The energy density in the BDU is much higher than in the scanner—typically 2–10 mJ/cm2. Unless steps are taken to keep the contaminants low (hydrocarbons, oxygen, etc.), the coated BDU optics can degrade rapidly. The technology of long-life, contaminant-free opto-mechanical assemblies has already been developed in the laser. This explains why laser optical modules can last longer than 10 billion pulses without any degradation. By applying similar technologies, the lifetime of the BDU optics can be increased significantly. This is shown in Figure 4.33. After 17 billion pulses, the transmission of a BDU is unchanged. In other words, the intensity of light entering the scanner remained unchanged over 17 billion pulses. At 8–10 billion pulses per year scanner usage, this corresponds to two years of operation. In summary, although laser performance has gone through rapid changes, there is a need to change how light is delivered to the scanner. Technology to do this effectively and efficiently already exists in the laser. As a result, the scanner is assured laser beam with fixed, stable beam properties and the process engineer reaps the biggest benefits: CD control. 4.4.7 Master Oscillator: Power Amplifier Laser Configuration An examination of technology roadmaps, such as the one published by ITRS indicates that the power requirements from excimer lasers increase dramatically to match the throughput requirements of scanners. Thus, for ArF, the output power was 40 W in 2003 and will be 60 W by 2005, as compared to 20 W in 2001. Likewise, the linewidths decrease with shrinking feature size, to 0.18 pm in 2005 from about 0.25 pm in 2003 and 0.4 pm in 2001. Power increases have been handled by increasing the repetition rate of

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

279

BDU Transmission (%)

Transmission measurements through BDU 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0

Within the accuracy of our measurements, there is no decrease in transmission

0

2

4

6

8

10

12

14

16

18

20

BDU Life (B pulses) FIGURE 4.33 BDU transmission (i.e., the amount of energy entering the scanner with respect to energy from laser) as a function of BDU pulses.

the lasers while maintaining the same energy. However, this has resulted in increasing blower power to move the gas between the electrodes in the chamber. Given that blower power increases as the cube of the laser power, everything else being held constant [33], a 40-W ArF laser would consume 28,000 W (w37 hp) of power compared to 3500 W (w4.6 hp) for a 20-W laser (Figure 4.34). Likewise, these increasing power requirements severely stress the thermal capacity of line-narrowing technology. An alternate approach, albeit a major shift in laser architecture, to achieve higher powers is to freeze repetition rates and increase energy. A single chamber laser with associated line narrowing cannot provide the increased energy, as the line-narrowing technology would be

160

System power (kW)

140

Total power = HV power + blower power

120 100 80 60 40 ower

HV p

20 0 0

10

20

30

40

50

60

70

80

90

Output power @ 193 nm (W) FIGURE 4.34 The drastic increase of blower power as the laser power increases. For an 80-W ArF laser, the total power into the system would be nearly 160 kW.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

280 Master oscillator

Power amplifier

FIGURE 4.35 A master oscillator power amplifier configuration.

severely stressed at high power. Instead increase in energy is achieved by using a master oscillator, power amplifier configuration (Figure 4.35). A low power, high performance laser produces the required low linewidth at low energy (master oscillator, MO) and a high gain amplifier (power amplifier, PA) boosts the output power to the required levels. In practice, the MO laser is triggered first and the PA is triggered about 20–30 ns after the MO. The so-called MOPA architecture has shown extremely promising results and was introduced as a 4-kHz, 40-W, and 0.25-pm linewidth ArF product in 2003 (Figure 4.36). With this shift in laser architecture, the author believes that excimer technology can continue to support the aggressive technology roadmaps of the semiconductor industry [34]. The advantages of the MOPA configuration are the following: 1. The functions of generating the required spectrum and generating raw power are separated. a. The line-narrowing optics operate at greatly reduced power, thereby increasing lifetime and decreasing optics heating b. The MO need not produce high energies, making it easier to achieve ultranarrow spectral bandwidths. In the XLA 100 product, the MO generates between 1 and 2 mJ per pulse, significantly reducing the power loading on the optics. In comparison, a single-chamber, 20-W laser generates 5 mJ per pulse. c. The service life of the MO gain generator (discharge chamber) is greatly extended as a 5-mJ-per-pulse chamber is being operated at 1–2 mJ per pulse. 2. The power amplifier has tremendous operational overhead (Figure 4.37), thereby extending the lifetime of the PA chamber and allowing flexibility in system design.

MO

PA

FIGURE 4.36 A commercial MOPA laser, the XLA 100 from Cymer, Inc. The laser produces 40 W with a linewidth of 0.25 pm.

q 2007 by Taylor & Francis Group, LLC

Excimer Lasers for Advanced Microlithography

281

70 Output from MOPA laser

Laser energy (mJ)

60 50 40 30 Output from a single chamber 4 kHz laser

20 10

FIGURE 4.37 Output from a MOPA laser in comparison to a single-chamber laser.

0 800

850

900

950

1000

1050

Input voltage (V)

This leads to increased service life. Furthermore, the energy overhead could be used to compensate for losses in the pulse stretcher and BDU. As a result, the scanner receives at its input the full power of the laser with a long pulse. 3. The power amplifier works in the saturated regime. As a result, the MOPA’s energy stability is superior (by a factor of 2–3) to a single-chamber laser (Figure 4.38). This fact, combined with higher pulse energy of the laser, permits wafer exposure with fewer pulses. This will actually lead to a decrease in cost of consumables.

MO and MOPA energy 1 sigma deviation 6.0

6.0 MOPA @ 4 kHz

5.5

MO @ 4 kHz

5.0

5.0

4.5

4.5

MOPA energy 1σ deviation (%)

MO energy 1σ deviation (%)

5.5

4.0 3.5 3.0 2.5 2.0 1.5

4.0 3.5 3.0 2.5 2.0 1.5

1.0

1.0

0.5

0.5

0.0

0.0 0

20,000 40,000 Pulse count

60,000

0

20,000 40,000 Pulse count

FIGURE 4.38 MOPA’s energy stability as compared to energy stability of a single-chamber laser (MO).

q 2007 by Taylor & Francis Group, LLC

60,000

Microlithography: Science and Technology

282 1.0 0.9

Spectral width (pm)

0.8 MO Δλ 95%

0.7 0.6

MOPA Δλ 95%

0.5 0.4 0.3

MO Δλ FWHM

0.2

MOPA Δλ FWHM

0.1 0.0

FIGURE 4.39 MOPA preserves MO linewidth independent of MOPA energy.

15

20

25

30

35

40

45

50

55

MOPA Energy (mJ)

There are some practical MOPA performance concerns that we will discuss: 1. Does the PA preserve the spectral shape (bandwidth) of the MO? PA preserves the MO spectrum (Figure 4.39). There is no change in the FWHM, although the PA Dl95% is slightly lower. There appears to be no variation in linewidth with increasing output energy from MOPA. 2. What is the level of amplified spontaneous emission (ASE)? ASE levels depend on MO–PA timing. If the PA is triggered outside a certain window with respect to the MO, the ASE levels increase, with a corresponding decrease in MOPA energy (Figure 4.40). Generally, the ASE levels are kept to less than 0.1% at the laser, which means the timing window must be approximately 20 ns.

MOPA energy (mJ)

3. How closely do the MO and PA have to be synchronized? MOPA can be synchronized to within G3 ns (Figure 4.41) thanks to solid-state pulsed-power technology. This results in stable MOPA energy.

30

MOPA energy

25 20 15 10

FIGURE 4.40 The delay window between MO and PA to keep ASE levels below 0.1% of the laser energy is about 20 ns.

q 2007 by Taylor & Francis Group, LLC

Ratio of energy in ASE pedestal to laser energy

5 0 1 0.1 ASE

0.01 1E−3 1E−4 –60

–40 –20 0 Time delay (τMO − τPA), ns

20

40

Frequency count

Excimer Lasers for Advanced Microlithography

283

100

10

1 −2

0 1 −1 Jitter deviation from mean (ns)

2

FIGURE 4.41 MO and PA lasers are synchronized with a timing jitter of C2 ns. The jitter control software maintains the timing between the two via a dithering technique, hence the bimodal distribution.

4. What will the overall system power draw be? The higher efficiency of the MOPA leads to lower system power at the higher energies, in spite of driving two chambers. With a fixed rep rate, there is no increase in MOPA blower power with energy (Figure 4.42). For ArF, the crossover point is around 30 W. 5. What about optical damage at the higher energies? Optical damage is an issue at higher energy due to fused silica compaction, especially at 193 nm. However, doubling the pulse length can compensate for doubling the energy (Equation 4.25). By the use of external pulse stretcher, the pulse length can be doubled or quadrupled. The associated optical losses can be compensated by the laser’s energy overhead. 4.4.8 Module Reliability and Lifetimes When excimer lasers were introduced for volume production, their reliability was a significant concern. Various cost-of-operation scenarios were created that portrayed the

160 y log no r te

ch

120 be

100

am

Cross over power ~ 30 W gle

ch

80 60

Sin

System power (kW)

140

40

MOPA technology

20 0 0

10

20

30

40

50

60

70

Laser output power at 193 nm (W)

q 2007 by Taylor & Francis Group, LLC

80

90

FIGURE 4.42 When the laser power exceeds about 30 W at 193 nm, the power consumed by dual-chamber MOPA technology is less than that consumed by single-chamber technology.

Microlithography: Science and Technology

284 8500

100.0% 99.5%

99.8% Uptime

6500

4692 MTBF 99.0% Hours

4500 98.5% 2500 98.0% 500

–1500

97.5% N- D- J- F- M- A- M- J- J- A- S- O- Av 02 02 03 03 03 03 03 03 03 03 03 03

97.0%

FIGURE 4.43 The uptime of excimer lasers for lithography, based on 2000 lasers, is 99.8% and the MTBF exceeds 4500 h.

excimer laser as the cost center of the lithography process. The cost of operation is governed by three major components in the laser: the chamber, the line narrowing module, and the metrology module. The chamber’s efficiency degrades as a function in lifetime measured in pulses. This is due to erosion of its electrodes. As a result, its operating voltage increases until the operating voltage reaches a maximum. The linenarrowing module’s grating reflectivity degrades with lifetime, which makes the module unusable. The metrology tool’s etalons and internal optics degrade due to coating damage until they cannot measure linewidth correctly. In all cases, end of life is gradual as a function of pulses. Thus, the end of life of the module can be predicted and that module replaced before the laser becomes inoperable. Today, most excimer lasers in a production environment have an uptime of 99.8% (Figure 4.43). Laser manufacturers have combined good physics and engineering in making remarkable strides in lifetime. Figure 4.44 shows a comparison of chamber lifetime in 1996 to that of today. This is remarkable considering the present-day 20-W chamber is slightly

Chamber life - then and now! 750 10 W, 1000 Hz, Δλ=0.8pm, ~3B pulses, 1996

650 Laser maintenance Here

600 FIGURE 4.44 The increase in chamber’s operating voltage as a function of number of pulses on the chamber. An increase in voltage indicates a decrease in efficiency. Beyond the laser’s operating range, the output stability of the laser suffers, making the chamber unusable.

q 2007 by Taylor & Francis Group, LLC

550

Laser's operating voltage range

Voltage (V)

700

20 W, 2500 Hz, Δλ =0.5pm, ~16B pulses, 2002

0

2

4

6

8

10

12

14

Chamber life (Billions of pulses)

16

18

Excimer Lasers for Advanced Microlithography

285

smaller in size than the one manufactured in 1996. Similarly, the optical module lifetimes have improved fivefold by a combination of durable coatings and materials, understanding of the damage mechanisms that limit coating lifetime, and by systematic studies of interaction of matter with deep-UV light. Today, the lifetimes of the chamber, line narrowing module, and metrology module are 12 billion, 15 billion, and 15 billion pulses, respectively. Thanks to MOPA technology, ArF will match KrF. F2 will not be far behind. If history is any indicator, lifetimes of all modules will continue to improve.

4.5 Summary This chapter has reviewed developments since excimer lasers became the light source for lithography. Today, excimer laser manufacturers are rapidly advancing the state of the technology at all three wavelengths: 248, 193, and 157 nm. With the successful commercialization of the MOPA-based excimer laser and the actively stabilized beam delivery unit, future power, linewidth, stability, productivity, and lifetime requirements can be met at all three wavelengths. Furthermore, these specifications can be met by the laser at its point of use: the scanner entrance.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23.

M. Krauss and F.H. Mies. 1979. in Excimer Lasers, C.K. Rhodes, ed., Berlin: Springer. F.G. Houtermans. 1960. Helv. Phys. Acta, 33: 933. S.K. Searles and G.A. Hart. 1975. Appl. Phys. Lett., 27: 243. C.A. Brau and J.J. Ewing. 1975. Appl. Phys. Lett., 27: 435. M. Rockni, J.A. Mangano, J.H. Jacobs, and J.C. Hsia. 1978. IEEE J. Quantum Electron., 14: 464. R. Hunter. 1977. in 7th Winter Colloquim on High Power Visible Lasers, Park City, UT. C.K. Rhodes and P.W. Hoff. 1979. Excimer Lasers, C.K. Rhodes, ed., Berlin: Springer. U.K. Sengupta. 1993. Opt. Eng., 32: 2410. R. Sze. 1979. IEEE J. Quantum Electron., 15: 1338. A.J. Palmer. 1974. Appl. Phys. Lett., 25: 138. R.S. Taylor. 1986. Appl. Phys. B, 41: 1. A.B. Treshchalov and V.E. Peet. 1988. IEEE J. Quantum Electron., 24: 169. S.C. Lin and J.I. Levatter. 1979. Appl. Phys. Lett., 34: 8. J. Hsia. 1977. Appl. Phys. Lett., 30: 101. J. Coutts and C.E. Webb. 1986. J. Appl. Phys., 59: 704. D.E. Rothe, C. Wallace, and T. Petach. 1983. Excimer Lasers, C.K. Rhodes, H. Egger, and H. Pummer, eds, New York: American Institute of Physics, pp. 33–44. J.R. Woodworth and J.K. Rice. 1978. J. Chem. Phys., 69: 2500. K. Suzuki, K. Ozawa, O. Tanitsu, and M. Go. 1995. “Dosage control for scanning exposure with pulsed energy fluctuation and exposed position jitter,” Jpn. J. Appl. Phys., 34: 6565. D. Myers, H. Besaucele, P. Das, T. Duffey, and A. Ershov. 2000. Reliable, modular, production quality, narrow band, high repetition rate excimer laser. US Patent 6,128,323. A. Kroyan, N. Ferrar, J. Bendik, O. Semprez, C. Rowan, and C. Mack. 2000. Modeling the effects of laser bandwidth on lithographic performance, in Proceedings of SPIE, Vol. 4000, 658. H. Levinson. 2001. Principles of Lithography, Bellingham, WA: SPIE Press. R. Sandstrom, E. Onkels, and C. Oh. 2001. ISMT Second Annual Symposium on 157 nm Lithography. W. Oldham and R. Schenker. 1997. “193-nm lithographic system lifetimes as limited by UV compaction,” Solid State Technol., 40: 95.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

286 24.

25. 26. 27. 28. 29. 30. 31. 32.

33.

34.

R. Morton, R. Sandstrom, G. Blumentock, Z. Bor, and C. Van Peski. 2000. “Behavior of fused silica materials for microlithography irradiated at 193 nm with low-fluence ArF radiation for tens of billions of pulses,” Proc. SPIE, 4000: 507. Y. Ichihara, S. Kawata, I. Hikima, M. Hamatani, Y. Kudoh, and A. Tanimoto. 1990. “Illumination system of an excimer laser stepper,” Proc. SPIE, 1138: 137. W. Partlo, I. Fomenkov, J. Hueber, Z. Bor, E. Onkels, M. Cates, R. Ujazdowski, V. Fleurov, and D. Gaidarenko. 2001. US Patent 6,317,447. J. Algots, J. Buck, P. Das, F. Erie, A. Ershov, I. Fomenkov, C. Marchi, 2001. Narrow band laser with fine wavelength control. US Patent 6,192,064. A. Ershov, G. Padmabandu, J. Tyler, and P. Das. 2000. “Laser spectrum line shape metrology at 193 nm,” Proc. SPIE, 4000: 1405. W. Partlo, D. Birx, R. Ness, D. Rothweil, P. Melcher, and B. Smith, 1999. US Patent 5,936,988. W. Partlo, R. Sandstrom, I. Fomenkov, and P. Das, 1995. in SPIE Proc. Vol. 2440. p. 90. T. Hofmann, B. Johnson, and P. Das. 2000. “Prospects for long pulse operation of ArF lasers for 193 nm microlithography,” Proc. SPIE, 4000: 511–518. L. Lublin, D. Warkentin, P. Das, A. Ershov, J. Vipperman, R. Spangler, and B. Klene. 2003. “High-performance beam delivery unit for next-generation ArF scanner systems,” Proc. SPIE, 5040: 1682. V. Fleurov, D. Colon III, D. Brown, P. O’Keeffe, H. Besaucele, A. Ershov, F. Trintchouk et al. 2003. “Dual-chamber ultra-narrowed excimer light source for 193 nm lithography,” Proc. SPIE, 5040: 1694. D. Knowles, D. Brown, H. Beasucele, D. Myers, A. Ershov, W. Partlo, and R. Sandstrom et al., 2003. Very narrow band, two chamber, high rep rate gas discharge laser system. US Patent 6,625,191.

q 2007 by Taylor & Francis Group, LLC

5 Alignment and Overlay Gregg M. Gallatin

CONTENTS 5.1 Introduction ......................................................................................................................287 5.2 Overview and Nomenclature ........................................................................................288 5.2.1 Alignment Marks................................................................................................289 5.2.2 Alignment Sensors ............................................................................................289 5.2.3 Alignment Strategies..........................................................................................291 5.2.4 Alignment vs. Leveling and Focusing ............................................................292 5.2.5 Field and Grid Distortion..................................................................................292 5.2.6 Wafer vs. Reticle Alignment ............................................................................293 5.3 Overlay Error Contributors............................................................................................293 5.3.1 Measuring Overlay ............................................................................................294 5.4 Precision, Accuracy, Throughput, and Sendaheads ..................................................295 5.5 The Fundamental Problem of Alignment....................................................................295 5.5.1 Alignment-Mark Modeling ..............................................................................297 5.6 Basic Optical Alignment Sensor Configurations ........................................................303 5.7 Alignment Signal Reduction Algorithms ....................................................................306 5.7.1 Threshold Algorithm ........................................................................................309 5.7.1.1 Noise Sensitivity of the Threshold Algorithm ..............................310 5.7.1.2 Discrete Sampling and the Threshold Algorithm ........................311 5.7.2 Correlator Algorithm ........................................................................................312 5.7.2.1 Noise Sensitivity of the Correlator Algorithm ..............................316 5.7.2.2 Discrete Sampling and the Correlator Algorithm ........................316 5.7.3 Fourier Algorithm ..............................................................................................317 5.7.3.1 Noise Sensitivity of the Fourier Algorithm....................................318 5.7.3.2 Discrete Sampling and the Fourier Algorithm..............................319 5.7.3.3 Application of the Fourier Algorithm to Grating Sensors ..........320 5.7.4 Global Alignment Algorithm ..........................................................................320 Appendix ....................................................................................................................................324 References ....................................................................................................................................327

5.1 Introduction This chapter discusses the problem of alignment in an exposure tool and its net result: overlay. Relevant concepts are described and standard industry terminology is defined. 287

q 2007 by Taylor & Francis Group, LLC

288

Microlithography: Science and Technology

The discussion has purposely been kept broad and tool-nonspecific. The content should be sufficient to make understanding the details and issues of alignment and overlay in particular tools relatively straightforward. The following conventions will be used: orthogonal Cartesian in-plane wafer coordinates are (x,y) and the normal to the wafer or out-of-plane direction will be the z axis.

5.2 Overview and Nomenclature As discussed in other chapters, integrated circuits are constructed by successively depositing and patterning layers of different materials on a silicon wafer. The patterning process consists of a combination of exposure and development of photoresist followed by etching and doping of the underlying layers and deposition of another layer. This process results in a complex and, on the scale of microns, very nonhomogeneous material structure on the wafer surface. Typically, each wafer contains multiple copies of the same pattern called “fields” arrayed on the wafer in a nominally rectilinear distribution known as the “grid.” Often, but not always, each field corresponds to a single “chip.” The exposure process consists of projecting the image of the next level pattern onto (and into) the photoresist that has been spun onto the wafer. For the integrated circuit to function properly, each successive projected image must be accurately matched to the patterns already on the wafer. The process of determining the position, orientation, and distortion of the patterns already on the wafer and then placing the projected image in the correct relation to these patterns is termed “alignment.” The actual outcome, i.e., how accurately each successive patterned layer is matched to the previous layers, is termed overlay. The alignment process requires, in general, both the translational and rotational positioning of the wafer and/or the projected image as well as some distortion of the image to match the actual shape of the patterns already present. The fact that the wafer and the image need to be positioned correctly to get one pattern on top of the other is obvious. The requirement that the image often needs to be distorted to match the previous patterns is not at first obvious, but is a consequence of the following realities: No exposure tool or aligner projects an absolutely perfect image. All images produced by all exposure tools are slightly distorted with respect to their ideal shape. In addition, different exposure tools distort the image in different ways. Silicon wafers are not perfectly flat or perfectly stiff and any tilt or distortion of the wafer during exposure, either fixed or induced by the wafer chuck, results in distortion of the as-printed patterns. Any vibration or motion of the wafer relative to the image that occurs during exposure and is unaccounted for or uncorrected by the exposure tool will “smear” the image in the photoresist. Thermal effects in the reticle, the projection optics and/or the wafer will also produce distortions. The net consequence of all this is that the shape of the first level pattern printed on the wafer is not ideal and all subsequent patterns must, to the extent possible, be adjusted to fit the overall shape of the first level printed pattern. Different exposure tools have different capabilities to account for these effects; in general, however, the distortions or shape variations that can be accounted for include x and y magnification and skew. These distortions, when combined with translation and rotation, make up the complete set of linear transformations in the plane. They are defined and discussed in detail in the Appendix. Because the problem is to successively match the projected image to the patterns already on the wafer and not simply to position the wafer itself, the exposure tool must effectively be able to detect or infer the relative position, orientation, and distortion of both the wafer

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

289

patterns themselves and the projected image. The position, orientation, and distortion of the wafer patterns are always measured directly, whereas the image position orientation and distortion is sometimes measured directly and sometimes inferred from the reticle position after a baseline reticle to image calibration has been performed. 5.2.1 Alignment Marks It is difficult to directly sense the circuit patterns themselves; therefore, alignment is accomplished by adding fiducial marks, known as alignment marks, to the circuit patterns. These alignment marks can be used to determine the reticle position, orientation, and distortion and/or the projected image position, orientation, and distortion. They can also be printed on the wafer along with the circuit pattern and hence can be used to determine the wafer pattern position, orientation, and distortion. Alignment marks generally consist of one or more clear or opaque lines on the reticle which then become “trenches” or “mesas” when printed on the wafer. But, more complex structures such as gratings, which are simply periodic arrays of trenches and/or mesas and checkerboard patterns are also used. Alignment marks are usually located either along the edges or “kerf” of each field or a few “master marks” are distributed across the wafer. Although alignment marks are necessary, they are not part of the chip circuitry and therefore, from the chip makers’ point of view, they waste valuable wafer area or “real estate.” This drives alignment marks to be as small as possible and they are often less than few hundred microns on a side. In principle, it would be ideal to align to the circuit patterns themselves, but this has so far proved to be very difficult to implement in practice. The circuit pattern printed in each layer is highly complex and varies from layer to layer. This approach, therefore, requires an adaptive pattern recognition algorithm. Although such algorithms exist, their speed and accuracy is not equal to that obtained with simple algorithms working on signals generated by dedicated alignment marks. 5.2.2 Alignment Sensors To “see” the alignment marks, alignment sensors are incorporated into the exposure tool with separate sensors usually being used for the wafer, the reticle, and/or the projected image itself. Depending on the overall alignment strategy, each of these sensors may be entirely separate systems, or they may be effectively combined into a single sensor. For example, a sensor that can “see” the projected image directly would nominally be “blind” with respect to wafer marks and hence a separate wafer sensor is required. But, a sensor that “looks” at the wafer through the reticle alignment marks themselves is essentially performing reticle and wafer alignment simultaneously and therefore no separate reticle sensor is necessary. Note that in this case the positions of the alignment marks in the projected image are being inferred from the position of the reticle alignment marks and a careful calibration of reticle to image positions must have been performed previous to the alignment step. Also, there are two generic system-level approaches for incorporating an alignment sensor into an exposure tool termed “through-the-lens” and “not-through-the-lens” or “off axis” (Figure 5.1). In the through-the-lens (TTL) approach, the alignment sensor looks through the same or mostly the same optics that are used to project the aerial image onto the wafer. In the not-through-the-lens (NTTL) approach, the alignment sensor uses its own optics that are completely or mostly separate from the image-projection optics. The major advantage of TTL is that, at least to some extent, it provides “common-mode rejection” of optomechanical instabilities in the exposure tool. That is, if the projection optics move, then, to first order, the shift in the position of the projected

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

290

Non-actinic alignment wavelength

TTL alignment sensor

Illuminator

Reticle

Reticle

Reticle alignment mark Projection optics

Wafer alignment mark

Projection optics

Actinic image position

Wafer

NTTL alignment sensor

Wafer

Non-actinic alignment wavelength

Wafer stage

x, y

x, y Laser gauge

(a)

Baseline Laser gauge

(b)

FIGURE 5.1 (a) Simplified schematic of a TTL alignment system. In such a system, the wafer marks are viewed through the projection optics. (b) Simplified schematic of an NTTL alignment system. In such a system, the wafer marks are viewed with an optical system that is completely separate from the projection optics.

image at the wafer plane matches the shift in the image of the wafer as seen by the alignment sensor. This cancellation helps desensitize the alignment process to optomechanical instabilities. The major disadvantage of TTL is that it requires the projection optics to be simultaneously good for exposure as well as alignment. Because alignment and exposure generally do not work at the same wavelength, the imaging capabilities of the projection optics for exposure must be compromised to allow for sufficiently accurate performance of the alignment sensor. The net result is that neither the projection optics nor the alignment sensor is providing optimum performance. The major advantage of the NTTL approach is precisely that it decouples the projection optics and alignment sensor, therefore allowing each to be independently optimized. Also, because an NTTL sensor is independent of the projection optics, it is compatible with different tool types such as i-line, DUV, and EUV. Its main disadvantage is that optomechancal drift is not automatically compensated and hence the “baseline” between the alignment sensor and the projected image must be recalibrated on a regular basis that can reduce throughput. The calibration procedure is illustrated in Figure 5.2. The TTL approach requires this same projected image to alignment sensor calibration be made as well but it does not need to be repeated as often. Further, as implied above, essentially all exposure tools use sensors that detect the wafer alignment marks optically. The sensors project light at one or more wavelengths onto the wafer and detect the scattering/diffraction from the alignment marks as a function of position in the wafer plane. Many types of alignment sensor are in common use and their optical configurations cover the full spectrum from simple microscopes to heterodyne grating interferometers. Also, because different sensor configurations operate better

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

291

Illuminator Actinic exposure wavelength

Illuminator

Reticle

Reticle

Projection optics

Wafer Wafer stage

x, y

NTTL alignment sensor

Projection optics

Actinic image of reticle mark Fiducial mounted on actinic sensitive detector.

Wafer Wafer stage

x, y

Laser gauge (a)

NTTL alignment sensor

Laser gauge

Alignment sensor locates position of fiducial.

(b)

FIGURE 5.2 The two steps in the calibration process of an NTTL system are shown. (a) The projected image of the reticle at the exposure or actinic wavelength is located in wafer-stage coordinates using a fiducial and detector mounted on the wafer stage. (b) The axis of the alignment sensor is located in wafer-stage coordinates by using it to detect the same wafer stage fiducial used to locate the actinic image.

or worse on given wafer types, most exposure tools allow more than one sensor configuration to allow for good overlay on the widest possible range of wafer types. For detailed descriptions of various alignment sensor configurations, see Refs. [1–12]. 5.2.3 Alignment Strategies The overall job of an alignment sensor is to determine the position of each of a given subset of all the alignment marks in a coordinate system fixed with respect to the exposure tool. This position data is then used in either of two generic ways, termed global and fieldby-field, to perform alignment. In global alignment, the marks in only a few fields are located by the alignment sensor(s) and all of this data is combined in a best-fit sense to determine the optimium alignment of all the fields on the wafer. In field-by-field alignment, the data collected from a single field is used to align only that field. Global alignment is usually both faster (because not all the fields on the wafer are located) and less sensitive to noise (because it combines all the data together to find a best overall fit). But, because the results of the best-fit are used in a feed-forward or dead-reckoning approach, it does rely on the overall optomechanical stability of the exposure tool. A detailed discussion of global alignment is presented in Section 5.7.4. Alignment is generally implemented as a two-step process, i.e., a fine alignment step with an accuracy of tens of nanometers follows an initial coarse alignment step with an accuracy of microns. When a wafer is first loaded into the exposure tool, the uncertainty in its position in exposure-tool coordinates is often on the order of several hundred microns. The coarse alignment step uses a few large alignment targets and has a capture range

q 2007 by Taylor & Francis Group, LLC

292

Microlithography: Science and Technology

equal to or greater than the initial wafer-position uncertainty. The coarse alignment sensor is generally very similar to the fine alignment sensor in configuration, but in some cases these two sensors can be combined into two modes of operation of a single sensor. The output of the coarse alignment step is the wafer position to within several microns, or less, which is within the capture range of the fine alignment system. Sometimes there is a “zero” step performed, known as prealignment, in which the edge of the wafer is detected mechanically or optically so that it can be brought into the capture range of the coarsealignment sensor. 5.2.4 Alignment vs. Leveling and Focusing In an overall sense, along with image distortion, alignment requires positioning the wafer in all six degrees of freedom: three translational and three rotational. However, adjusting the wafer so that it lies in the projected image plane, i.e., leveling and focusing the wafer, which involves one translational degree of freedom (motion along the optic axis) and two rotational degrees of freedom (orienting the plane of the wafer to be parallel to the projected image plane), are generally considered separate from “alignment” as used in the standard sense. Only in-plane translation (two degrees of freedom) and rotation about the projection optic axis (one degree of freedom) are commonly meant when referring to alignment. The reason for this separation in nomenclature is due to the difference in accuracy required. The accuracy required for in-plane translation and rotation generally needs to be on the order of about 20%–30% of the minimum feature size or critical dimension (CD) to be printed on the wafer. Current state-of-the-art CD values are on the order of a hundred nanometers and thus the required alignment accuracy is on the order of a few tens of nanometers. On the other hand, the accuracy required for out-of-plane translation and rotation is related to the total usable depth-of-focus of the exposure tool, which is generally only a few times the CD value. Thus, out-of-plane focusing and leveling of the wafer requires less accuracy than in-plane alignment. Also, the sensors for focusing and leveling are completely separate from the alignment sensors, and focusing and leveling does not usually rely on special fiducial patterns, i.e., alignment marks on the wafer. Only the wafer surface needs to be sensed. 5.2.5 Field and Grid Distortion As discussed above, along with in-plane rigid body translation and rotation of the wafer, various distortions of the image may be required to achieve the necessary overlay. The deviation of the circuit pattern in each field from its ideal rectangular shape is termed field distortion. Along with field distortion it is usually necessary to allow for grid distortion, i.e, deviations of the field centers from the desired perfect rectilinear grid, as well. Both the field and grid distortions can be separated into linear and nonlinear terms as discussed in the Appendix. Depending on the location and number of alignment marks on the wafer, most exposure tools are capable of accounting for some or all of the linear components of field and grid distortion. Although all of the as-printed fields on a given wafer are nominally distorted identically, in reality the amount and character of the distortion of each field varies slightly from field to field. If the lithographic process is sufficiently well controlled, then this variation is generally small enough to ignore. It is this fact that makes it possible to perform alignment using the global approach. As mentioned above, different exposure tools produce different specific average distortions of the field and grid. In other words, each tool has a unique distortion signature. A tool aligning to patterns that it printed on the wafer will, on average, be better able to

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

293

match the distortion in the printed patterns than a different tool with a different distortion signature. The net result is the overlay will be different in the two cases with the “toolto-itself” or “machine-to-itself” overlay, i.e., the result of a tool aligning to patterns that it printed being generally several nanometers to a few tens of nanometers better than when one tool aligns to the patterns printed by a different tool—the so called “tool-to-tool” or “machine-to-machine” result. Ideally, one would like to make all tools have the minimum distortion, but this is not necessary. All that is necessary is to match the distortion signatures of all the tools that will be handling the same wafers. This can be done by tuning the tools to match a single “master tool,” or they can be tuned to match their average signature. 5.2.6 Wafer vs. Reticle Alignment Although both the reticle and wafer alignment must be performed accurately, wafer alignment is usually the larger contributor to alignment errors. The main reason is the following: a single reticle is used to expose many wafers. Thus, after the reticle alignment marks have been “calibrated,” they do not change, whereas the detailed structure of the wafer alignment marks varies not only from wafer to wafer, but also across a single wafer in multiple and unpredictable ways. Just as real field patterns are distorted from their ideal shape, the material structure making up the trenches and/or mesas in real alignment marks is distorted from its ideal shape. Therefore, the width, depth, side-wall slope, etc., as well as the symmetry of the trenches and mesas vary from mark to mark. The effect of this variation in mark structure on the alignment signal from each mark is called process sensitivity. The ideal alignment system, i.e., combination of optics and algorithm, would be the one with the least possible process sensitivity. The result of all this is that the major fundamental limitation to achieving good overlay is almost always associated with wafer alignment. Further, most projection optical systems reduce or demagnify the reticle image at the wafer plane; therefore, less absolute accuracy is generally required to position the reticle itself.

5.3 Overlay Error Contributors The overall factors that effect overlay are the standard ones of measurement and control. The position, orientation, and distortion of the patterns already on the wafer must be inferred from a limited number of measurements, and the position orientation and distortion of the pattern to be exposed must be controlled using a limited number of adjustments. For actual results and analysis from particular tools, see Refs. [13–21]. Here, a list is presented of the basic sources of error. Measurement † Alignment system: Noise and inaccuracies in the ability of the alignment system induce errors in determining the positions of the alignment marks. This includes not only the alignment sensor itself, but also the stages and laser gauges that serve as the coordinate system for the exposure tool, as well as the calibration and stability of the alignment system axis to the projected image; this is true for both NTTL and TTL, as well as the electronics and algorithm that are used to collect and reduce the alignment data to field and grid terms. Finally, it must be remembered that the alignment marks are not the circuit pattern and the exposure tool is predicting the circuit pattern position, orientation, and distortion from the mark positions. Errors in this

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

294

prediction due to the nonperfection of the initial calibration of the mark-to-pattern relationship, or changes in the relationship due to thermal and/or mechanical effects and simplifications in algorithmic representation, such as the linear approximation to the nonlinear distortion, all contribute to overlay error. † Projection optics: Variations and/or inaccuracies in the determination of the distortion are induced in the projected pattern by the optical system. Thermomechanical effects change the distortion signature of the optics. At the nanometer level, this signature is also dependent on the actual aberrations of the projection optics, which causes different linewidth features to print at slightly different positions. In machine-to-itself overlay, the optical distortion is nominally the same for all exposed levels, so this effect tends to be minimal in this case. In machine-to-machine overlay, the difference in the optical distortion signatures of the two different projection optics is generally not trivial and thus can be a significant contributor to overlay errors. † Illumination optics: Nontelecentricity in the source pupil when coupled with focus errors and/or field nonflatness will produce image shifts and/or distortion. Variation in the source pupil intensity across the field also can shift the printed alignment mark position with respect to the circuit position. † Reticle: Reticle metrology errors, in the mark-to-pattern position measurement are caused by reticle mounting and/or reticle heating. Particulate contamination of the reticle alignment marks can also shift the apparent mark position. Control † Wafer stage: Errors in the position and rotation of the wafer stage during exposure, both in-plane and out-of-plane, contributes to overlay errors. Also, wafer stage vibration contributes. These are rigid-body effects. There are also nonrigid-body contributors, such as wafer and wafer stage heating, that can distort the wafer with respect to the exposure pattern, and also chucking errors that “stretch” the wafer in slightly different ways each time it is mounted. † Reticle stage: Essentially all the same considerations as for wafer stage apply to the reticle stage, but with some slight mediation due to the reduction nature of the projection optics. † Projection optics: Errors in the magnification adjustment cause pattern mismatch. Heating effects can alter the distortion signature in uncontrollable ways. 5.3.1 Measuring Overlay Overlay is measured simply by printing one pattern on one level and a second pattern on a consecutive level and then measuring, on a standalone metrology system, the difference in the position, orientation, and distortion of the two patterns. If both patterns are printed on the same exposure tool, the result is machine-to-itself overlay; if they are printed on two different exposure tools, the result is machine-to-machine overlay. The standalone metrology systems consist of a microscope for viewing the patterns, connected to a laser-gauge-controlled stage for measuring their relative positions. The most common pattern is a square inside a square called box-in-box and its 458-rotated version called diamond-in-diamond. The shift of the inner square with respect to the outer square is the overlay at that point in the field. The results from multiple points in the field can be expressed as field magnification, skew, and rotation; the average position of each field can be expressed as grid translation, magnification, skew, and rotation.

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

295

Finally, there is the possibility of measuring the so-called latent image in the resist before the resist is fully developed and processed. Exposing the resist causes a chemical change that varies spatially with the intensity of the projected image. This spatially dependent chemical change is termed the latent image. In principle, this chemical change can be sensed directly so that the position of the latent image relative to the underlying pattern can be determined without further resist processing, therefore saving time and money. Unfortunately, at least for the chemically amplified resists that are commonly used in production, this chemical change is very weak and difficult to detect. Latent image sensing is discussed in Refs. [22–23].

5.4 Precision, Accuracy, Throughput, and Sendaheads Ideally, alignment should be fast and accurate. At a minimum, it needs to be repeatable and precise. Alignment should be fast because lithography is a manufacturing technology and not a science experiment. Thus, there is a penalty to be paid if the alignment process takes too long: the number of wafers produced per hour decreases. This leads to a trade-off between alignment accuracy and alignment time. Given sufficient time, it is possible, at least in principle, to align essentially any wafer with arbitrary accuracy. However, as the allowed time gets shorter, the accuracy achievable will, in general, decrease. Due to a combination of time constraints, alignment sensor nonoptimality, and excess mark structure variation, it is sometimes only possible to achieve repeatable and precise alignment. However, because accuracy is also necessary for overlay, a predetermined correction factor or “offset” must be applied to such alignment results. The correction factor is most commonly determined using a “sendahead” wafer, i.e., a wafer that has the nominal alignment-mark structures printed on it is aligned and exposed and the actual overlay in terms of the difference between the desired exposure position, rotation, and distortion and the actual exposure position, rotation, and distortion is measured. These measured differences are then applied as an offset directly to the alignment results on all subsequent wafers of that type, which effectively cancels out the alignment error “seen” by the sensor. For this approach to work, the alignment process must produce repeatable results so that measuring the sendahead wafer is truly indicative of how subsequent wafers will behave. Also, it must have sufficient precision to satisfy the overlay requirements after the sendahead correction has been applied.

5.5 The Fundamental Problem of Alignment The fundamental job of the alignment sensor is to determine, as rapidly as possible, the positions of each of a set of alignment marks in exposure-tool coordinates to the required accuracy. Here, the word position nominally refers to the center of the alignment mark. To put the accuracy requirement in perspective, it must be remembered that the trenches and/or mesas that make up an alignment mark are sometimes on the order of the critical dimension, but often are much larger. Therefore, the alignment sensor must be able to locate the center of an alignment mark to a very small fraction of the mark dimensions. If the alignment mark itself, including any overcoat layers, such as photoresist, is perfectly symmetric about its center and the alignment sensor is perfectly symmetric, then the center of the alignment signal now corresponds exactly with the center of the alignment mark. Thus, only for the case of perfect symmetry does finding the center

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

296

of the signal correspond to finding the center of the alignment mark. If the mark and/or the alignment sensor is in any way not perfectly symmetric, then the signal center will not be coincident with the mark center and finding the signal center does not mean you have found the mark center. This is the fundamental problem of alignment. Noise also causes the detected signal to be asymmetric. However, if the signal is sufficiently sampled and the data are reduced appropriately, then, within limits, the effect of noise on the determination of the signal center can be made as small as necessary, leaving only the systematic variations in signal shape to contend with. The relation between the true mark center and the signal center can be determined by solving Maxwell’s equations for each particular mark structure and sensor configuration and using that result to determine the signal shape and offset, as illustrated in Figure 5.3. The result of such an analysis is a complicated function of the details of the mark structure and the sensor configuration. Generally, the details of the mark structure and its variation both across a wafer and from wafer-to-wafer are not well known; the results of such calculations have thus far been used only for sensitivity studies and off-line debugging of alignment problems. As discussed above, when the offset between the true mark center and the signal center is too large to ignore, it can be determined by using sendahead wafers. The overlay on a sendahead wafer is measured and applied as a fixed offset or correction to the mark positions found by the alignment sensor on subsequent wafers of the same type. Therefore, sendaheads effectively provide an empirical solution as opposed to an analytical solution to the difference between the true and measured mark positions. However,

Signal reduction algorithm

Given the incoming light

Find the outgoing light

Resist Process Layers

FIGURE 5.3 The fundamental problem that alignment mark modeling must solve is to determine the scattered/diffracted or outgoing distribution of light as a function of the illumination or incoming distribution and the mark structure.

q 2007 by Taylor & Francis Group, LLC

Mark structure

"Maxwell"

Alignment and Overlay

297

sendaheads take time and cost money, and thus chip manufacturers would prefer alignment systems that do not require sendaheads. Over the years, tool manufacturers have generally improved the symmetry of the alignment sensors and increased the net signalto-noise ratio of the data to meet the tighter overlay requirements associated with shrinking CD values. There also have been significant improvements in wafer processing. However, whereas in the past the dominant contributor to alignment error may well have been the sensor itself, now inherent mark asymmetry in many cases is an equal, or in some cases dominant, contributor. The economic benefit of achieving the required overlay with no sendaheads is nontrivial, but does require the development of a detailed understanding of the interaction of particular sensor configurations with process-induced mark asymmetries. The development of such a knowledge base will allow for the design of robust alignment sensors, algorithms, and strategies. In effect, a nonsymmetric mark, as drawn in Figure 5.3, has no well-defined center and the tool user must be able to define what he means if the tool is to meet overlay. In the past, this was done using sendaheads. In the future, it may well be done using accurate models of how the signal distorts as a function of known mark asymmetries. 5.5.1 Alignment-Mark Modeling The widths, depths, and thicknesses of the various “blocks” of material that make up an alignment mark are usually between a few tenths to several times the sensing wavelength in size. In this regime, geometrical, and physical optics are both, at best, rough approximations and only a reasonably rigorous solution to Maxwell’s equations for each particular case will be able to make valid predictions of signal shape, intensity, and mark offset. This is illustrated in Figure 5.3. Because of the overall complexity of wave propagation and boundary condition matching in an average alignment mark, it is essentially impossible to intuitively predict or understand how the light will scatter and diffract from a given mark structure. Therefore, to truly understand the details of why a particular mark structure produces a particular signal shape and a particular offset requires actually solving Maxwell’s equations for that structure. Also, the amplitude and phase of the light scattered/diffracted in a particular direction depends sensitively on the details of the mark structure. Variations in the thickness or shape of a given layer by as little as a few nm, or in its index by as little as a few percent, can significantly alter the alignment signal shape and detected position. Thus, again, to truly understand what is happening requires detailed knowledge of the actual three-dimensional structure, as well as its variation in real marks. In general, all the codes and algorithms used for the purpose of alignment-mark modeling are based on rigorous techniques for solving multilayer grating diffraction problems and they essentially all couch the answer in the form of a “scattering matrix” that is nothing but the optical transfer function of the alignment mark. It is beyond the scope of this discussion to describe in detail the various forms that these algorithms take; the reader is referred to the literature for details (see Refs. [24–31]). Although how a scattering matrix is computed will not be discussed, it is worthwhile to understand what a scattering matrix is and how it can be used to determine alignment signals for different sensor configurations. The two key aspects of Maxwell’s equations and the properties of the required electromagnetic field: 1. The electromagnetic field is a vector field, i.e., the electric and magnetic fields have a magnitude and a direction. The fact that light is a vector field, i.e., it has polarization states, should not be ignored when analyzing the properties of

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

298

alignment marks because this can, in many cases, lead to completely erroneous results. 2. Light obeys the wave equation, i.e., it propagates. However, it must also obey Gauss’ law. In other words, Maxwell’s equations contain more physics than just the wave equation. It is convenient to use the natural distinction between wafer in-plane directions, x and y, and the out-of-plane direction or normal to the wafer, z, to define the two basic polarization states of the electromagnetic field that will be referred to as TE (for tangential electric), and TM (for tangential magnetic). For TE polarization, the electric ð is tangent to the surface of the wafer, i.e., it has only x and y comfield vector, E, ð is tangent ð ponents, E Z e^x Ex C e^y Ey . For TM polarization, the magnetic field vector, B, ð to the surface of the wafer, i.e., it has only x and y components, B Z e^x Bx C e^y By (see Figure 5.4). In the convention used here, e^x ; e^y and e^z are the unit vectors for the x, y, and z directions, respectively. The work can be limited to just the electric field for both polarizations because the corresponding magnetic field can be calculated unambiguously from it; the notation Eð TE for the TE polarized waves and Eð TM for the TM polarized waves can then be used. For completeness, Maxwell’s equations, in MKS units, for a homogenous, static, isotropic, nondispersive, nondissipative medium take the form ð Eð Z 0 v, ð Bð Z 0 v, vð ! Eð ZKvt Bð 2

n ð vð ! Bð Z 2 vt E; c where n is the index of refraction and c is the speed of light in vacuum.

TE Incoming waves

TM Outgoing waves

z

Incoming waves

E B

E

Outgoing waves

z E

B

E

B B

E vectors out of paper (a)

x

B vectors out of paper

x

(b)

FIGURE 5.4 The most convenient pair of polarization states are the so-called (a) TE and (b) TM configurations. In either case, in two dimensions the full vector form of Maxwell’s equations reduces to a single scalar partial differential equation with associated boundary conditions.

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

299

The wave equation follows directly from Maxwell’s equations and is given by 0

1 2 n 2 ð x ;tÞ Z 0 @ v2t Kvð AEðð c2 0 1 2 ð x ;tÞ Z 0 @ n v2t Kvð 2 ABðð c2 ð vð ð Z e^i vi Z e^x vx C e^y vy C e^z vz Z e^x v=vxC e^y v=vyC e^z v=vz, vð 2 Z v, The notation vt Z v=vt, vð h V 2 2 2 Zvx C vy C vz is used, where the “$” indicates the standard dot product of vectors, i.e., for ð and B; ð Bð Z Ax Bx C Ay By C Az Bz h Pi Ai Bi with i taking the values x, y, z. ð A, two vectors A To simplify the notation, the summation convention will be used in which repeated indices are automatically summed over their appropriate range. This allows the summation P ð Bð Z Ai Bi . Also, using the summation convention, sign, , to be dropped, so that A, ð and B, ð ! B, ð ð ð which is denoted by A ð is A Z e^i Ai and B Z e^i Bi , etc. The cross product of A defined by e^i 3ijk Aj Bk, where 3 ijk with i, j, k taking the values x, y, z is defined by 3xyz Z 3yzx Z 3zxy ZC1, 3zyx Z 3yxz Z 3xzy ZK1, with all other index combinations being zero. Therefore, for example, vð ! Eð Z e^x ðvy Ez Kvz Ey Þ C e^y ðvz Ex Kvx Ez Þ C e^z ðvx Ey Kvy Ex Þ:

The Gauss law constraint is ð Eð ð xð;tÞ Z v, ð Bð ð xð;tÞ Z 0 v, The solution to the wave equation can be written as a four-dimensional Fourier transform, which is nothing but a linear superposition of plane waves of the form, eiðp ,ðxKiut . These are plane waves because their surfaces of constant phase, i.e., the positions xð ^ pð=jpðj defines the normal that satisfy pð,xð KutZ constant, are planes. The unit vector pZ to these planes or wavefronts, and for u positive, the wavefronts propagate in the C^p qﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃ direction with speed vZu/p. The wavelength, l, is related to pð by pZ jpðjZ pð2 Z pi piZ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ p2x C p2y C p2z Z 2p=l and the frequency f, in Hertz is related to the radian frequency u by uZ2pf. Combining these relations with the speed of propagation yields vZ 2pf =ð2p=lÞ Zlf . The variable u will be taken to be positive throughout the analysis. Substituting a single unit amplitude plane wave electric field, Eð Z 3^eiðp ,ðxKiut , into the wave equation with 3^ (a unit vector) representing the polarization direction of the electric field, one finds that pð and u must satisfy n2 K 2 u2 C pð2 Z 0 c This is the dispersion relation in a medium of index n.

q 2007 by Taylor & Francis Group, LLC

300

Microlithography: Science and Technology

Substituting Eð Z 3^eiðp ,ðxKiut into the Gauss law constraint yields

ð 3eiðp ,ðxKiut Þ Z ið p ,^3eiðp ,ðxKiut Z 0; v,ð^ ^ 3 Z 0. That is, for a single plane wave, the electric which is satisfied by demanding that p,^ field vector must be perpendicular to the direction of propagation. Note that this requires 3^ ^ to be a function of p. ð where “!” Substituting Eð Z 3^eiðp ,ðxKiut into the particular Maxwell equation vt Bð ZKvð ! E, is a cross-product, yields

vt Bð ZKipð ! 3^eiðp ,ðxKiut ZKipðp^ ! 3^Þeiðp ,ðxKiut n ZKiu ðp^ ! 3^Þeiðp ,ðxKiut ; c where pZun/c follows from the dispersion relation. The solution to this equation is n ðp^ ! 3^Þeiðp ,ðxKiut ; Bð Z c which shows that Eð and Bð for a single plane wave are in phase with one another. They propagate in the same direction. Bð has units that differ from those of Eð by the factor n/c. ^ the direction of propagation, and 3^, the polarization Also, Bð is perpendicular to both p, ^ and so Eð ! Bð points in the direction direction of the electric field. Note that 3^ !ðp^ ! 3^ÞZ p, of propagation of the wave. For the purpose of modeling the optical properties of alignment marks, it is convenient to separate pð into the sum of two vectors: one parallel to and one perpendicular to the wafer surface. The parallel or tangential vector will be written as bð Z e^x bx C e^y by , and the perpendicular vector will be written as Kg^ez for waves propagating toward the wafer, i.e., generally in the Kz direction, and as Cg^ez for waves propagating away from the wafer, i.e, generally in the Cz direction. bð will be referred to as the tangential propagation vector. The magnitude of bð is related to the angle of incidence or the angle of ð scatter/diffraction by jbjZ nk sinðqÞ, where q is the angle between the propagation vector pð and the z axis. Because only propagating and not evanescent waves need be considered here, g is real (for n real) and positive. Using this notation for pð, the dispersion relation qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 2 2 2 2 2 ð ð takes the form b C g Kðn =c Þu Z 0, which gives gðbÞZ n2 k2 Kbð , where k h u=cZ 2p=l, where g is the wavelength in vacuum; g is purely real (for n real) and ð jbj! nk, which corresponds to propagating waves, i.e., eGigz is an oscillating function of z, ð whereas for jbjO nk, g becomes purely imaginary, gZijgj and eGigz Z eHjgjz , which is exponentially decaying or increasing with z and corresponds to evanescent waves. Because the wave equation is linear, a completely general solution for Eð can be written as a superposition of the basic plane wave solutions,

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

301

ð x ;tÞ Z Eðð

ðh

i ð rCigzKickt 2 ^ TE ðb; ð kÞ C 3^TM ðbÞa ^ TM ðb; ð kÞ eib,ð d b dk 3^TE ðbÞa |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ}

ðh

hEð outZ Outgoing; i:e:; Mark Scattered=Diffracted Waves

i ð rKigzKickt 2 ^ TE ðb; ð kÞ C 3^TM ðbÞb ^ TM ðb; ð kÞ eib,ð d b dk 3^TE ðbÞb |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} ChEð inZ Incoming; i:e:; Sensor Illumination Waves

where we have explicitly indicated the contributions from TE and TM waves and rð Z e^x xC e^y y is just the in-plane position. The outgoing waves, Eð out , are those that have been scattered/diffracted by the alignment mark. Different sensor configurations collect and detect different portions of Eð out in ð kÞ and aTM ðb; ð kÞ are different ways to generate alignment signal data. The functions aTE ðb; the amplitudes, respectively, of the TE and TM outgoing waves with tangential propagation vector bð and frequency fZk/2pc. The incoming waves, Eð in , are the illumination, i.e., the distribution of light that the sensor projects onto the wafer. Different sensor configurations project different light distributions, i.e., different combinations of plane waves ð and bTM ðb;kÞ ð are the amplitudes, respectively, of the onto the wafer. The functions bTE ðb;kÞ TE and TM incoming waves with tangential propagation vector bð and frequency fZk/2pc. Because Maxwell’s equations are linear (nonlinear optics are not of concern here) the incoming and outgoing waves are linearly related to one another. This relation can conveniently be written in the form ð ATE ðb;kÞ

!

ð ATM ðb;kÞ |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} Outgoing Waves

ð

0

Z @

ð bð 0 Þ SEE ðb;

ð bð 0 Þ SEM ðb;

1 0

0 BTE ðbð ; kÞ

1

A ,@ A d2 b 0 0 0 0 ð ð ð ð ð SME ðb; b Þ SMM ðb; b Þ BTM ðb ; kÞ |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} Mark Scattering MatrixhS

Incoming Waves:

Each element of S is a complex number that can be interpreted as the coupling from a 0

ð bð Þ is the particular incoming wave to a particular outgoing wave. For example, SEE ðb; 0 coupling from the incoming TE wave with tangential propagation vector bð to the ð bð 0 Þ is ð In the same way, SEM ðb; outgoing TE wave with tangential propagation vector b.

0 ð Note that the coupling from the incoming TM wave at bð to the outgoing TE wave at b. because elements of S are complex numbers and complex numbers have an amplitude and a phase, the elements of S account for both the amplitude of the coupling, i.e., how much amplitude the outgoing wave will have for a given amplitude incoming wave, and for the phase shift that occurs when incoming waves are coupled to outgoing waves. Note that for stationary and optically linear media there is no cross-coupling of different temporal frequencies, finZfout, or, equivalently, kinZkout. The diagonal elements of S with respect 0 to the tangential propagation vector are those for which bð Z bð , and these elements correspond to specular reflection from the wafer. The off-diagonal elements, i.e., those with 0 bð s bð are nonspecular waves, i.e., the waves that have been scattered/diffracted by the alignment mark. See Figure 5.5 for an illustration of S and how it separates into propagating and evanescent sectors. The value of each element of S depends on the detailed structure of the mark, i.e., on the thicknesses, shapes, and indices of refraction of all the material “layers” that make up the alignment mark as well as on the wavelength of the light. The scattering matrix for a

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

302

−

qOut

0

Outgoing angles =rows +

−

+

0 q In Incoming angles=columns

(a)

qOut

Evan out

E-to-E

P-to-E

E-to-E

Prop out

E-to-P

P-to-P

E-to-P

Image sector

Evan out

(b)

E-to-E

P-to-E

E-to-E

Evan in

Prop in

Evan in

qIn

FIGURE 5.5 (a) The physical meaning of the elements of the scattering matrix. The different elements in the matrix correspond to different incoming and outgoing angles of propagation of plane waves. (b) To generate valid solutions to Maxwell’s equations requires including evanescent as well as propagating waves.

perfectly symmetric mark has an important property: it is centrosymmetric, as illustrated in Figure 5.6. It follows from this that the scattering matrix must be computed for each particular mark structure and for each wavelength of use. However, after this matrix has been computed, the alignment signals that are generated by that mark for all possible sensor

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

303

a

b

c

d qOut d c

b

a qIn

FIGURE 5.6 A perfectly symmetric alignment mark has a scattering matrix that is perfectly centrosymmetric. That is, elements at equal distances but opposite ð bð 0Z0Þ directions from the center of the matrix ðbZ are equal, as indicated.

configurations that use the specified wavelengths are completely contained in S. In standard terminology, S is the optical transfer function of the mark.

5.6 Basic Optical Alignment Sensor Configurations This section describes, in very general terms, the various basic configurations that an alignment sensor can take and the nominal signal shapes that it will produce. As mentioned in the first section, only optical sensors are considered, i.e., sensors that project light, either infrared, visible, or UV, onto the wafer and detect the scattered/diffracted light. The purpose of the alignment sensor is to detect the position of an alignment mark and so irrespective of the configuration of the alignment sensor, the signal it produces must depend in one way or another on the mark position. It follows that all alignment sensors, in a very general sense, produce a signal that can be considered to represent some sort of image of the alignment mark. This image can be thoroughly conventional, such as in a standard microscope, or it can be rather unconventional, such as in a scanned grating interference sensor. Simplified diagrams of each basic configuration are included below for completeness. The simplicity of these diagrams is in stark contrast to the schematics of real alignment systems whose complexity almost always belies the very simple concept they embody. For specific designs, see Refs. [1–12]. The following are common differentiators among basic alignment sensor types. † Scanning vs. staring: A staring sensor simultaneously detects position-dependent

information over a finite area on the wafer. A standard microscope represents an example of a staring sensor. The term staring comes directly from the idea that all necessary data for a single mark can be collected with the sensor simply staring at the wafer. A scanning sensor, on the other hand, can effectively “see” only a single point on the wafer and therefore must be scanned either mechanically or optically to develop the full wafer-position-dependent signal. Mechanical scanning may amount simply to moving the wafer in front of the sensor. Optical

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

304

scanning can be accomplished by changing the illuminating light such as to move the illuminating intensity pattern on the wafer. Optical scanning may involve the physical motion of some element in the sensor itself, such as a steering mirror, or it may not. For example, if the illumination spectrally contains only two closely spaced wavelengths, then, for certain optical configurations, the intensity pattern of the illumination will automatically sweep across the wafer at a predictable rate. † Brightfield vs. darkfield: All sensors illuminate the mark over some range of angles. This range can be large, such as in an ordinary microscope, or it can be small, such as in some grating sensors. If the range of angles over which the sensor detects the light scattered/diffracted from the wafer is the same as the range of illumination angles, it is called brightfield detection. The reason for this terminology is that, for a flat wafer with no mark, the specularly reflected light will be collected and the signal is bright where there is no mark. A mark scatters light out of the range of illumination angles; therefore, marks send less light to the detectors and appear dark relative to the nonmark areas. In darkfield detection, the range of scatter/ diffraction angles that are detected is distinctly different from the range of illumination angles. In this case, specularly reflected light is not collected and so a nonmark area appears dark and the mark itself appears bright. The relation of brightfield and darkfield to the scattering matrix is illustrated in Figure 5.7. † Phase vs. amplitude: Because light is a wave, it carries both phase and amplitude information and sensors that detect only the amplitude or only the phase or some combination of both have been and are being used for alignment. A simple microscope uses both because the image is essentially the Fourier transform of the scattered/diffracted light and the Fourier transform is a result of both the amplitude and phase information. A sensor that senses the position of the interference pattern generated by the light scattered at two distinct angles is detecting only the phase, whereas a sensor that looks only at the total intensity scattered into a specific angle or range of angles is detecting only the amplitude. † Broadband vs. laser: Sensors use either broadband illumination, i.e., a full spectrum of wavelengths spread over a few hundred nanometers generated by a lamp, or they use one or perhaps two distinct laser wavelengths. The advantage of laser illumination is that it is a very bright coherent source that therefore

Darkfield −NA

FIGURE 5.7 In all cases, the complete optical properties of the alignment mark are contained in the scattering matrix and all alignment sensors simply combine the rows and columns of this matrix in different ways. The above diagram shows a simple example of this. The brightfield illumination and collection numerical apertures (NA) or angular ranges coincide, whereas the darkfield range does not. The darkfield range is a combination of waves with more positive (CNA) and negative (KNA) numerical apertures than the illumination.

q 2007 by Taylor & Francis Group, LLC

qOut

Brightfield NA

Image sector

Darkfield +NA Brightfield illumination NA qIn

Alignment and Overlay

305

allows for the detection of weakly scattering alignment marks and for phase detection. The disadvantage is that, because it is coherent, the signal strength and shape are very sensitive to the detailed thin-film structure of the alignment mark. Therefore, small changes in thickness and/or shape of the alignment mark structure can lead to large changes in the alignment signal strength and shape. In certain cases, thin-film effects lead to destructive interference and therefore no measurable alignment signal. To mitigate this, a second distinctly different laser wavelength is often used so that if the signal vanishes at one wavelength, it should not at the same time vanish at the other wavelength. This, of course, requires either user intervention to specify which wavelength should be used in particular cases, or it requires an algorithm that can automatically switch between the signals at the different wavelengths depending on signal strength and/or signal symmetry. The advantage of broadband illumination is that it automatically averages out all of the thin-film effects; it is therefore insensitive to the details of the mark thin-film structure. Therefore, the signal has a stable intensity and shape, even as the details of the alignment mark structure vary. This is a good thing because the sensor is only trying to find the mark position; it is not nominally trying to determine any details of the mark shape. Its disadvantage is that, generally, broadband sources are not as bright as laser sources and it may therefore be difficult to provide enough illumination to accurately sense weakly scattering alignment marks. Also, phase detection with a broadband source is difficult because it requires equal-path interference. Simplified schematic diagrams of the various sensor types are shown in Figure 5.8 through Figure 5.12.

Source

Source

Object plane

Object plane

Critical illumination

Kohler illumination (a)

(b)

FIGURE 5.8 There are two generic forms of illumination: (a) Kohler and (b) critical. For Kohler illumination, each point in the source becomes a plane wave at the object. For critical illumination, each point in the source is imaged at the object.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

306 Brightfield (BF)

Brightfield (BF) Incoming wavefront

Darkfield (DF)

Darkfield (DF)

Darkfield (DF)

Darkfield (DF)

Mark

Mark Coherent incoming plane waves form focussed spot

Coherent incoming plane waves form focussed spot

FIGURE 5.9 This diagram illustrates the two generic illumination coherence configurations that are in common use. Specifically, it shows the difference in the spatial intensity distribution projected onto the mark between coherent and incoherent illumination with the same numerical aperture.

5.7 Alignment Signal Reduction Algorithms Only when process offsets are known a priori or are measured using sendaheads does the alignment problem default to finding the signal centroid itself. In this section, it is assumed that this is the case. If the only degrading influence on the signal were zero-mean Gaussian white noise, then the optimum algorithm for determining the signal centroid is to correlate the signal with itself and find the position of the peak of the output. Equivalently, because Image plane

DF

Pupil plane

DF DF

BF

BF DF

DF

BF

BF

DF

DF

DF

Mark

Mark

Imaging system

Non-imaging system

FIGURE 5.10 This diagram illustrates the two generic light collection configurations, imaging and nonimaging, that are in common use. (BF, brightfield; DF, darkfield).

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

307 DF

BF

BF DF (a)

Mark

(b)

Mark

FIGURE 5.11 (a) The curves indicate the signal intensity. The nomenclature brightfield (BF) and darkfield (DF) refers to the intensity that the sensor will produce when looking at an area of the wafer with no mark. In brightfield (darkfield), the range of incoming and outgoing plane wave angles is (is not) the same. Generally, a mark will appear dark in a bright background in brighfield imaging and bright in a dark background in darkfield imaging. (b) If the background area is very rough compared to the mark structure, then the scatter from the mark may actually be less than from the nonmark area and the brightfield and darkfield images will appear reversed as shown.

Imaging Grating orders

Sinusoidal intensity

0

−1

−2

+1

Incident planewave

−1

+1

−1

+1

+2

Grating period

Grating period Grating mark

Grating mark

FIGURE 5.12 This is the generic configuration of a grating alignment sensor. The upper figure shows the angular distribution of the grating orders as given by the grating equation. If only the C1 and K1 orders are collected, then the signal is purely sinusoidal as shown in the lower figure.

the derivative, i.e., the slope, of a function at its peak is zero, the signal can be correlated with its derivative and the position where the output of the correlation crosses through zero can be found. One proof that this is the optimal algorithm involves using the technique of maximum likelihood. Below is presented a different derivation that starts with the standard technique of finding the centroid, or center of mass, of the signal and shows that the optimum modification to it in the presence of noise results in the same autocorrelation algorithm, even for nonzero-mean noise. Wafer alignment signals are generated by scattering/diffracting light from an alignment mark on the wafer. For nongrating marks, the signal from a single alignment mark will generally appear as one, or perhaps several, localized “bumps” in the alignment-sensor signal data. As discussed above, the real problem is to determine the center of the alignment mark from the signal data. If sufficient symmetry is present in both the alignment sensor and the alignment-mark structure itself, then this reduces to finding the centroid or “center of mass” of the signal. For grating marks, the signal is often essentially perfectly periodic and usually sinusoidal, with perhaps an overall slowly varying amplitude envelope. In this case, the centroid can be associated with the phase of the periodic signal, as measured relative to a predefined origin. In general, all of the algorithms

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

308

discussed below can be applied to periodic as well as isolated signal “bumps.” But for periodic signals, the Fourier algorithm is perhaps the most appropriate. In all cases, there may be some known offset based on a sendahead wafer or on some baseline-tool calibration that must be added to the measured signal centroid to shift it to match the mark center. As discussed above, real signals collected from real wafers will, of course, be corrupted by noise and degraded by the asymmetry present in real marks and real alignment sensors. Because the noise contribution can be treated statistically, it is straightforward to develop algorithms that minimize, on average, its contribution to the final result. If this were the only problem facing alignment, then simply increasing the number of alignment marks scanned would allow overlay to become arbitrarily accurate. However, as discussed above, process variation both across a wafer and from wafer to wafer changes not only the overall signal amplitude, but also the mark symmetry and hence the signal symmetry. This effect is not statistical and is currently not predictable. Therefore, other than trying to make the algorithm as insensitive to signal level and asymmetry as possible and potentially using sendaheads, there is not much that can be done. Potential algorithms and general approaches for dealing with explicitly asymmetric marks and signals are given in Refs. [30,31]. It is simplest to proceed with the general analysis in continuum form. For completeness, the adjustments to the continuum form that must be made to use discrete, i.e., sampled, data are briefly discussed. These adjustments are straightforward, but also tedious and somewhat tool dependent. They are therefore only briefly described. Also, one dimension will be used here because x and y values are computed separately, anyway. Finally, the portions of the signal representing alignment marks can be positive bumps in a nominally zero background, as would occur in dark-field imaging, or they can be negative bumps in a nominally nonzero background, as would occur in brightfield imaging. For simplicity, the tacit assumption is made that darkfield-like signals are being dealt with. In the grating case, the signals are generally sinusoidal, which can be viewed either way, and this case will be treated separately. Let I(x) be a perfect, and therefore symmetric, signal bump (or bumps) as a function of position, x. The noise, n(x), will be taken to be additive, white, i.e., uncorrelated and spatially and temporally stationary, i.e, constant in space and time. When a new wafer is loaded and a given alignment mark is scanned, the signal will be translated by an unknown amount s relative to some predetermined origin of the coordinate system. Thus, the actual detected signal, D(x) is given by I(x) shifted a distance s, i.e., D(x)ZI(xKs). It is the purpose of an alignment sensor to determine the value of s from the detected signal. The position of the centroid or center of mass of the pure signal is defined as Ð Ð xDðxÞdx xIðxKsÞdx CZ Ð Z Ð : DðxÞdx IðxKsÞdx This is illustrated in Figure 5.13. To show that sZC so that we can find s is accomplished by computing the centroid. Let yZxKs, then Ð

Ð Ð ðy C sÞIðyÞ dy yIðyÞ dy IðyÞ dy Ð CZ Z Ð Cs Ð : IðyÞ dy IðyÞ dy IðyÞ dy |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄ} Z0

q 2007 by Taylor & Francis Group, LLC

Z1

Alignment and Overlay

x

309

FIGURE 5.13 The “center of mass” algorithm estimates the mark center, x-center, by summing the product of the distance from the origin, x, by the signal intensity at x over a specified range and normalizing by the net area under signal curve over the same range.

“Δmass”

x Center of mass

The first term vanishes because ð ð yIðyÞ dy Z ½ðoddÞ !ðevenÞ dy Z 0; and it is assumed that symmetric limits of integration are used, which for all practical purposes can be set to GN. Thus, s Z C; and the shift position can be found by computing the centroid. In the presence of noise, the actual signal is the pure signal, shifted by the unknown amount s, with noise added: DðxÞ Z IðxKsÞ/ DðxÞ Z IðxKsÞ C nðxÞ: Below are discussed standard algorithms for computing a value for C from the measured data D(x) that, based on the above discussion, amounts to determining an estimated value of s, which will be labeled sE. Along with using the measured data, some of the algorithms also make use of any a priori knowledge of the ideal signal shape I(x). The digitally sampled real data is not continuous. The convention DiZD(xi)is used to label the signal values measured at the sample positions xi, where iZ1,2,.,N, with N representing the total number of data values for a single signal. 5.7.1 Threshold Algorithm Consider an isolated single bump in the signal data that represents an “image” of an alignment mark. The threshold algorithm attempts to find the bump centroid by finding the midpoint between the two values of x at which the bump has a given value called, obviously, the threshold. In the case where the signal contains multiple bumps representing multiple alignment marks, the algorithm can be applied to each bump separately, and the results can be combined to produce an estimate of the net bump centroid. For now, let D(x) consist of a single positive bump plus noise, and let DT be the specified threshold value. Then, if the bump is reasonably symmetric and smoothly varying and DT has been chosen appropriately there will be two and only two values of x, xL and xR , that satisfy DT Z DðxL Þ Z DðxR Þ: The midpoint between xL and xR, which is the average of xL and xR, is taken as the estimate for s, i.e., 1 sE Z ðxL C xR Þ: 2 This is illustrated in Figure 5.14.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

310

Peak Threshold level

xleft

FIGURE 5.14 The threshold algorithm estimates the mark center, x_center, by finding the midpoint between the threshold crossover positions x_left and x_right.

xright

xcenter = 1(xleft + xright ) 2

As shown below, this algorithm is very sensitive to noise because it uses only two points out of the entire signal. A refinement that eliminates some of this noise dependence is to average the result from multiple threshold levels. Taking DT1 ;DT2 ;.;DTN to be N different threshold levels with sE1 ;sE2 ;.;sEN being the corresponding centroid estimate for each, the net centroid is taken to be sE Z

1 ðs C sE2 C/C sEN Þ N E1

The above definition of sE weights all the N threshold estimates equally. A further refinement of the multiple-threshold approach is to weight the separate threshold results nonuniformly. This weighting can be based on intuition, modeling, and/or experimental results that indicate that certain threshold levels tend to be more reliable than others. In this case, sE Z w1 sE1 C w2 sE2 C/C wN sEN ; where w1 C w2 C/C wN Z 1. 5.7.1.1 Noise Sensitivity of the Threshold Algorithm Noise can lead to multiple threshold crossovers and it is generally best to pick the minimum threshold value to be greater than the noise level. This is, of course, signal dependent, but a minimum threshold level of 10% of the peak value is reasonable. Also, because a bump has zero slope at its peak, noise will completely dominate the result if the threshold level is set too high. Generally, the greatest reasonable threshold level that should be used is on the order of 90% of the peak. The sensitivity of sE to noise can be determined in the following way. Let xL0 and xR0 be the true noise free threshold positions, i.e., IðxL0 Þ Z IðxR0 Þ Z DT : Now let DL and DR be the deviations in threshold position caused by noise so that xLZ xL0CDL and xRZxR0CDR. Substituing this into the threshold equation and assuming the Ds are small gives DT Z DðxL Þ Z IðxL0 C DL Þ C nðxL0 C DL Þ xIðxL0 Þ C I 0 ðxL0 ÞDL C nðxL0 Þ C n 0 ðxL0 ÞDL

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

311

and DT Z DðxR Þ Z IðxR0 C DR Þ C nðxR0 C DR Þ xIðxR0 Þ C I 0 ðxR0 ÞDR C nðxR Þ C n 0 ðxR0 ÞDR ; where the prime on I(x) and n(x) indicates differentiation with respect to x. Using I(xL0)Z I(xR0)ZDT and solving for the D’s yields nðxL0 Þ ; I ðxL0 Þ C n 0 ðxL0 Þ nðxR0 Þ : DR Z 0 I ðxR0 Þ C n 0 ðxR0 Þ DL Z

0

The temptation at this stage is to assume that n 0 is much smaller than I 0 , but for this to be true, the noise must be highly correlated as a function of x, i.e., it cannot be white. The derivative of uncorrelated noise has an rms slope of infinity. The discrete nature of real sampled data will mitigate the “derivative” problem somewhat, but nonetheless, to obtain reasonable answers using this algorithm, the noise must be well-behaved. Assuming the n(x) is smooth enough for the approximation n 0/I 0 to be made gives sE Z

xL C xR nðx Þ nðx Þ C 0 L0 C 0 R0 : 2I ðxL0 Þ 2I ðxR0 Þ 2

The rms error, ss, in the single-threshold algorithm as a function of the rms noise, sn, assuming the noise is spatially stationary and uncorrelated from the left to the right side of the bump and that I 0 (xL0)ZI 0 (xR0)hI 0 , is then sn ﬃ: ss Z pﬃﬃﬃﬃﬃ 2I 0 This result shows explicitly that the error will be large in regions where the slope I 0 is small and it is therefore best to choose the threshold to correspond large slopes. If the results from N different threshold levels are averaged and the slope I 0 is essentially the same at all the threshold levels, then sn ﬃ: ss y pﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2NI 0 5.7.1.2 Discrete Sampling and the Threshold Algorithm For discretely sampled data, only rarely will any of the Di correspond exactly to the threshold value. Instead, there will be two positions on the left side of the signal and two positions on the right, where the Di values cross over the threshold level. Let the i values between which the crossover occurs on the left be iL and iLC1, and on the right iR and iRC1. Then, the actual threshold positions can be determined by linear interpolation between corresponding sample positions. The resulting xL and xR values are then given by

q 2007 by Taylor & Francis Group, LLC

xL Z

xiLC1 KxiL ðD KDiL Þ C xiL ; DiLC1 KDiL T

xR Z

xiRC1 KxiR ðD KDiR Þ C xiR : DiRC1 KDiR T

Microlithography: Science and Technology

312 5.7.2 Correlator Algorithm

The correlator algorithm is somewhat similar to the variable-weighting-threshold algorithm in that it uses most or all of the signal data, except nominally instead of uniformly. The easiest approach to deriving the correlator algorithm is to minimize the noise contribution to the determination of C as given by the integration above. It is obvious that, in the presence of additive noise, the centroid integration should be restricted to the region where the signal bump is located. Integrating over regions where there is noise but no bump simply corrupts the result. In other words, there is no point to integrating where the true signal is not located. The integration range can be limited by including a function f(x) in the integrand which is nonzero only over a range that is about equal to the bump width. The centroid calculation then takes the form ð C Z f ðxÞDðxÞdx: In the standard centroid calculation given above, f(x) is proportional to x, which is an antisymmetric function. The optimum form of f(x) in the presence of noise, as determined below, is also antisymmetric, but it has a limited width. Of course, f(x) must be centered close to the actual bump centroid position for this to work. This can be accomplished in several ways. For example, an approximate centroid position could first be determined using a simple algorithm such as the threshold algorithm. The function f(x) is then shifted by this amount by letting f(x)/f(xKx0) so that there is significant overlap between it and the bump. The value of C computed via the integration is then the bump centroid position estimate, sE, as measured relative to the position x0. Or, f(x) could progressively be shifted by small increments and the centroid computed for each case. In this case, when the bump is far from the center of f(x), there is little or no overlap between the two; the output of the integration will be small, with the main contribution coming from noise. However, as f(x) is shifted close to the bump, they will begin to overlap and the magnitude of the integral will increase. The sign of the result depends on the relative signs of f(x) and the bump in the overlap region. As the f(x) is shifted through the bump, the magnitude of the integral will first peak, then decrease and pass through zero as f(x) becomes coincident with the bump centroid, then peak with the opposite sign as f(x) moves away from the bump centroid, and eventually decrease back to just the noise contribution as the overlap decreases to zero. Mathematically, this process takes the form of computing ð

Cðx0 Þ Z f ðxKx0 ÞDðxÞdx for all values of x0 in the signal range. The value if C(x0) is the estimate of the centroid position measured relative to the shift position x0, i.e., C(x0)ZsEKx0. The point is that the integral provides a valid estimate of position only when there is significant overlap between the bump and f(x). This occurs in the region where the magnitude of the integration passes through zero with the optimum overlap being exactly when the bump is centered at x0 so that C(x0)ZsEKx0Z0, from which it follows that sEZx0. Thus, the algorithm takes the form of correlating f(x) with D(x), with the best estimate of the centroid position, sE, being given by the value of x0, which produces the exact zero-crossing in the integration. The optimum form of the function f(x) is that which minimizes the noise contribution to C in the region of the zero crossing. In the presence of noise, the centroid calculation is

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

313

rewritten as ð Cðx0 Þ Z f ðxKx0 ÞDðxÞdx ð ð Z f ðxKx0 ÞIðxKsÞdx C f ðxKx0 Þnðx C x0 Þdx ð ð Z f ðxÞIðxKðsKx0 ÞÞdx C f ðxÞnðx C x0 Þdx where in the last step the integration variables have been changed by replacing (xKx0) with x. Assuming that the bump centroid, s, is close to x0, i.e., sKx0 is much less than the width of the signal, then ð ð Cðx0 Þ Z f ðxÞðIðxÞKðsKx0 ÞI 0 ðxÞÞdx C f ðxÞnðx C x0 Þdx; where I 0 ðxÞ h vIðxÞ=vx. To have a nonbiased estimate of the centroid position measured relative to x0, i.e., a nonbiased estimate of the value of sKx0, on average, the value of C(x0) as given above must equal the true value, sKx0, that is obtained in the absence of noise. Taking the statistical expectation value of both sides gives ð ð ð hCðx0 Þi Z f ðxÞIðxÞdxKðsKx0 Þ I 0 ðxÞf ðxÞdx C f ðxÞhnðx C x0 Þidx: Letting hn(xCx0)ihhniZ constant, then ð ð ð hCðx0 Þi Z f ðxÞIðxÞdxKðsKx0 Þ I 0 ðxÞf ðxÞdx C hni f ðxÞdx: To have hC(x0)iZsKx0, f(x) must satisfy the following set of equations: ð

f ðxÞIðxÞdx Z 0;

ð

0

I ðxÞf ðxÞdx ZK1;

ð

f ðxÞdx Z 0:

The first and last of these conditions demands that f(x) be an antisymmetric, i.e., an odd, function of x. The second condition is consistent with this because I 0 (x) is antisymmetric because it is assumed that I(x) is symmetric. However, in addition, it specifies the normalization of f(x). All three equations can be satisfied if f(x) is written in the form f ðxÞ ZKÐ

aðxÞ I 0 ðxÞaðxÞdx

;

where a(x) is an antisymmetric function. To determine the optimum form for a(x), the expectation value of the noise is minimized relative to the signal. The last term in the formula for C(x0) is the noise and the second term is the signal. Substituting these terms and the above form for f(x) and taking the expectation value,

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

314

noise signal

* Ð

Z

Ð

aðxÞnðxÞdx

2 +

aðxÞI 0 ðxÞdx

2

Z s2 Ð

Ð

ðaðxÞÞ2 dx

aðxÞI 0 ðxÞdx

2

after using hnðxÞnðx 0 ÞiZ s2 dðxKx 0 Þ as appropriate for white noise. To find the function a(x) that minimizes hnoise/signali, replace a(x) with a(x)CDa(x) in the above result, expand in powers of Da(x), and demand that the coefficient of Da(x) to the first power vanish. After some manipulation, this yields the following equation for a(x): Ð ½aðxÞ2 dx aðxÞ Z I ðxÞ Ð : aðxÞI 0 ðxÞdx 0

The solution to this equation is simply aðxÞ Z I 0 ðxÞ and so f ðxÞ ZKÐ

I 0 ðxÞ Z Stationary White Noise Optimum Correlation Function: ðI 0 ðxÞÞ2 dx

This is the standard result: the optimum correlator is the derivative of the ideal bump shape. If one correlated the signal with the ideal bump shape and searched for the peak in the result instead of finding the zero-cross position after correlating with the derivative, then this is the standard matched-filter approach used in many areas of signal processing. The same result can also be derived using the method of least squares. The mean square difference between D(x) and I(xKs) is given by ð

ðDðxÞKIðxKsÞÞ2 dx

The minimization of this requires finding sE such that 0Z

v vs

ð

ðDðxÞKIðxKsÞÞ2 dx

: sZsE

Taking the derivative inside the integral and using the fact that vanishes at the endpoints of the integration gives

Ð

IðxÞI 0 ðxÞdxZ 0 if I

ð

0 Z I 0 ðxKsE ÞDðxÞdx: This is the same result as above, but without the normalization factor. This derivation was presented many years ago by Robert Hufnagel. Note that at the peak of the bump the derivative is zero, whereas at the edges the slope has the largest absolute value. Using the derivative of the bump as the “weighting” function in the correlation shows explicitly that essentially all of the information about the bump centroid comes from its edges with essentially no information coming from its peak. Simply put, if the signal is shifted a small amount, the largest change in signal value occurs in the regions with the largest slope, i.e., the edges, and there is essentially no change in the

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

315

Alignment signal “bump”

“Bump” center line

Area

Area

x “The correlator” Left zone Left and right zone area values are assigned to the output at the correlator center line as it slides along the x axis

Right zone

Left zone output

Rightzone output Total=left zone output − right zone output

FIGURE 5.15 The correlator algorithm estimates the mark center as the position that has equal areas in the signal bump in the left and right zones. The algorithm computes the area difference, as shown here for a discretized correlator with only two nonzero zones, as a function of position. The estimated mark position corresponds to the zero crossing position in the bottom curve.

value at the peak. The edges are therefore the most sensitive to the bump position and hence contain the most position information. This is illustrated in Figure 5.15. The above result assumes that the only degrading influence on the signal is stationary white noise, i.e., spatially uncorrelated noise with position-independent statistics. With some effort, the stationary and uncorrelated restrictions can be removed and the corresponding result for correlated nonstationary noise can be derived. However, that is not the problem. The problem is that, generally, noise is not the dominant degrading influence on the alignment signal—process variation is. Thus, the above result only provides a good starting point for picking a correlator function. To achieve the optimum insensitivity to process variations, this result currently must be fine tuned based on actual signal data. In the future, if sufficient understanding of the effect of symmetric and asymmetric process variation on alignment structures is developed, then the optimum correlator for particular cases can be designed from first principles. The correlator algorithm can clearly be implemented in software, but it also can be implemented directly in hardware where it takes the form of a “split detector.” Consider two detectors placed close to one another with their net width being approximately equal to the expected bump width. The voltage from each detector is proportional to the area under the portion of the signal that it intercepts. When these two voltages are equal, then, assuming identical detectors, the signal center is exactly in the middle of the two detectors. If a simple circuit is used to produce the voltage difference, then, just as above, a zero crossing indicates the signal center. Note that the detectors uniformly weight the signal that they intercept rather than derivatize the signal. Therefore, the split-detector approach is equivalent to a “lumped” correlator algorithm where the smoothly varying signal derivative has been replaced by rectangular steps.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

316

5.7.2.1 Noise Sensitivity of the Correlator Algorithm In the presence of noise, the value of s is still determined by finding the value of x0 for which C(x0) is zero. From the above equations, this amounts to ð ð ð Cðx0 Þ Z 0 Z f ðxÞIðxÞdxKðsKx0 Þ I 0 ðxÞf ðxÞdx C f ðxÞnðx C x0 Þdx: Ð Ð Using f ðxÞIðxÞdxZ 0 and I 0 ðxÞf ðxÞdxZK1, ð ðsKx0 Þ Z f ðxÞnðx C x0 Þdx; which is, in general, not equal to zero and amounts to the error in sE for the particular noise function n(x). Using the form for f(x) given above and calculating the rms error, ss, in sE assuming spatially stationary noise yields sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ÐÐ 0 I ðx1 ÞI 0 ðx2 Þhnðx1 Þnðx2 Þidx1 dx2 ss Z :

Ð 0 2 ðI ðxÞÞ2 dx For the case where the noise is uncorrelated so that hnðx1 Þnðx2 ÞiZ s2n dðx1 Kx2 Þ, this reduces to sn ﬃ: ss Z qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Ð 0 ðI ðxÞÞ2 dx This result shows explicitly again that the error in the sE is larger when the slope of the 2 ideal pbump ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃshape is small. Note that hnðx1 Þnðx2 ÞiZ sn dðx1 Kx2 Þ requires sn to have units of I ! length because the delta function has units of 1/length and n(x) has units of I. 5.7.2.2 Discrete Sampling and the Correlator Algorithm First, for discrete sampling, the integration is replaced by summation, i.e., ð

Cðx0 Þ Z f ðxKx0 ÞDðxÞdx/ Ci0 Z

X

fiKi0 Di :

i

Second, as with the threshold algorithm, discrete sampling means that only rarely will an exact zero crossing in the output of the correlation occur exactly at a sample point. Usually, two consecutive values of i0 will straddle the zero crossing, i.e., Ci0 and Ci0C1 are both small but have opposite signs so that the true zero crossing occurs between them. If fi is appropriately normalized, then both Ci0 and Ci0C1 provide valid estimates of the bump centroid position as measured relative to the i0 and i0C1 positions, respectively, with either result being equally valid. Assuming that iZ0 corresponds to the origin of the coordinate system and Dx is sample spacing, then the two estimates are sE Z Ci0 Ci0 Dx and sE Z Ci0C1 C ði0 C 1ÞDx, respectively. Averaging the two results yields a better estimate given by 1 1 sE Z i0 C Dx C ðCi0 C Ci0C1 Þ: 2 2

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

317

This result is exactly equivalent to linear interpolation of the zero cross position from the two bounding values because assuming proper normalization of f(x) is equivalent to making the slope of the curve equal to unity. 5.7.3 Fourier Algorithm This algorithm is based on Fourier analysis of the signal. It is perhaps most straightforward to apply it to signals that closely approximate sinusoidal waveforms, but it can, in fact, be applied to any signal. We discuss the algorithm first for nonsinusoidal signals and then show the added benefit that accrues when it is applied to sinusoidal signals such as would be produced by a grating sensor as discussed, for example, by Gatherer and Meng [35]. Assuming that I(x) is real and symmetric, its Fourier transform, ~ h p1ﬃﬃﬃﬃﬃﬃ IðbÞ 2p

C ðN

IðxÞeibx dx;

KN

is real and symmetric, i.e., ~ Z I~ ðbÞ : Real IðbÞ ~ Z IðK ~ bÞ : Symmetric: IðbÞ The parameter b is the spatial frequency in radians/(unit length)Z2p!cycles/(unit length). ~ ~ by a phase factor: The Fourier transform of the measured signal, DðbÞ, is related to IðbÞ 1 ~ DðbÞ h pﬃﬃﬃﬃﬃﬃ 2p 1 h pﬃﬃﬃﬃﬃﬃ 2p 1 h pﬃﬃﬃﬃﬃﬃ 2p

C ðN KN C ðN KN C ðN

DðxÞeibx dx

IðxKsÞeibx dx

0

Iðx 0 Þeibðx CsÞ dx 0

~ Z eibs IðbÞ;

KN

where in the first step D(x)ZI(xKs) has been used as given above, and in the second step variables have been changed to let x 0 ZxCs. ~ is real, s can then be calculated by scaling the arctangent of the Remembering that IðbÞ ratio of the imaginary to the real component of the Fourier transform as follows: ~ 1 ImðDðbÞÞ s Z arctan : ~ b ReðDðbÞÞ ~ ~ ~ This can be proven by first noting that because IðbÞ is real, then ImðDðbÞÞZ sinðbsÞIðbÞ ~ ~ ~ and ReðDðbÞÞZ cosðbsÞIðbÞ. Therefore, IðbÞ cancels in the ratio, leaving sin(bs)/cos(bs)Z tan(bs). Then, taking the arctangent and dividing by the spatial frequency, b, leaves the shift, s, as desired.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

318

Like the correlator algorithm, the Fourier algorithm can also be derived using least squares. Substituting the Fourier transform representations 1 IðxÞ Z pﬃﬃﬃﬃﬃﬃ 2p

ð

Kibx ~ db IðbÞe

DðxÞ Z IðxKs0 Þ

1 Z pﬃﬃﬃﬃﬃﬃ 2p

ð

KibðxKs0 Þ ~ db IðbÞe

Ð into the least-squares integral ðDðxÞKIðxKsÞÞ2 dx, taking a derivative with respect to s, and setting the result equal to zero for sZsE yields ð

~ 2 sinðbðsE Ks0 ÞÞ: 0 Z bjIðbÞj ~ ~ Using sinðbðsE Ks0 ÞÞ ybðsE Ks0 Þ for sE close to s0 and bs0 Z arctanðIm½DðbÞ=Re½ DðbÞÞ, the same result as above is obtained. There are several interesting aspects to the above result. Although the right-hand side can be evaluated for different values of b, they all yield the same value of s. Therefore, in the absence of any complicating factors such as noise or inherent signal asymmetry, any value of b can be used and the result will be the same. 5.7.3.1 Noise Sensitivity of the Fourier Algorithm In the presence of noise, the Fourier transform of the signal data takes the form ~ ~ C nðbÞ: ~ DðbÞ Z eibs IðbÞ ~ Substituting the above form for DðbÞ into result given above for s yields ~ ~ 1 IðbÞsinðbsÞ C Im½nðbÞ sðbÞ Z arctan ; ~ b ~ IðbÞcosðbsÞ C Re½nðbÞ where s is now a function of b. That is, different m values will yield different estimates for s. The best estimate will be obtained from a weighted average of the different s values. This weighted average can be written as ð sE Z f ðbÞsðbÞdb ~ ð ~ f ðbÞ IðbÞsinðbsÞ C Im½nðbÞ arctan Z db ~ b ~ IðbÞcosðbsÞ C Re½nðbÞ ð ð ~ ~ f ðbÞ ðcosðbsÞIm½nðbÞKsinðbsÞRe½ nðbÞ Þ x s f ðbÞdb C db: ~ b IðbÞ In the last step, it was assumed that f(b) is large in regions where the signal-to-noise ratio ~ n; ~ and it is essentially zero in regions where theÐ signal to noise ratio is is large, i.e., IO ~ ~ Assuming zero mean noise so that hnðbÞiZ ~ small, i.e., I! n. 0, then f ðbÞdbZ 1 must be true for sE to be equal to the true answer, s, on average, i.e, hsEiZs. The error in sE is then given by the second term, and s2s

ð

f ðb1 Þ f ðb2 Þ ~ 1 ÞKsinðb1 sÞRe½nðb ~ 1 Þ cosðb1 sÞIm½nðb ~ ~ b Iðb Þ b Iðb Þ 1 1 2 2

~ 2 ÞKsinðb2 sÞRe½nðb ~ 2 Þ db1 db2 : ! cosðb2 sÞIm½nðb

Z

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

319

~ ~ ~ bÞ and Im½nðbÞZ ~ Using the fact that n(x) is real gives Re½nðbÞZ ð1=2Þ½nðbÞC nðK ð1=2iÞ 0 2 0 ~ ~ ½nðbÞKnðKbÞ: Assuming n(x) is uncorrelated, i.e., hnðxÞnðx ÞiZ sn dðxKx Þ, it follows that ~ 1 ÞRe½nðb ~ 2 Þi Z hRe½nðb

s2n ½dðb1 Kb2 Þ C dðb1 C b2 Þ 2

~ 2 Þi Z ~ 1 ÞIm½nðb hIm½nðb

s2n ½dðb1 Kb2 ÞKdðb1 C b2 Þ 2

~ 1 ÞIm½nðb ~ 2 Þi Z 0: hRe½nðb Substituting the above equation and assuming f(b)Zf(Kb) yields s2s

Z s2n

ð

f ðbÞ ~ bIðbÞ

2 db:

Ð Ð The optimum form for f(b) can be found by letting f ðbÞZ aðbÞ= aðbÞdb so that f ðbÞdbZ1 is automatically satisfied, then, by replacing a with aCDa, expanding in powers of Da, and finally demanding that the first order in Da terms vanish for all Da This yields the following relation Ð ð ð DaðbÞdb aðbÞDaðbÞ aðbÞ 2 Ð db Z db; 2 ~ ~ aðbÞdb ðbIðbÞÞ bIðbÞ 2 ~ , which then gives which is satisfied by letting aðbÞZ ðbIðbÞÞ

f ðbÞ Z Ð

2 ~ ðbIðbÞÞ : 2 ~ db ðbIðbÞÞ

Substituting the above equation then gives sE Z Ð

1 ~ ðbIðbÞÞ2 db

ð

~ Im½DðbÞ 2 ~ arctan bðIðbÞÞ db ~ Re½DðbÞ

sn ﬃ: ss Z qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ Ð 2 ~ ðbIðbÞÞ db

2 Thus, the optimum weighting is proportional to the power spectrum of the signal I~ , as one would expect when the noise is uncorrelated. Also, the b factor shows there is no information about the position of the bump for bw0. This is simply a consequence of the fact that bZ0 corresponds to a constant value in x that carries no centroid information. ~ Finally, I 0 (x) in Fourier or b space is given by ibIðbÞ, and thus ss has the same basic form in both the correlator and Fourier algorithms. Note that hnðxÞnðx 0 ÞiZ s2n dðxKx 0 Þ requires sn to pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ have units of I ! length because the delta function has units of 1/length and n(x) has units of I. This is exactly what is required for ss to have units of length because I~ has units of 1! length.

5.7.3.2 Discrete Sampling and the Fourier Algorithm The main effect of having discretely sampled rather than continuous data is to replace all the integrals in the above analysis with sums, i.e., replace true Fourier transforms with discrete Fourier transforms (DFTs) or their fast algorithmic implementation, fast Fourier transforms (FFTs).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

320

5.7.3.3 Application of the Fourier Algorithm to Grating Sensors In many grating sensors and in some nongrating sensors, the pure mark signal is not an isolated bump; it is a sinusoid of a specific known frequency, say b0, multiplied possibly by a slowly varying envelope function. The information about the mark position in this case is encoded in the phase of the sinusoid. The total detected signal will, as usual, be corrupted by noise and other effects that, in general, add sinusoids of all different frequencies, phases, and amplitudes to the pure b0 sinusoid. However, because it is known that the mark position information is contained only in the b0 frequency component of the signal, all the other frequency components can simply be ignored in a first approximation. They are useful only as a diagnostic for estimating the goodness of the signal. That is, if all the other frequency components are small enough so that the signal is almost purely a b0 sinusoid then the expectation is that the mark is clean and uncorrupted and the noise level is low, in which case one can have high confidence in the mark position predicted by the signal. On the other hand, if all the other frequency components of the signal are as large or larger than the b0 frequency component, then it is likely that the b0 frequency component is severely corrupted by noise and the resulting centroid prediction is suspect. Using the above result for computing s from the Fourier transform of the signal, but using only the b0 frequency component in the calculation, yields

! ~ 0Þ Im Dðb 1

; s Z arctan ~ 0Þ b0 Re Dðb and in the presence of noise, ~ ~ 0 Þ Iðb0 Þsinðb0 sÞ C Im½nðb 1 arctan ~ 0 Þcosðb0 sÞ C Re½nðb b0 ~ 0 Þ Iðb ~ 0 ÞKsinðb0 sÞRe½nðb ~ 0 Þ cosðb0 sÞIm½nðb xs C ~Iðb0 Þ

sE Z

~ 0 Þ. The effect of noise on the grating result is given by ~ 0 Þ/ Iðb for nðb s2n ﬃ; s2s Z pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ~ 0 ÞÞ2 Db ðb0 Iðb where Db is the frequency resolution of the sensor. 5.7.4 Global Alignment Algorithm The purpose of the global alignment algorithm is to combine all the separate alignmentmark position measurements into an optimum estimate of the correctable components of the field and grid distortions along with the overall grid and field positions. These “correctable” components generally consist of some or all the linear distortion terms described in the Appendix. As discussed in previous sections, each field will be printed with roughly the same rotation, magnification, skew, etc., with respect to the expected field. The linear components of the average field distortion are referred to collectively as field terms. The position of a given reference point in each field, such as the field center, defines the “grid,” and these points will also have some amount of rotation, magnification, skew, etc., with respect to the expected grid. The linear components of the grid distortion are referred to collectively as grid terms. In global fine alignment, where the alignment marks on only a few fields on the wafer are measured, both field and grid terms need to be determined from the alignment data to perform overlay. In field-by-field alignment, where

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

321

each field is aligned based only on the data from that field, the grid terms are not directly relevant. Here, only global fine alignment is considered. To be the most general, all six linear distortion terms discussed in the Appendix will be solved for: x and y translation, rotation, skew, x magnification, and y magnification. Note that not all exposure tools can correct for all of these terms; therefore, the algorithm must be adjusted accordingly. A generic alignment system will be considered that measures and returns the x and y position values of each of NM alignment marks in each of NF fields on a wafer. Let mZ1,2,.,NM label the marks in each field and fZ1,2,.,NF label the fields. The following matrix-vector notation will be used for position, as measured with respect to some predefined coordinate system fixed with respect to the exposure tool: ! xmf Z Expected position of mark m in field f ; rmf Z ymf 0 ! xmf 0 Z Measured position of mark m in field f ; rmf Z 0 ymf ! Xf Z Expected position of field f reference point; Rf Z Yf ! Xf0 0 Z Measured position of field f reference point: Rf Z Yf0 To be explicit, a reference point for each field must now be chosen. It is the difference between the measured and expected positions of this reference point that defines the translation of the field. A suitable choice would be the center of the field, but this is not necessary. Basically, any point within the field can be used, although this is not to say that all points are equal in this regard. Different choices will result in different noise propagation and rounding errors in any real implementation; the reference point must be chosen to minimize these effects to the extent necessary. The “center of mass” of the mark positions will be taken to be the reference point, i.e., the position of the reference point of field f is defined by Rf Z

1 X r NM m mf

If the alignment marks are symmetrically arrayed around a field, then Rf as defined above corresponds to the field center. The analysis is simplified if it is assumed that the field terms are defined with respect to the field reference point, i.e., field rotation, skew, and x and y magnification do not affect the position of the reference point. This can be done by writing rmf Z Rf C dm ; which effectively defines dm as the position of mark m measured with respect to the reference point. The field terms are applied to dm and the grid terms are applied to Rf. Combining the previous two equation yields the following constraint: X dm Z 0 m

Remember that the inherent assumption of the global fine alignment algorithm is that all the fields are identical; therefore, dm does not require a field index, f. However, the

q 2007 by Taylor & Francis Group, LLC

322

Microlithography: Science and Technology

measured dm values will vary from field to field. Therefore, for the measured data: 0 0 rmf Z Rf0 C dmf

The implicit assumption of global fine alignment is that, to the overlay accuracy required, one can write 0 Z T C G$Rf C F$dm C nmf ; rmf

where Rf and dm are the expected grid and mark positions, and 0 1 Tx T Z @ A Z Grid translation; Ty 0 1 Gxx Gxy A Z Grid rotation skew and mag matrix; GZ@ Gyx Gyy 0 1 Fxx Fxy A Z Field rotation skew and mag matrix: F Z@ Fyx Fyy See the Appendix for the relationship between the matrix elements and the geometric concepts of rotation, skew, and magnification. The term nmf is noise, which is nominally assumed to have a zero-mean Gaussian probability distribution and is uncorrelated from field to field and from mark to mark. The field translations are, by definition, just the shifts of the reference point of each field: ! ! ! Gxx Gxy Xf Tx C $ : ½Translation of field f Z T C G$Rf Z Ty Gyx Gyy Yf Throughout this analysis, the “C” and “$” indicate standard matrix addition and multiplication, respectively. In the equation for r 0mf, the unknowns are the field and grid terms. The expected positions and the measured positions are known. Thus, the equation must be inverted to solve for the combined field and grid terms, which amounts to 10 nominally independent numbers (two from the translation vector and four each from the grid and field matrices). The nominal independence of the 10 terms must be verified in each case because some exposure tools and/or processes will, for example, have no skew (so that term is explicitly zero) or the grid and field isotropic magnification terms will automatically be equal, etc. All 10 terms will be taken to be independent for the remainder of this discussion. Appropriate adjustment of the results for dependent or known terms is straightforward. Solving for the 10 terms from the expected and measured position values is generally done using some version of a least-squares fit. The least-squares approach, in a strict sense, applies only to Gaussian-distributed uncorrelated noise. Because real alignment measurements are often corrupted by “flyers” or “outliers,” i.e., data values that are not part of a Gaussian probability distribution, some alteration of the basic least-squares approach must be made to eliminate or at least reduce their effect on the final result. Iterative least-squares uses weighting factors to progressively reduce the contribution from data values that deviate significantly from the fitted values. For example, if s is the rms deviation between the measured and fitted positions, one can simply eliminate all data values that fall outside some specified range measured in units of, e.g., all points outside a G3s range could be eliminated and the fit is then recalculated without these points. This is

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

323

an all-or-nothing approach, a data value is either used, i.e., has weight 1 in the algorithm, or not, i.e., it has weight 0. A refinement of this approach allows the weight values to be chosen anywhere in the range 0–1. Often a single iteration of this procedure is not enough and it must be repeated several times before the results stabilize. Procedures of this type, i.e., ones that attempt, based on some criteria to reduce or eliminate the effect of “flyers” on the final results, go by the general name of robust statistics. Under this heading, there are also some basic variations on the least-squares approach itself, such as “least median of squares” or the so-called “L1” approach, that minimize the sum of absolute values rather than the sum of squares. An excellent and complete discussion of all the above considerations is given by Nakajima, et al. [36]. Which, if any, of these approaches is used is exposure-tool dependent. The optimum approach that needs to be applied in a particular case must be determined from the statistics of the measured data, including overlay results. Finally, it is not the straightforward software implementation of the least-square solution derived below that is difficult; it is all the ancillary problems that must be accounted for that present the difficulty in any real application, such as the determination and elimination of flyers, allowing for missing data, determining when more fields are needed and which fields to add, etc. More sophisticated approaches to eliminating flyers are discussed by Nakajima, et al. [37]. For the purposes of understanding the basic concept of global alignment, a single iteration of the standard least-squares algorithm is assumed in the derivation given below. 0 Substituting the matrix-vector form for the field and grid terms into the equation for rmf , rearranging terms, and separating out the x and y components yields 0 nxmf Z xmf KTx KXf Gxx KYf Gxy Kdxm Fxx Kdym Fxy ;

and 0 nymf Z ymf KTy KXf Gyx KYf Gyy Kdxm Fyx Kdym Fyy :

The x and y terms can be treated separately and with the equations written again in matrix-vector form; however, clustering with respect to the grid and field terms, the x equations are now 1 0 0 1 0 0 1 1 0 0 nx11 x11 Y1 dx1 dy1 1 X1 Tx C B 0 C B 0 C C B B C B C B n Y1 dx2 dy2 C B 1 X1 B Gxx C B x21 C B x21 C C C B B C B 0 C B 0 C C B B C B C B n 1 X1 Y1 dx3 dy3 C , B Gxy C B x31 C Z B x31 C K B C C B B C B C B B« BF C B « C B « C « « « « C xx A A @ @ A @ A @ 0 0 Fxy 1 XNF YNF dxNM dyNM nxNM NF xNM NF |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄ} error h 3x

data h Dx

A

and the y equations are 1 0 0 1 0 0 1 0 ny11 y11 Y1 dx1 dy1 1 X1 C B C B 0 C B B ny21 C B y 0 C Y1 dx2 dy2 C B 1 X1 C B 21 C B C B C B 0 C B 0 C B C B C B n 1 X1 Y1 dx3 dy3 C , B y31 C Z B y31 C K B C B C B C B B« B « C B « C « « « « C A @ A @ A @ 0 0 1 XNF YNF dxNM dyNM nyNM NF yNM NF |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄ} error h 3y

q 2007 by Taylor & Francis Group, LLC

data h Dy

A

unknowns h Ux

0

Ty

1

C B B Gyx C C B BG C B yy C C B BF C @ yx A Fyy |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ}

unknowns h Uy

Microlithography: Science and Technology

324

Using the indicated notation, the above equations reduce to 3x Z Dx KA,Ux

and

3y Z Dy KA,Uy :

The standard least-squares solutions are found by minimizing the sum of the squares of the errors: 3Tx ,3x Z ðDx KA,Ux ÞT ,ðDx KA,Ux Þ and 3Ty ,3y Z ðDy KA,Uy ÞT ,ðDy KA,Uy Þ: The superscript “T” indicates the matrix transpose. Taking derivatives with respect to the elements of the unknown vectors, i.e., taking derivatives one-by-one with respect to the field and grid terms, and setting the results to zero to find the minimum yields, after some algebra: Ux Z ðAT ,AÞK1 ,AT ,Dx

and

Uy Z ðAT ,AÞK1 ,AT ,Dy :

where the superscript “1” indicates the matrix inverse. Note that the A matrix is fixed for a given set of fields and marks. Thus, the combination (AT $A)K1 $AT can be computed for a particular set of fields and marks and the result can simply be matrix-multiplied against the column vector of x and y data to produce the bestfit field and grid terms. Alignment is then performed by using this data in the rmf ZTC G$RfCF$dm equation in a feed-forward sense to compute the position, orientation, and linear distortion of all the fields on the wafer.

Appendix Let x and y be standard orthogonal Cartesian coordinates in two dimensions. Consider an arbitrary combination of translation, rotation, and distortion of the points in the plane. This will carry each original point (x,y) to a new position, (x 0 ,y 0 ), i.e., x/ x 0 Z f ðx;yÞ y/ y 0 Z gðx;yÞ: The functions f and g can be expressed as power series in the x and y coordinates with the form x 0 Z f ðx;yÞ Z Tx C Mxx x C Mxy y C Cx2 C Dxy C/ y 0 Z gðx; yÞ Z Ty C Myx x C Myy y C Ey2 C Fxy C/; where the T, M, C, D, E, F,. coefficients are all constant, i.e, independent of x and y. The T terms represent a constant shift of all the points in the plane by the amount Tx in the x direction and by amount Ty in the y direction. The M terms represent shifts in the coordinate values that depend linearly on the original coordinate values. The remaining C, D, E, F and higher-order terms all depend nonlinearly on the original coordinate values. Using matrix-vector notation, the above two equations can them be written as a single

q 2007 by Taylor & Francis Group, LLC

Alignment and Overlay

325

equation of the form x0

!

y0

! ! Nonlinear x ; Z , C C y Ty Myx Myy Terms |ﬄﬄﬄ{zﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} Tx

!

Constant

Mxx

Mxy

Linear Term

where the “C” and “$” indicate standard matrix addition and multiplication, respectively. The constant term is a translation that has separate x and y values. The linear term involves four independent constants: Mxx, Mxy, Myx, and Myy . These can be expressed as combinations of the more geometric concepts of rotation, skew, x-magnification (x-mag) and y-magnification (y-mag). Each of these “pure” transformations can be written as a single matrix: ! cosðqz Þ Ksinðqz Þ Rotation Z ; sinðqz Þ cosðqz Þ Skew Z

1

0

sinðjÞ

1

xmag Z

yKmag Z

mx

0

0

1

1

0

0

my

! ;

! ; ! :

Here, q is the rotation angle and j is the skew angle, both measured in radians; mx and my are the x and y magnifications, respectively, both of which are unitless. Both skew and rotation are area-preserving because their determinants are unity, whereas x-mag and y-mag change the area by factors of mx and my, respectively. Skew has been defined above to correspond geometrically to a rotation of just the x axis by itself, i.e, x-skew. Instead of using rotation and x-skew, one could use rotation and y-skew or the combination x-skew and y-skew. Similarly, instead of using x-mag and y-mag, the combinations isotropic magnification, i.e, “iso-mag” and x-mag, or iso-mag and y-mag could have been used. Which combinations are chosen is purely a matter of convention (Figure 5.16). The net linear transformation matrix, M, can be written as the product of the mag, skew, and rotation matrices. Because matrix multiplication is not commutative, the exact form that M takes in this case depends on the order in which the separate matrices are multiplied. However, because most distortions encountered in an exposure tool are small, only the infinitesimal forms of the matrices need to be considered, in which case the result is commutative. Using the approximations cosðfÞ y1 sinðfÞ yf m x Z 1 C mx m y Z 1 C my ;

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

326

x-translation

y-translation

x-skew

y-skew

Rotation (Iso-skew)

Isotropic magnification

x-mag

y-mag

FIGURE 5.16 The various standard linear distortions in the plane are illustrated. As discussed in the text, various combinations of rotation, skew, and magnification can be used as a complete basis set for linear distortion. For example, isotropic magnification is the equal-weight linear combination of x-magnification and y-magnification; rotation is the equal-weight linear combination of x-skew and y-skew.

and then expanding to first order in all the small terms, q,j,mx, and my, gives M Z ðxmagÞ,ðymagÞ,ðSkewÞ,ðRotationÞ ! ! ! ! 1 0 mx 0 1 0 cosðqÞ KsinðqÞ Z , , , 0 my 0 1 sinðjÞ 1 sinðqÞ cosðqÞ ! mx cosðqÞ Kmx sinðqÞ Z y my ðsinðqÞ C sinðjÞcosðqÞÞ my ðcosðqÞKsinðjÞsinðqÞÞ 1 0 y

!

0 1 |ﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄ}

C

mx

Kq

q Cj

my

! :

Identity Matrix

Thus, the transformation takes the infinitesimal form: x0 y0

!

x y C y

! x C , y q C j my Ty |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} ! Dx Tx

!

mx

h

q 2007 by Taylor & Francis Group, LLC

Dy

Kq

:

1 C mx

Kq

q Cj

1 C my

!

Alignment and Overlay

327

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26.

D.C. Flanders et al. 1977. “A new interferometric alignment technique,” Applied Physics Letters, 31: 426. G. Bouwhuis and S. Wittekoek. 1979. “Automatic alignment system for optical projection printing,” IEEE Transactions on Electronic Devices, 26: 723. D.R. Bealieu and P.P. Hellebrekers. 1987. “Dark field technology: A practical approach to local alignment” Proceedings of SPIE, 772: 142. M. Tabata and T. Tojo. 1987. “High-precision interferometric alignment using checker grating,” Journal of Vacuum and Science and Technology B, 7: 1980. M. Suzuki and A. Une. 1989. “An optical-heterodyne alignment technique for quarter-micron x-ray lithography,” Journal of Vacuum and Science and Technology B, 9: 1971. N. Uchida et al. 1991. “A mask-to-wafer alignment and gap setting method for x-ray lithography using gratings,” Journal of Vacuum and Science and Technology B, 9: 3202. G. Chen et al. 1991. “Experimental evaluation of the two-state alignment system,” Journal of Vacuum and Science and Technology B, 9: 3222. S. Wittekoek et al. 1990. “Deep-UV wafer stepper with through-the-lens wafer to reticle alignment,” Proceedings of SPIE, 1264: 534. K. Ota et al. 1991. “New alignment sensors for a wafer stepper,” Proceedings of SPIE, 1463: 304. D. Kim et al. 1995. “Base-line error-free non-TTL alignment system using oblique illumination for wafer steppers,” Proceedings of SPIE, 2440: 928. R. Sharma et al. 1995. “Photolithographic mask aligner based on modified moire technique,” Proceedings of SPIE, 2440: 938. S. Drazkiewicz et al. 1996. “Micrascan adaptive x-cross correlative independent off-axis modular (AXIOM) alignment system,” Proceedings of SPIE, 2726: 886. A. Starikov et al. 1992. “Accuracy of overlay measurements: Tool and asymmetry effects,” Optical Engineering, 31: 1298. D.J. Cronin and G.M. Gallatin. 1994. “Micrascan II overlay error analysis,” Proceedings of SPIE, 2197: 932. N. Magome and H. Kawaii. 1995. “Total overlay analysis for designing future aligner,” Proceedings of SPIE, 2440: 902. A.C. Chen et al. 1997. “Overlay performance of 180 nm ground rule generation x-ray lithography aligner,” Journal of Vacuum and Science and Technology B, 15: 2476. F. Bornebroek et al. 2000. “Overlay performance in advanced processes,” Proceedings of SPIE, 2440: 520. R. Navarro et al. 2001. “Extended ATHENAe alignment performance and application for the 100 nm technology node,” Proceedings of SPIE, 4344: 682. Chen-FU Chien et al. 2001. “Sampling strategy and model to measure and compensate overlay errors,” Proceedings of SPIE, 4344: 245. J. Huijbregtse et al. 2003. “Overlay performance with advanced ATHENAe alignment strategies,” Proceedings of SPIE, 5038: 918. S.J. DeMoor et al. 2004. “Scanner overlay mix and match matrix generation: Capturing all sources of variation,” Proceedings of SPIE, 5375: 66. J.A. Liddle et al. 1997. “Photon tunneling microscopy of latent resist images,” Journal of Vacuum and Science and Technology B, 15: 2162. S.J. Bukofsky et al. 1998. “Imaging of photogenerated acid in a chemically amplified resist,” Applied Physics Letters, 73: 408. G.M. Gallatin et al. 1987. “Modeling the images of alignment marks under photoresist,” Proceedings of SPIE, 772: 193. G.M. Gallatin et al. 1988. “Scattering matrices for imaging layered media,” Journal of the Optical Society of America A, 5: 220. N. Bobroff and A. Rosenbluth. 1988. “Alignment errors from resist coating topography,” Journal of Vacuum and Science and Technology B, 6: 403.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

328 27. 28. 29. 30. 31. 32. 33. 34. 35.

36. 37.

Chi-Min Yuan et al. 1989. “Modeling of optical alignment images for semiconductor structures,” Proceedings of SPIE, 1088: 392. J. Gamelin et al. 1989. “Exploration of scattering from topography with massively parallel computers,” Journal of Vacuum and Science and Technology B, 7: 1984. G.L. Wojcik et al. 1991. “Laser alignment modeling using rigorous numerical simulations,” Proceedings of SPIE, 1463: 292. A.K. Wong et al. 1991. “Experimental and simulation studies of alignment marks,” Proceedings of SPIE, 1463: 315. Chi-Min Yuan and A. Strojwas. 1992. “Modeling optical microscope images of integrated-circuit structures,” Journal of the Optical Society of America A, 8: 778. X. Chen et al. 1997. “Accurate alignment on asymmetrical signals,” Journal of Vacuum and Science and Technology B, 15: 2185. J.H. Neijzen et al. 1999. “Improved wafer stepper alignment performance using an enhanced phase grating alignment system,” Proceedings of SPIE, 3677: 382. T. Nagayama et al. 2003. “New method to reduce alignment error caused by optical system,” Proceedings of SPIE, 5038: 849. A. Gatherer and T.H. Meng. 1993. “Frequency domain position estimation for lithographic alignment”, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 3, p. 380. R.L. Branham. 1990. Scientific Data Analysis, New York: Springer. S. Nakajima et al. 2003. “Outlier rejection with mixture models in alignment,” Proceedings of SPIE, 5040: 1729.

q 2007 by Taylor & Francis Group, LLC

6 Electron Beam Lithography Systems Kazuaki Suzuki

CONTENTS 6.1 Introduction ......................................................................................................................330 6.2 The Electron Optics of Round-Beam Instruments ......................................................330 6.2.1 General Description ............................................................................................330 6.2.2 Electron Guns ......................................................................................................332 6.2.3 The Beam Blanker ..............................................................................................334 6.2.4 Deflection Systems ..............................................................................................335 6.2.5 Electron–Electron Interactions ..........................................................................338 6.3 An Example of a Round-Beam Instrument: EBES ......................................................339 6.4 Shaped-Beam Instruments ..............................................................................................340 6.4.1 Fixed Square-Spot Instruments ........................................................................340 6.4.2 Shaped Rectangular-Spot Instruments ............................................................342 6.4.3 Character Projection Instruments ....................................................................343 6.5 Electron Projection Lithography and Other Emerging Methods..............................343 6.5.1 Scattering Contrast ..............................................................................................343 6.5.2 Image Blur by Electron–Electron Interactions ................................................343 6.5.3 Dynamic Exposure Motion ................................................................................344 6.5.4 Other Emerging Method ....................................................................................345 6.6 Electron Beam Alignment Techniques ..........................................................................345 6.6.1 Pattern Registration ............................................................................................345 6.6.2 Alignment Mark Structures ..............................................................................347 6.6.3 Alignment Mark Signals ....................................................................................348 6.6.4 The Measurement of Alignment Mark Position ............................................349 6.6.5 Machine and Process Monitoring ....................................................................350 6.7 The Interaction of the Electron Beam with the Substrate ..........................................351 6.7.1 Power Balance......................................................................................................351 6.7.2 The Spatial Distribution of Energy in the Resist Film ..................................353 6.8 Electron Beam Resists and Processing Techniques ....................................................354 6.9 The Proximity Effect ........................................................................................................354 6.9.1 Description of the Effect ....................................................................................354 6.9.2 Methods of Compensating for the Proximity Effect......................................356 Acknowledgments......................................................................................................................357 References ....................................................................................................................................357

329

q 2007 by Taylor & Francis Group, LLC

330

Microlithography: Science and Technology

6.1 Introduction Lithography using beams of electrons to expose the resist was one of the earliest processes used for integrated circuit fabrication, dating back to 1957 [1]. Today, essentially all highvolume production, even down to less-than-200 nm feature sizes, is done with optical techniques as a result of the advances in stepper technology described thoroughly elsewhere in this volume. Nevertheless, electron beam systems continue to play two vital roles that will, in all probability, not diminish in importance for the foreseeable future. First, they are used to generate the masks that are used in all projection, proximity, and contact exposure systems; second, they are used in the low-volume manufacture of ultra-small features for very high performance devices as described by Dobisz et al. in Chapter 15. In addition, however, there is some activity in so-called mix-and-match lithography where the e-beam system is used to expose one or a few levels with especially small features, and optical systems are used for the rest. Therefore, it is possible that as feature sizes move below about 100 nm (where optical techniques face substantial obstacles, especially for critical layers such as contacts and via-chain), electron beam systems might play a role in advanced manufacturing despite their throughput limitations as serial exposure systems. For these reasons, it is important for the lithographer to have some knowledge of features of e-beam exposure systems, even though it is expected that optical lithography will continue to be the dominant manufacturing technique. This chapter provides an introduction to such systems. It is intended to have sufficient depth for the reader to understand the basic principles of operation and design guidelines without attempting to be a principle source for a system designer or a researcher pushing the limits of the technique. The treatment is based on a monograph by Owen [2] that the reader should consult for more detail as well as background information and historical aspects. Processing details (including discussion of currently available resists) and aspects unique to ultra-small features (sub-100 nm) are covered in Chapter 15. Originally, this chapter was written by Owen and Sheats for the first edition of Microlithography. In this edition, the technology developments in these several years are updated, and minor corrections are added.

6.2 The Electron Optics of Round-Beam Instruments 6.2.1 General Description Figure 6.1 is a simplified ray diagram of a hypothetical scanned-beam electron lithography instrument where lenses have been idealized as thin optical elements. The electron optics of a scanning electron microscope (SEM) [3] would be similar in many respects. Electrons are emitted from the source, whose crossover is focused onto the surface of the workplace by two magnetic lenses. The beam half-angle is governed by the beam shaping aperture. This intercepts current emitted by the gun that is not ultimately focused onto the spot. In order to minimize the excess current flowing down the column, the beam shaping aperture needs to be placed as near as possible to the gun and, in extreme cases, may form an integral part of the gun itself. This is beneficial because it reduces electron–electron interactions that have the effect of increasing the diameter of the focused spot at the workpiece. A second benefit is that the lower the current flowing through the column, the less opportunity there is for polymerizing residual hydrocarbon or siloxane molecules and forming insulating contamination films on the optical elements. If present, these can acquire electric charge and cause beam drift and loss of resolution.

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems Source

331

Gun

Crossover Beam shaping aperature Lens 1 Beam blanking deflector Beam blanking aperature

Lens 2 Beam position deflector

α

Workpiece

FIGURE 6.1 Simplified ray diagram of the electron optical system of a hypothetical round-beam electron lithography system.

A magnetic or electrostatic deflector is used to move the focused beam over the surface of the workpiece; this deflector is frequently placed after the final lens. The beam can be turned off by a beam blanker that consists of a combination of an aperture and a deflector. When the deflector is not activated, the beam passes through the aperture and exposes the workpiece. However, when the deflector is activated, the beam is diverted, striking the body of the aperture. A practical instrument would incorporate additional optical elements such as alignment deflectors and stigmators. The arrangement shown in the figure is only one possible configuration for a scanned-beam instrument. Many variations are possible; a number of instruments, for example, have three magnetic lenses. If the beam current delivered to the workpiece is I, the area on the wafer to be exposed is A, and the charge density to be delivered to the exposed regions (often called the “dose”) is Q, it then follows that the total exposure time is T Z QA=I

(6.1)

Thus, for short exposure times, the resist should be as sensitive as possible and the beam current should be as high as possible. The beam current is related to the beam half-angle (a) and the diameter of the spot focused on the substrate (d) by the relationship 2 pd ðpa2 Þ I Zb 4

(6.2)

where b is the brightness of the source. In general, the current density in the spot is not uniform, but it consists of a bell-shaped distribution; as a result, d corresponds to an effective spot diameter. Note that the gun brightness and the beam half-angle need to be as high as possible to maximize current density. Depending on the type of gun, the brightness can vary by several orders of magnitude (see Section 6.2.2): a value in the middle of the range is 105 A cmK2 srK1. The numerical aperture is typically about 5!10K3 rad. Using these values and assuming a spot diameter

q 2007 by Taylor & Francis Group, LLC

332

Microlithography: Science and Technology

of 0.5 mm, Equation 6.2 predicts a beam current of about 15 nA, a value that is typical for this type of lithography system. For Equation 6.2 to be valid, the spot diameter must be limited only by the source diameter and the magnification of the optical system. This may not necessarily be the case in practice because of the effects of geometric and chromatic aberrations and electron– electron interactions. The time taken to expose a chip can be calculated using Equation 6.1. As an example, a dose of 10 mC cmK2, a beam current of 15 nA, and 50% coverage of a 5!5 mm2 chip will result in a chip exposure time of 1.4 min. A 3-in-diameter wafer could accommodate about 100 such chips, and the corresponding wafer exposure time would be 2.3 h. Thus, high speed is not an attribute of this type of system, particularly bearing in mind that many resists require doses well in excess of 10 mC cmK2. For reticle making, faster electron resists with sensitivities of up to 1 mC cm K2 are available; however, their poor resolution precludes their use for direct writing. Equation 6.1 can also be used to estimate the maximum allowable response time of the beam blanker, the beam deflector, and the electronic circuits controlling them. In this case, corresponds to the area occupied by a pattern pixel; if the pixel spacing is 0.5 mm, then AZ0.25!10K12 m2. Assuming a resist sensitivity of 10 mC cmK2 and a beam current of 15 nA implies that the response time must be less than 1.7 ms. Thus, the bandwidth of the deflection and blanking systems must be several MHz. Instruments that operate with higher beam currents and more sensitive resists require correspondingly greater bandwidths. Because the resolution of scanned-beam lithography instruments is not limited by diffraction, the diameter of the disc of confusion (Dd) caused by a defocus error Dz is given by the geometrical optical relationship: Dd Z 2aDz

(6.3)

Thus, if the beam half-angle is 5!10K3 rad, and the allowable value of Dd is 0.2 mm, then Dz!20 mm. This illustrates the fact that, in electron lithography, the available depth of focus is sufficiently great that it does not affect resolution. 6.2.2 Electron Guns The electron guns used in scanning electron lithography systems are similar to those used in SEMs (for a general description see, for example, Oatley [3]). There are four major types: thermionic guns using a tungsten hairpin as the source, thermionic guns using a lanthanum hexaboride source, tungsten field emission guns, and tungsten thermionic field emission (TF) guns. Thermionic guns are commonly used are they are simple and reliable. The source of electrons is a tungsten wire, bent into the shape of a hairpin that is self-heated to a temperature of between 2300 and 2700 C by passing a DC current through it. The brightness of the gun and the lifetime of the wire strongly depend on temperature. At low heater currents, the brightness is of the order of 104 A cmK2 srK1, and the lifetime is of the order of 100 h. At higher heating currents, the brightness increases to about 105 A cmK2 srK1, but the lifetime decreases to a value of the order of 10 h (see, for example, Broers [4] and Wells [5]). Space charge saturation prevents higher brightnesses from being obtained. (The brightness values quoted here apply to beam energies of 10–20 keV.) Lanthanum hexaboride is frequently used as a thermionic emitter by forming it into a pointed rod and heating its tip indirectly using a combination of thermal radiation and electron bombardment [4]. At a tip temperature of 1600 C, and at a beam energy of 12 keV,

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

333

Broers reported a brightness of over 105 A cmK2 srK1 and a lifetime of the order of 1000 h. This represents an increase in longevity of a factor of two orders of magnitude over a tungsten filament working at the same brightness. This is accounted for by the comparatively low operating temperature that helps to reduce evaporation. Two factors allow lanthanum hexaboride to be operated at a lower temperature than tungsten. The first is its comparatively low work function (approximately 3.0 eV as opposed to 4.4 eV). The second, and probably more important, factor is that the curvature of the tip of the lanthanum hexaboride rod is about 10 mm, whereas that of the emitting area of a bent tungsten wire is an order of magnitude greater. As a result, the electric field in the vicinity of the lanthanum hexaboride emitter is much greater, and the effects of space charge are much less pronounced. Because of its long lifetime at a given brightness, a lanthanum hexaboride source needs to be changed only infrequently; this is a useful advantage for electron lithography because it reduces the downtime of a very expensive machine. A disadvantage of lanthanum hexaboride guns is that they are more complex than tungsten guns, particularly as lanthanum hexaboride is extremely reactive at high temperatures, making its attachment to the gun assembly difficult. The high reactivity also means that the gun vacuum must be better than about 10K6 Torr if corrosion by gas molecules is not to take place. In the field emission gun, the source consists of a wire (generally of tungsten), one end of which is etched to a sharp tip with a radius of curvature of approximately 1 mm. This forms a cathode electrode, the anode being a coaxial flat disc that is located in front of the tip. A hole on the axis of the anode allows the emitted electrons to pass out of the gun. To generate a 20 keV beam of electrons, the potential difference between the anode and cathode is maintained at 20 kV, and the spacing is chosen so as to generate an electric field of about 109 V mK1 at the tip of the tungsten wire. At this field strength, electrons within the wire are able to tunnel through the potential barrier at the tungsten-vacuum interface, after which they are accelerated to an energy of 20 keV. An additional electrode is frequently included in the gun structure to control the emission current. (A general review of field emission has been written by Gomer [6]). The brightness of a field emission source at 20 keV is generally more than 107 A cmK2 srK1. Despite this very high value, field emission guns have not been extensively used in electron lithography because of their unstable behavior and their high-vacuum requirements. In order to keep contamination of the tip and damage inflicted on it by ion bombardment to manageable proportions, the gun vacuum must be about 10K12 Torr. Even under these conditions, the beam is severely affected by low-frequency flicker noise, and the tip must be reformed to clean and repair it at approximately hourly intervals. Stille and Astrand [7] converted a commercial field emission scanning microscope into a lithography instrument. Despite the use of a servo-system to reduce flicker noise, dose variations of up to 5% were observed. The structure of a TF gun is similar to that of a field emission gun except that the electric field at the emitting tip is only about 108 V mK1 and that the tip is heated to a temperature of 1000–1500 C. Because of the Schottky effect, the apparent work function of the tungsten tip is lowered by the presence of the electric field. As a result, a copious supply of electrons is thermionically emitted at comparatively low temperatures. The brightness of a typical TF gun is similar to that of a field emission gun (at least 107 A cmK2 srK1), but the operation of a TF gun is far simpler than that of the field emoission gun. Because the tip is heated, it tends to be self cleaning, and a vacuum of 10K9 Torr is sufficient for stable operation. Flicker noise is not a serious problem, and lifetimes of many hundreds of hours are obtained with tip reforming being unnecessary. A description of this type of gun is given by Kuo and Siegel [8], and an electron lithography system using a TF gun is described below.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

334

A thermionic gun produces a crossover whose diameter is about 50 mm, whereas field emission and TF guns produce crossovers whose diameters are of the order of 10 nm. For this reason, to produce a spot diameter of about 0.5 mm, the lens system associated with a thermionic source must be demagnifying a crossover, but that associated with a field emission or TF source needs to be magnifying it. 6.2.3 The Beam Blanker The function of the beam blanker is to switch the current in the electron beam on and off. To be useful, a beam blanker must satisfy three performance criteria. 1. When the beam is switched off, its attenuation must be very great; typically, a value of 106 is specified. 2. Any spurious beam motion introduced by the beam blanker must be much smaller than the size of a pattern pixel; typically, the requirement is for much less than 0.1 mm of motion. 3. The response time of the blanker must be much less than the time required to expose a pattern pixel; typically, this implies a response time of much less than 100 ns. In practice, satisfying the first criterion is not difficult, but careful design is required to satisfy the other two, the configuration of Figure 6.1 being a possible scheme. An important aspect of its design is that the center of the beam blanking deflector is confocal with the workpiece. Figure 6.2 is a diagram of the principal trajectory of electrons passing through an electrostatic deflector. The real trajectory is the curve ABC that, if fringing fields are negligible, is parabolic. At the center of the deflector, the trajectory of the deflected beam is displaced by the distance B 0 B. The virtual trajectory consists of the straight lines AB 0

A

B′

B

Φ

C Deflector plate FIGURE 6.2 The principal trajectory of electrons passing through an electrostatic deflector.

q 2007 by Taylor & Francis Group, LLC

Undeflected beam Deflected beam

Electron Beam Lithography Systems

335

and B 0 C. Viewed from outside, the effect of the deflector is to turn the electron trajectory through the angle f about the point B 0 (the center of the deflector). If, therefore, B 0 is confocal with the workpiece, the position of the spot on the workpiece will not change as the deflector is activated, and performance criterion two will be satisfied (the angle of incidence will change, but this does not matter). The second important aspect of the design of Figure 6.1 is that the beam blanking aperture is also confocal with the workpiece. As a result, the cross section of the beam is smallest at the plane of the blanking aperture. Consequently, when the deflector is activated (shifting the real image of the crossover from B 0 to B), the transition from on to off occurs more rapidly than it would if the aperture were placed in any other position. This helps to satisfy criterion three. The blanking aperture is metallic; therefore, placing it within the deflector itself would, in practice, disturb the deflecting field. As a result, the scheme of Figure 6.1 is generally modified by placing the blanking aperture just outside the deflector; the loss in time resolution is usually insignificant. An alternative solution has been implemented by Kuo et al. [9]. This is to approximate the blanking arrangement of Figure 6.2 by two blanking deflectors, one above and one below the blanking aperture. This particular blanker was intended for use at a data rate of 300 MHz that is unusually fast for an electron lithography system; therefore, an additional factor, the transit time of the beam through the blanker structure, became important. Neglecting relativistic effects, the velocity (v) of an electron traveling with a kinetic energy of v electron volts is

2qV vZ m

1=2 (6.4)

(q being the charge of electron and m the mass of electron). Thus, the velocity of a 20 keV electron is approximately 8.4!107 m sK1. The length of the blanker of Kuo et al. [9] in the direction the beam’s travel was approximately 40 mm, giving a transit time of about 0.5 Ns. This time is significant compared to the pixel exposure time of 3 Ns and, if uncorrected, would have resulted in a loss of resolution caused by the partial deflection of the electrons already within the blanker structure when a blanking signal was applied. To overcome the transit time effect, Kuo et al. [9] inserted a delay line between the upper and lower deflectors. This arrangement approximated a traveling wave structure where the deflection field and the electron beam both moved down the column at the same velocity, eliminating the possibility of partial deflection. 6.2.4 Deflection Systems Figure 6.3 is a diagram of a type of deflection system widely used in SEMs, the “prelens double-deflection” system. The deflectors D1 and D2 are magnetic coils that are located behind the magnetic field of the final magnetic lens. In a frequently used configuration, L1 and L2 are equal, and the excitation of D2 is arranged to be twice that of D1, but it is acting in the opposite direction. This has the effect of deflecting the beam over the workpiece yet not shifting the beam in the principal plane of the final lens, thus keeping its off-axis aberrations to a minimum. The size of this arrangement may be gauged from the fact that L1 typically lies between 50 and 100 mm. The prelens double-deflection system is suitable for use in SEMs because it allows a very small working distance (L) to be used (typically, it is less than 10 mm in SEM). This is essential in microscopy because spherical aberration is one of the most important factors

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

336

Principal trajectory

FIGURE 6.3 (a) A prelens double-deflection system, a type of deflector commonly used in scanning electron microscopes. (b) An in-lens double-deflection system, a type of deflector that is used in scanning electron lithography instruments. Note that the working distance (L) is made greater in (b) than it is in (a) because spherical aberration is not a limiting factor in scanning electron lithography; the longer working distance reduces the deflector excitation necessary to scan the electron beam over distances of several millimeters. The purpose of the ferrite shield in (b) is to reduce eddy current effects by screening D1 from the upper bore of the final lens.

Ferrite shield D1

D1

L2

D2

L1 Principal plane of final lens

D2

L L Specimen plane (a)

Workpiece plane (b)

limiting the resolution, and the aberration coefficient increases rapidly with working distance. Thus, the prelens double-deflection system allows a high ultimate resolution (10 nm or better) to be achieved, however, generally only over a limited field (about 10!10 mm). Outside this region, off-axis deflection aberrations enlarge the electron spot and distort the shape of the scanned area. Although this limited field coverage is not a serious limitation for electron microscopy, it is for electron lithography. In this application, the resolution required is comparatively modest (about 100 nm), but it must be maintained over a field whose dimensions are greater than 1!1 mm2. Furthermore, distortion of the scan field must be negligible. Early work in scanning electron lithography was frequently carried out with converted SEM using prelens double deflection. Chang and Stewart [10] used such an instrument and reported that deflection aberrations degraded its resolution to about 0.8 mm at the periphery of a 1!1 mm field at a beam half-angle of 5!10K3 rad. However, they also noted that the resolution could be maintained at better than 0.1 mm throughout this field if the focus and stigmator controls were manually readjusted after deflecting the beam. In many modern systems, field curvature and astigmatism are corrected in this way, but under computer control, the technique being known as “dynamic correction” (see, for example, Owen [11]). Chang and Stewart [10] also measured the deflection distortion of their instrument. They found that the nonlinear relationship between deflector current and spot deflection caused a positional error of 0.1 mm at a nominal deflection of 100 mm. The errors at larger deflections would be much worse because the relationship between distortion errors and nominal deflection consists of a homogeneous cubic polynomial under the conditions used in scanning electron lithography. Deflection errors are often corrected dynamically in modern scanning lithography systems by characterizing the errors before exposure, using the laser interferometer as the calibration standard. During exposure, appropriate corrections are made to the excitations of the deflectors. Owen and Nixon [12] carried out a case study on a scanning electron lithography system with a prelens double-deflection system. On the basis of computer calculations, they showed that the source of off-axis aberrations was the deflection system, the effects of

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

337

lens aberrations being considerably less serious. In particular, they noted that the effects of spherical aberration were quite negligible. This being the case, they went on to propose that, for the purposes of electron lithography, post-lens deflection was feasible. At working distances of several centimeters, spherical aberration would still be negligible for a welldesigned final lens, and there would be sufficient room to incorporate a single deflector between it and the workpiece. A possible configuration was proposed and built, the design philosophy adopted being to correct distortion and field curvature dynamically, and optimize the geometry of the deflection coil so as to minimize the remaining aberrations. Amboss [13] constructed a similar deflection system that maintained a resolution of better than 0.2 mm over a 2!2 mm2 scan field at a beam half-angle of 3!10K3 rad. Calculations indicated that the resolution of this system should have been 0.1 mm, and Amboss attributed the discrepancy to imperfections in the winding of the deflection coils. A different approach to the design of low-aberration deflection systems was proposed by Ohiwa et al. [14]. This in-lens scheme, illustrated in Figure 6.3b, is an extension of the prelens double-deflection system from which it differs in two respects. 1. The second deflector is placed within the pole piece of the final lens with the result that the deflection field and the focusing field are superimposed. 2. In a prelens double-deflection system, the first and second deflectors are rotated by 180 degrees about the optic axis with respect to each other. This rotation angle is generally not 180 degrees in an in-lens deflection system. Ohiwa et al. showed that the axial position and rotation of the second deflector can be optimized to reduce the aberrations of an in-lens deflection system to a level far lower than would be possible with a prelens system. The reasons for this are as follows. 1. Superimposing the deflection field of D2 and the lens field creates what Ohiwa et al. termed a “moving objective lens.” The superimposed fields form a rotationally symmetrical distribution centered not on the optic axis but on a point whose distance from the axis is proportional to the magnitude of the deflection field. The resultant field distribution is equivalent to a magnetic lens that, if the system is optimized, moves in synch with the electron beam deflected by D1 in such a way that the beam always passes through its center. 2. The rotation of DI with respect to D2 accounts for the helical trajectories of the electrons in the lens field. A limitation of this work was that the calculations involved were based, not on physically realistic lens and deflection fields, but on convenient analytic approximations. Thus, although it was possible to give convincing evidence that the scheme would work, it was not possible to specify a practical design. This limitation was overcome by Munro [15] who developed a computer program that could be used in the design of this type of deflection system. Using this program, Munro designed a number of post-lens, in-lens, and prelens deflection systems. A particularly promising in-lens configuration was one that had an aberration diameter of 0.15 mm after dynamic correction when covering a 5!5 mm2 field at an angular aperture of 5!10K3 rad and a fractional beam voltage ripple of 10K4. Because the deflectors of an in-lens deflection system are located near metallic components of the column, measures have to be taken to counteract eddy current effects. The first deflector can be screened from the upper bore of the final lens by inserting a tubular ferrite shield as indicated in Figure 6.3b [16]. This solution would not work for the second

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

338

deflector because the shield would divert the flux lines constituting the focusing field. Chang et al. [17] successfully overcame this problem by constructing the lens pole pieces not of soft iron, but of ferrite. Although magnetic deflectors are in widespread use, electrostatic deflectors have several attributes for electron lithography. In the past, they have rarely been used because of the positional instability that has been associated with them that was caused by the formation of insulating contamination layers on the surface of the deflection plates. However, recent improvements in vacuum technology now make the use of electrostatic deflection feasible. The major advantage of electrostatic over magnetic deflection is the comparative ease with which fast response times can be achieved. A fundamental reason for this is that, to exert a given force on an electron, the stored energy density associated with an electrostatic deflection field (UE) is always less than that associated with a magnetic deflection field (UM). If the velocity of the electrons within the beam is v and that of light in free space is c, the ratio of energy densities is v 2 UE Z c UM

(6.5)

Thus, an electrostatic deflection system deflecting 20 keV electrons stores only 8% as much energy as a magnetic deflection system of the same strength occupying the same volume. It follows that the output power of an amplifier driving the electrostatic system at a given speed needs to be only 8% of that required to drive the magnetic system. Electrostatic deflection systems have additional attractions in addition to their suitability for high-speed deflection. 1. They are not prone to the effects of eddy currents or magnetic hysteresis. 2. The accurate construction of electrostatic deflection systems is considerably easier than construction of magnetic deflection systems. This is so because electrostatic deflectors consist of machined electrode plates whereas magnetic deflectors consist of wires that are bent into shape. Machining is an operation that can be carried out to close tolerances comparatively simply, whereas bending is not. An electron lithography system that uses electrostatic deflection is described below. Computer-aided techniques for the design and optimization of electrostatic, magnetic, and combined electrostatic and magnetic lens and deflection systems have been described by Munro and Chu [18–21]. The thyratron is then commanded to commute; i.e., the switch is closed. The closing time for a typical thyratron is about 30 Ns, during which the thyratron changes state from an open to a shut switch. The command to commute is generated by the stepper or the scanner. 6.2.5 Electron–Electron Interactions The mean axial separation between electrons traveling with a velocity v and that constitute a beam current I is 1=2 qv 1 2q3 V DZ Z I I m

q 2007 by Taylor & Francis Group, LLC

(6.6)

Electron Beam Lithography Systems

339

In a scanning electron microscope, the beam current may be 10 pA, and for 20 keV electrons, this corresponds to a mean electron–electron spacing of 1.34 m. Because the length of an electron optical column is about 1 m, the most probable number of electrons in the column at any given time is less than one, and electron–electron interactions are effectively nonexistent. In a scanning lithography instrument, however, the beam current has a value of between 10 nA and 1 mA, corresponding to mean electron spacings of between 1.34 mm and 13.4 mm. Under these circumstances, electron–electron interactions are noticeable. At these current levels, the major effect of the forces between electrons is to push them radially, thereby increasing the diameter of the focused spot. (In heavy current electron devices such as cathode-ray tubes or microwave amplifiers, the behavior of the beam is analogous to the laminar flow of a fluid, and electron–electron interaction effects can be explained on this basis. However, the resulting theory is not applicable to lithography instruments where the beam currents are considerably lower.) Crewe [22] has used an analytic technique to estimate the magnitude of interaction effects in lithography instruments and has shown that the increase in spot radius is given approximately by

Dr Z

1 8p30

1=2 m LI q aV 3=2

(6.7)

In this equation, a represents the beam half-angle of the optical system, L represents the total length of the column, m and q are the mass and charge of an electron, and 30 is the permittivity of free space. Note that neither the positions of the lenses nor their optical properties appear in Equation 6.7; it is only the total distance from source to workpiece that is important. For 20 keV electrons traveling down a column of length 1 m and beam half-angle 5!10K3 rad, the spot radius enlargement is 8 nm for a beam current of 10 nA that is negligible for the purposes of electron lithography. However, at a beam current of 1 mA, the enlargement would be 0.8 mm, which is significant. Thus, great care must be taken in designing electron optical systems for fast lithography instruments that utilize comparatively large beam currents. The column must be kept as short as possible, and the beam half-angle must be made as large as possible. Groves et al. [23] have calculated the effects of electron–electron interactions using, not an analytic technique, but a Monte Carlo approach. Their computations are in broad agreement with Crewe’s equation. Groves et al. also compared their calculations with experimental data, obtaining reasonable agreement.

6.3 An Example of a Round-Beam Instrument: EBES The Electron-Beam Exposure System (EBES) was designed and built primarily for mask making for optical lithography on a routine basis. It had a resolution goal of 2 mm line widths, and it was designed in such a way as to achieve maximum reliability in operation rather than pushing the limits of capability. The most unusual feature of this machine was that the pattern was written by mechanically moving the mask plate with respect to the beam. The plate was mounted on an X–Y table that executed a continuous raster motion with a pitch (separation between rows) of 128 mm. If the mechanical raster were perfectly executed, each point on the mask could be accessed if the electron beams were scanned in a line, 128 mm long, and perpendicular to the

q 2007 by Taylor & Francis Group, LLC

340

Microlithography: Science and Technology

long direction of the mechanical scan. However, in practice, because mechanical motion of the necessary accuracy could not be guaranteed, the actual location of the stage was measured using laser interferometers, and the positional errors were compensated for by deflecting the beam appropriately. As a result, the scanned field was 140!140 mm, sufficient to allow for errors of G70 mm in the x direction and G6 mm in the y direction. The advantage of this approach was that it capitalized on well-known technologies. The manufacture of the stage, although it required high precision, used conventional mechanical techniques. The use of laser interferometers was well established. The demands made on the electron optical system were sufficiently inexacting to allow the column of a conventional SEM to be used although it had to be modified for high-speed operation [16]. Because EBES was not intended for high resolution applications, it was possible to use resists of comparatively poor resolution but high sensitivity, typically 1 mC cmK2. At a beam current of 20 nA, Equation 6.1 predicts that the time taken to write an area of 1 cm2 would be 50 s (note that the exposure time is independent of pattern geometry in this type of machine). Therefore, the writing time for a 10!10 cm2 mask or reticle would be about 1.4 h, regardless of pattern geometry. For very large scale integrated (VLSI) circuits, this is approximately an order of magnitude less than the exposure time using an optical reticle generator. Because of its high speed, it is practicable to use EBES for directly making masks without going through the intermediate step of making reticles [24]. However, with the advent of wafer steppers, a major use of these machines is now for the manufacture of reticles, and they are in widespread use. The writing speeds of later models have been somewhat increased, but the general principles remain identical to those originally developed.

6.4 Shaped-Beam Instruments 6.4.1 Fixed Square-Spot Instruments Although it is possible to design a high-speed round-beam instrument with a data rate as high as 300 MHz, it is difficult and its implementation is expensive. Pfeiffer [25] proposed an alternative scheme that allows patterns to be written at high speeds using high beam currents, without the need for such high data rates. In order to do this, Pfeiffer made use of the fact that the data supplied to a round-beam machine are highly redundant. The spot produced by a round-beam machine is an image of the gun crossover, modified by the aberrations of the optical system. As a result, not only is it round, but also the current density within it is also nonuniform, conforming to a bell-shaped distribution. Because of this, the spot diameter is often defined as the diameter of the contour at which the current density falls to a particular fraction of its maximum value with this fraction typically being arbitrarily chosen as 1/2 or 1/e. In order to maintain good pattern fidelity, the pixel spacing (the space between exposed spots) and the spot diameter must be relatively small compared with the minimum feature size to be written. A great deal of redundant information must then be used to specify a pattern feature (a simple square will be composed of many pixels). Pfeiffer and Loeffler [26] pointed out that electron optical systems could be built that produced not round, nonuniform spots, but square, uniformly illuminated ones. Thus, if a round-spot instrument and a square-spot instrument operate at the same beam current and expose the same pattern at the same dose, the data rate for the square-spot instrument will be smaller than that for the round-spot instrument by a factor of n2 where n is the

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

341

number of pixels that form the side of a square. Typically nZ5 to get adequate uniformity, and the adoption of a square-spot scheme will reduce a data rate of 300–12 MHz, a speed at which electronic circuits can operate with great case. To generate a square, uniformly illuminated spot, Pfeiffer and Loeffler [26] used Kohler’s method of illumination, a technique well known in optical microscopy (see, for example, Born and Wolf [27]). The basic principle is illustrated in Figure 6.4a. A lens (L1) is interposed between the source (S) and the plane to be illuminated (P). One aperture (SA1) is placed on the source side of the lens, and another (BA) is placed on the other side. The system is arranged in such a way that the following optical relationships hold: 1. The planes of S and BA are confocal. 2. The planes of SA1 and P are confocal. Under these circumstances, the shape of the illuminated spot at P (2) is similar to that of SA1, but it is demagnified by the factor d1/d2. (For this reason, SA1 is usually referred to as a spot shaping aperture.) The beam half-angle of the imaging system is determined by the diameter of the aperture BA. Thus, if SA1 is a square aperture, a square patch of illumination (2) will be formed at P even though the aperture BA is round. The uniformity of the illumination stems from the fact that all trajectories emanating from a point such as on the source are spread out to cover the whole of (2). When Koehler’s method of illumination is applied to optical microscopes, a second lens is used to ensure that trajectories from a given source point impinge on P as a parallel beam. However, this is unnecessary for electron lithography. Pfeiffer [25] and Mauer et al. [28] described a lithography system, the EL1, that used this type of illumination. It wrote with a square spot nominally measuring 2.5!2.5 mm containing a current of 3 mA. Because of the effects of electron–electron interactions, the edge acuity of the spot was 0.4 mm. The optical system was based on the principles illustrated in Figure 6.4a, but it was considerably more complex, consisting of four magnetic lenses. The lens nearest the gun was used as a condenser, and the spot shaping aperture was located within its magnetic field. This aperture was demagnified

d1

d2 j j

i i

s

s (a) S

SA1

L1

(b) S

BA P

SA1

D SA2

L1

L2

BA′

P′

k′ k s D (c) S

SA1

L1

SA2

L2

BA′

P′

FIGURE 6.4 (a) The principle of the generation of a square, uniformly illuminated spot, using Koehler’s method of illumination. (b) The extension of the technique to the generation of a rectangular spot of variable dimensions with the spot shaping deflector D unactivated. (c) As for (b), but with D activated.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

342

by a factor of 200 by the three remaining lenses, the last of which was incorporated in an in-lens deflection system. The EL1 lithography instrument was primarily used for exposing interconnection patterns on gate array wafers. It was also used as a research tool for direct writing and for making photomasks [29]. 6.4.2 Shaped Rectangular-Spot Instruments A serious limitation of fixed square-spot instruments is that the linear dimensions of pattern features are limited to integral multiples of the minimum feature size. For example, using a 2.5!2.5 mm square spot, a 7.5!5.0 mm rectangular feature can be written, but a 8.0!6.0 mm feature cannot. An extension of the square-spot technique that removed this limitation was proposed by Fontijn [30] and first used for electron lithography by Pfeiffer [31]. The principle of the scheme is illustrated in Figure 6.4b. In its simplest form, it involves adding a deflector (D), a second shaping aperture (SA2), and a second lens (L2) to the configuration of Figure 6.4a. The positions of these additional optical components are determined by the following optical constraints: 1. SA2 is placed in the original image plane, P. 2. The new image plane is P 0, and L2 is positioned so as to make it confocal with the plane of SA2. 3. The beam shaping aperture BA is removed and replaced by the deflector D. The center of deflection of D lies in the plane previously occupied by BA. 4. A new beam shaping aperture BA 0 is placed at a plane conjugate with the center of deflection of D (i.e., with the plane of the old beam shaping aperture BA). SA1 and SA2 are both square apertures, and their sizes are such that a pencil froms that just fills SA1 will also just fill SA2 with the deflector unactivated. This is the situation depicted in Figure 6.4b and under these circumstances, a square patch of illumination, jj, is produced at the new image plane P 0. Figure 6.4c shows what happens when the deflector is activated. The unshaded portion of the pencil emitted from s does not reach P 0 because it is intercepted by SA2. However, the shaded portion does reach the image plane where it forms a uniformly illuminated rectangular patch of illumination kk 0 . By altering the strength of the deflector, the position of k 0 and the shape of the illuminated patch can be controlled. Note that because the center of deflection of D is confocal with BA 0 , the beam shaping aperture does not cause vignetting of the spot as its shape is changed. Only one deflector is shown in the figure; in a practical system, there would be two such deflectors mounted perpendicular to each other so that both dimensions of the rectangular spot could be altered. Weber and Moore [29] built a machine, the EL2, based on this principle that was used as a research tool. Several versions were built, each with slightly different performance specifications. The one capable of the highest resolution used a spot whose linear dimensions could be varied from 1.0 to 2.0 mm in increments of 0.1 mm. A production version of the EL2, the EL3, was built by Moore et al. [32]. The electron optics of this instrument were similar to those of the EL2 except that the spot shaping range was increased from 2:1 to 4:1. A version of the EL3 that was used for 0.5 mm lithography is described by Davis et al. [33]. In this instrument, the spot current density was reduced from 50 to 10 A cmK2 and the maximum spot size to 2!2 mm to reduce the effects of electron–electron interactions.

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

343

Equation 6.1 cannot be used to calculate the writing time for a shaped-beam instrument because the beam current is not constant because it varies in proportion to the area of the spot. A pattern is converted for exposure in a shaped-spot instrument by partitioning it into rectangular shots. The instrument writes the pattern by exposing each shot in turn, having adjusted the spot size to match that of the shot. The time taken to expose a shot is independent of its area and is equal to Q/J (Q being the dose and J the current density within the shot), so the time taken to expose a pattern consisting of N shots is TZ

NQ J

(6.8)

Therefore, for high speed, the following requirements are necessary: 1. The current density in the spot must be as high as possible. 2. In order to minimize the number of pattern shots, the maximum spot size should be as large as possible. In practice, a limit is set by electron–electron interactions that degrade the edge acuity of the spot at high beam currents. 3. The pattern conversion program must be efficient at partitioning the pattern into as few shots as possible given the constraint imposed by the maximum shot size. In a modern ultra high resolution commercial system such as that manufactured by JEOL, the area that can be scanned by the beam before moving the stage is of the order of 1!1 mm, and the minimum address size (pixel size) can be chosen to be 5 or 25 nm. Recently, oblique patterns such as 45 and 135 degree orientations are prepared on SA2 (Hitachi, Nuflare, Leica), and those patterns can be written on wafer as well as rectangular patterns. 6.4.3 Character Projection Instruments Pfeiffer proposed to extend Shaped Rectangular-Spot Method to Character Projection Method by replacing SA2 by a character plate with an array of complex aperture shapes [34]. The example of the character plate is introduced in Figure 6.5. This method can reduce the number of exposure shots and can make throughput higher. Hitachi and Advantest successfully developed their instruments with this concept independently.

6.5 Electron Projection Lithography and Other Emerging Methods 6.5.1 Scattering Contrast The difference of scattered angle of incident electrons by Rutherford Scattering between different materials can be used to make contrast. This concept has been used in the area of transmission electron microscopy, and it was applied to image formation on an actinic film using a master stencil by Koops [35]. Berger [36] applied this concept to a lithography tool with a membrane-type mask that consists of heavy metal patterns on the thin membrane. A silicon stencil type can be also used as a mask of Electron Projection Lithography (EPL) [37]. 6.5.2 Image Blur by Electron–Electron Interactions An image blur can be used as a metric for resolution capability of EPL. The image blur is defined as the full width of half maximum (FWHM) of the point spread function of

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

344

Gun

Square aperture

1st condenser

+

Shaping plates Source image

Pattern aperture plate FIGURE 6.5 Example of character plate in a character projection instrument from Figure 6.7. (From Pfeiffer, H. C., IEEE Trans. Electron. Dev., ED-26, 1979, 663.)

Compound image

the point image that corresponds to the width of 12%–88% edge slope height. A similar equation to Equation 6.7 is given by

B Zk

I 5=6 L5=4 M a3=5 SF1=2 V 3=2

(6.9)

where B, I, L, M, a, SF, V are the image blur by electron–electron interaction, the total electrical current on wafer, the length between mask and wafer, magnification, the beam half-angle on the wafer, subfield size on the wafer, and acceleration voltage, respectively [38,39]. k is a coefficient. Higher acceleration voltage of electron, larger subfield size, and larger beam half-angle are effective to obtain smaller image blur. Acceleration voltage of 100 kV, subfield size of 0.25 mm!0.25 mm, and beam half-angle of 3.5 mrad are adopted in EPL exposure tool [40]. For this purpose, electrons are emitted from a surface of a tantalum crystal cathode, of which the backside is heated by electron bombardment current supplied by a directly heated tungsten filament [41]. 6.5.3 Dynamic Exposure Motion Dynamic exposure motion is realized in EPL exposure tool as shown in Figure 6.6. Each subfield on the mask is irradiated in turn with the combination of the beam deflection and the mask stage motion. Simultaneously, the wafer stage moves in the opposite direction of the mask stage, and patterns on the mask are projected onto a wafer, one after another. Position errors of both stages are compensated by deflection control of the electron beam. Because deflection width of 5 mm is realized on wafer and the maximum stage scan length is 25 mm, four continuous areas of 5 mm!25 mm can be exposed from a f200 mm mask [40]. Large subfield size, large deflection width, the f200 mm mask, and higher electrical

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

Deflector

345

Stage scan

Beam deflection Mask stage

Sub-field 1 × 1mm

Mask

Projection lens ×1/4 Mag.

Sub-fields Deflector

Wafer stage

Beam deflection

Wafer Sub-field 0.25 × 0.25 mm

Stage scan

FIGURE 6.6 Dynamic exposure motion of EPL exposure tool.

current (with keeping image blur by higher acceleration voltage) give the throughput of several f300 mm wafers/h or higher. 6.5.4 Other Emerging Method Proximity electron lithography was proposed in 1999 [42]. Electrons of 2 kV through a silicon stencil mask expose a wafer. A pre-production tool was manufactured and evaluated [43]. A lot of attempts for emerging methods such as multiple column or multiple beam systems are proposed and being developed in order to obtain higher throughput [44–47]. Several years will be necessary for them to become mature technologies.

6.6 Electron Beam Alignment Techniques 6.6.1 Pattern Registration The first step in directly writing a wafer is to define registration marks on it. Commonly, these are grouped into sets of three, each set being associated with one particular chip site. The registration marks may be laid down on the wafer in a separate step before any of the chip levels are written, or they may be concurrently written with the first level of the chip. Pattern registration is necessary because no lithography instrument can write with perfect reproducibly. Several factors, discussed in detail in the following, can lead to an offset of a given integrated circuit level from its intended position with respect to the previous one: 1. Loading errors. Wafers are generally loaded into special holders for exposure. Frequently, the location and orientation of a wafer are fixed by a kinematic arrangement of three pins or an electrostatic chuck. However, the positional

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

346

2.

3.

4.

5. 6.

errors associated with this scheme can be several tens micrometers, and the angular errors can be as large as a few hundreds microradians. Static and dynamic temperature errors. Unless temperature is carefully controlled, thermal expansion of the wafer can cause these errors to be significant. The thermal expansion coefficient of silicon is 2.4!10K6 CK1, and over a distance of 100 mm, this corresponds to a shift of 0.24 mm for a 1 C change. When two pattern levels are written on a wafer at two different temperatures but the wafers are in thermal equilibrium in each case, a static error results. However, if the wafer is not in thermal equilibrium while it is being written, a dynamic error results whose magnitude varies as the temperature of the wafer changes with time. Substrate height variations. These are important because they give rise to changes in deflector sensitivity. For example, consider a post-lens deflector, nominally 100 mm above the substrate plane and deflecting over a 5!5 mm scan field. A change of 10 mm in the height of the chip being written results in a maximum pattern error of 0.25 mm. Height variations can arise from two causes. The first is nonperpendicularity between the substrate plane and the undetected beam that makes the distance between the deflector and the portion of the substrate immediately below it a function of stage position. The second cause is curvature of the wafer. Even unprocessed wafers are bowed, and high-temperature processing steps can significantly change the bowing. The deviations from planarity can amount to several micrometers. Stage yaw. The only type of motion that a perfect stage would execute would be linear translation. However, when any real stage is driven, it will rotate slightly about an axis perpendicular to its translational plane of motion. Typically, this motion, called yaw, amounts to several arcseconds. The positional error introduced by a yaw of 10 00 for a 5!5 mm chip is 0.24 mm. Beam position drift. The beam in a lithography instrument is susceptible to drift, typically amounting to a movement of less than 1 mm in 1 h. Deflector sensitivity drift. The sensitivities of beam deflectors tend to drift because of variations in beam energy and gain changes in the deflection amplifiers. This effect can amount to a few parts per million in 1 h.

By aligning the pattern on registration marks, exact compensation is made for loading errors, static temperature errors, substrate height variations, and yaw, all of which are time independent. Usually, dynamic temperature errors, beam position drift, and deflector sensitivity drift are reduced to negligible levels by pattern registration because, although they are time dependent, the time scales associated with them are much greater than the time taken to write a chip. A typical alignment scheme would consist of a coarse registration step followed by a fine registration step. The procedures are, in general, quite similar to those used in optical lithography that are discussed at length in Chapter 1 and Chapter 5. Here, the aspects unique to electron lithography, primarily having to do with the nature of the alignment detection signals, are the focus. Using the wafer flat (or some other mechanical factor in the case of nonstandard substrates), a coarse positioning is carried out, and the wafer scanned (not at high resolution) in the area of an alignment mark. Assuming it is detected, the coordinates of its center (with reference to an origin in the machine’s system) are now known, and an offset determined from the coordinates specified for it in the pattern data. This level of accuracy is still inadequate for pattern writing, but it is sufficient to allow the fine registration step to be carried out. One purpose of fine registration is to improve the

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

347

accuracy with which the coarse registration compensated for loading errors. In addition, it compensates for the remaining misregistration errors (temperature errors, substrate height variations, stage yaw, beam position drift, and deflector sensitivity drift). Because these will, in general, vary from chip to chip, fine registration is carried out on a chip-by-chip basis. Each chip is surrounded by three registration marks. Because the residual errors after coarse registration are only a few micrometers, the chip marks need to be only about 100 mm long. The steps involved in fine registration could be as follows: 1. The pattern data specify coordinates corresponding to the center of the chip. These are modified to account for the wafer offset and rotation measured during the coarse registration step. The stage is moved accordingly so as to position the center of the chip under the beam. 2. The electron beam is deflected to the positions of the three alignment marks in turn, and each mark is scanned. In this way, the position of each of the three marks is measured. 3. The pattern data are transformed linearly so as to conform to the measured positions of the marks, and the pattern is then written onto the chip. This procedure is repeated for each chip on the wafer. Numerous variations of the scheme described here exist. A serious drawback of this scheme is that it works on the assumption that each chip corresponds to a single scanned field. A registration scheme described by Wilson et al. [48] overcomes this limitation, allowing any number of scanned fields to be stitched together to write a chip pattern, making it possible to write chips of any size.

6.6.2 Alignment Mark Structures Many types of alignment marks have been used in electron lithography, including pedestals of silicon or silicon dioxide, metals of high atomic number, and trenches etched into the substrate. The last type of mark is frequently used and will be used here as an example of how alignment signals are generated. A common method of forming trenches is by etching appropriately masked silicon wafers in aqueous potassium hydroxide. Wafers whose top surface corresponds to the [100] plane are normally used for this purpose. The etching process is anisotropic, causing the sides of the trenches to be sloped as illustrated in Figure 6.7a. Typically, the trench is 10 mm wide and 2 mm deep. The way in which the resist film covers a trench depends on the dimensions of the trench, the material properties of the resist, and the conditions under which the resist is spun onto the wafer. Little has been published on this subject, but from practical experience, it is found that two extreme cases exist. 1. If the resist material shows little tendency to planarize the surface of the wafer and is applied as a thin film, the situation depicted in Figure 6.7b arises. The resist forms a uniform thin film whose top surface faithfully follows the shape of the trench. 2. If, on the other hand, a thick film of a resist that has a strong tendency to planarize is applied to the wafer, the situation shown in Figure 6.7c results. The top surface of the resist is nearly flat, but the thickness of the resist film increases significantly in the vicinity of the trench.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

348 10 μm

0.5 μm 2 μm

0.5 μm

tan–1 2 (b)

(a)

Xrise

1μm P

Q

2 μm

Xrise

0.16

η 0.15 0.14

(c)

Distance (d)

X1

X2

FIGURE 6.7 (a) A cross-sectional view of an alignment mark consisting of a trench etched into silicon. The wall angle of tanK1(2)1/2 is the result of the anisotropic nature of the etch. (b) Coverage of the trench by a film of resist that has little tendency to planarize. (c) Coverage by a resist that has a strong tendency to planarize. (d) The variation of backscatter coefficient h as a function of position for the situation depicted in (c).

The mechanisms for the generation of alignment mark signals are different in these two cases. 6.6.3 Alignment Mark Signals The electrons emitted when an electron beam with an energy of several kilo-electronvolts bombards a substrate can be divided into two categories: 1. The secondary electrons are those ejected from the substrate material itself. They are of low energy, and their energy distribution has a peak at an energy of a few electron volts. By convention, it is assumed that electrons with energies below 50 eV are secondaries. 2. The back-scattered electrons are primaries that have been reflected from the substrate. For a substrate of silicon (atomic number ZZ14), their mean energy is approximately 60% of that of the primary beam [49]. The electron collectors used in scanning electron microscopes are biased at potentials many hundreds of volts above that of the specimen in order to attract as many secondary electrons as possible. As a consequence, it is these electrons that dominate the formation of the resulting image. Everhart et al. [50] explain why this is done, the paths of (back-scattered) electrons from the object to the collector are substantially straight, whilst those of secondary electrons are usually sharply curved. It follows that (back-scattered) electrons cannot reveal detail of any part of the object from which there is not a straight-line path to the collector, while secondary electrons are not subject to this limitation. Thus, secondary electrons provide far more detail when a rough surface is under examination. However, this argument does not apply to the problem of locating a registration mark, a comparatively large structure whose fine surface texture is of no interest. Consequently, no discrimination is made against back-scattered electrons in alignment mark detection, and, in fact, it is these electrons that most strongly contribute to the resulting signals. Backscattered electrons may be collected either by using a scintillator-photomultiplier

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

349

arrangement or by using a solid-state diode as a detector. This is a popular collection scheme and is usually implemented by mounting an annular diode above the workpiece. Wolf et al. [51] used a solar cell diode 25 mm in diameter with a 4 mm diameter hole in it through which the primary electron beam passed with the total solid angle subtended at the workpiece being 0.8 sr. Detectors of this type are insensitive to secondary electrons because they are not sufficiently energetic to penetrate down to the depletion region that is under the surface; the threshold energy for penetration is generally several hundred eV. The gain of the detector varies linearly with excess energy above the threshold and the gradient of the relationship being approximately 1 hole-electron pair per 3.5 eV of beam energy. A useful extension of this technique (see, for example, Reimer [52]) is to split the detector into two halves. When the signals derived from the two halves are subtracted, the detector responds primarily to topographic variations on the substrate; this mode is well suited for detecting the type of mark depicted in Figure 6.5b. When the signals are added, the detector responds to changes in the backscattered electron coefficient (h). The type of mark shown in Figure 6.7c is best detected in this mode because the values of h at locations such as P and Q are significantly different whereas the topographic variations are small. The backscattering coefficients of composite samples consisting of thin films supported on bulk substrates have been studied by electron microscopists (see, for example, Niedrig [53]). The composite backscattering coefficient varies approximately linearly from a value corresponding to the substrate material (hS) for very thin films to a value corresponding to the film material (hF) for very thick films. The value hF is achieved when the film thickness is greater than about half the electron range in the film material. A silicon substrate has a backscattering coefficient hSZ0.l8 (see, for example, Reed [54]). The widely used electron resist (poly methyl methacrylate) (PMMA) has the chemical formula C5O2H8, and its mass concentration averaged atomic number is 6.2; as a rough approximation, it can be assumed that its backscattering coefficient is equal to that of carbon, i.e., that hFZ0.07. The density of PMMA is 1.2 g cmK3, and the data of Holliday and Sternglass [55] imply that the extrapolated range of 20 keV electrons in the material is about 8 mm. From these values, it follows that the backscattering coefficient of bulk silicon is reduced by roughly 0.02 for every 1 mm of PMMA covering it. A similar result has been calculated by Aizaki [56] using a Monte Carlo technique. Figure 6.7d is a sketch of the variation of h along the fiducial mark depicted in Figure 6.7c. It has been assumed that at P and Q, well away from topographical changes, h has the values 0.16 and 0.14, respectively. The change in h caused by the sides of the trench is assumed to occur linearly over a distance of xrise 5 mm. The exact form of the transition depends on the shape of the trench, the way the resist thickness changes in its vicinity, the range of elections in the resist, and, most important, their range in silicon. (The extrapolated ranges of 20 and 50 keV electrons in silicon are about 2 mm and 10 mm, respectively.) Because a split backscattered electron detector connected in the adding mode responds to changes in h, Figure 6.7d also represents the signal collected as a well-focused electron beam is scanned over the registration mark. Despite the simplifying assumptions that have been made, this sketch is representative of the signals that are obtained in practice. 6.6.4 The Measurement of Alignment Mark Position A threshold technique is frequently used to measure the position of an alignment mark. In Figure 6.7d, for example, the threshold has been set to correspond to hZ0.15, and the position of the center of the trench is xZ ðx1 C x2 Þ=2. The accuracy with which x1 and x2 can

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

350

be measured is limited by electrical noise, of which there are the following three major sources: 1. The shot noise associated with the primary electron beam. 2. The noise associated with the generation of backscattered electrons. 3. The noise associated with the detector itself. Wells et al. [57] analyzed the effects of shot noise and secondary emission noise on alignment accuracy. Their theory may be adapted to deal with backscatter noise by replacing the coefficient of secondary emission by the backscattering coefficient h. With this modification, the theory predicts that the spatial accuracy with which a threshold point such as x1 or x2 may be detected is

128qxrise 1 C h Dx Z h m2 HQ

1=3 (6.10)

The quantities appearing in this equation are defined as follows. 1. Dx is the measure of the detection accuracy. It is defined such that the probability of a given measurement being in error by more than Dx is 10K4. 2. xrise is the rise distance of the signal. A value of 5 mm is assumed in Figure 6.7d. 3. h is the mean backscattering coefficient. A value of 0.15 is assumed. 4. m is the fractional change in signal as the mark is scanned. In Figure 6.7d, mZ0.02/0.15Z0.13. 5. It is assumed that the beam oscillates rapidly in the direction perpendicular to the scan direction with an amplitude (1/2)H. In this way, the measurement is made, not along a line, but along a strip of width H. A typical value is HZ10 mm. 6. Q is the charge density deposited in the scanned strip. Equation 6.10 indicates that a compromise exists between registration accuracy (Dx) and charge density (Q) and that Dx may be made smaller than any set value provided that Q is large enough. For 0.5 mm lithography, an acceptable value of Dx could be 0.1 mm. To achieve this level of accuracy, Equation 6.10 predicts that a charge density QZ2300 mC cm K2 has to be deposited. This calculation illustrates the point made earlier: because electron resists require doses in the range 1–100 mC cmK2 for exposure, pattern features cannot be used as registration marks. Wells et al. pointed out that a threshold detection scheme is wasteful because it uses only that part of the alignment signal that corresponds to the immediate vicinity of the threshold point. If the complete waveform is used, all the available information is utilized. The charge density necessary to achieve an accuracy of Dx is reduced by the factor Dx/xrise. In the case of the example considered above, this would reduce Q from 2300 to 46 mC cmK2. One way in which the complete waveform can be utilized is to use a correlation technique to locate the alignment mark; such schemes have been described by Cumming [58], Holburn et al. [59] and Hsu [60]. 6.6.5 Machine and Process Monitoring Although a modern commercial electron lithography exposure system comes with a great deal of computer control, close system monitoring is essential for obtaining optimum

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

351

TABLE 6.1 Test for a High Resolution Electron Lithography System Feature

Evaluation Tool and Purpose

25-step incremental dose pads in two versions: large and small shot sizes

25-step incremental dose line/space features from 0.1 to 0.5 mm Single- and four-field grid pattern 10 ! 10 array of single crosses Mask mode field butting Aligned overlay test pattern Custom

Optical/Nanospec-at lowest resolving dose, beam current density distribution visible in pad section; track thickness vs. dose for resist shelf/film life SEM-determine shot-butting quality for fine-line exposure (0.2 mm); line width vs. dose System mark detection-measure field gain/distortion System mark detection-measure stage accuracy Optical-measure verniers to determine needed corrections Optical-measure verniers to determine needed corrections Add any feature for speical exposures

performance. Table 6.1 shows a set of tests that have been found useful for a highresolution system operating in an R & D mode. In addition, a weekly alignment monitor is run using an electrical test pattern. The first level consists of standard van der Pauw pads connected to resistors oriented in the X and Y directions. The second level cuts a slot in the resistor, dividing it in two equal parts. The difference between resistance values gives the amount of misalignment.

6.7 The Interaction of the Electron Beam with the Substrate 6.7.1 Power Balance Consider a silicon substrate, coated with a film of PMMA 0.5 mm thick, with an electron beam of energy 20 keV and current 1 mA impinging on the top surface of the resist; the power flowing is 20 mW. Part of this power is chemically and thermally dissipated in the resist, part thermally in the substrate, and the remainder leaves the substrate. The energy loss as the beam passes in the forward direction through the resist can be calculated using the Thomson–Whiddington law, expressed by Wells [61] as

E2A KE2B Z b 0 z 0:2 with b 0 Z 6:9 !109 ðrE0:5 A =Z Þ

(6.11)

Here, EA is the energy (in eV) of the electrons as they enter the resist, and EB is the mean energy as they exit it; z is the thickness (in cm) of the resist film, and the density of the resist is r (g cmK3), and its effective atomic number is Z. Because the resist film is thin, it is assumed that the number of electrons absorbed or generated within it is negligible. The mean energy loss that occurs within the resist film in this forward direction is DEf Z EAKEBZ1.04 keV, and the power dissipated by the beam as it passes through the resist

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

352 is given by

Pf Z IB DEf

(6.12)

that, for the conditions cited above, is 1 mW. The beam penetrates into the silicon, and a fraction hZ0.18 of the electron current is backscattered out of the substrate into the resist. The Bishop’s data [49] indicate that the mean energy of the backscattered electrons is 60% of the incident energy (12.0 keV in this case), so the power dissipated in the substrate is 16.9 mW. The backscattered electrons pass through the resist film toward the surface in the backward direction. It is assumed that the number of electrons absorbed or generated within the resist film is negligible, so the electrons constitute a current hIB. Although the mean electron energy as the electrons enter the resist (11.4 keV) is known, one cannot use the Thomson–Whiddington law to calculate the energy lost in the resist film, DEB, for two reasons: first, the spread of electron energies at C is wide (unlike that at A), and, second, the directions of travel of the backscattered electrons are not, in general, normal to the resist-substrate interface. The DEB may be estimated from the results of Kanter [62] who investigated the secondary emission of electrons from aluminum (ZZ13). He proposed that because of the lower energies and oblique trajectories of electrons backscattered from the sample, they would be a factor b more efficient at generating secondary electrons at the surface than the primary beam. Kanter’s estimated value for b was 4.3, and his experimentally measured value was 4.9. Because secondary electron generation and resist exposure are both governed by the rate of energy dissipation at the surface of the sample, and because the atomic numbers of aluminium (ZZ13) and silicon (ZZ14) are nearly equal, it is reasonable to assume that these results apply to the exposure of resist on a silicon wafer. Thus, DEB Z bDEf

(6.13)

where b is expected to have a value between 4 and 5. The power dissipated in the resist film by the backscattered electrons is Pb Z hIB DEB Z hbIB DEf

(6.14)

Comparing Equation 6.12 and Equation 6.14, Pb/Pf is given by the ratio he Z

Pb Z hb Pf

(6.15)

It is generally acknowledged that for silicon at a beam energy of 20 keV, he lies between 0.7 and 0.8, corresponding to bZ4. (A number of experimental measurements of he have been collated by Hawryluk [63]; these encompass a range of values varying from 0.6 to 1.0.) If it is assumed that bZ4.0, then the mean energy lost by the backscattered electrons as they pass through the resist film is DEBZ0.75 keV. Therefore, in this example of the incident power in the beam, approximately 5% is dissipated in the resist by forward-traveling electrons, approximately 4% is dissipated in the resist by backscattered electrons, 85% is dissipated as heat in the substrate, and the remaining 7% leaves the workpiece as the kinetic energy of the emergent backscattered electrons.

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

353

The quantity that controls the change in solubility of a resist is the total energy absorbed per unit volume (energy density), 3 (J mK3). Assuming that the energy is uniformly absorbed in a resist layer of thickness z, this is related to the exposure dose Q by the expression

3Z

DEf ð1Che Þ Q z

(6.16)

For the example considered in this section, Equation 6.16 indicates that an exposure dose of 1 mC cm–2 corresponds to an absorbed energy density of 3.6!107 J mK3. 6.7.2 The Spatial Distribution of Energy in the Resist Film Monte Carlo techniques have been used to investigate the interactions of electrons with matter in the context of electron probe microanalysis and scanning electron microscopy by Bishop [49], Shimizu and Murata [64], and Murata et al. [65]. Kyser and Murata [66] investigated the interaction of electron beams with resist films on silicon using the same method, pointing out that its conceptual simplicity and the accuracy of the physical model were useful attributes. Monte Carlo calculations indicate that the lateral distribution of energy dissipated by the forward-traveling electrons at the resist-silicon interface may be approximated closely as a Gaussian. Broers [67] has noted that the standard deviation computed in this way may be expressed as sf (measured in mm) where

9:64z sf Z V

1:75 (6.17)

where V (measured in keV) is the energy of the incident electron beam and z (measured in mm) is the thickness of the resist film. For 20 keV electrons penetrating 0.5 mm of resist, Equation 6.17 predicts that sfZ0.08 mm. Thus, forward scattering is not a serious limitation until one reaches the nanolithography regime discussed in Chapter 15. In these cases, such techniques as thin resist films and electron beams with energies greater than 20 keV are used as discussed extensively in Chapter 15. The electrons backscattered from the substrate expose the resist film over a region with a characteristic diameter sb that is approximately twice their range in the substrate. Monte Carlo simulation techniques have been extensively used to simulate electron backscattering. Shimizu and Murata applied this technique to compute the backscatter coefficient and the lateral distribution of electrons backscattered from aluminum at a beam energy of 20 keV. Their calculations showed that hZ0.18 and that the electrons emerged from a circular region 4 mm in diameter. Because silicon and aluminum are adjacent elements in the periodic table, the results for silicon are almost identical. In the context of electron lithography, it is important that the volume in the resist within which the forward-traveling electrons dissipate their energy is much smaller than that within which the backscattered electrons dissipate theirs. The comparatively diffuse backscattered energy distribution determines the contrast of the latent image in the resist, and the more compact forward-scattered energy distribution determines the ultimate resolution.

q 2007 by Taylor & Francis Group, LLC

354

Microlithography: Science and Technology

6.8 Electron Beam Resists and Processing Techniques Poly(methyl methacrylate) that was one of the first resists used with electron lithography is still commonly used because of its high resolution and because it is one of the best known and understood positive e-beam resists. Its primary disadvantages are very low sensitivity and poor etch resistance under many important plasma conditions. Some positive resists that eliminate these problems are now available, but their process sensitivities are much greater; they are based on the acid catalysis phenomena described at length in Chapter 10. Traditional negative-acting resists possessed far higher sensitivities (because only one crosslink per molecule is sufficient to insolubilize the material), but at the expense of resolution. The resists were developed in a solvent appropriate for the non-crosslinked portion, and the crosslinked material would be swelled by this solvent, often touching neighboring patterns and resulting in extensive pattern distortion. As a result, they were limited to feature sizes greater than about 1 mm. The introduction of Shipley’s acidcatalyzed novolak-based resist (SAL-601 or its subsequent versions) was a major advance because it gives excellent resolution and good process latitude with etch resistance similar to that of conventional optical resists. Even though it relies on crosslinking for its action, it avoids the swelling problem because, being a novolak (phenolic-type polymer with acidic hydrogen), it is developed in aqueous base in essentially the same manner as positive optical novolak resists. Water is not a good solvent for the polymer, so no swelling occurs. (For an extensive discussion of the mechanism of development of novolak and phenolic resists, see Chapter 10.) Much of the discussion of resist processing at nanolithography dimensions in Chapter 15 is also relevant to high-resolution e-beam lithography, and the reader should consult this material for further details and references.

6.9 The Proximity Effect 6.9.1 Description of the Effect The proximity effect is the exposure of resist by electrons backscattered from the substrate, constituting a background where the pattern is superimposed. If this background were constant, it would create no lithographic problem other than a degradation in contrast that, although undesirable, would be by no means catastrophic. However, the background is not constant, and the serious consequence of this was observed in positive resists by Chang and Stewart [10], “Several problems have been encountered when working at dimensions of less than 1 mm. For example, it has been found that line width depends on the packing density. When line spacing is made less than 1 mm, there is a noticeable increase in the line width obtained when the spacing is large.” Figure 6.8 illustrates the way in which the proximity effect affects pattern dimensions. In Figure 6.8a, it is assumed that an isolated narrow line (for example, of width 0.5 mm) is to be exposed. The corresponding energy density distribution is shown in Figure 6.8d. The energy density deposited in the resist by forward-traveling electrons is 3, and the effects of lithographic resolution and forward-scattering on the edge acuity of the pattern have been ignored. Because the width of the isolated line is a fraction of 1 mm, and the characteristic diameter of the backscattered electron distribution (sb) is several (mm); the backscattered energy density deposited in the resist is negligibly small.

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

355

−0.5 μm Absorbed energy density in resist film

(a)

(b)

(c) (1+ηe)E

(1+1/2ηe)E

E ηeE

1/2 ηeE

0 (d)

(e)

0.36 μm

(f)

0.64 μm

(g)

(h)

(j)

(k)

(i)

(l)

FIGURE 6.8 The influence of the proximity effect on the exposure of (a) an isolated line, (b) equally spaced lines and spaces, and (c) an isolated space (exposed regions are denoted by shading). The absorbed energy densities are 1 μm shown in (d), (e), and (f). The profiles of the resist patterns are shown in (g), (h), and (i) after development time appropriate for the isolated line. The profiles shown in (j), (k), and (l) occur after a development time that is appropriate for the isolated space. The shapes of the developed resist patterns are drawn to be roughly illustrative of what would be observed in practice.

This, however, is not the case for the exposure of the isolated space depicted in Figure 6.8c. Because the width of the space is much smaller than sb, it is a good approximation to assume that the pattern is superimposed on a uniform background energy density he3, as is shown in Figure 6.8f. Figure 6.8 b and e deal with an intermediate case—an infinite periodic array of equal lines and spaces. Here, the pattern is superimposed on a background energy density of he3/2. Thus, the energy densities corresponding to exposed and unexposed parts of the pattern are strongly dependent on the nature of the pattern itself. This leads to difficulties in maintaining the fidelity of the pattern during development. In the drawings in Figure 6.8 g through i, it has been assumed that a development time has been chosen that allows the isolated line (g) to develop out to the correct dimensions. However, because the energy densities associated with the isolated space are appreciably higher, this pattern may well be so overdeveloped at the end of this time that it has completely disappeared (i). The problem may not be so drastic for the equal lines and spaces, for which the energy densities have lower values. Nevertheless, the lines will be overdeveloped; in (h) they are depicted (arbitrarily) as having a width of 0.36 mm, not the required 0.5 mm. The situation cannot be improved by reducing the development time. Figure 6.8 j through l illustrate what happens if the patterns are developed for a time appropriate for the isolated space. Although this is now developed out to the correct dimension, the isolated line and equal line and space patterns are inadequately developed. In general, because of the proximity effect, any one of these types of patterns may be made to develop out to the correct dimensions by choosing an appropriate development time. However, it is impossible to simultaneously develop them all out to the correct dimensions. Although the proximity effect poses serious problems, it does not constitute a fundamental resolution limit to electron lithography. A number of methods for compensating for the effect, or at least reducing the gravity of the problems, has been devised. These are described below. It should be noted that this discussion changes significantly when feature sizes reach sub-100 nm dimensions, and these effects are described in Chapter 15.

q 2007 by Taylor & Francis Group, LLC

356

Microlithography: Science and Technology

6.9.2 Methods of Compensating for the Proximity Effect The most popular form of proximity effect compensation is probably dose correction. To implement this method, the dose delivered by the lithography instrument is varied in such a way as to deposit the same energy density in all exposed regions of the pattern. Referring to Figure 6.8, this situation occurs if the dose delivered to the isolated line (a) is increased by the factor (1Che) and that to the lines of the line and space pattern (b) by a factor of (1C he)(1Che/2)K1. If this is done, all three types of patterns will develop out to approximately the correct dimensions after the same development time (that appropriate for the isolated space in Figure 6.8c). The dose correction scheme was first proposed by Chang et al. [68] who applied it to a vector scan round-beam lithography instrument with variations in dose obtained by varying the speed with which the beam was scanned over the substrate. Submicrometer bubble devices were successfully made in this way, and because of the simplicity and highly repetitive nature of the pattern, the calculation of the necessary dose variations was not arduous. However, for complex, nonrepetitive patterns, the dose calculations become involved, making the use of a computer to carry them out essential. The calculations are time consuming and, therefore, expensive to carry out with several hours of CPU time sometimes necessary even when using large, fast computers. A set of programs designed for this purpose has been described by Parikh [69], and these have since become widely used. Parikh’s algorithm involves a convolution of the pattern data with the inverse of the response function associated with electron backscattering. In order to make the computation tractable, the approximations made are considerable. Kern [70] has pointed out that the dose compensation calculations can be carried out exactly and more conveniently in the spatial frequency domain using a Fourier transform technique. Further analysis of dose compensation algorithms has been published by Owen [71,72]. Another popular form of compensation is shape correction. This would be applied to the patterns of Figure 6.8 by decreasing the widths of the exposed lines in (b), increasing the width of the isolated space in (e), and developing the pattern for a time appropriate for the isolated line pattern in (a). In practical applications, the magnitudes of the shape corrections to be applied are empirically determined from test exposures. For this reason, this technique is generally applied only to simple, repetitive patterns for which the empirical approach is not prohibitively time consuming. A correction scheme that involves no computation other than reversing the tone of the pattern has been described by Owen and Rissman [73]. The scheme is implemented by making a correction exposure in addition to the pattern exposure. The correction exposure consists of the reversed field of the pattern, and it is exposed at a dose a factor he(1Che)K1 less than the pattern dose using a beam defocused to a diameter a little less than sb. The attenuated, defocused beam deposits an energy density distribution in the resist that mimics the backscattered energy density associated with a pattern pixel. Because the correction exposure is the reverse field of the pattern exposure, the combination of the two produces a uniform background energy density, regardless of the local density of the pattern. If this correction technique were applied to the pattern of Figure 6.8, all pattern regions would absorb an energy density (1Che)3 and all field regions an energy density of he3. These are identical to the exposure levels of a narrow isolated space (to which negligible correction dose would have been applied). As a result, after development, all the patterns (a), (b), and (c) would have the correct dimensions. It has been reported that the use of beam energies much greater than 20 keV (e.g., 50 keV) reduces the proximity effect, and experimental and modeling data have been presented to support this claim (see, for example, Neill and Bull [74]). At first sight, it

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems

357

would seem that the severity of the proximity effect should not be affected by increasing the beam energy because there is good evidence that this has a negligible effect on he [75]. A possible explanation for the claimed reduction of the proximity effect at high energies hinges on the small size of the test patterns where the claims are based. At 20 keV, the range of backscattered electrons in silicon is about 2 mm, and at 30 keV it is about 10 mm [55]. A typical test structure used for modeling or experimental exposure may have linear dimensions of the order of a few micrometer. As a result, at 20 keV, nearly all the backscattered electrons generated during exposure are backscattered into the pattern region itself. However, at 50 keV, the electrons are backscattered into a considerably larger region, thereby giving rise to a lower concentration of backscattered electrons in the pattern region. Thus, the use of a high beam energy will reduce the proximity effect if the linear dimensions of the pattern region are considerably smaller than the electron range in the substrate. A more general explanation that applies to pattern regions of any size is statistical. At 20 keV, a given point on a pattern receives backscattered energy from a region about 4 mm in diameter, whereas at 50 keV, it would receive energy from a region about 20 mm in diameter. It is, therefore, to be expected that the variation in backscattered energy from point to point should be less at 50 keV than at 20 keV because a larger area of the pattern is being sampled at the higher beam energy. Consequently, because the proximity effect is caused by point to point variations in backscattered energy density, it should become less serious at higher beam energies.

Acknowledgments I would like to thank Geraint Owen and James R. Sheats for their original works in the first edition. In particular, I appreciate James R. Sheats’ help in providing his electronic file when I started to revise this chapter.

References 1. 2. 3. 4. 5. 6. 7. 8. 9.

10. 11.

D.A. Buck and K. Shoulders. 1957. Proceedings of Eastern Joint Computer Conference, New York: ATTE, p. 55. G. Owen. 1985. Rep. Prog. Phys., 48: 795. C.W. Oatley. 1972. The Scanning Electron Microscope, Cambridge: Cambridge University Press, chaps. 2 and 3. A.N. Broers. 1969. J. Sci. Instrum. (J. Phys. E.), 2: 272. O.C. Wells. 1974. Scanning Electron Microscopy, New York: McGraw-Hill, (Table 4). R. Gomer. 1961. Field Emission and Field Ionization. Cambridge, MA: Harvard University Press. chaps. 1 and 2. G. Stille and B. Astrand. 1978. Phys. Scr., 18: 367. H.P. Kuo and B.M. Siegel. 1978. In Electron and Ion Beam Science and Technology, 8th International Conference, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 3–10. H.P. Kuo, J. Foster, W. Haase, J. Kelly, and B.M. Oliver. 1982. In Electron and Ion Beam Science and Technology, 10th International Conference, B. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 78–91. T.H.P. Chang and A.D.G. Stewart. 1969. In Proceedings of 10th Symposium on Electron, Ion and Laser Beam Technology, L. Marton, ed., San Francisco, CA: IEEE, p. 97. G. Owen. 1981. J. Vac. Sci. Technol., 19: 1064.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

358 12. 13. 14. 15. 16. 17.

18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48.

49. 50. 51. 52. 53. 54. 55. 56.

G. Owen and W.C. Nixon. 1973. J. Vac. Sci. Technol., 10: 983. K. Amboss. 1975. J. Vac. Sci. Technol., 12: 1152. H. Ohiwa, E. Goto, and A. Ono. 1971. Electron. Commun. Jpn, 54B: 44. E. Munro. 1975. J. Vac. Sci. Technol., 12: 1146. L.H. Lin and H.L. Beauchamp. 1973. J. Vac. Sci. Technol., 10: 987. T.H.P. Chang, A.J. Speth, C.H Ting, R. Viswanathan, M. Parikh, and E. Munro. 1982. In Electron and Ion Beam Science and Technology, 7th International Conference, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 376–391. E. Munro and H.C. Chu. 1982. Optik, 60: 371. E. Munro and H.C. Chu. 1982. Optik, 61: 1. H.C. Chu and E. Munro. 1982. Optik, 61: 121. H.C. Chu and E. Munro. 1982. Optik, 61: 213. A.V. Crewe. 1978. Optik, 52: 337. T. Groves, D.L. Hammon, and H. Kuo. 1979. J. Vac. Sci. Technol., 16: 1680. R.F.W. Pease, J.P. Ballantyne, R.C. Henderson, M. Voshchenkov, and L.D. Yau. 1975. IEEE. Trans. Electron. Dev., ED-22: 393. H.C. Pfeiffer. 1975. J. Vac. Sci. Technol., 12: 1170. H.C. Pfeiffer and K.H. Loeffler. 1970. Proceedings 7th International Conference on Electron Microscopy. Paris: Societe´ Franc¸aise de Microscopie Electronique, pp. 63–64. M. Born and E. Wolf. 1975. Principles of Optics, Pergamon: Oxford. J.L. Mauer, H.C. Pfeiffer, and W. Stickel. 1977. IBM. J. Res. Dev., 21: 514. E.V. Weber and R.D. Moore. 1979. J. Vac. Sci. Technol., 16: 1780. L.A. Fontijn. 1972, PhD Thesis, Delft: Delft University Press. H.C. Pfeiffer. 1978. J. Vac. Sci. Technol., 15: 887. R.D. Moore, G.A. Caccoma, H.C. Pfeiffer, E.V. Weber, and O.C. Woodard. 1981. J. Vac. Sci. Technol., 19: 950. D.E. Davis, S.J. Gillespie, S.L. Silverman, W. Stickel, and A.D. Wilson. 1983. J. Vac. Sci. Technol., B1: 1003. H.C. Pfeiffer. 1979. IEEE. Trans Electron. Dev., ED-26: 663. W.H.P. Koops and J. Grob. 1984. Springer Series in Optical Sciences, G. Schmahl and D. Rudolph, eds, X-ray Microscopy, Vol. 43, pp. 119–128. S.D. Berger and J.M. Gibson. 1990. Appl. Phys. Lett., 57: 153. K. Suzuki. 2002. Proc. SPIE, 4754: 775. H.C. Pfeiffer et al. 1999. J. Vac. Sci. Technol., B17: 2840–2846. W. Stickel. 1998. J. Vac. Sci. Technol., B16: 3211. K. Suzuki et al. 2004. J. Vac. Sci. Technol., B22: 2885. S.D. Golladay, R.A. Kendall, and S.K. Doran. 1999. J. Vac. Sci. Technol., B17: 2856. T. Utsumi. 1999. J. Vac. Sci. Technol., B17: 2897. N. Samoto et al. 2005. J. Microlitho., Microfab. Microsyst. 4, 023008-1. M. Muraki and S. Gotoh. 2000. J. Vac. Sci. Technol., B18: 3061. H. Haraguchi et al. 2004. J. Vac. Sci. Technol., B22: 985. P. Kruit. Proceedings of Litho Forum (Jan-29-2004) in website http://www.sematech.org. C. Brandstatter et al. 2004. Proc. SPIE, 5374: 601. A.D. Wilson, T.W. Studwell, G. Folchi, A. Kern, and H. Voelker. 1978. In Electron and Ion Beam Science and Technology, 8th International Conference, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 198–205. H.E. Bishop. 1967. Br. J. Appl. Phys., 18: 703. T.E. Everhart, O.C. Wells, and C.W. Oatley. 1959. J. Electron. Control., 7: 97. E.D. Wolf, P.J. Coane, and F.S. Ozdemir. 1975. J. Vac. Sci. Technol., 6: 1266. L. Reimer. 1984. Electron Beam Interactions with Solids, Chicago: SEM, Inc., pp. 299–310. H. Niedrig. 1977. Opt. Acta, 24: 679. S.J.B. Reed. 1975. Electron Microprobe Analysis, Cambridge: Cambridge University Press, Table 13.1. J.E. Holliday and E.J. Sternglass. 1959. J. Appl. Phys., 30: 1428. N. Aizaki. 1979. Jpn. J. Appl. Phys., 18 suppl. 18-1, 319–325.

q 2007 by Taylor & Francis Group, LLC

Electron Beam Lithography Systems 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68.

69. 70. 71. 72. 73. 74. 75.

359

O.C. Wells. 1965. IEEE. Trans. Electron. Dev., ED-12: 556. D. Cumming. 1981. Microcircuit Engineering 80, Proceedings International Conference on Microlithography, Delft: Delft University Press, pp. 75–81. D.M. Holburn, G.A.C. Jones, and H. Ahmed. 1981. J. Vac. Sci. Technol., 19: 1229. T.J. Hsu. 1981. Hewlett Packard J., 32:5, 34–46. O.C. Wells. 1974. Scanning Electron Microscopy, New York: McGraw-Hill. H. Kanter. 1961. Phys. Rev., 121: 681. R.J. Hawryluk. 1981. J. Vac. Sci. Technol., 19: 1. R. Shimizu and K. Murata. 1971. J. Appl. Phys., 42: 387. K. Murata, T. Matsukawa, and R. Shimizu. 1971. Jpn. J. Appl. Phys., 10: 678. D.F. Kyser and K. Murata. 1974. Electron and Ion Beam Science and Technology, 6th International Conference, Vols, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 205–223. A. N. Broers. 1981. IEEE Electron. Dev., ED-28: 1268. T.H.P. Chang, A.D. Wilson, A.J. Speth, and A. Kern. 1974. Electron and Ion Beam Science and Technology, 6th International Conference, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 580–588. M. Parikh. 1979. J. Appl. Phys., 50: 4371. D.P. Kern. 1980. Electron and Ion Beam Science and Technology, 9th International Conference, R. Bakish, ed., Princeton, NJ: The Electrochemical Society, pp. 326–329. G. Owen. 1990. J. Vac. Sci. Technol., B8: 1889. G. Owen. 1993. Opt. Eng., 32: 2446. G. Owen and P. Rissman. 1983. J. Appl. Phys., 54: 3573. T.R. Neill and C.J. Bull. 1981. Microcircuit engineering 80, Proceedings International Conference on Microlithography, Delft: Delft University Press, pp. 45–55. L.D. Jackel, R.E. Howard, P.M. Mankiewich, H.G. Craighead, and R.W. Epworth. 1994. Appl. Phys. Lett., 45: 698.

q 2007 by Taylor & Francis Group, LLC

7 X-ray Lithography Takumi Ueno

CONTENTS 7.1 Introduction ......................................................................................................................361 7.2 Characteristics of X-ray Lithography ..........................................................................362 7.3 Factors Affecting the Selection of X-ray Wavelength ................................................363 7.4 Resolution of X-ray Lithography ..................................................................................364 7.4.1 Geometrical Factors ............................................................................................364 7.4.2 Effect of Secondary Electrons............................................................................366 7.4.3 Diffraction Effect..................................................................................................369 7.5 X-ray Sources ....................................................................................................................370 7.5.1 Electron Beam (EB) Bombardment X-ray Sources ........................................370 7.5.2 Synchrotron Orbit Radiation ............................................................................372 7.5.3 Plasma X-ray Sources ........................................................................................372 7.6 X-ray Masks ......................................................................................................................373 7.6.1 Mask Fabrication ................................................................................................373 7.6.2 Mask Patterning ..................................................................................................375 7.6.3 Mask Alignment..................................................................................................376 7.7 X-ray Resist Materials ....................................................................................................376 7.7.1 Factors Determining Sensitivity of X-ray Resists ..........................................376 7.7.2 Trend in X-ray Resists ........................................................................................378 7.8 X-ray Projection Systems ................................................................................................379 7.8.1 Mirror-Type Reduction Projection Systems ....................................................379 7.9 Conclusions ......................................................................................................................380 References ....................................................................................................................................380

7.1 Introduction Use of a very short wavelength of electromagnetic wave, i.e., x-ray, has an a priori obvious potential for high-resolution patterning. A quarter century has passed since x-ray lithography was described in 1972 by Spears and Smith [1]; in this time, many research institutes have been concerned with its development. However, the application of x-ray lithography to real production environments has not yet been materialized. The main reason for this delay is that optical lithography has been doing much better than was expected three decades ago. When x-ray lithography was proposed, the resolution limit of optical lithography was considered to 361

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

362

be 2–3 mm. Today, one does not have much difficulty in fabricating 0.1 mm patterns with a current ArF reduction projection aligner. Attempts to develop the exposure machines with large numerical aperture (NA) lens and for short-wavelength irradiation have been continued to obtain better resolution. Developments in resist processes and resist materials for high resolution have extended the use of optical lithography. X-ray lithography is a novel technology, requiring an entirely new combination of source, mask, resist, and alignment system. It is said that quantum jumps are required on each front for practical use of x-ray lithography [2]. Therefore, there has been reluctance to consider x-ray lithography as an alternative to optical lithography. In view of these difficulties and continued encroachment by optical lithography, it is not unreasonable to ask if x-ray lithography still has a future. It is difficult to tell how far an optical lithography can extend the resolution. However, it is not easy to obtain smaller patterns than the irradiation wavelength for practical use unless such an exotic technique as wavefront engineering, including interference lithography [3,4] is utilized. Therefore, provided that the device-related problems of scaling integrated circuits to lateral dimensions smaller than current dimensional size w90 nm are solved, some means are required of doing lithography at dimensions smaller than can be provided by conventional light sources and optics (here, conventional includes lamps, lasers, and optics that look like the mirrors and lenses discussed in Chapter 2 of this book). It is necessary to continue research on x-ray lithography to provide one possible solution to this demand. Another solution, or family thereof, is discussed in Chapter 15 on ultrahigh resolution electron beam (EB) lithography proximal probe lithography. Each of these methods has associated with it a set of advantages and disadvantages, and despite the many years of research, that many more years will be required to determine which will be the most appropriate for high-volume, high-yield manufacturing of ultra large-scale integrated circuits (ULSICs). Because x-ray lithography has not reached the stage of practical application, this chapter will be of a somewhat different nature than most of the book (it shares this feature to some extent with Chapter 13). Many questions must be answered before a practical system can be described and a practitioner’s handbook is written. The present work attempts to provide a general introduction to what is known and gives some indication of probable future directions. As in any research effort, many surprises may be in store. An excellent overview of research is given by Smith et al. [5].

7.2 Characteristics of X-ray Lithography The x-ray lithography system proposed by Spears and Smith [1] is a proximity printing in Figure 7.1. Although there have been reports on reduction projection x-ray lithography (see Section 7.8), the focus here is mainly on x-ray proximity printing for which technology is at this point most highly developed. The advantages of 1:1 x-ray lithography are summarized as follows based on the reported literature (although some are controversial as described below) 1. By using soft x-rays, wavelength-related diffraction problems that limit the resolution of optical lithography are effectively reduced 2. High-aspect ratio pattern fabrication can be achieved because of transparency of resist film to x-rays 3. Many defect-causing particles in the light optics regime are transparent to x-rays

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

363

Electron beam

d

X-ray Mask Resist

D Δ=s (W/2D )

d

s

Wafer

W FIGURE 7.1 Schematic diagram of an x-ray lithography system.

4. 5. 6. 7.

There is a possibility of high throughput because large areas can be irradiated Large depth-of-focus gives process latitude There is practically no field size limitation Unwanted scattering and reflection are negligible because the index of refraction in the x-ray spectral region is about the same for all materials (close to unity) [5]. This eliminates the standing-wave effect and other reflection-based problems that plague optical lithography.

7.3 Factors Affecting the Selection of X-ray Wavelength It is generally accepted that x-ray wavelength lies ranging from 10 to 0.1 nm, and it overlaps with ultraviolet region in the longer wavelength region and with g-rays in the shorter wavelength region. However, although x-rays cover a wide spectral range, the region of wavelengths for x-ray lithography is rather limited. The limited choice of materials for masking determines this wavelength range. The absorption coefficient for mask materials as a function of wavelength is shown in Figure 7.2 [6]. A mask with a reasonable contrast ratio is necessary to obtain good definition of x-ray images. According to Spears and Smith [1], the wavelength must be longer than 0.4 nm to obtain 90% x-ray absorption using most highly absorbing material (Au, Pt, Ta, W, etc.) with 0.5 mm thickness. On the other hand, substrate materials (Be, Si, SiC, Si3N4, BN, and organic polymers) restrict the usable wavelength to be less than 2 nm to transmit more than 25% of the incident x-rays (see Figure 7.2). Therefore, the wavelength range is limited to 0.4!l!2 nm. A plot of contrast ratios (related to the modulation transfer function (MTF) of the mask) as a function of Au thickness for four different wavelengths is shown in Figure 7.3 [6]. Because x-rays are generated under vacuum, a vacuum window separating the vacuum from exposing area is needed. The material (Be) for this window also dictates the wavelength range of x-rays. Although exposure systems under vacuum have been proposed, wafer handling would be complicated.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

364 100

50 Er

Ro

Au

20

Si Absorption coefficient a (μ−1)

10

Pt

5 Si3N4 2

Mylar

Au

1

Mylar Er Si

0.5

0.2

Parylene N Be

0.1 FIGURE 7.2 Absorption coefficients of some of the most absorbing and most transparent materials for x-rays.

2

5

10

20

50

100

Wavelength l (Å)

As will be discussed in the Section 7.4, the range of photoelectrons and diffraction effect limit the resolution. The effects of photoelectron range and diffraction on resolution are depicted in Figure 7.4 [7]. The range of photoelectrons is smaller for longer wavelengths, and the diffraction effect is smaller for shorter wavelengths. The optimum wavelength for x-ray lithography seems to fall into w1 nm. It is known that absorption coefficient varies strongly with wavelength. Organic resist materials mainly consist of C, H, and O, and absorption coefficient for these elements are higher in the longer wavelength region and lower in the shorter one. Therefore, the longer wavelength is desirable for higher resist sensitivity. The amount of the absorption energy is closely related to resist sensitivity that will be described in Section 7.7. In summary, the materials for x-ray masks are of the utmost importance in determining the wavelength range 0.4!l!2 nm.

7.4 Resolution of X-ray Lithography 7.4.1 Geometrical Factors A typical exposure system for x-ray lithography using an electron-beam bombardment source is schematically depicted in Figure 7.1. The opaque part of the mask casts shadows onto wafer below. The edge of the shadow is not absolutely sharp because of the finite size

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

365

10,000 C,K (44 Å)

Cu L (13 Å)

Contrast ratio

1000

100

AI K (8.3 Å)

Rh L (4.6 Å) 10

1

0

0.2

0.4 0.6 Thickness (μ)

0.8

1.0

FIGURE 7.3 X-ray contrast of Au as a function of Au thickness for four different wavelengths (bremsstrahlung neglected).

of the x-ray source d (diameter of focal spot of electrons on the anode) at a distance D from the mask. If the gap between mask and wafer is called s, the penumbral blur, d, is given by d d Zs D

(7.1)

Smaller gap and smaller x-ray source sizes lead to smaller penumbra. Small size and high intensity are important factors for developing x-ray sources. The incident angle of x-rays on the wafer varies from 908 at the center of wafer to tanK1(2D/W) at the edge of the wafer diameter W. The shadows are slightly longer at the edge by the amount D that is given by

W D Zs 2D

(7.2)

The smaller gap and larger D give the smaller D. A full wafer exposure system can be adopted when this geometrical distortion is acceptable. Otherwise, step-and-repeat exposure mode is necessary.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

366

Es 5

1.5

0.5

1

WD = 1.5

lS/2

Rg

WD, Rg (μm)

1

WD

μm 100 S= μm 40 μm 20 μm 10

0.1

m 1μ

FIGURE 7.4 Relation between resolution W0 and x-ray wavelength l. The effect of photoelectron range Rg and mask-wafer gap S on resolution as a function of the wavelength is also shown. Es is the synchrotron storage ring energy (MeV) to give the corresponding x-ray wavelength.

0.01 1

10 Wavelength (A)

100

7.4.2 Effect of Secondary Electrons Three processes are involved in the absorption of x-rays: Compton scattering, the generation of photoelectrons, and the formation of electron–positron pairs. The magnitude of three processes strongly depends on x-ray energy and atomic number as shown in Figure 7.5 [8]. The electron–positron pair formation is possible at x-ray energies above 1.02 MeV. The energies of the x-ray quanta in x-ray lithography are lower than 10 keV. For such energies, the cross section for the photoelectric effect is about 100 times larger than that for the Compton scattering. Therefore, only the effect of the photoelectrons needs to be considered. The absorption of an x-ray by an atom is innershell excitation followed by emission of photoelectron (Figure 7.6). In the simplest picture, when an x-ray photon is absorbed, a photoelectron is generated with kinetic energy Ep where Ep Z Ex KEb

(7.3)

Here, Ex is the energy of x-ray photon, and Eb is the binding energy required to release an electron from an atom as calculated by assuming that the final state has the same electronic configuration as it had before ionization. In reality, there is some relaxation, a consequence of many electron configuration interaction effects, resulting in the emitted electron’s having a slightly greater energy than it otherwise would have [9]. This relaxation energy is the order of a few electron volt, whereas Eb is the order of several hundred electron volt (eVÞ:

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

367

100 Electron− positron pair formation

Atomic number Z

80 Photoelectron 60

40 Compton 20

0 0.01

0.1

1

10

Photon energy (MeV)

FIGURE 7.5 Main contribution of photon–atom interaction as a function of atomic number and photon energy.

The vacancy that is created in the atom is quickly filled, and the energy Eb is distributed to the surroundings via Auger electrons or fluorescent radiation. Usually, the vacancy is filled by an electron from the next higher level, and the energy released is a fraction smaller than Eb. X-ray fluorescent occurs at a wavelength slightly longer than the wavelength corresponding to the absorption edge. The ratio of the probability for Auger emission to fluorescence for light element is 9:1. Hence, for light elements, 90% of the binding energy is transferred to the surroundings by Auger electrons. These electrons produced by photoelectric effect and Auger process cause the excitation and ionization in the resist film, leading to chemical changes in polymers. Therefore, the range of these electrons relates to the resolution of exposed patterns. The range of electrons in poly(methyl methacrylate) (PMMA) films as a function of electron energy is shown in Figure 7.7 [1] based on conventional electron energy loss analysis. The energies of the characteristic copper, aluminum, and molybdenum x-rays are indicated for comparison. The first attempt to measure the range of photoelectrons

e–, Ea

O Valence levels

Ex e–, Ep

Eb

q 2007 by Taylor & Francis Group, LLC

Core level

FIGURE 7.6 Schematic energy diagram showing the general processes involved in x-ray absorption. Core ionization results in the emission of a photoelectron of energy E p and the filling of this core level by an internal transition of energy EbKEval gives rise to an Auger electron of energy Ea (that may or may not come from exactly the same level as the one filling the core).

Microlithography: Science and Technology

368 0.6

Electron range (μm)

CuL

AIK

MoL

0.4

0.2

FIGURE 7.7 Characteristic electron range as a function of electron energy for a typical polymer film (rZ1 g/cm3).

0

1

2 Electron energy (keV)

3

4

was carried out by Feder et al. [6]. They measured the maximum penetration depth of electrons from a heavy metal layer into PMMA resist. The effective range was determined by the experimental arrangement shown in Figure 7.8. The erbium film evaporated on the resist film acted as an x-ray absorber and electron generator to expose the resist. After the exposure, the erbium film was removed, and the resist was developed. In Figure 7.8, a plot of the change in resist film thickness as a function of development time is also shown. In all cases, the initial stages of development showed a rapid decrease in thickness followed by a normal development curve representative of PMMA. They regarded that extrapolation of the normal part of the curve as the effective range of electrons.

1400 Rh Lα

1200

Mask substrate Au mask Erbium overcoat 50–800 (Å)

(a)

Δ Thickness (Å)

1000 800

AI Kα

600 C Kα 400

Resist

200

Wafer

0 (b)

0

1

2 3 4 5 Development time (min)

6

7

FIGURE 7.8 Schematic showing the measurement of effective range of electrons generated by x-ray exposure (a). The incident x-ray transmitted through the mask is incident on a resist film coated with erbium. Removal of erbium and development of the resist film will give the depth of the electrons. The depth of the developed exposed area in resist is plotted as a function of development time (b). The intercepts on the vertical axis represent the maximum penetration depth of electrons as measured in the developed resist.

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

369

It is clear from this figure that the effective range of electrons depends on the wavelength of x-rays; the range is smaller for longer wavelength. Although the energy of photoelectrons from Er would be different from those from C and O, the ranges are much smaller than the expected values shown in Figure 7.7. The resolution of this apparent discrepancy comes, in part, from considering that the spatial distribution of deposited energy is affected by the source of electrons [10]. The Auger electrons are shorter ranged; the standard deviation of the Gaussian distribution for C (290 eV) is 3 nm and for O (530 eV) 6 nm; these range are, of course, independent of the photoelectron energy. The associated photoelectrons have longer ranges, but they deposit energy more diffusely. Several factors enter into the increase in resolution over that suggested by the nominal range data in Figure 7.7. Lower energy leads to larger elastic scattering cross sections, thus helping confine the electrons [10]. Inelastic scattering mean free path tends to be smallest (for most material) in the vicinity of 100 eV [9]. Finally, it is important to realize that the definition of range is not a precise value; what counts is resist development. Resist development is, in general, not a linear function of the exposed dose, and it may be possible to correctly develop features in the presence of background exposure. It has shown experimentally that 30 nm line and widths can be produced with good fidelity in PMMA resist using an x-ray wavelength of 0.83 nm with corresponding maximum photoelectron range of 40–70 nm [11], and Smith et al. [5] suggest that feature sizes down to 20 nm are feasible using wavelength of 1 nm and longer. In x-ray lithography, one should take into account the effects of photo and Auger electrons near the interface of resist and silicon substrate. Ticher and Hundt [12] have reported the depth profile of deposited energy of photoelectrons and Auger electrons generated in the resist and silicon by AlK (0.83 nm) and RhL x-rays (0.43 nm). The angular distributions of the electrons are important in calculating the distribution of energy transferred. The Auger electrons have a spherical or isotropic distribution from their starting point, whereas the photoelectrons generated by x-rays with energies below 10 keV are preferentially emitted perpendicular to the impinging x-rays. Whereas for AlK radiation, the electrons from the silicon have only minor influence on the deposited energy profile, the electrons generated by RhL in the silicon can give a high contribution to energy density at the resist/silicon interface even in the unexposed area as shown in Figure 7.9. This difference was attributed to the larger electron range for RhL than for AlK radiation. The calculated energy density curves for RhL and AlK radiation are in good agreement with the resist profile after development. Some simulations [13–15] have shown that it is important to expose with a wavelength above the absorption edge of silicone, i.e., lO0.7 nm, because of photoelectrons originating in the substrate and propagating back into resist. Photoelectrons are also generated from the absorber on the mask by the absorption of the incident radiation. These electrons cause unwanted exposure on the resist that deteriorates the mask contrast and system resolution. These photoelectrons can be eliminated by coating the mask with an organic layer.

7.4.3 Diffraction Effect In proximity x-ray printing, diffraction of x-ray must be carefully considered even though diffraction in x-ray lithography is smaller than in optical lithography. The minimum line width Wmin is related to the mask-sample gap s by Wmin Z ½sl=a1=2

(7.4)

where l is the source wavelength, and a is the reciprocal of the square of the so-called Fresnel number [5]. Both accurate calculations (using Maxwell’s equation and real

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

370

RhL l −0.48 nm

AlK l −0.83 nm

Contrast

3.1 8.1 10.1

0.1 μm

0.1 μm

0.05

0.05 Resist

0.0

0.05 μm 0.1

Silicon

0.0

0.05 μm 0.1

FIGURE 7.9 Effect of the secondary (Photo and Auger) electrons on the deposited energy in a resist film near the resistsubstrate interface. Lines of constant energy density (D(x,z)/D0Z0.5) near the bottom of the resist layer are plotted for various contrast values for RhL and AlK radiation.

dielectric properties of the absorber) and experiment [16–18] show that an a value as large as 1.5 can be used if the spatial coherence of the source is optimized, which turns out to be bZd/Wminw1.5 (d being the penumbral edge blur because of the finite source size as defined in Equation 7.1, equivalent to a measure of spatial coherence). Under these conditions, edge ringing (rapid spatial variations in transmitted intensity close to the mask edge) is eliminated at the expense of a less abrupt edge transition [5]. The image quality is nevertheless sufficient to print 100 nm pitch electrode patterns with high fidelity using 1.32 nm x-rays at sZ2.7 mm with an exposure latitude of at least 2.3 X [18]. Extension of these results by foregoing analysis suggests that a gap of 15 mm will suffice for 100 nm features using 1 nm radiation and a source subtending 3.5 mrad as viewed from the substrate [5]. Figure 7.4 is a plot of minimum feature size versus mask sample gap. Smith and co-workers [5] have found it possible to routinely achieve controlled gaps of 5 mm and smaller in a research setting; this would yield w60–70 nm or smaller features. If this can be done in manufacturing has not yet been demonstrated. Factors to be considered are in the cleanliness and flatness of both substrate and mask. Features below about 30–40 nm require mask-substrate contact. Again, there remains a gap between what has been demonstrated in a research mode and high-volume manufacturing.

7.5 X-ray Sources 7.5.1 Electron Beam (EB) Bombardment X-ray Sources X-ray radiation produced by bombardment of material with accelerated electrons has been utilized as an x-ray source since the discovery of x-ray. When an accelerated electron (several tens of kilo electron volt (keV)) impinges on the target, two types of radiations are produced. One is continuous radiation primarily produced from interactions with nuclei; the electron emits energy when it experiences the strong electric field near nucleus (bremsstrahlung). The other is characteristic radiation (x-ray fluorescence); when the incident electron energy is high enough, it can cause inner-shell excitation

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

371

FIGURE 7.10 X-ray emission spectrum from Pd with electron beam (EB) bombardment.

followed by the emission of characteristic lines. A typical x-ray spectrum generated by EB bombardment is shown in Figure 7.10 [19] that shows a sharp high intensity characteristic radiation and broad low-intensity bremsstrahlung. Although it is primarily the characteristic radiation that is used for x-ray lithography, the influence of the continuous radiation on exposure dose cannot be neglected. The energy of the characteristic line is determined by the materials used as a target. A Gaines-type x-ray source is shown in Figure 7.11 [20]. This source has an inverted cone geometry, providing a large surface area with a minimum (and symmetric) project spot. At the same time, the inverted cone acts as an excellent black body absorber for electrons. The cathode of the electron gun is ring-shaped and masked from the view of the target because it is necessary to prevent evaporated cathode material from being deposited on the targets. The target is cooled by high velocity water flow.

Electron gun

Dispenser cathode Shield grid Extraction grid Target assembly

X–rays

e Cooling water e Beam–focus tube

FIGURE 7.11 Schematic view of electron gun and target Gaines-type assembly for x-ray generation.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

372 7.5.2 Synchrotron Orbit Radiation

Synchrotron orbit radiation (SOR) is a very intense and well-collimated x-ray source. It has received attention as a source for x-ray lithography since the report by Spiller et al. [21]. Synchrotron radiation is emitted when a relativistic electron experiences an acceleration perpendicular to its direction of motion. The characteristics of SOR are summarized as follows [21]: 1. The radiation is a broad continuum, spanning the infrared through the x-ray range 2. The intensity of the flux is several orders of magnitude larger than that of conventional EB bombardment sources 3. The radiation is vertically collimated, and its divergence is small (a few megarad) 4. The radiation is horizontally polarized in orbital plane (electron trajectory plane) 5. The source is clean in a high vacuum 6. The radiation can be thought as pulsed because the bursts of radiation are seen under circular motion. Characteristics of 2 and 3 are utilized in x-ray lithography. The high intensity of x-ray radiation can reduce the exposure time. The small divergence of SOR essentially eliminates the problem of geometrical distortion that imposes severe constrains on maskto-wafer positioning with conventional x-ray sources as discussed above. The spectral distribution and intensity of synchrotron radiation are dependent on electron energy, magnetic field, and orbital radius of the deflection magnet. As described in Section 7.3 on the selection of x-ray wavelength, the wavelength region for x-ray lithography is determined by mask contrast and mask substrate. Therefore, electron energy and magnetic field of SOR should be optimized to offer desirable spectral distribution and intensity for x-ray lithography. Although the SOR is a collimated beam and the effective source size is small, the x-ray beam is rectangular or slit-like in shape at some distance from the source. The emitted radiation is horizontally uniform but very vertically non uniform. To get enough exposure area, several methods have been reported [22]: a wafer was moved with a mask during exposure; a mirror that scans the reflected light vertically was oscillated; and the electron was oscillated in the storage ring. Other disadvantages of synchrotron radiation are large physical dimension and high cost. Although several attempts to design compact SOR were reported to reduce the construction cost [22], it still requires a billion dollars for a system. The cost for a beam port can be reduced by using a multiport system that might be cost-competitive with optical lithography. However, at least two SOR sources are necessary in case of shutdown. 7.5.3 Plasma X-ray Sources Several attempts to obtain high intensity x-ray emission from extremely high energy plasma have been made. Devices capable of producing such plasma rely on the ability to deliver energy to a target more rapidly than it can be carried away by loss processes. Several devices capable of producing dense, high temperature plasma by electric discharge and laser pulse irradiation have been reported. Economou and Flanders [23] reviewed the gas-puff configuration reported by Stalling et al. [24] that is shown in Figure 7.12. This consists of a fast valve and supersonic nozzle. When the valve is fired, the gas expands through the nozzle and forms a hollow cylinder at

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

373

Gas

Fast valve

Compressed plasma

Gas puff cylinder Capacitor

Fast switch

FIGURE 7.12 Schematic representation of the gas puff configuration. The fast-acting gas release valve and the shaped nozzle form a cylinder of gas. A discharge current through the gas causes the cylinder to collapse, formatting the plasma x-ray source.

the nozzle exit. A high, pulsed current is driven through the gas cylinder, causing the gas to ionize. The magnetic pressure induced by current through the resulting plasma cylinder leads to collapse onto axes, forming a dense, hot plasma that is a strong x-ray source. For a laser plasma, either an infrared or ultraviolet laser is used with pulses varying from 50 ps to 10 ns for these types of sources. The beam focused on the target (1014 W/cm2 is needed) creates a plasma of high enough temperature to produce black body radiation [25]. The conversion efficiency from laser energy to x-ray photons is higher than that of electron impact excitation, but the conversion efficiency from electric energy to laser is low. The advantage of this approach is that the power supply can be positioned away from the aligner, preventing electromagnetic interference.

7.6 X-ray Masks 7.6.1 Mask Fabrication One of the most difficult technologies in x-ray lithography is the mask fabrication. In proximity x-ray lithography, production, inspection, and repair of x-ray masks are the most problematical aspects. X-ray masks consist of a thin membrane as a substrate, x-ray absorber patterns and a frame to support the membrane. The x-ray mask uses a very thin membrane as substrate instead of glass substrate used for photomask because no material is highly transparent to x-rays. Materials currently investigated as a membrane include SiC, SiN, and Si. X-ray absorber materials are mostly Au, Ta, and W that define the circuit pattern on the membrane. These resist patterns are fabricated by EB lithography.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

374

Examples of x-ray mask fabrication schemes for Au additive process and Ta subtractive process are shown in Figure 7.13. The additive process indicates the plating of x-ray absorber on the resist patterned membrane, and subtractive process indicates etching the x-ray absorber using the resist patterns as an etching mask. X-ray mask fabrication using an additive method includes the deposition of membrane film on a silicon wafer; backetching the silicon to the membrane film; glass frame attachment; deposition of Cr for Deposition

Silicon wafer SiN or SiC

Windowing

Frame mounting

Glass frame

EB resist after exposure and development EB writing

Au Gold plating

Resist stripping inspection (a)

Deposition

ECR–SiO2 Ta SiN Si f –MAC resist

EB writing

Dry etching

Windowing

Frame mounting

SiO2 Ta RIE

SiN RIE bulk–Si etching Glass frame

(b) FIGURE 7.13 X-ray mask fabrication processes. The glass frame and membrane (nominally 1 mm thick) must be flat to submicrometer specifications in order to be useful in high-resolution proximity printing. In reproducible laboratory processing, 3 cm diameter masks have been made that are flat to 250 nm or better, enabling gaps below 5 mm. (a) Additive process, (b) Subtractive process. (From Smith, H.I., Schattenburg, M.L., Hector, S.D., Ferrera, J., Moon, E.E., Yang, I.Y., and Burkhardt, M., Microelectron. Eng., 32, 143, 1993.)

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography

375

plating base, resist coating, and pattern formation by EB lithography; Au plating (additive process); and resist removal. In subtractive method [26], after deposition of SiN film for membrane and Ta film for absorber, resist patterns are formed on Ta film by EB lithography. The patterns are first transferred to SiO2, and SiO2 patterns are then transferred to Ta film by dry etching. Finally, silicon substrate is etched from the backside to the membrane. It is difficult to relax the stress of the membrane both in additive and subtractive processes. Optical lithography has extended its resolution capability down to 90 nm, and the use of ArF excimer laser (193 nm) as a light source and wavefront engineering may have the potential of resolution near 60 nm level. Therefore, the issues of x-ray masks for sub100 nm, i.e., nanolithography, are especially important. X-ray lithography, in this realm, requires absorber patterns near 0.1 mm fabricated on the mask membrane 1:1 dimension precisely. An x-ray wavelength of w1 nm is used as discussed earlier. The difference in absorption coefficient of the material on the mask provides the image contrast. To obtain appropriate mask contrast, the absorber thickness must be 0.5–1 mm as described before. These structures become more difficult to fabricate because of the high aspect ratio as minimum feature size becomes smaller. In addition, these precise high aspect ratio patterns should be maintained with low distortion on the thin film membranes. However, the aspect ratio problem can be alleviated, to some extent, if one can use a thinner absorber film with the expectation of a phase shift effect because it was reported that a thinner absorber (0.3–0.35 mm) can improve the image quality by letting some of x-ray radiation pass through in the same manner as a leaky chrome optical phase shift mask [27,28]. These masks are remarkably robust mechanically; they can, for example, withstand 1 atm of pressure [5]. They are, however, subject to distortion by radiation damage, the supporting frame, and by the stress in the absorber film. The first of these problems appears negligible for SiC and Si masks, but it has not yet been fully solved for SiNx. The issue of absorber stress is the most critical one, and it has yet to be adequately addressed in a way that is compatible with high-volume manufacturing although stresssfree masks have been made. Smith et al. [5] discuss these issues in depth, and they give additional references. 7.6.2 Mask Patterning Absorber patterns are usually made by EB lithography although a variety of techniques have been proposed and investigated, including photolithography, interferometric lithography, x-ray lithography (i.e., mask replication), and ion-beam lithography. However, most of the EB exposure machines currently used are unable to meet the requirement for very accurate pattern size and beam placement needed pattern features below 0.2 mm. In the subtractive method, resist pattern formation should be carried out on the x-ray absorber materials of high atomic number such as Ta and W that usually show higher backscattering effect than those of low atomic number. The backscattering causes the electron energy deposition in unwanted area (proximity effect as described in Chapter 6), resulting in pattern size variation. The proximity correction during EB exposure makes the EB exposure more complicated. With additive approach, a very thin plating base is patterned, minimizing the backscattering problem. Care must be taken, however, to avoid pinholes in this film. A second important problem associated with e-beam patterning arises from e-beam scan distortion and stitching errors. Conventional e-beam systems do not have feedback to maintain precise positioning of the beam relative to the interferometrically controlled stage. Research is currently in progress to address this problem [5]. X-ray lithography has been shown to be sufficiently well controlled and capable of producing replicas from a master mask with adequate fidelity; therefore, it is feasible

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

376

for the original pattern generation using EBs to be carried out with greater precision (and at correspondingly higher cost) than for optical masks without sacrificing cost effective for overall process. 7.6.3 Mask Alignment Although Taniguchi et al. [29] demonstrated several years ago a misalignment detectivity of w5 nm using an interferometric scheme, such techniques have not achieved 3s alignments close to their detectivity. As with any alignment procedure, performance with realworld wafers is inferior to that obtained under carefully controlled conditions. The ˚ (25% of A ˚ 30 nm difficulties of achieving alignment of successive layers to within 70 A critical feature size) are formidable with any techniques. Electron beam lithography (nanometer application of which is discussed in Chapter 15) has some advantage in that, because it is a direct write process, precision of detection of alignment marks is, in general, not substantially worse than the resolution itself. X-ray lithography must rely on optical techniques, and it is a major design challenge to provide a set of signals whose fidelity is not unacceptably degraded by the various reflective and scattering effects present in integrated circuit wafers.

7.7 X-ray Resist Materials Requirements for x-ray resists strongly depend on x-ray sources and lithographic processes. A variety of multi-layer resist systems as well as the conventional single layer resist process has been extensively studied [30]. A simple single layer resist is desirable for practical use of x-ray lithography to be compatible with the existing process for optical lithography. A single layer resist process requires a resist with high resolution as well as high dry-etch resistance. 7.7.1 Factors Determining Sensitivity of X-ray Resists The absorption of a photon is the first step of photochemical or radiation chemical reactions in the resist film. X-ray energy absorption is given by Beer’s law I Z I0 expðKmm rlÞ

(7.5)

where I0 is the intensity of incident x-rays, I the intensity of x-ray after penetration through a thickness l of a homogeneous material having a mass absorption coefficient mm for x-ray, and r the bulk density of the material. Therefore, the mass absorption coefficient for a polymer, mmp, is given by P Am mmp Z P i mi Ai

(7.6)

where Ai and mmi are the atomic weights and mass absorption coefficient, respectively. The percent of x-ray absorbed in a polymer can be calculated from the relation % absorbed Z

ðI0 KIÞ Z 1KexpðKmmp rlÞ I0

(7.7)

For these calculations, values for mm in Table 7.1 [30] can be used. For example, absorption fractions of x-ray energy by a high resolution EB resist PMMA with 1-mm

q 2007 by Taylor & Francis Group, LLC

Mass Absorption Coefficients of Selected Elements mm (cm2/g) Element

Z

˚ Pd 4.36 A

˚ Rh 4.60 A

˚ Mo 5.41 A

˚ Si 7.13 A

˚ Al 8.34 A

˚ Cu 13.36 A

˚ C 44.70 A

C N O F Si P S Cl Br Fe Sn Ti Most absorbing elements

6 7 8 9 14 15 16 17 35 26 50 81

100 155 227 323 1149 1400 1697 2013 1500 630 675 1083

116 180 264 376 1337 1630 1975 197 1730 726 772 1213

184 285 416 594 2115 2579 232 301 2649 1118 1160 1032

402 622 908 1298 279 371 483 628 3680 2330 2318 1904

S Br P Si Heavy atoms

Br P Si Heavy atoms F

Br Heavy atoms F O Cl

2714 4022 5601 6941 1959 2405 3079 3596 3101 10690 9623 6276 Heavy atoms F O N Cl

2373 3903 6044 8730 36980 41280 47940 50760 32550 13300 6332 13030

Cl S Br P Si Heavy atoms C N O

627 970 1415 2022 423 564 733 953 1456 3536 3437 2697 Heavy atoms F Br O N

Cl S P Si Heavy atoms

C N Ci

C S N

Si P C

Si P C

Si P C

C N O

Least absorbing elements Source:

X-ray Lithography

TABLE 7.1

From Taylor, G. N., Coquin, G. A., and Somek, S. Polym. Eng. Sci., 17, 420, 1977.

377

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

378

˚ ) and Pd (4.36 A ˚ ) x-rays, respectively. thickness are only 3.1 and 1.7% for Mo (5.14 A Improvement in sensitivity of x-ray resist is not an easy task as a result of the low deposit energy of x-ray in resist film. As can be seen in Table 7.1, the mass absorption ˚ x-rays. Therefore, coefficients for halogen atoms and metals have larger values for 4–15 A incorporation of these atoms into polymers is an effective approach to increase x-ray ˚ ) x-ray, there was a report on Cl absorption. Because Cl atom strongly absorbs Pd (4.36 A containing acrylate polymers, poly(2,3-dichloro-1-propylacrylate) (DCPA) by Bell Laboratories workers [31]. The absorption fraction for DCPA is, at most, 9.9% for Pd ˚ ) x-rays. On the other hand, absorption fractions of 1 mm AZ-1350J photoresist are (4.36 A 70 and 40% for i-line (365 nm) and g-line (436 nm), respectively. More than 90% of x-ray energy penetrates through the resist film, whereas in optical lithography, about half of the incident light can be utilized. The sensitivity is generally described by incident energy radiation instead of absorbed energy. The sensitivity of diazonaphthoquinonenovolak photoresist is about 100 mJ/cm2. This means that x-ray resist should be more than five times as sensitive as conventional photoresist as far as absorbed energy in a resist film is concerned. Photo and Auger electrons produced following absorption of x-rays in a resist film cause chemical change, leading to differential solubility behavior. Therefore, chemical reactions induced by x-ray irradiation are similar to those exposed by EB irradiation. Most of EB resists can be used as x-ray resists. In addition, x-ray sensitivity shows linear relation with EB sensitivity as shown in Figure 7.14 [32]. 7.7.2 Trend in X-ray Resists Because the electron sensitivity shows good correlation with x-ray sensitivity, positiveworking resists using radiation-induced chain-scission-type polymers such as PMMA and negative working resists using cross-linking type polymers can be used. However, in order for drastic improvement in sensitivity, the recent trend in x-ray resists has been the use of a photo-induced chain reaction system as described in chemical amplification [33] in the chapter of chemistry of photoresists (Chapter 10). The chemical amplification system utilizes strong acids produced from acid generator photodecomposition to catalyze the reaction of acid-sensitive groups either in the polymer backbone or on the side chain. Because chemical amplification system utilizes the drastic change in solubility for aqueous base developer, it shows high resolution capability. Device fabrications using x-ray lithography have been demonstrated using chemical amplification resists

SR dose (mJ/cm2)

104

Positive Negative

103

φ-MAC RD2000N WX242 SNR NEN CMS WX242 SPP RE5000P SPR

SAL601 RAY-PN RAY-PF

102

PMMA

BNR-g EXP

Chemically-amplified

FBM-G MES-X

FIGURE 7.14 Relation of x-ray sensitivity with electron sensitivity for various resists.

q 2007 by Taylor & Francis Group, LLC

10−1

100

101 EB dose (μC/cm2)

102

X-ray Lithography

379

[26,34–36]. Generally, a deprotection reaction is used for positive resists, and acid-hardening reaction of melamine derivatives is used for negative ones. One of the disadvantages of this type od resist is acid diffusion into unexposed area. The range of the diffusion depends on the process condition, especially post-exposure baking temperature that was investigated in detail [34] (see also Chapter 8).

7.8 X-ray Projection Systems 7.8.1 Mirror-Type Reduction Projection Systems Workers of AT&T Bell Laboratories demonstrated an x-ray reduction projection exposure system based on Schwaltzschild reflection type 20:1 reduction in 1990 (Figure 7.15) [37], and they achieved 100 nm line and space patterns. Since then, much attention has been focused on x-ray lithography again although reduction projection systems were already reported by workers at Lawrence Livermore National Laboratory [38] and Nippon Telegraph and Telephone (NTT) [39]. The x-ray reduction projection system can be accepted as an extension of the projection system (steppers) used in the present photolithography. Minimum feature size on mask can be several times larger than that for proximity printing. The difficulty in x-ray reduction system is the fabrication of mirrors. Aspherical multilayer mirrors are necessary to reduce the number of mirrors and the thickness of each layer in only a few tens of angstroms. Because the wavefront error of these mirrors is required to be less than l/14, extremely precise thickness is needed for shorter wavelength. A multilayer mirror is prepared by periodical vapor deposition of dielectrics with different refractive indices. The pitch of dielectrics determines the wavelength of maximum reflectivity, and the accuracy of the pitch determines the reflectivity.

incident synchrotron radiation λ= 36 and 14nm Transmission mask

Aperture

Secondary mirror

Primary mirror Resist-covered wafer

q 2007 by Taylor & Francis Group, LLC

FIGURE 7.15 Schematic diagram showing the Schwarzchild objective used with an eccentric aperture and off-axis illumination.

Microlithography: Science and Technology

380

The requirements for resists for the reduction projection system are different because of different exposure wavelength. The use of wavelength shorter than 10 nm is limited by the present status of multi-layer mirrors fabrication. Most of the materials have high absorption coefficient in this wavelength region. The extinction of x-ray along the film thickness causes severe problems for image formation at around 10 nm wavelengths. The workers of AT&T Bell Laboratories demonstrated pattern formation with a three-layer resist process using PMMA as an imaging layer. Smith et al. [5] have described some interesting concepts involving the use of Fresnel zone plates for projection x-ray nanolithography; in one manifestation, the exposure requires no mask.

7.9 Conclusions X-ray lithography carries the potential for resolution at the limit of conceivable device structures that depends on the bulk properties of matter (i.e., larger than the scale of individual molecules or small aggregates). However, the advantages described above sometimes give rise to disadvantages. Because the refractive index of x-ray is near one (minimizing unwanted scattering and interference effects), optical elements for x-rays are limited to mirror and Fresnel zone plates. It is, therefore, difficult to collimate or focus the x-rays. Because the transmittance of the resist film is high (giving rise to exposures that are very vertically uniform and that allow high aspect ratios), only a small percent of the energy is deposited in the resist film. This high transmittance requires extremely high resist sensitivity to avoid the x-ray damage of the devices and to maintain adequate throughput. The proximity printing requires a system to control the gap between mask and wafer. This gap also gives rise to such problems as penumbra and magnification when the x-ray source size cannot be neglected. For the application of x-ray lithography to actual large-scale integration (LSI) fabrication, alignment technology should also be established. Above all, x-ray mask fabrication is a most difficult issue for actual use of x-ray lithography. It is always difficult to predict the limitation of minimum feature size that can be fabricated with optical lithography. There is no doubt, however, that obstacles for x-ray lithography described above should be overcome before the end of optical lithography era. As always in a fundamentally commercial endeavor, economics will make the final determination of what technology is brought to fruition because it is the cost per device that is the primary driver for the continued miniaturization of integrated circuits. Although it seems rather likely that these problems will be solved, their solutions must be achieved at a cost lower than that offered by other technologies that may mature with larger features but whose costs may be steadily lowered by continuous aggressive engineering efforts. For the next several years, the foregoing discussion should be to provide useful guidelines to allow the interested reader to monitor the research activity in this field.

References 1. 2. 3. 4.

E. Spears and H. I. Smith. 1972. Electron Lett., 8: 102; E. Spears and H. I. Smith. 1972. Solid State Technol., 15:7, 21. B. Fay, L. Tai, and D. Alexander. 1985. Proc. SPIE, 537: 57. H. I. Smith. 1974. Proc. IEEE, 62: 1361. T. Terasawa and S. Okazaki. 1993. IEICE Trans. Electron., E76-C: 19.

q 2007 by Taylor & Francis Group, LLC

X-ray Lithography 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.

34. 35. 36. 37. 38. 39.

381

H.I. Smith, M .L. Schattenburg, S. D. Hector, J. Ferrera, E. E. Moon, I. Y. Yang, and M. Burkhardt. 1993. Microelectron. Eng., 32: 143. R. Feder, E. Spiller, and J. Topalian. 1977. J. Vac. Sci. Technol., 12: 1332; R. Feder, E. Spiller, and J. Topalian. 1977. Polym. Eng. Sci., 17: 385. N. Atoda. 1984. Hoshasenkagaku (Radiation Chemistry), 19, 41; N. Atoda. 1994. Proceedings of International Conference on Advances Microelectronic Devices and Processing, p. 109. T. Watanbe. 1996. Hoshasen to Genshi Bunshi (Radiation Effect on Atoms and Molecules), S. Shida, Ed., Tokyo: Kyoritsu Publications, p. 30. D. P. Woodruff and T. A. Delchar. 1986. Modern Techniques of Surface Science, Cambridge: Cambridge University Press. L. E. Ocola and F. Cerrina. 1993. J. Vac. Sci. Technol., B11: 2839. K. Early, M.L. Schattenburg, and H.I. Smith. 1990. Microelectron. Eng., 11: 317. P. Tischerrand and E. Hundt. 1987. Proc. Symp. 8th Electron Ion Beam Sci. Technol., 78:5, 444. K. Murata. 1985. J. Appl. Phys., 57: 575; K. Murata, M. Tanaka, and H. Kawata. 1990. Optik, 84: 163. L. E. Ocola and F. Cerrina. 1993. J. Vac. Sci. Technol., B11: 2839. T. Ogawa, K. Mochiji, Y. Soda, and T. Kimura. 1989. Jpn. J. Appl. Phys., 28: 2070. S. D. Hector, M. L. Schattenburg, E. H. Anderson, W. Chu, V. V. Wong, and H. I. Smith. 1992. J. Vac. Sci. Technol., B10: 3164. J. Z. Y. Guo, F. Cerrina, E. Difabrizio, L. Luciani, M. Gentili, and D. Gerold. 1992. J. Vac. Sci. Technol., B10: 3150. W. Chu, H. I. Smith, and M. L. Schattenburg. 1991. Appl. Phys. Lett., 59: 1641. H. Leslie, A. Neukermans, T. Simon, and J. Foster. 1983. J. Vac. Sci. Technol., B1: 1251. J. L. Gaines and R.A. Hansen. 1975. Nucl. Instrum. Method, 126: 99. E. Spiller, D.E. Eastman, R. Feder, W. D. Grobman, W. Gudat, and J. Topalion. 1976. J. Appl. Phys., 47: 5450. A. Heuberger. 1986. Solid State Technol., 29:2, 93; M. N. Wilson, A.I. Smith, V. C. Kempson, A.L. Purvis, R. J. Anderson, M. C. Townsend, A. R. Jorden, D. E. Andrew, V. P. Suller, and M.W. Poole. 1990. Jpn. J. Appl. Phys., 29: 2620. N. P. Economou and D. C. Flander. 1981. J. Vac. Sci. Technol., 19: 868. C. Stalling, K. Childers, I. Roth, and R. Schneider. 1979. Appl. Phys. Lett., 35: 524. A. L. Hoffman, G. F. Albrecht, and E. A. Crawford. 1985. J. Vac. Sci. Technol., B3: 258. K. Deguchi. 1993. J. Photopolym. Sci. Technol., 4: 445. Y. Somemura, K. Deguchi, K. Miyoshi, and T. Matsuda. 1992. Jpn. J. Appl. Phys., 31: 4221; Y. Somemura and K. Deguchi. 1992. Jpn. J. Appl. Phys., 31: 938. J. Xiao, M. Kahn, R. Nachman, J. Wallance, Z. Chen, and F. Cerrina. 1994. J. Vac. Sci. Technol., B12: 4038. S. Ishihara, M. Kanai, A. Une, and M. Suzuki. 1989. J. Vac. Sci. Technol., B6: 1652. G. N. Taylor. 1980. Solid State Technol., 23:5, 73; G. N. Taylor. 1984. Solid State Technol., 27:6, 124. G. N. Taylor, G. A. Coquin, and S. Somek. 1977. Polym. Eng. Sci., 17: 420. K. Deguchi. 1994. ULSI Lithography No Kakushin (Innovation of ULSI Lithography), p. 245, Science Forum. H. Ito and C. G. Willson. 1983. “Chemical amplification in the design of dry developing resist materials,” Polym. Eng. Sci., 23: 1012; H. Ito and C. G. Willson. 1984. “Polymers in electronics,” ACS Symp. Ser., 242: 11. J. Nakamura, H. Ban, K. Deguchi, and A. Tanaka. 1991. Jpn. J. Appl. Phys., 30: 2619. R. DellaGardia, C. Wasik, D. Puisto, R. Fair, L. Liebman, J. Rocque, S. Nash et al. 1995. Proc. SPIE, 2437: 112. K. Fujii, T. Yoshihara, Y. Tanaka, K. Suzuki, T. Nkajima, T. Miyatake, E. Miyatake, E. Orita, and K. Ito. 1994. J. Vac. Sci. Technol., B12: 3949. J. E. Bjorkholm, J. Borkor, L. Eicher, R.R. Freeman, J. Gregus, T. E. Jewell, W. M. Mansfield et al. 1990. J. Vac. Sci. Technol., B8: 1509. A. M. Hawryluk and L. G. Seppala. 1988. J. Vac. Sci. Technol., B6: 2162. H. Kinoshita, K. Kurihara, Y. Ishii, and Y. Torii. 1989. J. Vac. Sci. Technol., B7: 1648.

q 2007 by Taylor & Francis Group, LLC

8 EUV Lithography Stefan Wurm and Charles Gwyn

CONTENTS 8.1 Introduction ....................................................................................................................384 8.1.1 Fundamental Relationships ..........................................................................385 8.1.2 Why EUV Lithography? ................................................................................385 8.1.3 Technology Delays ..........................................................................................386 8.1.4 Development Structure for EUVL ................................................................387 8.1.5 Other EUV Programs......................................................................................388 8.1.6 Technology Description ..................................................................................388 8.2 EUV Optics and EUV Multilayers..............................................................................390 8.2.1 Finishing Tolerances........................................................................................390 8.2.2 EUV Multilayers ..............................................................................................393 8.2.2.1 Deposition Systems ........................................................................394 8.2.2.2 Uniform and Graded Multilayer Coatings ................................395 8.2.2.3 Stability of EUV Mirrors ..............................................................396 8.2.2.4 Engineered Multilayers ................................................................397 8.2.3 Remaining Challenges and Issues ................................................................398 8.3 EUV Sources ..................................................................................................................399 8.3.1 Commercial EUV Source Requirements......................................................400 8.3.2 EUV Discharge Produced Plasma Sources..................................................401 8.3.2.1 Z-pinch ............................................................................................403 8.3.2.2 HCT Z-pinch ..................................................................................404 8.3.2.3 Capillary Discharge........................................................................405 8.3.2.4 Plasma Focus ..................................................................................405 8.3.2.5 Other Designs ................................................................................406 8.3.2.6 DPP Source System Aspects ........................................................406 8.3.3 EUV Laser Produced Plasma Sources..........................................................407 8.3.3.1 Spray Jet Sources ............................................................................408 8.3.3.2 Filament or Liquid Jet Sources ....................................................409 8.3.3.3 Droplet Sources ..............................................................................410 8.3.3.4 LPP Source System Aspects..........................................................410 8.3.4 Synchrotron Radiation Sources ....................................................................411 8.3.5 EUV Source Metrology ..................................................................................411 8.3.6 Commercialization Status and Production Needs ....................................413 8.4 EUV Masks ....................................................................................................................414 8.4.1 EUV Mask Substrates ....................................................................................415 8.4.2 EUV Mask Blanks............................................................................................416 383

q 2007 by Taylor & Francis Group, LLC

384

Microlithography: Science and Technology

8.4.3 EUV Mask Blank Defect Repair ....................................................................421 8.4.4 EUV Mask Patterning ....................................................................................426 8.4.5 Mask Costs........................................................................................................427 8.4.6 EUV Mask Commercialization Status..........................................................429 8.4.7 EUV Mask Production Needs and Mask Cost ..........................................429 8.5 EUV Resists ....................................................................................................................430 8.5.1 Commercialization Status ..............................................................................431 8.5.2 Production Needs............................................................................................431 8.6 EUV Exposure Tool Development..............................................................................432 8.6.1 The ETS Alpha Tool ........................................................................................433 8.6.1.1 System Overview............................................................................434 8.6.1.2 ETS Lithographic Setup and Qualification ................................436 8.6.1.3 ETS System Learning ....................................................................442 8.7 Source Power Requirements and System Tradeoffs ................................................443 8.8 Remaining Challenges ..................................................................................................447 8.9 Conclusions ....................................................................................................................448 8.10 Postscript ........................................................................................................................449 8.10.1 EUV Optics and EUV Multilayers................................................................450 8.10.2 EUV Sources ....................................................................................................451 8.10.3 EUV Masks ......................................................................................................452 8.10.4 EUV Resists ......................................................................................................454 8.10.5 EUV Exposure Tool Development................................................................454 8.10.6 Outlook..............................................................................................................455 References ..................................................................................................................................456

8.1 Introduction For nearly three decades, the number of transistors contained on an integrated circuit (IC) has grown exponentially following Moore’s law, doubling on the average every 18 months. With each new technology generation, lithography has become an even more important key technology driver for the semiconductor industry because of smaller feature sizes and tighter overlay requirements that cause an increase in the lithography tool costs relative to the total tool costs for an IC manufacturing facility. These technology needs and manufacturing constraints have been documented in the Semiconductor Industry Association (SIA) international technology roadmap for semiconductors (ITRS) [1]. The ITRS represents an industry-wide consensus on the estimates of technology requirements to support the semiconductor industry in supporting advanced microelectronics trends to reduce minimum feature sizes, increase functionality, increase speed, and reduce the cost per function, i.e., following Moore’s Law. Historically, the continued scaling has reduced the cost per function w25% per year and supported a market growth of w15% per year. Traditionally, the ITRS describes the technology generation by “node” that refers to the half-pitch dimensional spacing where the pitch is measured between the centerline of parallel lines. The node does not necessarily describe the minimum feature size or the gate length for a specific technology, e.g., the microprocessor gate length may be 60%–70% of the node dimension.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

385

8.1.1 Fundamental Relationships The two fundamental relationships describing a lithography imaging system, resolution (RES) and depth of focus (DOF), are given by RES Z k1 l=NA

(8.1)

DOF ZGk2 l=NA2

(8.2)

and

where l is the wavelength of the radiation used for imaging and NA is the numerical aperture of the camera. The parameters k1 and k2 are empirically determined and correspond to those values that yield the desired critical dimension (CD) control within an acceptable IC manufacturing process window. Values for k1 and k2 of 0.6 and greater have been used in high-volume manufacturing. However, a given lithographic technology can be extended further for smaller values for k1 by using resolution enhancement techniques (RET) and optimizing the IC fabrication process at the cost of tighter process control [2,3]. Setting k1 and k2 equal to 0.5 corresponds to the theoretical values (Rayleigh criteria) for RES and DOF. Equation 8.1 and Equation 8.2 demonstrate that improvements in RES achieved by incremental decreases in wavelength and increases in NA result in a decrease in DOF and a corresponding decrease in the process window. Extreme ultraviolet (EUV) Lithography, or EUVL, extends optical lithography by using much shorter wavelengths rather than increasing NA to achieve better RES. As a result, it is possible to simultaneously achieve a RES of less than 50 nm with a DOF of 1 mm or larger by operating with a wavelength of 20 nm or less using a camera having an NA of 0.1. 8.1.2 Why EUV Lithography? Extensions of conventional 248 nm optical lithography have and are expected to continue to dominate semiconductor device manufacturing and support the SIA roadmap to 90 nm. These extensions rely on incremental increases in the system optical NA and RET such as off-axis illumination, phase-shift masks, and optical proximity correction (OPC). Whereas 157 nm lithography initially emerged as the technology of choice for 70 and 50 nm, difficulties with optics materials and resists have delayed the 157 nm technology, and large NA 193 nm tools (perhaps in an immersion configuration) are expected to be initially used for many 45 nm applications. However, for 45 nm, a next-generation lithography (NGL) technology will be required for printing 45 nm half pitch features. EUVL, using 10–14 nm EUV light, is the most promising NGL technology. This technology builds on the industrial optical experience, uses an EUV light source, and although initially targeted for introduction at 70 nm, is expected to support IC fabrication at w45 nm; scaling is expected to support several technology generations down to below 22 nm without system throughput loss. One of the major reasons for pursuing EUV lithography occurs because of the similarities with conventional optical lithography. For example, the optical process follows the Rayleigh laws; the systems use reduction optics; and the lithography process builds on the experience in the wafer fabrication facility (process margin, similar resists, etc.). Following the standard optics laws, the large k1 supports the technology extendibility with conventional OPC and phase shifting to smaller technology nodes and provides for aberration

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

386

compensation, etc. In addition, tool operation uses step and scan printing similar to today’s lithography, and, in essence, the technology exploits the present knowledge base. However, there are some differences between optical and EUV lithography tha present special challenges. The differences occur as a result of the very short wavelength light. For example, all materials absorb EUV radiation. This requires special reflecting optics with high precision finishes that are coated with quarter wave uniform and graded Bragg reflectors. The reflecting optics are somewhat inefficient and result in approximately a 30% loss at each mirror, limiting the practical number of optics and requiring the use of aspheric surfaces. An even number of optics is required to accommodate scanning stages. Reflective masks are required instead of transmission masks with off normal illumination. The lithography process must occur in a vacuum with precision environmental control to minimize contamination, oxidation, and loss of EUV by gaseous absorption. Finally, a plasma source is required to generate the short wavelength light, adding another set of challenges. The differences provided substantial challenges from the normal optical lithography system migration (I-line, 248 nm, 193 nm), and most of the early work on EUVL was focused on meeting these challenges. 8.1.3 Technology Delays In spite of the attractiveness of EUVL and the support by the Extreme Ultraviolet Limited Liability Company (EUVLLC), the technology has encountered several implementation delays. Some of the main reasons for the delays are listed below. 1. One of the major reasons for the delay was the fact that existing deep ultraviolet (DUV) technologies were extended far beyond the expectations in the mid 1990s when EUVL was first proposed. Even though EUVL was first proposed for introduction at the 100 nm node, DUV has been extended, and it is expected to be the main technology for 90 nm and will probably support the 70 nm and part of the 45 nm node with immersion technology extensions. Because the DUV extension from 248 to 193, 157, and consideration of 126 nm (and recently immersion) were viewed as simple extensions of the optical lithography, there was widespread industry support and a reluctance to divert resources to a new technology like EUVL. As the technology extensions progressed, the fabrication of optical materials became showstoppers for 126 and perhaps 157 nm lithography. It is estimated that the work on 157 nm lithography and the accompanying diversion of resources and funding caused at least a two year delay in EUVL development. 2. The EUVL program has been reporting always highlighted technology challenges and risks in addition to reporting technical progress. This gave EUV technology issues a much higher visibility than likewise problems and often even more severe risks in other NGL technologies received. EUVL issues such as required optics finishing accuracy, mask phase defects, CD control, and costs, the requirement of defect free masks, high source power, and high cost of ownership (CoO) were considered as technology showstoppers at intermediate points during the program. 3. Although the development of EUVL and other NGL technologies were started in the mid 1990s when industry and technology advancement interests were high, the subsequent economic downturn caused many companies to delay investments in advanced lithography technologies. The economic fluctuations caused oscillations in the development schedule, and they tended to increase

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

387

costs. In addition, the projected high tool cost of w$25M discouraged company commitments. Even though the cost was high by I-line and DUV standards, the projected cost was well in line with the International SEMATECH (ISMT) predictions for an NGL technology based on the increasing cost trends with technology node and time lines. 4. The industry support for NGL technologies, including EUVL, provided mixed messages to the equipment supplier industry. Even though each of the proposed NGL technologies had a strong sponsor, the levels of industry commitment varied. Various alliances were established between companies to support a specific technology, e.g., IBM and AT&T for electron projection lithography (EPL), Europe for ion projection lithography (IPL), and the EUVLLC for EUVL. The lack of an apparent strong consensus for the NGL technology caused equipment development companies and their suppliers to delay investments, leading to additional cascading delays.

8.1.4 Development Structure for EUVL During the initial research phase for advanced lithography, experiments are usually performed in major company, government, or university research laboratories. During this phase, the first principles of the technology are demonstrated, and some of the challenges are identified. The first papers proposing the use of EUV or soft x-ray radiation (wavelengths from 2 to 50 nm) for projection lithography were published in 1988 by groups from Lawrence Livermore National Laboratories (LLNL) [4] and Bell Laboratories [5] for all-reflective projection lithography systems (using multilayer-coated mirrors and reflection masks). In 1989, a group from NTT demonstrated projection imaging of 0.5 mm features [6], and the first demonstration of the technology’s potential with nearly diffraction-limited imaging followed in 1990 by the printing of 0.05 mm features in a Poly(methyl methacrylate) (PMMA) resist, using 13 nm radiation by Bell Laboratories [7]. In 1991, Sandia National Laboratories (SNL) demonstrated the first EUVL imaging system using a compact laserproduced plasma source [8], and in 1996, it fabricated the first functioning device patterned with EUVL [9] using a microstepper developed in collaboration with Bell Laboratories. As the interest in the soft x-ray projection lithography (SXPL) grew, the name of the technology was changed to EUVL in order to avoid confusion with proximity x-ray lithography. As lithography technology concepts begin to mature, much larger investments are required to further demonstrate the viability of a technology for commercial use. Traditionally, government investment has contributed to the early development with commercial lithography equipment companies assuming the commercialization role after most of the basic problems have been solved. International SEMATECH has estimated that an investment of around $1B is required to perform the basic research and development and to commercialize a new lithography technology. As the feature sizes for the technology decrease, the required investment increases. Because of the high development costs and the technology challenges previously noted, an alternative development and funding approach for EUVL was needed. In 1996, changes in the U.S. government funding priorities resulted in reduced support from the Department of Energy (DOE) for EUV research at LLNL and SNL, and a consortium of semiconductor manufacturers, the EUVLLC, was formed in 1997 to provide funding and direction for the commercialization of EUVL. The Intel-led EUVLLC consortium, including AMD, Motorola, Micron, Infineon, and IBM, contracted with the DOE

q 2007 by Taylor & Francis Group, LLC

388

Microlithography: Science and Technology

Virtual National Laboratory (VNL) that consisted of Lawrence Berkeley National Laboratory (LBNL), LLNL, and SNL in 1997 to develop the EUVL technology. The program goal was to facilitate the research, development, and engineering to enable the Semiconductor Equipment Manufacturers (SEM) to provide production quantities of 100 nm EUV exposure tools for IC manufacturing by 2005 [10,11]. Funding was provided by the EUVLLC to the DOE VNL to perform the required research-, engineering-, and jointdevelopment programs to support key EUV component technologies with industry partners. The EUVLLC members also invested resources within their own companies to support mask and resist development. 8.1.5 Other EUV Programs In addition to the EUVLLC program, ISMT started an EUVL program in 1998 to support mask modeling, development of a 0.3 NA set of optics, and microstepper enhancements. Prior to funding specific programs, ISMT conducted numerous workshops that examined, in detail, the various NGL technologies, and it ranked the critical issues associated with the technologies. In 1997, EUVL was ranked last out of four NGL technologies by the attendees. Based on the EUVL development program progress, the technology ranking moved to first place in 1999 as the most probable technology to be used for the 50 nm node [12]. In 1998, the European research program Extreme UV Concept Lithography Development System (EUCLIDES) was formed to evaluate EUVL as a viable NGL solution and to perform initial development for key technologies and to perform a system architecture study [13]. The R&D focus was on mirror substrates, high-reflectivity multilayer coatings, vacuum stages, and a comparison of plasma and synchrotron EUV sources. The French PREUVE program was also initiated in 1999 with a focus on developing a 0.3 NA micro exposure tool (MET), EUV sources, optics, multilayers, mask and resist modeling, and defect metrology [14]. The Japanese Association of Super-Advanced Electronics Technologies (ASET) program was also established in 1998 [15]. One of the goals of the program was to develop the basic technologies for EUV lithography. The EUV portion of the ASET program focused on multilayer, mask, resist, and process development. The PREUVE and EUCLIDES programs have transitioned into the new European MEDEACprogram, and the Japanese ASET program is phasing into the new Extreme Ultraviolet Lithography System Development Association (EUVA) program [16,17]. 8.1.6 Technology Description Although many contributions have been made to the development of EUVL by the different programs, the emphasis in this chapter will focus on results from the EUVLLC program. Following a brief description of the technology and R&D status, the commercialization status and remaining production needs will be outlined for EUV sources, exposure tools, masks, and resists. Finally, some of the key remaining challenges for making EUVL a viable manufacturing technology will be summarized. A schematic for an EUVL system showing the optical path is shown in Figure 8.1. System operation is as follows: (1) EUV radiation is produced by a 25–75 eV plasma produced by focusing a high power laser beam or an electric discharge on a Xenon (Xe) gas, liquid, or solid target to produce a plasma that emits visible and EUV illumination; (2) a condenser consisting of a multilayer coated collector and grazing incidence mirrors collects and shapes the EUV beam into an arc field 6 mm wide by 104 mm long to illuminate the reflective mask or reticle; (3) a low expansion reflective reticle is clamped to a scanning reticle stage to move the mask across the illumination beam; (4) 4! reflective

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

389

Scanning reticle stage

Euv reflective masks

llluminator Condenser mirror Projection optics

Plasma source

Wafer

Scanning wafer stage

Beam shaping optics

FIGURE 8.1 Extreme ultraviolet lithography (EUVL) system schematic showing major components. (From Gwyn, C. W. et al., in Extreme Ultraviolet Lithography: A White Paper, Livermore, CA, 1999: Extreme Ultraviolet Limited Liability Company.)

reduction optics containing aspheric mirrors are used to de-magnify the mask image, and finally; (5) the scanning wafer stage containing a wafer coated with EUV sensitive photoresist scans the wafer across the EUV beam in perfect synchronism and at onefourth of the speed of the scanning reticle stage. To prevent the buildup of carbon on the reflective surfaces in the presence of EUV, the partial pressures of hydrocarboncontaining gases in the vacuum are controlled. Because the reflection of EUV from the reflective surfaces is less than 70%, the deposited EUV flux causes localized heating of the reticle and optical surfaces; this requires thermal management of critical surfaces. In addition, the velocity and position of the magnetic levitated stages must be controlled with nanometer precision. A complete description or summary of all aspects of EUVL lithography is beyond the scope of this chapter. Instead, the discussion will focus on the basic status of the technology and most of the remaining key challenge areas associated with the commercial implementation of EUV lithography. The next section provides a summary of the optics development, finishing status, and the status of Mo/Si multilayer development. Section 8.3 outlines the status of EUV source development with a focus on both laser produced and electric discharge sources. Section 8.4 describes the development of reflective EUV masks along with a qualitative discussion of the cost of EUV reflective masks relative to a 193 nm transmission mask. A brief summary of the status of extended DUV resists is contained in Section 8.5. In Section 8.6, the fabrication of a full field scanning, or alpha class, tool is described as implemented in the engineering test stand (ETS) by the EUVLLC program along with some imaging results. Some of the system implementation tradeoffs are discussed briefly in Section 8.7, outlining the impact of various assumptions on the required source power. Although some of the remaining commercialization challenges are mentioned for each of the major development areas, a summary of the remaining key challenges is provided in Section 8.8. Conclusions are summarized in Section 8.9. Although metrology is extremely important in developing EUV lithography, it is

q 2007 by Taylor & Francis Group, LLC

390

Microlithography: Science and Technology

beyond the scope of this chapter to discuss the subject in detail, and only brief discussions are provides in key areas.

8.2 EUV Optics and EUV Multilayers The basic requirements for EUVL optics have been documented in previous white papers and publications [11,18–21]. Precision reflective projection optics with a wavefront accuracy of !l/14 are required to provide an adequate image quality. Because of the energy loss associated with each reflection, the number of optics used in a practical high throughput system is limited, and to minimize aberrations, aspheric optical surfaces are required. A six mirror system is usually assumed for a practical 0.25 NA system although high 0.5 NA systems using up to eight mirrors have been proposed [22]. For these systems, the shape of the surface or figure must be finished to !0.15 nm rms to maintain the l/14 quality. In addition to figure, the atomic level finish or roughness must also be on the order of 0.1 nm rms. In order to reflect EUV light, the mirror substrates must be coated with distributed quarterwave multilayer Bragg reflectors using materials having a dissimilar complex index of refraction [23]. The coatings must preserve figure and finish (or eventually improve it), and they must be stable over their operational lifetime of approximately five years. 8.2.1 Finishing Tolerances Tight optical tolerances and spatial positions of the optics in an operating system must be precisely maintained. This requires that the optical substrates be made of low thermal expansion materials (LTEM) such as Zerodurw* or ULEw† with near zero coefficients of thermal expansion (CTE). For accurate mirror placement in metrology and finishing tools and for the final mounting in the projection optics, new kinematic mirror mounting methods have been developed [24,25]. During the manufacture of the aspheric optics, high quality figure and finish accuracies must be achieved. The finishing specifications for the aspheric optics shown in Table 8.1 are divided into three categories corresponding to spatial periods or dimensions. Imaging quality of optical systems is usually characterized by using Zernike polynomials to quantify deviations of the measured transmitted wavefront in the exit pupil of the system from an ideal spherical wavefront [26,27]. Fitting Zernike’s polynomial expansion to actual measured wavefront data yields the Zernike coefficients, i.e., the weights of individual polynomial terms. Zernike coefficients 5–36 are called figure and quantify aberrations such as astigmatism, coma, three-leaf clover, and spherical aberration. Coefficients 37 and higher add up to what is called mid spatial frequency roughness (MSFR) that describes the smoothness in spatial periods of 1 mm–1 mm. As illustrated in Figure 8.2, MSFR causes small angle scattering where the scattered light remains in the image field. MSFR leads to a background illumination that is superimposed on the desired image that is called flare. Both the absolute level of flare and non-uniformity of flare across the image field reduce image quality. The absolute flare level reduces the image contrast that adversely limits the range of acceptable operating conditions or process window for performing lithography. And flare * Zerodurw is a registered trademark of Schott Glaswerke GmbH, Germany. †

ULEw is a registered trademark of Corning, Inc., U.S.A.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

391

TABLE 8.1 Nominal Specifications for Extreme Ultraviolet Lithography (EUVL) Projection Systems Maximum Error Specification in nm rms

Error Term

4 Mirror Alpha System (ETS) NAZ0.1

6 Mirror Beta System NAZ0.25

6 Mirror Production System NAZ0.25

Defined by Integrating the Power Spectral Density (PSD) of Surface Errors Over the Following Band Limitsa

0.25 0.20

0.20 0.15

0.15 0.10

(Clear Aperture)K1–1 mmK1 1 mmK1–1 mmK1

0.10

0.10

0.10

1 mmK1–50 mmK1

Figure Mid-spatial frequency roughness (MSFR) High-spatial frequency roughness (HSFR) a

The band limits do depend on NA, and for NAZ0.25 systems, the MSFR range becomes larger, making the MSFR requirement for production optics significantly more challenging than for a NAZ0.1 alpha system. Source: From Ku¨rz, P. 2nd International Extreme Ultra-iolet Lithography (EUVL) Symposium, September 30–October 02, 2003.

non-uniformity over the image field causes non-uniformity in the CD for printed features. Scattering at larger angles leads to light’s being scattered outside the image field. This loss of light because of wide-angle scattering is due to the high spatial frequency roughness (HSFR) component of mirror finish. HSFR is associated with surface roughness of spatial periods of !1 mm. As indicated in Figure 8.2, the main effect of HSFR is a reduction of throughput. The figure specification is dictated by the requirement of high resolution imaging with low distortion across the image field, and in order to achieve diffraction-limited performance according to the Marechal criterion [26,28], the composite wavefront error (WFE) must be less than l/14. For a four-mirror system like the ETS 0.1 NA alpha tool, this means that the composite WFE must be less than w1 nm (z13.4 nm/14). Assuming errors of individual mirrors are not correlated and all optical surfaces in the system contribute on the average equally to the composite WFE, the maximum average figure error of a single mirror would then have to be !0.5 nm rms. However, because figure errors of the optics cause twice the error in the reflected wavefront, the maximum average

=HS q

FR

FR =MS

Long-range flare Mid-range flare Short-range flare Figure frequencies

Aperture

q 2007 by Taylor & Francis Group, LLC

FIGURE 8.2 Angular distribution of small angle scattering into the image field [mid spatial frequency roughness (MSFR) or flare] and large angle scattering [high spatial frequency roughness (HSFR)] out of the image field [frequency: fZsin(q/l)].

Microlithography: Science and Technology

392

WFE error of a single mirror in the four-mirror ETS optics system cannot exceed 0.25 nm rms. For a six-mirror 0.25 NA system, the same calculation would give a figure value of less than 0.20 nm rms per mirror. This figure quality will be sufficient for sixmirror EUV beta tools; however, for production tools targeting the 32 nm node, figure values !15 nm rms are needed to achieve the 10% CD control target [29]. The finishing process usually focuses on finishing the three spatial frequency regions separately, and it often involves using different finish or polishing processes. Usually, the optic is finished to obtain the required figure then it is polished to reduce the MSFR and, finally, the HSFR. Finishing the optic in these three spatial regions is an iterative process, and, quite often, reducing the HSFR introduces error in the MSFR and figure and vice versa. Achieving all three specifications simultaneously requires a high level of polishing skill and accurate metrology. For each of the finish parameters, the relevant parameter in rms surface error can be obtained by integrating the 2-D power spectral density (PSD) of the surface errors over the associated band of spatial frequencies. For example, the effects of MSFR can be determined by integrating the PSD between the spatial frequencies of 1/mm and 1/mm. Simultaneously achieving the required figure, MSFR, and HSFR values becomes even more challenging for higher NA optical systems, i.e., for larger optics [30]. This is mainly due to the higher aspherical gradients of the larger optics. Because of the difficulty in simultaneously obtaining low MSFR and HSFR, the use of multilayer smoothing has been developed to reduce the HSFR after the MSFR has been reduced to an acceptable level. As shown in Section 8.4, defect smoothing or covering with properly deposited multilayer coatings can smooth roughness levels and cover defects of less than 100 nm spatial scale that easily fall in to the HSFR spatial frequency range. A variety of EUV aspheric optics has been produced by a number of manufacturers, including ASML Optics (formerly Tinsley), Zeiss, Nikon, Canon, Tropel, and others. Parent and clear aperture optics have been fabricated with sizes up to 209 mm with a 9 mm aspheric departure. During optics manufacturing, only the clear aperture is fabricated with a free board buffer region surrounding the optic. The continuous improvement in manufacturing quality is shown in Figure 8.3 where the average figure, MSFR, and HSFR values have been plotted for several sets of EUV projection optics boxes (POB). Continuous improvement in the metrology of at-wavelength and visible-light testing has been made to support more accurate finishing of the optics. Those metrologies are needed for the qualification of individual mirror elements [31–33] as well as for the alignment of assembled EUV optical systems [24,34–37]. In addition to extending the 0.9 Avgerage figure

FIGURE 8.3 Demonstrated improvement in optics finishing. ETS POB sets 1 and 2 were 4!, 0.1 NA, and 4-mirror systems; MET POB sets 1 and 2 were 5!, 0.25 NA, and Schwarzschild systems.

q 2007 by Taylor & Francis Group, LLC

Residual error in (nm) rms

0.8

Avgerage MSFR

0.7

Avgerage HSFR

0.6 0.5 0.4 0.3 0.2 0.1 0.0 1997 "5×"

ETS POB 1 Q4-98

ETS POB 2 Q3-00

MET POB 1 Q4-00

MET POB 2 Q4-00

EUV Lithography

393

capabilities of interferometry techniques currently used in the optics industry that rely on using compensation optics to measure optical aspheres [19,33], a new technique for EUV optics interferometry using phase-shifting point diffraction interferometers (PS/PDI) has demonstrated accuracy levels of 40–70 pm [38–41]. An advanced version of the phase shifting diffraction interferometer (PSDI) for visible interferometric characterization of individual optics has been developed that minimizes the metrology errors by removing essentially all optical elements to minimize the errors introduced by optical fabrication errors, dust particles, and alignment [24,42]. This lensless PSDI uses software to back propagate the measured fringes in an ideal environment to support measurement accuracies of less that 1 nm rms. For higher throughput, at-wavelength interferometry with looser coherence requirements lateral shearing interferometry (LSI) is being considered [37,43]. 8.2.2 EUV Multilayers As previously noted, EUV radiation is absorbed by all materials, and optical surfaces designed to reflect EUV wavelength light must be coated with distributed quarterwave multilayer Bragg reflectors. A variety of material combinations have been used in experiments; however, most of the experimental work has been done using molybdenum (Mo) in combination with either silicon (Si) or beryllium (Be). The period of the multilayer is selected to produce constructive interference for the light reflected from each layer with the maximum reflectivity occurring at the Bragg or peak wavelength. For the quarter wave Mo/Si stack, the thickness of the Mo is slightly reduced, resulting in a Mo thickness of 2.8 and 4.1 nm for Si layers. For Mo/Be multilayers, ruthenium (Ru) is added to the Mo to produce an alloy layer with a thickness of 2.3 nm and a Be thicknesses of 3.4 nm. The nominal peak reflectivity for the Mo/Si is 13.5 nm, and it is 11.4 nm for the MoRu/Be. The reflectivity of the multilayer stack rapidly increases for the first 20 bi-layers, and it then tends to saturate. For typical bi-layer stacks, the number of bi-layers ranges between 40 and 60. Typical reflectance curves for the two multilayers are shown in Figure 8.4. The full width half maximum (FWHM) value, i.e., the bandwidth of the reflectivity curve with respect to the peak wavelength, is usually in the 3.5%–4% range. Because of the toxic environmental concerns in using Be in Europe and Japan, essentially all of the recent multilayer work has been performed with Mo/Si multilayers. The primary

100

Reflectance (%)

80

MoRu/Be 69.3% at 11.4 nm

Mo/Si 67.2% at 13.5 nm

(FWHM=0.35 nm) 50 bilayers

(FWHM=0.55 nm) 40 bilayers

60

40

20

0

11

12

13

Wavelength (nm)

q 2007 by Taylor & Francis Group, LLC

14

FIGURE 8.4 Reflectance response of Mo/Si and MoRu/Be multilayer mirror coatings measured at 58 from normal incidence. Typical peak reflectances of 69.3 and 67.2% are achieved for MoRu/Be and Mo/Si, respectively. (Data courtesy of Montcalm, C. et al, Proc. SPIE, 3331, 43, 1998.)

394

Microlithography: Science and Technology

materials-dependent characteristics of the EUV multilayers for mirrors are reflectance, stress, stability, and deposition methods to meet the tight specifications on thickness control and repeatability. In addition to these parameters, multilayers for masks require very low levels of defects that can be repaired to make them non-printable during a lithography process (see Section 8.4.2). EUV projection lithography system throughput is a strong function of the multilayer reflectance because of the multiple reflections required. A production lithography system can contain 10–12 multilayer coated optics that operate at near-normal incidence. Therefore, it is critical that the multilayer mirrors have the highest possible normal incidence reflectance and meet stringent multilayer matching requirements for production lithography systems to be practical in terms of imaging performance and throughput. The requirements involve (1) the intrinsic material properties and performance of the multilayer structures and (2) the deposition technology needed to meet tight specifications on thickness control and repeatability. 8.2.2.1 Deposition Systems Multilayers can be deposited using a variety of methods. The two most popular include DC-magnetron sputter deposition (MSD) and ion beam sputter deposition (IBSD) [11,44–54]. Another technique uses ion beam-assisted electron beam evaporation deposition (EBED), and it is likely the technique used for coating the first beta tool optics [30,55–57]. MSD is typically used for optics coating because of its ability to simultaneously coat large elements with good control of both uniform and graded coatings, coating speed, and temperature control. IBSD provides better defect control, and it allows the deposition conditions to be tailored to cover mask substrate defects. EBED has the advantage that compared to sputter deposition techniques deposited and atoms reach the substrate with low thermal energy. However, it does require an ion beam polishing step after each layer deposition. Other multilayer deposition methods such as atomic layer deposition (ALD) are also being investigated. 8.2.2.1.1 Magnetron Deposition For the MSD systems [11,44–46,49], 2–4 rectangular sputter sources are placed at various angular locations in a horizontal circular chamber with chimneys around them that limit the deposition zone to the area directly above the source. The substrates are held face down on a rotating table (platter) above the sources in a “sputter-up” configuration. The sputter chamber is typically cryo-pumped to the low 10K7 Torr. Ultrahigh purity Argon (Ar) at pressures of 0.50–2.00 mTorr is used to sputter the Si, Mo, and other targets at various power levels. The multilayers are deposited by sweeping the substrates over the sources with controlled rotation of the platter. One bi-layer is deposited for each complete revolution of the platter. The layer thicknesses are determined by the time the substrate is exposed to the source, depending on the substrate transit velocity. Additionally, the substrates are rapidly spun about their own axis of symmetry (i.e., in a planetary motion) that provides azimuthal uniformity. Because of the sensitivity of the deposition rates to changes of in process variables such as power, pressure, flow rate, substrateto-target distance, and platter velocity, these parameters must be closely controlled. The desired uniform or graded thickness distribution on a given optic can be achieved by modulating the velocity of the substrate while it passes through the sputter flux. If, for example, the system produces a coating too thin at the edge of the optic with a constant platter velocity, a more uniform coating is obtained by reducing the velocity while the substrate enters and leaves the deposition zones when the substrate edges are being coated. The optimized platter velocity modulation recipe is rapidly determined with computer software that predicts the film thickness uniformity for any given platter

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

395

velocity modulation recipe. The system can be used to obtain precisely graded coatings as well as uniform coatings, and it is applicable to both curved (concave and convex) and flat optics. The effect of substrate curvature is accounted for within the deposition model. 8.2.2.1.2 Ion Beam Deposition Ion beam sputter deposition has traditionally been used to deposit multilayer coatings on mask blanks [53,54], but it is also used to coat imaging optics [52]. Although the deposition process is much slower than magnetron sputtering, the process supports low defect coatings and a variety of options for controlling the multilayer smoothing of substrate defects and mirror roughness. To support low defect level multilayer coating of mask blanks, the systems are interfaced to standard mechanical interface (SMIF) pods. Important system parameters include target design, finish, purity, cleanliness, accelerating voltages, beam current, and ion optics, shield materials, geometries, and locations, handling protocols, system maintenance, venting, and pump-down processes. Although substantial work must be done to continue the defect reduction process for mask blanks, repeated coating-added defect levels as low as 0.05 cmK2 have been achieved [58] (see Section 8.4.2). 8.2.2.1.3 Electron Beam Evaporation Deposition EBED uses an electron beam to vaporize the material to be deposited with the substrates located at an appropriate distance facing the evaporation source. For large area thickness control, hardware baffles are needed that introduce another adjustable parameter that has to be controlled to achieve the desired run-to-run control between individual mirror coatings of EUV optics sets. Though this deposition technology seems to be the most likely choice for the manufacture of the first beta tool optics [30], little is known about actual achieved coating quality, and no data on optics matching comparable to what has been demonstrated for the MSD technique is available. An advantage of multilayers fabricated with EBED is the lower as deposited film stress compared to multilayers deposited with MSD [56]. 8.2.2.2 Uniform and Graded Multilayer Coatings To preserve the figure of the projection optics, thickness control to about 0.1% rms is necessary across the 160 mm diameter optical substrates. This imposes rigorous control on the uniformity of the multilayer period thickness over the surface of the substrate. With new MSD sputter sources and a constant substrate rotation velocity, a thickness uniformity of G0.5% Peak-to-Valley (P–V) over a 140 mm diameter region can be achieved with the coating at the edge slightly thinner than at the center. This thickness uniformity was further improved to G0.06% P–V (or 0.11% P–V) by the use of a substrate platter rotation velocity modulation technique where the substrate is moved more slowly while its periphery is entering and exiting the deposition zones. The 40 bi-layer Mo/Si multilayer stack is nominally 280 nm thick, corresponding to a thickness of approximately 1050 Mo and Si atoms. Therefore, a uniformity of 0.11% P–V is equivalent to 0.3 nm or a thickness variation of only about one atom between the thickest and thinnest regions of the coating. Platter velocity modulation is an extremely useful technique that has been applied to address a critical challenge—accurate control of coating thickness distribution on the curved surfaces of the ETS projection optics. Some optics require multilayer coatings with a thickness gradient to accommodate a variation in angle of incidence across the optic. A prime example is the first condenser optic that must collect a large solid angle of the radiation from the laser plasma source. Other examples include the optics used for the EUV Microstepper systems, i.e., the two Schwarzschild projection optics and the ellipsoidal condenser optic.

q 2007 by Taylor & Francis Group, LLC

396

Microlithography: Science and Technology

Run-to-run repeatability of deposition rate is critical to insure that all optics in a system reflect at the same peak wavelength. Wavelength-mismatched optics significantly reduce optical throughput of a system with multiple reflections. For deposition systems that can coat only one optic at a time because of size limitations, the run-to-run repeatability of thickness must be controlled to 0.4% at 3s. The wavelength shifts among the curves must be relatively small compared to the spectral width of the reflectance peak. 8.2.2.3 Stability of EUV Mirrors The reflectance, peak wavelength, and stress of the multilayer coatings must be stable over time, temperature, and radiation exposure. The temporal stability of the reflectance and peak wavelength of a Mo/Si multilayer stored in air has been demonstrated over a period of 25 months [46]. The observed fluctuation in the measured reflectance is largely caused by the relative uncertainty of 0.5% in the reflectance measurements. Therefore, the reflectance appears stable within the limits of the metrology. Measurement of the peak wavelength is more repeatable so that the shift toward shorter wavelengths with aging, although small, is probably caused by continued formation of Mo/Si at the interfaces that has a negative volume change of reaction or by densification of Si. Experiments characterizing stress have shown that Mo/Si multilayer samples optimized for high EUV reflectance had an as-deposited compressive stress of about K420 MPa [11,59,60] that decreases by w10% over the first few months after deposition, but stabilizes thereafter. The reflectance of those Mo/Si multilayers is observed to be stable within 0.4% over a period of over 400 days. 400 MPa of film stress is large enough to deform the figure of the projection optics in an EUV lithography system. Model calculations show that, except near the edges, most of the deformation because of film stress is spherical [61]. The spherical component of the deformation can be compensated for during alignment of the optics [62]. Nevertheless, because of the stringent surface figure requirements for these optics, it is desirable to minimize deformation and, in particular, nonspherical deformation of the optics as a result of the multilayer film stress. Previous techniques for reducing stress in Mo/Si multilayer films included varying the multilayer composition, deposition conditions, and post-deposition annealing as well as doping the multilayers during deposition using various gases [59,60,63]. However, any technique used to reduce the effects of multilayer film stress must do so without incurring a significant loss in reflectance. These methods have been tested, revealing that post-deposition annealing yields the greatest stress reduction at the lowest cost in reflectance [60]. For example, Mo/Si stress can be reduced by 75% with only a 1.3% (absolute) drop in reflectance at annealing temperatures near 2008C. Multilayer mirrors must also maintain their reflective properties after long-term exposure to the EUV flux expected in a high volume manufacturing EUV lithography system. Multilayer mirror EUV reflectance can be degraded by surface contamination, oxidation, and/or erosion. Contamination and oxidation are the biggest concerns for projection optics whereas erosion is more typically an issue with condenser mirrors that directly face the EUV generating plasma. In addition, degradation of multilayer surface properties in the multilayer bulk because, e.g., of interdiffusion at the Mo/Si/Mo interface boundaries, can reduce EUV optics lifetime as well. Contamination and oxidation can reduce projection optics performance in two ways: as a dc-effect reducing throughput only and as an ac-effect introducing wavefront aberrations [64]. Most likely, in situ metrologies will be needed to closely monitor the state of individual mirrors’ health. Metrologies monitoring optics contamination of individual mirrors on an atomic level as well as contamination prevention schemes and in situ optics cleaning methods have been proposed and demonstrated [65–69]. Because the

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

397

vacuum vessel containing the projection optics cannot be baked as would typically be done for ultra-high vacuum systems to reduce water background pressure, water adsorbs on the optics surfaces and is being dissociated mainly by the secondary electron flux generated by the incident EUV flux (and to a lesser degree, by the primary EUV photons themselves). If not counterbalanced, this process slowly oxidizes the optics surfaces. However, if hydrocarbons are present in the system, they are also dissociated on the surface. Given the right surface properties, carbon and oxygen atoms can recombine on the optics surface to carbon monoxide molecules that then can be desorbed via electron stimulated desorption (ESD) by secondary electrons and (with a smaller cross section) by photon stimulated desorption (PSD) by the incident EUV flux itself. The same argument suggests a way to prevent oxidation works, of course, to prevent carbon build-up by introducing the right amount of water pressure in a hydrocarbon rich environment. The challenge for commercial exposure systems is to develop this mitigation scheme into a stable process with a large enough cleaning process window in terms of water and hydrocarbon partial pressures. The key to achieve this will be the selection of the right optics capping layer material to support this mitigation scheme over the required operational lifetime of EUV optics. The optimal process would be a reaction that does not use up the optics surface material (like a catalytic reaction). Transition metals like Ru are prime candidates for optics capping layer materials. 8.2.2.4 Engineered Multilayers Surface- and interface-engineered multilayers are being developed to meet the stability requirements for commercial EUV optics. Those designs use alternating layers of Mo and Si separated by angstrom-thin Boron Carbide (B4C) or Carbon (C) layers [48,51,70–72]. These barrier layers control the interdiffusion of Mo and Si at the interfaces. Based on a study to optimize the thickness of individual layers to achieve the highest reflectance, the optimum thickness of the B4C layer on a Mo-on-Si interface is 0.3–0.4 nm and 0.25–0.3 nm on a Si-on-Mo interface. Peak reflectance values of 70% have been obtained at 13.5 nm with a FWHM of 0.54 nm using 50 bi-layers. The typical experimental reflectivity curves of such engineered multilayers are shown in Figure 8.5. Note that the peak reflectivity can be adjusted slightly by varying the spacing of the multilayer components.

Adjust peak

70

Reflectance (%)

60 50 40 70.6%

69.6%

30 20 10 0

12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 Wavelength (nm)

q 2007 by Taylor & Francis Group, LLC

FIGURE 8.5 Typical reflectance curves of interface engineered multilayers using B4 C as diffusion barrier. (Data courtesy of VNL/EUVLLC.)

398

Microlithography: Science and Technology

The engineered multilayers were tested for thermal stability during rapid thermal annealing. During these tests, the temperature was varied between 50 and 3508C, and the annealing time was kept constant at 2 min. The results showed excellent thermal stability compared to standard Mo/Si multilayers. There was no reflectance change for the multilayers with the B4C interfaces and a very small peak wavelength shift (0.02 nm) up to 2008C. Although the stress is approximately 30% higher for the engineered multilayers, thermal annealing at 3008C can reduce the stress by 60% with a loss of 0.3% reflectivity and a wavelength shift of 0.102 nm. The protective capping layer must be applied to the multilayer stack prior to removing the multilayer coated optic or mask from the deposition system. It protects the multilayer mirrors from degradation during exposure, damage during cleaning processes to remove surface contamination and, in the case of reticles, to remove particulates that may be deposited during handling or system use. A variety of materials has been used, including the Si and Mo multilayers and deposited C and Ru. Not only are the material properties important but the deposition conditions must also be controlled to optimize the effectiveness of the capping layer. Morphology and microstructure of the capping layer must be such that it is oxidation resistant and can be cleaned multiple times from contaminants like C without loss of EUV reflectivity, that it prevents damage to the underlying multilayer structure by, e.g., diffusion of hydrogen atoms along the grain boundaries, and that the capping layer thickness is such that the multilayer stack reflectivity is not reduced. It is not clear yet if all of this can be accomplished using one capping layer material or if, e.g., the phase space for MSD is too limited temperature wise (because the temperature cannot exceed the limit where the multilayer interface structure is damaged), and other techniques using low temperature precursors for deposition like ALD might be needed. Although a small decrease in reflectance has been observed for the Ru capped multilayers, the capped multilayers have demonstrated a dramatic improvement in lifetime in an accelerated e-beam testing with a 1.4% loss for 22-h accelerated exposure versus complete loss of reflectivity for a Si-capped multilayer [73,74]. In addition, the Ru capped multilayers have been tested in an atomic hydrogen environment that could be used to remove carbon contamination and are shown to be superior to Si capped multilayers [68]. 8.2.3 Remaining Challenges and Issues Although continued progress has been made in optics finishing and metrology, additional improvements are required to support production lithography tools. For example, to support adequate process control, continuous reduction of the WFE is required as feature sizes shrink with each lithography node. For the 32 nm node and below, the figure for individual optics in a six mirror system will need to be approximately 0.12 nm rms [29]. As the printing dimensions are reduced, the necessary reduction in flare will require a MSFR of !0.15 nm rms [75]. In order to simultaneously obtain low values of MSFR and HSFR, multilayer smoothing methods will be used to reduce the HSFR to %0.10 nm rms. As figure and finish requirements get more and more demanding, wavefront metrology for qualifying the optics will be evermore challenging. Even with current EUV optics quality, differences in wavefront measurements using different metrology tools are in the same order of magnitude as the WFE to be measured [36]. In addition, it is not clear yet if visible-light interferometry will be sufficient for optics qualification. Visible-light interferometry can measure wavefront and distortion, and it can quantify MSFR whereas at-wavelength metrology in addition to measuring wavefront, flare, and

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

399

chromatic effects is sensitive to optics contamination and EUV multilayer parameters [36,64]. The operational lifetime of the optical system must be greater than five years, i.e., figure and finish quality of the optical system have to be preserved over an extended time period. This will require precise control of the vacuum environment surrounding the optical system to minimize carbon deposition and oxidation of the multilayers on the optics. The primary remaining challenge in deposition technology is improved run-to-run repeatability of the deposition rate to provide wavelength-matched optics. Wavelengthmismatched optics reduce the optical throughput of EUVL systems. Advanced EUVL systems with higher numerical apertures than the ETS will require the coating of everlarger optics with either highly uniform or accurately graded coatings. New deposition systems are being developed to meet the challenges of coating such large curved optics. Although existing multilayer coatings possess the high reflectance required for commercial use of EUV lithography, further reflectance improvements will increase throughput and continue to drive down CoO. To date, experiments have shown the coatings to be relatively stable. However, more progress will be needed to meet the operational lifetime requirements of EUV mirrors in commercial lithography tools. Indispensable in this respect is the availability of accelerated lifetime testing methods for EUV mirrors that can be used by EUV mirror manufacturers to benchmark their progress toward meeting the commercial lifetime requirements. Significant progress has been made in this area, and dose equivalents of EUV photons and electrons used in accelerated electron beam testing are being established [76]. International SEMATECH is leading and coordinating the effort to enable accelerated EUV optics lifetime testing [77] to provide full verification methods for EUV optics stability. The effects (and compatibility) of various optical cleaning strategies on multilayer properties and lifetime need to be verified. The value of the optical substrates also makes it desirable to have techniques available for substrate recovery, i.e., the ability to either remove or overcoat a multilayer without damaging the substrate. Multiple approaches for substrate recovery are being improved, including wet etching, dry etching, and simply overcoating the original multilayer.

8.3 EUV Sources Extreme ultraviolet light in the soft x-ray spectrum between 11 and 14 nm can be produced by several methods; however, the EUV source power required to achieve the desired wafer throughput for commercial lithography scanners clearly favors hot, dense plasmas as EUV radiation sources [78–81]. A good estimate of the required plasma temperature is the temperature of a black body radiator with its maximum at 13.5 nm. For the emission maximum of a black body radiator to be at 13.5 nm, temperatures around 20 eV* are needed. Plasma temperatures in this range can be generated by exciting target materials either by high power lasers with laser produced plasma (LPP) sources or by electrical discharges, referred to as discharge produced plasma (DPP) sources. Other methods of producing EUV radiation in the 13.5 nm range include synchrotrons, high harmonic generation with femto-second laser pulses, discharge pumped x-ray lasers, and electron beam-driven radiation devices like emission from relativistic electrons. However, the power that can be extracted from high harmonics generation is too low (although these * Plasma temperatures are commonly specified in energy equivalents, i.e., kT in eV rather than T in Kelvin (20 eV correspond to approximately 220,0008K).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

400

sources are very promising for metrology and plasma characterization purposes), reliable and efficient x-ray lasers are still far in the future, and current synchrotron sources cannot provide the EUV flux needed and are not practical sources for chip manufacturers. Only free electron lasers might be able to meet the requirements for EUV sources, but they may not be practical for use in current manufacturing environments. In addition to powerful EUV sources for high volume manufacturing lithography tools, lower power EUV sources are also needed for resist and mask evaluation in low throughput METs [36] and for metrology tools such as ones for aerial image measurement systems [82], laboratory reflectivity monitors [83], open frame exposure tools for resist screening at EUV wavelength [84,85], at wavelength inspection of EUV mask blank defects or repaired areas [86], and at wavelength interferometry of EUV optics [36,87]. Though some of the sources mentioned above may not provide the source power eventually needed for high volume manufacturing lithography tools, they hold promise for use as metrology sources or with the development of coherent pulsed EUV sources in the femtosecond regime and characterization and improved understanding of hot dense plasmas [23,88,89] that could support development of high power EUV plasma sources. 8.3.1 Commercial EUV Source Requirements The requirements for EUV sources for commercial lithography based on a system throughput of 100 wafers per hour (wph) as agreed to by three lithography tool manufacturers [81] are listed in Table 8.2. The EUV power for lithography tools is specified in terms of “clean photons” at the so-called intermediate focus (IF), the aperture between the source and the illuminator. Figure 8.6 shows a schematic illustration of the concept of Intermediate Focus (IF). As shown, all debris filtering and all wavelength filtering must be performed prior to the IF location so that only clean photons within the 2% bandwidth window around 13.5 nm arrive at the IF. Because of limitations in EUV radiation collection efficiency (all of the radiation emitted by the source cannot be collected because only a limited collection angle is accessible) and possible absorption losses as a result of trace amounts of gases along the light path, more EUV power than required at the IF has to be generated by the source plasma to provide the 115 W of in-band EUV power at the IF [79]. TABLE 8.2 Joint Source Requirements from Lithography Tool Manufacturers Parameter Wavelength Extreme ultraviolet (EUV) power (in-band) at intermediate focus Repetition frequency Integrated energy stability Source cleanliness (after intermediate focus) Etendue of source output Maximum solid angle input to illuminator Spectral purity 130–400 nm (DUV/UV) R400 nm (IR/Visible) at wafer a

Specification 13.5 nm 115 W O7–10 kHza G0.3%, 3s over 50 pulses R30,000 h Maximum 1–3.3 mm2sra 0.03–0.2 sra %7%a To be determineda

Design dependent.

Source: 2003.

From Watanabe, Y., Ota, K., and Franken, H. EUV Source Workshop, Belgium.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

401

Vacuum chamber Plasma

Collector Aperture (IF)

Source Debris mitigation

Spectral purity filter

Source side

llluminator side

FIGURE 8.6 Definition of clean photon spot at intermediate focus (IF). (From Watanabe, Y., Ota, K., and Franken, H. Joint Spec ASML, Canon and Nikon, EUV Source Workshop, Antwerp, Belgium, September 29, 2003.)

The 115 W in-band power requirement at IF is governed by the lithography system characteristics, including assumptions for a resist sensitivity of 5 mJ/cm2, a 67.5% reflectivity for each of the six mirrors of the projection optics, a 65% reticle reflectivity, and by some additional losses in the camera system because of bandwidth mismatch, polarization losses, and gas absorption (each about a 5% loss) as well as an overall loss of about 8% in the illuminator after the IF. The requirement for source repetition frequency is driven by CD control, i.e., image quality. For a given scanning speed, the CD error increases with decreasing repetition rate. With scanning speeds of several hundred mm/s required to achieve the system throughput, repetition rates above 7 kHz are needed. The integrated energy stability requirement, i.e., pulse-to-pulse repeatability, is also driven by CD control whereas source cleanliness is driven by CoO. Source size (etendue) and collectable angle specification, i.e., the maximum solid angle input to the illuminator, are primarily driven by throughput considerations (collectable EUV power) and, to a lesser extent, by illuminator design considerations [90,91]. Spectral purity requirements are driven by the resist wavelength sensitivity, image quality, lithography tool design considerations, and, to some extent, by CoO. There are significant differences in DPP versus LPP source designs requiring different condenser designs and materials, heat dissipation schemes, and/or debris mitigation techniques. In the following sections, the generic features of DPP and LPP source technologies, i.e., how EUV power is generated, are described, and environmental issues are outlined that currently limit source component lifetimes thereby increasing the CoO of commercial EUV lithography sources. 8.3.2 EUV Discharge Produced Plasma Sources Converting electrical energy into radiation in gas discharge lamps is a very efficient and widely used process as demonstrated by the multitude of fluorescent lamps encountered in everyday life. However, traditional gas discharge plasmas are not sufficiently hot or dense to produce the intense 13.5 nm radiation necessary for EUV lithography. The discharge plasma must be heated and compacted by using the so-called pinch effect [92]. As illustrated in Figure 8.7, this effect is based on the compression of a plasma column through the magnetic field generated by the axial discharge current. The strong radial inward-pointing Lorentz forces (pinch effect) drive a fast compression or implosion at a supersonic speed that heats and compacts the forming plasma until a hot, dense, and

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

402

p

B (l,r ) (r )

p

FIGURE 8.7 Schematic illustration of pinch effect. (a) The current density ðj and the magnetic field Bð driven by the discharge current result in a radial force Fð Z ðj ! Bð that compresses the plasma column. (b) The respective magnetic field pressure, pmagneticfield, is balanced by the plasma pressure pplasma.

stable plasma column emerges that balances the magnetic field pressure. Plasma temperatures generated by the pinch effect can be estimated from the equilibrium situation shown in Figure 8.7 where the thermal plasma pressure (pplasma) equals the magnetic field pressure (pmagneticfield) generated by the discharge current B2 Z hZi C 1 ni kT 2m0

(8.3)

where hZiZne/ni is the degree of ionization, ne and ni are the electron and ion densities, respectively, and kT is the plasma temperature. With the magnetic field B defined by the axial discharge current I (BZm0I/2pr) and the line density Ni given by NiZni pr2, the resulting plasma temperature is given by kT Z

m0 I 2 8p hZi C 1 Ni

(8.4)

With targeted plasma temperatures in the 20–40 eV range, a medium ionization of hZiZ10 for Xe, plasma densities in the mid 1018 cmK3 range, magnetic field pressures for the equilibrium condition in Equation 8.3 have to be in the 2–4!108 Pa range (2–4 kbar). The respective magnetic fields in the 20–0 T range can be generated by discharge currents in the 20–30 kA range. By keeping the scaling parameter I2/Ni in Equation 8.4 constant via simultaneous adjustments of discharge current and line density Ni, plasmas with similar temperatures can be generated with different configurations. Discharge sources producing stable plasma columns using the characteristic parameter values given above can only be achieved in pulsed mode operation. Typically, target media vapor pressures in those sources are between 5 and 100 Pa with plasma columns of several millimeters in length compressed to diameters of a few 100 mm by pulsed high current discharges with peak currents in the range of 5–50 kA. Plasma temperatures have been reported between 15 and 50 eV with pulse durations ranging from several nanoseconds to microseconds [93] and duty cycles up to several kHz. The target material that has been used most widely is Xe with some Helium (He) added for stability [94,95], but other higher conversion efficiency targets are also being considered, foremost among them is tin vapor. In addition, there are other targets such as lithium or oxygen that have promising emission characteristics, but handling highly contaminant and contagious liquid metals or highly reactive species presents significant obstacles when compared with more benign targets. For all discharge systems currently being explored, the drive current is generated by the fast discharge of charged high

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

Z-pinch

Plasma focus

HCT Z-pinch

Capillary discharge

403

(a)

(b)

(c)

FIGURE 8.8 Schematic depiction of the most commonly used gas discharge designs for extreme ultraviolet (EUV) sources. In (a), the ignition phase is shown; in (b), the current flow in the high current phase is shown; (c) shows a schematic of the actual plasma areas formed and EUV emission from each source type (c).

voltage capacitors. Several DPP concepts are being explored [78,96], and the four most widely used designs are shown in Figure 8.8. Each of the source concepts shown, Z-pinch, hollow cathode triggered (HCT) Z-pinch, plasma focus, and capillary discharge, has significant advantages and drawbacks. The emission characteristics of the plasmas generated by the four designs are very similar. However, differences in electrode configuration specifically with respect to electrode erosion (i.e., electrode lifetime) and cooling (scalability to high repetition rates), usable solid angle (etendue), and self absorption of the emitted EUV radiation in the respective neutral gas environment can be quite different for the respective discharge source concepts. 8.3.2.1 Z-pinch Intense EUV radiation in the spectral range around 13.5 nm has been generated using Z-pinch source configurations with Xe gas as target material and different pre-ionization

q 2007 by Taylor & Francis Group, LLC

404

Microlithography: Science and Technology

schemes. One source design uses radio-frequency (rf) pre-ionization [97,98] whereas other designs use pre-ionization generated by an electrical discharge in the electrode area [94,95,99–103]. In the Z-pinch design, electrical stored energy in high voltage capacitors is supplied to the electrode system in a Xe/He gas environment. For the hot, dense plasmas generated by the pinch effect along the axis between the electrodes, EUV power emission of 120 W in-band EUV power (2% bandwidth) in 2p sr at 4 kHz repetition rate, 1.8 sr collectable angle, 0.55% conversion efficiency, source dimensions of 1.5 mm lengths and 500 mm diameter, and source energy stability !5% (1s) have been reported [100,101]. The power at IF is approximately 10 W. 8.3.2.2 HCT Z-pinch In the basic HCT Z-pinch source design flat, cylindrically shaped hollow electrodes with central boreholes opposite to each other are connected to a capacitor bank that is charged to a high voltage [104–108]. During the ionization phase, the gas in the volume along the field lines in the gap between the boreholes and extending into the hollow electrodes is ionized, and the hollow cathode plasma is formed with its plasma column providing the required low resistance for the compressing high current pulse. The breakdown voltage of the system is determined by the Paschen curve [109], an empirical curve describing the dependence of breakdown voltage on the reduced pressure, i.e., the product of pressure and electrode separation. The Paschen curve has a minimum, depending on gas/gas composition being used (for air, it is approximately 0.8 Pa m) with breakdown voltage increasing for lower, reduced pressures because the mean free path of electrons becomes comparable to the electrode separation, and electrons recombine at the electrode instead of ionizing atoms/molecules between the electrodes. For higher reduced pressures, the breakdown voltage increases because electrons cannot gain enough energy from the electromagnetic field between two collisions. In contrast to other Z-pinch sources, the HCT Z-pinch source operates on the left branch of the Paschen curve (as known from pseudo sparks), i.e., the breakdown voltage increases if electrode distance and/or pressure are reduced. This results in the electrical breakdown occurring in the volume between the two boreholes. In addition, breakdown occurs spontaneously when increasing the voltage. This eliminates the need for an external switch, i.e., the capacitor banks at voltages in the range of 10 kV can be directly connected to the electrode system. The basic HCT Z-pinch design advantages include the fact that no external switch is needed, and there are no insulators in the vicinity of the discharge region. Operating the HCT Z-pinch in self-breakdown mode has some disadvantages, including a comparatively difficult control of discharge timing and strong dependence of the breakdown voltage on system pressure, i.e., small pressure variations shift the breakdown voltage. Therefore, a stable operation of the self-triggered HCT Z-pinch is only possible for frequencies up to w1 kHz. By inserting a third trigger electrode biased to a few hundred volts (as shown in Figure 8.8) into the hollow cathode, the trigger electrode can remove the initial electrons in the hollow cathode prior to a breakdown, thereby impeding the breakdown. Although this introduces an additional switch into the HCT Z-pinch design, there is still a major difference between this type of switch and the switches that have to be used in other designs because this HCT Z-pinch does not carry high currents. Using the HCT Z-pinch with the trigger electrode allows stable operation of the source over a wider pressure range and provides precise discharge using low voltage trigger pulses [105,106]. For the HCT Z-pinch source operating with Xe, EUV power (2% bandwidth) of 10 W at IF at 7 kHz repetition rate, conversion efficiency of w0.5%, 1.8 sr collectable angle, and source dimensions of less than 2 mm length have been reported [107,108].

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

405

8.3.2.3 Capillary Discharge The capillary discharge source is one of the smallest and simplest EUV sources. Capillary discharge sources for EUVL have been developed by SNL in collaboration with the University of Central Florida [110], by GREMI-ESPEO in France [111,112], and also by the EUVA consortium in Japan [113]. The electrode configuration is similar to the Z-pinch discussed above. Instead of the insulator and the pre-ionized plasma, a capillary consisting of a ceramic tube is integrated on the axis between the electrodes [114]. The discharge between the two electrodes is guided through the capillary. Although this limits the radial size of the source to the inner diameter of the capillary, the plasma is well located in space, and the spatial jitter of the plasma is low compared to other types of discharges. Because of the small distance between plasma and capillary walls, erosion is a severe problem. In addition, the high thermal stress of the capillary material in combination with shock waves can cause the capillary material to crack. Therefore, the choice of capillary material is critical, and ideal materials include electrical insulators with high thermal conductivity that can withstand the high stresses generated by the steep thermal gradients at the insulator surface. Typically, capillary materials that have been used are specifically engineered ceramics or materials such as aluminum nitride or diamond [110,111,115–117]. Because capillary discharge source erosion tends to be more severe than for other source designs, debris mitigation techniques such as the helium curtain that can also be used in other designs have been pioneered for this source [110]. One of the main advantages of the capillary discharge source is its longer pulse length that, in principle, would make it easier to scale up this source type for high power operation. For a non-commercial prototype discharge source, EUV power emission (2% bandwidth) of 9.4 W in p sr at 1 kHz repetition rate (burst mode) and 0.1% conversion efficiency has been reported [110]. Compared to other sources, the 2p equivalent value is 18.8, and with typical EUV transmission of 40% between source and IF, the clean photon power at IF would be close to 4 W. 8.3.2.4 Plasma Focus The plasma focus device electrodes are arranged concentrically with an insulator between them on one end (compare Figure 8.8). When a high voltage is supplied to the electrodes, a plasma is generated on the surface of the insulator. The current through the plasma creates a magnetic field that forces the plasma to the open end of the electrodes. This geometry leads to a compression of the plasma on the axis of the electrodes where high densities and temperatures are achieved. In the plasma focus configuration, the electrically stored energy is converted to magnetically stored energy before being used for the pinch plasma generation. Compared to the Z-pinch design, the current pulse duration for the plasma focus design is typically one order of magnitude longer. This relaxes design constraints for the pulse generator and usually allows operation at a lower charging voltage. As with other discharge sources, electrode erosion can be a severe problem. Hot ions and electrons from the plasma expand in a shock wave that interacts with the electrodes, resulting in electrode erosion of the inner electrode closest to the plasma. For the plasma focus devices known as dense plasma focus (DPF), in-band EUV output energy in the 60 mJ/pulse range into 2p sr at 1 kHz continuous operation and 4 kHz burst mode operation and conversion efficiencies close to 0.5% have been reported [118–125]. At 1 kHz, this corresponds to 60 W (200 W for short-term 4 kHz burst mode) generated EUV power. With collector geometries providing 2 sr EUV light sampling and EUV transmission between source and IF around 40%, the typical EUV power measured at IF is then around 7 W at 1 kHz (20 W for short-term 4 kHz burst mode).

q 2007 by Taylor & Francis Group, LLC

406

Microlithography: Science and Technology

8.3.2.5 Other Designs In addition to the concepts shown in Figure 8.8, a new concept called the star-pinch [126–128] has emerged that uses a discharge geometry where the hot plasma is farther removed from the wall material (electrode or insulator) than in the other discharge designs. Therefore, the star pinch has less of a problem with cooling (radiation heating scales with RK2 with R being the distance between wall and plasma) and electrode lifetime. Per pulse, EUV fluences (2% bandwidth) of 7 mJ per sr and per mm3 emitting source volume have been reported [127]. With an EUVemitting source volume of w1 mm3, a collector geometry (similar to other devices) sampling close to 2 sr, and a current maximum repetition rate of 1 kHz range, this translates into w14 W of generated in-band EUV power. Another source design, the Capillary Z-pinch, that is currently in development by the EUVA consortium in Japan [129], combines features of the capillary discharge source and the Z-pinch. Extreme Ultraviolet Lithography System Development Association has reported 9.7 W in 1.55 sr at 2 kHz and 0.57% conversion efficiency. The equivalent 2p value is 39.3 W and assuming 40% EUV transmission between plasma source and IF point corresponds to 7.9 W at IF. 8.3.2.6 DPP Source System Aspects For most DPP sources, the electrodes are in close proximity to each other and to the walls of the discharge lamp; therefore, electrode erosion and erosion of the wall material present significant debris and source lifetime issues. For most DPP sources (with the exception of the star pinch source), EUV emission is not isotropic with most of the radiation emitted in a narrow cone along the symmetry axis of the source (compare Figure 8.8). Therefore, grazing incidence collectors are needed to collect as much of the EUV light as possible. Although the maximum solid angle for EUV light collection is not known exactly, grazing incidence collectors using a sophisticated condenser design should be able to capture most of the emitted EUV light. In principle, much of the technology that has been developed for x-ray and synchrotron radiation grazing incidence mirrors can be adapted to the illuminator designs using the DPP sources. However, the DPP source environment is much more challenging for the mirror materials typically used (e.g., noble or transition metals). Understanding condenser erosion in high power DPP sources is critical for developing solutions that are cost effective and meet the source goals for critical component lifetime. This can be done either by frequent replacement of condenser elements having low component costs and negligible associated tool downtime, or by developing new condenser materials that can withstand the harsh source environment for longer periods of time before the EUV reflectivity and reflectivity uniformity drop below acceptable levels [130]. The other important factor in extending critical component lifetime is reducing the amount of debris that is generated by the source or reducing the debris that arrives at the condenser mirrors. Reducing the debris generated by the source may prove to be impractical because most debris problems get worse if targets such as tin with higher EUV conversion efficiency are used. This makes it even more important to measure and understand the kind of debris being produced by discharge sources in order to be able to develop effective debris mitigation techniques. Several debris mitigation techniques have already been developed such as the helium curtain technique [110] that sweeps out debris without reducing EUV flux or foil traps [131] that prevent debris from reaching the condenser surface (but also absorb EUV light) as well as electromagnetic confinement of debris that has been ionized using secondary plasmas [132]. Because of debris mitigation, the collected EUV flux is reduced, and the maximum clean source power achievable by a single DDP source is likely to be much less than 115 W, and methods may be needed for multiplexing several DPP sources.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

407

Next to debris and contamination reduction, managing the thermal load of a high power EUV DPP source will be the other major engineering challenge that has to be solved. Thermal loading becomes more severe as repetition rates are increased in order to scale power toward the commercial source requirements of 115 W clean EUV photons at IF. With typical DPP source electrical to EUV conversion efficiencies in the 0.1% range and plasma power in the order of several 100 W (to produce 115 W at the IF point), effective power dissipation away from critical source components in the range of 100 kW has to be achieved. Though a w0.1% electrical to EUV conversion efficiency may not appear to be very efficient, the efficiency is much better than the electrical to EUV source conversion efficiency of LPP sources. However, the relative large radiation volume of DPP sources might lead to an etendue limitation of extractable EUV power. Compared to LPP sources, overall perceived lower system complexity is seen as an advantage of DPP sources. 8.3.3 EUV Laser Produced Plasma Sources Laser produced plasma sources produce EUV radiation by focusing high intensity laser pulses onto a gaseous, liquid, or solid target. At light intensities exceeding 10 10– 1011 W/cm2, the target material is highly ionized, and the electrons are accelerated in the high intensity fields to heat the plasma. Because the initial plasma is produced by the leading edge of the laser pulse, the main portion of the laser pulse interacts with this plasma and neutral target material. With the dielectric function of a free electron gas given by

3ðuÞ Z 1K

u2p u

2

with

u2p Z

ne2 30 me

(8.5)

where up is the plasma frequency, u the light frequency, n the electron density, and e and me electron charge and mass, respectively. When the plasma density exceeds the critical density nc where the plasma frequency equals the frequency of the incident light, the dielectric function becomes negative and the refractive index imaginary. For densities above nc, the incident light is reflected. This effect together with a steep density gradient on the vacuum side of the target limits the effective target material thickness to a few 100 mm for typical laser pulses of 10 ns duration. Optimizing target morphology, duration of interaction, i.e., the length of the laser pulse, intensity and wavelength is key to efficient absorption of the laser radiation by the target material. As with DPP sources, the spectral distribution of the emitted radiation is governed by the plasma temperature, and it can be estimated using the relation [133,134] T½eV Z 2:85 !10K4 I½W=cm2 4=9

(8.6)

that for typical LPP source laser intensities in the 1012 W/cm2 range results in w60 eV plasmas. At these temperatures, line emission of highly ionized ions produces the desired EUV radiation in the 13–14 nm range (e.g., for Xe the respective ion is XeC10) [135–137]. As for DPP sources, the target material that has been most widely used for LPP sources is Xe because of the ease of material handling and low levels of contamination. Other target elements that have been identified as efficient spectral emitters in the EUV wavelength range include lithium, oxygen, or metal targets such as gold, tin, tantalum, tungsten, and copper [138,139]. Although in the past, solid state targets were widely used because of the higher achievable conversion efficiencies, i.e., laser light to EUV; these targets have the disadvantage of high debris generation and severe contamination

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

408

of condenser optics [140–142]. Therefore, mass-limited targets such as clustered or condensed noble gases or liquid droplets are favored. Typical LPP sources use diode pumped Nd:YAG lasers (1064 nm) for compactness and have power conversion efficiencies with pulse widths of 3–10 nm per pulse energies of 0.5–1 J and intensities of 1011 to a few 1012 W/cm2 with plasma temperatures in the 40–100 eV range. For a typical LPP installation, the laser strikes the target jet at a right angle to the target material jet. EUV light emitted from the plasma is collected using an elliptical condenser mirror. A diffuser collects the unused target material from the jet that together with gas in the illumination chamber is recycled for continuous use in the system. Generally, LPP sources can be differentiated by the way the target material is delivered into three groups: spray jet sources, filament type sources, and droplet type sources [79,94,95,100,101,143–153]. There are also other LPP source types such as ablation type sources, but they are not being considered to be among the most promising for commercialization into a EUVL light source. 8.3.3.1 Spray Jet Sources

0.5 0.4

0.2

0.3 0.2

0.1

0.1 0.0

0

2

4

6

8

10

CE (% into 2p into 2% BW)

EUV (W into C1 into 2.5% BW)

Figure 8.9 shows a picture of the liquid spray jet (the liquid spray jet consists of small Xe droplets and is an improvement from the gas and cluster spray jets with higher EUV output but has the same features and problems of earlier spray jet sources) and the dependence of conversion efficiency and EUV output as measured at the first condenser surface on plasma/nozzle separation [145]. In order to have a high EUV output, the laser has to intersect the spray jet as close to the nozzle as possible because the spray jet diverges, and the intersected target flux in the laser focus area decreases. On the other hand, the closer plasma production is to the nozzle, the more severe nozzle erosion and heating. Erosion limits the lifetime of the nozzle, a critical source component, and heating not only presents a thermal management problem (but the heat has to be conducted away) but also effects spray jet stability. Therefore, the operating window for the liquid spray jet has to be chosen carefully in order to provide high EUV output with sufficient source stability and low nozzle erosion. Although the EUV output of liquid spray jet sources is too low to enable the high wafer throughput required in commercial EUV scanners, a liquid spray jet source has been the workhorse used in the first EUV alpha tool (the ETS at the

Laser Xe jet

Nozzle

d

0.0

d (mm) FIGURE 8.9 Liquid spray jet source developed at the virtual national laboratory (VNL). (From Ballard, W. P. et al., Proc. SPIE, 4688, 302, 2002.) The picture shows the liquid spray jet stream and indicates where the laser beam intersects the jet. The graph shows the extreme ultraviolet (EUV) power measured at the location of the condenser and the conversion efficiency as a function of plasma nozzle separation d; in the picture shown, d is 2 mm. (From Ballard, W. P. et al., Proc. SPIE, 4688, 302, 2002.) (Picture courtesy of VNL and EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

409

VNL in Livermore) for full field imaging [154]. For the ETS LPP source, a 1.5 kW laser consisting of three 500 W modules each operating at 1.6 kHz was used. Operating a single 500 W module with EUV conversion efficiencies around 0.2%, peak EUV power output reached 1 W with collected power close to 0.4 W (depending on plasma/nozzle separation, refer to Figure 8.9). 8.3.3.2 Filament or Liquid Jet Sources One of the key advantages of LPP sources is that the location of plasma generation can be chosen to minimize the source proximity to surface materials. The spray jet sources are not really capable of using this advantage because the target jet rapidly diverges as it leaves the nozzle exit, and the plasma has to be generated just a few millimeters away from the nozzle. The filament jet or liquid jet [94,143,147,149–152,155–157] and the droplet source solve this problem in a different way. The filament jet or liquid jet can either be completely frozen or a liquid stream that essentially projects a solid rod or liquid stream of target material out into the vacuum where it can be intersected far away from the nozzle. Producing the filament jet or liquid jet can be achieved by operating a Xe liquefaction system at the right temperature/pressure range and ejecting the stream through nozzles of 10–100 mm size into the vacuum. The distinction between liquid jet and filament jet is only a gradual one because evaporation cooling ensures that the liquid jet will start to freeze from the outside toward the center of the jet stream as it passes through the vacuum. Therefore, a liquid jet will always be partly frozen, and the filament jet starts out at the nozzle in liquid form (otherwise, it will not pass through it), but the starting temperature is lower than for the liquid jet, and it will freeze to the core of the stream exiting the nozzle very quickly. Figure 8.10 shows a photo of a 50 mm diameter filament jet with the target rod protruding 25 mm into the vacuum. The critical issue with filament/liquid jets is the production of a rod of w50–100 mm diameter far from the nozzle that is stable enough so that it can be hit in space by a focused laser beam whose focus area is probably just a few times the filament diameter. Working distances of up to 50 mm (i.e., nozzle/plasma separation) with shot to shot EUV power stability of !3% (1s) have been achieved [150,151]. Producing the output power required by commercial EUV lithography tools can only be accomplished if LPP sources can be operated at high repetition frequencies. This, in turn, requires a high enough filament speed so that the filament disturbance caused by one laser shot does not affect the arriving filament volume that supplies the target material for the next laser shot. If the velocity of shock waves through the liquid stream or the filament rod were to determine the repetition frequency, liquid jets should top filament jets because sound (and, therefore, a shockwave) travels faster in a solid than a liquid. However, this does not seem to be the case because the laser shot completely breaks the filament, and the important timescale governing repetition frequency is the time it takes to transport the broken liquid jet or filament jet section away from the laser focus area before the next laser shot is fired. Measurements show [150] that the broken sections have length scales in the

Nozzle

Filament, ∅ 50 μm

~25 mm

q 2007 by Taylor & Francis Group, LLC

FIGURE 8.10 Example of a Xe-filament of 50 mm diameter and 25 mm length. (Photo courtesy of VNL and EUVLLC.)

410

Microlithography: Science and Technology

millimeter range, and with respective jet velocities around 50 m/s, the minimum time that has to pass until the next laser shot can be fired is around 20 ms, i.e., repetition rates up to 50 kHz should be possible. An improved version based on the ETS source design and using a liquid Xe jet demonstrated 22 W of generated EUV power (2% bandwidth) into 2p sr with 0.87% conversion efficiency (laser power to EUV power) and 9.4 W at IF using 2.475 kW laser power at a 5 kHz repetition frequency, and it demonstrated 1.6% (1s) dose stability averaged over 50 pulses [158,159]. 8.3.3.3 Droplet Sources As with the filament or liquid jet source, the droplet source [141,160–168] solves the problem of moving the location of plasma generation from the nozzle and, at the same time, reduces the amount of target material that is introduced into the source chamber. For rare gas targets, this may not be such a big advantage; therefore, the focus has shifted to filament and liquid jet development. However, this might change if targets have to be used that are more contaminating, e.g., tin, for which mass limited droplet targets have definitive advantages. The use of droplet sources also requires synchronization of the laser pulse with the arrival of the droplet at the target location. 8.3.3.4 LPP Source System Aspects Total system efficiency is determined by the efficiency of converting electrical power to laser power, the percentage of laser power that is converted into EUV, and the collection of EUV flux by the condenser. Solid state lasers nominally operate in the 5% efficiency range in converting electrical power to laser power. The conversion of laser power into EUV power requires tailoring the laser pulse shape, duration, target point, and power to the target to maximize the efficiency. For Xe liquid or solid targets, conversion efficiencies as high as 1.2% are expected. Power scaling is typically obtained by increasing the operating efficiency through laser and Xe target improvements. Assuming repetition rates of up to 25 kHz for a Xe target, a laser pulse energy of 1 J, a conversion efficiency of 1.2%, and a collection angle of 5 sr, a 25 kW laser could produce 115 W of EUV power at the IF point. Use of non-Xe targets such as tin (Sn) would lead to even higher EUV power levels. Laser produced plasma sources with noble gas targets are inherently cleaner sources than DPP sources, and the point source plasma geometry in a vacuum environment provides emission characteristics that are essentially isotropic (i.e., 4p). Therefore, near normal incidence Mo/Si Bragg reflector condenser mirrors are used to collect as much light as possible. Because these mirrors are exposed to the high energy particle flux ejected from the plasma, multilayer erosion is expected to be a significant problem. For the LPP liquid spray jet source that has been used for the ETS alpha tool, this has been observed [169,170]. As for DPP sources, a detailed understanding of the debris responsible for the damage is a prerequisite to reducing condenser erosion. Most likely, there are two primary causes for condenser erosion: very high kinetic energy ions (kinetic sputtering) and highly charged ions that cause coulomb explosions (potential sputtering) when hitting a surface, regardless of their kinetic energy [171]. For LPP sources, there are some simple debris mitigation techniques that are easy to implement such as using Ar background gas in the chamber or the addition of other methods such as combinations of electric grids to stop or divert ions before they hit an optics surface [172]. Laser produced plasma system advantages include: a small radiation volume, i.e., a small well-defined plasma providing a good etendu match for collection by the illumination system; exact initiation of the plasma by the laser pulse; no direct current on the source components; and no need for a spectral purity filter because the spectral emission of the plasma is primarily within the bandwidth of interest, and after a single multilayer

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

411

reflection, a spectral purity filter is not required. Disadvantages include low electrical to EUV conversion efficiencies (w0.02% for Xe and 0.044% for Sn) as a result of the intermediate electrical to optical energy conversion and the perceived higher system complexity and cost. 8.3.4 Synchrotron Radiation Sources Synchrotron radiation sources have been investigated as a mature candidate source technology for EUVL [173,174]. They can be made very reliable, and they have no issues with contamination. On the other hand, synchrotrons are not compact and have unique and costly facilities requirements. A significant issue regarding the use of bending magnet synchrotrons for EUVL is that these sources emit broadband EUV power that achieves high brightness in the narrow vertical axis but low brightness in the extended horizontal plane. Thus, a complex condenser design will be required to extract sufficient in-band power to achieve acceptable throughput with a scanning ring-field projection optic. Several patents and publications have appeared describing high-efficiency synchrotron condenser designs, but the power level remains a factor of two to three below laser-plasma source projections. For synchrotrons to produce the required throughput, it will be necessary to employ insertion devices such as undulators or wigglers consisting of a periodic array of permanent magnets inserted into a straight section of the synchrotron. The power increases with the number of periods, and these sources would be more complex and expensive than their bending magnet counterparts, but they would allow more efficient radiation collection because of their much smaller emission solid angle. A synchrotronbased system has been proposed where each synchrotron ring supports two EUVL steppers [174]. The projected power for a hybrid ring with a wiggler and a beam energy of 500 MeV and a beam current of 1 A is 1.9 kW over all wavelengths and 8 W of power at 13 nm into a 2% bandwidth. The power scales slowly with beam energy, and it is estimated that a synchrotron source generating 100 W EUV would cost on the order of $350M. With the right facilitization, this source could support up to eight scanners. Out of band power dissipation and the Brehmstrahlung radiation with associated shielding and monitoring present additional facilitization challenges. 8.3.5 EUV Source Metrology True to the motto, “You can’t make what you can’t measure,” EUV source metrology is a key enabler for commercial quality EUV sources. Precise and stable measurement methods for measuring EUV emission from the source volume are a prerequisite for effective source optimization. EUV sources must be characterized in terms of absolute in-band power, spectral distribution of within EUV band radiation and out-of-band radiation (VUV, DUV, and IR), spatial dimensions and stability of the emitting source volume as well as the angular distribution of light emission [175–177]. An important step in benchmarking source capabilities has been the so-called flying circus (FC) EUV source comparison [178–180]. Initiated by ASML to review the development status of EUV sources, including a method for standard comparison, a portable narrow-band XUV diagnostic was applied in a benchmark effort to evaluate the commercial EUV source development status for a number of source concepts, including DPP and LPP sources [179]. International SEMATECH expanded this effort under the FC-2 program and continues to drive standardization of source metrology and industry benchmarking to assess progress of EUV source development and commercialization [77]. Among the metrology needs being identified by industry experts that could be addressed in a precompetitive fashion are time resolved spectral-, imaging-, and high-repetition rate

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

412

diagnostics for individual EUV pulses and up to 10 kHz repetition rates, respectively; broad band radiation measurements; radiometry and spectral characterization capabilities, including instrument calibration; reflectometry for multilayer material and other materials and source metrology for IF characterization, a common source metrology roadmap; and, finally, metrology for in situ, real-time measurements [181]. An international EUV source metrology roadmap would be very helpful in addressing the need for qualification and benchmarking of commercial grade EUV sources. As important as the characterization of source performance is for the optimization of EUV output, a standardized metrology is even more important in determining progress in EUV source critical component lifetimes to lower EUV source CoO. Methods are needed for monitoring and characterizing the actual status of electrode degradation, optics components erosion and/or deposition of materials on optics surfaces, effectiveness of debris mitigation, thermal stress, etc. The respective measurement methods needed include environmental metrology techniques for EUV sources. Commercial grade EUV sources will benefit from the availability of in situ, real-time metrologies that can closely track the environmental health of individual source components.

TABLE 8.3 Development Status of Electric Discharge-Produced Plasma Sources

Company/ Consortium VNL/UCF/GREMI [93] Cymer [124,125]

Source Type c

Capillary discharge Dense plasma focus

Target

Rep. Rate (kHz)

Conversion Efficiencya (%)

Power (W) (into 2% Bandwidth) Into 2p srb b

@IF 4

Xe

1

0.1

18.8

Xe

2

0.45

66

6.6

Philips extreme UVd [107,108]

HCT Z-pinch

Xe

5 (burst) 7

0.45 0.4–0.5

200 50e

20 10

XTREME technologies [94,101] TRINITIg [102,103]

Z-pinch

Sn Xe, Snf

4 4

1.2–1.86 O0.55

106 120

21 10–20

Z-pinch

Xe

1 1.7 (burst) 1 (burst) 1 2 (burst)

1.2 1.2 1.4–2.26 0.5 0.57

70 200 100 14 39.3b

14e 40e 20e 2.8e 7.9e

PLEX LLC [128] Extreme ultraviolet lithography system development association (EUVA)h[129]

Star pinch “Capillary Zpinch”

a

Sn Xe Xe

Average conversion efficiency and highest observed value. If values for less than 2p sr were reported, values were recalculated for equivalent 2p values for ease of comparison. c Laboratory prototype. d Collaboration with Fraunhofer ILT Aachen, Germany. e No data available; estimated from reported value (power at source or at IF, respectively) assuming p sr collection efficiency and 40% EUV transmission between IF and source location. f Sn results are identical with results from TRINITI. g Troitsk Institute for Innovation and Fusion Research, Russia. h Early development: USHIO, Gigaphoton, Tokio Institute of Technology, Kumamoto University. b

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

413

8.3.6 Commercialization Status and Production Needs Although substantial progress has been made in improving source power and although over ten companies are developing laser produced and electric discharge plasma sources, substantial development is required to meet the increased power levels for production. For example, the EUV source power has increased from a fraction of 1 W in 1996 to greater than 15 W (2p, 2% bandwidth); at the same time, the source requirements have increased to over 300 W in the collection volume to support the high wafer throughput and affordable operating costs. The higher power requirements have placed increased demands on source thermal control and debris mitigation and on improving the EUV flux transmission through the entire system by improving mirror reflectance, reducing the number of mirrors, reducing the need for spectral purity filters, improving resist sensitivity, and improving environmental control. As a consequence of the increased power requirements, it has become less likely that commercial grade sources (DPP as well as LPP sources) for high volume manufacturing will use Xe as a target [102,164,165]. For some of the sources, the EUV power limitation using Xe targets has become clear, and more efforts are now being refocused on targets with higher EUV conversion efficiency such as Sn [107,108,182– 185]. For LPP sources, double pulse techniques with the first pulse preparing the optimum target state for the second pulse to produce maximum EUV output are being explored [135], and multiplexed lower power lasers are being used instead of using a single high power laser beam to increase efficiency and reduce cost [155,156]. Table 8.3 and Table 8.4 summarize the development status of DPP and LPP sources, respectively. As can be seen, there is more activity in DPP than in LPP source development. This is a reversal of the early EUV source development where the emphasis was on LPP sources. Since then, development of the most powerful LPP source demonstrated [158] has been discontinued, and others have dropped out of the race [182] or have

TABLE 8.4 Development Status of Laser-Produced Plasma Sources

Company/Consortium b

NGST/CEO [154,158] JMARc [182] EXULITEe Innoliteg [152] XTREME technologies [94,101] Powerlase [155,156] EUVA [129,157] a

Power (W) (into 2% Bandwidth)

Source Typea

Target

Laser Power (kW)/Rep. Rate (kHz)

Conversion Efficiency (%)

Into 2p sr

@IF

SJ LJ ML SJ LJ LJ

Xe Xe Sn Xe Xe, Sn Xe

0.5/1.6 2.5/5 0.6/2.5 0.035/NAf 0.05/O17 0.3–0.5/3.3

0.2 0.87 O1 0.71 0.95 0.94

1 22 1.6 0.025 0.5 2

0.4 9.4 0.64d 0.008 0.2d 1

SJ FJ

Xe Xe

0.6/3.5 0.6/10

0.9 0.33

5.4 2

2.7 0.8d

LJ, liquid jet; ML, mass limited (droplet target); SJ, spray jet (or cluster jet); FJ, filament jet. NGST/CEO has discontinued LPP source development, but other companies could still use the technology. c JMAR has discontinued EUV source development. d No data available; Assuming 2p collection efficiency and 40% EUV transmission between plasma and IF. e The members of the French EXULITE consortium are: THALES Laser SA, CEA, and Alcatel Vacuum Technology. f No data available. g Innolite is commercializing the filament jet/liquid jet technology developed at the Swedish Royal Institute of Technology. b

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

414

shifted their emphasis from development of high power lithography EUV light sources to development of high brightness metrology sources [152]. Discharge produced plasma sources currently appear to have the lead and most likely will be used in the first commercial EUV lithography tools. However, for high volume manufacturing, the best source has still to be demonstrated in terms of providing the most reliable and cost efficient solution.

8.4 EUV Masks Although EUVL is an optical extension of DUV lithography technology to shorter wavelengths and it supports the direct transfer of learning from 248, 193, or 157 nm optical lithography, the basic structure of an EUV mask is different from conventional optical masks [186,187]. In contrast to the other technologies, EUVL cannot use a transparent mask technology, but it must use reflective masks employing the same kind of multilayer technology that is used for the EUV camera optics. The major steps in producing a EUV mask are schematically outlined Figure 8.11. The process starts with a substrate material that is a highly finished blank on which the multilayer stack is deposited. This intermediate blank, i.e., the substrate with the multilayer stack on top, is commonly referred to as EUV mask blank. Finally, patterning of the EUV mask blank, including respective inspection and repair steps, yields a finished EUV mask. The following sections will

Substrate qualification

Pattern generation

Multilayer deposition and mask blank inspection

Pattern transfer into absorber

Buffer layer deposition

Patterned mask inspection and defect repair

Absorber layer deposition FIGURE 8.11 Extreme ultraviolet (EUV) mask manufacturing flow.

q 2007 by Taylor & Francis Group, LLC

Buffer layer etch and final inspection

EUV Lithography

415

describe, in detail, the requirements and specifications, current state of technology, and challenges ahead for EUV mask substrates, EUV mask blanks, and the EUV mask patterning process. 8.4.1 EUV Mask Substrates Low thermal expansion materials (LTEM) are required for EUV mask substrates because the EUV radiation-induced heating of masks can lead to image distortions for a higher CTE. The form factor for the EUV mask substrate is the same as for the transparent masks currently being used. However, specifications for flatness, roughness, and defects are much more stringent than for the transparent glass substrates currently used. Table 8.5 lists the most stringent specification requirements in each category that are part of the industry standard SEMI P37-1102 [188] for EUV mask substrates that outlines the specifications in much more detail. Figure 8.12 shows the dimensional specifications for EUV mask substrates, and it identifies the area specifications referred to in Table 8.5. During the last few years, significant progress has been made by several suppliers in meeting flatness, roughness, and defect specifications for EUV masks. A major challenge has been simultaneously meeting all requirements, specifically with respect to flatness and roughness; improvements in flatness could be made at the expense of increasing roughness, and if roughness was improved, non-flatness increased. The best substrates available in late 2003 have flatness values in the range of 150–200 nm P–V and surface roughness values of approximately 0.25 nm rms. A comparison to the specifications from Table 8.5 indicates that flatness still needs to improved by a factor of 6 (a factor of 3 for the 50 nm specification requirement sufficient for beta type exposure tools for the 45 nm node) and surface roughness by about a factor of 2. Whereas flatness and surface roughness values improved over time, very TABLE 8.5 Mask Substrate Specifications for Commercial Extreme Ultraviolet (EUV) Mask Substrates Parameter Coefficients of thermal expansion (CTE) Front side flatness Back side flatness Low order thickness variation Wedge Local slope of front surface Front side surface roughness Back side surface roughness Front side surface defects Back side surface defects a b

Definition

Specification

Mean value

0G5 ppb/8C

P–V within flatness quality area P–V flatness over entire surface: lspatiala%edge length P–V within flatness quality area P–V flatness over entire surface: lspatial%edge length Within flatness quality area after removing wedge angle: lspatial%edge length Wedge angle Local slope angle: 400 nm%lspatial%100 mm Surface roughness in quality area: lspatial%10 mm

30 nm 1000 nm 30 nm 1000 nm 30 nm

Surface roughness over entire back surface: 50 nm%lspatial%10 mm Localized light scatterers O50 nm PSLb equivalent size in defect quality area Number of localized light scatterers with PSL equivalent size O1.0 mm in flatness quality area

%0.5 nm rms

%100 mrad %1.0 mrad %0.15 nm rms

0 cmK2 0

lspatial refers to the spatial period. PSL, Polystyrene latex sphere.

Source: From SEMI P37-1102, Specification for Extreme Ultraviolet Lithography Mask Substrates, Semiconductor Equipment and Materials International, San Jose, CA, 2002.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

5 mm

Flatness quality area

5m m

6.3 mm

FIGURE 8.12 Dimensions and flatness quality area specification of extreme ultraviolet (EUV) mask substrates. The current working definition for the defect quality area is the same as for the flatness quality area. Area specifications apply to front and back.

152 mm (edge length) 147 mm

15 1 2 m 47 m mm (e dg e le ng th )

416

little to no progress has been made in reducing the defect levels. No concerted effort has been made to reduce EUV mask substrate defect levels because the perception has been that once the right processes have been found for meeting roughness and flatness requirements, defect reduction should be straightforward. In addition to reducing the number of defects generated by the grinding and polishing processes, defect mitigation techniques such as a substrate smoothing layer to cover small defects are being considered. CTE values reported by suppliers are already very close to the 5 ppb/8C specification value. Low order thickness variation, wedge angle, and local slope specifications are currently not being addressed, but they will need more attention with respect to the metrologies that can be used to measure those parameters. Specifically for low order thickness variations and local slope, the metrology requirements such as the number and location of data points that will be needed to quantify those parameters over the mask substrate surface are not known. Those specifications may be revisited because additional smoothing layers applied before depositing the EUV multilayer stack or the chucking methods used to grip EUV masks could change those requirements. Progress in meeting EUV substrate specifications is driven by specific product demands. Over the last few years, mask houses of semiconductor manufacturers and independent mask shops began purchasing EUV mask substrates according to the substrate class specifications outlined in SEMI P37-1102 [188]. However, the overall demand for EUV substrates is still low. In an effort to help develop EUV substrate and EUV mask blank infrastructure, ISMT and the state of New York have created the EUV Mask Blank Development Center (MBDC) in Albany, New York. In addition to generating demand for EUV mask substrates, the MBDC provides the metrology infrastructure for both: EUV mask substrates and EUV mask blank development that is accessible to prospective EUV mask substrate and mask blank suppliers. 8.4.2 EUV Mask Blanks Several thin film deposition technologies have been evaluated for depositing the EUV multilayer stacks. The requirements for EUV multilayers used on mask blanks are different than those used for EUV optics because the mask surface is in the object focus of the optical imaging system and the respective imaging optics surfaces are not in focal planes. Defects on the imaging optics are not imaged into the image plane (however, light scattering caused by those defects will still affect contrast and flare) in contrast to defects on a mask surface. Therefore, for depositing EUV multilayers on EUV mask substrates, thin film deposition techniques generating few defects are favored. Because thin film deposition using MSD generates significantly more defects than IBSD, most efforts have concentrated on developing IBSD for EUV mask blank manufacturing. Consequently, IBSD has several other advantages such as defect smoothing properties that are useful.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

417

80% 70%

l peak

l centroid

Reflectivity

60%

l centroid =

50% 40%

(l 1+l 2) 2

FWHM

30% 20% 10% 0% 12.5

l1

l2 13.0

13.5 Wavelength (nm)

14.0

14.5

FIGURE 8.13 Reflectivity curve for a 40 bi-layer Mo/Si multilayer mask blank, assuming un-polarized light, perfect interfaces, and a 68 angle of incidence. The so-called centroid wavelength is the median wavelength in the full width half maximum (FWHM) range. (Calculation From The program is provided by Eric Gullikson on the Lawrence Berkeley National Laboratory Center for X-Ray Optics (CXRO) website: http://www.cxro.lbl.gov/optical_constants/)

The only drawback of IBSD with respect to MSD is that multilayers produced with IBSD currently have somewhat lower peak reflectivity values than those produced with MSD. Other deposition techniques such as EBED or ALD have not been used for making mask blanks although EBED is being used for manufacturing EUV projection optics. The SEMI P38-1103 [189] standards document details the specifications for EUV mask blanks (as well as for absorber stacks and any additional capping layer, under layer, or conductive back side layer). Figure 8.13 shows a calculated reflectivity curve using the program provided in Ref. 190 for a 40 bi-layer Mo/Si EUV multilayer, 68 off-normal incidence, unpolarized light, and perfect interfaces. It shows the definition of the centroid wavelength that is the arithmetic mean value of the two wavelengths defining the FWHM points for the curve. The most important specifications for the EUV blank multilayer are those for peak reflectivity, centroid wavelength, FWHM, i.e., bandwidth, and defect number. Those specifications are summarized in Table 8.6 for the most aggressive product class specified (as for the EUV mask substrate specifications [188], lesser aggressive specifications apply for development or beta type mask blank product classes that are also specified in [189]). Although the definition of the mask blank area over which those specifications apply is not specified in [189], it is for all practical purposes identical with the flatness quality area shown for the mask substrate in Figure 8.12. Commercial EUV blanks meeting the specifications in Table 8.6 for peak reflectivity variation and centroid wavelength uniformity over the flatness quality area depicted in Figure 8.12 TABLE 8.6 Extreme Ultraviolet (EUV) Multilayer Specifications for Commercial EUV Mask Blanks Parameter Peak EUV reflectivity Maximum range of peak reflectivity (absolute) Maximum range of bandwidth Maximum range of centroid wavelength Defect requirements in PSL equivalent size range

Specification O67% 0.50% 0.005 nm at full width half maximum (FWHM) 0.06 nm 0 defectsO25 nm

Source: From SEMI P38-1103, Specification for Absorbing Film Stacks and Multilayers on Extreme Ultraviolet Lithography Mask Blanks, Semiconductor Equipment and Materials International, San Jose, CA, 2003.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

418

became available in the second half of 2003. The peak reflectivity target of O67% has not been met, but the best EUV blanks achieve about 63% reflectivity. The one specification that proves to be the most difficult to meet is the defect specification. Because EUV mask blank defect levels govern mask blank yield, the number of defects has a large impact on EUV mask cost. In the early days of EUV, it seemed as if the only way to attain a reasonable mask blank yield was to start with zero defect EUV mask substrates and add no defects during multilayer deposition because it was believed to be impossible to repair EUV mask blank defects. However, this belief has changed dramatically because of a concerted industry effort, mainly by the EUVLLC, that funded mask blank development work at the VNL. Over the last several years, researchers at the VNL developed a double pronged strategy of reducing the number of defects by harnessing the inherent smoothing properties of IBSD multilayer deposition processes [191,192] and developing repair techniques for those defects that could not be smoothed or covered, yet could end up as printable EUV mask blank defects. In addition, the VNL demonstrated the feasibility of EUV mask blank repair techniques. The multilayer deposition technique developed by the VNL (that, in the meantime, has been adapted by suppliers) uses an ion beam thin film planarization process between successive layer depositions to smooth substrate defects during the deposition process and essentially render substrate defects %50 nm non printable. This is illustrated in Figure 8.14 that shows, on the left, a cross section transmission electron microscope (TEM) image of a w60 nm gold sphere deposited on a mask blank substrate and then coated with 40 Mo/Si bi-layers with the accompanying graph on the right showing height scans using an atomic force microscope (AFM) of an as deposited 50 nm gold sphere; the 6.5 nm height bump left on average on the top of the multilayer film after a normal Mo/Si deposition process; and the 1 nm height bump left on average after a deposition process utilizing an ion beam thin film planarization technique. Although the defect height that gets transported to the surface of the multilayer stack is very much reduced and does not print, the disturbance caused by the defect throughout

Surface height (nm)

50

Gold sphere

40 30

50 nm Au sphere (prior to coating) Best smoothing for Mo/Si coating (mean height = 6.5 nm)

Best smoothing for ion-assisted Mo/Si coating (mean height = 1.0 nm)

20 10

Substrate 0 –300 –200 –100

XTEM of coated gold sphere

0

100 200 300 400

Length scale (nm)

FIGURE 8.14 Cross section transmission electron microscope (TEM) image of a multilayer coated gold sphere of 50 nm size (left). Atomic force microscope (AFM) scans of such a gold sphere (a) as deposited on the mask substrate; (b) after a standard Mo/Si IBSD deposition process; and (c) after using a different deposition process that utilizes the thin film planarization techniques. The length scale on the abscissa shows that the width of the defect does not change very much. (From Mirkarimi, P. B. and Stearns, D. G., Appl. Phys. Lett., 77, 2243, 2000.) (Image and data courtesy of Mirkarimi, P. B.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

419

the bulk of the multilayer can shift the wavelength and cause a change in local reflectivity, impacting CD uniformity. However, the changes introduced by the repair process are small when compared to the CD change introduced by a defect. Simulations for a 35 nm isolated line with a proximity defect (NAZ0.25; sigmaZ0.9; 9 mirror bandpass; and 30% threshold resist) show that using the standard multilayer deposition process leads to high double digit CD changes (O70%); however, using the planarization process shifts the wavelengths of the multilayers much less and results in CD changes of !10% [193]. With a CD change of less than 10%, the 1 nm height bump left after the smoothing deposition process shown in Figure 8.14 will not be considered a printable defect [194,195]. The dual strategy developed by the VNL/EUVLLC in the late 1990s of making substrate defects%55 nm nonprintable via thin film planarization while simultaneously reducing the size and number of defects added during the deposition process is illustrated in Figure 8.15. As noted, progress has been made with respect to rendering larger and larger substrate defects nonprintable. In effect, the thin film planarization technique has been so successful that the goal of being able to render a 55 nm substrate defect nonprintable has not only been met but was also surpassed in 2003. When thin film planarization was first used to smooth substrate defects, it was not clear if this process would be compatible with the centroid uniformity requirement in Table 8.6. In fact, it looked as if high centroid uniformity and planarization of substrate defects were not compatible. However, further development has increased the smoothing capabilities and the deposition process in such a way that it became compatible with the stringent mask blank uniformity requirement. This is illustrated in Figure 8.16 and Figure 8.17. Figure 8.16 shows the centroid uniformity and the peak reflectivity uniformity for a mask blank substrate that has been manufactured using the thin film planarization technique that renders the 70 nm substrate defect shown in Figure 8.17 nonprintable. Centroid and peak reflectivity uniformity both meet the most stringent SEMI P38-1103 specifications, 200 70

180

Reduce process added defects

60

140

130 nm

120 100

90 nm

80 60

70 nm 60 nm Defect free mask blanks

40 20 0 1998

30 nm 20 nm

1999

55 nm 50 nm 40 nm Improve defect smoothing

2000 2001 Year

2002

2003

Surface height (nm)

Defect size (nm)

160

70 nm particle prior to coating

50 40 30 20

After ion beam thin film planarization process (mean height < 1 nm)

10 0 –500 –400 –300 –200 –100 Position (nm)

0

100

FIGURE 8.15 The graph on the right illustrates the dual strategy conceived by the VNL/EUVLLC in the late 1990s of reaching the defect-free mask blank goal by smoothing of substrate defects%55 nm and by reducing size and number of defects added during the deposition process. As the left graph demonstrates, the smoothing goal was exceeded in 2003 when the virtual national laboratory (VNL) demonstrated smoothing of 70 nm substrate particles to less than 1 nm residual height transported to the multilayer surface. (From Walton, C. C. et al., Proc. SPIE, 3997, 496, 2003, Mirkarimi, P. B. et al. 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Belgium, 2000.) (Image and data, courtesy of Walton, C. C. et al. and Mirkarimi, P. B. et al.)

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

420 13.7

V1612

66

13.5

Rmax

l cent (nm)

13.6

68

V1612 Avg and std: 13.483 0.0041554 Rmax= 66.231 0.10503

13.4

64

62

13.3 13.2 0

20

40

60

60

80

0

Distance from center (mm)

20

40

60

80

Distance from center (mm)

FIGURE 8.16 Examples for centroid and peak reflectivity uniformity as obtained with a thin film planarization process capable of smoothing 70 nm substrate particles. (Data courtesy of Mirkarimi, P. B. et al.)

and the maximum peak reflectivity is within 1% of the target specification. Interestingly, with the planarization process used, a shallow depression is observed in the multilayer surface instead of a bump (compare Figure 8.15). Whereas progress has been made in rendering substrate defects up to 70 nm in size nonprintable by using thin film planarization techniques, far less progress has been made in reducing the number of defects introduced during the multilayer deposition process. The best defect levels achieved are in the range of 0.05 defects/cm2 for defects larger than 90 nm PSL equivalent size [58]. This defect level would still amount to about 20 defects O90 nm within the quality area outlined in Figure 8.12 and to many more printable defects if smaller defect sizes could have been detected by the inspection tools. Inspection tools capable of detecting defects down to sizes of 60 nm PSL equivalents on mask blanks are now becoming available. Combined with a better understanding of defect generation and migration, those tools

0.4 70

0.2 Surface height (nm)

60 Surface height (nm)

Close-up of smoothed 70 nm particle

70 nm particle prior to coating

50 40 After ion beam thin film planarization process (uniformity condition)

30 20

0

–0.2

–0.4

10 0 –400

–200

0

Position (nm)

200

400

–0.6

–200

–100

0

100

200

Position (nm)

FIGURE 8.17 Defect smoothing results for smoothing a 70 nm substrate particle and obtaining high centroid and peak reflectivity uniformity at the same time. Interestingly, no bump is left at the multilayer surface, but a very shallow depression of about 0.5 nm depth. (Data courtesy of Mirkarimi, P. B. et al.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

421

will enable progress toward engineering lower defect deposition processes. As the deposition techniques developed in research laboratories are being transferred to deposition tool suppliers and those suppliers successfully reproduce the excellent centroid and peak reflectivity uniformity results and reduce number and size of defects on the mask blank, more emphasis will be placed on reducing the number of remaining defects in the multilayer. 8.4.3 EUV Mask Blank Defect Repair Figure 8.18 shows the mask blank yield as a function of the number of defects that potentially can be repaired for two different quality areas. Mask blank yields were calculated using the standard Poisson yield model. The top half of Figure 8.18 assumes the full mask blank area is subject to the stringent defect requirements, and the bottom half assumes that only the actual die area needs to meet those requirements. For both cases, yield curves are shown that assume that 0, 1, 2, 3, or 4 defects within the specified quality area can be repaired. As demonstrated, even the capability to repair only one defect within the specified quality area results in a significant yield improvement. The defect roadmap target for EUV mask blanks is 0.003 defect/cm2 for EUV production tools for the 45 nm half pitch. This specification is equivalent to a mask blank yield of about 60% if the quality area is specified as the full mask blank area. If one defect per mask blank could be repaired, the specification could already be met with 0.007 defects/cm2; if 2, 3, or 4 defects could be reliably repaired, defect levels of 0.012, 0.017, or 0.021 defects/cm2 would be acceptable, respectively. If only the actual die area on the mask would be used to define the defect quality area, defect levels of up to 0.04 defect/cm2 would still provide a 60% mask blank

100

4

Mask blank yield with zero defects (%)

80

3

1

2

60 0 40 20 (a)

Blank quality area: 100 mm×100 mm

0 100 4 80 3 60

1

2

40 0 20

(b)

Blank quality area: 142 mm×142 mm

0 0.001

0.01 Defect density (cm−2)

q 2007 by Taylor & Francis Group, LLC

0.1

FIGURE 8.18 Mask blank yield calculated using a simple Poisson yield model. The following two different scenarios are shown: (a) the mask blank quality area is the 142!142 mm2 area shown in Figure 8.12; (b) a smaller 100! 100 mm2 area is used. The numbers next to the curves indicate the number of defects that can be repaired.

Microlithography: Science and Technology

422

Phase defect wafer plane image

Amplitude defect wafer plane image

Mo/Si

Mo/Si

Substrate

Substrate

(a)

Phase defect

(b)

Amplitude defect

FIGURE 8.19 Illustration of phase (a) and amplitude defects (b) on extreme ultraviolet (EUV) mask blanks. (Graphics courtesey of Barty, A. et al.)

yield. Of course, the objective is to reach both the defect level of 0.003 defects/cm2 and to be able to repair several defects, increasing the mask blank yield to greater than 90%. When serious development of EUV lithography started in the late 1990s, it was assumed that EUV mask blank defects could not be repaired. However, as has been demonstrated by the VNL/EUVLLC consortium, EUV blank defects can be repaired using different repair strategies for amplitude and phase defects. Figure 8.19 illustrates phase and amplitude defects for EUV mask blanks and the respective wafer plane images [196]. Phase defects are caused by substrate defects for which thin film planarization (i.e., defect smoothing) is not sufficient to render the defects nonprintable. Small phase differences (in the worst case, around quarter wave magnitude) lead to interference patterns in the wafer plane. Although, for illustration purposes, only a substrate particle is shown to cause this positive phase defect in Figure 8.19; however, other particles deposited during the Mo/Si multilayer deposition can cause phase defects as well. In addition, scratches or pits on the substrate surface can introduce so-called negative phase defects (i.e., a small dent on the multilayer surface instead of a small bump). If particles are added during the final stages of the multilayer deposition process as shown in Figure 8.19, the disturbance caused by those will lead to amplitude defects. In contrast to phase defects that are pure interference phenomena, amplitude defects cause incident light to be scattered out of the entrance pupil of the optical system. For both defect types, phase and amplitude defects, repair methods have been demonstrated. The repair method for EUV mask blank phase defects uses the fact that Mo/Si multilayers can be compacted by heating because of silicide formation at the Mo/Si interfaces [197].

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

423 Localized thermal source (electron beam) Defect area after repair

Original defect

(a)

(b)

(c)

FIGURE 8.20 Illustration of extreme ultraviolet (EUV) mask blank phase defect repair, including (a) original defect; (b) multilayer compaction through localized heating of the multilayers above the defect using an e-beam; and (c) original defect area after repair. (Graphics courtesy of Hau-Riege, S. P. et al.)

Because the density of the silicide is higher than that of the Si layer, the multilayer stack contracts. This is illustrated in Figure 8.20 that shows the phase defect repair process. If each of the 30 multilayer pairs above the defect in Figure 8.20 contracts by 0.1 nm, a phase defect bump of 3 nm can be repaired (i.e., about a quarter EUV wave). Of course, multilayer heating has to be localized, so only the multilayer volume above the defect is compacted and not the volume around it. Because the respective defect sizes are small, focused electron beams are best suited for this task and have been successfully used to demonstrate e-beam phase defect repair of EUV masks. Initially, it was assumed that the electron current needed to repair a phase defect would be very high (on the order of several mA), making it very difficult to achieve the small beam sizes needed to repair small defects. However, the thermal properties of the thin films used in the Mo/Si stack are significantly different from their bulk Mo and Si values that were initially used to estimate the necessary e-beam currents. It has been demonstrated that 1 nm deep and 2.5 mm wide surface depressions can be made into Mo/Si multilayers using e-beam currents in the 1 nA range. The repair method of using localized heating as shown in Figure 8.20 works only for positive phase defects, i.e., when there is a bump in the multilayer surface. Negative phase defects as caused by scratches and pits in the substrate surface cannot be repaired using multilayer stack compaction. However, in contrast to positive phase defects that can be caused by substrate defects and by particles added during the multilayer deposition process, it is reasonable to assume that pit-like defects can only be introduced at the substrate level. It is unlikely that new pit-like defects are added during the multilayer deposition in the bulk of the multilayer stack. Therefore, all pit defects should see maximum thin film planarization if this defect smoothing process indeed works for pits as well as for bumps on the substrate surface. Assuming that there are as many pit-like defects as there are bump like defects on the EUV mask substrate and that more positive, but no more negative, phase defects are added during the deposition process, overall, positive phase defects should be dominating. However, it is not known if the implicit and explicit assumptions made here are true, namely that size distributions of pit- and bump-like substrate defects are similar and that their numbers are comparable. Multilayer compaction changes the bi-layer period; therefore, the centroid wavelength

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

424

will be slightly shifted toward shorter wavelength in the repaired area. This mismatch of the reflectivity curve within the repaired area leads to a drop in reflectivity of the repaired area because the curve no longer perfectly matches the EUV bandpass of the system. However, post repair reflectivity measurements of micron-sized repair areas show that the drop in EUV reflectivity is small in the order of a few (2–3) percent, and no dependence of the reflectivity drop on the actual size of the repaired area has been found. A reflectivity drop of a few percent within a small area will not lead to DCD/CD variations in the 10% range even if the most unfavorable circumstances are assumed with respect to placement of lines, etc. Therefore, this drop is not considered critical. Electron energies and currents needed to make a surface indentation of a certain depth can be calibrated and an in situ inspection of the repaired area in the phase defect repair tool is most likely not necessary. However, the phase defect has to be characterized beforehand in order to find out how much of a multilayer compaction is needed. The amplitude defect repair technique developed for repairing EUV mask blanks makes use of the fact that the multilayer stack is a distributed Bragg reflector, and the refractive index for EUV light in Mo and Si is only slightly different from unity. If, therefore, only a small number of multilayers are removed from the top of the multilayer for amplitude defect repair, the phase difference introduced between rays reflected from the repaired area and from non-repaired neighboring areas will be negligible (i.e., cause no phase defect), and the drop in reflectance as a result of the removal of only a few layers will be minimal. The amplitude defect repair process as illustrated in Figure 8.21 uses focused ion beam (FIB) removal to scoop out a very shallow crater of multilayer material together with the amplitude defect. A critical aspect of this repair technology is that the scooping exposes Mo/Si interfaces [198,199]. As shown in Figure 8.22, the exposed Mo quickly oxidizes in atmosphere, leading to significant reflectivity variation within the repaired area (actually, an oscillatory reflectivity variation on crossing over the exposed interfaces). Therefore, it is critical that after scooping out the amplitude defect the freshly exposed surface of the repair area is coated with a protective capping layer that prevents Mo oxidation and, at the same time, only introduces a low overall reflectivity drop. It would be advantageous if existing FIB tools without any or only minor modifications

Remove damaged top layers locally with a FIB

Amplitude defect

(a)

(b)

Defect area after repair

(c)

FIGURE 8.21 Illustration of extreme ultraviolet (EUV) mask blank amplitude defect repair, including (a) original defect; (b) removal of damaged top layer using Focused ion beam (FIB); (c) defect area after repair covered with a protective layer. The depth to width aspect ratio is not drawn to size. The crater is typically in the order of microns, whereas the crater depth is in the order of maximal a few tenths of nanometers. (Graphics courtesy of Hau-Riege, S. P. et al.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

425

100 μm

FIGURE 8.22 Demonstration of the scooping technique for repairing amplitude defects. As can be seen, milling the crater exposes alternating Mo/Si surfaces, and in atmosphere, the exposed Mo quickly oxidizes (dark rings), leading to significant reflectivity variation. (Photo courtesy of Hau-Riege, S. P. et al.)

could be used for amplitude defect repair. Therefore, different sputter geometries ranging from near normal to large angle sputtering and a range of ion energies from 500 eV to 2 keV using Ar ions have been tested for amplitude repair. The experimental results have shown that neither sputtering angle nor ion energies have a significant impact on the reflectivity variation observed within the repaired area, meaning that, most likely, FIB tools could easily be adapted for amplitude repair. The same tool that is used for milling should also be capable of depositing the protective capping layer and should preferably provide in situ reflectivity characterization of the repaired area. Although FIB is the only amplitude repair technology that has been demonstrated, other amplitude repair technologies might even be more advantageous. This is specifically true for amplitude defect repair by electron beam-induced etching [200]. This technique uses secondary electrons generated by a focused electron beam to induce chemical reactions between the surface material and a process gas that is directed at the repair location. Those reactions can produce non-volatile and/or volatile substances that are dependent on the gas mixture used. For removing amplitude defects by scooping out part of the multilayer stack, the reaction products must be volatile. As soon as the etch process is completed, the repair area could be covered with a capping layer using the same technique but different gas mixtures. Amplitude defect repair has a greater impact on mask blank yield if deeper craters can be milled to scoop out defects. The depths of the craters that can be milled are governed by the minimum number of bi-layers that need to stay and by the possible damage that is done to those remaining bi-layers. If more bi-layers are used to make mask blanks, more bi-layers can be removed because the reflectivity as a function of bi-layers levels is nearly constant after 40 bi-layers. If mask blank multilayer stacks were made out of 80 bi-layers, nearly 40 bi-layers could be removed without introducing significant reflectivity changes. However, the prolonged exposure of the stack needed to scoop out 40 bi-layers might damage the remaining 40 bi-layers. The actual useful amplitude defect repair window needs to be mapped out, and there are other overriding considerations probably more important than the number of bi-layers that can technically be removed while still meeting the reflectivity requirements. The area scooped out of the multilayer to repair an amplitude defect is large and shallow. Circuit features that will be written within the repaired mask area will be at a slightly different z position (with z measuring the distance from the mask focus plane). This will lead to image distortion because specific exposure features will be at different focal positions. The above statements regarding amplitude repair—that the actual repair window has to be mapped out in order to better understand what defects will be repairable—is also true for phase defect repair. Also, parameters such as the distribution of defects within the multilayer stack are currently not very well known. As has been pointed out, there is no demonstrated repair technique for negative phase defects caused by pits and scratches on the EUV substrate surface. However, considering that several years ago EUV mask blank

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

426

repair was considered virtually impossible, great progress has been made, and most of the challenges ahead are engineering challenges. Like may aspects of EUV technology, the basic feasibility of mask blank defect repair technologies has been successfully demonstrated, and what is needed now (and already in progress) is the transfer of the respective expertise and knowledge, both in actual defect repair and in defect repair simulation, to commercial suppliers. Now that improvements in the mask blank manufacturing process with respect to centroid wavelength and reflectivity uniformity are exceeding roadmap targets for volume manufacturing, the focus in EUV mask blanks will be on reducing defects. It is to be expected that a significant reduction of defects introduced in the EUV mask blank making process on the substrate level and during the multilayer deposition process in combination with efforts to improve and perfect current phase and amplitude repair technologies will provide the high EUV mask blank yield needed. In addition, there are other methods for making the few remaining defects nonprintable that can be employed during mask patterning. A simple measure consists of arranging the mask patterns in a way that defects reside beneath the absorber material of the EUV mask. Defects covered by the absorber material will certainly not print. 8.4.4 EUV Mask Patterning Mask patterning starts with an absorber coated blank that is nearly defect free in the quality area as shown in Figure 8.23 [201]. After depositing the photoresist, the mask blank defect locations and circuit pattern are compared. For a few defects, an attempt is made to shift the mask pattern to cover the defects. Then the mask pattern is written in the resist and is transferred into the absorber layer using reactive ion etching (RIE) while maintaining the integrity of the buffer layer. After patterning, using bright field optical microscopy, the mask is inspected for absorber defects consisting of excess material or pinholes. Defects can be repaired using a conventional FIB or electron beam methods to

Shift mask pattern to cover one or more defects Mask blanks from supplier

Deposit resist and pre-bake E-beam write, Etch & clean

Final inspect & clean Mount in frame with removable protective cover

FIGURE 8.23 Mask patterning process flow.

q 2007 by Taylor & Francis Group, LLC

Mask blank defects

Inspect patterned masks

Repair masks using FIB or e-beam

EUV Lithography

427

remove the excess absorber material or deposit metal in pinholes or voids. Electron beam repair has potential advantages over the FIB repair by eliminating the FIB Ga stains and potential ion damage to the buffer layer and by providing higher etch selectivity between the absorber and the buffer layer. If electron beam repair could be successfully demonstrated, it would eliminate the need for a thick repair buffer (for FIB the buffer has to be thick enough to protect the multilayers from being damaged by Ga ions) and thereby reduce the absorber stack height. Because the buffer layer can absorb EUV, the last fabrication step consists of removing the buffer layer. Another patterned inspection is performed after the buffer layer removal and FIB or e-beam repair are used to remove any residual buffer layer materials. Although the EUV mask writing process is similar to that used for conventional optical masks and uses the e-beam writing tools, the pattern definition process is simpler. Because of the large k1 values and a mask error factor of approximately one, the need for OPC is minimized. In addition, any mask biasing required because of the non-telecentric illumination and/or correction for flare can be easily implemented. Because flare is constant across the field of illumination except near the boundaries and because it is deterministic, a simple compensation can be introduced for CD pattern distortions [202–204]. For future technology extensions, it should be noted that RET experiments have demonstrated the possible use of phase shifting masks (PSM) and dipole illumination to increase the RES for EUVL for small features. 8.4.5 Mask Costs Mask costs continue to increase as the technology node and printing tolerances decrease. The costs for 45 nm node printing and below are considerably more expensive than present technologies used in production. These costs are determined by the requirement for more defect free substrates, increased mask writing times for the large number of pattern geometries, increased precision for mask inspection, and finer control required for mask repair. In addition, because of the low k1 values for 248 and 193 nm lithography, OPC adds a further complication caused by the data volume and longer time required to write and repair the masks. Moreover, PSMs will be required for 193 nm for the 45 nm node that increases the writing and inspecting complexity, and if complementary, PSMs are required, the additional mask blank, writing time, and inspection steps further increase the cost. To illustrate the difference in complexity, a small pattern is shown Figure 8.24 for EUV and 193 nm lithography technologies for the 45 nm node. For the comparison, it should be noted that there is a lot of commonality between making the two types of masks. For example, both will use essentially the same e-beam patterning tools and similar DUV inspection methods. In addition, repair will use similar FIB or e-beam repair methods for removing or depositing extra chrome material. However, the 193 nm masks are more complex, as shown in the figure, because OPC and PSMs will be

EUV mask

193 OPC mask

FIGURE 8.24 Comparison of extreme ultraviolet (EUV) and 193 nm optical proximity correction (OPC) mask complexity.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

428

required. These complications impact the mask write, repair time, and costs. The increased pattern complexity for 193 nm masks is expected to require close to twice the writing time needed for EUV masks. The inspection cost is expected to be proportional to the pattern complexity, and the inspection cost for 193 nm should track the mask writing time. The relative cost of EUV to 193 nm is illustrated in the flow diagram shown in Figure 8.25. The major steps in the fabrication sequence are shown on the left portion of the diagram, and the key assumptions are summarized in the center. For example, starting with the mask blanks, the blanks for EUV and 193 nm will have similar requirements. They must be defect-free and flat. Although the flatness requirement may be tighter for EUV, defects can be smoothed and removed. There are no similar solutions for optical masks. For comparison, it has been assumed that the EUV mask blanks are twice as expensive as 193 nm optical masks. The mask blank inspection process will use similar EUV inspection methods and tools; therefore, the costs should be comparable for similar size defect inspection specifications. If distributed phase shifters are required for 193 nm masks, the cost for inspection may be higher than for EUV blanks. For the pattern alignment and writing step, the cost associated with mask writing is proportional to the pattern complexity. For OPC patterns, the writing time is likely to be 1.6–2 times the time required for EUV masks. Inspection of the patterned masks will use similar DUV inspection methods, and for repair, FIB or e-beam techniques will be used. The inspection time is expected to be proportional to the pattern complexity. Depending on the complexity of the repair process for PSMs, the repair process may be simpler for EUV; however, an at-wavelength inspection of the EUV mask may be required for the repaired area to verify the repair process. For comparison, it was assumed that the cost ratio for the inspection and repair is the same as for the writing step. The relative range for EUV to 193 nm mask cost is estimated by multiplying the low numbers and the high numbers in the range estimates to EUV vs. 193 nm assumptions

EUV cost relative to 193 nm

Flat, defect free, similar complexity, and specifications

2

Inspect, locate defects for pattern alignment

DUV inspection, similar resolution (PSM for 193 nm)

1

Align and pattern using e-beam

Standard process, less complex patterning, write time - if complementary PSM are needed for 193 nm – fraction is smaller

0.5–0.8

Inspect and repair using FIB or e-beam

DUV inspection, simpler patterns, (defects above ML)

0.5–0.8

Mount in container

Similar to requirements for 193 nm pellicle mounting

1

Mask blanks from supplier

Total

0.5–1.3

FIGURE 8.25 Extreme ultraviolet (EUV) versus 193 nm cost comparison for major steps in EUV mask fabrication.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

429

produce a total cost ratio of 0.5–1.3. This is a relatively simple way to compare the mask costs; however, a more detailed CoO analysis produces similar values that depend on the detailed assumptions and selection of equipment to support the mask production. One present problem is that the LTEM blank cost is about a factor of 4 higher than 193 nm blanks, increasing the cost. However, the 2! initial blank cost ratio represents a reasonable target, and if the blanks are less than a factor of two higher, the EUVL masks could be significantly less expensive than 193 nm masks. As previously noted, International SEMATECH (ISMT) is supporting the characterization of mask blanks. ISMT has established a finished mask blank cost goal of $5 k per blank. Although this is much lower than the present cost, as experience is obtained in finishing mask blanks, the cost could approach the $5 k level. With defect avoidance, smoothing, and repair strategy for EUV masks, the estimated 45 nm EUV mask cost ranges from $50 k to $130 k (193 nm PSMw$100 k and complementary PSMw$150 k). Several CoO analyses have been performed for EUVL and other NGL technologies [186,205]. These analyses have been somewhat conservative because of the unknown costs for new or extended inspection and repair tools. For mask blank prices ranging from $5 K to $15 K, the finished EUV mask prices range between $45 k and $75 k, subject to a number of assumptions regarding the yields at various stages of fabrication. Using the methods outlined earlier for defect mitigation and repair for mask blanks, the mask blank price is being targeted at $5 k. In addition, the large values for k1 and a mask error factor of one are expected to minimize any need for OPC. These advantages for EUV masks tend to support a target patterned mask cost in the lower area of the price range and substantially less than the predicted cost for optical masks. 8.4.6 EUV Mask Commercialization Status A number of companies including Schott-Lithotech, Ashai, Hoya, and Corning are producing mask materials, and several companies are producing multilayer coated blanks complete with a capping or buffer layer. As mentioned, ISMT and the state of New York have created the EUV MBDC in Albany, New York. The goals for the MBDC include developing the equipment, processes, and infrastructure to produce EUV mask blanks in partnership with commercial blank and tool suppliers while reducing the cost of mask blanks. A photograph of a full field patterned mask is shown in Figure 8.26. A number of captive and commercial mask fabrication facilities have produced patterned masks. The complete fabrication process has been performed on both 8 0 silicon wafers and on 6 0 square format masks. Captive mask facilities at Intel, IBM, Infineon, and Motorola have produced full field masks. Commercial companies include the Mask Center of Competency (MCoC), a joint facility managed by IBM and Photronics, and the Reticle Technology Center LLC, a joint project with DPI, Motorola, AMD, and Micron. 8.4.7 EUV Mask Production Needs and Mask Cost Although good progress has been made toward improving the quality of the LTEM mask blanks, the present blank flatness is in the 200–300 nm rms range. Continued polishing improvements will be needed to obtain the 50 nm rms flatness level. In addition, the MSFR finishing needs to be below 0.15 nm rms, and a continuing effort focused on reducing the defects in the multilayer deposition process is needed. Continued improvement in mask writing tools is required as the feature size on the mask decreases. Deep ultraviolet inspection methods have been demonstrated, and a EUVL mask inspection microscope has been

q 2007 by Taylor & Francis Group, LLC

430

Microlithography: Science and Technology

FIGURE 8.26 Full field extreme ultraviolet lithography (EUVL) test reticle on 6 in.2 commercial mask blank.

proposed for defect inspection and classification. To facilitate use, consideration should be given toward integrating the inspection and defect repair tools.

8.5 EUV Resists The attenuation of EUV radiation in organic materials requires that EUV resists use a thinlayer imaging (TLI) process [206–209]. Two main approaches have been investigated as potential EUV resist solutions: single-layer ultra-thin resists (UTR) over hardmasks and bi-layer resists (BLR) [210,211]. The UTR scheme has been the main focus because singlelayer resist processing is the most familiar to the semiconductor industry. The five key challenges for EUVL resists are high photo-speed, small line edge roughness (LER), high etch resistance, low outgassing, and low defect levels [11]. Ultra-thin resists is the simplest EUV resist process. Experiments performed using the ETS and 10! systems indicate that the RES and sensitivity requirements for EUV lithography can likely be met by using extended conventional-style single-layer resists originally formulated for a DUV 248 nm optical wavelength [210]. For example, when these resists are spin-applied to a thickness of 100 nm, both 70 and 100 nm 1:1 lines have been patterned at a dose of w6 mJ/cm2 with a corresponding 3-s LER of 6.7 and 5.1 nm, respectively. It should be noted that in the case of thin resists, pinholes were not the problem as was initially feared. Although most experiments have used a w100 nm thick, organic imaging layer on an etch-resistant hardmask, absorption and imaging calculations and experiments have shown that high quality EUV imaging is possible for a film thickness of up to 160 nm with greater than 50% transmission of the radiation through the film [212,213]. Experiments using commercially available DUV photoresists have also demonstrated that welloptimized, standard processes are capable of depositing UTR resist films without increases in defect levels above those seen for more conventional resist thickness [206]. Ultra-thin resists film experiments have been performed for pattern transfer into SiO2 and SiON hardmasks as well as pattern transfer into poly-silicon without a hardmask [214,215]. For the SiO 2 and SiON hardmask experiments, patterns etched into the

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

45 nm 3:1

431

39 nm 3:1

FIGURE 8.27 Best achieved resolution with the projection optics box (POB 2) optics in static imaging in the static exposure system (SES) tool at Berkeley. (Data courtesy of Naulleau, P. P. et al.)

substrates with standard recipes resulted in sidewall angles greater than 858 with approximately half of the resist remaining on top after patterning (the unexposed film thickness loss for this resist is approximately 10 nm). Outgassing measurements of materials used for 248 nm lithography at EUV wavelengths have shown low quantities of volatiles at 13.4 nm radiation. These results support the concept of designing resists with low outgassing qualities. Although BLR has not received as much attention as single-layer materials, several Si-BLR have been successfully synthesized and imaged with EUV radiation [11]. The imaging performance of these materials has been nearly as good as the UTR materials with the most recent Si-BLR materials being capable of 90 nm RES at about 7 mJ/cm2 with 10.4 nm LER (3-s) [11]. These imaging materials have been simultaneously engineered to provide sufficient adhesion at the interface between the imaging layer and specific underlayer materials. There are indications that currently available EUV resists may not resolve features below 40 nm, i.e., resist RES may be limited by an intrinsic point spread function of 40–50 nm FWHM [216]. Figure 8.27 shows the highest RES EUV images obtained with the POB 2 optics mounted in the Static Exposure System (SES) at the advanced light source (ALS) in Berkeley [217]. The images were recorded by combining the variable sigma capability of the SES tool, i.e., off-axis illumination such as dipole or quadrupole illumination, with controlled overdosing of exposures. Extensive process window studies using POB 2 optics in the SES tool also indicate that the LER process window may be more critical than the CD process window [218]. A better understanding of contributions to LER by the mask and how mask LER is transferred to resist LER is needed [219]. 8.5.1 Commercialization Status The fastest positive resist tested to date has demonstrated 100 nm RES and 7.2 nm LER (3-s) at 2.3 mJ/cm2 sensitivity, and the fastest negative resist tested to date has demonstrated 100 nm RES and 7.6 nm LER (3-s) at 3.2 mJ/cm2. A summary of the resist data showing sensitivity as a function of LER is shown in Figure 8.28. As noted, there are at least four commercial suppliers developing and testing EUV resist. 8.5.2 Production Needs For production, resists that meet all of the five requirements are needed. Continued work is required to formulate sensitive resists with low LER while maintaining the present resists low outgassing qualities and good etch resistance. For imaging at 32 nm, a LER

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

Sensitivity (mj/cm)

432

45

Supplier A

40

Supplier B

Best positive resist (100 nm l/s)

Supplier C

35

Supplier D

30

Shipley published

25

Dose = 2.3 mJ/cm2

20

Goal

LER = 7.2 nm

Best negative resist (100 nm l/s)

15 10 5 0 0

5

10

15

20

25

30

LER (nm)

Dose = 3.2 mJ/cm2 LER = 7.6 nm

FIGURE 8.28 Positive and negative resist printing results and sensitivity/line edge roughness (LER) results.

requirement of G10% of the CD budget would indicate that the LER must be less than 3 nm, and for 22 nm, a LER less than 2.2 nm will be required. In addition, methods for controlling the defect levels within the resist and during deposition and post-processing are required. The sensitivity and LER target values for production resists are indicated by the small inset in Figure 8.28. Whereas shot noise is not likely to be a problem for the 45 nm node, continued evaluation of the resist molecular structure and degree of chemical amplification must be considered during the development and improvement of more sensitive resists [220–226]. The remaining issues consist of simultaneously achieving resist RES, high photospeed, and low LER for both positive and negative resists. Although continued work in extending present DUV resist formulations is required, a new formulation may be needed. As noted in the section on system tradeoffs (see Section 8.7), to alleviate the requirement for higher source power, developing more sensitive resists with the required characteristics provides an important system and cost benefit. The remaining challenges are summarized in the Table 8.7.

8.6 EUV Exposure Tool Development Extreme ultraviolet lithography exposure tool development can be divided into four classes: early 10X METs; 0.3 NA METs; full field development tool; and initial alpha and beta tool designs by commercial companies. The 0.3 NA METs and the experimental full field scanning tool or ETS have addressed essentially all aspects of the EUVL technology TABLE 8.7 Resist Line Edge Roughness (LER), Sensitivity, and Resolution Status and 2007 Requirements Parameter

Present (Best)

Required by 2007/32 nm Node

LER Sensitivity Resolution

w3 nm @ 10 mJ/cm2 5 mJ/cm2 w40 nm @ 5 mJ/cm2

2–3 nm 2–3 mJ/cm2 !20 nm

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

433

TABLE 8.8 World-Wide Micro Exposure Tool (MET) Tool Availability System

Location

10! Microsteppers BESSY MET BEL

Livermore, California Berlin, Germany Grenoble, France

ASET METb

Atsugi, Japan Berkeley, California Intel, International SEMATECH (ISMT)

Exitech MET

a b

NA/Demagnification

Field Sizea

0.1/10!

0.5!0.5

l/20

0.3/5! 0.3/10!

200!600 100!200

l/10 Ol/10

0.3/5! 0.3/5!

300!500 200!600

l/7 l/20

0.3/5!

200!600

!l/20

Lens Quality

Reticle Size 1 in.2 wafer section 8 in. wafer 8 in. wafer 6 in. reticle 8 in. wafer 4 in. wafer 6 in. reticle 8 in. wafer 6 in. reticle

All field sizes in mm2 expect for the 10! microsteppers in Livermore that are in mm2. Use of the MET tool in Berkeley is split between ISMT and the EUVLLC.

development and have reduced the commercialization risks to essentially zero. Although continued development is required in all phases of the technology as commercialization occurs, the risk for new surprises as the technology matures has been eliminated. This statement can be contrasted to the development of other NGL technologies such as 157 nm where the production of CaF2 because of birefringence became a serious production issue and proximity x-ray where the use of 1! masks became a barrier and projection electron beam technology where throughput and space charge became limiting factors. The exposure tools fall into two classes: static METs used for resist development and experimental studies and full field scanning tools used for mask and process development. The world-wide MET availability is presented in Table 8.8 where the characteristics of the various systems are summarized. In addition to those listed, there are 10! interferometric exposure systems installed at the University of California at Berkeley and at the University of Wisconsin. Exitech is manufacturing a commercial 0.3 NA system. The first was installed at Intel in Portland, Oregon, and the second is scheduled for installation at the ISMT Resist Test Center (RTC) in Albany, New York, in the spring of 2004. Based on the experimental results from the existing systems and the expanding number of METs, it can be concluded that a good selection of METs is available for resist and mask development. The ETS and MET address two extremes of the technology. The ETS is a 0.1 NA system that includes all elements of a full field scanning system, and the MET is a large 0.3 NA static exposure system that focuses on printing small geometries and the attendant resist RES, flare, etc. Although the first 0.3 MET tools are already assembled, no printing results are available yet. The following sections give a description of the ETS alpha tool and a summary of its lithographic capabilities [11,154,216,227–229]. 8.6.1 The ETS Alpha Tool In developing the ETS, the original goals were to demonstrate a complete full field EUV lithography system by developing and integrating all of the major subsystems and components; demonstrate and evaluate full field printing; obtain system learning experience through tool operation and upgrades; and transfer the learning to stepper companies to assist in the development of alpha, beta, and production tools. Following a brief

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

434

description of the system, the ETS lithographic setup and qualification will be described, and a short summary of the ETS system learning experience, gained through upgrading and operating the system, will be given. 8.6.1.1 System Overview The intent in developing the ETS was to demonstrate the integration of subsystems and components into an alpha class tool with the flexibility of supporting data acquisition and specific system, subsystem, and component learning. The tool was designed to emphasize flux through the system, not wafer throughput, and to have a simplified wafer handling and software interface. A drawing and a photograph of the system is shown in Figure 8.29. There are two major modules, the source illuminator chamber and the main optics or exposure chamber, that are supported by two vacuum systems. In the illuminator, the EUV radiation generated by the LPP source is collected using a six-channel condenser, and it is shaped to properly illuminate the arc-shaped field of view of the projection system. The Xe gas is recompressed, purified before liquefaction, and recirculated. A spectral purity filter and seal minimize the Xe gas leakage into the main optics vacuum chamber during source operation, and they remove out-of-band radiation from the illuminating beam. The final ETS LPP source configuration was a Xe spray-jet target with a 1500 W Nd:YAG laser system developed by TRW supporting power scalability by using one to three laser chains [145,154]. Each of the three TRW modules operates at 1667 Hz and produces a 300 mJ pulse having a width of 10 ns. The three chains of the TRW laser can be operated in a synchronous mode to deliver a 900 mJ power pulse or in an interlaced mode to increase the pulse repetition rate to 5000 Hz. The main optics vacuum chamber has three separate vacuum environments that can be individually addressed: the reticle stage with the metrology tray on top, the housing for the projection optics box, and the wafer stage environment. The exposure chamber contains the final grazing incidence mirror (C4 element in Figure 8.29) that further shapes the illumination from the illuminator chamber and focuses the flux onto the 6-in.2 reticle that is transported in the scan direction on a magnetically levitated stage that provides long travel in the scan direction. The reticle pattern is projected on to the wafer using a 4-mirror optical system, having a NA of 0.1 and producing a 4! reduction image. The wafer state is a 1-D long travel magnetically levitated stage, similar to the

Reticle stage C4 element Projection optics

Wafer stage

Drive laser beam

C1 collector Gas jet assembly Laser-produced plasma C2, C3 pupil optics

Spectral purity filter FIGURE 8.29 Drawing of the optics path in the engineering test stand (ETS), identifying major system components and photograph of the ETS. (Graphic and picture courtesy of VNL/EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

435

reticle stage, but combined with a mechanical stage for long travel in the cross-scan direction as required to cover sites on 8-in. wafers. Stage system performance for 70 nm imaging with scanning speeds of 10 mm/s (40 mm/s for the reticle stage) was achieved with wafer stage jitter in X-, Y-, and Z-direction across a 5!3 die area (24 mm ! 32.5 mm die size) well within specifications [227]. The system specifications for the 0.1 NA system include G0.5 mm DOF for an operating wavelength of 13.4 nm; a scanned field size of 24!32.5 mm; and support for a CD of 100 nm with a k1Z0.75 for 200 mm wafers. In Figure 8.30, the various EUV sensors and their locations within the ETS are identified, and a picture of the C1 condenser mirror assembly indicating the positions of the C1 photodiodes is shown. † The C1 sensors: Four in-band photodiodes that are integrated into the C1

†

† †

†

condenser assembly. In order to increase C1 sensor lifetime, the C1 sensors do not directly face the laser produced plasma; instead, they receive their light via small EUV multilayer mirrors that direct light from the plasma source onto the photodiode surface. The retcile illumination monitor (RIM) sensor: A linear array of 80 in-band photodiodes integrated into the reticle stage and grouped into four subsets of 20 diodes covering the exposure slit in cross-scan direction. The illumination pattern is mapped by using the reticle stage to scan the array across the EUV ring field. The C4 photoemission sensor: A photoemission sensors to monitor the signal from the final condenser element. The wafer dose sensor (WDS): An in-band photodiode integrated into the wafer stage that measures EUV dose at the wafer level. It is located in one corner of the wafer platen and positioned behind a 25 mm pinhole to provide a well-defined detection area. In addition to measuring the spatial distribution of the EUV radiation as the stage is scanned, the sensor provided absolute pulse energy measurements at the wafer level. An aerial image monitor (AIM) sensor at the wafer plane: A high spatial RES is achieved by placing a patterned structure consisting of 100 nm wide slits etched into a nickel absorber placed on an EUV transparent silicon nitride membrane.

Reticle illumination monitor

Reticle flux sensor

M1–M4 photoemission

Thru-lens imager illuminator

Thru-lens imager receiver

In-band "C1" photodiodes (4)

wafer dose sensor Aerial image monitor

D

C

C4 Photoemission

E B F

C2, C3 photoemission (12)

A

Jet flow direction

EUV CCD (2)

FIGURE 8.30 Schematic of Extreme ultraviolet (EUV) sensors and alignment devices in the engineering test stand (ETS), and photograph of the C1 condenser mirror assembly, showing the six individual petals and the Xe jet flow direction as well as indicating the positions of the four C1 photodiodes. (Graphic and picture courtesy of VNL/EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

436

During operation, the AIM detector scans across a periodic pattern in the aerial image that has the same pitch (10 mm) as the patterned structure in front of the detector assembly. The AIM sensor can be used for a variety of measurement, including finding focus and monitoring focus stability, measuring drift or jitter between the EUV image and the wafer stage, determining scan magnification and skew, and studying image distortion. The C2–, C3–, and M1–M4 photoemission sensors, although present in the ETS system, were not used. Specifically for the mirrors M1–M4, photoemission signals should provide a good monitor for surface contamination (that was not an issue for ETS because of the clean environment and the relative low EUV power levels). In addition to those sensors, a number of temperature sensors, vibration monitors, vacuum gauges, residual gas monitors, photodiodes to monitor source output, and CCD cameras to monitor source size and position are included in the ETS implementation [227]. 8.6.1.2 ETS Lithographic Setup and Qualification In late 2002, the development set of optics, POB 1, was replaced with an upgraded set of optics, POB 2. The POB structure was identical to POB 1; however, each of the optics had a higher optical quality than the earlier set (compare Section 8.2). Prior to installing the POB 2 in the ETS, the POB 2 was installed in the SES tool at LBNL with the objectives of verifying the at-wavelength alignment and imaging accuracy at various field points in the exposure field (as had been done previously with POB 1) and for first printing using the unique pupil shaping capabilities developed at the LBNL [217,230]. The ring field illumination generated by the six condenser petals shown in Figure 8.30 creates a somewhat unusual, but well understood, pupil fill. Figure 8.31 shows a comparison of 70 and 80 nm images taken with the POB 2 in the ETS, SES images taken with the POB 2 at the ALS simulating the six-channel pupil fill of the ETS, and SES images produced using a disk shaped pupil fill. As expected, the Horizontal–Vertical (H–V) bias can be seen in both six-channel illuminations, but not with the disk shaped pupil fill, and is more evident for isolated than for dense lines. During the initial lithographic setup, first static imaging was done to characterize static imaging performance and to adjust focus, stage tip/tilt, and to correct for image aberrations such as astigmatism. Focus was found to be within 7 mm of the expected value, and dose non-uniformity for the sites shown in Figure 8.32 was measured to be within G2% across the smile field. Stage tip and tilt corrections were 20 mm and K110 mrad, and about 0.75 nm of astigmatism was found and corrected (discussed in the following section). Figure 8.32 shows a set of 90 nm elbow images taken at different field points across the exposure field after completion of static lithographic set-up (the center field corresponds to site 3B). By comparing lithographic data with image simulations using alignment data and information available from the interferometric characterization of the POB 2 box, astigmatism was minimized at sites 1B, 2B, 3B, 4B, and 5B. Figure 8.33 shows the actual changes in astigmatism at 1.25 mm defocus as a function of M4-mirror tilt at the center field position (site 3B in Figure 8.32). As indicated by the arrow in Figure 8.33, the best setting for M4-mirror tilt was found to be between K3 and K6 mrad, and therefore, was set to be K4.5 mrad. Figure 8.34 shows an image comparison for 100 nm elbow features in the center field point (site 3B in Figure 8.32) before the astigmatism correction and after tilting the M4 mirror by K4.5 mrad. As demonstrated, astigmatism is reduced and more balanced than before the correction with horizontals blurred on the left-most image and verticals being blurred on the right-most image (so the isolated lines can be seen vanishing for both extreme out of focus positions).

q 2007 by Taylor & Francis Group, LLC

EUV Lithography Pupil fill

437 80 nm features

70 nm features

ETS 6-channel fill

(a) SES 6-channel fill

(b) SES disk fill

(c) FIGURE 8.31 Comparison of 70 and 80 nm images from the engineering test stand (ETS) (a) with images from the static exposure system (SES) simulating the six-channel pupil fill of the ETS (b) and using a disk illumination (c); the ETS six-channel fill shown is a HeNe illumination image of the actual pupil fill in the ETS. (Data courtesy of VNL/EUVLLC and Naulleau, P. P. et al.)

Across the exposure, smile image quality at sites 2B, 3B, and 4B was found to be very good whereas sites 1B and 5C were found to be under dosed by w7.5 and w10%, respectively. As was expected from POB 2 wavefront data, astigmatism changed sign between sites 3A and 3C with the best field point occurring at site 3B (having the best field point in the center field was the goal of POB 2 alignment at LLNL). ETS image resolution after completion of the static lithographic set-up of POB 2 is illustrated in Figure 8.35, showing the comparison of an 80 nm image recorded with the POB 1 in February 2002 with a likewise image recorded with the POB 2 in November 2002. As can be seen, the image quality of the POB 2 image is significantly better than that of the image recorded earlier with POB 1 (though different masks were used for recording the images, the huge difference in image quality is not related to mask improvements). Following the completion of static tool setup, scanned operation of the ETS with POB 2 was started. Apparent skew of 350 mrad in the directions of mask and wafer stage was found and corrected. The magnification error was measured and corrected in the scanning direction. Based on visible-wavelength interferometry, a K680 ppm magnification correction was expected, whereas a larger value of 765 ppm was found and corrected.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

438

Site 3

Site 2 Site 1 C

C

Site 4

C B

B A

C A

Site 5 C B

B A

B

FIGURE 8.32 90 nm in focus elbows across the exposure field recorded after completion of lithographic setup. (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

−6 μrad

−3 μrad

0 μrad

FIGURE 8.33 Astigmatism as a function of M4-mirror tilt. The arrow indicates the selection of best setting for mirror tilt between K3 and K6 mrad, i.e., K4.5 mrad. The images were recorded at 1.25 mm defocus. (Data courtesy of VNL/EUVLLC.)

Before astigmatism correction

After astigmatism correction

−1.5 μm

−1.0 μm

−0.5 μm

0.0 μm

+0.5 μm

+1.0 μm

+1.5 μm

FIGURE 8.34 Comparison of astigmatism for 100 nm elbow features (center field point) before and after applying the M4K 4.5 mrad tilt correction. (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

439

POB 1: February 2, 2002

POB 2: November 22, 2002

FIGURE 8.35 Comparison of projection optics box (POB 1) and POB 2 static image resolution for 80 nm elbow features. Both images were recorded with the same drive laser power (500 W) but with different masks. (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

Figure 8.36 illustrates the imaging performance in scanned mode for 80 nm elbow features across the scanned field. Dose non-uniformity for static exposures was found to be less than G2% variation across the field. For scanned exposures, a dose non-uniformity of w10% across the field was observed that was found to be caused by a variation in apparent slit width. By adjusting the slit width (i.e., narrowing the slit and reducing the effective field size), dose non-uniformity across the field could be improved to about the same level as observed in static printing. Figure 8.37 shows the through focus behavior for 458 90 nm elbow features at best dose and through dose behavior for 90 nm elbows at best focus recorded at the center field site 3B indicated in Figure 8.32. Figure 8.38 shows a comparison of static and scanned 90 nm elbow features including very similar measured LER values. In Figure 8.39, more high resolution images of dense line features with a relaxed pitch of 2:1 are shown and also some contact hole images. The POB 2 optic was designed for 100 nm resolution, but it resolves feature sizes well below 100 nm. When judging the POB 2 resolution via images as shown in Figure 8.39, it has to be understood that it is all but clear where in going from 70 to 50 nm features the resolution of the optics starts to be the limiting factor or where e.g., the inherent resolution 24 mm × 5 mm scanned field

FIGURE 8.36 Scanned images of 80 nm elbows across the scanned field recorded after completion of engineering test stand (ETS) scan characterization. (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

440

Through focus 90 nm elbow features

−1.0 μm

−0.5 μm

0.0 μm

+0.5 μm

+1.0 μm

Through dose 90 nm elbow features

−20%

−10%

nominal dose

+10%

+20%

FIGURE 8.37 Through focus and through dose scanned 90 nm elbow images recorded at the center field point site 3B shown in Figure 8.32 (500 W drive laser power). (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

limit of the resist starts to play an increasing role. In addition, other factors such as the unknown source size (or changes of source size) and the effective sigma of the illumination (different for horizontal and vertical) have to be considered. Flare for POB 2 (and, previously, for POB 1) was measured using a resist-clearing method (often referred to as Kirk test). For both systems, the results are shown in the left graph in Figure 8.40 as a function of position across the arc field. As can be seen, for both systems, the magnitude of flare agrees very well with values calculated based on mirror roughness, with POB 2 demonstrating much lower flare than POB 1, and with constant flare across the field. As was the case for POB 1, a difference is observed in the flare values in POB 2 for features oriented in the horizontal and vertical directions; however, the difference is more apparent for smaller feature sizes rather than for larger feature sizes. The results of the orientation-dependent flare for the central field point are shown in the left graph of Figure 8.40. Interestingly, for POB 1 vertical and horizontal flare, values are different for large features, whereas for POB 2, they are different for small features. There seems to be no apparent reason for this, and a slight bias for either vertical or horizontal flare measurements for POB 1 could easily change this ordering so that the overall

LER=6.6

Static image

LER=6.4

Scanned image

FIGURE 8.38 Comparison of static and scanned 90 nm elbow features, including Line edge roughness (LER). (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

441

50 nm (2:1)

100 nm (2:1)

60 nm (2:1)

110 nm (2:1)

70 nm (2:1)

100 nm (1:1)

70 nm (underdosed)

FIGURE 8.39 High resolution images of relaxed pitch lines and spaces and contact hole features. (Data courtesy of VNL/EUVLLC.)

50

50

40

40 Flare (%)

Flare (%)

behavior (but not the magnitude) of vertical and horizontal flare as a function of feature size would look just like what is observed for POB 2. As for the fact that flare for vertical and horizontal lines is different at all, it was later found that this might be due to an anisotropy in mirror polishing, leading to higher scatter in one direction compared to an orthogonal direction. Simulations of 1-D PSD distributions extracted from the twodimensional (at-wavelength) wavefront data showed a clear difference between horizontal and vertical PSDs [229]. It is not clear if this observed flare anisotropy is due to only a single mirror or if it is a net effect of contributions from all four camera optic mirrors. After lithographic setup and qualification was finished, IC manufacturers used the ETS for proprietary exposures. Using the ETS, IC manufacturers ran experiments for mask and resist characterization and development as well as process window analysis [231].

30 20 10

30 20 10

POB 1

POB 2

0

0 1

(a)

POB 1: H flare POB 1: V flare POB 2: H flare POB 2: V flare

2

3 4 5 Site number

6

1

7 (b)

2

4

8

Feature size (μm)

FIGURE 8.40 Flare for 2 mm horizontal lines measured across the arc field at seven positions (a) and flare measured at the central field point for various feature sizes in horizontal and vertical orientations (b). (Data courtesy of Lee, S. H. et al.)

q 2007 by Taylor & Francis Group, LLC

442

Microlithography: Science and Technology

FIGURE 8.41 Full field engineering test stand (ETS) exposed wafer from december 2002 with 15 fields and 24 mm ! 32 mm scanned field size. (Data courtesy of O’Connel, D. J. et al. and VNL/EUVLLC.)

During this time, the ETS produced full field exposures on 8-in. wafers with 15 dies per wafer on a routine basis as shown in Figure 8.41. 8.6.1.3 ETS System Learning All unique aspects of EUVL have been successfully demonstrated with the ETS system module integration, including all components, metrology, environmental control, and a variety of masks. By design, the ETS system did not include overlay capability. The system has been in operation since late in the first quarter of 2001, and over two years experience has been obtained using the tool in a variety of operation conditions. The system experience included many planned activities and unplanned events. It included planned system upgrades, including replacing the system projection optics and replacement of the illuminator optics, changing the methods for replacing reticles, experiments with electrostatic chucks, upgrading sensors, improving the system software, and planned system maintenance. Unplanned activities included operational mistakes that lead to accidental contamination and thermal excursions and component outgassing. Several component failures were experienced, especially for sensors that represent real time operational and practical learning experiences. The total system experience has minimized the system and lithographic printing risks. From a system perspective, the extensive modeling and simulation results as compared with experimental results have demonstrated an excellent understanding of all issues and have increased the confidence level for designing new EUV systems. The imaging studies continue to validate the optical lithography similarities and detailed understanding of the printing process and all of the system attributes that effect the quality of the printed results. The experiments studying flare and process window compare favorably with the modeling results and flare mitigation and compensation methods that have been developed. The data obtained through the extended experimental studies and the detailed modeling support the conclusions listed. To emphasize, the validated models support the detailed system understanding and use of CAD tools for production system design. The integration approach has supported the subsystem interface choices and the associated environmental control in each of the different environmental zones. The lithographic studies have validated the flare prediction and mitigation methods. Lastly, the extensive experimentation studies, the wide variety of personnel expertise directed toward uncovering and solving problems, and focus by 6 IC companies have provided a high level of confidence that all technology and operational surprises have

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

443

been identified [231,232]. No lurking issues have been identified as with some of the other NGL technologies, e.g., CaF2 birefringence, membrane masks, stitching, magnification control, etc. There are, however, some system challenges remaining. Increased source power requirement is an issue, and appropriate system trade-offs need to be identified [232]. The reticle particle protection issue needs to be resolved. In addition, environmental engineering needs to be applied to the system to extend the mitigation of contamination and possibly erosion. Finally, even though all of the prerequisite work has been done, the tool makers must solidify their beta and production tool manufacturing schedules.

8.7 Source Power Requirements and System Tradeoffs Most of the emphasis on source development has been on maximizing the power output to achieve the 115 W at the IF. This has placed most of the challenge on the source manufacturers and strained the illuminator capabilities. In examining the 115 W target power level, there has been a tendency for the tool suppliers to keep this level high because of the uncertainties associated with the power conditioning in the condenser, contamination mitigation, mirror reflectivity, component aging, and manufacturing tolerances. It has usually been assumed that the 100 wph throughput level must be maintained until major tool maintenance is scheduled. As evidenced during many of the source supplier discussions, material limits are being reached as the power requirements continue to increase. Practically, system and source power tradeoffs will be required to obtain a viable production tool. To illustrate some of the tradeoffs, a crude system throughput model has been developed using the system schematic shown in Figure 8.42. The optical portion of the

Optical throughput schematic Intermediate focal point

Reticle

wafer

Source Illuminator

P. O. Box

Source Conversion efficiency Collect ≥ 2π sr Wafer Improvement opportunities Efficient spectral filter Improve resist Improve reflectivity sensitivity Minimize loss for Minimize nondebris mitigation scanning overhead Illuminator Reticle Projection optics Improve reflectivity Minimize reflections Improve Reduce spatial Improve reflectivity reflectivity No attenuation roughness ≤ 0.10 nm Match multilayers Minimize loss for Match multilayers for contamination Reduce polarization loss contamination protection Zero attenuation resist protection outgassing window FIGURE 8.42 Optical throughput schematic and throughput improvement opportunities.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

444

tool can be divided into five subsystems: Source, Illuminator, Reticle, Projection Optics, and Wafer. Each of these subsystems can be further subdivided into elements that affect the system throughput. Furthermore, because there are many uncertainties with the exact values for model parameters, contingencies are assumed to account for the uncertainties and component aging or degradation with use. Each of the subsystems is described below with associated parameters included in the throughput model. 1. Source For this model, the source includes the plasma, the collection optics, debris mitigation system, and spectral purity filter if required. Although not included in this model, the total CoO model must contain the relative efficiency of converting wall plug power to power contained in the plasma at noted in the previous source discussion. The throughput model includes the following: † EUV flux collected in a volume as percent of 2p sr † Reflectivity of central collector (may not be required for the discharge source) † Transmission of the debris mitigation system † Transmission of the spectral purity filter

For the model parameters used for the nominal or present throughput model, it has been assumed that flux is collected from 80% of the collection volume, a single multilayer collector mirror is used having a 50% reflectivity, a spectral purity filter having a transmission of 50% and a debris attenuation system (gas environment or debris foil) that causes a 20% reduction in the amount of flux reaching the IF. All those parameters are multiplied together to obtain the net source to IF conversion efficiency. For the LPP source, if a spectral purity filter may not be needed, the conversion efficiency is 32%. For the discharge source, the collector mirror may not be needed, but the spectral purity filter may be needed, and the efficiency can be calculated to be the same for both sources. 2. Illuminator or Condenser The illuminator contains the optical elements for conditioning the illumination flux, providing the proper pupil fill, and illuminating the reticle with a uniform flux. The system usually contains several multilayer and grazing incidence mirrors. Parameters in the throughput model include † Reflectivity for multilayer mirrors † Number of multilayer mirrors † Number and reflectivity of grazing incidence mirrors † Degree of matching of the multilayer mirrors † Contingency and system aging percentage

For the model parameters used, it has been assumed that four multilayer and three grazing incidence mirrors will be required. For a multilayer reflectivity of 65%, grazing incidence reflectivity of 85%, and a contingency loss of 23%, the illuminator efficiency is 8.4%. The contingency loss can contain a number of factors such as multilayer mismatch, reflectance loss with aging, and environmental losses.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

445

3. Reticle The reticle region contains the mask and any contamination protection required to keep particles away from the mask. Throughput factors include † Reticle reflectivity † Flux attenuation caused by the contamination protection and reticle degradation

with time For the model parameters, a reticle reflectivity of 65% was used, and it was assumed that a thermophoresis reticle protection method was used that resulted in no flux attenuation. 4. Projection Optics For 0.25–0.3 NA projection systems, six multilayer mirrors are required. Factors impacting the throughput include † Individual mirror reflectivity † Multilayer mismatching † Degradation of the optical efficiency caused by oxidation or carbon deposition

on the mirrors For the projection optics, a uniform reflectivity of 68% was assumed and a contingency loss of 20% was assumed to account for any losses caused by multilayer mismatching, optics aging, or environmental effects. 5. Wafer The required EUV flux at the wafer depends on the stage scanning speed, the resist sensitivity, and the overhead or idle time when exposures cannot occur. For this analysis, it has been assumed that the total stage time for a wafer exposure is 36 s with 9 s required for the actual die exposures and 27 s allocated for wafer overhead. For the initial throughput calculations, a resist sensitivity of 5 mJ/cm2 was used. A throughput model based on the parameters listed above has been summarized in Table 8.9. In this model, a further reduction in efficiency of 37% was assumed as a contingency for system aging and other parametric changes. Although a number of different assumptions can be made regarding each of the parameters, nominal values have been assumed for the initial model that are consistent with the overall model results presented by lithography tool suppliers [81]. For these assumptions, the total power at the IF is 115 W with an LPP source power requirement of 360 W. For a discharge source, the power requirement ranges between 1 and 2! the LPP power. To illustrate the reduction in required source power for system improvements, the following assumptions were made: Source † The source collection efficiency was improved by 5%–85% † The multilayer reflectivity was improved by 5%–55% † An improvement in spectral purity filter of 5% (using a grating) for transmission

of 55%

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

446 TABLE 8.9 Throughput Model Noting Improvement Effects System Characteristics (100 wph) Source to IF efficiency Collection (% of 2 p sr) Reflectivity of coll. mirror Spectral filter transmission Transmission of debris system Illuminator efficiency Reflectivity (4 ML, 3 GI @ 85%) Contingency loss Reticle reflectivity Projection optics efficiency Reflectivity (6 ML) Contingency loss System aging degradation Wafer (power—90 die per wafer) Resist sensitivity Total source power @ IF Laser produced plasma (LPP) source (no filter) Discharge produced plasma (DPP) source

Present

Improved

16% 80% 50% 50% 80% 8.4% 65% (0.11) 23% 65% 8.0% 68% (0.11) 20% 37% 300 mW 5 mJ/cm2 115 W 316 W 720 W

20.6% 86% 55% 55% 80% 10% 67% (0.12) 20% 67% 9.2% 69% (0.11) 15% 30% 137 mW 3.5 mJ/cm2 33 W 87 W 160 W

Illuminator † An improvement in multilayer reflectivity of 2%–67% † A slight reduction in contingency loss from 23 to 20%

Reticle † The reticle reflectivity was improved by 2%–67%

Projection Optics † A 1% improvement in multilayer reflectivity was assumed to produce a reflec-

tivity of 68% † The contingency loss was reduced from 20 to 15%

Wafer † A resist improvement was assumed to provide a sensitivity of 3.5 mJ/cmK2

And the overall system aging contingency was reduced from 37 to 35%. These improvements have been multiplied together as shown in Table 8.9 to provide an overall reduction in the required source power at the IF from 115 to 53 W. For an LPP source, this translates into a total source power of 140 W and between 140 and 255 W for the discharge source. If a further improvement can be made in the system stages to reduce the stage overhead by 20%, i.e., from 27 to 21.6 s, the power at the IF can be reduced to 33 W. This translates into a LPP power requirement of 87 W and a discharge source power of 87–160 W. Because of the number of contingency factors for system aging, a new tool should have a much higher throughput than an older system that has experienced degradation in mirror

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

447

reflectivity and others losses. However, the initial throughput is limited by the stage speed and overhead, and later in the tool life, the throughput is limited by the source power. One possible tradeoff is to design a system with fixed stage speed corresponding to a starting or minimum source frequency and gradually increase the source repetition frequency as the system ages (in the range of a factor of 1.5–3) in order to optimize the system efficiency and CoO. An alternate system tradeoff could involve a design where a single source is used to support two wafer exposure systems and the source is switched between the two with a movable mirror. This could double the time the source power is effectively used and decrease the demands on stage performance by allowing slower scanning speeds.

8.8 Remaining Challenges Although there are many issues that must be addressed in detail as the production tools are developed, the key challenges are summarized in Table 8.10 where the issues, their relative priority, technical requirements, challenges, and relative difficulties are listed. The issues shown in this list have been part of other critical issues lists developed by the NGL committees in the past although the priorities may be different. Although there has been a lot of focus on source development by numerous developers and there are tradeoffs that can be made between source power and other system improvements, the source remains the top priority (Priority 1) challenge because of the high power requirement. The source and condenser reliability and source power requirements essentially are part of the same challenge in that increasing source power places more severe requirements on maintaining the condenser optics integrity. The difficulty in continuing to increase source power is very high in that many of the fundamental thermal and debris limitations are being approached. For these reasons, the source power must be addressed from a systems viewpoint as previously noted. Tradeoffs must be addressed between total TABLE 8.10 Remaining Challenges for Extreme Ultraviolet (EUV) Commercialization Priority 1

2

Issue Source power Source and condenser reliability CoO Low cost mask blanks Contamination High NA optics

3

Commercial masks Reticle protection

4

Resists

5

Thermal management

q 2007 by Taylor & Francis Group, LLC

Requirement

Challenges

Difficulty

%115 W @ IF w1 year lifetime

System tradeoffs Debris, erosion, and thermal control

High Moderate to high

O115 wph and !$25 M per tool 50 nm flatness, low defects Mitigation, cleaning

System cost versus throughput Manufacturing, volume scale up Environmental control Continuous improvement Implementation Container design, in situ cleaning New formulation

Moderate to high

Moderate to high

System design

Low

Figure, finish, lifetime Costs !193 nm Protection in lieu of pellicle 2 mJ/cm2, 2–3 nm LER, 20 nm Res Support 100 wph

Moderate Moderate Low to moderate Low Moderate

448

Microlithography: Science and Technology

power requirements and improved multilayer reflectivities in the condenser and projection optics box, multilayer bandwidth matching, resist sensitivity, debris control methods, spectral purity control, and wafer stage overhead. In addition, the throughput requirement and contingencies included in the source power model must be adjusted for tool CoO issues associated with tool operational, maintenance, and component replacement strategies. The Priority 2 challenges address several issues that are associated with CoO, including masks, tool maintenance associated with contamination control, and availability of high quality, high NA optics. Reticle protection is listed because no proven consensus methods have been adopted. Methods under consideration include thermophoresis and electrostatic protection. In addition, in vacuum reticle inspection and cleaning in an auxiliary chamber to the reticle handler may be required prior to loading a reticle into the tool or for precautionary cleaning at specific intervals during use. Each of these issues must be addressed for production tools. Improved resists with low LER and improved resolution and photospeed are required. Although the emphasis has been focused on extending DUV resists, a new formulation may be required to obtain the tolerances needed to support 32 nm nodes and below. Thermal management has been listed as the last priority in Table 8.10. Whereas this is an important issue as power levels are scaled to production levels, thermal issues are well understood and can be controlled in vacuum environments using active cooling methods. Thermal issues must be addressed at the system design level, and differential heating of various optics with appropriate cooling and/or incremental alignment/focus adjustments, the use and type of spectral purity filter, and heat generating sources, including absorption of EUV flux, stage components, source/condenser environment, etc., must be considered. Although a number of challenges in developing and manufacturing production tools remains, a large number of researchers, private companies, and consortia are engaged in world-wide efforts to address each of the issues. For example, there are over 40 companies, universities, and laboratories working on EUV-related problems in the United States. In addition, ISMT has taken the lead in developing the Mask Blank Development Center and the Resist Test Center, both in Albany, New York, as well as sponsoring the development of high quality mask blanks. SRC continues to fund several university programs, and leading semiconductor manufacturers are supporting the infrastructure development. In Europe, there are over 50 companies and research institutes working on EUV source, mask, and tool projects. ASML and Exitec have taken the lead role in the development of alpha tools and microsteppers. In addition, the European government is sponsoring programs through the MEDEAC program, and consortia are sponsoring projects at IMEC and LETI. In Japan, the number of companies and research institutes working on EUV projects has grown from less than 10 to over 30 in the last two years. The Japanese government is sponsoring projects within the ASET and AIST programs and has established a new consortium, EUVA, that is focusing on EUVL tool development. Both Nikon and Canon have active tool development programs with a goal of providing prototype tools in 2006. Many of the EUV research results have been reported at numerous conferences over the past six years such as SPIE, EIPBN, and MNE as well as at International EUVL Workshops and Symposia. Over 1000 papers have been presented in over 40 conferences.

8.9 Conclusions Over the last six years, the risks associated with EUVL have been dramatically reduced. There is a good general understanding of the remaining issues, and

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

449

world-wide support by over 100 companies and research institutions is being directed toward solving the remaining problems. Most key technology decisions have been made, the technology is well understood, and extensive Research and Development results have been published. The proof of technology has been demonstrated in all areas, ranging form the fabrication of optics, multilayer coatings, mask patterning and repair, a variety of EUV sources, and integrated micro and full field exposure tools. The area additional development is needed for, the target requirements have been quantified, and system tradeoff issues have been identified. The R&D knowledge base is in place, and the commercial production capability is established. Based on the extensive R&D, well-planned experiments, detailed modeling and analysis by many different scientists and engineers at leading research institutions, and the continuous close scrutiny by advocates and opponents of EUVL, it can categorically be stated that there are no lurking issues threatening EUVL such as other NGL technologies, e.g., CaF 2 birefringence for 157 nm lithography, mask distortion and fabrication for 1! x-ray lithography, complex membrane masks and stitching and charging effects for e-beam and ion beam technologies. EUVL can provide the only viable solution for 32 nm, and if solid commitments were made by the user and production communities, it could be ready for 45 nm. Although there will always be incremental improvements, R&D has been completed, the knowledge base is in place, the commercial base is available, and user commitments must be made in order to complete the technology implementation. In summary: † Six years R&D and two years experience with the full field ETS have demon-

strated EUVL technology feasibility and minimized risks. † Source/system tradeoffs can provide achievable, realistic source power goals. † Mask costs are affordable—costs and complexity are potentially less than for

193 nm masks. † Suppliers are engaged to commercialize the technology.

8.10 Postscript Because of the rapid pace of development in the field of EUV technology, significant progress has been made since the main portion of this chapter was written. Some of the results were reported at the 3rd International EUVL Symposium held in Miyazaki, Japan [233] on November 2–4, 2004, and they are discussed here. Based on the progress in the different EUV technology and infrastructure areas, the conference steering committee at the 3rd International EUVL Symposium ranked three EUV critical issues and listed three more without ranking them. They are the following: 1. Availability of defect-free mask 2. Lifetime of source components and collector optics 3. Resist resolution, sensitivity, and LER are met simultaneously

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

450

Additional critical issues that were not ranked included 1. Reticle protection during storage, handling, and use 2. Source power 3. Projection and illuminator optics lifetime This list has to be compared to the EUV critical issues list as ranked by the conference steering committee at the 2nd International EUVL Symposium held in Antwerp, Belgium, in 2003 that listed the following six critical issues: 1. 2. 3. 4. 5. 6.

Source power and lifetime, including condenser optics lifetime Availability of defect-free masks Reticle protection during storage, handling, and use Projection and illuminator optics lifetime Resist resolution, sensitivity, and LER Optics quality for 32 nm node

Because of the significant progress in generating EUV photons, source power is no longer among the top three EUV critical issues, and the availability of defect-free masks is ranked first. The CoO is one of the main challenges for EUV, and associated topics such as lifetime of source components and collector optics are ranked high on the critical issues list. Not surprisingly for a maturing technology, resist issues have come into focus and Resist resolution, sensitivity, and LER are met simultaneously is now third EUV critical issue. The following paragraphs provide a brief summary of the major achievements reported at the 3rd International EUVL Symposium and associated SEMATECH workshops [234–239]. 8.10.1 EUV Optics and EUV Multilayers Progress in EUV optics and EUV multilayers is more difficult to assess than during the time coating technologies were developed in research laboratories because optics substrate manufacturing technology and coating technology are now extremely competitive areas. However, several MET tool optics have been manufactured and are being used for resist and mask-related work. The optics quality, e.g., wave front errors, is within specification, and the bottleneck for sub-40 nm printing is not limited by the optics but by the availability of high resolution resists. The first 6-mirror EUV alpha tool optics are nearly complete, and the supplier is confident that the first alpha tool will be available for exposures by mid-2006. Meeting the figure and finish requirements for the 45 nm halfpitch node is not seen as critical, and extensibility toward the 32 nm half-pitch node must be demonstrated. However, a greater concern to the end customers than optics quality is the optics lifetime. The optics lifetime of the first 6-mirror alpha tool is currently specified at 1000 light-on hours at standard tool operating conditions (w10 wafers per hour throughput). For the production tool, the specification is 30,000 light-on hours (at 100 wafers per hour throughput). A lot of effort is being focused on identifying oxidation resistant capping layers that, in combination with suitable mitigation techniques, is perceived as the best way to reach the 30,000 h optics lifetime specification. Oxygen diffusion through the capping layer has to be prevented, and optimizing the capping layer material, texture, and morphology are the main challenges [240,241]. Finding the best

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

451

capping layer is not trivial because the process parameters for depositing those layers of 1– 2 nm thickness are restricted by the temperature load the Mo/Si multilayer stack (or the interface engineered multilayer stack) can handle without inter-diffusion and multilayer compaction, affecting the EUV reflectance characteristics of the Bragg reflector. Although several different materials or material combinations have been evaluated, Ru is still one of the main contenders, and it is more resistant to oxidation than Si capped multilayers [240– 242]. Among the cleaning techniques, atomic hydrogen cleaning investigated by the EUVLLC/VNL [68] is receiving more attention. However, hydrogen diffusion through the capping layer can cause serious damage to the B4C interface layers. Surface scientists have been investigating many of the phenomena that are responsible for capping layer oxidation using model systems such as the oxidation of different single crystal surfaces of Ru. The application of the surface science knowledge-base to find capping layer solutions is encouraging [243]. 8.10.2 EUV Sources As previously mentioned, tremendous progress has been made in generating EUV photons. The power levels reported for Sn DPP sources are in the range of 50–70 W (2p, 2% bandwidth) at the IF [244–247]. This significant improvement is mainly due to the use of Sn as source target material to produce EUV. Although power for Xe EUV sources has also been increased to 25 W at the IF [244], Sn has a much higher conversion efficiency. Conversion efficiencies of 2.3% have been measured [248,249], and even higher conversion efficiencies are predicted for solid Sn [250]. The higher conversion efficiency of Sn allows Sn-based sources to be scaled to production tool requirements (115 W at IF). In addition, for the same EUV power, the source size (i.e., the plasma volume that emits the EUV radiation) of Sn DPP sources is smaller than the source size of Xe DPP sources [244]. Therefore, an optical system with a given entendue can accommodate a much more powerful Sn source. The entendue limitation for useable power for Xe DPP EUV sources is not an issue for Sn DPP sources. Sn DPP sources are now able to generate about half the power at the IF that will be needed for high volume manufacturing production EUV steppers. LPP sources are still behind lagging behind with much lower EUV power levels at the IF. To scale Sn LPP source to 60 W at the IF using Sn as target material, about 12 kW of laser pump power will be required, and with higher collection efficiencies, this figure may be reduced to 8 kW [245]. In addition to Sn, In, and Li are being investigated as potential source target materials. One of the major commercial source suppliers is shifting development work from a Sn DPP source to a Li-droplet LPP source [247]. The main reason for considering Li (that, according to the supplier, already produces about 60 W at IF) is the limitation of collection efficiency of Sn DPP source designs to less than 10% (for other Sn DPP source designs suppliers claim that O10% collection efficiency will be possible). A Li EUV source provides a narrow line source with less heating than the other continuum-type sources where a lot of energy goes into producing unwanted radiation and debris as well as heat that has to be removed. Although there are other challenges related to Li contamination, the conversion efficiency for Li is reported to be around 2.5% [247]. By being able to heat a collector to around 4008C, Li that has condensed on the mirror surface can be desorbed, thereby solving one of the main problems of current high power sources, i.e., collector mirror lifetime. Collector optics using optimized EUV Bragg reflectors are being developed that can withstand higher thermal loads; however, the peak reflectance of those mirrors is w 40% lower than the 70% for typical Mo/Si Bragg reflectors. Regardless of the target material being used, the main challenge for EUV source development is to increase the lifetime of source components (such as electrodes for

q 2007 by Taylor & Francis Group, LLC

452

Microlithography: Science and Technology

DPP sources, debris filters, etc.) and collector optics. The progress of producing more EUV photons has removed the issue of EUV source power as a potential EUV technology showstopper for high volume manufacturing. The focus for EUV source development is now on CoO issues, e.g., finding low cost, replaceable collectors where the unit costs for EUV source consumables (e.g., collector, debris filter, etc.); component replacement times controlled by mean time between failure or interrupt (MTBF or MTBI) and mean time to repair (MTTR) are comparable to replacement unit costs and repair times for current ArF lithography sources. Until recently, EUV source metrology has focused on measuring EUV photons; for example, the measurements of generated source power and source power at the IF have been critical to benchmark EUV source power progress [251]. New metrologies are needed to measure and characterize the debris coming from the high power Sn DPP sources and to reduce and/or mitigate this debris before it hits the collector mirrors. Several efforts are currently providing critical data that will help mitigate debris in DPP and also in LPP sources [252–259]. In addition to high power EUV sources, the development of low power EUV sources has continued. Micro exposure tools do not need high power sources nor do EUV microscopes or other EUV imaging systems that could be used for metrology purposes. Good progress has also been made in developing coherent EUV sources [260] around 13 nm that could be used for at wavelength interferometry of EUV optics.

8.10.3 EUV Masks The availability of defect-free mask blanks is the top challenge for EUVL. To meet this challenge, improvements in many different areas are needed, including mask substrates (flatness, roughness, defectivity); substrate metrology (flatness); inspection sensitivity down to 30 nm PSL equivalent size; cleaning for substrates and mask blanks; defect-free deposition multilayer processes; verification of visual light inspection versus at-wavelength inspection for blanks; integrated handling concepts that demonstrates zero adders at 30 nm PSL equivalent defect size for blanks and patterned masks; mask blank repair of amplitude and phase defects; and the respective metrology to verify repair results. In almost all of these areas, significant progress has been made, and mask substrate and blank suppliers are meeting increasingly better specification requirements [234,261– 263]. In the substrate and blank metrology area, all of the learning and know-how from EUVLLC/VNL projects (flatness, roughness, defectivity, reflectivity, reflectivity uniformity, etc.) have been transferred to benefit commercial suppliers, and focused industry efforts build and expand the demonstrated capabilities [261]. For EUV mask substrates, defectivity and flatness are the two most critical issues that need improvement. Specifically, mask substrate defects have been a problem because it is difficult to detect defects added during the multilayer deposition process if the defects become decorated and cannot be distinguished from real defect adders originating from the deposition process. Although an excellent method has been determined—the so-called multilayeron-multilayer process [262]—to discriminate against defect decoration of substrate defects, ultimately much lower substrate defect levels will be needed. The multilayeron-multilayer process was developed to enable monitoring progress toward lower added defects during multilayer deposition. The SEMATECH MBDC has demonstrated !0.105 added defect per cm2 at 80 nm PSL sensitivity for the deposition process, whereas the best total blank defect level demonstrated to date is 0.32 defects/cm2 at 80 nm PSL sensitivity [262,263]. The best supplier defect density results are somewhat higher; however, all suppliers show good improvement in some areas (defectivity, peak

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

453

reflectivity, reflectivity uniformity, flatness, roughness, etc.), and some suppliers show very good improvement in all areas [261]. Whereas two different deposition tools were used at the EUVLLC/VNL to obtain low added defect levels and high peak reflectivity and reflectivity uniformity, the industry has succeeded in merging the two processes using one tool to achieve low added defect levels and very good reflectivity and uniformity at the same time [234,263]. Because of the integrated systematic effort of reducing blank defects, it has been found that many of the defects are organic in nature although they are coated with Mo and Si [263]. This clearly shows that there is a great potential for reducing the defect levels by improving substrate and blank handling processes because most of the organic defects result from handling contamination. The MBDC deposition process reporting 0.105 added defects/cm2 at 80 nm PSL sensitivity is, therefore, likely to be much cleaner than indicated by this current number of defects added during deposition. The topic of reticle handling and storage is receiving increased attention, and zero added defects at 150 nm PSL sensitivity level have been demonstrated by one stepper supplier for their reticle handling system to be used on the first EUV alpha tool [264]. Pseudo pellicle concepts have been suggested to protect EUV masks during handling and storage, and the status of five different mask carrier designs has been reported [234]. This is a significant increase in the work on a critical area that is expected to drive progress and standardization efforts toward a common industry solution. Clearly, a common carrier/pseudo pellicle standard is needed to enable continued progress toward demonstrating zero defect mask blanks, testing, and benchmarking of proposed industry carrier/pseudo pellicle concepts [234,265]. Currently, the bottleneck for improving blank defectivity is inspection capability. The best inspection sensitivity for defect inspection on Mo/Si multilayers is 80 nm PSL equivalent size. To demonstrate the 0.003 defects/cm2 at 30 nm PSL equivalent size required for the 45 nm half-pitch node, an improved defect inspection method is needed. Although good progress is being made in demonstrating actinic mask blank inspection [266,267], one of the remaining challenges is to provide better actinic and visual inspection techniques. The other challenge that could limit progress toward achieving the required low mask blank defectivities is mask cleaning. Whereas larger particles are bound by Van-der-Waals-type forces and can be easily cleaned, smaller particles are bound by much stronger chemical bonds. This requires the development of novel cleaning technologies to enable sub-100 nm particle cleaning that may actually blur the boundary between amplitude defect repair and cleaning of small particles. Although defect repair has not received much attention, some progress has been reported. The defect repair technologies pioneered by the EUVLLC/VNL are being employed by commercial mask repair tool suppliers to develop repair processes for both amplitude and phase defects. In addition to SEMATECH’s MBDC that is focused on reducing mask blank defects, the first EUV mask pilot lines are being started [268,269]. Topics receiving increased attention include defect inspection [270], analysis of defect composition [263,271], development of capping layer materials for EUV blanks [272], and EUV blank repair [273,274]. In addition, mask patterning [275] and printing of patterned EUV masks are being verified on the pilot lines using available EUV MET tools. Overall, significant progress has been made in enabling the availability of low defect mask blanks. Given the past successes and the ongoing efforts, the level of optimism for obtaining the 0.003 defects/cm2 target at 30 nm PSL equivalent size is much higher than a year ago. However, the low defect levels represent a formidable challenge that will need continued industry focus.

q 2007 by Taylor & Francis Group, LLC

454

Microlithography: Science and Technology

8.10.4 EUV Resists The major challenge for EUV resists is still to meet several tough requirements simultaneously, i.e., resist resolution, sensitivity, LER, and low levels of outgassing. For a long time, Shipley’s (Rohm and Haas) EUV-2D has been the standard EUV resist, showing resolution down to 50 nm for 2:1 features and down to 40 nm for overexposed 3:1 features. For resist resolution, the major question posed by those EUV-2D images was whether chemically amplified resists (CAR) can resolve features below 40–50 nm. Several resists have been tested on the MET and EUV interference lithography (IL) tools [276]. Reasonable resist resolution results including not only top-down but also cross section analysis have been shown for 50 nm 1:1 features; 35–45 nm 1:1 features were achieved with some top rounding, resist footing, and higher LER. The best resist resolution to date was obtained on the MET-1 K Rohm and Haas resist exposed at the Berkeley MET tool [277,278]. However, resist sensitivity for the MET-1 K is O20 mJ/cm2. Initial results for molecular glass resist [279] have not shown superior resolution or LER, and outgassing data are not yet available for those resists. An important result is that PMMA exposed on EUV IL tools shows resist resolution down to 20 nm. Because of outgassing, PMMA cannot be exposed on the MET tools without risking optics contamination; however, it is clear from the optics data available for those tools that feature resolution is currently limited by resist and not optics. Resist resolution will need significant improvement to approach the level optics resolution will be the limiting factor. The availability of high resolution resists that can be used for qualifying EUV optics may actually become a problem in qualifying the first EUV alpha tools if sufficiently high resolution resists are not available by late 2005. Therefore, the emphasis is now more on improving resolution and less on increasing sensitivity. Although no progress has been shown for LER, LER is an issue that is not EUV specific and will have to be addressed by any technology targeting the 45 nm node. Learning from other technologies (e.g., LER improvements for 193 nm immersion resists) may be transferable to EUV. With new exposure capabilities becoming available that can be used by resist suppliers to test resists, it is not unreasonable to expect a significant acceleration in EUV resist development. The Berkeley MET, one of the first high-resolution EUV exposure capabilities to become available to resist suppliers in mid-2004, has been used to screen over 100 resists/resist modifications. In addition, a new EUV Resist Test Center (RTC) with a stand alone MET tool with integrated track and all respective metrology capabilities on-site to provide resist exposure service for resist benchmarking is opening [280]. The EUV RTC that is located next to the MBDC in Albany, New York, will become operational in late 2004, and it can be used to accelerate EUV resist learning by resist suppliers that, until now, have been limited by exposure capability. 8.10.5 EUV Exposure Tool Development Three 0.3 NA MET tools are now operational and running, and the fourth will provide an exposure service in late 2004. Two of those, the HiNA tool at ASET and the MET tool at Berkeley, use synchrotron radiation as EUV light source [278,281]. The other two, Intel’s MET [282] and SEMATECH’s MET at the EUV RTC, use Xe DPP sources. The Berkeley MET tool provides a variable illumination system that can produce any kind of desired illumination and is predicted to enable printing resolution in the 15 nm range. In addition to the 0.3 NA MET tools with 5!demagnification, the 0.1 NA 10!demagnification tools at Sandia are still available, and new EUV IL exposure capability has become available to test resists.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography

455

TABLE 8.11 Extreme Ultraviolet (EUV) Tool Introduction Schedule of Major Stepper Suppliers Company ASML Nikon Canon

Alpha Tool 2 tools, !10 wph 2006 EUV1 tool 2006 !10 wph tool 2006/2007

Beta Tool

Gamma Tool

30 wph tool 2007

80 wph tool 2009 EUV2 HVM tool 2008 60 wph tool 2008/2009

As shown in Table 8.11, all major lithography tool suppliers now have EUV firmly on their roadmaps, and currently, two alpha tools are being built by one of the suppliers to be delivered in 2006 [264,283,284]. Four of the six projection optics mirrors for the first of those two alpha tools have been fabricated to the specification, and the target lifetime for the projection optics is about three years, translating into the 1000 light on hours mentioned in 8.10.1 [264,285]. Because source power is not being viewed as one of the key critical issues, the main concerns for EUV tools are now optics contamination and lifetime and overall CoO [286].

8.10.6 Outlook Significant progress has been made toward EUV commercialization during 2004, and if the pace of progress continues, the outlook is good that EUV will be ready for introduction at the 45 nm half-pitch node. However, EUV development cannot occur in isolation from the development in other lithography technologies. For example, competing technologies such as 193 nm immersion lithography have drawn a lot of attention, dedication, and industry resources; this emphasizes the important role that major consortia like ASET, EUVA, MEDEAC, and SEMATECH have in the coordinating and focusing valuable resources, so there is very little to no duplication of efforts on an international level. The International EUV Initiative provides a framework through which those organizations can coordinate work, and it helps drive the industry consensus through international technical working groups for the main technology areas. Those efforts help foster the discussions among the key players necessary to drive industry standardization, specifically in the mask area, but also increasingly in the other areas like resist (outgassing), source (source metrology), and optics contamination. Although both main contenders, EUV and 193 nm immersion, must address major challenges, it is fair to state that, presently, more is known about the EUV critical issues than about potential issues with extending 193 nm immersion technology down to 45 nm half-pitch. However, assuming that the industry will overcome the challenges for both technologies, the main criterion for down selection will likely be the CoO. For EUV, this means that as commercialization is firmly controlled by the suppliers and as the infrastructure for the 45 nm half-pitch node becomes operational, increasing efforts have to be directed toward EUV extensibility. Clearly, EUV is a technology that will be able to support several nodes, but the technological need to be solved. Looking at the EUV specifications for the 32 nm half-pitch node should make it clear that in order to meet those challenges by 2011, work on EUV extensibility has to start now. Currently, there is no alternative for high volume manufacturing to EUV for the 32 nm half-pitch node and below.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

456

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

26. 27. 28. 29. 30.

31. 32. 33. 34. 35. 36.

37. 38. 39.

International Technology Roadmap for Semiconductors, http://public.itrs.net/ H.J. Levinson. 2001. Principles of Lithography, Bellingham: SPIE Press. A.K.-K. Wong. 2001. Resolution Enhancement Techniques, Bellingham: SPIE Press. A.M. Hawryluk and L.G. Seppala. 1988. J. Vac. Sci. Technol. B, 6: 2162. W.T. Silfvast and O.R. Wood, II. 1988. Microelectron. Eng., 8: 3. H. Kinoshita et al. 1989. J. Vac. Sci. Technol. B, 7: 1648. J.E. Bjorkholm et al. 1990. J. Vac. Sci. Technol. B, 8: 1509. D.A. Tichenor et al. 1991. Opt. Lett., 16: 1557. K.B. Nguyen et al. 1996. OSA TOPS on Extreme Ultraviolet Lithography, Vol. 4, Washington, DC: Optical Society of America. C.W. Gwyn et al. 1997. Extreme Ultraviolet Lithography, EUVLLC. C.W. Gwyn et al. 1999. Extreme Ultraviolet Lithography: A White Paper, Livermore, CA: Extreme Ultraviolet Limited Liability Company. International SEMATECH NGL Task Force meetings. European EUCLIDES program. French PREUVE program. Japanese ASET program. European MEDEACprogram. Japanese EUVA program. P.J. Silverman. 2001. Proc. SPIE, 4343: 12. H. Meiling et al. 2001. Proc. SPIE, 4343: 38. H. Meiling et al. 2002. Proc. SPIE, 4688: 52. H. Meiling et al. 2003. Proc. SPIE, 5037: 24. F. Bociort, M.F. Bal, and J.J.M. Braat. 2000. in Proceedings of the 2nd International Conference on Optical Design and Fabrication, Tokyo, Japan, ODF2000, November 15–17, p. 339. D. Attwood. 1999. Soft s-rays and Extreme Ultraviolet Radiation, Cambridge: Cambridge University Press. D.A. Tichenor et al. 2000. Proc. SPIE, 3997: 48. J.S. Taylor et al. 2003. Visible Light Interferometry and Alignment of the MET Projection Optics, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. F. Zernike. 1934. Physica, 1: 689. M. Born and E. Wolf. 1999. Principles of Optics, 7th Ed., Cambridge: Cambridge University Press. A. Mare´chal. 1947. Rev. d’ Optique, 26: 257. C. Krautschik et al. 2001. Proc. SPIE, 4343: 38. P. Ku¨rz. 2003. The EUV Optics Development Program at Carl Zeiss SMT AG, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. G.E. Sommargren. 1996. Phase Shifting Diffraction Interferometer, U.S. Patent 5,548,403, August 20. H.N. Chapman and D.W. Sweeney. 1998. Proc. SPIE, 3331: 102. U. Dinger et al. 2000. Proc. SPIE, 4146: 35. K.A. Goldberg et al. 1999. Proc. SPIE, 3676: 635. K.A. Goldberg et al. 2002. Proc. SPIE, 4688: 329. K.A. Goldberg et al. 2002. VNL Research in Interferometry for EUV Optics, 1st International Extreme Ultra-Violet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17, 2002. K.A. Goldberg et al. 2003. Proc. SPIE, 5037: 69. H. Medecki. 1998. Phase-Shifting Point Diffraction Interferometer, U.S. Patent No. 5,835,217, November 10, 1998. H. Medecki et al. 1996. Opt. Lett., 21:19, 1526.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54.

55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66.

67. 68. 69. 70. 71. 72. 73. 74.

75. 76. 77. 78. 79. 80.

81.

457

K.A. Goldberg. 1997. Extreme Ultraviolet Interferometry, PhD thesis, University of California, Berkeley. P.P. Naulleau et al. 1999. Appl. Opt., 38:35, 7252. G.E. Sommargren et al. 2002. Proc. SPIE, 4688: 316. K. Murakami et al. 2003. Proc. SPIE, 5037: 257. C. Montcalm et al. 1998. Proc. SPIE, 3331: 43. D.G. Stearns, R.S. Rosen, and S.P. Vernon. 1991. J. Vac. Sci. Technol. A, 9: 2669. J.A. Folta et al. 1999. Proc. SPIE, 3676: 702. T. Feigel et al. 2000. Proc. SPIE, 3997: 420. S. Bajt, D.G. Stearns, and P.A. Kearney. 2001. J. Appl. Phys., 90:2, 1017. R. Soufli et al. 2001. Proc. SPIE, 4343: 51. M. Shiraishi, N. Kandaka, and K. Murakami. 2003. Proc. SPIE, 5037: 249. S. Braun et al. 2003. Proc. SPIE, 5037: 274. E. Spiller et al. 2003. Appl. Opt., 42:19, 4049. C.C. Walton et al. 2000. Proc. SPIE, 3997: 496. P.B. Mirkarimi et al. 2003. Advances in the Ion Beam Thin Film Planarization Process for Mitigating EUV Mask Defects, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02. E. Louis et al. 1999. Proc. SPIE, 3676: 844. E. Louis et al. 2000. Proc. SPIE, 3997: 406. E. Zoethout et al. 2003. Proc. SPIE, 5037: 872. J.A. Folta et al. 2002. Proc. SPIE, 4688: 173. P.B. Mirkarimi and C. Montcalm. 1998. Proc. SPIE, 3331: 133. C. Montcalm. 2001. Opt. Eng., 40:3, 469. A.K. Ray-Chaudhuri et al. 1998. Proc. SPIE, 3331: 124. H.N. Chapman and D.W. Sweeney. 1998. Proc. SPIE, 3331: 102. M. Shiraishi et al. 2002. Proc. SPIE, 4688: 516. A. Barty and K.A. Goldberg. 2003. Proc. SPIE, 5037: 450. M.E. Malinowski et al. 2002. Proc. SPIE, 4688: 442. M.E. Malinowski. SEMATECH Project LITH113: EUV Optics Contamination Control Gas Blend Carbon Mitigation Data and Final Report, Report to International SEMATECH, Project LITH113, Agreement 399509-OJ. M.E. Malinowski et al. 2001. Proc. SPIE, 4343: 347. S. Graham et al. 2002. Proc. SPIE, 4688: 431. B. Mertens et al. 2003. Proc. SPIE, 5037: 95. S. Bajt et al. 2002. Opt. Eng., 41: 1797. S. Bajt et al. 2001. Proc. SPIE, 4506: 121. S. Braun et al. 2002. Jpn. J. Appl. Phys., 41: 4074. S. Bajt et al. 2003. Proc. SPIE, 5037: 236. S. Bajt et al. 2003. Investigation of Oxidation Resistance of EUV Multilayers, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. C. Krautschik et al. 2002. Proc. SPIE, 4688: 289. M.E. Malinowski et al. 2003. Proc. SPIE, 5037: 429. V. Bakshi et al. 2003. ”Extreme ultraviolet lithography: Status and challenges ahead,“ Semiconductor Fabtech, 19: 67. R. Lebert et al. 2001. Proc. SPIE, 4343: 215. U. Stamm, H. Schwoerer, and R. Lebert. 2002. Phys. J. 33–39. G.D. Kubiak. Prospects and challenges for high-power EUV sources, SPIE 28th Annual International Symposium and Education Program on Microlithography, Februray 23–28, paper 5037-14. Y. Watanabe, K. Ota, and H. Franken. 2003. Joint Spec ASML, Canon and Nikon, EUV Source Workshop, Antwerp/Belgium, September 29.

q 2007 by Taylor & Francis Group, LLC

458

Microlithography: Science and Technology

82.

A. Barty et al. 2002. Aerial Image Microscopes for Inspection of Defects in EUV Mask Blanks, 1st International Extreme Ultra-Violet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17, 2002. H. Fiedorowicz et al. 2003. Proc. SPIE, 5037: 389. R. Lebert et al. 2003. Inband EUV Open Frame Resist Exposer TEUVL, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. W.-D. Domke et al. 2003. Resist Characterization for EUV-Lithography, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. M. Yi et al. 2002. Proc. SPIE, 4688: 395. K.A. Goldberg et al. 2002. Proc. SPIE, 4688: 329. A. Rundquist et al. 1998. Science, 280: 1412. A. Paul et al. 2003. Nature, 421: 51. V. Banine et al. 2000. Proc. SPIE, 3997: 126. V. Banine and J. Moors. 2001. Proc. SPIE, 4343: 203. W.H. Bennett. 1934. Phys. Rev., 45: 890. N.R. Fornaciari et al. 2002. Proc. SPIE, 4688: 110. U. Stamm et al. 2003. Proc. SPIE, 5037: 119. U. Stamm et al. 2002. Proc. SPIE, 4688: 87. G. Schriever et al. 2000. Proc. SPIE, 3997: 162. M.W. McGeoch. 1998. Appl. Opt., 37: 1651. M.W. McGeoch. 2000. Proc. SPIE, 3997: 861. V.M. Borisov et al. 2002. Proc. SPIE, 4688: 626. U. Stamm et al. 2003. XTREME Technologies EUV Sources, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. U. Stamm et al. 2003. High Power Gas Discharge and Laser Produced Plasma Sources for EUV Lithography, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium. September 30–October 02, 2003. V. Borisov et al. 2003. A Comparison of EUV Sources for Lithography Based on Xe and Sn, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. V. Borisov et al. 2003. Development of High Conversion Efficiency High Power EUV Sources for Lithography, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. K. Bergmann et al. 1999. Appl. Opt., 38: 5413. J. Pankert et al. 2002. Proc. SPIE, 4688: 87. J. Pankert et al. 2003. Proc. SPIE, 5037: 112. J. Pankert et al. 2003. Philips’s EUV Lamp: Status and Roadmap, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29. J. Pankert et al. 2003. Philips’s EUV Lamp: Status and Roadmap, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02. Y.P. Raizer. 1997. Gas Discharge Physics, Berlin: Springer. N.R. Fornaciari et al. 2002. Proc. SPIE, 4688: 110. J.M. Pouvesle et al. 2003. Plasma Sources Sci. Technol., 12: S43. E. Robert et al. 2002. Proc. SPIE, 4688: 672. Y. Teramoto et al. 2003. Proc. SPIE, 5037: 767. W.T. Silfvast. 1999. Proc. SPIE, 3676: 272. M.A. Klosner and W.T. Sifvast. 1998. Opt. Lett., 23: 1609. N.R. Fornaciari et al. 2000. Proc. SPIE, 3997: 120. N.R. Fornaciari et al. 2001. Proc. SPIE, 4343: 226. W.N. Partlo et al. 2000. Proc. SPIE, 3997: 136. W. Partlo, I. Fomenkov, and D. Birx. 1999. Proc. SPIE, 3676: 846. W.N. Partlo et al. 2001. Proc. SPIE, 4343: 232. I.V. Fomenkov et al. 2002. Proc. SPIE, 4688: 634.

83. 84.

85.

86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101.

102. 103.

104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography 122.

123. 124. 125.

126. 127. 128. 129. 130. 131. 132.

133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144.

145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156.

459

I.V. Fomenkov et al. 2002. Characterization of a Dense Plasma Focus Device as a Light Source for EUV Lithography, 1st International Extreme Ultra-Violet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17. I.V. Fomenkov et al. 2003. Proc. SPIE, 5037: 807. I.V. Fomenkov. 2003. Performance of a Dense Plasma Focus Light Source for EUV Lithography, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29. I.V. Fomenkov. 2003. Performance of a Dense Plasma Focus Light Source for EUV Lithography, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02. M.W. McGeoch and C.T. Pike. 2003. Proc. SPIE, 5037: 141. M.W. McGeoch and C. Pike. 2002. High Power EUV Source, 1st International Extreme UltraViolet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17. M.W. McGeoch. 2003. Star Pinch Update, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29. A. Endo et al. 2003. EUV Light Source Development at EUVA, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29. A. Hassanein et al. 2003. Proc. SPIE, 5037: 358. L.A. Shmaenok et al. 1998. Proc. SPIE, 3331: 90. D.N. Ruzic et al. 2003. Secondary Plasma-Based Debris Mitigation for Next Generation EUVL Sources, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02. H. Puell. 1970. Z. Naturforsch. Teil. A, 25: 1807. G. Schriever, K. Bergmann, and R. Lebert. 1998. J. Appl. Phys., 83: 4566. G. O’Sullivan. 1982. J. Phys. B, 15: L765. J. Blackburn et al. 1983. J. Opt. Soc. Am., 73: 1325. G. O’Sullivan. 2003. EUV Emission from Xe and Sn Plasmas, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29. F. Jin et al. 1995. Proc. SPIE, 2523: 81. R.C. Spitzer et al. 1996. J. Appl. Phys., 79: 2251. D. Colombant and G. Tonon. 1973. J. Appl. Phys., 44: 3524. M. Richardson et al. 1998. Opt. Comm., 145: 109. R. Constantinescu, J. Jonkers, and J. Vrakking.1999. OSA Proceedings on Applications of High Field and Short Wavelength Sources VIII, Potsdam, Germany, June 27–30, 1999, p. 111. H. Shields et al. 2002. Proc. SPIE, 4688: 94. H. Shields. 2002. Progress and Current Performance for Laser Produced Plasma EUV Power, 1st International Extreme Ultra-Violet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17, 2002. W.P. Ballard et al. 2002. Proc. SPIE, 4688: 302. L. Rymell et al. 1999. Proc. SPIE, 3676: 421. B.A.M. Hansson et al. 2000. Microel. Engin., 53: 667. L. Malmqvist et al. 1996. Rev. Sci. Inst., 67: 4150. B.A.M. Hansson et al. 2001. Proc. SPIE, 4506: 1. B.A.M. Hansson et al. 2002. Proc. SPIE, 4688: 102. B.A.M. Hansson et al. 2002. Status of the Liquid Xenon-Jet Laser-Plasma Source, 1st International Extreme Ultra-Violet Lithography (EUVL) Symposium, Dallas, Texas, October 15–17, 2002. B.A.M. Hansson. 2003. Status of the Liquid-Xenon-Jet Laser-Plasma EUV Source, International Sematech EUV Source Workshop, Santa Clara, California, Februray 23, 2003. T. Abe et al. 2003. Proc. SPIE, 5037: 776. W.P. Ballard et al. 2003. Proc. SPIE, 5037: 47. S. Ellwi. 2003. Powerlase LPP EUV Source Update, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. S. Ellwi. 2003. High Power Short Pulse and Cost Effective Laser Modules for Laser Produced Plasma (LPP) EUV Source, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003.

q 2007 by Taylor & Francis Group, LLC

460

Microlithography: Science and Technology

157.

A. Endo et al. 2003. Laser-Produced Plasma Light Source Development for EUV Lithography at EUVA, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. S. McNaught. 2003. Laser-Produced Plasma EUV Source Program at Cutting Edge Optronics, International Sematech EUV Source Workshop, Santa Clara, California, Februray 23, 2003. Bunze, V. 2003. EUV Source Technology: The challenge of High Throughput and Cost Effective EUV Lithography, SEMI Technical Symposium: Innovations in Semiconductor Manufacturing (STS:ISM), SEMICON West 2003. L. Rymell and H. Hertz. 1993. Opt. Commun., 103: 105. H. Hertz et al. 1995. Proc. SPIE, 2523: 88. L. Rymell, M. Berglund, and H. Hertz. 1995. Appl. Phys. Lett., 66: 2625. M. Berglund, L. Rymell, and H. Hertz. 1996. Appl. Phys. Lett., 69: 1683. M. Richardson et al. 2003. The Case for Tin as an EUV Source, 2nd International Extreme UltraViolet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. M. Richardson et al. 2003. The Case for Tin as an EUV Source, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. F. Jin et al. 1993. Proc. SPIE, 2015: 151. R.C. Constantinescu et al. 2000. Proc. SPIE, 4146: 101. S. Dusterer et al. 2001. Appl. Phys. B, 73: 693. P.A. Grunow et al. 2003. Proc. SPIE, 5037: 418. L. Klebanoff. 2003. Condenser Erosion Observations in the ETS, 2nd International Extreme UltraViolet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. J.D. Gillaspy et al. 2003. Study of EUV Source Collector Damage Mechanism, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30– October 02, 2003. K. Takenoshita, C.-S. Koay, and M. Richardson. 2003. Proc. SPIE, 5037: 792. J.B. Murphy et al. 1993. Appl. Opt., 32: 6920. D.C. Ockwell, N.C.E. Crosland, and V.C. Kempson. 1999. J. Vac. Sci. Technol. B, 17:6, 3043. K. Mann et al. 2003. Proc. SPIE, 5037: 656. M.C. Schu¨rmann et al. 2003. Proc. SPIE, 5037: 378. M.C. Schu¨rmann et al. 2003. EUV Source Metrology, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. R. Stuik et al. 2002. Nucl. Instr. Meth. A, 492:1–2, 305. R. Stuik. 2002. Characterization of XUV Sources, Eindhoven: Proefschrift, Technische Universiteit Eindhoven, pp. 0–1. F. Bijkerkm et al. 2003. FC2 Project Status and Metrology Survey, International Sematech EUV Source Workshop, Santa Clara, California, Februray 23, 2003. V. Bakshi. 2003. EUV Source Projects in Pre-Competitive Arena, International Sematech EUV Source Workshop, Antwerp, Belgium, September 29, 2003. I.C.E. Turcu et al. 2003. Overview of High Efficiency EUV Source Generated by Laser-Produced Plasma, International Sematech EUV Source Workshop, Santa Clara, California, Februray 23, 2003. T. Aota et al. 2003. Proc. SPIE, 5037: 147. C.-S. Koay et al. 2003. Proc. SPIE, 5037: 801. T. Tomie et al. 2003. Progress of Tin Plasma Technologies at AIST, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. S.D. Hector. 2002. Proc. SPIE, 5130: 990. C.W. Gwyn and P.J. Silverman. 2003. Proc. SPIE, 5130: 990. SEMI P37-1102, Specification for Extreme Ultraviolet Lithography Mask Substrates, Semiconductor Equipment and Materials International, San Jose, CA, 2002. SEMI P38-1103, Specification for Absorbing Film Stacks and Multilayers on Extreme Ultraviolet Lithography Mask Blanks, Semiconductor Equipment and Materials International, San Jose, CA, 2003.

158. 159.

160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171.

172. 173. 174. 175. 176. 177. 178. 179. 180. 181. 182.

183. 184. 185.

186. 187. 188. 189.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography 190. 191. 192. 193. 194. 195. 196. 197. 198.

199. 200. 201. 202.

203.

204. 205. 206. 207. 208. 209. 210. 211.

212. 213. 214. 215. 216. 217. 218. 219. 220. 221. 222.

223. 224. 225.

226.

461

The program is provided by Eric Gullikson on the Lawrence Berkeley National Laboratory Center for X-Ray Optics (CXRO) website: http://www.cxro.lbl.gov/optical_constants/ P.B. Mirkarimi and D.G. Stearns. 2000. Appl. Phys. Lett., 77: 2243. P.B. Mirkarimi et al. 2001. IEEE J. Quantum Electron., 37: 1514. S.P. Hau-Riege et al. 2003. Proc. SPIE, 5037: 331. E.M. Gullikson et al. 2002. J. Vac. Sci. Technol. B, 20:1, 81. P.P. Naulleau et al. 2003. J. Vac. Sci. Technol. B, 21:4, 1286. A. Barty et al. 2002. Proc. SPIE, 4688: 385. P.B. Mirkarimi et al. 2002. J. Appl. Phys., 91:1, 81. S.P. Hau-Riege et al. 2003. Progress in Multilayer Defect Repair, Virtual National Laboratory Extreme Ultraviolet Lithography Quarterly Status Meeting, Berkeley, California, September 12, 2003. S.P. Hau-Riege et al. 2003. Proc. SPIE, 5037: 331. T. Liang and A. Stivers. 2002. Proc. SPIE, 4688: 375. P.Y. Yan. Masks for Extreme Ultraviolet Lithography—Chapter 11 in the Handbook of Mask Making, Marcel Dekker, to be published. C. Krautschick et al. 2003. Implementing Flare Compensation for EUV Masks Through Localized Mask CD Resizing, SPIE 28th Annual International Symposium and Education Program on Microlithography, Februray 23–28, 2003, paper No. 5037-07. S.H. Lee et al. 2003. Lithographic Flare Measurements of EUV Full-Field Optics, SPIE 28th Annual International Symposium and Education Program on Microlithography, Februray 23–28, paper No. 5037-13. J. Cobb et al. 2003. Flare Compensation in EUV Lithography, 2nd International Extreme UltraViolet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. W. Trybula and P. Seidel. 2002. Analysis of EUV Technology Cost of Ownership, First International Symposium on Extreme Ultraviolet Lithography, Dallas, Texas, October 14–17, 2002. C. Pike et al. 2000. Proc. SPIE, 3997: 328. P. Dentinger et al. 2000. Proc. SPIE, 3997: 588. T. Watanabe et al. 2000. Proc. SPIE, 3997: 600. N.N. Matsuzawa et al. 2001. Proc. SPIE, 4343: 278. J. Cobb et al. 2002. Proc. SPIE, 4688: 412. W.-D. Domke et al. 2003. Resist Characterization for EUV Lithography, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. G.D. Kubiak et al. 1991. Proc. OSA 124. K. Early et al. 1993. Appl. Opt., 32:34, 7044. G.F. Cardinale et al. 1999. J. Vac. Sci. Tech. B. J.E.M. Goldsmith et al. 1999. Proc. SPIE, 3676: 264. S.H. Lee et al. 2002. Proc. SPIE, 4688: 266. P.P. Naulleau et al. 2002. Proc. SPIE, 4688: 64. P.P. Naulleau. 2002. Internal VNL/EUVLLC Memorandum. P.P. Naulleau and G.M. Gallatin. 2003. Appl. Opt., 42: 3390. J. Cobb, F. Houle, and G. Gallatin. 2003. Proc. SPIE, 5037: 397. S.H. Lee, R. Bristol, and J. Bjorkholm. 2003. Proc. SPIE, 5037: 890. S. Hirscher et al. 2003. Advances in EUV Lithography Development for Sub-50 nm DRAM Nodes, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. C.A. Cutler et al. 2003. Proc. SPIE, 5037: 406. S.A. Robertson et al. 2003. Proc. SPIE, 5037: 900. R. Brainard et al. 2003. Effect of Polymer Molecular Weight and Resist Sensitivity on LER and AFM Morphology of EUV Resists, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. T. Watanabe et al. 2003. Resist Characteristics in EUVL—Mitigation of Hydrocarbon Outgassing Species and LER in CA Resist, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003.

q 2007 by Taylor & Francis Group, LLC

462

Microlithography: Science and Technology

227. 228. 229. 230. 231.

D.A. Tichenor et al. 2002. Proc. SPIE, 4688: 72. D.J. O’Connell et al. 2003. Proc. SPIE, 5037: 83. S.H. Lee et al. 2003. Proc. SPIE, 5037: 103. P.P. Naulleau et al. 2002. J. Vac. Sci. Technol. B, 20: 2829. S.H. Lee et al. 2003. Engineering Test Stand (ETS) Updates: Lithographic and Tool Learning, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. C.W. Gwyn. 2003. EUV Lithography in Perspective, 2nd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Antwerp, Belgium, September 30–October 02, 2003. Proceedings of the Third International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki/Japan, November 2–4, 2004. Proceedings of the SEMATECH EUV Mask Technology & Standards Workshop, Miyazaki/Japan, November 1, 2004. Proceedings of the SEMATECH EUV Resist Workshop, Miyazaki/Japan, November 1, 2004. Proceedings of the SEMATECH EUV Source Modeling Workshop, Miyazaki/Japan, November 4, 2004. Proceedings of the SEMATECH EUV Optics & Contamination Lifetime Workshop, Miyazaki/Japan, November 4, 2004. Proceedings of the SEMATECH EUV Source Condenser Erosion Workshop, Miyazaki/Japan, November 4, 2004. Proceedings of the SEMATECH EUV Source Workshop, Miyazaki/Japan, November 5, 2004. G. Edwards et al. 2004. Progress Toward Projection Optics Contamination and Condenser Erosion Test Protocols, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. S. Bajt et al. 2004. Screening of Oxidation Resistant Capping Layers for EUV Multilayers, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. S. Hill et al. 2004. EUV and E-Beam Exposures of Ruthenium-Capped Multilayer Mirrors, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. T.E. Madey et al. 2004. Surface Phenomena Related to Degradation of EUV Mirrors, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. J. Pankert et al. 2004. Update on Philips’ EUV Light Source, 3rd International Extreme UltraViolet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. U. Stamm et al. 2004. EUV Source Development Status at Xtreme Technologies—an Update, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. J. Ringling et al. 2004. Development Status of High Power Tin Gas Discharge Plasma Sources for Next Generation EUV Lithography, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. I.V. Fomenkov et al. 2004. Progress in Development of a High Power Source for EUV Lithography Based on DPF and LPP, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. M. Richardson et al. 2004. The Tin Doped Laser-Plasma EUV Source, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. C.S. Koay et al. 2004. Precision 13 nm Metrology of the Microscopic Tin Doped Laser-Plasma Source, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. G. O’Sullivan et al. 2004. Recent Results on EUV Emission from Laser Produced Plasmas with Slab Targets Containing Tin, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. S. Grantham et al. 2004. Characterization of EUV Detectors and Tools at NIST, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004.

232. 233. 234. 235. 236. 237. 238. 239. 240.

241.

242.

243.

244. 245.

246.

247.

248. 249.

250.

251.

q 2007 by Taylor & Francis Group, LLC

EUV Lithography 252.

253.

254.

255.

256.

257.

258.

259.

260. 261.

262.

263.

264. 265.

266.

267.

268. 269.

270.

463

R.J. Anderson et al. 2004. The Erosion of Materials Exposed to a Laser-Produced Plasma Extreme Ultraviolet Illumination Source, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. M.A. Jaworski et al. 2004. Gaseous Tin EUV Light Source Debris Mitigation and Characterization, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. E.L. Antonsen et al. 2004. Debris Characterization from a Z-pinch Extreme Ultra-Violet Light Source, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. M.A. Jaworski et al. 2004. Secondary RF Plasma System for Mitigation of EUV Source Debris and Advanced Fuels, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. D.N. Ruzic et al. 2004. Time Exposure and Surface Analysis of EUV Light and Debris Exposed Condenser Optics, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. H. Furukawa et al. 2004. Estimations on Generation of High Energy Particles from EUV LPP Light Sources, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. I. Nishiyama et al. 2004. Modeling for Charge Transfer of Highly Ionized Xe Ions Produced by Laser Produced Plasma, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. H. Komori et al. 2004. Magnetic Field Ion Mitigation for Collector Mirrors, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. J. Rocca et al. 2004. New NSF Center for EUV Science and Technology, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P. Seidel. 2004. Commercial EUV Mask Blank Readiness for 45 nm Half-Pitch (hp) 2009 Manufacturing, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P. Kearney et al. 2004. Defect Inspection on Extreme Ultraviolet Lithography Mask Blanks at the Mask Blank Development Center, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. R. Randive et al. 2004. Investigating the Composition and Potential Sources of Particles in a LowDefect Mo/Si Deposition Process for Mask Blanks at ISMT-N, 3rd International Extreme UltraViolet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. N. Harned et al. 2004. Progress on ASML’s Alpha Demo Tool, 3rd International Extreme UltraViolet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. K. Orevk et al. 2004. An Integrated Demonstration of EUVL Reticle Particle Defect Control, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. T. Terasawa et al. 2004. Actinic EUV Mask Blank Inspection with Dark-Field Imaging Using LPP EUV Source and Two-Dimensional CCD Camera, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. A. Barty et al. 2004. EUV Mask Inspection and at-wavelength Imaging Using 13.5 nm Light, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. A.R. Stivers et al. 2004. EUV Mask Pilot Line at Intel Corporation, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. C. Holfeld et al. 2004. Challenges for the Integrated Manufacturing of EUV Masks, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. E.M. Gullikson, E. Tejnil, and A.R. Stivers. 2004. Modeling of Defect Inspection Sensitivity of a Confocal Microscope, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004.

q 2007 by Taylor & Francis Group, LLC

464

Microlithography: Science and Technology

271.

E.Y. Shu. 2004. Applying Advanced Surface Analysis Techniques to Small Defect Characterization on EUV ML Blanks, 24th Annual BACUS Symposium, 5567-169, Monterey, California, September 13–17, 2004. A.R. Stivers et al. 2004. EUV Mask Pilot Line at Intel, 24th Annual BACUS Symposium, 5567-03, Monterey, California, September 13–17, 2004. T. Liang et al. 2004. E-Beam Mask Repair, 24th Annual BACUS Symposium, 5567-49, Monterey, California, September 13–17, 2004. A. Tchikoulaeva, C. Holfeld, and J.H. Peters. 2004. Repair of EUV Masks Using a Nanomachining Tool, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P. Yan et al. 2004. EUVL Mask Patterning with Blanks from Commercial Supplier, 24th Annual BACUS Symposium, 5567-83, Monterey, California, September 13–17, 2004. R. Gronheid et al. 2004. Resist Evaluation Using EUV Interference Lithography, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. R.L. Brainard et al. 2004. Performance of EUV Photoresists on the ALS Micro Exposure Tool, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P.P. Naulleau et al. 2004. High Resolution EUV Microexposures at the ALS, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. D. Yang et al. 2004. Molecular Glass Photoresists for EUV Lithography, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. A.C. Rudack et al. 2004. Facility Considerations for International SEMATECH’s EUV Resist Test Center, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. H. Oizumi et al. 2004. Lithographic Performance of High-Numerical Aperture (NAZ0.3) SmallField Exposure Tool (HINA) for EUV Lithography, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. W. Yueh et al. 2004. EUV Resist Patterning Performance from the Intel Microexpsure Tool (MET), 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. T. Asami. 2004. EUV Exposure System Development Plan in Nikon, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. Y. Gomei. 2004. EUVL Development Activity at Cannon, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P. Kuerz et al. 2004. The EUVAlpha Demo Tool Program at Carl Zeiss SMT AG, 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004. P. Seidel. 2004. Initial Cost of Ownership Analysis for the 45 nm Half-Pitch (hp)/2009 Applications (EUV and 193i for 2009, 2010, 2011 Manufacturing), 3rd International Extreme Ultra-Violet Lithography (EUVL) Symposium, Miyazaki, Japan, November 2–4, 2004.

272. 273. 274.

275. 276.

277.

278.

279. 280.

281.

282.

283. 284. 285.

286.

q 2007 by Taylor & Francis Group, LLC

9 Imprint Lithography Douglas J. Resnick

CONTENTS 9.1 Introduction ......................................................................................................................465 9.2 The Rising Cost of Lithography ....................................................................................467 9.3 Soft Lithography ..............................................................................................................469 9.3.1 Process Background............................................................................................469 9.3.2 Bioapplications ....................................................................................................470 9.4 Nanoimprint Lithography..............................................................................................471 9.4.1 Thermal Imprint Lithography ..........................................................................471 9.4.2 Alternative Thermal Imprint Processes ..........................................................475 9.4.3 Thermal Imprint Tools ......................................................................................477 9.5 Step-and-Flash Imprint Lithography............................................................................478 9.5.1 The S-FIL Tool ......................................................................................................478 9.5.2 The S-FIL Template ............................................................................................481 9.5.3 The S-FIL Resist ..................................................................................................484 9.6 Imprint Lithography Issues............................................................................................486 9.6.1 Defects ..................................................................................................................486 9.6.2 Image Placement and Overlay..........................................................................487 9.7 Template Infrastructure ..................................................................................................489 9.7.1 Template Writing ................................................................................................489 9.7.2 Template Inspection............................................................................................493 9.7.3 Template Repair ..................................................................................................494 9.8 Conclusions ......................................................................................................................497 Acknowledgments......................................................................................................................497 References ....................................................................................................................................498

9.1 Introduction Relative to the other lithographic techniques discussed in this book, imprint lithography is a very old and established technology. While Johannes Gutenberg is generally credited with the invention of modern printing, imprinting was already being practiced for centuries in China. There is evidence of carved characters in stone and ceramic as early as 500 BC. With the invention of paper in approximately 200 BC, it became possible to reproduce works on large writing surfaces. Important work was etched into stone slab and 465

q 2007 by Taylor & Francis Group, LLC

466

Microlithography: Science and Technology

transferred onto materials such as hemp, silk rags, or bark. In the seventh century, stone gave way to woodblock, a technique that was also used in Europe during the Middle Ages. To form a printed image, a sheet of paper was placed over the block, and the image was transferred either by rubbing or by inking the block and abrading the paper. Wood block remained the primary means of imprinting well into the nineteenth century in China. However, in 1040, Bi Sheng began to experiment with movable type made from either clay or ceramics. The individual type was arranged on an iron plate, and held in place with wax and resin. By melting the wax, the type could be removed and then rearranged as necessary. Eventually, clay gave way to wood, and later to more durable materials such as copper and brass. The technology never became popular in China because of the thousands of characters that were needed to convey information in the Chinese language. Johannes Gutenberg began his work in the town of Mainz in 1436. His invention was a combination of the formation of easily cast characters combined with a screw-type press, which was a takeoff of machines that were used to produce wine in the Rhine valley. By splitting individual components of language, such as letters, numbers, and punctuation, into individual units, it was now possible to quickly form different sentences and paragraphs. The press was completed in 1440, and in 1455 printed versions of a 42-line Bible (42 lines per page) appeared. Two hundred copies of this Bible were printed, and 48 copies are known to exist today. A picture of the press, along with a page of the Bible, is shown in Figure 9.1. Not unlike startups today, Gutenberg’s work required venture capital, and a moneylender named Johannes Fust realized the potential of Gutenberg’s work. In 1449, Fust loaned Gutenberg 800 florins, which was used for the preparation of the imprinting tool. Prior to the release of the Bible, several books and treatises were published, and the impact of the technology began to blossom. By the time of the printing of the Bible,

FIGURE 9.1 (a) The Gutenberg press. (b) A page from the 42-line Bible.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

467

however, the relationship between Gutenberg and Fust became strained, and Fust took Gutenberg to court, claiming, among other things, embezzlement of funds. The archbishop’s court of worldly justice ruled primarily in favor of Fust, and in the same year that the 42-line Bible was printed, Gutenberg lost his printing workshop and was effectively bankrupt. Gutenberg continued his work, along with other printers, up until 1462, when the Archbishop of Nassau attacked the city of Mainz. Once the city was under control, the surviving citizens, including Gutenberg, were forced to leave. When printers and compositors settled in new towns, the knowledge of the printing process was spread beyond the immediate vicinity of Mainz. By 1477, William Caxton published the first book in England. By the end of the century, printing technology was established in over 250 cities across Europe. Roman type was introduced in 1572, and Oxford University started a printing operation in 1587. In 1593, Shakespeare’s “Venus and Adonis” appeared in print and began a new era in literature. It should also be noted that although movable type is generally thought of as a European invention, a common use of movable type began in Korea around the same time as Gutenberg’s innovative work. An alphabetical script known as “Han’gul,” originally consisting of 28 characters, was officially presented in 1444.

9.2 The Rising Cost of Lithography The topic of nanotechnology implies different things to different people. High-density semiconductor circuits are now being fabricated with critical dimensions (CDs) less than 100 nm. Recent experiments demonstrate the feasibility of CMOS circuitry for gate lengths smaller than 10 nm. Reduced CDs are accompanied by the development of novel materials, including high-dielectric materials, gate metals, and porous-film stacks. The unique physical and chemical phenomena at the nanoscale can lead to other types of novel devices that have significant practical value. Emerging nanoresolution applications include subwavelength optical components, biochemical analysis devices, high-speed compound semiconductor chips, distributed feedback lasers, photonic crystals, and high-density patterned magnetic media for storage. To take advantage of these opportunities, it is necessary to be able to cost-effectively image features well beyond 100 nm. In the fields of micro and nanolithography, major advancements in resolution have historically been achieved through use of shorter wavelengths of light. Using phaseshift mask technology, it has already been demonstrated that 193-nm photolithography can produce sub-100-nm features. Along this path, such improvements come with an ever increasing cost for photolithographic tools. It is interesting to note that the cost of photo tools has kept pace with data density, as depicted in Figure 9.2 [1]. As conventional projection lithography reaches its limits, next-generation lithography (NGL) tools may provide a means to further pattern shrinks, but are expected to have price tags that are prohibitive for many companies. The development of both light sources and optics to support the sources are primarily responsible for the rise in the cost of an NGL tool. Lithography at 157 nm, for example, requires the use of CaF2 as a lens material. In the case of extreme-ultra-violet lithography (EUVL), no source with sufficient output has yet been identified that will meet the industry’s throughput requirements. Several models exist that can estimate the cost of ownership (or cost per wafer level). Two key ingredients in the model are tool cost and mask cost. Figure 9.3 is a three-dimensional rendering of wafer-level cost as a function of both tool and mask cost [2]. Clearly, a technology that can reduce the tool cost by an

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

468

1011

$100,000,000 EUV (?) 157nm 193nm

109

$10,000,000

108 Imprint

107 106

$1,000,000

Tool price

# Transistors per chip

1010

105 104

Memory density

103 1975

1980

1985

1990 1995 Date

2000

2005

$100,000 2010

FIGURE 9.2 Transistor density and tool cost as a function of time.

order of magnitude will have a significant effect on the economics of the fabrication process. Imprint lithography is essentially a micromolding process in which the topography of a template defines the patterns created on a substrate. Investigations by several researchers in the sub-50-nm regime indicate that imprint lithography resolution is only limited by the resolution of the template fabrication process. It possesses important advantages over photolithography and other NGL techniques because it does not require expensive projection optics, advanced illumination sources, or specialized resist materials that are central to photolithography and NGL technologies. There are three basic approaches to imprint lithography: soft lithography (SL), nanoimprint lithography (NIL), and step-andflash imprint lithography (S-FIL). Each technique is described in detail in the next three sections. Following this is a discussion devoted to the key issues that must be addressed if imprint lithography is to play a key role in the advancement of high-density integrated circuits (ICs).

CoO ($ per wafer level)

M0 = $30 K 80 60 40 20 0 20 15 10 Tool cost ($M)

40

5 0

0

20 Throughput (wafers/hr.)

FIGURE 9.3 Cost-of-ownership model as a function of tool cost and throughput.

q 2007 by Taylor & Francis Group, LLC

80 60 M0 = $15 K

Imprint Lithography

469

9.3 Soft Lithography 9.3.1 Process Background Soft lithography (SL), also known as microcontact printing (mCP) generally refers to the process of transferring a self-assembled monolayer (SAM) using a flexible template (Figure 9.3). The invention of the technology dates back to 1994, and is the result of work from the laboratory of George Whitesides at Harvard [3,4]. The technology surfaced mainly as a quick and easy way for students to print small geometries in a laboratory environment. Whitesides et al. have formed a template by applying a liquid precursor to polydimethylsiloxane (PDMS) over a master mask produced using either electron-beam or optical lithography. More details about the formation of the master are covered in Section 9.4. The liquid is cured, and the PDMS solid is peeled away from the original mask. The PDMS is essentially an elastomeric material, consisting of a polymer chain of siliconcontaining oils. Typical mechanical properties include a tensile strength of 7.1 MPa, an elongation at break of 140%, and a tear strength of 2.6 kN/m. As a result, relative to either silicon or fused silica, it is quite pliant. Once prepared, the PDMS template can then be coated with a thiol ink solution, such as an alkanethiol. The imprint process is depicted in Figure 9.4. The thiol molecules are subsequently transferred to a substrate, coated with a thin layer of gold, thereby forming a SAM on the gold surface. The nature of the gold-sulfur bond is still not completely understood. Kane et al. postulate that the species present at the surface of the gold is a gold thiolate [5]: R–SH C Auð0Þn / RS–AuðIÞAuð0ÞnK1 C 1=2H2 [ :

(9.1)

To prevent adhesion between the master and daughter masks, the master surface is passivated by the gas-phase deposition of a long-chain, fluorinated alkylchlorosilane (CF 3(CF2)6(CH2) 2SiCl3). The fluorosilane reacts with the free silanol groups on the Soft lithography* Whitesides, Harvard

1. PDMS template with thiol

2. Imprint stamp

3. Transfer molecules

4. Pattern transfer

q 2007 by Taylor & Francis Group, LLC

FIGURE 9.4 Fabrication sequence for microcontact (or soft lithography) printing.

Microlithography: Science and Technology

470

surface of the master to form a Teflon-like surface with a low interfacial free energy. The passivated surface acts as a release layer that facilitates the removal of the PDMS stamp from the master. The pattern transfer process starts with a wet etch of the thin gold film. A wet etch is typically used because gold is not readily reactively ion etched. Although it is possible to sputter-etch gold, the thin thiol layer would not hold up to such a process. The gold film then acts as an etch mask for any underlying materials. Because gold is a soft metal, it is often necessary to include a second hard mask beneath the gold. The range of feature sizes that can be imprinted with this technology is broad. Although squares and lines with geometries of several microns are easily achieved, smaller circular features with dimensions as small as 30 nm have also been demonstrated (Figure 9.5). Although the technology has been used to make working field-effect transistors, complex electronic devices are not likely to be the strength of this particular printing technology. Although the SAMs are easily transferred and tend to self-heal during deposition, the thickness of the molecule (w1 nm) makes it difficult for routine pattern transfer of the thicker films typically required for semiconductor devices. Thinner films are also subject to defects, and yield is critical in the semiconductor field. In addition, the elastic nature of the PDMS template makes it impractical for the very precise layer-to-layer overlay tolerances required in the semiconductor industry. Interestingly, however, the same feature that renders it difficult to align one level to another also allows printing on surfaces that are not planar. This attribute is relatively unique to microcontact printing. 9.3.2 Bioapplications For templates with dimensions as small as a few hundred nanometers, the master can be made at relatively low cost. PDMS material is very inexpensive, and template replication can cost as little as $0.25 per square inch. As a result, the technology may be very attractive for biological applications. It should also be noted that PDMS is biocompatible, permeable to many gases, and can also be utilized for cell cultures [6]. Conversely, many photosensitive materials used in the semiconductor industry, in addition to the photo process itself, are not biocompatible. A large number of biological experiments have been made possible through the use of SL. As an example, patterned SAMs can be used to control the absorption of protein on certain surfaces. Using mCP, Lopez et al. patterned gold surfaces into regions terminated in

50 nm

(a)

2 μm

(b)

FIGURE 9.5 (a) A 30-nm ring created with mCP. (b) A working field-effect transistor.

q 2007 by Taylor & Francis Group, LLC

10 μm

Imprint Lithography

471

FIGURE 9.6 Laminar flow patterns in microfluidic channels created using soft lithography.

oglio(ethylene glycol) and methyl groups [7]. Immersion of the patterned SAMs in proteins resulted in adsorption of the proteins on the methyl-terminated regions. It is also possible to use SL as a means for patterning cells on substrates [8]. Furthermore, the shape of a cell can also be affected by the patterns created with SL. Singhvi et al. used this ability to explore the effect of cell shape on cell function [9]. It was noted that cells attached preferentially to treated islands and, in many cases, conformed to the shape of the island. In addition, the size and shape of the island played a role in the size and shape of the cells. There has much recent attention in the biological field of microfluidic testing and several groups have carried out groundbreaking work in this field. In the past, channels have been fabricated in fused silica using photolithographic processing and wet chemical etching of the glass. The challenge has always been finding a cost-effective method for producing channels and developing a robust process for sealing the channels. Kim et al. developed a three-dimensional micromolding process by bringing a PDMS mold in contact with a conformal substrate, thereby forming microfluidic channels [10]. The technique is extremely useful for directing specific types of cells through different channels. A limitation for micron-sized channels is the amount of turbulent flow encountered in the device. Flow is typified by a Reynolds number; a lower number is characteristic of a more laminar flow. By operating under laminar flow conditions, the opportunities for intermixing are reduced. Whitesides et al. introduced the concept of a network of capillaries to better direct cells in micron-sized microfluidic channels [11]. A simple network is shown in Figure 9.6. The technique is very useful for cell capture and subsequent delivery of chemicals for culturing.

9.4 Nanoimprint Lithography 9.4.1 Thermal Imprint Lithography Thermal imprint lithography, also referred to as NIL, was first introduced by Stephen Chou at the University of Minnesota in 1996 [12]. Chou introduced a mechanical printing process that did not require any type of energetic beam. A schematic of the imprint process is depicted in Figure 9.7. A resist, such as PMMA, is spin-coated and baked onto a substrate, such as silicon. The substrate and resist are then heated above the glass transition temperature of the resist. For PMMA, a typical imprint bake temperature is between 140 and 1808C. The template is then pressed into the resist until the resist flows into the features of the mold. Typical imprinting pressures for the initial experiments range from

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

472

• Place template on dual layer substrate • Apply pressure at elevated temperature.

Template Silicon thermoplastic Thermoset Substrate

• Separate the template from the substrate. FIGURE 9.7 Schematic flow for a thermal imprint process.

• Oxygen etch transfer

600 to 1900 psi (40–129 atm). Pressures of less than 30 atm are more typical of today’s processes. The template and substrate are then cooled and separated, leaving a reversetone image that has the potential to precisely replicate the features on the mold. It is possible to use a variety of different resist materials and several materials are now manufactured specifically for use with imprint lithography. PMMA was initially chosen for its low thermal coefficient of expansion and small pressure-shrinkage coefficient. NIL is usually carried out in a vacuum environment to avoid trapping air in the template during the imprint process. “Sticking” issues are mitigated either by adding release agents to the resist or by treating the surface of the template in a fashion similar to that described in the previous section. The template used in the imprinting process is also a silicon substrate. This is a necessity because the NIL takes place at elevated temperatures, and care must be taken to avoid thermal mismatches during imprinting. There are two common ways two pattern the silicon. The first simply requires the use of a resist mask to etch the underlying silicon. For sub-100-nm feature definition, it is necessary to use a high-resolution electron-beam writing tool, such as a Leica VB6. More details on this tool are described in Section 9.5. Alternatively, scanning electron microscopes, equipped with customized e-beam writing packages, can provide excellent resolution. Typical high-resolution electron-beam resists include PMMA and ZEP 520. In the case where it is necessary to etch deep into the silicon, a hard mask may be employed in the pattern transfer process. Alternatively, a metal lift-off process may follow e-beam imaging. The remaining metal then serves as an etch mask for the underlying silicon. The silicon can be dry etched with either fluorine- or chlorinecontaining gases. Although chlorine chemistries usually yield a more anisotropic etch, the chlorine aggressively attacks the imaging resist. As a result, some type of hard mask, as described above, is required. A second method for forming a silicon template involves the deposition of a thin oxide film on the silicon, followed by electron-beam imaging. The e-beam resist then serves as a mask for the oxide, and the silicon acts as an etch stop for the oxide. This is the most common method now used for defining thermal imprint templates. Very little data has been published on the specifics of this template fabrication process, but the oxide etch process itself is well known. Details on the etching characteristics of fused silica are also discussed in detail in Section 9.5. Agreement between the template image and the final resist image can be remarkable. A comparison between 70-nm template and resist features are depicted in Figure 9.8. In Chou’s first work on the subject, 60-nm lines were imaged in PMMA resist. This is

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

(a)

473

FIGURE 9.8 70-nm features in the mold (a), and in the imprinted resist (b).

(b)

clearly not the limit of resolution of the technology, however. Subsequent work has demonstrated that it is possible to print holes smaller than 10 nm (Figure 9.9) [13], again demonstrating that resolution of the technology is limited only by what can be achieved in the template. The imprinted image is not useful until the pattern transfer process is completed. Any molding-type imprint process is typically accompanied with a residual layer after the imprint process, as shown in Figure 9.7. The first step in the pattern transfer process, therefore, involves a descum process to remove the residual layer. For PMMA, an oxygen descum works well; however, care must be taken to control this step to not introduce a lateral etch component, thereby changing the CD of the feature of interest. After this step is complete, either a subtractive or additive process can occur. PMMA has notoriously poor etch selectivity. As a result, most of the early work tended to focus on liftoff processing. Lines of 40 nm and dots of 25 nm are depicted in Figure 9.10 after imprint lithography, descum, and metal lift-off. 10 nm

FIGURE 9.9 Sub-10-nm holes formed using NIL.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

474

40 nm 75 nm step

25 nm

FIGURE 9.10 25-nm pillars and 40-nm lines after metal deposition and lift-off.

Because the thermal imprint process requires no optics, it is easily implemented in research labs and has been used to fabricate a variety of interesting and useful devices. Two examples are discussed below. Interested readers are referred to Refs. [14–17] for further examples of fabricated devices. Gratings are relatively easy to print using NIL and may be used for a variety of interesting applications such as extreme-ultraviolet and ultraviolet filters, transmission filters, visible and infrared polarizing optical elements, waveplates, and phase retarders. Yu et al. formed a 100-nm pitch grating in a silicon mold by first patterning oxide features on a 200-nm pitch [18]. By depositing silicon nitride, performing an etch-back, removing the oxide, and using the nitride as an etch mask into the silicon, he was able to form large-area 100-nm period gratings. The usefulness of the resultant gratings was tested by measuring the ellipsometric parameters of the beam at a wavelength of 632.8 nm at different angles of incidence, with the gratings oriented either parallel or perpendicular to the plane of incidence. Good agreement is obtained with effective medium theory simulations, as shown in Figure 9.11. Microring resonator devices have been the focus of recent interest because of their potential for applications in photonic circuits [19]. A microring resonator typically has the shape of a ring closely coupled to a waveguide, thereby offering capabilities such as 60

180 Parallel Perpendicular

50

150 Delta (deg)

Psi (deg)

40 30 Parallel

20

Perpendicular 90 60 30

10 0 40

120

45

50

55 60 65 70 75 Angle of incidence (deg)

80

85

0 40

45

50

55

60

65

70

75

80

85

Angle of incidence (deg)

FIGURE 9.11 Parametric variance of the beam at a wavelength of 632.8 nm at different angles of incidence, with the gratings oriented either parallel or perpendicular to the plane of incidence. Good agreement is obtained with effective medium theory simulations, shown as a solid line.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

475 1.2

Intensity (μW)

1.0

0.8

0.6

0.4 1550 (a)

(b)

1555

1560

1565

1570

1575

1580

Wavelength (nm)

FIGURE 9.12 (a) Microring resonator mold. (b) Output spectra response with a TM-polarized input.

narrow-bandwidth filtering and compactness. Chao et al. studied two methods for fabricating polymer-based microring resonators [20]. In one method, NIL was used to imprint a PMMA waveguide on top of thermal oxide. The PMMA was heated to a temperature of 1758C, and the imprint was performed at a pressure of 75 kg/cm2. Figure 9.12a shows the oxide mold used for the imprint process. Figure 9.12b shows the output of a fabricated PS microring resonator 9.4.2 Alternative Thermal Imprint Processes All of the lithographic methods discussed to this point, including the more conventional forms of photolithography, involve the use of an imaging resist that is then utilized as a mask to define an image into an underlying material. Any type of pattern transfer process is problematic because it has the potential for adding defects and changing CDs. It would be very advantageous to be able to directly image the material of choice without the need of a resist. The development of functional materials is underway, and dielectric materials, such as hydrogen silsequioxane, have been proven to be both electron-beam sensitive and photosensitive [21]. Imprint lithography combined with functional materials would be a very cost-effective way of patterning features. Chou et al. have developed a variant of thermal imprinting to directly pattern silicon. The technique is called laser-assisted direct imprint (LADI) [22]. A schematic of the LADI process is shown in Figure 9.13. A quartz mold is used to form the features in the silicon. To image the silicon, a 20-ns XeCl laser pulse (at a wavelength of 308 nm) is applied through the quartz template and onto the silicon surface. The energy imparted by the laser causes a thin layer on the surface of the silicon to melt. Because the viscosity of the silicon melt is extremely low, the quartz can then be pressed into the silicon, allowing the silicon melt to fill the mold. It was estimated that the silicon remains as a melt for approximately 220 ns and that the depth of the melt is roughly 280 nm. After the silicon cools, the quartz and silicon can be separated. As a demonstration, Chou patterned 300-nm grating structures, 110 nm deep into silicon. As with conventional NIL, the patterns were faithfully replicated. SEM micrographs of both the mold and the replicated features are depicted in Figure 9.14. Although, the exact force of the imprint process could not be measured, it is estimated that the force used for the small demonstration pieces was approximately 17 atm. It is very

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

476

FIGURE 9.13 Schematic representation of the laser assisted direct imprint (LADI) method for patterning silicon.

likely that the process, which was performed on 1.5!1.5 mm2 samples, could be scaled up to sizes comparable to those used in conventional NIL tools. It should be noted that LADI is not limited to patterning only silicon. Polysilicon patterning has also been demonstrated, but it may be feasible to extend the technology to very different materials, such as germanium, III–V compounds, and dielectrics.

10 nm

140 nm

110 nm

Quartz (a)

220 nm

200 nm

Silicon (b)

FIGURE 9.14 SEM image of the template (a), and of the patterned silicon (b), using LADI.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

477

Three-dimensional patterning should also be possible with the technique. Future development of this exciting technology will be interesting to monitor. 9.4.3 Thermal Imprint Tools Unlike the development of other NGL tools, the lack of expensive optics and sources has made the construction of commercial imprint tools attractive for several companies. This is a significant departure from the development efforts associated with technologies such as EUVL and electron-projection lithography (EPL). For these technologies, if the development efforts are not successful in addressing the needs of the high-density silicon CMOS industry, it is extremely likely that the tools will not be commercialized. The early requirement for optical elements, photonic devices, filters, and other high-end products, along with the development of emerging markets, are likely to sustain the imprint tool business, regardless of whether it can address the needs of CMOS. Several vendors are already supplying thermal imprint systems to customers. As many as four different companies are now selling imprint tools in Europe: Obducat, EVGroup, Suss Microlectronics, and Jenoptiks. Entry into the imprint business was an obvious extension for all four companies. Obducat was already in the CD stamper business, and has now developed thermal imprint tools capable of patterning wafers up to six inches in diameter. Their NIL 6-inch tool operates at temperatures and pressures as high as 3508C and 80 bar, respectively. An optional ability to imprint both sides of a wafer is also offered. Obducat is also in the business of building electron-beam systems, and can supply customers either with tools to build their own templates, or services that provide templates. EVGroup and Suss Microelectronics, competitors in the fields of wafer bonding and proximity/contact aligners, have adapted their tools to address both thermal imprinting and UV imprinting. The EVG520HE is an offering from EVGroup, and is based on technology originally designed for MEMs wafer-bonding applications. This thermal imprint tool can operate at temperatures as high as 5508C, and at pressure up to 40 kN. Other tools adapted for imprint include the EV620 and the IQ Aligner. Suss offers comparable tools: the SB6E, a thermal imprint version, and the MA6 (based on aligner technology). The newest tool under development has been labeled the Nano Patterning Stepper 200, or NPS200. The tool is being developed with the help from government funding, and is being designed to operate as either a thermal or UV imprint tool. Imprint time per die is targeted at less than 1 min per die. Finally, the demonstrations done by Jenoptiks are based on systems originally designed for hot embossing. In the United States, one startup company, Nanonex, is offering thermal imprint tools. Based on the work of Stephen Chou from Princeton University, Nanonex was founded in 1999, and began tool development in 2000. Tools currently offered are the NX-1000, a thermal imprint tool with no alignment capability, and the NX-2000, a system that can operate as either a thermal or UV imprinter. Also being planned is an NX-3000, which adds alignment capability. All three systems are designed as full-wafer imprinters and can handle substrates as large as 200 mm. One concern during the imprint process is pressure nonuniformity. Pressure nonuniformity can lead to serious errors, such as unfilled imprint areas and nonuniform residual layers. This can easily occur in a system that relies on alignments of two parallel plates to make contact between the template and wafer. Other issues that may occur during the process include surface roughness and bow in either the template or wafer. To minimize these effects, the Nanonex tools apply an air-cushion press (ACP) to uniformly apply pressure and conformally contact the template and wafer during imprinting [23]. An example of the process is shown in Figure 9.15. Figure 9.15 depicts the results on pressure sensitive paper from an imprint made between two parallel plates and an imprint using

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

478

FIGURE 9.15 (a) Pressure uniformity using a parallel press. (b) Pressure uniformity using an air cushion press.

(a)

(b)

the ACP method. The improvement in uniformity is substantial. It should be noted that Obducat and Hitachi (discussed below) use comparable technologies to achieve good pressure uniformity. Two companies from Asia are the latest to offer commercial imprint systems: NND from Korea and Hitachi from Japan. NND offers a full-wafer system called the Nanosis 610. The tool is designed for imprinting wafers up to 150 mm in diameter, although an optional upgrade to 200 mm is offered. There have been demonstrations of processes using the system in which almost no residual layer is evident. The reason for these results has not yet been fully explained, however. Hitachi just entered the market with a thermal imprint system that can do full-wafer imprinting on substrates as large as 300 mm. The tool is fully automated and includes cassette-to-cassette handling of wafers.

9.5 Step-and-Flash Imprint Lithography Devices that require several lithography steps and precise overlay will need an imprinting process capable of addressing registration issues. A derivative of NIL, ultraviolet–nanoimprint lithography (or UV–NIL) addresses the issue of alignment by using a transparent template, thereby facilitating conventional overlay techniques (Figure 9.16). In addition, the imprint process is performed at low pressures and at room temperature, which minimizes magnification and distortion errors. Two types of approaches are being considered for UV–NIL. The first method uses conventional spin-on techniques to coat a wafer with a UV-curable resist [24]. Although it is possible to uniformly coat the wafer, there are concerns that the viscosity of the resist will be too high to facilitate the formation of very thin residual layers. If the residual layer is too thick, the CD uniformity may suffer as a result of the subsequent pattern transfer process. This problem is addressed by locally dispensing a low-viscosity resist in a single stepper field. This second approach was first disclosed by Willson et al. in 1999 and is generally referred to as step-and-flash imprint lithography, or S-FIL [25]. S-FIL appears to be the most suitable imprint technique for fulfilling the stringent requirements of silicon IC fabrication. Because a tool, a template, and a resist are necessary for the fabrication process, each of these subjects is discussed in detail. 9.5.1 The S-FIL Tool Imprint lithography relies on the parallel orientation of the imprint template and substrate. Inaccurate orientation may yield a layer of cured etch barrier that is nonuniform across the imprint field. Thus, it is necessary to develop a mechanical system whereby

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

479

Quartz template Release layer Planarization layer Substrate

Monomer

UV blanket expose

HIGH resolution, LOW aspect-ratio relief Residual layer

HIGH resolution, HIGH aspect-ratio feature

FIGURE 9.16 Schematic illustration of the stepand-flash imprint lithography (S-FIL) process.

template and substrate are brought into co-parallelism during etch-barrier exposure. In contrast with the ACP process discussed in the previous section, this was originally achieved in S-FIL by way of a two-step orientation scheme. In step one, the template stage and wafer chuck are brought into course parallelism via micrometer actuation. The second step uses a passive flexure-based mechanism that takes over during actual imprint [26,27]. The first step-and-repeat system was built at the University of Texas at Austin by modifying a 248-nm Ultratech stepper that was donated by IBM (Figure 9.17). Key system attributes include a microresolution z-stage that controls the imprint force, an automated x–y stage for step-and-repeat positioning, a precalibration stage that enables parallel alignment between the template and substrate, a fine-orientation flexure stage that provides a highly accurate, automatic parallel alignment of the template and wafer, an exposure source that is used to cure the etch barrier, and an automated fluid delivery system that accurately dispenses known amounts of the liquid etch barrier. A commercialized version of an S-FIL tool is now available from Molecular Imprints, Inc. (MII). MII is a U.S.-based venture startup company that received its initial funding in 2001, and sold its first tool, the Imprio 100, in 2003. The Imprio 100 is designed as a step-and-repeat patterning tool and can accommodate wafer sizes up to 200 mm in diameter [28]. Standard die size is 25!25 mm2, although both smaller and larger die sizes are possible. To minimize defect issues during the imprint process, the tool is equipped with a class-0.1 minienvironment. Although the Imprio 100 from MII is a substantial improvement relative to the first University of Texas tool, it has

q 2007 by Taylor & Francis Group, LLC

480

Microlithography: Science and Technology

FIGURE 9.17 The first step-and-repeat UV-based nanoimprint tool.

neither the throughput nor the overlay specifications (w250-nm 3s) necessary for silicon IC fabrication. Instead, the system was primarily designed and manufactured to address the needs of compound semiconductor, mechanical microstructures, advanced packaging, thin-film head, and photonics markets. These markets require high-resolution features but are typically less sensitive to defects. They also operate at low volumes of wafers and are therefore more sensitive to costs, particularly tool costs. The tool has a throughput capacity of approximately two 200-mm wafers per hour. As a result, it will be possible to collect enough statistical information of performance characteristics of S-FIL to allow the design of a fully engineered high-volume-manufacturing tool in the future. The Imprio 100 was developed in partnership with several key OEM suppliers for the stage technology, the UV source, and the control architecture. The extremely complicated and costly imaging optics, source, and step-and-scan mechanical systems associated with other NGL techniques are not required in S-FIL technology. It is essentially a precise mechanical system with specialized fluid-mechanics subsystems and a mercury arc lamp as its source. Therefore, it is a much simpler system with a significantly smaller footprint, and its cost structure has the potential to be an order of magnitude lower than high-end lithography tools. Of particular interest is the resist delivery system, which incorporates a microsolenoid nozzle capable of dispensing drops less than 5 nL in volume. This type of control is essential for the control of the residual layer formed during the imprint process. When integrated with a well-designed flexure stage and wafer chuck, it is possible to print an etch barrier with residual layers well under 100 nm. Figure 9.18 depicts the data for residual-layer uniformity in a single die. In this case, a mean thickness of less than 60 nm was achieved, with a 10-nm 3s variation. An inkjet dispense system is currently

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

481

Residual layer thickness (nm)

70.00 60.00 50.00 Mean MEAN

40.00

Sigma SIGMA 30.00

3 Sigma SIGMA

20.00 10.00 0.00 0

5

10

15

20

25

30

FIGURE 9.18 Residual-layer thickness uniformity obtained across a 25!25 mm2 field using a microsolenoid valve dispensing system.

under development. This technology has demonstrated the ability to achieve residual layers as thin as 20 nm, with uniformity close to what is possible by spin coating. An optical photograph of an imprinted die with the inkjet dispense head is shown in Figure 9.19. Two other tools are also being offered by MII: an Imprio 50, a research S-FIL tool with a 10!10 mm2 imprint field and no alignment capability, and an Imprio 55, a research tool with an alignment specification of 1 mm, 1s. Under development is and Imprio 200, a cassette-to-cassette system with a planned wafer throughput of five wafers per hour. 9.5.2 The S-FIL Template Early template fabrication schemes started with a 6!6!0.25 in.3 conventional photomask plate and used established Cr and phase-shift etch processes to define features in the glass substrate [29]. Although sub-100-nm geometries were demonstrated, CD losses during the etching of the thick Cr layer etch make the fabrication scheme impractical for 1! templates. It is not unusual, for example, to see etch biases as high as 100 nm [30].

FIGURE 9.19 Residual-layer uniformity for several die using an inkjet dispensing system. Note the color uniformity.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

482

More recently, two methods have been employed to fabricate templates [31,32]. The first method uses a much thinner (15 nm) layer of Cr as a hard mask. Thinner layers still suppress charging during the e-beam exposure of the template, and have the advantage that CD losses encountered during the pattern transfer through the Cr are minimized. Because the etch selectivity of glass to Cr is better than 18:1 in a fluorine-based process, a sub-20-nm Cr layer is also sufficient as a hard mask during the etching of the glass substrate. The second fabrication scheme attempts to address some of the weaknesses associated with a solid glass substrate. Because there is no conductive layer on the final template, SEM and defect inspection are compromised. By incorporating a conductive and transparent layer of indium tin oxide (ITO) on the glass substrate, charging is suppressed during inspection, and the transparent nature of the final template is not affected. The experimental details of the processes have been covered in previous publications [32,33]. The Cr-based template pattern transfer process consisted of an exposure in a Leica VB6 and development of the ZEP-520 positive resist, followed by an oxygen descum, Cr etch, resist strip, quartz etch, and a Cr wet etch. It is interesting to note that it was necessary to remove the resist prior to the quartz etch. If left in place during the CHF3-based quartz etch, the amount of polymer deposited during the etch process is substantial enough to impact the fidelity of the quartz features. Additional amounts of oxygen may be necessary to minimize polymer formation. The process sequence for 30-nm features is depicted in Figure 9.20. Widespread use of imprint lithography will require that the template be both inspectable and repairable. For applications requiring sub-100-nm lithography, it will likely become necessary to inspect the templates using electron beams. If this is the case, the template will need a charge-reduction layer to dissipate charge during the inspection process. A fabrication scheme that incorporates a transparent conducting oxide, such as ITO, into the final template addresses this problem. A thin layer of PECVD oxide is deposited over the ITO and defines the thickness of the imprinted resist layer. Features are formed on the template by patterning an electron-beam resist, transferring the pattern via reactive etching into the oxide, and stripping the resist. The ITO must have sufficient conductivity to avoid charging effects first during resist exposure and later during template inspection. The resistivity of the as-deposited ITO film is on the order of 2.0!106 U/sq. The resistivity decreases substantially, however, after the films are annealed at a temperature of 3008C. In its annealed state, the ITO film resistivity is about 3.5!102 U/sq. Charge dissipation during e-beam writing and SEM inspection is realized at this conductivity level. The ITO must also be very transparent at the actinic wavelength used during the S-FIL exposure process (365 nm). It is possible to achieve transmission well above 90% at 365 nm [34]. The ITO has the additional attribute of performing as an excellent etch-stop during the pattern transfer of the PECVD oxide layer. Examples of final template features formed using this process are shown in Figure 9.21. Resist

Descum

Cr etch

Resist strip

FIGURE 9.20 Pattern transfer sequence for 30-nm trenches. Lines between the trenches are 100 nm.

q 2007 by Taylor & Francis Group, LLC

Quartz etch

Imprint Lithography

100 nm

483

60 nm

30 nm

20 nm

FIGURE 9.21 100-, 60-, 30-, and 20-nm features defined using the ITO-based process.

An even simpler way to make a template is to use an electron-beam-sensitive flowable oxide, such as hydrogen silsequioxane (HSQ). Although the primary use of HSQ is as a low-k dielectric, several investigators have demonstrated its usefulness as a high-resolution electron-beam resist. In its cured state, HSQ becomes a durable oxide, making it a very convenient material for direct patterning of SFIL template relief structures. Processing of HSQ as an electron-beam resist is less complicated because it is not chemically amplified and can be developed in the standard tetramethyl ammonium hydroxide (TMAH)-based developers used commonly for conventional resists. All that is required to make a template is to coat and bake the HSQ directly on the ITO layer, and then expose and develop the HSQ [35]. It is interesting to note that the methods described in this section can also be used sequentially to form multilayer structures that can be used to fabricate devices such as T-gates or optical grating couplers [36]. SEM pictures depicting two-tiered and three-tiered structures are shown in Figure 9.22. Figure 9.22a and b are tiered structures produced using alternating layers of ITO and PECVD oxide. Figure 9.22c was produced by patterning a bottom oxide film and subsequently coating, exposing, and developing an HSQ layer. The final step in the template fabrication process is a treatment designed to lower the surface free energy. Alkyltrichlorosilanes form strong covalent bonds with the surface of fused silica, or SiO2. In the presence of surface water, they react to form silanol intermediates that undergo a condensation reaction with surface hydroxyl groups, and adjacent silanols to form a networked siloxane monolayer. When this functional group is synthetically attached to a long, fluorinated aliphatic chain, a bifunctional molecule that is suitable as a template release film is created. The silane-terminated end bonds itself to a template’s surface, providing the durability necessary for repeated imprints. The fluorinated chain, with its tendency to orient itself away from the surface, forms a tightly packed comb-like structure and provides a low-energy release surface. Annealing further enhances the condensation creating a highly networked, durable, low-surface-energy coating.

FIGURE 9.22 Multitiered structures formed by iterating the fabrication process.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

484 9.5.3 The S-FIL Resist

The resist stack typically consists of a silicon-containing etch barrier over an antireflective coating (also referred to as the transfer layer). The etch barrier is patterned via the imprint process. The subsequent pattern transfer process involves an etch of the remaining residual layer (w60 nm in thickness), followed by an anisotropic etch of the transfer layer. The etch-barrier material is subject to several design constraints. The etch-barrier liquid must be dispensable from an automatic fluid dispensing system, and must not change significantly in composition between dispensing and imprinting by, e.g., component evaporation. It must be readily displaced during the imprint step and photopolymerize rapidly during exposure. Shrinkage due to polymerization must be controlled. The polymer must release from the template while adhering to the transfer layer, and it must exhibit sufficient rigidity to avoid feature collapse. It must exhibit some level of temperature stability to withstand the etching temperatures, and it must exhibit sufficient etch selectivity during the O2 reactive ion etching (RIE) step to allow for high aspect ratios to be generated in the transfer layer. The S-FIL process relies on photopolymerization of a low-viscosity, acrylate-based solution. Acrylate polymerization is known to be accompanied by volumetric shrinkage that is the result of chemical bond formation. Consequently, the size, shape, and placement of the replicated features may be affected. Volumetric shrinkage was found to be less than 10% (v/v) in most cases [37]. The current etch-barrier liquid is a multicomponent solution that has been previously been described in detail [37]. The silylated monomer provides etch resistance in the O2 transfer etch. Crosslinker monomers provide thermal stability to the cured etch barrier and also improve the cohesive strength of the etch barrier. Organic monomers serve as mass-persistent components and lower the viscosity of the etch-barrier formulation. The photoinitiators dissociate to form radicals upon UV irradiation, and these radicals initiate polymerization. SEMs of this etch barrier are shown in Figure 9.23a. Twenty-nanometer features have been resolved with both types of templates described earlier. Cross-sectional images are

(a)

(b)

80 nm

50 nm

60 nm

30 nm

30 nm

20 nm

three tiered structure

FIGURE 9.23 Printed features in the acrylate-based etch barrier. Figure 9.10a depicts top-down SEMs. Figure 9.10b shows crosssectional images of both single tier and multitiered features.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

485

Template

Resist

KEY KEY:: KEY: +9−12 nm +6−9 nm +3−6 nm Mean +/−3 nm −3−6 nm −6−9 nm −9−12 nm 30/100 nm X = 25.3 nm 3σ = 4.5 nm

X = 36.6 nm 3σ = 4.4 nm

FIGURE 9.24 Uniformity plot of 30-nm dense trenches across a 25!25 mm2 die, in both the template and in the resist. Note that the 3s values do not change.

shown in Figure 9.23b. The profiles closely replicate the relief image in the template for both single- and multitiered structures. Critical-dimension uniformity studies have also been performed. In one study, an 8!8 array of features were defined on a template. The template was then used to print a die on a wafer. The CD variation was measured using an Hitachi 7800 CD-SEM for both the template and the etch barrier. The results for 30-nm features are shown in Figure 9.24. As expected, there is only a small additional variance in the CD caused by the printing process [37]. Prior to etching the underlying transfer layer, it is necessary to remove the residual etchbarrier material formed during the imprint process. Because the silicon content is at least 12%, best selectivity between the etch barrier and the transfer layer is achieved by using a combination of CF4 and oxygen. After the transfer layer is exposed, the gas chemistry is comprised only of O2. Recent studies indicate that selectivities greater than 6:1 may be possible for both etches. Figure 9.25 shows the pattern transfer sequence. More details on this process can be found in the paper of Johnson et al. [3]. It is interesting to note that the presence of oxygen dissolved in the etch barrier and in the ambient environment causes two undesirable effects on the curing of the acrylate etch barrier. Oxygen dissolved in the etch barrier consumes photoinitiated radicals, resulting in an inhibition period before polymerization begins. Furthermore, oxygen diffusion into the etch barrier limits the curing reaction around the perimeter of the template. Although it may be possible to modify the ambient, other chemistries, such as vinyl ethers, may be more suitable for the imprint process [38]. This approach eliminates the oxygen-inhibition

FIGURE 9.25 Pattern transfer sequence showing the etch barrier over the transfer layer, the residual layer etch, and the etch of the transfer layer.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

486

effect and may also further reduce the viscosity of the etch barrier, thereby further reducing the residual layer formed during the imprint process.

9.6 Imprint Lithography Issues Several other issues need to be addressed before technologies such as NIL and S-FIL can be considered as a viable technology for silicon IC fabrication. The two biggest issues are defects and overlay. Because imprint lithography is a “contact” lithography, there are concerns associated with defects generated during the process. As a 1! technology, there are also concerns relative to wafer alignment. Each of these topics is discussed below. The only comprehensive studies in these fields are restricted to S-FIL, and the review is limited therefore to this particular type of imprint lithography. 9.6.1 Defects

400 350

Defects

300 250 200 150 100 50 0 (a)

0

20

40

60

Imprint number

80

100

Slope (defects added per imprint)

For S-FIL, the low-surface-energy monolayer applied to the template acts effectively as a self-cleaning agent. This attribute has been reported in several publications [25,29]. A dirty template was used to imprint several die on a silicon wafer. The progression of pictures indicated that defects that start on the template embed themselves in the etch barrier, and by the seventh imprint, there were no detectable particles. It is also interesting to note that there does not appear to be any degradation of the release layer over time. Contact angles measurements show no change after more than two months [25]. Although the data clearly illustrates a self-cleaning effect, this is not sufficient evidence to prove that defects are not added after many imprints. A more convincing study involves printing wafers, and having the defects tracked using an inspection tool. To this end, a study of imprinted wafers was conducted on a KLA-Tencor 2139 wafer-inspection tool in collaboration with KLA-Tencor [38]. Initial inspection of 96 consecutive imprints shows relatively high levels of detected defects, but no significant upward trend in defects over time, as shown in Figure 9.26a. Although the data is noisy and the number of defects is relatively large, there does not appear to be an increase in defects. Statistical analysis of these data has been performed. Figure 9.26b depicts the relationship between the number of defects added per imprint and the number of imprints. As the size

(b)

6 4 2 0 −2 −4 −6

0

20

40

60

80

Imprint number

FIGURE 9.26 (a) Defect levels vs. imprint number. (b) Defects added per imprint as a function of imprint number.

q 2007 by Taylor & Francis Group, LLC

100

Imprint Lithography

487

FIGURE 9.27 40-nm features are intact after 1557 imprints (w40 wafers).

of the data set increases, there is a change in the data that shifts the slope and its confidence downward to capture zero. A more recent study examined the surface of a 320 mm2 imprinted field containing 30-, 40-, and 50-nm lines with varying pitches [39]. The results for the 40-nm features are shown in Figure 9.27. Scanning electron micrographs depict the field after imprint number 1, 576, 583, and 1557. The results are nominally the same for each picture: the 40-nm lines remain intact, and no defects are visible in the field of view. 9.6.2 Image Placement and Overlay Two concerns worth addressing are: (1) Does the template fabrication process result in image placement errors that cannot be removed using conventional correction techniques such scale and orthoganality corrections? (2) If image placement is good, can the imprint tool align and make the corrections necessary to meet the stringent requirements for silicon processing? To examine image placement, a 6025 photoplate was patterned over a 5!5 in.2 area with alignment marks. Image placement was measured using a Leica LMS 2020 during each step of the Cr/quartz template fabrication process described in a previous publication [31]. The resultant image placement errors has a maximum error of approximately 15 nm. This error can be attributed to the stress of the chromium film. The image placement errors experimentally observed agree very well with finite element models [40]. To determine what type of overlay error would result from the patterning process, a second plate was written, using an opposite tone resist. The center 1!1 in.2 (a typical field size) areas of both plates were then compared. The result, after correcting for scale and orthogonality are shown in Figure 9.28a [41]. The displacement vectors are typically less

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

488 1

3 2

5 4

7 6

9 8

11 13 15 17 19 21 10 12 14 16 18 20 22

3

1 2

1

5 4

7 6

9 8

11 10

13 12

15 14

17 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 10 [nm] Summary Mean Max 3 S -D -

(a)

Min Max

X [nm] 0 - 13 14 - 97

Y [nm] −0 .09 13.45

−22 - 70 18 - 40

−17.60 19.81

10 [nm] Summary Mean Max 3 S -D -

(b)

Min Max

X [nm] −0.00 9.28

Y [nm] −0.22 12.63

−9.07 7.79

−10.66 8.66

FIGURE 9.28 (a) Distortion map after the chromium layer has been removed from the template. (b) Distortion map comparing the center 1!1 in2 areas from two different templates.

than 10 nm and are randomly directed, indicating that the error vectors are mostly limited to the sensitivity of the LMS 2020. The issue of overlay comes down to the capabilities of the imaging system and the method used for imprinting. Because S-FIL is a room-temperature and low-pressure (%1 psi) process, the real concern becomes the ability of the tool to overlay different mask levels. Tool capability has two major components. The first is related to the alignment method and alignment optics. The second is the ability to correct for distortion errors such as magnification and orthogonality. The current method of alignment on the Imprio 100 takes advantage of the transparent template, and a through the template alignment system is used to align marks on both the wafer and template. This type of system may actually be advantageous relative to reduction systems because distortion errors from the lens elements are eliminated. It is important to note the differences between alignment in an S-FIL tool such as the Imprio 100 and a typical contact aligner. First, alignment in the S-FIL tool is performed for each die, thereby minimizing runout errors. Second, alignment adjustments can be made with the template and wafer in “contact.” Across most of the die, the template and substrate are actually separated by the liquid etch barrier. In the area of the alignment mark, however, there is no etch barrier. This is important distinction because the etch barrier and the template are closely index-matched. If the etch barrier was allowed in the alignmentmark area, it would not be possible to image the mark. Alignment adjustments are possible in this scheme, because the etch barrier is still a liquid. It should be a straightforward task, therefore, to align within a few hundred nanometers. An example of an aligned template and wafer is shown in Figure 9.28b. The real challenge, then, is to be able to correct for distortion between the template and wafer. One possible way to accomplish this is to set a series of piezos around the template. To date, modeling [42] and preliminary experiments suggest that the use of a template whose thickness is substantially larger than the depth of the etched features allows for magnification corrections that are independent of the features etched into the template. Also, very uniform strain fields can be obtained using mechanical means. Experimental

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

489

verification of these magnification systems as part of a complete imprinting step and repeat tool still remains to be carried out.

9.7 Template Infrastructure Even if all the issues discussed above are successfully addressed, imprint lithography will not become widespread unless there is commercially available source of templates. Mask shops are primarily focused on 4! technology, and a shift to 1! technology is a daunting task. Not only must the templates be written by fast mask writers (as opposed to the tools generally used by research groups to make templates), they must also be inspected and repaired. All three of these topics are discussed below. Again, the only work to date has been limited to studies on S-FIL templates. 9.7.1 Template Writing To understand the resolution that is possible to achieve with a commercial mask writer, Dauksher et al. used a Leica SB350 MW to expose ZEP 7000 resist [43]. The SB350 is a variable-shaped-beam tool with 50-keV electron optics, and a 1-nm address grid. Highthroughput writing is achieved through the use of a vector-scan writing strategy, a writeon-the-fly mode, and high current densities. The maximum shape size available for writing is 2!2 mm2. A robotic reticle management station handles plates as large as 9!9 in.2, and wafers up to 200 mm in diameter. Proximity correction was achieved using a parameter-set determination package (BETA), in combination with PROXECCO correction software from PDF Solutions, GmBH [44,45]. The ZEP 7000 resist was prepared using a spin-coating process at 3300 rpm. The resulting film thickness was approximately 180 nm. After coating, the resist was baked at 1808C for 30 min on a hotplate. After exposure, the resist was puddle-developed for 60 s using ZED 500 developer. The first plate written was used to determine the proximity correction parameters necessary for subsequent exposures. The ZEP 7000 exposure dose ranged from ranged from 20 to 90 mC/cm2. A resolution test pattern with features as small as 10 nm was used as a test vehicle. The patterns were written in two ways: (1) no bias, and (2) a negative 20-nm bias. After pattern transfer, feature sizes down to 50 nm were measured, and the data was input into the proximity correction software. Figure 9.29a depicts a three-Gaussian fit of the initial data. The fit was then used to generate CDs for isolated features. 3

3 (including process–bias)

0

based on 3 gaussian derived from:

a=0.071 β=1.040 e=0.225 g=12.00 n=0.350

Template lot#: SF–cr–leica 1 / (MC210)

–3

50nm

70nm 100nm 150nm 200nm

300nm

500nm

700nm

1000nm

1

–6

0

(a)

a=0.071 β=1.040 e=0.225 g=12.00 n=0.350

Measured Calculated

Dose factor

Log(Intensity)

PEC "Control curve"

4

8

12 16 Radius(μm)

20

0

24

(b)

1000 Linewidth (nm)

FIGURE 9.29 (a) Three-Gaussian control curve generated from measurements on the first test plate. (b) Measured and calculated isolated feature dimensions ranging from 50 to 1000 nm.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

490

20 nm bias

No bias

120

140 40 nm 40 n 50 nm 50 n 60 nm 60 nm 80 nm 80 nm 100 10 0nm nm

Critical dimension (nm)

Critical dimension (nm)

140

100 80 60 40 20 50

55

60

65

70

Exposure dose (μC/cm2)

(a)

100 80 60 40 20 60

75

40 nm 40 nm 50 nm 50 nm 60 nm 60 nm 80 nm 80 nm 100 nm 100 n

120

70

80

90

100

110

120

Exposure dose (μC/cm2)

(b)

FIGURE 9.30 (a) Critical dimension of semidense trenches, with no proximity correction, as a function of exposure dose. The patterns in (b) received a negative bias of 20 nm (10 nm per edge).

Excellent agreement between calculated and measured features was obtained (Figure 9.29b). Using the proximity correction parameters obtained from the first plate, two additional plates were exposed. Arrays of resolution test patterns were written in a columnar format. The first two columns consisted of nonproximity-corrected dose arrays. The exposure dose varied from 27.0 to 166.5 mC/cm2. The second column had a K20-nm bias relative to the first column. The third and fourth columns were proximity-corrected. The exposure dose ranged from 38 to 54 mC/cm2. The fourth column was biased 20 nm relative to the third column. The exposure latitude for semidense features (the line spacing was held constant at 100 nm) was measured for features ranging from 40 to 100 nm. Figure 9.30 and Figure 9.31 depict the results of the measurements for both nonproximity- and proximity-corrected trench features, respectively. Figure 9.30a shows the results for the case when no bias was applied. A nominal dose of 57 mC/cm2 was necessary to obtain nominal No bias

20 nm bias

110

110 40 nm 50 nm 60 nm 80 nm 100 nm

90 80 70 60 50 40 37

(a)

40 nm 50 nm 60 nm 80 nm 100 nm

100 Critical dimension (nm)

Critical dimension (nm)

100

90 80 70 60 50 40

38

39

40

41

Exposure dose

42

43

(μC/cm2)

44

30 43

45 (b)

44

45 46 47 48 49 Exposure dose (μC/cm2)

50

51

FIGURE 9.31 Critical dimension of semidense trenches, with proximity correction, as shown in Figure 9.1. The patterns in (b) received a negative bias of 20 nm (10 nm per edge).

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

491

100-nm trenches. The exposure dose necessary to obtain the 50-nm trench increased to 67 mC/cm2. With the inclusion of a 20-nm bias (Figure 9.30b), exposure latitude improves, but linearity degrades. As an example, the nominal doses needed to resolve 100- and 50-nm trenches are 71 and 102.5 mC/cm2, respectively. Much larger exposure doses were needed to resolve the 40-nm features. In Figure 9.30b, the 40-nm trench does not reach its nominal value until a dose of 116 mC/cm2 is applied. Linearity is significantly improved after the application of the proximity correction software. Figure 9.31a depicts the exposure latitude results for the case when no bias is applied. The nominal dose needed to obtain 100 nm trenches is 42 mC/cm2, and shifts by only 5 mC/cm2 for a 50-nm trench. It is interesting to note that less dose is required for the sub-80-nm trenches. The best results for both latitude and linearity are obtained by applying a 20-nm exposure bias (Figure 9.31b). For this case, 50-nm trenches are slightly oversized for all measured doses, however, the 40-nm trenches are somewhat undersized. Figure 9.32 depicts SEM images of the 100- and 50-nm features at the nominal exposure dose required for 100-nm trenches. The effect of using the proximity correction software is apparent. Without correction, the 50-nm features are small, and it is likely that a resist scum remains in the bottom of the features. With proximity correction, the 50-nm features are well defined, with no apparent scum at the surface of the trench. After pattern transfer, all of the features previously discussed were measured again. The results are shown in Figure 9.33. Figure 9.33a depicts exposure latitude for 100-, 80-, and 50-nm trenches before and after pattern transfer, for the case where no proximity correction is used. There is a negative shift in CD of approximately 5–10 nm after etch. The 50-nm trenches are clear only for the largest exposure dose shown and are undersized by 10 nm, indicating that a significant amount of residual resist was present in the resist for smaller exposure doses. Figure 9.33b is a plot of feature linearity for the case where proximity correction is applied. The filled data points depict linearity, as measured in the resist. The open data points show the results after etch. When no exposure bias is applied, all of the trench features are resolved; however, the linearity is not well maintained. The larger trenches are undersized after pattern transfer. Linearity is improved for trenches as small as 50 nm when a 20-nm exposure bias is applied. The 40-nm trenches (after etch) are not present at the nominal exposure dose. They are resolved, however, at higher exposure doses.

100 nm

50 nm

FIGURE 9.32 100- and 50-nm resist features written at the nominal dose necessary for the 100-nm trenches. The best results were obtained for the case where the patterns received proximity correction (PC) and were biased by 20 nm.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

492

PC–no bias and 20 nm bias

No proximity correction 110 50 ZEP 80 ZEP 100 ZEP 50 final 80 final 100 final

120 100 80 60 40

90 80 70 60 50 40 30

20 50 (a)

ZEP–PC ZEP–PC/Bias Final–PC Final–PC/Bias

100 Measured CD (nm)

Critical dimension (nm)

140

55

60

65

Exposure dose

70

(μC/cm2)

30

75

40

50

(b)

60

70

80

90 100 110

Coded CD (nm)

FIGURE 9.33 Measurements of final features after pattern transfer. (a) Exposure latitude when no proximity correction is applied. (b) Linearity for two different cases when proximity correction is used.

A second adjustment of the proximity correction conditions will be necessary to further improve linearity. SEMs of the final features, when proximity correction is applied, are shown in Figure 9.34. To better understand resolution limits, other exposure doses were investigated. Although trenches as small as 29 nm were observed, the line edge quality was poor. At 33 nm, however, the features were better resolved. Dense arrays of 55-nm holes were present, and semidense arrays with holes as small as 44 nm were resolved. Results were compared with data obtained using a Leica VB6. Although both tools yield templates with 40-nm features, line-edge roughness appears to be better for the templates written with the VB6. Better resolution is also obtained with the VB6 (w20–33 nm). This is not surprising because the edge acuity of the SB350MW is approximately 25–30 nm. It should also be noted that the results obtained with the VB6 required no proximity correction. Based on these observations, a shaped-beam system operating at 100 kV, with edge acuity half of what is achieved today, should have resolution comparable to the VB6.

No trenches

100 nm

80 nm

60 nm

50 nm

40 nm

FIGURE 9.34 Final template features after etch, with proximity correction applied. The top row depicts results when no exposure bias is used. The bottom row shows features obtained with a 20-nm exposure bias.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

493

To meet ITRS requirements for future generations of silicon ICs, good resolution must also be coupled with tools having better image placement capability. It is interesting to note that a Gaussian vector beam tool might also be used, if the clock rate can be increased by a factor of four up to 100 MHz. As an example, a gate level with a 35% feature coverage would take approximately 24 h of exposure time. This time is comparable to writing times for the industries most complex 4! masks [46]. 9.7.2 Template Inspection If imprint lithography is to be considered as a viable method for fabricating complex devices such as high-density silicon-based circuits, an infrastructure must be established that is capable of supplying users with high-quality 1! templates. It is critical, therefore, that tools are available that can both inspect and repair these templates. As an example, the ITRS roadmap for the 45-nm node requires no mask defects larger than 30 nm over the field of the die. The difficulty of this task is quite staggering. As an analogy, if the size of a circuit was equivalent to North America, a 30-nm defect would roughly have the dimensions of a pot hole on a freeway. As a result, the task of an inspection system is equivalent to finding every pot hole on every road in North America [47]. To investigate current tool capabilities, S-FIL templates were fabricated with special test patterns: 280- and 400-nm nested squares, vias, and pillars were written with e-beam lithography intended for optical inspection (Figure 9.35a) and a similar pattern of 70and 100-nm size was written for e-beam inspection (Figure 9.35b). The patterns were written with and without programmed defects of various sizes. Die-to-die inspection was successfully run on a KLA-Tencor RAPID model 526 DUV photomask inspection system. The tool was operated in transmission mode at 257-nm wavelength under very standard sensitivity conditions and found a sufficiently low defect density to form a good baseline and set some preliminary quantitative numbers. An initial optical inspection of a two-cell area of about 2 mm2 confirmed 60 programmed defects and identified 13 nuisance defects. Figure 9.36 shows some of the typical defects identified by the inspection process.

(a)

4X Defects (optical inspection)

(b)

1X Defects (E-beam inspection)

FIGURE 9.35 (a) 4! patterned defects on an S-FIL template. (b) 1! programmed defects on the same template.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

494

Small programmed defect (size 2)

Medium programmed defect (size 5) and a nuisance defect

Large programmed defect (size 9)

FIGURE 9.36 4! programmed defects detected by a KLA-Tencor DUV inspection system.

Although successful, resolution of this tool is approximately 80 nm. To identify smaller defects, it is likely that an electron-beam inspection tool will be required. A parallel electron-beam reticle inspection system (PERIS) is currently under development by KLA-Tencor. The architecture is similar to that of an optical microscope, with the exception that electrons are used in place of a light source. A schematic of the system is shown in Figure 9.37. A two-dimensional electron imager is used to collect information from more than 10,000 pixels in parallel. An example of the resolution of the tool is shown in Figure 9.38. Figure 9.38a depicts a 100-mm field of view, including a captured defect. The close-up of the (Figure 9.38b) defect reveals that it is smaller than 50 nm. Significant work remains before this tool is commercialized, but it is promising technology for both the inspection of 1! templates and printed wafers. 9.7.3 Template Repair Once a defect has been located, a complimentary repair tool must remove the defect, leaving the repaired area free from debris and chemically unaltered. Two technologies are currently being investigated. The first is a custom nanomachining technique under development at RAVE. The second is reactive electron-beam etching, currently under development by NaWoTec and others. Parallel electron reticle inspection system

Similar architecture to to Similar architecture optical microscope except optical microscope except uses electrons instead of of uses electrons instead light. light.

Image converter and CCD detector 2D electron imager collects > 10,000 pixels in parallel

Imaging optics Flood gun Illumination optics Continuous X motion

FIGURE 9.37 Schematic illustration of a parallel electron reticle inspection system (PERIS).

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

(a)

495

(b)

FIGURE 9.38 (a) Defect detected by a multi-pixel electron-beam inspection tool. (b) A close-up of the defect.

Initial repair studies have been started by Dauksher et al. using a RaveLLC 650-nm system. The nanomachining operation is based on an AFM platform, with RAVE’s proprietary nanomachining head and several other support modules installed. The 650nm system uses both low- and high-numerical-aperture optics for navigation. The system further utilizes a pattern recognition system to automate fiducial deskew and to establish a mask coordinate system. After the coordinate origin is established, defect locations can either be downloaded electronically or located and archived manually. The nanomachining head both scans and provides machining capability. A CO2 cryogenic cleaner is employed for debris removal. For the purpose of studying repair capabilities, S-FIL templates were e-beam written on 6!6!0.25 in.3 (6025) plates with a special programmed defect pattern. This pattern was comprised of subsections, which contained line-edge defects, point defects, dual defects (adjacent line-edge and point defects), line-end bridging defects, and line-edge bridging defects. In turn, each of these defects existed in both tones and in a variety of sizes down to 50 nm. A representative image of some of the dual defects is shown in Figure 9.39.

FIGURE 9.39 Programmed defects used as a test device for template repair tools.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

496

Quartz line Before repair

Quartz defect

(a)

After AFM repair

(b)

300 nm defect

50 nm defect

50 nm trench

FIGURE 9.40 (a) Programmed defects before repair. (b) Defects repaired with a RAVE 650-nm nanomachining system.

Figure 9.40 contains AFM images of three different defects before and after repair. Successful removal of the two simulated opaque defects was accomplished with reasonable edge placement. Further repairs were enacted using “mouse bites” in lines as a starting point for milling through the lines in a direction perpendicular to the long axis. More extensive studies are planned for the future. NaWoTec technology combines the use of a thermal field-emission SEM with a precursor injection technology to cause either localized deposition or etching at a substrate surface. The LEO base system features a GEMINI thermal field emitter (TFE) column,

(a)

(b) FIGURE 9.41 (a) Pt/C deposited using an NaWoTec electron-beam repair tool. (b) Quartz etching done with the same system.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography

497

a variable-pressure (VP) vacuum system, an airlock, and an interferometer-controlled stage for 8-in wafers or 6-in photomask plates. NaWoTec integrates the softwarecontrolled gas-supply system, the pattern generator, process control and general user interface, defect cruising using KLA format, repair recipes, and repair procedures. Additionally supplied is an automated drift-compensation package. Five computercontrolled gas channels for precursors for deposition and etching of mask materials are integrated, and digital image pickup, and repair-pattern generation are included. A repair manager supports defect repair. In general, 1-keV electrons and a low electron current are used for the deposition and the etching processes. Although no repairs have been done specifically for imprint lithography, depositions using PT/C have been demonstrated on photoplates (as a replacement for missing chrome) and on stencil masks (to replace missing silicon). Etching of TaN, MoSi, and quartz has also been demonstrated. Examples of a PT/C deposition and a quartz etch are depicted in Figure 9.41.

9.8 Conclusions Imprint lithography has come a long way in a very short period of time. Resolution seems limited to the ability to form a relief image in the template and sub-10-nm printing has already been demonstrated. The technology is cost effective, and applications will continue to grow as emerging markets in the field of nanotechnology continue to blossom. To be considered as a method for fabricating silicon ICs, several concerns still need to be addressed. UV–NIL, and in particular S-FIL, seem the best imprinting option for meeting the stringent requirements of future generations of silicon-based circuitry. Tools, templates, and resists are readily available to start exercising the technology and will be used to answer the open issues such as defectivity and overlay. If these issues can be solved, imprint lithography may be the right NGL because extendibility to at least 10 nm seems viable. The last consideration, then, becomes the supporting infrastructure. Reduction lithography has been in the mainstream now for more than 20 years, and the ability to write, inspect, and correct a 1! template will need to be developed. Electronbased inspection and repair tools, as well as faster Gaussian-based electron-beam writing systems may provide the pathway for template fabrication in the future.

Acknowledgments The writing of this chapter would have been impossible without the contributions of many different talented scientists. I would like to thank several collaborators in particular. David Mancini, Bill Dauksher, and Kevin Nordquist are responsible for much of the work on S-FIL templates. C. Grant Willson, Norm Schumaker, and S. V. Sreenivasan provided most of the expertise for the tool and resist discussions. Roxanne Engelstad taught me more than I ever wanted to know about finite-element modeling and mask distortion. Finally, I would like to thank Laura Siragusa and Vida Ilderem for their support of this work.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

498

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.

D.J. Resnick, W.J. Dauksher, D. Mancini, K.J. Nordquist, T.C. Bailey, S. Johnson, N. Stacey et al. 2003. Proceedings of SPIE, 5037: 12–23. S.V. Sreenivasan, C.G. Willson, N.E. Schumaker, and D.J. Resnick. 2002. Proceedings of SPIE, 4688: 903–909. Y. Xia and G.M. Whitesides. 1998. Angewandte Chemie, International Edition, 37: 550–575. J.A. Rogers, K.E. Paul, and G.M. Whitesides. 1998. Journal of Vacuum Science and Technology B, 16:1, 88–97. R.S. Kane, S. Shuichi, T. Takayama, E. Ostuni, D.E. Ingber, and G.M. Whitesides. 1999. Biomaterials, 20: 2363–2376. S.J. Clarson and J.A. Semlylen. 1993. Siloxane Polymers, Prentice Hall: Englewood Cliffs. G.P. Lopez, H.A. Biebuyck, R. Harter, A. Kumar, and G.M. Whitesides. 1998. Journal of the American Chemical Society, 20: 1213–1220. P.M. St. John et al. 1998. Langmuir, 14: 2225–2229. R. Singhvi, A. Kumar, G. Lopez, G.N. Stephanopoulos, D.I.C. Wang, and G.M. Whitesides. 1994. Engineering cell shape and function, Science, 264: 696–698. E. Kim, Y. Xia, and G.M. Whitesides. 1998. Annual Review of Materials Science, 28: 153–184. S. Takayama et al. 1999. Proceedings of the National Academy of Sciences, U.S.A., 198: 5545–5548. S.Y. Chou, P.R. Krauss, and P.J. Renstrom. 1996. Journal of Vacuum Science and Technology B, 14:6, 4129–4133. S.Y. Chou, P.R. Krauss, W. Zhang, L. Guo, and L. Zhuang. 1997. Journal of Vacuum Science Technology B, 15:6, 2897–2904. Z. Yu, S.J. Schablitsky, and S.Y. Chou. 1999. Applied Physics Letters, 74:16, 2381–2383. Y. Chen, D. Macintyre, E. Boyd, D. Moran, I. Thayne, and S. Thoms. 2002. Journal of Vacuum Science Technology B, 20:6, 2887–2890. X. Cheng, Y. Hong, J. Kanicki, and L. Jay Guo. 2002. Journal of Vacuum Science and Technology B, 20:6, 2877–2880. W. Wu, B. Cui, X. Sun, W. Zhang, L. Zhuang, L. Kong, and S.Y. Chou. 1998. Journal of Vacuum Science and Technology B, 16:6, 3825–3829. Z. Yu, W. Wu, L. Chen, and S.Y. Chou. 2001. Journal of Vacuum Science and Technology B, 19:6, 2816–2819. B.E. Little and S.T. Chu. 2000. Optics and Photonics News, 11:24, 24–29. C. Chao and L. Jay Guo. 2002. Journal of Vacuum Science and Technology B, 20:6, 2862–2866. C.M. Falco, J.M. van Delft, J.P. Weterings, A.K. van Langen-Suurling, and H. Romijn. 2000. Journal of Vacuum Science and Technology B, 18:6, 3419–3423. S.Y. Chou, C. Keimel, and J. Gu. 2002. Nature, 417: 835–837. H. Tan, L. Kong, M. Li, C. Steere, and L. Koecher. 2004. Proceedings of SPIE, 5374: 213–221. M. Otto, M. Bender, B. Hadam, B. Spangenberg, and H. Kurz. 2001. Microelectronic Engineering, Vols. 57–58: 361–366. M. Colburn, S. Johnson, M. Stewart, S. Damle, T. Bailey, B. Choi, M. Wedlake et al. 1999. Proceedings of SPIE, 3676: 379–389. B.J. Choi, S. Johnson, S.V. Sreenivasan, M. Colburn, T. Bailey, and C.G. Willson. 2000. ASME DETC2000/MECH 14145, Baltimore, MD. B.J. Choi et al. 1999. Design of orientation stages for step and flash imprint lithography, ASPE 1999 Annual Meeting. I. McMackin, P. Schumaker, D. Babbs, J. Choi, W. Collison, S.V. Sreenivasan, N. Schumaker, M. Watts, and R. Voisin. 2003. Proceedings of SPIE, 5037: 178–186. M. Colburn, T. Bailey, B.J. Choi, J.G. Ekerdt, and S.V. Sreenivasan. 2001. Solid State Technology, 46: 67. K.H. Smith, J.R. Wasson, P.J.S. Mangat, W.J. Dauksher, and D.J. Resnick. 2001. Journal of Vacuum Science and Technology B, 19:6, 2906. D.J. Resnick, W.J. Dauksher, D. Mancini, K.J. Nordquist, E. Ainley, K. Gehoski, J.H. Baker et al. 2002. Proceedings of SPIE, 4688: 205.

q 2007 by Taylor & Francis Group, LLC

Imprint Lithography 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.

499

T.C. Bailey, D.J. Resnick, D. Mancini, K.J. Nordquist, W.J. Dauksher, E. Ainley, A. Talin et al. 2002. Microelectronic Engineering, Vols. 61–62: 461–467. W.J. Dauksher, K.J. Nordquist, D. Mancini, D.J. Resnick, J.H. Baker, A.E. Hooper, A.A. Talin et al. 2002. Journal of Vacuum Science and Technology B, 20:6, 2857–2861. A.E. Hooper, A.A. Talin, D.J. Resnick, J.H. Baker, D. Convey, T. Eschrich, H.G. Tompkins, and W.J. Dauksher. 2003. Proceedings of Nanotechnology. D.P. Mancini, K.A. Gehoski, K.J. Nordquist, D.J. Resnick, T.C. Bailey, S.V. Sreenivasan, J.G. Ekerdt, and C.G. Willson. 2002. Journal of Vacuum Science and Technology B, 20:6, 2896–2901. S. Johnson, D.J. Resnick, D. Mancini, K. Nordquist, W.J. Dauksher, K. Gehoski, and J.H. Baker. 2003. Microelectronic Engineering, Vols. 67–68: 221–228. M. Colburn, I. Suez, B.J. Choi, M. Meissl, T. Bailey, S.V. Sreenivasan, J.G. Ekerdt, and C.G. Willson. 2001. Journal of Vacuum Science and Technology B, 19:6, 2685. S.C. Johnson, T.C. Bailey, M.D. Dickey, B.J. Smith, E.K. Kim, A.T. Jamieson, N.A. Stacey et al. 2003. Proceedings of SPIE, 5037: 197–202. F. Xu, N. Stacey, M. Watts, V. Truskett, I. McMackin, J. Choi, P. Schumaker et al. 2004. Proceedings of SPIE, 5374: 232–241. C.J. Martin, R.L. Engelstad, E.G. Lovell, D.J. Resnick, and E.J. Weisbrod. 2002. Journal of Vacuum Science and Technology B, 20:6, 2891. K.J. Nordquist, D.P. Mancini, W.J. Dauksher, E.S. Ainley, K.A. Gehoski, D.J. Resnick, Z.S. Masnyj, and P.J. Mangat. 2002. Proceedings of SPIE, 4889: 1143. D.L. White and O.R. Wood. 2000. Journal of Vacuum Science and Technology B, 6:18, 3552–3556. D. Beyer, D. Lo¨ffelmacher, G. Goedl, P. Hudek, B. Schnabel, and T. Elster. 2001. Proceedings of SPIE 4562. J. Butuschke, D. Beyer, C. Constantine, P. Dress, P. Hudek, M. Irmscher, C. Koepernik, C. Krauss, J. Plumhoff, and P. Voehringer. 2003. Proceedings of SPIE, 5256: 344–354. M. Belic, R. Jaritz, P. Hudek, and H. Eisenmann. 2002. In Proceedings of Mask Patterning for 100 nm Technology Mode. F. Kalk. Private communication. D. Adler. Private communication.

q 2007 by Taylor & Francis Group, LLC

10 Chemistry of Photoresist Materials Takumi Ueno and Robert D. Allen

CONTENTS 10.1 Introduction ....................................................................................................................504 10.2 DNQ–Novolac Positive Photoresists ..........................................................................505 10.2.1 Photochemistry of DNQ ................................................................................505 10.2.2 Improvement in Photoresist Performance ..................................................507 10.2.2.1 Novolac Resins ............................................................................507 10.2.2.2 DNQ ..............................................................................................516 10.2.3 Perspective of DNQ Resists ..........................................................................519 10.3 Chemical-Amplification Resist Systems ....................................................................519 10.3.1 Acid Generators ..............................................................................................522 10.3.1.1 Onium Salts..................................................................................522 10.3.1.2 Halogen Compounds ................................................................524 10.3.1.3 o-Nitrobenzyl Esters....................................................................525 10.3.1.4 p-Nitrobenzyl Esters ..................................................................526 10.3.1.5 Alkylsulfonates ............................................................................526 10.3.1.6 a-Hydroxymethylbenzoin Sulfonic Acid Esters ....................529 10.3.1.7 a-Sulfonyloxyketones ................................................................530 10.3.1.8 Diazonaphthoquinone-4-sulfonate (4-DNQ) ..........................530 10.3.1.9 Iminosulfonates ..........................................................................531 10.3.1.10 N-Hydroxyimidesulfonates ......................................................531 10.3.1.11 a,a 0 -Bisarylsulfonyl Diazomethanes ........................................533 10.3.1.12 Disulfones ....................................................................................533 10.3.2 Acid-Catalyzed Reactions..............................................................................533 10.3.2.1 Deprotection Reaction ................................................................533 10.3.2.2 Depolymerization........................................................................541 10.3.2.3 Crosslinking and Condensation ..............................................543 10.3.2.4 Polarity Change ..........................................................................548 10.3.3 Route for Actual Use of Chemically Amplified Resists ..........................550 10.3.3.1 Acid Diffusion..............................................................................550 10.3.3.2 Airborne Contamination ............................................................553 10.3.3.3 N-Methylpyrrolidone (NMP) Uptake ......................................553 10.3.4 Improvement in Process Stability ................................................................554 10.3.4.1 Additives ......................................................................................554 10.3.4.2 Polymer End Groups ..................................................................554 10.3.4.3 Tg of Polymers ............................................................................556 10.4 Surface Imaging..............................................................................................................557 503

q 2007 by Taylor & Francis Group, LLC

504

Microlithography: Science and Technology

10.4.1 Gas-Phase Functionalization ........................................................................557 10.4.2 Desire ................................................................................................................558 10.4.3 Liquid-Phase Silylation ..................................................................................560 10.4.4 Use of Chemical Amplification ....................................................................561 10.4.5 Factors that Influence Pattern Formation ..................................................562 10.5 Resists for ArF Lithography ........................................................................................563 10.6 New Approaches of Contrast Enhancement During Development......................564 10.7 Update of Modern Resist Technology ........................................................................566 10.7.1 Introduction ....................................................................................................566 10.7.2 Chemical Amplification Resist Update for 248 nm ..................................567 10.7.2.1 PHS-Based Resists ......................................................................567 10.7.2.2 Limits of CA Resists ..................................................................569 10.7.2.3 Resolution of CA Resists............................................................570 10.7.3 193-nm Resists ................................................................................................572 10.7.3.1 Backbone Polymers ....................................................................572 10.7.3.2 PAG Effects in 193-nm Resists ..................................................573 10.7.3.3 Cyclic-Olefin-Backbone Polymers ............................................574 10.7.3.4 Backbone Polymers with Hexafluoroalcohol ........................575 10.7.4 Immersion Lithography Materials ..............................................................577 10.8 Conclusion ......................................................................................................................578 References ....................................................................................................................................578

10.1 Introduction The design requirements of successive generations of very large scale integrated (VLSI) circuits have led to a reduction in lithographic critical dimensions. The aim of this chapter is to discuss the progress of the resists for present and future lithography. It is worthwhile describing the history and the trend of lithography and resists. It is evident from Figure 10.1 that three turning points of resist materials have been reached [1]. The first turning point was the replacement of a negative resist composed of cyclized rubber and a bisazide by a positive photoresist composed of a diazonaphthoquinone (DNQ) and a novolac resin. This was induced by the change of exposure system from a contact printer to a g-line (436 nm) reduction projection step-and-repeat system—the so-called stepper. The cyclized rubber system has poor resolution due to swelling during the development and low sensitivity due to a lack of absorption at the g-line. The DNQbased positive photoresist shows sensitivity at the g-line and high resolution using aqueous alkali development. Performance of the g-line stepper was improved by increasing numerical aperture (NA). Then, shorter-wavelength i-line (365 nm) lithography was introduced. The DNQ–novolac resist can be used for i-line lithography; therefore, much effort has been made to improve the resolution capability of the DNQ–novolac resist as well as the depth-of-focus latitude. The effect of novolac resin and DNQ chemical structure on dissolution inhibition capability has been investigated mainly to get high dissolution contrast, which will be discussed in Section 10.2. The progress of this type resist and an i-line stepper is remarkable, achieving resolution below the exposure wavelength of the i-line (0.365 mm). However, i-line lithography has difficulty accomplishing 0.3-mm processes (64 MDRAM), even using a high-NA i-line stepper in conjunction with a DNQ–novolac resist.

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

505

1st turning point Solvent developer

Resolution (μm)

10.0

Cyclized rubber Bisazide

Phenolic resins TMAH developer

Contact Printer

3rd turning point

DNQ-Novola k g-line (436 nm)

1.0

2nd turning point

Carboxylic

i-line (365 nm) Chemical Amplification

KrF (248 nm)

0.1

ArF (193 nm) ArF immersion

Reduction projection optics

EB EUV

80

90

00

10

Year FIGURE 10.1 Development trend of lithography and resists.

Several competing lithographic technologies had been proposed for next-generation engineering of the i-line: KrF lithography (deep-UV lithography) and electron-beam lithography [2]. The wavefront engineering includes off-axis illumination (OAI), pupil filtering [3], and phase-shifting lithography (discussed in Chapter 2). DNQ–novolac resists can be used for OAI and pupil filtering, although higher sensitivity is necessary. For phaseshifting lithography, bridging of patterns at the end of the line-and-space patterns occurs when a positive resist is used [4]. Therefore, negative resists with high sensitivity and high resolution are required for phase-shifting lithography. In KrF lithography, positive and negative resists with high sensitivity, high resolution, and high transmittance at the exposure wavelength are needed. The chemical amplification resists have been used in KrF lithography, which is the second turning point. ArF resists are currently used for large-volume production. In Section 10.7, Robert Allen has provided an update of the recent progress in resist materials.

10.2 DNQ–Novolac Positive Photoresists The positive photoresist composed of DNQ and novolac resin was the workhorse for semiconductor fabrication [5]. It is surprising that this resist has a resolution capability below the exposure wavelength of i-line (365 nm). In this section, the photochemical reaction of DNQ and improvement of the resist performance by newly designed novolac resins and DNQ inhibitors are discussed. 10.2.1 Photochemistry of DNQ The Wolff rearrangement reaction mechanism of DNQ was proposed by Suess [6] in 1942 as shown in Scheme 10.1. It is surprising that the basic reaction was already established a half century ago, although Suess [6] suggested the chemical structure of the final product

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

506 O

H N2

CO

hr

+ N2

O C OH

H2O

R

R

SCHEME 10.1 Photolysis mechanism for DNQ–PAC proposed by Suess.

was 3-indenecarboxylic acid, which was corrected to 1-indenecarboxylic acid [7]. Packansky and Lyerla [7] showed direct spectroscopic evidence of 1-indenoketene intermediate formation at 77 K using IR spectroscopy. Similar results were also reported by Hacker and Turro [8]. Packansky and Lyerla [7] also investigated the reaction of ketene intermediates using infrared and 13C nuclear magnetic resonance spectroscopy [7]. The reactivity of the ketene depends on the conditions, as shown in Scheme 10.2. Under ambient conditions, ketenes react with water trapped in the novolac resin to yield O C CH3

RT vacuum Resin (II)

Isolated Ketene

O C

CH2

O

(Via ketene intermediate)

R

hν 77K vacuum

hν RT, vacuum Resin (II)

R 1

2

O N N PAC

R

3

hν RT, air H2O CO2H

(Via ketene intermediate) R SCHEME 10.2 UV induced decomposition pathways for DNQ–PAC in a novolac resin.

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

507

3-indenecarboxylic acid. However, UV exposure in vacuo results in ester formation via a ketene-phenolic OH reaction. Vollenbroek et al. [9,10] also investigated the photochemistry of 2,1-diazonaphthoquinone (DNQ)-5-(4-cumylphenyl)-sulfonate and DNQ-4-(4-cumylphenyl)-sulfonate using photoproduct analysis. They confirmed that the photoproduct of DNQ is indenecarboxylic acid and its dissolution in aqueous base gives the formation of indenyl carboxylate dianion that decarboxylates in several hours. They showed that films of mixtures of novolac and indenecarboxylic acid showed no difference in dissolution rate compared to that of exposed photoresist. Many attempts to detect the intermediates in photochemistry of DNQ by time-resolved spectroscopy have been reported. Nakamura et al. [11] detected the strong absorption intermediate at 350 nm that was assigned as a ketene intermediate formed by DNQ sulfonic acid in solution. Shibata et al. [12] observed the transient absorption at 350 nm of hydrated ketene as an intermediate of DNQ-5-sulfonic acid. Similar results were also reported by Barra et al. [13] and Andraos et al. [14]. Tanigaki and Ebbsen [15] observed the oxirene intermediate as a precursor of ketene, which was also confirmed by spectroscopic analysis in an Ar matrix at 22.3 K. It is still controversial whether ketocarbene is a reaction intermediate [9], whereas the existence of the ketene intermediate is confirmed. Because most of the attempts to detect the intermediates were performed in solution, further studies are needed to confirm the reaction intermediates in the resist film. Sheats [16] described reciprocal failure, intensity dependence on sensitivity, in DNQ–novolac resists with 364-nm exposure, which is postulated to involve the time-dependent absorbance of the intermediate ketene. 10.2.2 Improvement in Photoresist Performance 10.2.2.1 Novolac Resins

Sensitivity

Sensitivity

hip el a tio ns f" r of eod "T r

Resolution capablity

ip lat ion sh re ff" -o de "T ro

Heat resistance

ip sh el a tio n f" r of eod "T r

Film thickness relention

A group at Sumitomo Chemical has made a systematic study of novolac resin to improve performance of the positive resists [17–22]. It is generally accepted that a resist with high sensitivity gives low film-thickness retention of the unexposed area after development and low heat resistance (Figure 10.2). Novolac resins have been designed with a molecular structure and a molecular weight different from existing materials, although the control of synthesis in novolac resin is considered to be difficult due to poor reproducibility. Sumitomo Chemical investigated the relation between lithographic performance and the characteristics of novolac resins such as isomeric structure of cresol, the position of the

Sensitivity

FIGURE 10.2 Trade-off relationships for various performances of a photoresists. Dotted line indicates the improvement of the performance. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

q 2007 by Taylor & Francis Group, LLC

508

Microlithography: Science and Technology

OH

(1) Molecular weight

CH2

n

CH3

(2) Isomeric structure of cresol OH

OH

OH

CH3 CH3 (o)

CH3 (p)

(m)

(3) Methylene bond position OH (o)

H3C

CH2

OH (m)

OH

OH

CH2

CH2

CH3

CH3

CH2 CH3

OH CH2

(p)

FIGURE 10.3 Factors of novolac resins that influence resist characteristics. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

CH3

(4) Molecular weight distribution

methylene bond, the molecular weight, and the molecular-weight distribution (Figure 10.3). To clarify the lithographic performance, measurements of the dissolution rates of unexposed and exposed resists were attempted. It is not always easy to judge improvement in contrast by g-value in exposure-characteristic curves where the remaining film thickness after development is plotted as a function of the logarithm of exposure dose. Because the difference in dissolution rate between exposed and unexposed resists is two to five orders of magnitude, it should be noted that the measurement of dissolution rate made the difference in resist performance clearer than that of the exposure characteristic curves used in their early work [17]. With increasing molecular weight, dissolution rate of novolac resin (Rn), unexposed (R0) and exposed (Rp) resist decreases as shown in Figure 10.4. Therefore, the contrast remains almost constant for changes in the molecular weight of the novolac resin [19]. When the para–meta ratio of novolac resin increases, dissolution rates of novolac resin (Rn) and the resist films of unexposed (R0) and exposed (Rp) resists decrease (Figure 10.5) [19]. However, the decrease in dissolution rate of an unexposed resist is larger than that of the exposed region for novolac resin with a high para–meta ratio. Therefore, the resist contrast was improved using the novolac resin with high para–meta ratio at the expense of its sensitivity. The reason for decrease in dissolution rate with high p-cresol content may be ascribed to high polymer regularity and rigidness leading to slow diffusion of the developer. Figure 10.6 shows the dependence of dissolution rate on the S4 value of m-cresol novolac resin, which represents the ratio of “unsubstituted carbon-4 in benzene ring of cresol to carbon-5,” which indicates the fraction of ortho–ortho methylene bonding to high-ortho bonding [19]. With increasing S4 values, the content of type-B structure increases in novolac resin. The dissolution rate of an unexposed resist (R0) shows a drastic decrease with increasing S4 value, as shown in Figure 10.6, whereas the dissolution rate of novolac

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

Dissolution rate (nm / s)

103

103

RP

102

102

RP Ro

Rn 101

509

101

RP Ro

100

Ro

5

10

100 20 ( 103)

15

Molecular weight

FIGURE 10.4 Effect of molecular weight of novolac resins on dissolution rates. The resists composed of novolac resins synthesized from m-cresol (80%) and p-cresol (20%) and 3HBP-DNQ ester, where 3HBP is 2,3,4-trihydroxybenzophenone. Rn, dissolution rate of novolac resins; R0, dissolution rate of unexposed film; Rp, dissolution rate of exposed (60 mJ/cm2) film. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

(Rn) resin decreases slightly. It should be noted that the dissolution rate of the exposed area remains constant for various S4 values. Therefore, a high contrast resist is obtained without sensitivity loss using a novolac resin with a high S4 value. Azo coupling of novolac resin with diazonaphthoquinone via base-catalyzed reaction during development, as shown in Scheme 10.3, can explain the difference in resist performance with different S4 values. High-ortho novolac has more vacant para positions compared with a normal novolac resin, and these vacant positions enhance the electrophilic azocoupling reaction.

103

103

Rp Ro

Dissolution rate (nm/s)

102

102

Rp

Rp Ro

101

101

Rn

100

100

Ro

10−1 0

0

8

2

6

meta

4

4

6

2

8

para ratio

q 2007 by Taylor & Francis Group, LLC

0

10

FIGURE 10.5 Effect of meta:para cresol ratio in novolac resins on dissolution rates. The molecular weight of these novolac resins is almost the same. Rn, dissolution rate of novolac resins; R0, dissolution rate of unexposed film; Rp, dissolution rate of exposed (60 mJ/cm2) film. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

Microlithography: Science and Technology

510

CH

OH

OH

CH2

CH2

CH3 CH2

CH3

OH

OH CH2

CH3

CH3

CH3

"high-ortho " structure

CH3

A

OH CH2

CH2

B

103

103

Dissolution rate (nm/s)

Rp

FIGURE 10.6 Effect of content of “unsubstituted carbon-4 in benzene ring of cresol,” S4, in novolac resins synthesized from m-cresol on dissolution rates. The molecular weight of these novolac resins is almost the same. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

102

102

Rp

Rn

Ro Rp Ro

101

101 Ro

100

100

10 20 30 40 50 S4

OH

CH2

OH CH2

CH3 OH N N

O

OH CH2 CH3

CH3

N2 SO2 SO2

OH

O

O O

C

O

+

O SO2

CH2

Base

O

C

n

O

CH3

O SO2

(Soluble)

(Insoluble) N N

N2

O

H3C

OH H3C

H3C

CH2

CH2

OH SCHEME 10.3 Dissolution inhibition azo coupling reaction of DNQ–PAC with novolac resin.

q 2007 by Taylor & Francis Group, LLC

OH

CH2 OH

Chemistry of Photoresist Materials

511

104

104 Noo = 30 100 Novolak

Dissolution rate (nm/s)

103

103

Rp Ro

102

102

Rp 101

101

Rn 100

Rp Ro

100

Ro 10−1

10

FIGURE 10.7 Effect of molecular weight distribution of novolac resins on dissolution rates. The molecular weight of these novolac resins is almost the same. (From Hanabata, M., Uetani, Y., and Furuta, A., Proc. SPIE, 920, 349, 1988.)

−2

3

4

5

6

7

Mw Mn

The effect of the molecular weight distribution of novolac resin on resist performance is shown in Figure 10.7 [18]. The dissolution rate of novolac resin increases with increasing molecular weight distribution (Mw/Mn). The discrimination between exposed and unexposed area is large at a certain Mw/Mn value, indicating that the optimum molecular weight distribution gives high contrast resist. On the basis of their systematic studies on novolac resins, Hanabata et al. [19] proposed the “stone wall” model for a positive photoresist with alkali development as shown in Figure 10.8 [19]. In exposed parts, indenecarboxylic acid formed by exposure and

L

N

H

L

H

N

N

COOH

H

H H H L

H LN

N

ed pos (Ex ar t) p

H

L

L N

H

COOH

COOH

N

H

(U

N

ne x pa p o s r t) e d

H

H

N

H : High molecular weight novolak resin L : Low

H

H

H

"

N : NQD (Naphthoquinorediazide)

q 2007 by Taylor & Francis Group, LLC

Azocoupling reaction product

FIGURE 10.8 “Stone wall” model for development of positive photoresist. (From Hanabata, M., Uetani, Y., and Furuta, A., J. Vac. Sci. Technol. B, 7, 640, 1989.)

512

Microlithography: Science and Technology

low-molecular-weight novolac resin dissolve first into the developer. This increases the surface contact area of high-molecular-weight novolac with the developer, leading to dissolution promotion. In unexposed areas, an azo-coupling reaction of low-molecularweight novolac resin with DNQ retards the dissolution of low-molecular-weight resin. This stone-wall model gave clues to the design a high-performance positive photoresist in the authors’ following works. They extended the study on novolac resins by synthesizing from various alkyl (R)substituted phenolic compounds, including phenol, cresol, ethylphenol, butylphenol, and copolymers of these, were investigated to see the effect of their composition on the resist performance [20]. The requirements for resists were that the dissolution rate of exposed areas should be larger than 100 nm/s and the dissolution-rate ratio of exposed to unexposed area should be larger than 10. To meet the requirements, they proposed the selection principles of phenol compounds for novolac resin synthesis: (1) the average carbon number in substituent R per one phenol nucleus must be 0.5–1.5 (i.e., 0.5%[C]/ [OH]%1.5); (2) the ratio of para-unsubstitution R with respect to OH group must be 50%. The Sumitomo group tried to clarify the roles of individual molecular weight parts of novolac resin: low-molecular-weight (150–500), middle-molecular-weight (500–5000) and high-molecular-weight novolac (greater than 5000) in resist performance [21]. It was found that a novolac resin with a low content of middle-molecular-weight component shows high performance, such as resolution, sensitivity, and heat resistance. Then, “tandem type novolac resin” shown in Figure 10.9 was proposed. The advantage of tandem-type novolac resins can be explained again by the stone-wall model. The low-molecular-weight novolacs and DNQ molecules are stacked between high-molecular-weight novolacs. In exposed areas, dissolution of indenecarboxylic acid and low-molecular-weight novolacs promotes dissolution of high-molecular-weight novolacs due to an increase in surface contact with the developer. In unexposed areas, the azo-coupling reaction of DNQ compounds with low-molecular-weight novolacs retards the dissolution. Therefore, phenolic compounds can be used instead of low-molecular-weight novolacs if the compounds have moderate hydrophobicity and azo-coupling capability with DNQ.

Molecular weight 5000

H

M

500 150

L

Normal novolak resin

Area ratio (%) of GPC trooes

46–60 32–45 5–20 % % %

Tandem type novolak resin FIGURE 10.9 Gel-permeation chromatography traces of a normal novolac and “tandem type” novolak resin. (From Hanabata, M., Oi, F., and Furuta, A., Proc. SPIE, 1466, 132, 1991.)

q 2007 by Taylor & Francis Group, LLC

35–92 0–30 8–35 % % %

Chemistry of Photoresist Materials

513

Hanabata et al. [22] showed high-performance characteristics of the resists composed of phenolic compounds, high-molecular-weight novolacs, and a DNQ compound with high heat resistance. Studies of the effects of novolac molecular structures on resist performance have also been reported by several groups. Kajita et al. [23] of JSR investigated the effect of novolac structure on dissolution inhibition by DNQ. They found that the use of meta-methylsubstituted phenols, especially 3,5-dimethylphenol, is effective to obtain a higher ratio of intra/inter-molecular hydrogen bonds and the ratio can be controlled by selecting the phenolic monomer composition. Interaction between ortho–ortho units of novolac resin and DNQ moiety of the PAC plays an important role in dissolution inhibition in alkali development. It was found that naphthalene sulfonic acid esters without the diazoquinone moiety also showed dissolution inhibition. This dissolution-inhibition effect also depended upon the structure of the novolac resin, indicating the importance of interaction between the naphthalene moiety and the novolac resin. They proposed a host–guest complex composed of a DNQ moiety and a cavity or channel formed with aggregation of several ortho–ortho linked units, as shown in Figure 10.10, where the complex is formed via electrostatic interaction. Honda et al. [24] proposed the dissolution inhibition mechanism called “octopus-pot” of novolac–PAC interaction and the relationship between novolac microstructure and DNQ inhibitor. The mechanism involves two steps. The first step is a static molecular interaction between novolac and DNQ via macromolecular complex formation during spin-coating. A secondary dynamic effect during the development process enhances the dissolution inhibition via formation of cation complexes having lower solubility. The addition of DNQ to novolac caused the OH band to shift to a higher frequency (blue shift) in the IR spectra, which suggests a disruption of the novolac hydrogen bonding by the inhibitor and concomitant hydrogen bonding with the inhibitor. The magnitude of the blue shift increases monotonically with the dissolution inhibition capability as shown in Figure 10.11. An increase in p-cresol content in m/p-cresol novolacs leads to an increase in ortho–ortho bonding because ortho positions only are available for reaction on the p-cresol nucleus. The magnitude of the blue shift for the p-cresol trimer was found to be dependent on the DNQ concentration and goes through a maximum at a mole ratio (p-cresol:DNQ) of 18. This suggests that a complex involving six units of p-cresol trimer and one molecule of DNQ is formed, probably through intermolecular hydrogen bonding. The model of the

OH OH OH

OH

N2

OH

OH

OH

OH

HO

HO H O HO

O

OH

OH

OH OH

H OH O

OH

OH

OH OH

SO3

HO HO

Calixarene

Host-guest complex

Pseudo-cyclophane

FIGURE 10.10 “Host–guest complex” model for dissolution inhibition of a photoresist. (From Kajita, T., Ota, T., Nemoto, H., Yumoto, Y., and Miura, T., Proc. SPIE, 1466, 161, 1991.)

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

50

100 80

Inhibition

FIGURE 10.11 Correlation of dissolution inhibition and blue-shift with p-cresol content in m/p-cresol novolacs. The x-axis indicates the p-cresol content in the feed stock for novolac synthesis. The inhibition was defined as the ratio of dissolution rate of the synthesized novolac to that of unexposed resist that was formulated with this novolac and 4HBPDNQ ester, where 4HBP is 2,3,4,4 0 -tyetrahydroxybenzophenone (average esterification levelZ2.75). The ester content in solid film is 20 wt.%. (From Honda, K., Beauchemin, B. T., Jr., Hurditch, R. J., Blankeney, A. J., Kawabe, K., and Kokubo, T., Proc. SPIE, 1262, 493, 1990.)

30

60 40 20

10 0

20

40 60 P/P+m (%)

80

Blue-shift (cm−1)

514

100

macromolecular complex, the “octopus-pot” model, is schematically depicted in Figure 10.12. To improve the dissolution inhibition effect, Honda et al. [25] synthesized novolac resin with a p-cresol trimer sequence of novolac incorporated into a polymeric chain; they copolymerized m-cresol with a reactive precursor that was prepared by attaching two units of m-cresol to the terminal ortho position of p-cresol trimer. This kind of novolac can exhibit a higher degree of dissolution inhibition at a lower content of DNQ–PAC. Secondary dynamic effects of dissolution chemistry during development were investigated for a series of novolac resins with different structures with various quarternary ammonium hydroxides [26,27]. Phenolic resins used were (1) conventional m/p cresol (CON), (2) high-ortho, ortho m/p cresol novolac oligomer (hybrid pentamer: HP), (3) high-ortho, ortho m/p cresol made from polymerization of HP with m-cresol (HON), (4) novolac from xylenol feed stock (PAN), and (5) polyvinylphenol (PVP). UV spectral change as a function of dissolution time is shown in Figure 10.13. The absorbance at 282 nm decreases with decreasing novolac film thickness. A new absorption band appeared at 305 nm that was assigned to the cation complex between novolac and tetramethylammonium cation (Figure 10.14). The relation between cation complex formation rate and dissolution rate of HP is shown in Figure 10.15 for three types of quarternary ammonium hydroxide. The dissolution rate can be monitored by absorption at 282 nm of

PAC

O=S=O O O OO O OO OO N O O N O O O HO H H O O O

FIGURE 10.12 “Octopus-pot” model of macromolecular complex of ortho–ortho bonded novolac microstructure with DNQ–PAC. (From Honda, K., Beauchemin, B. T. Jr., Hurditch, R. J., Blankeney, A. J., Kawabe, K., and Kokubo, T., Proc. SPIE, 1262, 493, 1990.)

q 2007 by Taylor & Francis Group, LLC

O,O-bond Novolak

Chemistry of Photoresist Materials

1.0

515

282 1 305 3

0.5

4

0 250

300 Wavelength (nm)

O

H O

H

H

O

CH3

O H

H3C

H3 H3 C C H3C N CH3 ⊕

340

FIGURE 10.13 UV absorption spectral change of hybrid pentamer (HP see Fig. 15) film with development time with 0.262 N tetramethylammonium hydroxide solution. Development time: (1) 50 s; (2) 100 s, (3) 200 s; (4) 500 s. (From Honda, K., Beauchemin, B. T. Jr., Hurditch, R. J., Blankeney, A. J., and Kokubo, T., Proc. SPIE, 1672, 305, 1992; Honda, K., Blankeney, A. J., Hurditch, R. J., Tan, S., and Kokubo, T., Proc. SPIE, 1925, 197, 1993.)

O

Absorbance

2

H3 C

FIGURE 10.14 Schematic structure of tetramethylammonium ion complex of hybrid pentamer (HP, see Fig. 10.15). (From Honda, K., Beauchemin, B. T. Jr., Hurditch, R. J., Blankeney, A. J., and Kokubo, T., Proc. SPIE, 1672, 305, 1992; Honda, K., Blankeney, A. J., Hurditch, R. J., Tan S., and Kokubo, T., Proc. SPIE, 1925, 197, 1993.)

CH3

Complexation kc ×10−3 (s−1)

8 1

6

4 2 2 3 0 0

2 4 6 Dissolution kd ×10−3 (s−1)

CH3

CH3 H3C

N+ CH3

HOCH2CH2

CH3

C +

OH

OH

OH

C

CH3

CH3

CH3

CH3

CH3

OH CH2

H3C

N+ CH2 CH2 OH

2: DEDM+

1: TMA

HO

8

CH3

Hybrid pentamer

q 2007 by Taylor & Francis Group, LLC

N CH2 CH2 OH CH3

3: BEDM+

FIGURE 10.15 Correlation of cation complex formation rate to the dissolution rate with hybrid pentamer. (From Honda, K., Beauchemin, B. T. Jr., Hurditch, R. J., Blankeney, A. J., and Kokubo, T., Proc. SPIE, 1672, 305, 1992; Honda, K., Blankeney, A. J., Hurditch, R. J., Tan, S., and Kokubo, T., Proc. SPIE, 1925, 197, 1993.)

Microlithography: Science and Technology

516

10 1 1. CON 2. BON 3. PAN

% OD305

2 5 3

1.0 3 2 % OD282

FIGURE 10.16 Spectroscopic dissolution rate monitoring (SDRM) curves of various types of novolac films in 0.262 N TMAH solution. (1) CON, a conventional m/p-cresol novolac; (2) HON, a high ortho–ortho m/p-cresol novolac made from polymerization of hybrid pentamer (HP, see Fig. 10.15) with m-cresol; (3) PAN, a novolac made from xylenol feed stock. (From Honda, K., Beauchemin, B. T. Jr., Hurditch, R. J., Blankeney, A. J., and Kokubo, T., Proc. SPIE, 1672, 305, 1992; Honda, K., Blankeney, A. J., Hurditch, R. J., Tan, S., and Kokubo, T., Proc. SPIE, 1925, 197, 1993.)

0.5

0

1

50

100

150

Time (s)

the aromatic absorption band of novolac resin. The rate of the complex formation is approximately the same as the rate of dissolution when using a water rinse. This evidence supports cation diffusion as the rate-determining step for dissolution. The CON and HON novolacs relatively quickly build a high concentration of quarternary ammonium complex, while PAN shows slower formation of the complex despite of lower molecular weight than in CON novolac, as shown in Figure 10.16. Because the cation diffusion typically controls dissolution, polymer flexibility and microstructure exert a strong influence on cation diffusion rate. The effect of developer cations have also been studied by other workers [27,28]. The studies described above suggest that the dissolution behavior of novolac resins is quite important in the development of positive photoresists. To understand the dissolution mechanism, theoretical and experimental studies of dissolution behavior of phenolic resins have been reported [28–34]. 10.2.2.2 DNQ The effects of chemical structures of DNQ inhibitors on resist performance are addressed next. The effects should be investigated in correlation with novolac structures. DNQ compounds are usually synthesized by the esterification reaction of phenol compounds with DNQ sulfonyl chloride, and many DNQ–PACs are reported. Kishimura and coworkers [35] have reported on a dissolution-inhibition effect of DNQ–PACs derived from polyhydroxybenzophenones and several m-cresol novolac resins. The number of DNQ moieties in the resist film and the average esterification value of DNQ–PACs were the same for each type of ballast molecule. The distance between DNQ moieties in the DNQ–PAC and the degree of dispersion of DNQ moieties in the resist film are important to enhance the dissolution inhibition effect.

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

OD

517

OD

DO OD OD

DO

FIGURE 10.17 DNQ–PAC made from 3,3,3 0 3 0 -tetramethyl-1,1 0 -spiroindane-5,6,7,5 0 ,6 0 ,7 0 -hexol polyhydroxy ballast compound, where D is 2,1-diazonaphthoquinone-5sulfonyl.

Tan et al. [36] of Fuji Photo Film described DNQ-sulfonyl esters of novel ballast compounds. The PAC structure was designed to minimize the background absorption at 365 nm and to enable a resist formulation to be optimized with low PAC loading. They proposed 3,3,3 0 ,3 0 -tetrametyl-1,1 0 -spiroindane-5,6,7,5 0 ,6 0 ,7 0 -hexol (Figure 10.17) as a polyhydroxy ballast compound. Nemoto et al. [37] investigated the effect of DNQ proximity and the hydrophobicity of a variety of trifunctional PACs on the dissolution characteristics of positive photoresists. They found that the new index L!RT is linearly related to dissolution inhibition for various novolac resins, as shown in Figure 10.18, where L is the average distance between DNQ groups in PAC molecules estimated by molecular orbital (MO) calculations and RT is the retention time in reverse HPLC measurements. RT can be a measure of hydrophobicity of the PAC. Similar results were reported by Uenishi et al. [38] of Fuji Photo Film. They used model backbones without hydroxy groups and fully esterified DNQ–PACs. The inhibition capability was found to be correlated with the retention time on reverse phase HPLC, a measure of hydrophobicity and with the distance of DNQ moiety of DNQ–PACs. The Fuji Photo Film group extended their investigation of PAC structure effects on dissolution behavior to include the effects of the number of unesterified OH groups of DNQ–PAC on image performance [39]. PACs generally lose their inhibition capability with increasing number of unesterified OH groups compared to fully esterified PACs, whereas certain particular PACs still remained strongly inhibiting even when an OH was left unesterified. Such PACs lost inhibition when one more OH is left unesterified (Figure 10.19) and gave a large dissolution discrimination upon exposure, which results in a high-resolution resist. DNQ–PACs with steric crowding around OH groups seem to be the structural requirement in addition to hydrophobicity and remote DNQ configuration. The PAC provides good solubility in resist solvent. Hanawa et al. also proposed PACs obtained by selective esterification of OH groups of ballast molecules with sterically hindered OH groups (Figure 10.20) [40,41]. These PACs give higher sensitivity, g-value,

Dissolution inhibition

1000

100

10

1 0.5

5

10

100 L×R T

q 2007 by Taylor & Francis Group, LLC

1000

FIGURE 10.18 Influence of the resin structure on the relation between the dissolution inhibition and L!RT (see text). (B): 73MX35 novolac resin synthesized from 7/3 molar ratio of m-cresol to 3,5-xylenol; (6): 82MX35 novolac resin synthesized from 8/2 molar ratio of m-cresol to 3,5-xylenol; (,): 91MX35 novolac resin synthesized from 9/1 molar ratio of m-cresol to 3,5-xylenol; (>): 100M novolac resin synthesized from mcresol; (-): polyhydroxystyrene. (From Kishimura, S., Yamaguchi, A., Yamada, Y., and Nagata, H., Polym. Eng. Sci., 32, 1550, 1992.)

Microlithography: Science and Technology

518

104 PAC 7-2D

103

PAC 7-3D PAC 7-4D

Dissolution rate (Å/s)

102 PAC 7-5D

101

DO

100

OD

DO

DO

10−1

DO

OD

OH

OD OD OD

Me OD

OD OD OD

Me

10−2 FIGURE 10.19 Dissolution rate change with exposure dose for DNQ–PAC with different esterification degree. DNQ–PAC with hindered OH shows a high performance of photoresist.

10−3

100

101 102 103 Exposure energy (mJ/cm2)

104

and resolution than those of fully esterified PACs. Hindered OH groups are effective for scum-free development. Workers of IBM and Hoechst–Celanese reported a DNQ–PAC derived from phenolphthalein that showed advantageous scumming behavior; even at high defocus, the patterns did not web or scum as is usually observed [42]. It may be due to a basecatalyzed hydrolysis of a lactone ring that leads to a more soluble photoproduct, as shown in Scheme 10.4. The polyphotolysis model [43] for DNQ resists stimulated study on PAC molecules. The model suggests that more DNQ groups in a single PAC molecule improve the resist

O N2 D= SO2

OH

DO OD

DO

OH

DO OD

DO

OH OH OD OD

DO FIGURE 10.20 DNQ–PACs with steric hindrance OH groups obtained by selective esterification of OH groups with DNQsulfonylchlorides. (From Uenishi, K., Sakaguchi, S., Kawabe, Y., Kokubo, T., Toukhy, M. A., Jeffries, A. T., III, Slater, S. G., and Hurditch, R. J., Proc. SPIE, 1672, 262, 1992.)

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials R–O

R–O

O–R

O

519

O–R

M+OH− +H2O

O

OH O−M+

O

O–R

M+OH− −RO−M+ −H2O

O

O−M+ O

SCHEME 10.4 Lactone-ring opening during alkali development.

contrast. Some results support the model but some do not. Hanawa et al. [40] reported that two DNQ in a PAC showed a contrast similar to that of three DNQ in a PAC. Uenishi et al. [39] also investigated the effect of the number of DNQ in a single PAC and number of OH groups unesterified. Although the number of DNQ in a PAC is important to improve the contrast in some cases, it is not simple because the distance of DNQ groups and hydrophobicity of PAC also affect the dissolution inhibition capability as described above. 10.2.3 Perspective of DNQ Resists As described above, enormous data of novolac resins and DNQ compounds have been accumulated. Several models such as the stone-wall model by the Sumitomo group, hostguest model by JSR and the octopus-pot model by OCG were proposed to explain the dissolution behavior of high-performance photoresists. There are some agreements and contradictions between them. The naphthalene sulfonyl group plays a major role in novolac dissolution inhibition for both reports by Honda et al. (OCG) [24] and Kajita et al. (JSR) [23]. It is generally accepted that high-ortho novolac shows a high inhibition effect by DNQ–PAC. For base-induced reactions, Hanabata et al. (Sumitomo Chemical) [19] reported azo coupling, whereas Honda et al. [24] reported that azo coupling cannot explain the high dissolution-inhibition effect for high-ortho novolac. However, the models proposed above are instructive to understand resist performance and give clues for the development of high-performance resists. Table 10.1 shows the summary of the Sumitomo group’s work on novolac resins. Although these results are impressive, they were obtained at certain conditions. For example, the S4 effect was investigated for novolacs obtained from m-cresol. Therefore, the effects of m/p-cresol, molecular weight, and molecular-weight distribution were not given. The effects of combinations of various novolac resins and DNQ–PAC on resist performance were not fully understood. Further study of these effects may improve the resist performance.

10.3 Chemical-Amplification Resist Systems Deep-UV lithography is one of several competing candidates for future lithography to obtain resolution below 0.30 mm. DNQ-based positive photoresists are not suitable for deep-UV lithography due both to absorption and sensitivity. Absorption of both novolac resins and DNQ–PACs is high and does not bleach at around 250 nm, resulting in resist profiles with severely sloping sidewalls. Much attention has been focused on chemical-amplification resist systems, especially for deep-UV lithography. These resists

q 2007 by Taylor & Francis Group, LLC

520

TABLE 10.1 Effect of Five Factors of Novolac Resins on Dissolution Rates and Resist Performance Disolution Rate Factors

Unexposed (Rp)

Exposed (R0)

Rp/R0 Ratio

Sensitivity

/

(0) Y (x)

Molecular weight, MW:[

Isomeric structure of cresol para:[

Resolution

(0) [

[

/

(0) [

(0) [

(0) [

Y (0)

Y (x)

Y (0)

/

(x) Y[ (0)

(0) [

(0) [Y (x)

(0) [

(0) [Y (x)

(0) [Y (x)

(0) [Y (x)

(0) [Y (x)

(0) [

(0) [Y (x)

(0) [

(0) [

(0) [

Y (0)

(0) [

(0) [

(0), Improvement; (x), deterioration; ([), increase; (Y), decrease; and (/), no change.

Y (x)

/

(0) [

Y (x)

(0) [

Microlithography: Science and Technology

Y (x)

NQD/novolac ratio:[

q 2007 by Taylor & Francis Group, LLC

Heat Resist

Y (0)

Methylene bond position, S4:[

Molecular weight distribution, Mw/Mn:[

Film Thickness Retention

Chemistry of Photoresist Materials

521

are advantageous because of high sensitivity, which is important because the light intensity of deep-UV exposure tools is lower than that of conventional i-line steppers. In chemical-amplified resist systems, a single photoevent initiates a cascade of subsequent chemical reactions. The resists are generally composed of an acid generator that produces acid upon exposure to radiation and acid-labile compounds or polymers that change the solubility in the developer by acid-catalyzed reactions. As shown in Figure 10.21, the photogenerated acid catalyzes the chemical reactions that change the solubility in a developer. The change from insoluble to soluble is shown. The quantum yield for the acid-catalyzed reaction is the product of the quantum efficiency of acid generation multiplied by the catalytic chain length. In chemically amplified resists, acid generation is the only photochemical event. Therefore, it is possible to design the acidlabile base polymer with high transmittance in the deep-UV region. Because the small amount of acid can induce many chemical events, it is expected that yields of the acidcatalyzed reaction along the film thickness can be alleviated, even for concentration gradients of photogenerated acid along the film thickness. Another important point of chemically amplified resist systems is the drastic polarity change resulting from an acidcatalyzed reaction, which can avoid the swelling during the development and give high contrast. Many chemical-amplification resist systems have been proposed since the report by the IBM group [44,45]. Most of these are based on acid-catalyzed reactions [46–51], although some with base-catalyzed reactions have been reported [52,53]. In the following sections, acid generators and resists are classified by acid-catalyzed reaction and discussed.

I I

PAG

I

I I

hν

Insoluble

I I

H+

I

I I

S S

S

S S

q 2007 by Taylor & Francis Group, LLC

Soluble

FIGURE 10.21 Schematic representation of a positive chemically amplified resist.

Microlithography: Science and Technology

522 10.3.1 Acid Generators

10.3.1.1 Onium Salts Most well-known acid generators are onium salts such as iodonium and sulfonium salts that were invented by Crivello [54]. The photochemistry of diaryl iodonium and triaryl sulfonium salts has been studied in detail by Dektar and Hacker [55–59]. The primary products formed upon irradiation of diphenyiodonium salts are iodobenzene, iodobiphenyl, acetanilide, benzene, and acid, as shown in Scheme 10.5 [55]. The reaction mechanism for product formation from diaryliodonium salts is shown in Scheme 10.6, where the bar indicates reaction in a cage. The mechanism for direct photolysis in solution can be described by three types of processes: in-cage reactions, cage-escape reactions, and termination reactions. The photolysis products are formed by heterolysis of diaryliodonium salts to a phenyl cation and iodobenzene, and also by homolysis to phenyl radical and iodobenzene radical cations. Direct photolysis favors product formation by a heterolytic cleavage pathway. In-cage recombination produces iodobiphenyl, whereas the cageescaped reaction produces iodobenzene, acetanilide, and benzene. Direct photolysis of triphenyl sulfonium salts produces new rearrangement products: phenylthiobiphenyl, along with diphenylsulfide, as shown in Scheme 10.7 [56]. The reaction mechanism is shown in Scheme 10.8. The heterolytic cleavage gives a phenyl cation and diphenyl sulfide, whereas homolytic cleavage gives the singlet phenyl radical and diphenylsulfinyl radical cation pair. These pairs of intermediates then produce the observed photoproducts by an in-cage recombination mechanism, leading to phenylthiobiphenyl. Diphenylsulfide is formed by direct photolysis either in-cage or in a cage-escaped reaction. Other products are formed by cage-escaped or termination reactions. The difference in photolysis of onium salts in the solid state and in solution has been studied from the viewpoint of cage and cage-escape reactions [58]. Photolysis of triphenylsulfonium salts in the solid state shows a remarkable counter-ion dependence and a cage:escape ratio as high as 5:1 is observed, whereas the ratio in solution is about 1:1. In the solid state, the cage-escape reaction with solvent and termination processes involving solvent cannot occur. Because the environment is rigid, in-cage recombination processes are favored. Thus, for the nonnucleophilic MFn (PF6, AsF6, SbF6, etc.) and triflate anions ðCF3 SOK 3 Þ, the in-cage recombination to give phenylthiobiphenyls predominates. Photolysis of onium salts in polymer matrix, poly(tert-butoxycarbonyloxy-styrene) (t-BOC–PHS), was studied [59]. Environments with limited diffusion favor the recombination reaction to yield in-cage products. t-BOC–PHS films are such an environment,

+

X−

hν CH3CN

I

+

+

+

I 2-IBP

3-IBP

4-IBP

NHCOCH3 SCHEME 10.5 Photoproducts from diphenyliodonium salts.

q 2007 by Taylor & Francis Group, LLC

+

+

+

HX

Chemistry of Photoresist Materials

523

PhI

Ph+

e− transfer X−

PhI

Ph+

X−

→

1 or 3

PhI+·

Ph·

X−

Ph2I+X− *

PhI+·

homolysis 1 X− Ph· →

X−

Ph+

PhI heterolysis

→

hν

Ph2I+X−

PhI+·

Ph·

RH

Ph−PhI + PhI + PhR + HX

RH

Ph−PhI + PhI + PhH + HX

X−

3

SCHEME 10.6 Mechanism of product formation from direct photolysis of diphenyliodonium salts. The bars indicate in-cage reactions.

but there are fewer in-cage products than expected. This can be explained if sensitization of an onium salt by excited t-BOC–PHS occurs. McKean et al. [60,61] measured the acid generation efficiency in several polymer matrices using the dye-titration method. The quantum yield from triphenylsulfonium salts in poly(4-butoxycarbonyloxystyrene) (Pt-BOC–PHS) is lower than that in solution. The acid generation efficiency from triphenylsulfonium salts in polystyrene, poly(4-vinylanisole), and poly(methyl methacrylate) was measured. They pointed out that the compatibility of sulfonium salts and the polymer matrix with respect to polarity affects the acid generation efficiency.

s+ x–

hν

s

s +

SH

14

s

s +

15

16 Z

+

+

SH = CH3CN, Z = –NHCOCH3 SH = CH3OH, Z = –OCH3 SH = C2H5OH, Z = –OC2H5 SCHEME 10.7 Photoproducts from direct irradiation of triphenylsulfonium salts.

q 2007 by Taylor & Francis Group, LLC

+ HX

Microlithography: Science and Technology

524 hν

Ph3S+ X–

[Ph3S+ X–]*

[Ph3S+ X–]*

(1)

Ph2S

Ph+

X–

(2)

Ph·

X–

(3)

Ph2S

Ph+

X–

Ph2S+·

Ph2S

Ph+

X–

14 + 15 + 16 + H+

Ph2S+·

Ph·

X–

14 + 16 +

Ph2S

Ph+

X–

Ph2S

Ph2S+·

Ph·

X–

Ph2S+· +

Ph+ + RH

Ph2S+

–H

Ph· + RH

(4) (5)

Ph+ +

X–

(6)

Ph·

X–

(7)

+

+ H+

(8)

Ph2S+ –H + H·

(9)

PhR

Ph2S+· + RH

+

H+

Ph2S + H+

(10)

PhH + R·

(11)

+ R·

Ph

R

(12)

Ph· + Ph·

Ph

Ph

(13)

Ph·

R· + R·

R

R

(14)

SCHEME 10.8 Mechanism of direct photolysis of triphenylsulfonium salts. The bars indicate in-cage reactions.

10.3.1.2 Halogen Compounds Halogen compounds such as tricholoromethyl-s-triazene have been known as freeradical initiators for photopolymerization [46,62]. These halogen compounds were also used as acid generators for an acid-catalyzed solubilization composition [63]. Homolytic cleavage of the carbon–halogen bond produces a halogen-atom radical followed by hydrogen abstraction, resulting in the formation of hydrogen-halide acid, as shown in Scheme 10.9 [64]. Calbrese et al. [65] found that the halogenated acid generator, 1,3,5-tris(2,3-dibromopropyl)-1,3,5-triazine-(1H,3H,5H)trione, could be effectively sensitized to 365 and 436 nm

Ar N CI3C

SCHEME 10.9 Mechanism of acid formation from trichloromethyl triazene.

q 2007 by Taylor & Francis Group, LLC

Ar hν

N N

CI

C CI3

+

N

CI3C RH

HCI + R

+ OTHER PRODUCTS

N N

C CI2

Chemistry of Photoresist Materials

525

Chemistry of photoresist materials

(1)

SH

(2)

SH* + RX

hν

SH +· + RX -· R· + XS· + H+

RX -· SH +·

(3)

SH* SCHEME 10.10 Acid formation mechanism from halogen compounds via electron-transfer reaction.

using electron-rich sensitizers. They proposed a mechanism for sensitization involving electron transfer from excited sensitizer to photoacid generators in these systems (Scheme 10.10). The energetics of electron transfer is described by K$ SH RX KE00 ðSHKSH ÞKC; KE DG Z E C$ RX SH

(10.1)

where DG is the free enthalpy, E(SH/SHC$) is the energy required to oxidize the sensitizer and E(RXK$/RX) is the energy required to reduce the acid generator, E00(SHKSH*) is the electronic energy difference between the ground state and excited state sensitizer, and C is the coulomb interaction of the ion pair produced. Electrochemical redox potentials and spectroscopic data support this mechanism [65]. Based on similar data for p-cresol as a model for phenolic resins in these resists, light absorbed by the resin when such resists are exposed to deep-UV may contribute to the sensitivity via electron transfer from the resin to brominated isocyanate to produce an acid. 10.3.1.3 o-Nitrobenzyl Esters Houlihan and coworkers [66] have described acid generators based on 2-nitrobenzylsulfonic acid esters. As shown in Scheme 10.11, the mechanism of the photoreaction of nitrobenzyl ester involves insertion of an excited nitro-group oxygen into a benzylic carbon–hydrogen bond. Subsequent rearrangement and cleavage generates

HO

R N+ – O H C O R

O

N

hν

O O R

S R

O

O

O

+ HO CHO

·· O N

O

NO

S R

S R O

R

C R

O

OH O S R O

q 2007 by Taylor & Francis Group, LLC

SCHEME 10.11 Mechanism of acid formation from o-notrobenzylesters.

Microlithography: Science and Technology

526

0.22 ABS Chain length)

FIGURE 10.22 Plot of the lithographic sensitivity vs. 1/(F!catalytic chain length!ABS/mm). F is quantum yield of acid generation; ABS is the absorbance of the resist film. (From Houlihan, F. M., Neenan, T. X., Reichmanis, E., Kometani, J. M., and Chin, T., Chem. Mater., 3, 462, 1991.)

1/(Φ

0.2 0.18

0.06

p-CP3 p-CH3

0.15

m-CP3 p-CN3

0.14

o-NO2 m-CO3n o-NO2 o-CP3 o-CP3 p-CN3 o-NO2 p-CP3 o-NO2p-NO3

0.12 0.1

o-NO2p-CN3

0.08

o-NO2 p-F o-NO2o-p-OP o-NO2 p-CP 3 o-CP3TRESIC

0.04 0.02 0

o-p p-CN3 o-Op,p-CN3

0

20

40

60

80

100

Sensitivity (mJ/cm2)

nitrosobenzaldehyde and sulfonic acid. They also made a study of thermal stability and acid generation efficiency on varying the substituents on 2-nitrobenzylbenzenesulfonates [67,68]. A plot of the reciprocal of quantum yield of acid generation, catalytic chain length, and absorbance per micron was made vs. sensitivity (Figure 10.22). A reasonably linear plot over the whole range of esters is obtained, indicating that three basic parameters in Figure 10.22 determine the resist sensitivity [68]. 10.3.1.4 p-Nitrobenzyl Esters Yamaoka and coworkers [69] have described p-nitrobenzylsulfonic acid esters such as p-nitrobenzyl-9,10-diethoxyanthracene-2-sulfonate as a bleachable acid precursor. Photodissociation of the p-nitrobenzyl ester proceeds via intramolecular electron transfer from the excited singlet state of 9,10-diethoxyanthracene moiety to p-nitrobenzyl moiety followed by the heterolytic bond cleavage at the oxygen–carbon bond of sulfonyl ester as shown in Scheme 10.12, where dimethoxyanthracene-2-sulfonate is described. This mechanism was supported by the fact that the transient absorption assigned to dimethoxyanthracene-2-sulfonate radical cation and nitrobenzyl radical anion are detected in laser spectroscopy [70]. 10.3.1.5 Alkylsulfonates Ueno et al. have shown that tris(alkylsulfonyloxy)benzene can act as a photoacid generator upon deep-UV [71] and electron-beam irradiation [72]. Schlegel et al. [73] found that 1,2,3-tris(methanesulfonyloxy)benzene (MeSB) gives a high quantum yield (number of acid moieties generated per photon absorbed in a resist film) when utilized in a novolac resin. The quantum yield would be more than 10 when calculated based on the number of photons absorbed only by the sulfonate. This strikingly high quantum efficiency can be explained in terms of the sensitization mechanism from excited novolac resin to the sulfonates, presumably via electron transfer reaction as shown in Scheme 10.13 [74]. This mechanism was supported by the following model experiment. When a resist composed of bisphenol A protected with t-butoxycarbonyl (t-BOC-BA), cellulose acetate as a base polymer, and MeSB, is deprotected with a photogenerated acid, absorbance at 282 nm increases due to bisphenol A formation, which can be used as a “detector” for the acid-catalyzed reaction. Because cellulose acetate shows no absorption at exposure

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

527

Chemistry of photoresist materials

OCH3 SO3CH2

OCH3 OCH3 1

NO2

hν (365 or 436nm) * SO CH 3 2

NO2

OCH3 Electron transfer OCH3

·-

SO3 CH2

NO2

+· OCH3 OCH3

Bond

cleavage

SO3

CH2

NO2

+·

RH

OCH3

OCH3

-SO3·

O2N

CH2CH2

N2O

OCH3 SO3H

OCH3

OCH3 CH2

NO2

OCH3 CH2

NO2

OCH3 SCHEME 10.12 Mechanism of product formation from direct photolysis of p-nitrobenzyl-8,10-dimethoxyanthracene-2-sulfonate.

wavelength (248 nm), there is no possibility of sensitization from excited polymer to MeSB, leading to negligible change of absorbance at 248 nm. On the contrary, when trimethylphenol (TMP) as a model compound of a novolac resin was added to the system, the deprotection reaction proceeded: the absorbance at 282 nm increases with exposure time as well as content of TMP, as shown in Figure 10.23. In addition, it was found that spectral sensitivity resembles the absorption spectra of novolac resin (Figure 10.24), indicating sensitization by novolac resin. The effect of chemical structure on acid-generation efficiency has been measured for various alkylsulfonates of pyrogallol backbone (Figure 10.25), and methanesulfonates of mono-, di-, and tri-hydroxybenzenes

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

528 OH CH2

–·

+

ArH·

MeSO3

+

n

+

MeSO3

CH3 ArH*

–· MeSO3

O H2O

CH3

–

·

S OH

OH

+

O

O CH3

–

·

S OH

+

+

ArH·

CH3SO3H

+

ArH

O SCHEME 10.13 Mechanism of sulfonic acid generation from alkylsulfonates by electron transfer reaction.

and their isomers [75]. The difference in the number of sulfonyl groups per benzene ring may affect the reduction potential or electron affinity of sulfonates, leading to a change in rates of the electron transfer reactions. The quantum yield for sulfonates is higher for

mJ / cm2

.4

450 375 300 225 150 75 0

.3 .2 .1

5.4% TMP

mJ / cm2

Absorbance

.4

.2 .1

FIGURE 10.23 Absorption spectra of resists MeSB/TMP/ t-BOC-BA/CAZ5/x/15/80-x (wt ratio) after a sequence coating-exposure-baking 808C/10 min with different exposure doses. MeSB: 1,2,3-tris (methanesulfonyloxy)benzene; TMP: trimethylphenol; t-BOC-BA: bisphenol A protected with t-butoxycarbonyl; CA: cellulose. Film thickness: w1.5 mm. (From Schlegel, L., Ueno, T., Shiraishi, H., Hayashi, N., and Iwayanagi, T., Chem. Mater., 2, 299, 1990).

q 2007 by Taylor & Francis Group, LLC

450 375 300 225 150 75 0

.3

10% TMP

mJ / cm2

.4

450 375 300 225 150 75 0

.3 .2 .1

15.2% TMP 200

250

300 Wavelength (nm)

350

Chemistry of Photoresist Materials

529

1.0

MeSB/tBOC-BA/N 5/30/65 %

Sensitivity

Absorbance

MeSB/CA 10/90 % d =1070 nm

1.5

200

250 300 Wavelength (nm)

350

FIGURE 10.24 Solid line: spectral sensitivity curves of the resist MeSB/tBOC-BA/novolac in the deep-UV region. The ordinate scale corresponds to logarithmical decrease of dose values. Dotted line: absorption spectrum of MeSB in cellulose film.

smaller alkyl sizes. The acid-generation efficiency is higher for methanesulfonates derived from trihydroxybenzenes than those from dihydroxybenzene and monohydroxybenzene derivatives. 10.3.1.6 a-Hydroxymethylbenzoin Sulfonic Acid Esters Roehert et al. [76] described a-hydroxymethylbenzoin sulfonic acid esters as photoacid generators. Irradiation of the sulfonic acid ester to an excited triplet state leads to fragmentation via an a-cleavage (Norrish type I) into two radical intermediates, as shown in Scheme 10.14 [76,77]. The benzoyl radical is stabilized via H-abstraction to yield a (substituted) benzylaldehyde almost quantitatively. The second radical intermediate is stabilized via cleavage of the carbon–oxygen bond, which is linked to the sulfonyl moiety (–CH2– OSO 2–) and forms the respective acetophenone and sulfonic acid. The photoacid generating efficiency of this type of sulfonate is compared with those of bis(arylsufonyl) diazomethane (BAS-DM), 1,2,3-tris(methanesulfonyloxy)benzene (MeSB), and 2,1-diazonaphthoquinone-4-sulfonate (4-DNQ) using the tetrabromophenol blue indicator technique. The order of acid generating efficiency is BAS-DMOa-hydroxymethylbenzoin sulfonic acid esterswMeSBO4-DNQ.

R=phenyl

Film thickness (μm)

1.0

R=Tosyl R=Methyl

R=Napnthl

R=Ethyl R=Propyl

0.5

R=Butyl 0-S02 -R 0-S02 -R 0-S02 -R

0

1

10 Exposure dose (mJ/cm2)

q 2007 by Taylor & Francis Group, LLC

100

FIGURE 10.25 Exposure characteristic curves for resists with different sulfones. Novolac resin/t-BOCBA/sulfonateZ100/13.3/1.65 (mole ratio). (From Ueno, T., Schlegel, L., Hayashi, N., Shiraishi, H., and Iwayanagi, T., Polym. Eng. Sci., 32, 1511, 1992.)

Microlithography: Science and Technology

530

O

OH

C

C S2

CH2

S1

OSO2R

hν 1248 nml

O

OH

C

C

∗ S2

CH2

S1

OSO2R

T

Norrish type I cleavage

OH

O C• S1

+

•C

S2

CH2 OSO2R

Substituted benzaldehydes •O

R–SO2OH

+

C S2

CH2 SCHEME 10.14 Photochemical reaction mechanism of a-hydroxymethylbenzoin sulfonic acid esters.

Sulfonic acids

H–Ao abstraction −

Kato Enol Tauto merisation

Substituted acetophenones

10.3.1.7 a-Sulfonyloxyketones a-Sulfonyloxyketones were reported as acid generators in chemical amplification resist systems by Onishi et al. [78]. The photochemistry of a-sulfonyloxyketones is shown in Scheme 10.15 [77]. p-Toluene sulfonic acid is liberated after photoreduction. 10.3.1.8 Diazonaphthoquinone-4-sulfonate (4-DNQ) 1,2-Diazonaphthoquinone-4-sulfonate can be used as an acid generator [79–81]. The photochemistry of 1,2-diazonaphthoquinone-4-sulfonate shown in Scheme 10.16 was proposed by Buhr et al. [79]. It is expected that the reaction follows its classical pathway starting from diazonaphthoquinone via the Wolff-rearranged ketene to the indene carboxylic acid. In polar media, possibly with proton catalysis, the phenol ester moiety can be eliminated, leading to sulfene that adds water to generate the sulfonic acid. This reaction mechanism was also supported by Vollenbroek et al. [10]. 4-DNQ acid generators were used for acid-catalyzed crosslinking of image-reversal resists [79,80] and for an acid-catalyzed deprotection reaction [81].

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

531

O H

OD H

CD3 +

C C OTos CH3

h-ν D C OTos Photo– reduction CD3

(IX)

C •

CD3

C OTos

+

•C

CH3

OD

CD3

(XVII) Liberation of D-OTos

O H

O H

O

C C D

+

CD3 C CD3

Cage reaction

CH3 (XXII)

C C• CH3

OD +

•C

CD3

CD3

(XVIII)

SCHEME 10.15 Photochemical reaction mechanism of a-sulfonyloxy ketones.

10.3.1.9 Iminosulfonates Iminosulfonates, which produce sulfonic acids upon irradiation, are generally synthesized from a sulfonyl chloride and an oxime derived from ketones. Shirai and Tsunooka proposed the reaction mechanism (Scheme 10.17) [82,83]. Upon irradiation with UV light, the cleavage of –O–Nabonds of iminosulfonates and the subsequent abstraction of hydrogen atoms lead to the formation of sulfonic acids accompanying the formation of azines, ketones, and ammonia. 10.3.1.10 N-Hydroxyimidesulfonates The photodecomposition of N-tosylphthalimide (PTS) [84] to give an acid is outlined in Scheme 10.18 [85,86]. Irradiation to deep-UV light leads to homolytic cleavage of the N–O bond, giving a radical pair. The pair can either collapse to regenerate the starting PTS or it can undergo cage escape. Hydrogen abstraction from the polymeric matrix by the toluenesulfonyl radical gives toluenesulfonic acid. An alternative mechanism involves electron transfer from the excited-state polymer or other aromatic species to PTS to form a

O O N2

C

COOH H2O

hν

SO2–OAr

SO2–OAr

COOH

H

COOH H 2O

H

SO2–OAr

SO3H

q 2007 by Taylor & Francis Group, LLC

SO2

SCHEME 10.16 Sulfonic acid formation from direct photolysis of DNQ-4-sulfonates.

Microlithography: Science and Technology

532

O

R′ O

N

C

S

R″

R′″

O hν

O

R′ +

N•

C

•O

R″

R′″

O

RH

RH

R′ •N

H2O

C

O

R″

R′ C

S

HO

O

R″ SCHEME 10.17 Photochemistry of iminosulfonates.

N

C

R′″

O

R′

R′

S

C

N

R″

R″

radical-cation–radical-anion pair. The PTS radical anion will be protonated, followed by homolytical decomposition to give toluenesulfonatyl radical. As in the mechanism for direct photolysis, the sulfonatyl radical abstracts a hydrogen atom from the medium to give a protic acid. A. Direct photolysis of PTS O

O

1 hν 254nm

N–OTs

N• •OTs

escape

•

O

O O

O HQ

N• + •OTs

NH + TsOH O

O B. Photoinduced electron transfer O−

O

•

ArH*

N–OTs

N–OTs + ArH−+

H+

O

O

O

O •

NH + TsO·

N–OTs O

O ArH−+

HQ

HQ

ArH +

H+

SCHEME 10.18 Mechanism of sulfonic acid formation from photolysis of N-hydroxyimidesulfonates.

q 2007 by Taylor & Francis Group, LLC

TsOH

Chemistry of Photoresist Materials

533

10.3.1.11 a,a 0 -Bisarylsulfonyl Diazomethanes Pawlowski and coworkers [87] have reported the nonionic acid generators a,a 0 -bisarylsulfonyl diazomethanes that generate sulfonic acids upon deep-UV irradiation. They analyzed photochemical products of a,a 0 -bis(4-t-butylphenysulfonyl)diazomethane in acetonitrile/water solution using HPLC and proposed the photochemical reaction mechanism shown in Scheme 10.19 [87,88]. The mechanism of acid generation is that the intermediate carbene formed during photolytic cleavage of nitrogen rearranges to a highly reactive sulfene, which then adds water present in the solvent mixture to give the sulfonic acid (8a). The main product is 4-t-butylphenyl thiosulfonic acid 4-t-butylphenyl ester (4a), probably formed via elimination of nitrogen and carbon dioxide from the parent compound. Although 4a does not contribute to the acid-catalyzed reaction, it is known that thiosulfonic acid esters undergo a thermally induced acid- or base-catalyzed decomposition reaction to yield the respective disulfones and sulfinic acids. 10.3.1.12 Disulfones The reaction mechanism for disulfones is shown in Scheme 10.20 [89]. Homolytic cleavage of an S–S bond is followed by hydrogen abstraction, yielding two equivalents of sulfinic acid. Quantum yield of the photolysis of disulfone compounds in THF solution were determined to be in the range of 0.2–0.6, depending on chemical structures. 10.3.2 Acid-Catalyzed Reactions 10.3.2.1 Deprotection Reaction Protection of reactive groups is a general method in synthesis [90]. There are several protecting groups for the hydroxy group of polyhydroxystyrene.

R

O

N2 O

S

C

S

O

R

O 1a

O

S

IX R

S

O

S

S

C

S

O + H2O

R

O O R

R

O

S

CH

O

SO3H

3a

R

8a O S

R

O S

C

O

O R

R

O

S

S

S O

O R

O C

S O

O R

R

O

C

S

S

O

CH3 R

4a CH3

CH3

SCHEME 10.19 Photodecomposition mechanism of a,a 0 -bisarylsulfonyl diazomethanes.

q 2007 by Taylor & Francis Group, LLC

R

O

1A"

O CH2

C

O

O O

S O

WR

N2

3 A'

R

O

conc. WR, - N2

R

R

R

Microlithography: Science and Technology

534

CI

SO2

SO2

CH3

hν SO2• + H3C

CI

SO2•

H-Donor

SCHEME 10.20 Photodecomposition mechanism of disulfones.

H-Donor

CI

SO2H

H3C

SO2H

10.3.2.1.1 t-BOC Group The IBM group has reported resist systems based on acid-catalyzed thermolysis of the side-chain t-BOC protecting group [44]. A resist formulated with poly(p-t-butoxycarbonyloxystyrene) (t-BOC–PHS) and an onium salt as a photoacid generator has been described [44,45]. The acid produced by photolysis of an onium salt catalyzes acidolysis of the t-BOC group, which converts t-BOC–PHS to poly(hydroxystyrene) (PHS), as shown in Scheme 10.21. Because the photogenerated acid is not consumed in the deprotection reaction, it serves only as a catalyst. This system is named chemical amplification. The exposed part is converted to polar polymer that is soluble in polar solvents such as alcohols or aqueous bases, whereas the unexposed area remains nonpolar. This large difference in polarity between exposed and unexposed areas gives a large dissolution contrast in the developer, which also allows negative or positive images depending on the developer polarity. Development with a polar solvent selectively dissolves the exposed area to give a positive tone image. Development with a nonpolar solvent dissolves the unexposed area to give a negative tone image. Because this resist is based on the polarity change, there is no evidence of the swelling that is faced in gel-formation-type resists such as cyclized-rubber-based negative resists. The swelling phenomena cause pattern deformation and result in limited resolution. Because the tertiary butyl ester group is sensitive to AAL1-type hydrolysis (acidcatalyzed unimolecular alkyl cleavage) [91] in a reaction that does not require a stoichiometric amount of water, reactions related to the t-butyl group described in Scheme 10.22 have been reported to apply to resist systems [92,116].

S

+−

hν

X

(CH2CH)n

H +− X

+ CO2

CH3 O C O C CH3

SCHEME 10.21 Chemical amplified resist using acid-catalyzed t-BOC deprotection reaction.

q 2007 by Taylor & Francis Group, LLC

H+

O

CH3

(CH2CH)n +

+ C

OH

CH3

tertiary - Butoxycarbonyl (tBOC)

CH3

CH3 CH3 H+

+

C CH3

CH2

Chemistry of Photoresist Materials CH2

535

CH

CH2

CH R1

H

+ CH2 C

R1 C

O

C

R2

CH3

CO2H

R2

O

CH3 CH2

C

CH2

H

C O

C

+ CH2 C

C

R2

C O

O R1

R1

CH3

OH R2

CH3 SCHEME 10.22 Acid-catalyzed deprotection (AALK1 acidolysis) for polarity change.

Workers of AT&T proposed a series of copolymers of tert-butoxycarbonyl-oxystyrene and sulfur dioxide prepared by radical polymerization [93]. The sulfone formulation exhibits both improved sensitivity and high contrast. It is expected that chain degradation of the matrix polymer results in a chemically amplified resist with improved sensitivity. They also prepared a new terpolymer, poly[(tert-butoxycarbonyloxy)styrene-co-acetoxystyrene-co-sulfone] and found that the acetoxy group can be cleaved from acetoxystyrene monomer in aqueous base solution as shown in Scheme 10.23 [94]. This cleavage occurs H

SO2 1

x

2

6 5

O

SO2

Δ

x 3

4 O

O

C

C

O

CH3

OH

O C

O

O

CH3

aq. base

C(CH3)3

SO2 x

OH

OH

SCHEME 10.23 Acid-catalyzed deprotection reaction during postexposure baking and base-catalyzed cleavage during development.

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

536

only when sulfone is incorporated into the copolymer in the appropriate amounts. It is expected that this base-catalyzed cleavage during development enhances solubility of the exposed area where the t-BOC group is removed by acid. Incorporation of 50 wt.% acetoxystyrene into the polymer reduces the weight loss by approximately 18% in the solid film, whereas poly[(tert-butoxycarbonyloxy)styrene sulfone] shows 40% loss. The reduction of weight loss is advantageous for adhesion. A disadvantage of onium salts such as diaryliodonium and triarylsulfonium salts is strong dissolution inhibition for alkali development. To improve the solubility of onium salts in the developer, Schwalm et al. [95] proposed a unique photoacid generator that completely converts to phenolic products upon irradiation, as shown in Scheme 10.24. They synthesized sulfonium salts containing an acid-labile t-BOC protecting group in the same molecule. After irradiation, unchanged hydrophobic initiator and hydrophobic photoproducts are formed in addition to the acid, but upon thermal treatment, the acidcatalyzed reaction converts all of these compounds to phenolic products, resulting in enhanced solubility in aqueous bases. Kawai et al. [96] described a chemically amplified resist composed of partially t-BOCprotected monodisperse PHS as a base polymer, t-BOC-protected bisphenol A as a dissolution inhibitor, and a photoacid generator. They reported that the use of monodisperse polymer and optimization of t-BOC protection degree improved resolution capability and surface inhibition. Three-component resists using t-BOC-protected compounds as dissolution inhibitors in combination with a photoacid generator and a phenolic resin have been reported by several groups [97–100]. Aoai et al. [101] systematically investigated the effect of the chemical structures of the backbone of t-BOC compounds on dissolution inhibition capability. Similar chemical structural effects have been observed, as reported in DNQ compounds. t-BOC compounds with large distances between t-BOC groups and high hydrophobicity show strong dissolution-inhibition effects. s

RO

OR X

base insoluble

hν OR

(S < 1)

RO

s

OR

+

x

(OR) + RO

s

RO

s

+ H X

Δ

HO

OH + HO

s

S

X

R = acid labile group SCHEME 10.24 Reaction mechanism of t-BOC-protected acid generator.

q 2007 by Taylor & Francis Group, LLC

OH

(OH) + HO

S

base insoluble

Chemistry of Photoresist Materials

537

Hydrolysis

Dissolution rate of resist

OH

CH3

C OH C O OH

OH CH3

O Thermal CH3 decomposition

Phenolic resin C C OR

CH3

C C

O

O

OH CH3

O Phenolic resin acid generatior CP-TBOC O OR CH3

R : C O C(CH3)3

O Resist preparation Exposure,baking Development (in aqueous base)

FIGURE 10.26 Concept of dissolution-rate enhancement during development. t-BOC compounds that show dissolution inhibition are deprotected by acid-catalyzed reaction. The deprotected compounds with lactone ring are cleaved by base-catalyzed reaction during development, leading to dissolution enhancement of the exposed region. (From Kihara, N., Ushirogouchi, T., Tada, T., Naitoh, T., Saitoh, S., and Sasaki, O., Proc. SPIE, 1672, 194, 1992.)

Workers at Toshiba [102], Nihhon Kayaku [103], and NTT [104] reported 1-(3H)-isobenzofuranone derivatives protected with t-BOC as a new type of dissolution inhibitor. The concept of this resist system is shown in Figure 10.26. These inhibitors decomposed by acid-catalyzed thermal reaction. In addition, the lactone ring of the decomposed products was cleaved by base-catalyzed reaction in the developer, which may enhance the dissolution rate of the exposed area. 10.3.2.1.2 THP Group The tetrahydropyranyl (THP) group can be used as a protecting group of PHS [63,105]. The deprotection reaction of the THP group has been investigated in detail by Sakamizu et al. [106]. The proposed mechanism is shown in Scheme 10.25. A proton first attacks the

CH2 CH

CH2 CH

n

+ O

+

H+ OH

O

n

−H+

O

1

O

H2O OH

+H+

OH

O 2

H C=O 3

SCHEME 10.25 Acid-catalyzed deprotection reaction of tetrahydropyranyl protected polyhydroxystyrene (THP-M).

q 2007 by Taylor & Francis Group, LLC

Microlithography: Science and Technology

538

10,000 NMD-3 (5%)

FIGURE 10.27 Dissolution rate of THP(tetrahydropyranyl)protected polyhydroxystyrene, THP-M, as a function of THP protection degree for various developers. NMD is tetramethylammonium hydroxide aqueous solution. (From Hattori, T., Schlegel, L., Imai, A., Hayashi, N., and Ueno, T., Opt. Eng., 32, 2368, 1993.)

Dissolution rate (nm/s)

1000 NMD-3 (2.3%)/ n-propanol=7/1

100 10 1 NMD-3 (2.38%)

0.1 0.01

0

10

20

30

40

50

60

70

80

THP-protection degree (%)

phenolic oxygen to produce PHS and a carbocation, 1. Carbocation 1 can react with water from the atmosphere or trapped in novolac to give 2-hydroxytetrahydropyran, or it loses a proton to give 3,4-hydropyran. 2-Hydroxytetrahydropyran is a hemiacetal and is in equilibrium with 5-hydroxypentanal. A fully THP-protected PHS (THP-M) suffered from poor developability in aqueous base when THP-M was used as a base polymer. The effect of the deprotection degree on dissolution rate was investigated by Hattori et al. [107]. As shown in Figure 10.27, 30% protection degree is enough for negligible dissolution for alkali development. It should be noted that deprotection from 100 to 30% cannot induce a change in dissolution in 2.38% tetramethylammonium hydroxide solution. Optimization of the protection degree can provide alkali-developable two-component resists for KrF lithography (Figure 10.28). It is difficult as shown in Figure 10.29 to deprotect THP groups completely for both high(92%) and low- (20%) protected THP-M. It should be noted that 20% THP-protected THP-M gives a product of lower protected degree at fully exposed region than 92% THP-protected THP-M. Therefore, it is expected that THP-M with low protection degree reduces the surface inhibition due to high yield of alkali-soluble hydroxy groups. Other polymers incorporating the THP protecting group have been reported. Taylor et al. [108] evaluated copolymers of benzylmethacrylate and tetrahydropyranylmethacrylate as deep-UV resists. Kikuchi and coworkers [109] described copolymers of styrene and tetrahydropyranylmethacrylate. Terpolymers of N-hydroxybenzylmethacrylamide, tetrahydropyranyloxystyrene, and acrylic acid were applied to deep-UV resists, which showed good adhesion to silicon substrates and high glass-transition temperatures [110].

FIGURE 10.28 Line and space patterns of 0.3 mm using a resist composed of partially THP-protected polyhydroxystyrene (THP-M) and an onium salt. (From Hattori, T., Schlegel, L., Imai, A., Hayashi, N., and Ueno, T., Opt. Eng., 32, 2368, 1993.)

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

539

100 90 THP-protection degree (%)

80 THP-M 92%TBIT PEB 100˚C/10 min

70 60 50

THP-M 92%TBIT PEB 100˚C/2 min

40 30 20 10 0 0.1

1 10 100 Exposure dose (mJ/cm2)

1000

FIGURE 10.29 Change in THP protection degree with exposure dose for THP-Ms of low protection degree (20%) and high protection degree (92%). The deprotection degree was determined by IR after postexposure baking. (From Hattori, T., Schlegel, L., Imai, A., Hayashi, N., and Ueno, T., Opt. Eng., 32, 2368, 1993.)

10.3.2.1.3 Trimethylsilyl Group Early work on the trimethylsilyl group as a protecting group was reported by Cunningham and Park [111]. Yamaoka et al. [112] made preliminary experiments on the rate of acid-catalyzed hydrolysis for a series of alkylsilylated and arylsilylated phenols. The order of the rate is shown in Figure 10.30. It is likely that the rate of hydrolysis is governed by the steric hindrance of the trisubstituted silyl group rather by electronic induction effect because no obvious correlation between the rate of the hydrolysis and Hammet’s values of the silylating substituents is observed. Among the silylating substituents studied, trimethylsilyl group was chosen as a protecting group for polyhydroxystyrene (PHS) because of its high rate of hydrolysis and good stability, whereas the highest rate of hydrolysis was obtained for dimethylsilyl group. They reported a resist using trimethylsilyl-protected PHS combined with p-nitrobenzylsulfonate as an acid generator [113]. 10.3.2.1.4 Phenoxyethyl Group Phenoxyethyl was proposed as a protecting group of PHS by Jiang and Bassett [114]. As shown in Scheme 10.26, phenol is produced from the protection group via acid-catalyzed cleavage, as well as production of PHS. Because phenol is very soluble in aqueous bases, it

CH3 O

Si H

CH3 O

CH3 CH3 Si CH CH3 CH3

CH3 O

CH3 O

Si CH3

q 2007 by Taylor & Francis Group, LLC

Si CH3 CH2 CH3 CH3

CH3

CH3

O

Si CH3

CH3 CH3 O

Si C CH3 CH3 CH3

FIGURE 10.30 The order of acid-catalyzed hydrolysis rates for alkyl- and arylsilylated phenols.

Microlithography: Science and Technology

540

Ph3S+SbF6−

+

+ CH3CHO

hν. H2O.

O

O

OH

OH

Masked dissolution promotor SCHEME 10.26 Acid-catalyzed reaction of poly(4-(1-phenoxyethoxy)styrene).

acts as a dissolution promoter in exposed areas. In addition, phenol is not volatile under lithographic conditions, resulting in smaller film thickness loss than the t-BOC system. 10.3.2.1.5 Cyclohexenyl Group Poly[4-(2-cyclohexenyloxy)-3,5-dimethylstyrene] has been prepared for a dual tone imaging system [115]. Poly[4-(2-cyclohexenyloxy)styrene] is less attractive due to the occurrence of some Claisen rearrangement and of other side reactions, as shown in Scheme 10.27. On the contrary, the polymer with ortho-position methylation is deprotected easily due to limitation of the side reaction by blocking of the reaction site. 10.3.2.1.6 t-Butoxycarbonylmethyl Group It is generally accepted that a carboxylic acid shows higher dissolution promotion than the hydroxy group of a phenol. It is expected that the deprotected polymer containing carboxylic acid acts as a dissolution accelerator in aqueous base [116]. However, poly (vinylbenzoic acid) shows high absorbance at 248 nm. To avoid the strong absorption of

( CH2

CH

( CH2

)n

CH

)n-x ( CH2

CH

)x + x

H OH

OH O IV

( CH2

CH

( CH2

)n

CH

)n + n

Me

Me

Me

Me OH

O V SCHEME 10.27 Acid-catalyzed reaction polyhydroxystyrene.

q 2007 by Taylor & Francis Group, LLC

of

cyclohexyl-protected

polyhydroxystyrene

and

methyl-substituted

Chemistry of Photoresist Materials

541

CH CH2 n CH CH2 m

CH CH2 n

CH CH2 m

H+ & Δ

+

CH3 CH2 = C CH3

O CH2 CO2

OH O CH2 COOH

OH C(CH3)3

SCHEME 10.28 Chemical structure of BCM-PHS and its acid-catalyzed thermolysis.

the benzoyl group, Onishi et al. [117] reported partially t-butoxycarbonylmethyl-protected PHS, which involves a methylene group between phenyl and carboxylic acid. The reaction mechanism is shown in Scheme 10.28. Another advantage is that t-butoxycarbonylmethylprotected PHS gave no phase separation after development in film that is faced in t-BOC protected PHS. 10.3.2.2 Depolymerization 10.3.2.2.1 Polyphthalaldehyde (PPA) Ito and Willson [118] reported acid-catalyzed depolymerization of polyaldehydes as a first stage of chemical amplification resists. Polyphthalaldehyde is classified as O,O-acetal, which will be described below. The polymerization of polyaldehyde is known to be an equilibrium process of low ceiling temperature (Tc). Above Tc, the monomer is more thermodynamically stable than its polymer. During polymerization at low temperature, end-cap by alkylation or acylation terminates the equilibrium process and renders the polymer stable at approximately 2008C. Although polymers of aliphatic aldehydes are highly crystalline substances that are not soluble in common organic solvents, polyphthalaldehyde (PPA) provides noncrystalline materials that are highly soluble and can be coated to provide clear isotropic films of high quality. Tc of PPA is approximately K408C. The resist composed of PPA and an onium salt can give positive tone images without subsequent processes. This is called self-development imaging. As shown in Scheme 10.29, the acid generated from onium salts catalyzes the cleavage of the mainchain acetal bond. After the bond is cleaved at above the ceiling temperature, the materials spontaneously depolymerizes to monomers. PPA can be used as a dissolution inhibitor of novolac resin in a three-component system. The photogenerated acid induces depolymerization of PPA, resulting in loss of dissolution inhibition capability to give a positive image. Although PPA materials are sensitive self-developing resists, they have drawbacks, such as liberation during exposure of volatile materials that could damage the optics of exposure tools, and poor dry-etching resistance. Ito et al. [119,120] found that

nBu

H C

O

H

O H C

O

n

CCH3

H

PPA SCHEME 10.29 Acid-catalyzed depolymerization of polyphthalaldehyde.

q 2007 by Taylor & Francis Group, LLC

H O H C C

O O H

HC

H C

O

Microlithography: Science and Technology

542

poly(4-chlorophthalaldehyde) does not spontaneously depolymerize upon exposure to radiation; it requires a postexposure bake step to obtain positive relief image, which can avoid the damage to optics. This system is called thermal development, as distinguished from self-development. 10.3.2.2.2 O,O- and N,O-Acetals The workers of Hoechst AG reported three-component chemical-amplification resist systems using acid-catalyzed depolymerization of O,O- and N,O-acetals [121,122]. The reaction mechanism is shown in Scheme 10.30 and Scheme 10.31, respectively. PolyN,O-acetal is protonated at the oxygen atom and liberates an alcohol, XOH. The intermediate formation of a carbocation is the rate-limiting step in hydrolysis and its stability is influenced by mesomeric and inductive effects of the substituents R1 and R2. It is noteworthy that the liberation of XOH causes a decrease in molecular weight of an inhibitor and that cleavage products like alcohols and aldehydes show strong dissolution promotion. Because the novolac resin suffers from strong absorption at 248 nm, the authors developed methylated polyhydroxystyrene, poly(4-hydroxystyene-co-3-methyl-4-hydroxystyrene), as a base resin that shows a reasonable dissolution-inhibition effect [123–125]. 10.3.2.2.3 Polysilylether The silicon polymer containing silylether groups in the main chain is hydrolyzed by acid and degraded to low-molecular\-weight compounds [126,127]. This polymer can be used as a dissolution inhibitor of a novolac resin in a three-component system. As shown in Scheme 10.32, the decomposition of the polymer includes protonation of oxygen, a nucleophilic attack of water to the Si atom, the cleavage of the Si–O bond, and reproduction of the proton, resulting in depolymerization and loss of inhibition capability. Investigation of the effect of chemical structure around silylether groups in the polymer on the hydrolysis rate indicated that the rate decreases with increasing bulkiness of alkyl groups and alkoxy groups. 10.3.2.2.4 Polycarbonate and Others Frechet et al. [128–133] have designed, prepared, and tested dozens of new imaging materials based on polycarbonates, polyethers, and polyesters, which are all susceptible Y S H

H+ OH

O R R

OH H O

CH R O O H O

O

H O

OH−

R

R

H O R

+ H2O

H

CH H O

OH− OH−

O

H

OH OH

OH−

(1)

O

OH−

OH−

SCHEME 10.30 Acid-catalyzed depolymerization of acetals leading to their loss of dissolution inhibition capability.

q 2007 by Taylor & Francis Group, LLC

Chemistry of Photoresist Materials

X

N

O

H C N

n − XOH R1

543

R2

Catalytic cycle

Polymeric hydrophobic

m R1

[H+]

R2

X

O

− [H+]

m