CS250 VLSI Systems Design

Similar documents
Why FPGAs? FPGA Overview. Why FPGAs?

Sharif University of Technology. SoC: Introduction

Digital Integrated Circuits EECS 312

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: N. Weaver & J. Wawrzynek. Lecture 2 EE141

TKK S ASIC-PIIRIEN SUUNNITTELU

24. Scaling, Economics, SOI Technology

Digital Integrated Circuits EECS 312. Review. Remember the ENIAC? IC ENIAC. Trend for one company. First microprocessor

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

EECS150 - Digital Design Lecture 2 - CMOS

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

Integrated Circuit Design ELCT 701 (Winter 2017) Lecture 1: Introduction

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

IC TECHNOLOGY Lecture 2.

L12: Reconfigurable Logic Architectures

SEMICONDUCTOR TECHNOLOGY -CMOS-

IC Layout Design of Decoders Using DSCH and Microwind Shaik Fazia Kausar MTech, Dr.K.V.Subba Reddy Institute of Technology.

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

SEMICONDUCTOR TECHNOLOGY -CMOS-

L11/12: Reconfigurable Logic Architectures

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

1967 FIRST PRODUCTION MOS CHIPS 1969 LSI ( TRANSISTORS) PMOS, NMOS, CMOS 1969 E-BEAM PRODUCTION, DIGITAL WATCHES, CALCULATORS 1970 CCD

SoC IC Basics. COE838: Systems on Chip Design

Slide Set 14. Design for Testability

Lossless Compression Algorithms for Direct- Write Lithography Systems

The Impact of Device-Width Quantization on Digital Circuit Design Using FinFET Structures

Digital Systems Design

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

An Efficient IC Layout Design of Decoders and Its Applications

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

FinFETs & SRAM Design

ELEC 4609 IC DESIGN TERM PROJECT: DYNAMIC PRSG v1.2

PHYSICAL DESIGN ESSENTIALS An ASIC Design Implementation Perspective

COE328 Course Outline. Fall 2007

International Research Journal of Engineering and Technology (IRJET) e-issn: Volume: 03 Issue: 07 July p-issn:

Field Programmable Gate Arrays (FPGAs)

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains

Scan. This is a sample of the first 15 pages of the Scan chapter.

Chapter 7 Memory and Programmable Logic

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

Flexible Electronics Production Deployment on FPD Standards: Plastic Displays & Integrated Circuits. Stanislav Loboda R&D engineer

VLSI Design Digital Systems and VLSI

Lecture 1: Intro to CMOS Circuits

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Lecture 23 Design for Testability (DFT): Full-Scan

FPGA Design with VHDL

High Performance Carry Chains for FPGAs

Lecture 23 Design for Testability (DFT): Full-Scan (chapter14)

Testing Digital Systems II

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

Performance Modeling and Noise Reduction in VLSI Packaging

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

Next Generation of Poly-Si TFT Technology: Material Improvements and Novel Device Architectures for System-On-Panel (SOP)

EECS150 - Digital Design Lecture 17 - Circuit Timing. Performance, Cost, Power

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

2.6 Reset Design Strategy

11. Sequential Elements

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity.

RFSOI and FDSOI enabling smarter and IoT applications. Kirk Ouellette Digital Products Group STMicroelectronics

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

ELEN Electronique numérique

Further Details Contact: A. Vinay , , #301, 303 & 304,3rdFloor, AVR Buildings, Opp to SV Music College, Balaji

Digital Integrated Circuits EECS 312

Outline. EECS150 - Digital Design Lecture 27 - Asynchronous Sequential Circuits. Cross-coupled NOR gates. Asynchronous State Transition Diagram

Combinational vs Sequential

Semiconductors Displays Semiconductor Manufacturing and Inspection Equipment Scientific Instruments

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

3D-CHIP TECHNOLOGY AND APPLICATIONS OF MINIATURIZATION

Logic Devices for Interfacing, The 8085 MPU Lecture 4

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

LFSR Counter Implementation in CMOS VLSI

Lecture 1: Circuits & Layout

Using on-chip Test Pattern Compression for Full Scan SoC Designs

IC Mask Design. Christopher Saint Judy Saint

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

Microprocessor Design

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview

Innovative Fast Timing Design

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

Timing EECS141 EE141. EE141-Fall 2011 Digital Integrated Circuits. Pipelining. Administrative Stuff. Last Lecture. Latch-Based Clocking.

Static Timing Analysis for Nanometer Designs

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

ESE534: Computer Organization. Previously. Today. Previously. Today. Preclass 1. Instruction Space Modeling

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

An FPGA Implementation of Shift Register Using Pulsed Latches

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

Optimizing area of local routing network by reconfiguring look up tables (LUTs)

Design and Analysis of Custom Clock Buffers and a D Flip-Flop for Low Swing Clock Distribution Networks. A Thesis presented.

Transcription:

CS250 VLSI Systems Design Fall 2012 John Wawrzynek, Jonathan Bachrach with Krste Asanovic, John Lazzaro and Rimas Avizienis (TA) Why CS250 and not EE250 Put IC design expertise into the hands of those best qualified to take advantage of its potential: Those with intimate knowledge of computation and algorithms: computer scientists! Traditionally, and often today IC design is stratified: Algorithm / architecture Microarchitecture Circuit design Layout Better option is tall thin designer. Spans all levels of the design and implementation stack. Leads to more successful innovation and highly optimized designs. 2

Enabling System Architects Managing the complexity is the key challenge. Manipulating multiple levels of design complexity is difficult and continually getting worse (remember Moore s Law). Approach: 1. Borrow ideas from software (hierarchy, libraries, design patterns,...) 2. Focus on design representations 3. Practice using computer aided design tools 4. In the context of some application domain 5. Access to Silicon foundries for fabrication 3 Course Format (1) The new CS250 (as of Fall 2009) VLSI design for system architects. Focus on common ASIC design methodology: RTL synthesis and standard cell implementation. No transistor level layout. Back to a design centric course. Learn by doing. Requires a lot of infrastructure set up (thanks to Yunsup Lee, Brian Zimmer, Brian Richards) Entire class worked implementing RISC processors. Many variations on a theme. This semester focus on image processors - more details later. 4

Course Format (2) Most closely related courses: CS 150 - undergraduate digital design. Prerequisite. CS 152/252 Computer Architecture / Microarchitecture. EE 141/242 Transistor level circuits and layout. EE 244 Computer Aided Design of ICs (CAD algorithms) Course Theme: How do we get the best design results from the standard design flow using tradeoffs in area/performance/energy and exploring micro-architectural alternatives. 5 Course Structure Check Website Calendar/Info for details Weeks 1-7: Lectures on fundamentals of ASIC design Lab exercises to learn CAD tools Weeks 8-14: Project related activities Project group presentation (proposal, progress, final report) private project meetings : instructors meet in private with groups Grading: 5% Class Participation, 25% Labs, 70% Project Please, no Laptop/iPad/handheld use in class. We will have a short break midway in each class so you can catchup on email, etc. 6

Some Important Tentative Dates Lab 1 Due: Lab 2 Due: Sep 24 Brief Oral Project Proposal: Oct 4 Written Project Proposal Due: Oct 8 Sep 10 (Monday) Lab 3 Due: Oct 15 Project Final Presentations: Dec 7 (RRR week) Final Project Report: Dec 12 midnight These are all hard deadlines, so please budget your time accordingly. Total of 4 late days for labs. We will assist you all we can to help you make the deadlines. 7 More Course Details Discussion section TBA. Very important for tips on doing the labs and project You will need to get a named instructional account to log onto our servers installed with the CAD tools. Piazza for all Q/A, announcements, etc., check website. Instructor office hours on the web. Enrollment Undergrad: need to have taken CS150 (or equivalent) with B + or better. Grad: we assume you have taken undergraduate digital design. If not, see us for remedial materials. Design Language For all, we assume Verilog/VHDL experience. However, we will be introducing you to a brand new hardware design language, call Chisel (under construction.) 8

Project Details Project groups of 2 people. Start with functional specification for a image processor, explore multiple micro-architectural variations to optimize performance or energy efficiency. Examples: edge detection, segmentation, optical flow detection, compression,... Within the pattern(s) that you choose, generate a set of VLSI implementations performing a design space exploration determining the Pareto optimal points in the performance, area, and energy efficiency space. Lots of background in a few weeks. 9 End of Introduction part 1 10

What has changed in 30 years since the early days of chip design? 11 12

Secondary driver: Wafer size Processed Wafer Cost From: Facing the Hot Chips Challenge Again, Bill Holt, Intel, presented at Hot Chips 17, 2005. Wafer size conversions offset trend of increasing wafer processing cost 13 Source: Intel 8 Processing advances 4µm 45nm 14

IC Technology Stuff (1) Feature size: then: ~4µm now:.028µm Interconnect: then: 2 layers now: ~10 layers, then: aluminum Transistors: now: copper then: planar MOSFET now: same Layout and GDRs: Essentially unchanged. More complex. Density and area-fill rules. Circuits: then: clocked static CMOS now: same (lots of crazy stuff in between) Interesting, though, most CMOS circuits and layouts designed in 1980 would work if fabricated on today s IC process. 15 IC Technology Stuff (2) Transistors: then: near perfect switch now: leaky Power consumption: then: dynamic (switching) energy now: approaching 50% static leakage (back to the future - nmos has similar problem) New improved devices coming soon: FinFETs Chip Input/Output then: parameter pads now: often area pads Lithographic Mask Costs: then: few $k now: $M (full die, 65, 45, 28nm) 16

IC Technology Stuff (3) Device reliability: then: devices nearly never fail future (<65nm): high soft and hard error rates Process variations across die, die-to-die: Statistical variations in processing (wire widths/resitivity, transistor dimensions/strengths, doping inconsistencies) become apparent at smaller geometries. Some circuits fast, others slow. Some high-power, some low. Yield on leading edge processes dropping dramatically IBM quotes yields of 10 20% on Cell processor 17 Chip functionality: Design cost: Design Stuff then: limited by area now: usually limited by energy dissipation now: design costs in $50M range for full-die custom designs (high percentage in verification) Implementation Alternatives: more alternatives that trade up-front design costs for per unit costs. FPGA compete aggressively with custom silicon then: most custom designs implemented at silicon level now: many more custom designs implemented with FPGAs Standard design abstraction: then: transistors circuits now: RTL in HDLs, standard cores and standard cells (higher productivity, somewhat less area/ energy efficient) - 18

Full-custom: Standard-cell: Gate-array (structured ASIC): FPGA: Microprocessor: Domain Specific Processor: Implementation Alternatives All circuits/transistors layouts optimized for application. Arrays of small function blocks (gates, FFs) automatically placed and routed. Partially prefabricated wafers customized with metal layers or vias. Prefabricated chips customized with loadable latches or fuses. Instruction set interpreter customized through soft ware. Special instruction set interpreters (ex: DSP, NP, GPU). By ASIC, most people mean Standard-cell based implementation. Wh 19 The Important Distinction Instruction Binding Time When do we decide what operation needs to be performed? A. DeHon General Principles Earlier the decision is bound, the less area, delay/energy required for the implementation. Later the decision is bound, the more flexible the device. 20

Full-Custom Circuit styles and transistors sizes are customized to optimize die, size, power, performance. High NRE (non-recurring engineering) costs Time-consuming and error prone layout Optimizing for small die can result in low per unit costs, extreme-low-power, or extreme-highperformance. Common for analog design. Requires full set of custom masks. High NRE usually restricts use to high-volume applications/markets or highly-constrained and cost insensitive markets. 21 Standard-Cell* Based around a set of pre-designed (and verified) cells Ex: NANDs, NORs, Flip-Flops, buffers, Each cell comes complete with: layout (perhaps for different technology nodes and processes), Behavioral simulation, delay, & power models. Chip layout is automatic, reducing NREs (usually no hand-layout). Requires full set of masks - nothing prefabricated. Non-optimal use of area and power, leading to higher per die costs than fullcustom. Commonly used with other design implementation strategies (large blocks for memory, I/O blocks, etc.) 22

Gate Array Store prefabricated wafers of active & gate layers & local interconnect, comprising, primarily, rows of transistors. Customize as needed with back-end metal processing (contact cuts, vias, metal wires). Could use a different factory. 23 Gate Array Shifts large portion of design and mask NRE to vendor. Shorter design and processing times, reduced time to market. Highly structured layout with fixed size transistors leads to large sub-circuits (ex: Flip-flops) and higher per die costs. Memory arrays are particularly inefficient, so often prefabricated, also: Sea-of-gates, structured ASIC, master-slice. 24

Field Programmable Gate Arrays Two-dimensional array of simple logic- and interconnectionblocks. Typical architecture: LUTs implement any function of n-inputs (n=3 in this case). Optional Flip-flop with each LUT. Fuses, EPROM, or Static RAM cells are used to store the configuration. Here, it determines function implemented by LUT, selection of Flip-flop, and interconnection points. Many FPGAs include special circuits to accelerate adder carry-chain and many special cores: RAMs, MAC, Enet, PCI, SERDES,... 25 Traditional FPGA versus ASIC argument (circa 2000) total cost FPGAs cost effective ASICs cost effective volume FPGA ASIC ASIC: High NRE costs ($2M for 0.35um chip). Relatively Low cost per die. FPGAs: Very low NRE costs. Relatively low silicon efficiency high cost per part. Cross-over volume from cost effective FPGA design to ASIC in the 10K range. 26

Cross-over Point has Moved Right total cost FPGA ASIC FPGAs cost effective ASICs cost effective ASIC: Increasing NRE costs ($40M for 90nm chip 1 ) (verification, mask costs 2, etc.) Fewer silicon designs becomes inevitable. FPGAs: Move in to fill the need, furthermore, FPGAs better able to follow Moore s Law, relatively cheaper to test. Cross-over volume now >100K. volume 1 Vahid Manian, VP manufacturing and operations, Broadcom Corp. 2 Roger Minear, Agere Systems Inc, 30-35- layer mask set $650,000 for 130nm and $1.4M for 90nm. 27 Hybrids Chip Implementations Abound Ex: standard practice in microprocessors that data-paths are full-custom and control (instruction decode, pipeline control) in standard-cells. (Less common recently) Control ( random ) logic difficult to regularize. Relatively small percentage of die area/power. Permits late binding of design changes. Extra NAND or NOR gates were often added to control section, and some wafers left without metallization, to permit late design fixes through metal mask revisions (gate-array idea). 28

System-on-chip (SOC) Brings together: standard cell blocks, custom analog blocks, processor cores, memory blocks, embedded FPGAs, Standardized on-chip buses (or hierarchical interconnect) permit easy integration of many blocks. Ex: AMBA, Sonics, IP Block business model: Hard- or soft-cores available from third party designers. ARM, inc. is the shining example. Hardand synthesizable RISC processors. ARM and other companies provide, Ethernet, USB controllers, analog functions, memory blocks, Pre-verified block designs, standard bus interfaces (or adapters) ease integration - lower NREs, shorten TTM. SIP, SOP, MCM interesting alternatives. 29 Modern ASIC Methodology and Flow RTL Synthesis Based HDL specifies design as combinational logic + state elements Instantiations needed for blocks not inferred by synthesis (typically RAM) Event simulation verifies RTL Formal verification compares logical structure of gate netlist to RTL Place & route generates layout Timing and power checked statically Layout verified with LVS and GDRC RTL (Verilog/VHDL) + instantiations formal verification logic synthesis cell place & route GDS Specification simulator gate netlist (with area/perf/pwr estimates) GDRC, LVS, other checks timing/ power analysis 30

Design Representations 31 Lecture 1,Introduction CS250, UC Berkeley, Fall 2012 Engineering Challenge Application Gap usually too large to bridge in one step, but there are exceptions... Physics 32 Lecture 1,Introduction CS250, UC Berkeley, Fall 2012

Magnetic Compass Application Physics 33 Lecture 1,Introduction CS250, UC Berkeley, Fall 2012 Design Abstraction Stack Application Unit-Transaction Level (UTL) Register-Transfer Level (RTL) Gates Circuits Devices (Transistors) Physics n Conduction Band Eg Valence Band oxi p n 34 Lecture 1,Introduction CS250, UC Berkeley, Fall 2012

Properties of a Useful Abstraction Hides less important details e.g., for RTL, don t worry how combinational logic is decomposed into logic gates Allows control of more important details e.g., RTL designer still controls how much logic is performed between any two registers If done right, provides portable efficiency i.e., same RTL can be implemented as custom logic, standard cells, FPGA, or even vacuum tube logic, with reasonably good results 35 Lecture 2, Design Representations CS250, UC Berkeley, Fall 2011 Logic Synthesis Verilog and VHDL started out as simulation languages, but quickly people wrote programs to automatically convert Verilog code into gate level netlists. Synthesis converts Verilog (or other HDL) descriptions to implementation technology specific primitives: For FPGAs: LUTs, flip-flops, and RAM blocks For ASICs: standard cell gate and flip-flop libraries. Memory blocks built with special memory generator and then handinstantiated. 36

Why Logic Synthesis? 1. Automatically manages many details of the design process: Fewer bugs Improved productivity 2. Abstracts the design data (HDL description) from any particular implementation technology. Designs can be re-synthesized targeting different chip technologies. Ex: first implement in FPGA then later in ASIC. 3. In most cases, leads to a more optimal design than could be achieved by manual means (ex: logic optimization) Why Not Logic Synthesis? 37 foo.v Main Logic Synthesis Steps Parsing and Syntax Check Load in HDL file, run macro preprocessor for `define, `include, etc.. Design Elaboration Inference and Library Substitution Logic Expansion Logic Optimization Technology Mapping foo.gates Compute parameter expressions, process generates, create instances, connect ports. Recognize and insert special blocks (arithmetic structures,...) Expand combinational logic to primitive Boolean representation. Apply Boolean algebra and heuristics to simplify and optimize under constraints. Map generic logic representation to cell instances from chosen cell library. Modern tools incorporate preliminary layout & timing constraints, and attempt timing driven synthesis. 38

CMOS From the Bottom, Up 39 IC Fabrication and Layout Representation Mask drawings sent to the fabrication facility to make the chips.

Mask set for an n-fet (circa 1986) Vd = 1V I na n+ Vg = 0V dielectric p- Vs = 0V n+ Masks #1: n+ diffusion #2: poly (gate) #3: diff contact #4: metal Top-down view: Layers to do p-fet not shown. Modern processes have 6 to 10 metal layers (or more) (in 1986: 2). 41 Design rules for masks, 1986... Poly overhang. So that if masks are misaligned, channel doesn t short out. Minimum gate length. So that the source and drain depletion regions do not meet! length Metal rules: Contact separation from channel, one fixed contact size, overlap rules with metal, etc... #1: n+ diffusion #3: diff contact #2: poly (gate) #4: metal 42

Fabrication 43 Vd = 1V I μa n+ Mask set for an n-fet... Vg = 1V dielectric p- Top-down view: Vs = 0V Vd n+ Vg Ids Vs Masks #1: n+ diffusion #2: poly (gate) #3: diff contact #4: metal How does a fab use a mask 44 set to make an IC?

Start with an un-doped wafer... UV hardens exposed resist. A wafer wash leaves only hard resist. oxide p- Steps #1: dope wafer p- #2: grow gate oxide #3: deposit undoped polysilicon #4: spin on photoresist #5: place positive 45 poly mask and expose with UV. Wet etch to remove unmasked... HF acid etches through poly and oxide, but not hardened resist. oxide p- oxide p- After etch and resist removal 46

Use diffusion mask to implant n-type accelerated donor atoms oxide n+ n+ p- Notice how donor atoms are blocked by gate and do not enter channel. Thus, the channel is self-aligned, precise mask alignment is not needed! 47 Metallization completes device oxide n+ n+ p- Grow a thick oxide on top of the wafer. oxide n+ n+ p- oxide n+ n+ p- Mask and etch to make contact holes Put a layer of metal on chip. Be sure to fill in the holes! 48

Final product... Vd Vs The planar process Top-down view: oxide n+ n+ p- Jean Hoerni, Fairchild Semiconductor 1958 49 p-channel Transistors 50

p-fet: Change polarity of everything V well = Vs = 1V I μa p+ Vg = 0V dielectric n-well p- Vd = 0V Vs p+ Vg Isd Vd New n-well mask Mobility of holes is slower than electrons. p-fets drive less current than n- Fets, all else being 51 equal Bulk versus SIO Processing Silicon on Insulator Lower parasitic capacitance -> lower energy, higher-performance Also used for radiation hard application (space craft) - saphhire instead of Oxide. 10-15% increase in total manufacturing cost due to substrate cost. 52

Lithography Optical proximity correction (OPC) is an enhancement technique commonly used to compensate for image errors due to diffraction or process effects. desired (drawn) Current state-of-the-art photolithography tools use deep ultraviolet (DUV) light with wavelengths of 248 and 193 nm, which allow minimum feature sizes down to 50 nm. modified mask exposure 53 Modern Processing Parameters From 2009 ITRS Roadmap 2010 2014 # Mask Levels MPU 35 37 # Mask Levels DRAM 26 26 Maximum Lithography Field Size area (mm 2 ) 858 858 Maximum Lithography Field Size length (mm) 33 33 Maximum Lithography Field Size width (mm) 26 26 Bulk or epitaxial or SOI wafer size (mm) 300 450 http://www.itrs.net/ 54

Processing Enhancements Trench isolation: Shallow trench isolation (STI), a.ka. Box Isolation Technique, prevents current leakage between n-well and p-well devices. High-K dielectrics / Metal gate: Replacing the silicon dioxide gate dielectric with a high-κ material allows increased gate capacitance without the concomitant leakage effects. Strained Silicon: A layer of silicon in which the silicon atoms are stretched beyond their normal interatomic distance leading to better mobility, resulting in better chip performance and lower energy consumption. Gate Engineering : for within-die choice of multiple transistor threshold voltages (Vt) to optimize delay or power. 55 End of Introduction part 2 56