Introduction to CMOS VLSI Design (E158) Lecture 11: Decoders and Delay Estimation

Similar documents
ROM MEMORY AND DECODERS

Lecture 10: Sequential Circuits

11. Sequential Elements

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1

Hardware Design I Chap. 5 Memory elements

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

EE5780 Advanced VLSI CAD

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Introduction to CMOS VLSI Design (E158) Lab 3: Datapath and Zipper Assembly

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

ELEC 4609 IC DESIGN TERM PROJECT: DYNAMIC PRSG v1.2

CMOS VLSI Design. Lab 3: Datapath and Zipper Assembly

Nan Ya NT5DS32M8AT-7K 256M DDR SDRAM

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

EECS 270 Final Exam Spring 2012

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

Lecture 1: Intro to CMOS Circuits

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

Lecture 1: Circuits & Layout

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

DESIGN OF NOVEL ADDRESS DECODERS AND SENSE AMPLIFIER FOR SRAM BASED memory

55:131 Introduction to VLSI Design Project #1 -- Fall 2009 Counter built from NAND gates, timing Due Date: Friday October 9, 2009.

EECS150 - Digital Design Lecture 17 - Circuit Timing. Performance, Cost, Power

Timing EECS141 EE141. EE141-Fall 2011 Digital Integrated Circuits. Pipelining. Administrative Stuff. Last Lecture. Latch-Based Clocking.

SA4NCCP 4-BIT FULL SERIAL ADDER

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

PICOSECOND TIMING USING FAST ANALOG SAMPLING

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

CS 152 Computer Architecture and Engineering

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Combinational vs Sequential

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

EECS 270 Group Homework 4 Due Friday. June half credit if turned in by June

VU Mobile Powered by S NO Group

Infineon HYB18T512160AF-3.7 DDR2 SDRAM Circuit Analysis

L11/12: Reconfigurable Logic Architectures

Quiz #4 Thursday, April 25, 2002, 5:30-6:45 PM

Digital Integrated Circuits EECS 312

Difference with latch: output changes on (not after) falling clock edge

EECS150 - Digital Design Lecture 2 - CMOS

TKK S ASIC-PIIRIEN SUUNNITTELU

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Tutorial Outline. Typical Memory Hierarchy

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

EECS 270 Midterm 1 Exam Closed book portion Winter 2017

Computer Architecture and Organization

Chapter 7 Sequential Circuits

Memory, Latches, & Registers

Project 6: Latches and flip-flops

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

CS/EE 181a 2010/11 Lecture 6

EE273 Lecture 11 Pipelined Timing Closed-Loop Timing November 2, Today s Assignment

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

Chapter 18. DRAM Circuitry Discussion. Block Diagram Description. DRAM Circuitry 113

Microprocessor Design

L12: Reconfigurable Logic Architectures

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

1. Convert the decimal number to binary, octal, and hexadecimal.

Developing Standard Cells for TSMC 0.25 µm Technology with MOSIS DEEP Rules

PARALLEL PROCESSOR ARRAY FOR HIGH SPEED PATH PLANNING

Lecture 26: Multipliers. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

Electrical & Computer Engineering ECE 491. Introduction to VLSI. Report 1

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

High Performance Carry Chains for FPGAs

Design Low-Power and Area-Efficient Shift Register using SSASPL Pulsed Latch

MUX AND FLIPFLOPS/LATCHES

Wire Delay and Switch Logic

Sequential Logic. References:

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

EECS150 - Digital Design Lecture 3 - Timing

An FPGA Implementation of Shift Register Using Pulsed Latches

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

Integrated Circuit Design ELCT 701 (Winter 2017) Lecture 1: Introduction

Lecture 23 Design for Testability (DFT): Full-Scan

Power Distribution and Clock Design

International Research Journal of Engineering and Technology (IRJET) e-issn: Volume: 03 Issue: 07 July p-issn:

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Designing VeSFET-based ICs with CMOS-oriented EDA Infrastructure

Clocking Spring /18/05

OV µm Pixel Size Back Side Illuminated (BSI) 5 Megapixel CMOS Image Sensor

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

ISC0904: 1k x 1k 18µm N-on-P ROIC. Specification January 13, 2012

IC Mask Design. Christopher Saint Judy Saint

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

A video signal processor for motioncompensated field-rate upconversion in consumer television

United States Patent [19] [11] Patent Number: 5,862,098. J eong [45] Date of Patent: Jan. 19, 1999

V6118 EM MICROELECTRONIC - MARIN SA. 2, 4 and 8 Mutiplex LCD Driver

Clock - key to synchronous systems. Lecture 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

Lossless Compression Algorithms for Direct- Write Lithography Systems

Noise Margin in Low Power SRAM Cells

Design Project: Designing a Viterbi Decoder (PART I)

Clock - key to synchronous systems. Topic 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION (Autonomous)

EECS 270 Midterm 2 Exam Closed book portion Fall 2014

Transcription:

Harris Introduction to CMOS VLSI Design (E158) Lecture 11: Decoders and Delay Estimation David Harris Harvey Mudd College David_Harris@hmc.edu Based on EE271 developed by Mark Horowitz, Stanford University MAH E158 Lecture 12 1

Decoders and Delay Estimation Reading W&E 4.5-4.6 Introduction In the last lecture, we looked at memory design. Today we will look at various methods for building decoders to drive the word lines and column multiplexer circuitry. To build a fast memory, we need to minimize the delay of the decoder. This challenge will serve as a jumping off point for delay estimation and gate sizing to minimize delay. MAH E158 Lecture 12 2

Peripheral Circuits decoder mux We need to build the decoder and wordline drive circuits, and the column select and bitline drive circuits. For both we need to build a decoder -- something to select the correct line. Lets look at building decoders for CMOS memories. MAH E158 Lecture 12 3

Decoders A decoder is just a structure that contains a number of AND gates, where each gate is enabled for a different input value. For a n-bit to 2 n decoder, we need to build 2 n, n-input AND gates. And we want to build these AND gates so they layout nicely (in a regular way) MAH E158 Lecture 12 4

Large Fanin AND Gates In CMOS building this type of gate causes a problem, since large fanin implies a series stack. We will see a little later in the notes that the best way to do this is to use a two-level decoder by predecoding the inputs. In nmos the problem was easy, large fanin NOR gates work well. So a collection of NOR gates solves the problem very nicely. MAH E158 Lecture 12 5

CMOS Decoders In CMOS, a large fanin gate implies a series stack. So we need to build a decoder that does not use a large fanin gate. But how? Use a 2-level decoder. An n-bit decoder requires 2n wires A0, A0, A1, A1, Each gate is an n bit NOR (NAND gate) Could predecode the inputs Send A0 A1, A0 A1, A0 A1, A0 A1, A2 A3 Instead of A0, A0, A1, A1, Maps 4 wires into 4 wires that need to go to the decoder Reduces the number of inputs to the decode gate by a factor of two. MAH E158 Lecture 12 6

Predecode Example A0 A1 A0 A1 A0 A1 A0 A1 A0 A1 A0 A1 A0 A1 A0 A1 A1 A1 A0 A0 2 Bit Predecode No Predecode MAH E158 Lecture 12 7

Predecode Predecode is just like what we did when we needed to make a single six input AND gate. Did it in a few levels: predecode decode gate One can do a 2 input predecode, or a 3 input predecode A 2 input predecoder generates 4 outputs A 3 input predecoder generates 8 outputs The difference with standard logic is that we need to decode all possible inputs. This means that each predecode gate can be reused by many final decode gates. A little planning can yield a regular layout. MAH E158 Lecture 12 8

Predecode A predecoded decoder: A 0 A 1 A 2 A 3 A 4 A 5 MAH E158 Lecture 12 9

Layout Issues Often we need to build large array structures (for example we need a large RAM), so we want to layout the decoder in as little space as possible. We need to find a good way to layout this structure. Clearly we need to run the address lines through each decoder cell, and stack the decoder cells next to each other. MAH E158 Lecture 12 10

Predecode Layout The output of the predecode gate need to drive the address lines. These address lines are usually high capacitance So usually it is better to use a NAND with an inverter buffer as the predecode cells. Cells can be placed on top of the address lines, or to the left of the address lines. predecode cells decode cells MAH E158 Lecture 12 11

Decoder Cell Layout Need to have n and p transistors Need to take up minimum space Want it to be easy to program the cell While layout is regular each cell is different It connects to a different set of inputs Look at a couple of layout styles MAH E158 Lecture 12 12

Decoder Layout Cell Area is proportional to n 2. Decoder area is n 3. A 0 Gnd Vdd A 0 A 1 A 1 A 2 A 2 The problem with this layout is that most of the space is wasted. All of the area under the wires is wasted. We should rotate the gate to fit under the wires. MAH E158 Lecture 12 13

A Slightly Better Decoder Layout Better cell design (like we have talked about) Out1 Out0 A 0 A 0 A 1 A 1 A 2 A 2 Vdd Gnd In this layout, the basic cell remains unchanged, it is the wire contacts that are programmed. This is sometimes a good idea, since it lets you optimize the decode cell (in this case the 3 input gate) MAH E158 Lecture 12 14

A Smaller Layout Leave space for all the tracks in the cell Address lines in M2/Poly Vdd Out1 Gnd Out0 A 0 A 0 A 1 A 1 A 2 A 2 Need to program the decoder by placing transistors, or metal. With predecode, you have more tracks per transistor. MAH E158 Lecture 12 15

Wordline Driver Decoder is just part of the wordline drive circuit Also need to qualify the wordline (AND with clock) Also need to buffer the signal to drive WL cap Clock qualification can be done in the decoder A0 An Phi1 - just another input to the decoder Usually not a great idea, since this can lead to large skew Clock AND is usually done in last stage before driver decode_s1 can be large devices wordline_q1 Φ1 or use normal NAND gate MAH E158 Lecture 12 16

Thin Drivers Wordline pitch of memory cell is not that tight (about 40λ), but not that large either. There are some memories (ROMs, drams) with much tighter pitch. For many of these applications you need thin gates and drivers. The minimum useful space is 16λ Decoder is here In Out 16λ Gnd Vdd Contacts can be shared For the wordline driver, I might use two of these drivers in parallel, to reduce the horizontal length (effectively fold the transistors again) MAH E158 Lecture 12 17

Putting it Together Floorplan for a memory Bit Line Precharge Φ1 Memory Array Row Decode Mem Mem Drv Drv Decoder Decoder Predecoder Column Mux Bit IO 2:1 Mux & Bit IO R/W Address Built using Array constructs Decoder base is often array, with programming done by software Memory is built by arraying a cell that contains the cell and its mirror MAH E158 Lecture 12 18

Transistor Sizing For memories (and other structures) you end up with long high cap wires Need to drive these large capacitors quickly, and this sets the device size We will look at chain of inverters first, and then think about gates Factors to consider in gate sizing: Need to think about the load you are driving Need to think about the load you present to your predecessor Why transistor sizes matter when you are driving a large capacitance 13ns falling 26ns rising min 4λ:2λ 2pF (10mm of metal2) MAH E158 Lecture 12 19

Buffer (or Gate) Sizing But bigger gates have bigger input capacitance too: Delay = 4ns - falling 8ns - rising Delay = 0.3ns min 400-p 200-n 2pF Clearly we need to make the predriver larger too. Is there an optimal solution? Yes, in a way Minimize delay of chain - for the minimum all delays will match (why?) 1 f f 2 f 3 Equalizing delay principle applies to any critical path through gates. MAH E158 Lecture 12 20

MAH E158 Lecture 12 21

MAH E158 Lecture 12 22