EE-382M VLSI II FLIP-FLOPS

Similar documents
Lecture 26: Multipliers. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5

EE241 - Spring 2007 Advanced Digital Integrated Circuits. Announcements

Lecture 21: Sequential Circuits. Review: Timing Definitions

11. Sequential Elements

II. ANALYSIS I. INTRODUCTION

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

AN EFFICIENT DOUBLE EDGE TRIGGERING FLIP FLOP (MDETFF)

Comparative study on low-power high-performance standard-cell flip-flops

Topic 8. Sequential Circuits 1

ECEN454 Digital Integrated Circuit Design. Sequential Circuits. Sequencing. Output depends on current inputs

Memory elements. Topics. Memory element terminology. Variations in memory elements. Clock terminology. Memory element parameters. clock.

Clock - key to synchronous systems. Topic 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

Clock - key to synchronous systems. Lecture 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

Digital System Clocking: High-Performance and Low-Power Aspects

ECE321 Electronics I

Lecture 11: Sequential Circuit Design

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Sequential Circuit Design: Part 1

Lecture 6. Clocked Elements

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

EE141-Fall 2010 Digital Integrated Circuits. Announcements. Homework #8 due next Tuesday. Project Phase 3 plan due this Sat.

Sequential Circuit Design: Part 1

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

CPE/EE 427, CPE 527 VLSI Design I Sequential Circuits. Sequencing

A Unified Approach in the Analysis of Latches and Flip-Flops for Low-Power Systems

Modeling and designing of Sense Amplifier based Flip-Flop using Cadence tool at 45nm

Digital System Clocking: High-Performance and Low-Power Aspects

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Asynchronous Model of Flip-Flop s and Latches for Low Power Clocking

(CSC-3501) Lecture 7 (07 Feb 2008) Seung-Jong Park (Jay) CSC S.J. Park. Announcement

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

Digital System Clocking: High-Performance and Low-Power Aspects. Microprocessor Examples

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

Clocking Spring /18/05

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

A Power Efficient Flip Flop by using 90nm Technology

Energy Recovery Clocking Scheme and Flip-Flops for Ultra Low-Energy Applications

Energy-Delay Space Analysis for Clocked Storage Elements Under Process Variations

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1

ELE2120 Digital Circuits and Systems. Tutorial Note 7

Lecture 10: Sequential Circuits

EE141-Fall 2010 Digital Integrated Circuits. Announcements. Synchronous Timing. Latch Parameters. Class Material. Homework #8 due next Tuesday

Chapter 7 Sequential Circuits

Digital Integrated Circuits EECS 312

New Single Edge Triggered Flip-Flop Design with Improved Power and Power Delay Product for Low Data Activity Applications

1. What does the signal for a static-zero hazard look like?

EECS150 - Digital Design Lecture 3 - Timing

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

D Latch (Transparent Latch)

An FPGA Implementation of Shift Register Using Pulsed Latches

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

A NOVEL APPROACH TO ACHIEVE HIGH SPEED LOW-POWER HYBRID FLIP-FLOP

Sequential Logic. References:

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

EMT 125 Digital Electronic Principles I CHAPTER 6 : FLIP-FLOP

CMOS Latches and Flip-Flops

Logic Design. Flip Flops, Registers and Counters

cascading flip-flops for proper operation clock skew Hardware description languages and sequential logic

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

Unit 11. Latches and Flip-Flops

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

Experiment 8 Introduction to Latches and Flip-Flops and registers

Power-Optimal Pipelining in Deep Submicron Technology

International Journal of Engineering Research in Electronics and Communication Engineering (IJERECE) Vol 1, Issue 6, June 2015 I.

Name Of The Experiment: Sequential circuit design Latch, Flip-flop and Registers

Final Exam review: chapter 4 and 5. Supplement 3 and 4

P.Akila 1. P a g e 60

Static Timing Analysis for Nanometer Designs

DESIGN OF DOUBLE PULSE TRIGGERED FLIP-FLOP BASED ON SIGNAL FEED THROUGH SCHEME

Novel Low Power and Low Transistor Count Flip-Flop Design with. High Performance

Digital Integrated Circuit Design II ECE 426/526, Chapter 10 $Date: 2016/04/07 00:50:16 $

A Novel Low-overhead Delay Testing Technique for Arbitrary Two-Pattern Test Application

Research Article Ultra Low Power, High Performance Negative Edge Triggered ECRL Energy Recovery Sequential Elements with Power Clock Gating

LATCHES & FLIP-FLOP. Chapter 7

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

Low Power and Reduce Area Dual Edge Pulse Triggered Flip-Flop Based on Signal Feed-Through Scheme

Chapter 5 Flip-Flops and Related Devices

An Optimized Implementation of Pulse Triggered Flip-flop Based on Single Feed-Through Scheme in FPGA Technology

CHAPTER 1 LATCHES & FLIP-FLOPS

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Chapter 6. Flip-Flops and Simple Flip-Flop Applications

CS8803: Advanced Digital Design for Embedded Hardware

Computer Architecture and Organization

High performance and Low power FIR Filter Design Based on Sharing Multiplication

ECEN620: Network Theory Broadband Circuit Design Fall 2014

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

EECS150 - Digital Design Lecture 17 - Circuit Timing. Performance, Cost, Power

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Power Optimization Techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic

Design of Pulse Triggered Flip Flop Using Conditional Pulse Enhancement Technique

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Sequential Circuits

Transcription:

EE-382M VLSI II FLIP-FLOPS Gian Gerosa, Intel Fall 2008 EE 382M Class Notes Page # 1 / 31

OUTLINE Trends LATCH Operation FLOP Timing Diagrams & Characterization Transfer-Gate Master-Slave FLIP-FLOP Merged Functions Clock Skew Other Topologies SCAN References Homework Discussion EE 382M Class Notes Page # 2 / 31

Where are we going? Trends in high-performance systems Higher frequency leads to.. Deeper pipelines or more parallelism leads to. More transistors which leads to. More sequentials (FLOP or LATCH) which leads to. Consequences Increased flip-flop overhead Cycle time in 12-15 stage pipeline uarchitectures ~22 FO4 delays FLOP overhead ~3 FO4 delay (D-Q delay) ~14% Clock uncertainty (jitter & skew) also affects cycle time Clock power EE 382M Class Notes Page # 3 / 31

Why work on Sequentials? In a 3.3 GHZ processor (90n CMOS) cycle=300ps - Typical D-Q delay is ~ 90ps. - If one can design a faster sequential, say D-Q delay of ~ 60pS, this represents ~10% processor performance improvement. - If in addition one can absorb 15ps of uncertainties and/or embed one level of logic, this will yield an additional 5-10% processor performance improvement. - Attaining a 10-20% performance improvement via architecture enhancements is very expensive (area, power, complexity, etc.)! EE 382M Class Notes Page # 4 / 31

Basic LATCH Operation Dout Dout Din Din Transparent-low Transparent-high transparent opaque transparent opaque Din Din Tdq Tdq Dout Dout Tsu Th Tsu Th EE 382M Class Notes Page # 5 / 31

Difference between a LATCH and a FLOP Data F-F Q Edge triggered Clock Clock Data Q Q only changes at the rising edge of the Data Latch Clock Q Transparent / Opaque Clock Data Q EE 382M Class Notes Page # 6 / 31 Q follow s the input DATA

Building a FLOP with Two Latches Dout Din EE 382M Class Notes Page # 7 / 31

FLOP Delay Sum of setup time and Clk-output delay is the only true measure of the performance with respect to the system speed (MAXDELAY) Tcycle = Tcq + Tlogic + Tsu + Tskew Tlogic contains interconnect delay D Q logic D Q N loads Tcq Tlogic Tsu EE 382M Class Notes Page # 8 / 31

FLOP Timing Diagrams Clock volts Din Tsu Thold Dout Tcq 100 200 300 400 500 600 700 800 900 1000 Tsu : input setup time Thold : input hold time Tcq : to out Tdata to out = Tsu + Tcq picoseconds EE 382M Class Notes Page # 9 / 31

Functional Pass/Failure vs. Tsu and Th Master internal node fail fail pass pass Input setup time EE 382M Class Notes Page # 10 / 31 Input hold time

FLOP Characterization Tsu : input setup time Thold : input hold time Tcq : to out Tdata to out = Tsu + Tcq picoseconds Tdata to out 10% Tcq Tsu Thold minimum Tcq -250-200 -150-100 -50 0 50 100 150 200 250 Data to Clock (picoseconds) EE 382M Class Notes Page # 11 / 31

MAXDELAY D1 D Q Q1 D1 logic D Q Q1 Tcycle D1 Tcq Tcq Q1 D1 Tlogic Q1 Tsu Tlogic < Tcycle (Tcq + Tsu) or Tsu Tcycle <= Tlogic + Tcq + Tsu EE 382M Class Notes Page # 12 / 31

MAXDELAY with Clock Skew D1 D Q Q1 D1 logic D Q Q1 Tcycle Tsu Tcq Tcq Tskew D1 Q1 D1 Q1 Tlogic Tsu Tlogic < Tcycle (Tcq + Tsu + Tskew) EE 382M Class Notes Page # 13 / 31

MINDELAY D1 D Q Q1 D1 logic D Q Q1 Tcycle Tcq D1 Q1 D1 Tlogic Q1 Tsu Thold Tlogic > Thold Tcq + Tskew EE 382M Class Notes Page # 14 / 31

DESIGN WINDOW Thold Tcq + Tskew < Tlogic and Tlogic < Tcycle (Tcq + Tsu + Tskew) If Tcq > Thold + Tskew, then MINDELAY hazard is removed since Tlogic >= 0 always. EE 382M Class Notes Page # 15 / 31

T-G Master-Slave FLOP (buffered non-inverting) TIMING: Tsu ~ 1 TG + 2 inverters Th ~ 1 inverter Tcq ~ 1 TG + 1 inverter Dout Din Non time-borrowing Time borrowing keeps the MASTER open longer by ~ 2 inverter delays; need to be careful about MINDELAYS Isolates SLAVE latch timing optimization/sensitivities from output load. EE 382M Class Notes Page # 16 / 31

Merged Function inverting FLOP A B Dout EE 382M Class Notes Page # 17 / 31

RESETABLE Master-Slave FLOP (asynchronous) Dout Din Rb EE 382M Class Notes Page # 18 / 31

Clock Skew Impact to Fmax Din master slave master slave Dout τ1 τ3 LCB τ2 LCB local buffer GLOBAL Tcycle = Tcq + Tlogic + Tsu + T_uncertainty T_uncertainty = skew + jitter skew = τ1 τ2 τ3 EE 382M Class Notes Page # 19 / 31

Other Circuit Topologies for M-S FLOPS C 2 MOS Hybrid Latch Flip-Flop (HLFF) Pulse Latch In Backup: True Single-Phase Clock FLOP K-6 Dual-Rail ETL Semi-Dynamic Flip-Flop (SDFF) EE 382M Class Notes Page # 20 / 31

C 2 MOS FLOPS clk slave Din B B Q clk master B B B D Robustness to slope Low power feedback Poor driving capability EE 382M Class Notes Page # 21 / 31

Hybrid Latch Flip-Flop (HLFF) (AMD K-6, Partovi, ISSCC 1996) N Dout Din Clk Dclk_ EE 382M Class Notes Page # 22 / 31

Hybrid Latch Flip-Flop (HLFF) waveforms TIMING: Sampling Window ~ 3 inverters Tsu ~ 0 to slightly negative Th > sampling window Tcq ~ 2 inverters Clk Dclk_ Din valid N valid Dout valid EE 382M Class Notes Page # 23 / 31

Pulse Latch Din Dout Clock pclk τ EE 382M Class Notes Page # 24 / 31

Pulse Latch Waveforms TIMING: Sampling Window ~ NAND + τ Tsu ~ 0 to slightly negative Th > sampling window Tcq ~ 2 inverters Clock Pclk Din valid Dout valid EE 382M Class Notes Page # 25 / 31

FLOP with SCAN SCAN GADGET B in p n out A Scan_in Scan_out A AB FUNCTIONAL AB A Din Dout EE 382M Class Notes Page # 26 / 31

A Typical Scan Path scanable FLOPS DI # A Store_en Q # A A scanable Latches SI DO B # #_P A B B # B SO Store_en A Q Hold_scan FLOPS (non-destructive scan) # #_P A B # B EE 382M Class Notes Page # 27 / 31

QUICK AREA and TIMING budgets in 130nm Inverting FLIP-FLOP: Area ~ 60 μm 2 Tsu ~ 35ps Tcq ~ 65ps Total FLOP timing overhead ~ 100ps Scan Gadget area ~ 35 μm 2 TOTAL scan inverting FLOP ~ 95 μm 2 This layout does not include scan. EE 382M Class Notes Page # 28 / 31

QUICK AREA and TIMING budgets in 65nm Inverting FLIP-FLOP: Area ~ 15 μm 2 Tsu ~? ps Tcq ~? ps Total FLOP timing overhead ~? ps input 0.45 μm 0.25 μm Rest of FLOP output 0.90 μm 0.50 μm Scan Gadget area ~ 9 μm 2 TOTAL scan inverting FLOP ~ 24 μm 2 EE 382M Class Notes Page # 29 / 31

Design Goals Target: Small load Shortest Din to Dout direct path Low-power feedback Simultaneously optimize both master and slave latches High driving capability Optimize speed * power product while: Minimizing Tsu + Thold (smallest sampling window) Reducing sensitivity to slew rate and skew Not allowing floating nodes Characterization: Use worst case Tcq + Tsu for MAXDELAY analysis. Use worst case Thold for MINDELAY analysis. Take into account all sources of power dissipation EE 382M Class Notes Page # 30 / 31

References 1. A. Chandrakasan, W.J. Bowhill, F. Fox, Design of High-Performance Microprocessor Circuits, IEEE Press, New York, 2001. Chapter 11 Clocked Storage Elements by Hamid Partovi, pages 207-234. 2. V. G. Oklobdzija, The Computer Engineering Handbook, CRC Press, Boca Raton, Florida, 2002. Chapter 10.2 Latches and Flip-Flops by Fabian Klass, pages 10.34-10.69. 3. R. J. Baker, H.W. Li, D.E. Boyce, CMOS Circuit Design, Layout, and Simulation, IEEE Press, New York, 1998. Chapter 13, pages 255-274. 4. V. G. Oklobdzija et. al., Digital System Clocking: High-Performance and Low-Power Aspects, A Wiley-IEEE Press Publication, 264 pages, 2003. Reference 2 has a very nice treatment of FLOPS/LATCHES, MIN/MAXDELAY, SKEW, etc with plenty of timing diagrams. EE 382M Class Notes Page # 31 / 31

BACKUP EE 382M Class Notes Page # 32 / 31

Transfer-Gate (T-G) Master-Slave FLOP Low power feedback Un-buffered inputs input capacitance depends on the phase of the over-shoot and under-shoot with long routes Wire length must be restricted at the input Buffered input addresses above issues Low power Small clk-output delay, but positive setup Easily embedded scan, mux, other simple functions EE 382M Class Notes Page # 33 / 31

Hybrid Latch Flip-Flop Highlights Flip-flop features: single phase edge triggered, on one edge Latch features: Soft edge property brief transparency, equal to 3 inverter delays negative setup time allows slack passing absorbs skew minimum delay between flip-flops must be controlled Fully static Possible to incorporate logic EE 382M Class Notes Page # 34 / 31

ATPG Sequence Timing Aclk, Bclk Freq = 1/16 G 1 st System Cycle SI DI A # A DO 1 st Capture in Slave 2 nd 2nd System Cycle # B B SO G launch # DC STUCK @ Capture at speed in slave # Capture at speed in master Transition Fault testing Observe Master A B B A B A STORE_EN SHIFT_EN EE 382M Class Notes Page # 35 / 31

Merged Function MUX-FLOP SelA_ SelB_ SelC_ SelD_ A Dout B C D EE 382M Class Notes Page # 36 / 31

Another RESETABLE Master-Slave FLOP (synchronous) Din Dout Rb EE 382M Class Notes Page # 37 / 31

True Single-Phase Clock (TSPC) FLOP Din X Y Dout MASTER PRE-CHARGE SLAVE Clock power is low; no local inversion required. EE 382M Class Notes Page # 38 / 31

True Single-Phase Clock FLOP Waveforms TIMING: Tsu ~ 2 inverters Th ~ 2 inverters Tcq ~ 3 inverters Din valid X valid Y valid Dout valid EE 382M Class Notes Page # 39 / 31

Semi-Dynamic Flip-Flop (SDFF) N K Dclk Soft edge conditioned by data since first stage is pre-charged - cross-coupled latch is added for robustness Small penalty for adding logic Latch has one transistor less in stack - faster than HLFF, but 1-1 glitch exists EE 382M Class Notes Page # 40 / 31

Semi-Dynamic Flip-Flop Waveforms TIMING: Sampling Window ~ 2 inverters + 1 NAND Tsu ~ 0 to slightly negative Th > sampling window Tcq ~ 2 inverters Clk Dclk D valid N valid K valid Q valid EE 382M Class Notes Page # 41 / 31

K-6 Dual-Rail ETL Pch A B Dclk_ Determines A, B, Q, and Q_ pulse widths EE 382M Class Notes Page # 42 / 31

K-6 Dual-Rail Waveforms TIMING: Sampling Window ~ 3 inverters Tsu ~ 0 to slightly negative Th > sampling window Tcq ~ 2 inverters Clk Dclk_ D valid A valid B Pch Q valid T valid T is determined by 4 inversions Q_ valid EE 382M Class Notes Page # 43 / 31

HMK#3 Problem 1. For both Din transitions (0->1 and 1->0), determine the input setup Tsu, input hold Thold, and to out Tcq for the following 4 FLIP FLOPS (a, b, c, d). Use 70ps slew rate (full rail) for Din and ; use the 130 nm CMOS transistor models. These designs are all driving a 4.2/2.1 inverter. Show ALL your work; also answer the following questions pertaining to each design: a. List 3 deficiencies with this design. Hint: look at b, c designs. Will this design work for a cycle time of 450ps? Why or why not? b. Is the Din input capacitance lower than design a.? What about the capacitance? c. What are the benefits of placing the slave latch off to the side? Is this a timeborrowing FLOP? Is the capacitance lower than design b? Any benefit in ing the master LATCH feedback? d. This design is a pulsed LATCH. Describe it s behaviour with timing diagrams; Compared to a traditional FLIP-FLOP scheme, list ONE advantage and ONE disadvantage. Simulation Tips: Use HSPICE ic statements to properly initialize these sequential circuits. EE 382M Class Notes Page # 44 / 31

Homework # 3, Problem #1 FLOP design A /0.6 /0.6 0.13/0.6 0.13/0.6 Din din 0.56 1.4 0.7 dout 4.2 2.1 Out 18.0 EE 382M Class Notes Page # 45 / 31

Homework # 3, Problem #1 FLOP design B /0.6 /0.6 0.13/0.6 0.13/0.6 Din din 0.56 0.56 1.4 0.7 Dout_b 4.2 2.1 Out 18 0.56 0.56 EE 382M Class Notes Page # 46 / 31

Homework # 4, Problem #2 FLOP design C /0.6 0.13/0.6 0.13 Din din 0.56 0.56 1.4 0.7 Dout_b 4.2 2.1 Out 18.0 0.13 0.13 0.13 0.13 EE 382M Class Notes Page # 47 / 31

Homework # 4, Problem #2 FLOP design D /0.6 0.13/0.6 0.13 Din din 0.56 0.56 1.4 0.7 Dout 4.2 2.1 Out 18.0 0.13 0.13 0.13 0.13 0.13 0.13 0.13 0.56 0.56 pclk EE 382M Class Notes Page # 48 / 31