Digital System Clocking: High-Performance and Low-Power Aspects. Microprocessor Examples

Similar documents
Digital System Clocking: High-Performance and Low-Power Aspects

Clock Generation and Distribution for High-Performance Processors

Digital System Clocking: High-Performance and Low-Power Aspects

Lecture 21: Sequential Circuits. Review: Timing Definitions

ECE321 Electronics I

EE-382M VLSI II FLIP-FLOPS

EE241 - Spring 2007 Advanced Digital Integrated Circuits. Announcements

Lecture 6. Clocked Elements

EE241 - Spring 2005 Advanced Digital Integrated Circuits

Clocking Spring /18/05

II. ANALYSIS I. INTRODUCTION

ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5

Sequential Circuit Design: Part 1

System IC Design: Timing Issues and DFT. Hung-Chih Chiang

Sequential Circuit Design: Part 1

Lecture 26: Multipliers. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

A Unified Approach in the Analysis of Latches and Flip-Flops for Low-Power Systems

Timing EECS141 EE141. EE141-Fall 2011 Digital Integrated Circuits. Pipelining. Administrative Stuff. Last Lecture. Latch-Based Clocking.

Clocked Storage Elements in High-Performance and Low-Power Systems. Further reproduction without written permission is strictly prohibited.

ECEN454 Digital Integrated Circuit Design. Sequential Circuits. Sequencing. Output depends on current inputs

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

System IC Design: Timing Issues and DFT. Hung-Chih Chiang

11. Sequential Elements

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Topic 8. Sequential Circuits 1

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Lecture 11: Sequential Circuit Design

Modeling and designing of Sense Amplifier based Flip-Flop using Cadence tool at 45nm

Energy-Delay Space Analysis for Clocked Storage Elements Under Process Variations

A NOVEL APPROACH TO ACHIEVE HIGH SPEED LOW-POWER HYBRID FLIP-FLOP

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

Lecture 10: Sequential Circuits

Chapter 7 Sequential Circuits

Hardware Design I Chap. 5 Memory elements

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1

AN EFFICIENT DOUBLE EDGE TRIGGERING FLIP FLOP (MDETFF)

FLIP-FLOPS and latches, which we collectively refer to as

Sequential Logic. E&CE 223 Digital Circuits and Systems (A. Kennings) Page 1

Energy Recovery Clocking Scheme and Flip-Flops for Ultra Low-Energy Applications

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

Comparative study on low-power high-performance standard-cell flip-flops

CPE/EE 427, CPE 527 VLSI Design I Sequential Circuits. Sequencing

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

An efficient Sense amplifier based Flip-Flop design

P.Akila 1. P a g e 60

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

Memory, Latches, & Registers

Lecture 1: Intro to CMOS Circuits

CprE 281: Digital Logic

2.6 Reset Design Strategy

Improved Sense-Amplifier-Based Flip-Flop: Design and Measurements

COMP2611: Computer Organization. Introduction to Digital Logic

Memory elements. Topics. Memory element terminology. Variations in memory elements. Clock terminology. Memory element parameters. clock.

Simulation Mismatches Can Foul Up Test-Pattern Verification

Introduction to Digital Logic Missouri S&T University CPE 2210 Flip-Flops

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Microprocessor Design

Lecture 1: Circuits & Layout

Logic Analysis Basics

Logic Analysis Basics

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Based on slides/material by. Topic Testing. Logic Verification. Testing

Design for Testability

International Journal of Engineering Research in Electronics and Communication Engineering (IJERECE) Vol 1, Issue 6, June 2015 I.

Logic Devices for Interfacing, The 8085 MPU Lecture 4

Lecture 23 Design for Testability (DFT): Full-Scan

An FPGA Implementation of Shift Register Using Pulsed Latches

Synchronous Digital Logic Systems. Review of Digital Logic. Philosophy. Combinational Logic. A Full Adder. Combinational Logic

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

Sequential Circuit Design: Principle

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1

Introduction to Sequential Circuits

GLITCH FREE NAND BASED DCDL IN PHASE LOCKED LOOP APPLICATION

Power Distribution and Clock Design

CprE 281: Digital Logic

Experiment 8 Introduction to Latches and Flip-Flops and registers

CMOS Latches and Flip-Flops

CprE 281: Digital Logic

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

VLSI System Testing. BIST Motivation

Design And Analysis of Clocked Subsystem Elements Using Leakage Reduction Technique

Embedded Logic Flip-Flops: A Conceptual Review

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

SEQUENTIAL CIRCUITS SEQUENTIAL CIRCUITS

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

Combinational vs Sequential

An Optimized Implementation of Pulse Triggered Flip-flop Based on Single Feed-Through Scheme in FPGA Technology

Chapter 8 Design for Testability

EMT 125 Digital Electronic Principles I CHAPTER 6 : FLIP-FLOP

ECEN620: Network Theory Broadband Circuit Design Fall 2014

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Transcription:

igital System Clocking: High-Performance and Low-Power Aspects Vojin G. Oklobdzija, Vladimir M. Stojanovic, ejan M. Markovic, Nikola M. Nedovic Chapter 9: Microprocessor Examples Wiley-Interscience and IEEE Press, January 2003 Microprocessor Examples Clocking for Intel Microprocessors IA-32 Pentium Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III Clocking Clocking and CSEs Alpha Clocking: A Historical Overview Clocking and CSEs IBM Microprocessors Level-Sensitive Scan esign Examples of CSEs Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 2

Microprocessor Examples Clocking for Intel Microprocessors IA-32 Pentium Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III Clocking Clocking and CSEs Alpha Clocking: A Historical Overview Clocking and CSEs IBM Microprocessors Level-Sensitive Scan esign Examples of CSEs Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 3 Intel Microprocessor Features Pentium II Pentium III Pentium 4 MPR Issue June 1997 April 2000 ec 2001 Clock Speed 266 MHz 1GHz 2GHz Pipeline Stages 12/14 12/14 22/24 Transistors 7.5M 24M 42M Cache (I//L2) 16k/16K/- 16K/16K/256K 12K/8K/256K ie Size 203mm 2 106mm 2 217mm 2 IC Process 0.28µm, 4M 0.18µm, 6M 0.18µm, 6M Max Power 27W 23W 67W Source: Microprocessor Report Journal Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 4

IA-32 Pentium Pro Ext FB CLK Gen elay Line elay SR eskew Control elay Line elay SR Left Spine Core P Right Spine Clock distribution network with deskewing circuit (Geannopoulos and ai 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 5 Adaptive eskewing Technique Equalization of two clock distribution spines by compensating for delay mismatch elay lines Phase detector Controller Result: global clock skew of only 15ps 0.25µm technology 7.5M transistors Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 6

IA-32 Pentium Pro In elay Line Out Load<1:15,2> Load<0:14,2> <1:15,2> <0:14,2> elay Shift Register elay shift register (Geannopoulos and ai 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 7 IA-32 Pentium Pro Right Bandwidth Control elay = n Phase etector 1 Left Leads Left elay = n Phase etector 2 Right Leads Phase detector (Geannopoulos and ai 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 8

First IA-64 Microprocessor PLL RCs PLL Core Clock Reference Clock eskew Cluster Clock distribution topology (Rusu and Tam 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 9 Programmable eskew Units Strategy similar to that in IA-32 External differential clock System bus frequency PLL generates internal clock 2x frequency Clock distribution architecture Balanced global clock tree Multiple deskew buffers Multiple local clock buffers Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 10

First IA-64 Microprocessor Global Clock TAP Interface Reference Clock Phase etector eskew Buffer igital Filter Control FSM eskew Settings RC Regional Clock Grid RC Regional Feedback Clock eskew buffer architecture (Rusu and Tam 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 11 First IA-64 Microprocessor Input Output Enable elay Control Register igitally controlled delay line (Rusu and Tam 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 12

First IA-64 Microprocessor Simulated regional clock-grid skew (Rusu and Tam 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 13 First IA-64 Microprocessor Measured regional clock skew (Rusu and Tam 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 14

Pentium 4 1x- enable clock enable distribution & sync clock enable generator clock enable distribution & sync 2x- enables addr. bus outbound clocks MACRO MACRO bus clock bus clock# Core PLL I/O PLL core distribution I/O data distribution core clock data bus outbound clocks core clock I/O feedback clock divide by 4 data from core data clock outbound deskew state machine MSFF data data to core inbound buffers input buffer MSFF core clock inbound latching clocks inbound clocks gen state machine strobe glitch protection and detection input buffers strobes Core and I/O clock generation (Kurd et al. 2001), Copyright 2001 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 15 Multi-GHz Clock Network in Pentium 4 Three core and three I/O frequencies (total 6 frequencies running concurrently) ifferential off-chip reference clock PLL synthesizes core and I/O clocks Global core clock distribution 47 independent clock domains Each domain has 5-bit deskew control register Clock skew < 20ps Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 16

Pentium 4 To Test Access Port PLL 3 3-stage binary tree of clock repeaters omain Buffer 1 omain Buffer 2 omain Buffer 3 Phase etector Phase etector Local Clock Macro Local Clock Macro Local Clock Macro Sequential Elements Sequential Elements Sequential Elements omain Buffer 46 omain Buffer 47 Phase etector Phase etector Local Clock Macro Local Clock Macro Sequential Elements Sequential Elements Logical diagram of core clock distribution (Kurd et al. 2001), Copyright 2001 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 17 Stretch 1 Stretch 0 Enable 1 Enable 2 Gclk Adjustable elay Buffer Pentium 4 medium freq. pulse clk phase 1 Stretch 1 Stretch 0 Enable 1 Enable 2 Gclk Stretch 1 Stretch 0 Enable 1 Enable 2 Gclk Stretch 1 Stretch 1 Stretch 0 Stretch 0 Enable 1 Enable 1 SlowSync Enable 2 Gclk Gclk Buf Type 1 Buf Type 1 medium freq. pulse clk phase 2 slow freq. pulse clk phase 1 Buf Type 1 Enable Gclk Buf Type 3 medium freq. normal clk phase 1 Stretch 1 Stretch 0 Enable 1 Adjustable elay Buffer fast freq. pulse clk Enable 2 Gclk Buf Type 2 Example of local clock buffers generating various frequency, phase and types of clocks (Kurd et al. 2001), Copyright 2001 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 18

Intel Clocking: Summary Increasing clock speeds and die size Balancing the clock skew in large designs using simple RC trees is becoming less effective Insertion delay 7-8FO4 due to increased die Comparable to the clock period Clock skew control has been getting harder to due to increased PVT variations Inductive effects at multi-ghz rates Use of active deskewing circuits Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 19 Microprocessor Examples Clocking for Intel Microprocessors IA-32 Pentium Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III Clocking Clocking and CSEs Alpha Clocking: A Historical Overview Clocking and CSEs IBM Microprocessors Level-Sensitive Scan esign Examples of CSEs Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 20

UltraSPARC Family Characteristics UltraSPARC-I UltraSPARC-II UltraSPARC-III Year 1995 1997 2000 Architecture SPARC V9, 4-issue SPARC V9, 4-issue SPARC V9, 4-issue ie size 17.7x17.8mm 2 12.5x12.5mm 2 15x15.5mm 2 # of transistors 5.2M 5.4M 23M Clock Frequency 167MHz 330MHz 1GHz Supply voltage 3.3V 2.5V 1.6V Process 0.5µm CMOS 0.35µm CMOS 0.15µm CMOS Metal layers 4 (Al) 5 (Al) 7 (Al) Power consumption <30W <30W <80W Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 21 UltraSPARC-III: Clocking Performance-driven high-power clock distribution Eight logic gates per cycle High-speed semi-dynamic flip-flops with logic embedding Large hold time mandates use of advanced tools for fixing fast-path violations Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 22

UltraSPARC-III : Clocking Clock distribution delay in UltraSPARC-III (Heald et al. 2000), Copyright 2000 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 23 UltraSPARC-III: Clock Storage Elements M P1 M N3 NAN S 1 Inv 4 M N5 M P2 Inv 5 M N2 Inv 2 Inv3 Inv 6 Inv 1 M N1 M N4 Semidynamic flip-flop (Klass 1998), Copyright 1998 IEEE Single-ended dynamic structure with use of keepers for static operation and use of clock pulsing Positive feedback (NAN) improves low-to-high setup time Fast, at the price of high internal and clock power Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 24

UltraSPARC-III: Clock Storage Elements 1 M P1 M N3 S NAN Inv4 M P2 Inv 5 1 M P1 M N3 M N2a NAN M N2c 1 S M P2 Inv 5 Inv4 M Inv N5 Inv 6 3 2 NMOS network Inv 3 M N5 Inv 6 2 M N2b M N2d 2 M N4 N M N4 M N1 M N1 Inv 1 Inv 2 Inv 1 Inv 2 Logic embedding in a semi-dynamic flip-flop Two-input XOR function (Klass, 1998), Copyright 1998 IEEE A non-inverting logic function can be embedded by replacing the input transistor with an n-mos logic network Necessary for fitting 8 logic stages in cycle time, also used for scan Complexity of embedded logic limited by the n-mos stack depth Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 25 UltraSPARC-III: Clock Storage Elements M P1 M P1 M P2 M P4 M P3 S Inv 5 Inv 5 S R Inv 6 Inv 4 M N3 M N6 M N3 M N5 NAN Inv 1-2 Inv 3-4 M N7 M N2 Inv 3 M N2 M N4 M N1 Inv 1 Inv 2 M N1 Single-ended dynamic SFF ifferential dynamic SFF (Klass, 1998), Copyright 1998 IEEE ynamic version of SFF used in dynamic logic paths Outputs exercise precharge-evaluate sequence to ensure monotonicity Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 26

UltraSPARC-III: Clock Storage Elements M P3 M P1 M P4 M P6 M P2 M N3 NAN S M N6 M P5 M P7 M N7 Inv 5 Inv 4 M N2 Inv 2 M N4 Inv3 Inv 1 M N1 M N5 UltraSPARC-III flip-flop (Heald et al. 2000), Copyright 2000 IEEE Final UltraSPARC-III flip-flop modified by decoupling keepers to increase immunity to α-particles Somewhat degraded speed and logic embedding property Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 27 Microprocessor Examples Clocking for Intel Microprocessors IA-32 Pentium Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III Clocking Clocking and CSEs Alpha Clocking: A Historical Overview Clocking and CSEs IBM Microprocessors Level-Sensitive Scan esign Examples of CSEs Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 28

Alpha Microprocessor Features 21064 21164 21264 21364 # transistors [M] 1.68 9.3 15.2 152 ie Size [mm 2 ] 16.8x13.9 18.1x16.5 16.7x18.8 21.1x18.8 Process Supply [V] 0.75µm 3.3 0.5µm 3.3 0.35µm 2.2 0.18µm 1.5 Power [W] 30 50 72 125 Freq. [MHz] 200 300 600 1200 Gates/Cycle 16 14 12 12 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 29 Alpha Microprocessors: Clocking clock grid (a) (b) (c) Alpha microprocessor final clock driver location: (a) 21064, (b) 21164, (c) 21264 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 30

Alpha Microprocessors: Clocking 21064 clock skew (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 31 Alpha Microprocessors: Clocking 21164 clock skew (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 32

Alpha Microprocessors: Clocking ext. clk PLL GCLK Grid local clk Box Grid local clk cond cond. local clk cond cond. local clk 21264 clock hierarchy (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 33 Alpha Microprocessors: Clocking 21264 clock skew (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 34

Alpha Microprocessors: Clocking NCLK LL LL LL GCLK grid L2L L2R 21364 major clock domains (Xanthopoulos et al. 2001), Copyright 2001 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 35 Alpha Microprocessors: Clocking 21364, NCLK clock skew (Xanthopoulos et al. 2001), Copyright 2001 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 36

Alpha µp: Clock Storage Elements P 1 P 1 N 3 P 5 X P 2 N 4 P 3 X P 2 P 4 N1 N2 N1 N2 N 5 21064 modified TSPC latches (Gronowski et al. 1998), Copyright 1998 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 37 Alpha µp: Clock Storage Elements X X (a) (b) 21164: (a) phase-a latch, (b) phase-b latch (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 38

Alpha µp: Clock Storage Elements 1 1 2 X1 2 X 3 4 X2 (a) Embedding of logic into a latch: (a) 21064 TSPC latch, one level of logic; (b) 21164 latch, two levels of logic. (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 39 (b) Alpha µp: Clock Storage Elements 21264 flip-flop (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 40

Alpha Microprocessors: Timing Logic Logic R R GCLK Critical Path efinition and Criteria - Identify common clock, and R -Maximize - Minimize R +U R T cycle GCLK Race efinition and Criteria - Identify common clock, and R - Minimize -Maximize R R+H cond Critical-path and race analysis for clock buffering and conditioning (Gronowski et al. 1998), Copyright 1998 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 41 Microprocessor Examples Clocking for Intel Microprocessors IA-32 Pentium Pro First IA-64 Microprocessor Pentium 4 Sun Microsystems UltraSPARC-III Clocking Clocking and CSEs Alpha Clocking: A Historical Overview Clocking and CSEs IBM Microprocessors Level-Sensitive Scan esign Examples of CSEs Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 42

Hazard-Free Level-Sensitive Polarity-Hold Latch +Clock ata Out -Clock Eichelberger 1983 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 43 General LSS Configuration Inputs (X) Combinational Logic Outputs (Y) Y=Y(X, S n ) Clocked Storage Elements Scan-Out Clock Present State Next State S S Scan-Out n+1 n S n+1 = f {S n, X} Scan-In Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 44

LSS Shift Register Latch L 1 Latch -Scan_In -L 1 +L 1 L 2 Latch -ata -L 2 +A +L 2 -C +B Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 45 LSS ouble Latch esign State S n Primary Outputs Z X 1 L1 L2 X 2 L1 L2 Primary Inputs X Combinational Logic X 3 L1 L2 S n X n L1 L2 C 1 A Shift Scan In B Shift or Scan In Scan Out Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 46

IBM S/390 Parallel Server Processor CLKG B_CLK A_CLK CLKL SCAN_IN L1 L2 CLK_ENABLE CLKG SELECT_N IN_A IN_B (SCAN_OUT) SELECT_A CLKL TEST_ISABLE LSS SRL with multiplexer used in the IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 47 IBM S/390 Parallel Server Processor B_CLK A_CLK SCAN_IN IN_A IN_B IN_C IN_M IN_N mux_a mux_m_n (SCAN_OUT) SELECT_N CLKL SELECT_A TEST_ISABLE Static multiplexer version of the SRL used in the IBM S/390 G4 (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 48

IBM S/390 Parallel Server Processor CLKG C1 C2 A_CLK SCAN_IN IN L1 L2 (SCAN_OUT) B_CLK CLKG C2_ENABLE C2 C1_ISABLE C1 A clocked storage element is used in the non-timing-critical timing macros of the IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 49 IBM S/390 Parallel Server Processor CLKG C1 B_CLK CLKG C2_ENABLE UNOVERLAP C2 C2 C1_ISABLE C1 The clock-generation element used to detect problems created with fast paths: IBM S/390 G4 processor (Sigal et al. 1997), reproduced by permission Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 50

IBM PowerPC Processor SCAN_GATE SG SEL_EXT i NCLK (a) SEL i CLK OT SEL 0 SEL n-1 CLK SO 0 n-1 CLK True Mux CLK Slave Latch OC SEL 0 SEL n-1 SR Master Latch Complement Mux The experimental IBM PowerPC processor (Silberman et al. 1998), reproduced by permission Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 51 (b) IBM PowerPC 603: Master-Slave Latch ACLK V SCAN in C 2 ACLK in C 1 C 2 out C 1 C 2 ACLK The PowerPC 603 MSL (Gerosa et al. 1994), Copyright 1994 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 52

IBM PowerPC 603: Local Generator C1_FREEZE C1_TEST SCAN_C1 GCLK ACLK C1 WAITCLK OVERRIE C2 C2_TEST C2_FREEZE The PowerPC 603 local clock regenerator (Gerosa et al. 1994), Copyright 1994 IEEE Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 53 Summary Intel Microprocessors Active clock deskewing in Pentium processors Sun Microsystems Processors Semidynamic flip-flop (one of the fastest single-ended flip-flops today, soft-edge ) Alpha Processors Performance leader in the 90s Incorporating logic into CSEs IBM Processors esign for testability techniques Low-power champion PowerPC 603 Nov. 14, 2003 igital System Clocking: Oklobdzija, Stojanovic, Markovic, Nedovic 54