VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units

Similar documents
Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

EE141-Fall 2010 Digital Integrated Circuits. Announcements. Homework #8 due next Tuesday. Project Phase 3 plan due this Sat.

Lecture 8: Sequential Logic

Clock - key to synchronous systems. Topic 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

Clock - key to synchronous systems. Lecture 7. Clocking Strategies in VLSI Systems. Latch vs Flip-Flop. Clock for timing synchronization

Fundamentals of Computer Systems

Digital Circuits and Systems

ECE321 Electronics I

ECEN454 Digital Integrated Circuit Design. Sequential Circuits. Sequencing. Output depends on current inputs

Latch-Based Performance Optimization for FPGAs. Xiao Teng

EE141-Fall 2010 Digital Integrated Circuits. Announcements. Synchronous Timing. Latch Parameters. Class Material. Homework #8 due next Tuesday

Fundamentals of Computer Systems

CPE/EE 427, CPE 527 VLSI Design I Sequential Circuits. Sequencing

Unit 11. Latches and Flip-Flops

Chapter 6. Flip-Flops and Simple Flip-Flop Applications

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

CS8803: Advanced Digital Design for Embedded Hardware

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

11. Sequential Elements

Design for Testability Part II

Memory elements. Topics. Memory element terminology. Variations in memory elements. Clock terminology. Memory element parameters. clock.

CMOS Latches and Flip-Flops

Fundamentals of Computer Systems

Synchronous Sequential Logic

EE241 - Spring 2013 Advanced Digital Integrated Circuits. Announcements. Lecture 14: Statistical timing Latches

Clocks. Sequential Logic. A clock is a free-running signal with a cycle time.

Lecture 11: Sequential Circuit Design

6.S084 Tutorial Problems L05 Sequential Circuits

Design for Testability

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Sequential Circuits: Latches & Flip-Flops

EXPLOITING LEVEL SENSITIVE LATCHES FOR WIRE PIPELINING. A Thesis VIKRAM SETH

DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING DIGITAL DESIGN

DIGITAL CIRCUIT LOGIC UNIT 11: SEQUENTIAL CIRCUITS (LATCHES AND FLIP-FLOPS)

Synchronous Sequential Logic. Chapter 5

Advanced Digital Logic Design EECS 303

Logic Design. Flip Flops, Registers and Counters

Combinational / Sequential Logic

Combinational vs Sequential

Software Engineering 2DA4. Slides 9: Asynchronous Sequential Circuits

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

Interconnect Planning with Local Area Constrained Retiming

Homework 3 posted this week, due after Spring break Quiz #2 today Midterm project report due on Wednesday No office hour today

Switching Circuits & Logic Design

Keeping The Clock Pure. Making The Impurities Digestible

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Chapter 5 Synchronous Sequential Logic

(CSC-3501) Lecture 7 (07 Feb 2008) Seung-Jong Park (Jay) CSC S.J. Park. Announcement

Review of digital electronics. Storage units Sequential circuits Counters Shifters

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

More Digital Circuits

Name Of The Experiment: Sequential circuit design Latch, Flip-flop and Registers

Basis of sequential circuits: the R-S latch

Logic Design II (17.342) Spring Lecture Outline

Flip-Flops. Because of this the state of the latch may keep changing in circuits with feedback as long as the clock pulse remains active.

Chapter 12. Synchronous Circuits. Contents

K.T. Tim Cheng 07_dft, v Testability

IT T35 Digital system desigm y - ii /s - iii

A clock is a free-running signal with a cycle time. A clock may be either high or low, and alternates between the two states.

Digital System Design

Retiming Sequential Circuits for Low Power

LATCHES & FLIP-FLOP. Chapter 7

ELCT201: DIGITAL LOGIC DESIGN

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

Figure 1 shows a simple implementation of a clock switch, using an AND-OR type multiplexer logic.

Lec 24 Sequential Logic Revisited Sequential Circuit Design and Timing

CS8803: Advanced Digital Design for Embedded Hardware

D Latch (Transparent Latch)

EE 447/547 VLSI Design. Lecture 9: Sequential Circuits. VLSI Design EE 447/547 Sequential circuits 1

Sequential Circuit Design: Principle

Introduction. NAND Gate Latch. Digital Logic Design 1 FLIP-FLOP. Digital Logic Design 1

Level and edge-sensitive behaviour

1. Convert the decimal number to binary, octal, and hexadecimal.

RS flip-flop using NOR gate

UNIT 11 LATCHES AND FLIP-FLOPS

CSE115: Digital Design Lecture 23: Latches & Flip-Flops

Lecture 23 Design for Testability (DFT): Full-Scan

CHAPTER 4: Logic Circuits

Lecture 10: Sequential Circuits

RS flip-flop using NOR gate

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

Chapter 11 Latches and Flip-Flops

Digital Fundamentals

Sequential Logic. E&CE 223 Digital Circuits and Systems (A. Kennings) Page 1

The outputs are formed by a combinational logic function of the inputs to the circuit or the values stored in the flip-flops (or both).

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

SEQUENTIAL CIRCUITS THE RELAY CIRCUIT

Unit 9 Latches and Flip-Flops. Dept. of Electrical and Computer Eng., NCTU 1

EMT 125 Digital Electronic Principles I CHAPTER 6 : FLIP-FLOP

2 Sequential Circuits

YEDITEPE UNIVERSITY DEPARTMENT OF COMPUTER ENGINEERING. EXPERIMENT VIII: FLIP-FLOPS, COUNTERS 2014 Fall

Sequential Logic and Clocked Circuits

MC9211 Computer Organization

INTRODUCTION TO SEQUENTIAL CIRCUITS

Sequential Circuit Design: Part 1

Chapter 4: One-Shots, Counters, and Clocks

Logic Gates, Timers, Flip-Flops & Counters. Subhasish Chandra Assistant Professor Department of Physics Institute of Forensic Science, Nagpur

Chapter 5 Sequential Circuits

Transcription:

VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units Grace Li Zhang 1, Bing Li 1, Masanori Hashimoto 2 and Ulf Schlichtmann 1 1 Chair of Electronic Design Automation, Technical University of Munich (TUM) 2 Department of Information Systems Engineering, Osaka University

Overview Motivation VirtualSync Timing Model Timing Optimization Framework of VirtualSync Experimental Results Summary 2

The Traditional Timing Paradigm Clock-to-q delay t cq : 3 Setup time t su : 1 Hold time t h : 1 T min = t cq +d max +t su =3+17+1=21 Sequential components such as flip-flops synchronize signal propagations. Combinational gates perform logic computations. Reduce design effort Disadvantages Flip-flops have clock-to-q delays and impose setup time. Delay imbalances between flipflop stages degrade performance. 3

Timing Optimization Methods Gate Sizing T min = 3+12+1=16 Retiming T min = 3+7+1=11 The limit in the traditional timing paradigm VirtualSync T min =(3+13+1)/2=8.5 22.7% reduction compared with retiming&sizing 4

VirtualSync Concept fast path must be delayed loops must be blocked boundary F3 boundary F5 boundary F6 Circuit under optimization VirtualSync: Step 1: Remove all flip-flops except those at the boundary of the module Step 2: Block fast signals for timing synchronization, including signals arriving at boundary flip-flops too early through fast paths signals traveling across combinational loops 5

VirtualSync Concept delay fast path by buffers relative reference points for timing checking F/L loop blocked by flip-flop/latch boundary boundary boundary Circuit under optimization Delay units (logic gates, flip-flops and latches) are used to slow down signals on fast paths and loops. Relative reference points provide relative timing information. 6

Delay Units in VirtualSync s u s v s u s v s u s v t d s v output gap s v T+t cq t su s v output gap T/2+t cq input gap t su D: duty cycle t d input gap s u t h T s u t h D*T T s u Linear delaying effect of a combinational delay unit Constant delaying effect of a flip-flop Piecewise delaying effect of a latch Input gap: the difference of arrival times of two signals at a delay unit Output gap: the difference between their arrival times after they pass through the unit 7

Overall Flow of VirtualSync sequential circuits remove all flip-flops mark reference points create selection variables for delay units at each circuit node maximize performance and minimize area using ILP decrease lower bound of inserted delay no set lower bound of inserted delay All required delays are padded? yes Optimized circuit 8

Results of VirtualSync Speed increase (%) Area change (%) 4 2 0-2 -4-6 -8-10 Speed increase and area results compared with ideally balanced design 9

Summary A new timing model, VirtualSync, with sequential components and combinational logic gates as delay units is proposed. By viewing flip-flops and latches as delay units, circuit performance can be pushed even beyond the limit of the traditional timing paradigm. VirtualSync demonstrates a good potential for high-performance designs. 10

Thank you for your attention!

Heuristic method in VirtualSync Emulation of sequential delay units with different padding delays for long and short paths Model approximation with clock/data-to-q delays Yes Different padding delays are needed? No Model legalization using accurate delay models Different padding delays are needed? Yes Buffer replacement using sequential units and delay discretization No Optimized circuit 12

Results of VirtualSync Circuit Critical part Optimized circuit Comparison #gates #flipflop #flipflop #latch #buffer clock period reduction area increase s5378 35 1877 11 14 94 11.5% 2.84% s9234 91 3981 58 45 91 2.5% -5.17% s13207 191 3483 95 73 52 2.5% -1.09% s15850 71 3847 72 18 26 0% 6.01% s38584 126 9498 62 75 46 0.5% -0.5% systemcdes 92 3232 90 81 227 3.5% 2.43% mem_ctrl 136 7500 101 39 140 3.5% 0.97% usb_funct 138 5378 123 37 60 4% 0.21% ac97_ctrl 237 4873 42 172 218 0% -9.76% pci_bridge 239 9510 188 68 338 3% 0.05% The comparison was made with extreme retiming and sizing, with which the timing performance has reached the limit in the traditional timing paradigm. 13

Relative Timing References in VirtualSync s o =3 s u =14 s v =4 s w =7 s t =3 s z =5 o u v w t z F1 11 F2 3 F3 2 F4 T=10 t cq =3 t su =1 t h =1 boundary -10-10 boundary removed after optimization kept after optimization s z t h s z +t su T The location of the removed flip-flops such as F2 and F3 are called anchor points. The anchor points allow to relate timing information to boundary flip-flops. Every time when a signal passes an anchor point, its arrival time is converted by subtracting T. If F3 is removed, the arrival time s z becomes -3+2=-1, violating the hold time constraint. The timing constraints at the boundary flip-flops force the usage of the internal sequential delay units! 14

Synchronizing Logic Waves by Delay Units comb. delay? ξ uv seq. delay? anchor? λ tz u v d vw sizing? w t z 1. Combinational delay unit and gate sizing s w s u +ξ uv *r u + d vw *r u s w s u +ξ uv *r l + d vw *r l (1) (2) s u, s u,s w, s w are the latest and earliest arrival time of node u and w. is the delay of an inserted buffer. r u and r l are two constants to reserve a guard band for process variations. ξ uv 15

Synchronizing Logic Waves by Delay Units comb. delay? ξ uv seq. delay? anchor? λ tz u v d vw sizing? w t z 2. Insertion of sequential delay units Case 1: No sequential delay unit is inserted between w and t s t s w s t s w (3) (4) s t, s t,s w, s w are the latest and earliest arrival time of node t and w. 16

Synchronizing Logic Waves by Delay Units comb. delay? ξ uv seq. delay? anchor? λ tz u v d vw sizing? w t z 2. Insertion of sequential delay units Case 1: A flip-flop is inserted between w and t s w, s w N wt *T +φ wt +t h *r u (5) s t (N +1)*T +φ +t *r wt wt cq s w, s w (N wt +1)*T +φ wt t su *r u (6) s t (N wt +1)*T +φ wt +t cq *r l u (7) (8) φ wt is the phase shift of the clock signal A flip-flop only works in a region t h after the rising clock edge and t su before the next rising clock edge. The signal always starts to propagate from the next active clock edge. 17

Synchronizing Logic Waves by Delay Units comb. delay? ξ uv seq. delay? anchor? λ tz u v d vw sizing? w t z 2. Insertion of sequential delay units Case 1: A level-sensitive latch is inserted between w and t s t N wt *T +φ wt + D*T +t cq *r u s t s w +t dq *r u (9) s t max(n wt *T +φ wt + D*T +t cq *r l, (10) s w +t dq *r l ) (11) D is the duty cycle of the clock signal The upper is the case that the latch is non-transparent; the lower is the case that the latch is transparent. The signal starts to propagate from the maximum of the earliest time. 18

Synchronizing Logic Waves by Delay Units comb. delay? ξ uv seq. delay? anchor? λ tz u v d vw sizing? 3. Reference shift with respect to anchor points s z = s t λ tz *T (12) 4.Wave non-interference condition s u +t stable s u +T Overall formulation w t z (13) Objective: find a solution to make the circuit work at a given clock period Subject to: (1)-(13) and setup and hold time constraints at the boundary flip-flops NP-hard! 19

Results of seq. delay units after buffer replacement Number of seq. units 300 250 200 150 100 50 Before rep. After rep. 0 20

Runtime Circuit T r (s) s5378 121.6 s9234 7251.1 s13207 3121.6 s15850 289.97 s38584 1142.3 systemcdes 7310.5 mem_ctrl 3750.1 usb_funct 1211.7 ac97_ctrl 2936.8 pci_bridge 7418.5 21