Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University

Similar documents
Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering

Flip-flop Clustering by Weighted K-means Algorithm

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

A Survey on Post-Placement Techniques of Multibit Flip-Flops

Post-Routing Layer Assignment for Double Patterning

Clock Tree Power Optimization of Three Dimensional VLSI System with Network

Australian Journal of Basic and Applied Sciences. Design of SRAM using Multibit Flipflop with Clock Gating Technique

DUE to the popularity of portable electronic products,

Power Reduction Approach by using Multi-Bit Flip-Flops

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

Latch-Based Performance Optimization for FPGAs. Xiao Teng

K.T. Tim Cheng 07_dft, v Testability

Improved Flop Tray-Based Design Implementation for Power Reduction

ECE 301 Digital Electronics

AN OPTIMIZED IMPLEMENTATION OF MULTI- BIT FLIP-FLOP USING VERILOG

Interconnect Planning with Local Area Constrained Retiming

A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity.

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

QDR SRAM DESIGN USING MULTI-BIT FLIP-FLOP M.Ananthi, C.Sathish Kumar 1. INTRODUCTION In memory devices the most

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

Iterative Deletion Routing Algorithm

University College of Engineering, JNTUK, Kakinada, India Member of Technical Staff, Seerakademi, Hyderabad

Chapter 5 Synchronous Sequential Logic

More design examples, state assignment and reduction. Page 1

Slack Redistribution for Graceful Degradation Under Voltage Overscaling

DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING DIGITAL DESIGN

Low Voltage Clocking Methodologies for Nanoscale ICs. A Dissertation Presented. Weicheng Liu. The Graduate School. in Partial Fulfillment of the

Retiming Sequential Circuits for Low Power

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units

Power-Optimal Pipelining in Deep Submicron Technology

Controlling Peak Power During Scan Testing

Chapter 12. Synchronous Circuits. Contents

Instructions. Final Exam CPSC/ELEN 680 December 12, Name: UIN:

Design for Testability

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

EECS 427 Discussion 1

11. Sequential Elements

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

Achieving Faster Time to Tapeout with In-Design, Signoff-Quality Metal Fill

EET2411 DIGITAL ELECTRONICS

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Synthesis of Reversible Sequential Elements

Clock-Aware FPGA Placement Contest

Design Project: Designing a Viterbi Decoder (PART I)

EXPLOITING LEVEL SENSITIVE LATCHES FOR WIRE PIPELINING. A Thesis VIKRAM SETH

Lecture 23 Design for Testability (DFT): Full-Scan

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Digital Logic Design Sequential Circuits. Dr. Basem ElHalawany

Final Exam review: chapter 4 and 5. Supplement 3 and 4

Unit 11. Latches and Flip-Flops

Logic and Computer Design Fundamentals. Chapter 7. Registers and Counters

Software Engineering 2DA4. Slides 3: Optimized Implementation of Logic Functions

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

Impact of Test Point Insertion on Silicon Area and Timing during Layout

TKK S ASIC-PIIRIEN SUUNNITTELU

High Performance Low Swing Clock Tree Synthesis with Custom D Flip-Flop Design

Figure.1 Clock signal II. SYSTEM ANALYSIS

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

Final Exam CPSC/ECEN 680 May 2, Name: UIN:

Flip-Flops. Because of this the state of the latch may keep changing in circuits with feedback as long as the clock pulse remains active.

More Digital Circuits

CSE 352 Laboratory Assignment 3

Fundamentals of Computer Systems

A Technique to Reduce Peak Current and Average Power Dissipation in Scan Designs by Limited Capture

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

ASYNCHRONOUS COUNTER CIRCUITS

ECE 715 System on Chip Design and Test. Lecture 22

Introduction to Sequential Circuits

Scan Chain Design for Power Minimization During Scan Testing Under Routing Constraint.

Sequential Circuit Design: Principle

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

Why FPGAs? FPGA Overview. Why FPGAs?

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Minimizing Leakage of Sequential Circuits through Flip-Flop Skewing and Technology Mapping

A Proposal for Routing-Based Timing-Driven Scan Chain Ordering

A Critical-Path-Aware Partial Gating Approach for Test Power Reduction

VLSI Design Digital Systems and VLSI

Design for Testability Part II

Fundamentals of Computer Systems

IN DIGITAL transmission systems, there are always scramblers

Exploring Architecture Parameters for Dual-Output LUT based FPGAs

ISPD 2017 Contest Clock-Aware FPGA Placement

Power Distribution and Clock Design

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Computer Organization & Architecture Lecture #5

Chapter 7 Sequential Circuits

Introduction. Serial In - Serial Out Shift Registers (SISO)

Power-Aware Placement

1. What does the signal for a static-zero hazard look like?

Jin-Fu Li Advanced Reliable Systems (ARES) Laboratory. National Central University

Asynchronous (Ripple) Counters

Pulsed-Latch ASIC Synthesis in Industrial Design Flow

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

The Effect of Wire Length Minimization on Yield

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

Transcription:

Power-Driven Flip-Flop p Merging g and Relocation Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Mak @National Tsing Hua University

Outline Introduction Problem Formulation Algorithms Experimental Results Conclusions

Outline Introduction Problem Formulation Algorithms Experimental Results Conclusions

Flip-Flop Flop Merging Merge several 1-bit Flip-Flops into a Multi- bit Flip-Flop Flop (MBFF) Eliminate some inverters and area Reduce the # clock sinks

Flip-Flop Flop Merging

Reduction of clock sinks

Related Work [15] Post-placement power optimization with multi-bit flip-flops, flops ICCAD 10 The objective of [15] is to minimize the total FF Power However, our objective function is to minimize the # clock sinks and switching power of signal nets

Wirelength of Signal Nets Different merging solutions will affect the wirelength and switching power of signal nets differently

Post-Placement Placement Relocation After merging, we need to relocate these MBFFs It will affect the total switching power of signal nets 0.3 0.1 0.3 0.1 0.3 0.1 0.3 0.1

Outline Introduction Problem Formulation Algorithms Experimental Results Conclusions

Problem Formulation Inputs A preplaced design and a MBFF Library Objectives Minimize the # sinks in clock network Minimize the switching power of signal nets α i is the switching rate of signal nets

Constraints Guarantee there is no timing violation Feasible region of FFs Control the placement density Maintain the quality of legalization Consider routing congestion

Feasible Region of a FF Slack = Maximum allowed delay - D AB Slack A = Slack B = Slack / 2

Feasible Region of a FF (cont.) P K Q

Outline Introduction Problem Formulation Algorithms Experimental Results Conclusions

Intersection Graph Get the feasible regions of all FFs Th i t ti f f ibl i b The intersection of feasible regions can be represented by an intersection graph

Design Flow

Find all the Maximal Cliques Finding all the maximal cliques is NPC in general graph However, it can be solved in polynomial time in the rectangle intersection graph Solve by the sweep line algorithm

MBFF Extraction We want to extract the MBFFs by clique partitioning Clique partitioning is a NP-Hard problem Different extraction strategies will affect The number of clock sinks The wirelength of signal nets

MBFF Extraction (cont.) Cost of creating MBFF β D(β ): the merging possibility of FFs in β B(β ): the # bits of β Switching gpower of signal nets connected to β α i is the switching rate of signal nets

Example of Extraction Algorithm Assume we have 1/2/4-bit MBFF in library There are two maximal cliques c 1 = {1,2,3,6,7}, c 2 = {4,5,6} Random sampling 1, 2 or 4 of FFs from c 1, c 2 β 1 = {1,2,3,6}, β 2 = {4,6} cost(β 1 ) < cost(β 2 ) => select β 1 Re-sampling β 1 = {7} from c 1 cost(β 2 ) < cost(β 1 ) FF6 already covered Re-sampling β 2 = {4, 5} from c 2 Final Extraction {β 1, β 2, β 1 }

MBFF Relocation For a MBFF β, we want to minimize the switching power of its signal nets α i is the switching rate of signal nets We can formulate it as a weighted median problem

MBFF Relocation (cont.) The weight of P1~P5 are 2:1:1:3:1

MBFF Relocation (cont.) Because of bin density constraints, some MBFFs cannot be placed in preferred region

Outline Introduction Problem Formulation Algorithms Experimental Results Conclusions

Experimental Setup Implemented in C++ Work on Linux with 2.13GHz CPU We have 9 test cases r1~r5 r5 from [22] Exact Zero-Skew t0~t3 from 2010 CAD contest of Taiwan Random generate switching rates 5%~15%

Experimental Results Reduction of clock sinks and wirelength of clock tree

Experimental Results (cont.) Reduction of wirelength and estimated switching power of nets connected to FFs

Comparison with [15] Our algorithm can be modified to target the objectives of [15]

Conclusions We present a power-driven flip-flop merging and relocation approach to reduce the switching power consumption of the entire circuit

Q&A Thanks for your attention