Minimizing Leakage of Sequential Circuits through Flip-Flop Skewing and Technology Mapping

Similar documents
Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains

Simultaneous Control of Subthreshold and Gate Leakage Current in Nanometer-Scale CMOS Circuits

140 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 12, NO. 2, FEBRUARY 2004

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

University College of Engineering, JNTUK, Kakinada, India Member of Technical Staff, Seerakademi, Hyderabad

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

Figure.1 Clock signal II. SYSTEM ANALYSIS

FP 12.4: A CMOS Scheme for 0.5V Supply Voltage with Pico-Ampere Standby Current

Leakage Current Reduction in CMOS VLSI Circuits by Input Vector Control

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Power Optimization by Using Multi-Bit Flip-Flops

Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop

An FPGA Implementation of Shift Register Using Pulsed Latches

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

Area Efficient Level Sensitive Flip-Flops A Performance Comparison

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

A Low-Power CMOS Flip-Flop for High Performance Processors

Noise Margin in Low Power SRAM Cells

ISSN:

Reduction of Area and Power of Shift Register Using Pulsed Latches

Dual-V DD and Input Reordering for Reduced Delay and Subthreshold Leakage in Pass Transistor Logic

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

A CHARGE RECYCLING THREE-PHASE DUAL-RAIL PRE-CHARGE LOGIC BASED FLIP-FLOP

Efficient Path Delay Testing Using Scan Justification

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

PERFORMANCE ANALYSIS OF POWER GATING TECHNIQUES IN 4-BIT SISO SHIFT REGISTER CIRCUITS

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

A Novel Low-overhead Delay Testing Technique for Arbitrary Two-Pattern Test Application

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Integrated Circuit Design ELCT 701 (Winter 2017) Lecture 1: Introduction

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Retiming Sequential Circuits for Low Power

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

International Journal of Advancements in Research & Technology, Volume 2, Issue5, May ISSN

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

Design And Analysis of Clocked Subsystem Elements Using Leakage Reduction Technique

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

Pulsed-Latch ASIC Synthesis in Industrial Design Flow

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains. Outline

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

II. ANALYSIS I. INTRODUCTION

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

DESIGN AND SIMULATION OF LOW POWER JK FLIP-FLOP AT 45 NANO METER TECHNOLOGY

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

Current Mode Double Edge Triggered Flip Flop with Enable

Timing Error Detection and Correction for Reliable Integrated Circuits in Nanometer Technologies

HIGH SPEED CLOCK DISTRIBUTION NETWORK USING CURRENT MODE DOUBLE EDGE TRIGGERED FLIP FLOP WITH ENABLE

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

55:131 Introduction to VLSI Design Project #1 -- Fall 2009 Counter built from NAND gates, timing Due Date: Friday October 9, 2009.

A Power Efficient Flip Flop by using 90nm Technology

Design and Analysis of a Linear Feedback Shift Register with Reduced Leakage Power

Low Power Illinois Scan Architecture for Simultaneous Power and Test Data Volume Reduction

Asynchronous Model of Flip-Flop s and Latches for Low Power Clocking

Comparative study on low-power high-performance standard-cell flip-flops

ECE321 Electronics I

Power Optimization Techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic

International Journal of Computer Trends and Technology (IJCTT) volume 24 Number 2 June 2015

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Controlling Peak Power During Scan Testing

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

LFSR Counter Implementation in CMOS VLSI

Optimization of Scannable Latches for Low Energy

DUAL EDGE-TRIGGERED D-TYPE FLIP-FLOP WITH LOW POWER CONSUMPTION

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

D Latch (Transparent Latch)

Transactions Brief. Circular BIST With State Skipping

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

Novel Design of Static Dual-Edge Triggered (DET) Flip-Flops using Multiple C-Elements

Scan-shift Power Reduction Based on Scan Partitioning and Q-D Connection

Analysis of Power Consumption and Transition Fault Coverage for LOS and LOC Testing Schemes

Scan Chain Design for Power Minimization During Scan Testing Under Routing Constraint.

POWER AND AREA EFFICIENT LFSR WITH PULSED LATCHES

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

Low Voltage Clocking Methodologies for Nanoscale ICs. A Dissertation Presented. Weicheng Liu. The Graduate School. in Partial Fulfillment of the

Design of an Efficient Low Power Multi Modulus Prescaler

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits

Low Power D Flip Flop Using Static Pass Transistor Logic

Weighted Random and Transition Density Patterns For Scan-BIST

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Interconnect Planning with Local Area Constrained Retiming

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Scan. This is a sample of the first 15 pages of the Scan chapter.

Transcription:

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.7, NO.4, DECEMER, 2007 215 Minimizing Leakage of Sequential Circuits through Flip-Flop Skewing and Technology Mapping Sewan Heo and Youngsoo Shin Abstract Leakage current of CMOS circuits has become a major factor in VLSI design these days. Although many circuit-level techniques have been developed, most of them require significant amount of designers effort and are not aligned well with traditional VLSI design process. In this paper, we focus on technology mapping, which is one of the steps of logic synthesis when gates are selected from a particular library to implement a circuit. We take a radical approach to push the limit of technology mapping in its capability of suppressing leakage current: we use a probabilistic leakage (together with delay) as a cost function that drives the mapping; we consider pin reordering as one of options in the mapping; we increase the library size by employing gates with larger gate length; we employ a new flipflop that is specifically designed for low-leakage through selective increase of gate length. When all techniques are applied to several benchmark circuits, leakage saving of 46% on average is achieved with 45-nm predictive model, compared to the conventional technology mapping. Index Terms Low power, leakage current, logic synthesis, technology mapping, VLSI design I. INTRODUCTION Scaling down of transistors has resulted in dramatic increase of leakage current. Threshold voltage of Manuscript received Nov. 5, 2007; revised Nov. 30, 2007 Department of Electrical Engineering KAIST, Daejeon 305-701, Korea E-mail : hsewan@kaist.ac.kr, youngsoo@ee.kaist.ac.kr MOSFET devices has been scaled down to compensate for the reduced circuit performance in low supply voltage, which causes exponential increase of subthreshold leakage. Gate oxide has been scaled down as well for better control of MOSFET channel current, which leads to large amount of gate leakage. The leakage current, in fact, has become a major portion of total power consumption, and, in many technologies, it contributes up to 50% of the overall power consumption [1]. Many circuit-level techniques have been proposed to control leakage such as power gating, body bias, input vector control, selective MTCMOS, zigzag power gating, mixed V t, and so on [1]. However, most of these techniques require significant amount of designers effort during design process and the support of dedicated design tools. These are some of reasons why these techniques are not yet prevalent in large scale circuit design. In this paper, we focus on technology mapping, which is one of the steps of logic synthesis when gates are selected from a particular library to implement a circuit. The technology mapping takes an optimized logic network (as a result of technology independent logic minimization) as its input and outputs a netlist of gates, which minimizes a total cost (usually area, delay, or the combination of the two). Since the technology mapping is the only step in logic synthesis where the detailed leakage information is available, we try to take a radical approach to see how much leakage we can save while timing constraints are still satisfied. We use a weighted sum of probabilistic leakage and delay as a cost function of the mapping as opposed to traditional area and delay metrics. We consider pin reordering as one of the options in the mapping. We increase the library size by

216 SEWAN HEO et al : MINIMIZING LEAKAGE OF SEQUENTIAL CIRCUITS THROUGH FLIP-FLOP SKEWING AND ~ Netlist Net probability computation Phase assignment Normal gates L-biased gates Technology mapping (with pin reordering) Cost = w leakage + (1-w) delay Decrease w N Timing satisfied? Y Done Fig. 1. Overall flow of the proposed technology mapping. employing gates with larger gate length, thus less leakage with slight increase of delay. We employ a new set of flip-flops that are specifically designed for lowleakage through selective increase of gate length. Depending on the state probability of each flip-flop, we either choose the gate-length-biased flip-flop or the one with its state complemented. The prototype tool was implemented in SIS [6] logic synthesis environment. The results with several benchmark circuits show that we can reduce leakage by 46% on average in 45-nm predictive technology model. The remainder of this paper is organized as follows. In the next section, we briefly explain gate-length biasing and pin reordering, which are two main techniques we use in the technology mapping, followed by the overall flow of our mapping procedure. In Section III, we propose a gate-lengthbiased flip-flop, which has characteristics of unequal leakage and delay, and phase assignment procedure that exploits these flip-flops. Experimental results with several benchmark circuits are presented in Section IV, and we draw conclusion in Section V. 2.1 Gate-Length iasing II. PRELIMINARIES Gate-length biasing involves a small increase in the gate lengths of devices. In a 130-nm industrial process, it is reported [2] that an 8 nm increase in gate length yields 30% decrease in leakage with 5% increase in delay for a minimum size inverter. This large decrease in leakage with just a small increase of delay occurs because the Fig. 2. An example D flip-flop: (a) original and (b) gate-length biased one. nominal gate length of the technology is usually very close to the knee of the leakage versus gate length curve that is produced by short channel effects. This small increase in gate length does not affect the printability during the manufacturing process, and can usually allows pin compatibility with the unbiased version of the cell, which benefits post placement optimization. In addition to a set of gates with nominal gate length, we have the same set of gates with larger gate length as shown in Fig. 1. For sequential elements such as flipflops, we apply gate-length biasing, but only to a subset of transistors, which will be explained in Section III. The additional set of gates created by exploiting gatelength biasing enlarges search space during synthesis. They may used for low leakage library, as long as the small delay increase does not induce a critical timing problem. 2.2 Pin Reordering Pin reordering refers to exchanging the inputs of a gate when they are compatible [3]. Take an example of two-input NAND gate with inputs A and (with A being closer to the output). If the signal probability of is higher than that of A, exchanging the two inputs can help reduce gate leakage, since the nmos device connected to ground can be a main source of gate leakage when its gate terminal () is driven by the signal of high probability of being one. Furthermore, when combined with gate-length biasing, pin reordering can lead to a substantial reduction of gate leakage, since subthreshold leakage can be reduced by

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.7, NO.4, DECEMER, 2007 217 proper gate-length biasing. Our experiments reveal that about 80% of leakage can be reduced in four-input NAND gates via combined pin reordering and gatelength biasing. Using pin reordering causes almost no penalty. Since the exchanged inputs are logically the same with the original one, the technique can be readily implemented in the conventional synthesis environment and does not require additional cost for manufacturing. Furthermore, subthreshold leakage through transistor stack is not affected by proper signal probability reordering. Therefore, pin reordering can be used simultaneously with gate-length biasing for further leakage reduction. 2.3 Overall Flow Fig. 1 shows the overall flow of the proposed technology mapping. It takes a logic network of a sequential circuit, which represents multiple oolean functions (i.e. flip-flop input functions and circuit output functions), as its input, and generates a gate-level netlist, where gates are selected from a technology library. In the library, we assume gates with larger gate length in addition to those with nominal gate length. In order to obtain a state probability (i.e. probability of Q-output of each flip-flop being logic one), we simulate the network with a sequence of sample input patterns, monitor the Q-outputs, and derive their probabilities. These probabilities, together with the signal probabilities of primary inputs, are propagated through the network [4] to obtain signal probabilities of all the nets. These probabilities are then used to derive the leakage of any gate that is to be mapped on the network. efore we start the mapping of combinational subcircuit, we go through a step, which we call phase assignment. In this step, we try to minimize the leakage of flip-flops, which will be explained in detail in the next section. For technology mapping, each function in the network is represented as a set of base functions 1, which is called a subject graph. Each gate in the library is likewise represented using the base function, which are called pattern graphs. The technology mapping, thus, is to find an optimal-cost covering of subject graphs using the 1 ase functions are a set of gates that can implement all the oolean functions. An example is an inverter and two-input NAND gate. collection of pattern graphs [5]. Since general covering is not likely to be solved in reasonable amount of time, it is approximated as a series of tree covering. The tree covering can be solved in polynomial time via dynamic programming. The cost function we use in the dynamic programming is a weighted sum of leakage and delay (as opposed to conventional area and/or delay) as indicated in Fig. 1. Note that the leakage is computed from the signal probabilities of the nets. For example of two-input NAND gate, its leakage can be expressed by: L = l00 (1 PA )(1 P ) + l01(1 PA ) P + l P (1 P ) + l P P, 10 A where P A and P denote the signal probability of input-a and input-, respectively. The l ij corresponds to the leakage of the gate when input-a is logic i and input- is logic j. We consider the possibility of pin reordering when we consider the candidates for the mapping. The weight for the leakage (ω) is initially 1.0 implying that we try to find the mapping that leads to minimum leakage. If the timing is not satisfied, we decrease ω and try another mapping. The procedure is iterated until the timing constraints are satisfied, which guarantees the minimum leakage within timing constraint. 11 III. PHASE ASSIGNMENT 3.1 Gate-Length iased Flip-Flop Fig. 2(a) shows an example D flip-flop with inverter and tristate inverter implementation. Over the operation of flip-flops, both D-input and Q-output have the same logic state most of the time, since a new D-input which is one of the outputs of combinational subcircuit (and arrives shortly before active clock edge) will be captured Inputs Combinational subcircuit A Outputs Fig. 3. Assignment of complemented state. D Q 1 D Q n

218 SEWAN HEO et al : MINIMIZING LEAKAGE OF SEQUENTIAL CIRCUITS THROUGH FLIP-FLOP SKEWING AND ~ and propagated to the Q-output at active clock edge. The leakages for two possible flip-flop states are also shown in Fig. 2(a), which indicate that the leakage is almost independent of the state (for this particular flip-flop). However, if we employ gate-length biasing for the transistors that are turned off when both D-input and Q- output are logic low as shown in Fig. 2(b), the leakage for the two flip-flop states can be made very different as shown in the figure. Specifically, if we increase the gate length of the transistors, which are marked in Fig. 2(b), the leakage when D-input and Q-output are logic low becomes 480 na as opposed to the original 1133 na. The leakage when D-input and Q-output are logic high is also reduced (from 1153 na to 936 na), mainly due to the two gate-length-biased transistors in the cascaded inverters, which are responsible for generating internal clock signals. The benefit of leakage reduction from the gate-lengthbiased flip-flops is considerable in sequential circuits, since a large portion of total leakage is from sequential elements as shown in section IV. The gate-length-biased flip-flop has skewed timing parameters. The rising and falling clock-to-q delay is increased by 32% and 7%, respectively. The increase of rising delay is larger than that of falling delay since the transistors whose gate length is increased are sensitized for rising signal. The rising and falling setup time is increased by 34% and 24%, respectively. 3.2. Phase Assignment of Flip-Flop Since the leakage of gate-length-biased flip-flops is very different for different flip-flop states, it can be exploited during the technology mapping as shown in Table 1. Total leakage reduction with benchmarks. Circuit #gates #s L-D map + Opt + Opt. s349 550 15 9.4% 33.5% 48.1% s382 735 21 11.2% 20.6% 46.0% s386 924 6 6.4% 24.4% 31.9% s0 788 21 15.5% 27.5% 53.1% s510 1103 6 6.7% 31.8% 37.5% s641 885 19 16.7% 35.1% 51.9% s713 953 19 14.2% 33.8% 49.0% s838 1891 32 11.9% 30.7% 49.4% s1423 2492 74 4.1% 19.3% 44.5% s1488 3555 6 26.5% 48.7% 50.8% s1494 3606 6 18.9% 43.5% 45.9% Average 12.9% 31.7% 46.2% Fig.1 (the box named phase assignment). If the state probability is higher than 0.5, we want to have the state complemented, so that it has more chance to remain in low leakage state (both D and Q are logic low). This can be accomplished as follows. As an example of a sequential circuit as shown in Fig. 3, suppose we want to complement the state of the first D flip-flop. We simply insert two inverters: one before the D-input and the other after Q-output. The second inverter can be avoided if Q is available, since we can achieve the same goal by swapping Q and Q. The extra inverters, if left, may not be an overhead, since they can be absorbed in the combinational subcircuit and, after its mapping, they are likely to disappear. The same holds for other types of flip-flops. For example of J-K flip-flop, it can be readily shown that by exchanging J and K inputs and Q and Q outputs, respectively, we can complement the original flip-flop state. For flip-flops with state probability less than 0.5, we simply use gate-length-biased flip-flops (as far as timing of the circuit is satisfied) without complementing their states. The phase assignment is more efficient when used with gate-length-biased flip-flops and used in control path (with most of signal probabilities far from 0.5) rather than data path. Since it is based on probabilistic flip-flop state, complementing state is more powerful when flip-flops can be in a low-leakage state with high probability. IV. EXPERIMENTAL RESULTS We performed experiments on a set of circuits taken from the MCNC and ISCAS 89 benchmarks. Each Normalized Leakage 60 Area-delay Mapping 45 Leakage-delay Mapping 36 +Pin reordering 33 20 +Lbiasing Fig. 4. Leakage reduction of s382 by each technique. 30 20 +Phase Assignment

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.7, NO.4, DECEMER, 2007 219 Leakage reduction (%) 75 70 65 60 55 50 45 35 30 bbara bbsse beecount dk14 dk17 ex1 ex4 ex6 factor 1.2 1.3 Fig. 5. Variation of leakage saving for varying input probabilities and varying timing constraints. circuit was synthesized with SIS [6] and mapped into a gate library, which we built for 45-nm predictive model [7]. The proposed technology mapping was implemented in SIS [6] environment as well. Shown in the first three columns of Table I are the name of the circuits, the number of gates in the combinational subcircuit, and the number of flip-flops. In the fourth column, we see the amount of leakage saving when we use a cost function of weighted sum of leakage and delay (refer to Fig. 1) compared to leakage when conventional cost function of area and delay is used. For each circuit, we assume 1.5 times of critical path delay (when we map the circuit with cost function of delay alone) as its timing constraint. We see about 13% saving on average. When we employ pin reordering and library of gate-length-biased gates to our mapping, the total saving increases to about 32% on average as shown in the fifth column, implying that combined pin reordering and gate-length biasing alone yields about 19% of leakage saving. After we employ phase assignment of flip-flops, the overall saving even goes up to 46% on average (sixth column), which 100 90 80 mapped 70 60 50 s526 30 s832 20 s1488 10 s1494 0 -C 25C 75C 125C Fig. 6. Variation of total leakage reduction with temperature. Leakage reduction (%) keyb mc opus sse is significant. The effect of each technique for leakage reduction is analyzed with an example circuit s382 in Fig. 4. The leakage is normalized to the total leakage of the circuit when it is mapped with conventional cost function of area and delay (leftmost bar). The effect of the mapping with a cost function of leakage and delay is reflected in the second bar. The effect of pin reordering in the combinational subcircuit alone is shown in the third bar. The fourth bar represents the effect of gate-lengthbiasing on sequential as well as combinational portion of the circuit. The last bar indicates the effect of phase assignment. The total leakage is reduced by about 50% by all techniques applied simultaneously during technology mapping. Leakage of combinational subcircuit is reduced by pin reordering and gate-length biasing with leakage-delay mapping, while that of flip-flops is reduced by phase assignment with skewed flip-flop. This is from reduction of subthreshold leakage and gate leakage by gate-length biasing and pin reordering, respectively. Since our technology mapping is driven by input signal probabilities, which can vary over execution of circuits, it is important to guarantee sizable leakage saving even though there is a variation of input signal probabilities. Fig. 5 shows the variation of leakage saving of MCNC benchmark circuits for different input signal probabilities. Each bar represents a range of leakage saving under 100 different average input probabilities of circuit inputs. The dot in each bar indicates the average leakage saving. We also repeat the same experiment for different timing constraints. The timing constraint of each circuit is assumed 1.2 and 1.3 times, respectively, of critical path delay (when we map the circuit with cost function of delay alone). As we allow loose timing constraint, the leakage saving is increased, as it must. Since our mapping involves leakage, which is a function of temperature, and the mapping is performed for fixed temperature, while temperature itself varies over time, it is important to ensure that the mapping is not too sensitive to temperature. We take four example circuits, map them at fixed temperature, and simulate them to see their leakage saving while we vary temperature, as shown in Fig. 6. The leakage saving

220 SEWAN HEO et al : MINIMIZING LEAKAGE OF SEQUENTIAL CIRCUITS THROUGH FLIP-FLOP SKEWING AND ~ increases with temperature, as expected. At higher temperature, the circuits are more leaky (by dominant subthreshold leakage) and gate-length biasing is more effective, which governs the leakage saving. As temperature is decreased, the absolute leakage itself is decreased (by steady gate leakage and exponentially decreased subthreshold leakage), and pin reordering is a main driver for leakage saving. V. CONCLUSIONS Although many circuit techniques have been proposed, they do not align well with conventional VLSI design due to many custom engineering. In this paper, we proposed leakage-aware technology mapping, which is one of steps of logic synthesis and is usually transparent to designers. We tried every efforts to push the limit of capability of technology mapping in terms of leakage saving. We used a probabilistic leakage (together with delay) as a cost function that drives the mapping; we considered pin reordering as one of the options in the mapping; we increased the library size by employing gates with larger gate length; we employed a new flipflop that is specifically designed for leakage through selective increase of gate length. When all techniques are applied during technology mapping, an average leakage saving of 46% was achieved, compared to the conventional technology mapping. REFERENCES [1] S. G. Narendra and A. Chandrakasan, Eds., Leakage in Nanometer CMOS Technologies, Springer, 2005. [2] P. Gupta, A.. Kahng, P. Sharma, and D. Sylvester, Selective gatelength biasing for costeffective runtime leakage control, in Proc. Design Automat. Conf., June 2004, pp. 327-330. [3] D. Lee, W. Kwong, D. laauw, and D. Sylvester, Analysis and minimization techniques for total leakage considering gate oxide leakage, in Proc. Design Automat. Conf., June 2003, pp. 175-180. [4] S. Ercolani, M. Favalli, M. Damiani, P. Olivo, and. Ricc o, Estimate of signal probability in combinational logic networks, in Proc. European Test Conf., Apr. 1989, pp. 132-138. [5] K. Keutzer, DAGON: technology binding and local optimization by DAG matching, in Proc. Design Automat. Conf., June 1987, pp. 341-347. [6] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. K. rayton, and A. Sangiovanni- Vincentelli, SIS: a system for sequential circuit synthesis, Tech. Rep., UC/ERL M92/41, U. C. erkeley, May 1992. [7] W. Zhao and Y. Cao, New generation of predictive technology model for sub-45nm design exploration, in Proc. Int l Symp. on Quality Electronic Design, Mar. 2006, pp. 585-590. Sewan Heo was born in usan, Republic of Korea, 1983. He received the.s. and M.S. degree in the Department of electrical engineering from Korea Advanced Institute of Science and Technology (KAIST), Korea, in 2005 and 2007, respectively. He is currently working as a researcher in Electronics and Telecommunications Research Institute (ETRI), Korea, from 2007. His research interests are in low power digital circuit design and multimedia processor design. Youngsoo Shin received the.s., M.S., and Ph.D. degrees in electronics engineering from Seoul National University, Korea. He has worked at the University of Tokyo, Japan, as a Research Associate, and IM T. J. Watson Research Center, Yorktown Heights, NY, as a Research Staff Member. He is currently an Associate Professor in the Department of Electrical Engineering, KAIST, Daejeon, Korea. He received a est Paper Award at the 2005 Int'l Symp. on Quality Electronic Design (ISQED). He has been on the program committee for Int'l Symp. on Low Power Electronics and Design (ISLPED), Int'l Conf. on Computer-Aided Design (ICCAD), and Asia and South Pacific Design Automation Conf. (ASP-DAC).