International Journal of Advances in Engineering, 2015, 1(4), 518-522 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in RESEARCH ARTICLE Multi bit Flip-Flop Grouping Using Data-Driven Clock Gating M.Lavanya and P.Anitha Dept. of Electronics and Communication, Kingston Engineering College, India mlavanyav93@gmail.com Received 24 March 2015 / Accepted 14 April 2015 Abstract-The clock gating is a predominant technique used for power saving. It is observed that the commonly used synthesis based gating still leaves a large amount of redundant clock pulses. Data-driven gating aims to disable these. To reduce the hardware overhead involved, flip-flops (FFs) are grouped so that they share a common clock enabling signal. The question of what is the group size maximizing the power savings is answered in a previous paper. Here we answer the question of which FFs should be placed in a group to maximize the power reduction. We propose a practical solution based on the toggling activity correlations of FFs and their physical position proximity constraints in the layout. Our data-driven clock gating is integrated into an Electronic Design Automation (EDA) commercial backend design flow, achieving total power reduction of 15% 20% for various types of large-scale state-of-the-art industrial and academic designs in 40 and 65 manometer process technologies. Index Terms - Flip-flops (FFs), Data-driven gating, Electronic Design Automation (EDA). I.INTRODUCTION Clock gating is a popular technique used in many synchronous circuits for reducing dynamic power dissipation. Clock gating saves power by adding more logic to a circuit to prune the clock tree. Pruning the clock disables portions of the circuitry so that the flip-flops in them do not have to switch states. Switching states consumes power. When not being switched, the switching power consumption goes to zero, and only leakage currents are incurred. Clock enabling signals are usually introduced by designers during the system and clock design phases, where the inter-dependencies of the various functions are well understood. In contrast, it is very difficult to define such signals in the gate level, especially in control logic, since the inter-dependencies among the states of various flip-flops depend on automatically synthesized logic. There is a big gap between block disabling that is driven from the HDL definitions, and what can be achieved with data knowledge regarding the flip-flops activities and how they are correlated with each other. Clock Signal: Digital circuits always have some input and generate digital outputs accordingly. Some digital circuits are not clocked, meaning that the input applied to the circuit flows through digital gates without any timing or storage and generates the output. It only takes a time equal to the propagation delay time to reach the output.on the other hand most of the digital circuits that do more complex processing on the digital A digital clock signal is basically a square wave voltage similar as the one shown below: Figure.1 Clock signal II. SYSTEM ANALYSIS Clock tree consume more than 50 % of dynamic power. The components of this power are, power consumed by combinational logic whose values are changing on each clock edge, Power consumed by flip-flop sand,the power consumed by the clock buffer tree in the design. It is good design idea to turn off the clock when it is not needed. Automatic clock gating is supported by modern EDA tools. They identify the circuits where clock gating can be inserted. RTL clock gating works by identifying groups of flip-flops which share a common enable control signal. RTL clock gating uses this enable signal to control a clock gating circuit which is connected to the clock ports of all of the flip-flops with the common enable term. Therefore, if a bank of flip-flops which share a common enable term have RTL clock gating implemented, the flip-flops will consume zero dynamic power as long as this enable signal is false. There are two types of clock gating Latch-based clock gating Latch-free clock gating Latch free clock gating: The latch-free clock gating style uses a simple AND or OR gate (depending on the edge on which flip-flops are triggered). Here if enable signal goes inactive in between the clock pulse or if it multiple times
519 Int. J. Adv. Eng., 2015, 1(4), 518-522 then gated clock output either can terminate prematurely or generate multiple clock pulses. This restriction makes the latch-free clock gating style inappropriate for our single-clock flip-flop based design. Figure.2 Latch Free Clock Gating Latch based clock gating: The latch-based clock gating style adds a level-sensitive latch to the design to hold the enable signal from the active edge of the clock until the inactive edge of the clock. Since the latch captures the state of the enable signal and holds it until the complete clock pulse has been generated, the enable signal need only be stable around the rising edge of the clock, just as in the traditional un gated design style. Fig.3 Latch based clock gating Gated Clock Design: Considering two circuits in Figure.1 from a functional (Zero delay) aspect, both of the registers have a same function. However, from timing aspect timing constraints assigned to the signal load and ENA are different. Actually, tighter timing constraints need to be assigned to the signal ENA Figure.4 Gated Clock Design This requires gathering statistical information of our flip-flops using simulations, and statistical analysis. Another issue that influences the effectiveness of this suggested technique is the fan-out of the gater. The theory presents a formula for calculating the optimal fan-out of the gater, referred to as k: When q- the probability for flip-flop input stability. CFF,C-latch, Cw - the capacities of extra flip-flops, latches, and wires. In our project, we approached this issue by implementing different fan-outs and estimating the effectiveness of each one on the power consumption reduction. Figure.5 Graphical Representation Of Gaters Fanout
520 Int. J. Adv. Eng., 2015, 1(4), 518-522 The graph shows the normalized power net savings per flip-flop obtained by adaptive gating at first level of clock tree in the equation above. The saving is compared to the non-gated situation. The optimal fan-out is marked for each toggling probability: Using the statistical information gathered and the optimal fan-out, we could attain groups of matching flip-flops for the clock gating Data-Driven Clock Gating: Clock enabling signals are very well understood at the system level and thus can effectively be defined and capture the periods where functional blocks and modules do not need to be clocked. Those are later being automatically synthesized into clock enabling signals at the gate level. In many cases, clock enabling signals are manually added for every FF as a part of a design methodology. Figure.6 Practical data-driven clock gating Figure.7 Activity similarity of FFs in a DSP core Data-driven clock gating is shown to achieve savings of more than 10% of the total dynamic power consumed by the clock tree. Data driven clock gating for digita filters achieves 20% power savings.the gating logic is tailored to the structure of the filter, whereas the approach discussed in this paper is more general and applies to large scale and a wide range of designs. Figure.8 Distribution of the number of FFs sharing common pre-enabled clock signal of a DSP core
521 Int. J. Adv. Eng., 2015, 1(4), 518-522 Proposed clock Gating Circuit: III.RESULTS AND DISCUSSION Figure.9 Clock Gating Circuit Figure.10 Waveform Of Clock Gating Circuit Fig.10 shows the schematic design of clock gating for low power systems and it shows the operation of circuit BY using gating methods.in this the clock pulse from clock distribution network is given to XOR GATE. Figure shows the output waveform of clock gating circuit power consumed by using this technique is 2.473965e-010 watts. Table.1 Comparison Table Of Gating Flip-flop Power consumed Non clock gating Clock gating Average power 4.604680e-011 watts 2.473965e-010 watts Max power 4.174061e-002 watts 5.858986e-002watts Min power 2.321770e-003 watts 8.679983e-003 watts CONCLUSION In this data driven clock gating technique is used to reduce the number of redundant clock pulses and the number of transistors, so the power dissipation also reduced. All the flip flops are grouped by giving common clock pulse so the area is reduced; more specifically looks at the real post clock tree timing information of clock logic at placement stage, which enables best placement and optimization for clock logic. Also, seeing the timing slack it provides maximum latency of clock logic for better clock tree and hence timing results. It uses XOR gate in order to disable the flip flop. Disabling effectiveness is better. Clock gating saves more power when compared with other techniques which provides better performance in terms of area and power. In my future work power gating can be used. In a processor chip, certain areas of the chip will be idle and will be activated only for certain operations.but these areas (cmos) are still provided with power for biasing. The power gating limits this unnecessary power being wasted by
522 Int. J. Adv. Eng., 2015, 1(4), 518-522 shutting down power for that area and resuming when ever needed. REFERENCES 1. V. G. Oklobdzija, Digital System Clocking High-Performance And Low- Power Aspects. New York, Ny, Usa: Wiley, 2003. 2. L. Benini, A. Bogliolo, And G. De Micheli, A Survey On Design Techniques For System-Level Dynamic Power Management, Ieee Trans. Very Large Scale Integr. (Vlsi) Syst., Vol. 8, No. 3, Pp. 299 316,Jun. 2000. 3. M. S. Hosny And W. Yuejian, Low Power Clocking Strategies In Deep Submicron Technologies, In Proc. Ieee Intll. Conf. Integr. Circuit Design Technol., Jun. 2008, Pp. 143 146. 4. C.Chunhong, K.Changjun, And S.Majid, Activity-Sensitive Clock Tree Construction For Low Power, In Proc. Int. Symp. Low Power Electron.Design, 2002, Pp. 279 282. 5. A. Farrahi, C. Chen, A. Srivastava, G. Tellez, And M. Sarrafzadeh, Activity-Driven Clock Design, Ieee Trans. Comput.-Aided Design 6. Integr. Circuits Syst., Vol. 20, No. 6, Pp. 705 714, Jun. 2001. 7. W. Shen, Y. Cai, X. Hong, And J. Hu, Activity And Register Placement Aware Gated Clock Network Design, In Proc. Int. Symp. Phys. Design, 2008, Pp. 182 189. 8. M. Donno, E. Macii, And L. Mazzoni, Power-Aware Clock Tree Planning, In Proc. Int. Symp. Phys. Design, 2004, Pp. 138 147. 9. S. Wimer And I. Koren, The Optimal Fan-Out Of Clock Network For Power Minimization By Adaptive Gating, Ieee Trans. Very Large Scale Integr. (Vlsi) Syst., Vol. 20, No. 10, Pp. 1772 1780, Oct. 2012. 10. Y.-T. Chang, C.-C. Hsu, M. P.-H. Lin, Y.-W. Tsai, And S.-F. Chen, Post-Placement Power Optimization With Multi-Bit Flip-Flops, In Proc. Saving Based On Interval Graphs, In Proc. Int. Symp. Phys. Design, 2011, Pp. 115 121. 11. N. Magen, A. Kolodny, U. Weiser, And N. Shamir, Interconnect-Power Dissipation In A Microprocessor, In Proc. Int. Workshop Syst. Level Int. Predict., 2004, Pp. 7 13. 12. M. Muller, S. Simon, H. Gryska, A. Wortmann, And S. Buch, Low Power Synthesizable Register Files For Processor And Ip Cores, Ieee Trans. Very Large Scale Integr. (Vlsi) Syst., Low-Power Design Tech., Vol. 39, No. 2, Pp. 131 155, Mar. 2006.