International Journal of Inventions in Computer Science and Engineering, Volume 2 Issue 4 April 2015 Partial Bus Specific Clock Gating With DPL Based DDFF Design For Low Power Application Reshmachandran 1, M.Tamilarasu 2 1 PG Scholar, Department of Electronics and communication, Dhanalakshmi srinivasan college of engineering, Coimbatore. 2 Assistant Professor, Department of Electronics and communication, Dhanalakshmi srinivasan college of engineering, Coimbatore. Abstract: The clock gating enables the clock signals from the CDN (clock distribution network). This technique activates the clock which is needed for the operation of the circuit. The unnecessary clock signals are not activated during the clock gating. This saves the dynamic power of the circuit. The auto gated flip-flops which are using clock gating technique provides only small power consumption. The circuit based on look ahead clock gating is used to avoid tight timing constraints for each clock pulses. The enabling clock pulses for the derived timing signals to the gated logic saves the power from the flipflops. The look ahead technique can also be used for reducing the delay and the distortions from the circuit for the achievement of the application level. The novel approach we are going to design is PBSCG (Partial bus specific clock gating) implemented in parallel counter for application level implementation. This can be adopted for the all sectional view from particular architecture implementation. This process could be available for the structural level implementation for all the CMOS logic gates for the integrated chip. This architecture can be designed and verified by using TANNER EDA tool. (ABS) Keywords: Clock Gating, Auto Gated Flip-Flop, Parallel Counter, Partial bus specific clock gating Reference to this paper should be made as follows: Reshmachandran1, M.Tamilarasu2 (2015) Partial Bus Specific Clock Gating With DPL Based DDFF Design For Low Power Application, International Journal of Inventions in Computer Science and Engineering, Volume 2 Issue 4 2015. 1 Introduction The clock signals are to be enabled at the process of system level [2] and it can effectively capture the functional block modules. This need not be clocked. These signals are activated later into the clock enabling signals [2] in the form of gate level. In the other [5] devices the clock signals are automatically added by the design consideration. Still the circuit is having some floating at the high level. For this situation we need to [8] calculate the dynamic power consumption consumed by a circuit when the clock signals are enabled. This period for assessing the clock gating requires the [11] analysis and the requirements of FF s Precharge and evaluation state as presented. The clock will be disabled in the next cycle by XOR-ing the output [9] of the present data input and it will reveal at the output in the next cycle. Then the output of the XOR gates are OR-ed for generating the gate signal for the FF s [10] which is to be used to avoid the glitches. The Integrated clock gate (ICG) can be used by the environmental tools by the combination of LATCH with the AND gate [13]. These latches could be used in ultra low power applications for a digital filter. The data driven clock gating signal are being used as an enabling signals [12] in this applications. There will be a trade off for ICG is the number of clock pulses could be disabled. The pulses could also be a tradeoff [5] for the hardware overhead. While increase the number of flip-flops the hardware overhead decreases to obtain by OR-ing the enable signals. The level of this high and the low state of signals could be processed in the same versa to give the proper output. The clock gating signals are not enabled as free. The logics and the interconnections are to be desired [7] to enable those signals and the output can be covered by area and the power consideration. In some operation individual clock input [4] has been given to the FF s and it consumes more amount of power. These clock separations have been yielding more size also. This could results in high overhead [8] of the consumed small amount of power. The registers attached to use the clocks and the enable condition are used by clock gating. To achieve the clock gating from the enable conditions [2] in order to use the imperative design. This process also saves the power as well as large number of ISSN (Online): 2348 3539.
Partial Bus Specific Clock Gating With DPL Based DDFF Design For Low Power Application 7 MUX s in the logic circuit [11]. These circuits could be replaced by using the Clock gating signals [5] from the CDN. The general form of the ICG can also to be distributing these signals to the clocks for the level of interchanging [6] as a part of the CDN. Since the level of the clock gating logic change the clock tree structure and it will be remain at the same tree. Clock gating logic levels having the strategy are as follows: 1) The RTL level code has to enable the condition which could be accessed the logic level synthesis. 2) The design could be specific modules or a registers that can be processed by ICG as a library function. 3) The automated clock gating has been semiautomatically inserted and it will be generated as an ICG cells. So this will be enable the RTL level or it will be insert into the ICG level for the optimizations. II. Look Ahead Clock Gating Look-ahead path and pipelining to eliminate the carry chain delay and reduce AND gate fan-in and fanout. The look-ahead clock gating block consists of enhanced auto gated symbol for master and the slave blocks. This could be used as a look-ahead structure for reducing the timing constraints of the each block. A. Enhanced AGFF Used For LACG LACG takes AGFF a leap forward, addressing three goals stopping the clock pulse also in the master latch, making it applicable for large and general designs and avoiding the tight timing constraints. LACG is based on using the XOR output in Fig to generate clock enabling signals of other FFs in the system, whose data depend on that FF. There is a problem though. The XOR output is valid only during a narrow window of around the clock rising edge, where and are the FF s setup time and clock to output contamination delay, respectively. After a delay the XOR output is corrupted and turns eventually to zero. To be valid during the entire positive half cycle it must be latched as shown in Figure 4. Fig 1 Enhanced AGFF Used For LACG Fig 2 Symbol Figure 2 is the symbol of the enhanced AGFF with the XOR output. The power consumed by the new latch can be reduced by gating its clock input. Such gating has been proposed in and it involves another XOR and OR gates, useful for high clock switching probability. It is subsequently shown that probability is very low and it is therefore not further being gated. Fig 3 Block diagram of the look ahead clock gating The enhanced auto gated flip-flops could be having the related use of the sectional circuits from the each and every input. The output from the flip-flop as Q and X could be input of the logic block and the continuous input to another block. The XOR and the OR gated logic could be used as a leap forward approach for the input signal. The clock and the gated clock also given to the logic and then it will be adopted as a signal from the each block of the architecture. The rising and the falling edge of the clock pulse enables the clock load from the switching. The output from the flip-flop k could be given to the logic as well as the gated signal as (1-(1-p) ^k for the output of the next level logic. The gated signal clock pulses also to be the path recognize of the master slave of the enhanced auto gated logic. This logic has been given to the next level of the flip-flop for automatic process (1-(1-p) ^k of the gate as general signals from the input. Then the output of the flip-flop could be given to the clock gated signal and the clock enabling signals to give the final output. This gives the timing constrains path of the look ahead clock signal from the inputs.the look ahead clock gating overcomes the drawbacks of the auto gated flip-flops in the tight timing constraints from the clock pulses which is not enabled in the
8 Reshmachandran et al. gated signal. The structural details from the signals where it could not recognize in the rising and the falling edges of the clock pulses. During the computation path the setup time and the holding timing can also to enable in the path of all the input pulses from the master slave blocks. III. Partial Bus Specific Clock Gating A. Pbscg Architecture The auto gated flip flops could be implemented as an application of partial bus specific Clock gating (PBSCG). This could be efficient of power saving from the circuits used in the flip-flops for measuring the outputs. An activitydriven parallel bus specific CG (PBSCG afterward) is employed to maximize dynamic power reduction at RT level before synthesis. It chooses solely a set of flip-flops (FF) to be gated by selection, and therefore the downside of gated FF choice is reduced from exponential quality into linear. When the OBSC is applied to the look, the parts activity redundant operations throughout the clock gated amount square measure determined by forward traversing the circuit from the gated FF outputs. These parts are going to be power gated mistreatment the clock modify signal generated by OBSC as long as the implementation of RTPG will cut back active discharge power. The practicableness analysis of RTPG is predicated on our planned minimum average idle time construct. BSC circuit compares the inputs and outputs, and gates the clock after they square measure equal. BSC are often used as a final CG choice to cut back dynamic power once no CG is often applied throughout synthesis. However, BSC is way from best in terms of dynamic power minimization, and therefore the partial BSC (PBSC afterward) circuit. To reduce the high power consumption in the related low power structure design for high performance has been presented here. The level of the each data given to the bit level of the flip-flop as auto gated flip-flop can be adopted form the input bit wise operation. The signal from the structural view can be delivered the sequence of the each input as a clock level output signal. And also the carry look-ahead used a pre scalar technique with systolic 4-bit counter modules with the cost of an extra detector circuit. The detector circuit detected the assertion of lower order bits to enable counting in the higher order bits. Fig 4 Block of partial Bus specific clock gating The main structure consists of the look-ahead path and the counting path. The Bus is partitioned into uniform 4-bit synchronous up counting modules. The counting path s counting logic controls counting operations and the lookahead logic anticipates future states and thus prepares the parallel data s for these future states. In the counting path, each module serves two main purposes. The first purpose is to generate all bits associated with their ordered position and the second purpose is to enable the future states of the look-ahead path. Partial Bus specific clock gating architecture enables high flexibility and reusability, and thus enables short design time for wide counter applications. The architecture is composed of four basic module types separated by auto gated FF s in a pipelined organization. These four modules type are placed in a highly repetitious structure in both the counting path and the look-ahead paths. B. Parallel counter The auto gated flip flops could be implemented as an application of parallel counter. This could be efficient of power saving from the circuits used in the flip-flops for measuring the outputs. Counters square measure wide thought of as essential building blocks for a range of circuit operations such as programmable frequency dividers, shifters, code generators, memory choose management, and various arithmetic operations. Since, several applications square measure comprised of those basic operations; a lot of analysis focuses on economical counter design style. Counter design methodologies explore tradeoffs between operational frequency, power consumption, area requirements, and target application specialization. To reduce the high power consumption in the related low power counter design for high performance has been presented here. The level of the each data given to the bit level of the flip-flop as auto gated flip-flop can be adopted form the input bit wise operation. The signal from the structural view can be delivered to the sequence of the each input as a clock level output signal. And also the carry look-
Partial Bus Specific Clock Gating With DPL Based DDFF Design For Low Power Application 9 ahead used a pre scalar technique with systolic 4-bit counter The fault issues ensuing from charge sharing may well modules with the cost of an extra detector circuit. The be reduced. Associate degree unconditional shutoff detector circuit detected the assertion of lower order bits to mechanism in DDFF overcomes the downside all FF S. The enable counting in the higher order bits. rationale for this in duration is that the charge sharing, that becomes uncontrollable because the variety of NMOS transistors within the stack will increase. The associate degree analysis reveals that the twin Dynamic node periodical hybrid Flip-Flop (DDFF) serves to be an economical flip-flop structure by means that of low power and high speed. The facility consumption is controlled by clock gating. The analysis approach reveals the sources of performance and power consumption bottle necks in numerous style designs. D. DDFF with double pass transistor logic implementation in PC and PBSC Fig 5 Block of parallel counter The main structure consists of the look-ahead path and the counting path. The counter is partitioned into uniform 4- bit synchronous up counting modules. The counting path s counting logic controls counting operations and the lookahead logic anticipates future states and thus prepares the counting for these future states. In the counting path, each module serves two main purposes. The first purpose is to generate all counter bits associated with their ordered position and the second purpose is to enable the future states of the look-ahead path. Parallel counter architecture enables high flexibility and reusability, and thus enables short design time for wide counter applications. The architecture is composed of four basic module types separated by auto gated FF s in a pipelined organization. These four modules type are placed in a highly repetitious structure in both the counting path and the look-ahead paths. In physical science, pass electronic transistor logic (PTL) describes many logic families utilized in the look of integrated circuits. Double pass transistor logic(dpl) is used here.it reduces the count of transistors wont to create completely different logic gates, by eliminating redundant transistors. This reduces the amount of active devices, however has the disadvantage that the distinction of the voltage between high and low logic levels decreases at every stage. Every electronic transistor nonparallel is a smaller amount saturated at its output than at its input. Here we are going to implement this logic in dual dynamic flip flop for redundant transistors, Power and the delay comparison. C. Dual Dynamic Flip-Flop Implementation The Dual Dynamic node periodical hybrid Flip- Flop (DDFF) is employed to decrease circuit complexness, increasing in operation speed and lower power dissipation. Fig 7 DPL based DDFF circuit Fig 6 Circuit diagram of DDFF The actual latching happens in analysis section throughout one- 1 overlap of CLK. If D is high (prior to the current overlap period), the input is discharged from NM0-2, this switches the cross coupled electrical converter try INV1-2 that causes the output of electrical converter as high and output discharge through NM4. For low level, this electrical converter is preserved by electrical converter try INV1-2, for the remainder of analysis section no latching happens. The provision control high throughout analysis amount by p-mos electronic transistor PM1. As
10 Reshmachandran et al. CLK falls low, the circuit enters within the pre-charge section and also the input force high through PM4, shift the state of INV1-2. Throughout this era the input node isn\'t actively driven by any electronic transistor, it stores the charge dynamically. The outputs at node QB and maintain their voltage levels through INV4. If D is low i.e zero (prior to the overlap period), node X1 remains high and force low through NM3 because the CLK goes high. Thus, node QB is charged high through PM2 is control off. At the top of the analysis section, because the CLK falls low, the input remains high and output stores the charge dynamically. Fig 9 Output waveform of Look ahead clock gating Fig 8 Schematic Of 4 Bit Parallel Counter With PBSCG And DPL BASED DDFF Circuit. Fig 10 Output waveform of partial Bus specific clock gating In figure 8, when the input is given as high and the low variations the automatic condition gives the random level output from the each output port. The gating could be given to the clock signal and this clock signal is enabled when the automatic condition gives the high level input. This will give the higher performance. IV. Power And Delay Results CIRCUIT POWER (µw) DELAY(ns) PC_LACG_DFF 0.9902113 0.50400 PC_LACG_DDFF 0.3834653 0.48700 PC_PBSCG_DFF 0.2597194 0.44560 PC_PBSCG_DDFF 0.2206 0.4354 PC_PBSC_DDFF_PTL 0.01939930 0.38600 Fig 11 Output waveform of 4 bit parallel counter with LACG and using DFF V. Simulation Results The proposed PBSCG and the look-ahead design implemented in parallel counter has been simulated and verified by using the TANNER EDA tools. By the level of this consideration we could find out the output as the power consumption of the proposed circuit. This design could be analyzed as the implementation in the proposed design. Fig 12 Output waveform of 4 bit parallel counter with LACG and using DDFF
Partial Bus Specific Clock Gating With DPL Based DDFF Design For Low Power Application 11 Fig 13 Output waveform of 4 bit parallel counter with PBSCG and using DFF computation could be enabled as a power and the delay timing constrains of the each clock pulses. The target matching could be pursued from the digital process of the each clock cycling for the auto gated flip-flops. This could be considered and followed by this proposed design to implement this survey and the evaluation of the circuits. The Partial bus specific structures also could enable the data from the inputs and the modified outputs. This can also be performed as per the details from the integrated level of simulation process. The analysis also could be verified and simulated. The performance of the proposed design has been improved as compared with conventional circuit. References Fig 14 Output waveform of 4 bit parallel counter with PBSCG and using DDFF Fig 15 Output waveform of 4 bit parallel counter with PBSCG and DPL based DDFF A Transient analysis is carried out assuming typical parameters, with power, supply voltage in volts, transient analysis from 0-1000ns,the clock period,the data period and taking the delay time. The total power dissipation has been improved as much from the circuit that can be adopted to compare with the existing and the proposed circuits. The delay could also to be detected from the each circuit and it also to be compared with the conventional circuits as much as 70-90%. The nano meter technology could be adopted as a 90nm process to detect the chip integration level from the analysis. The implementation of the parallel counter could be adequate from the lookahead design and PBSCG design and designed with both conventional DFF based and proposed DDFF based design. Therefore the proposed design is very well suited for low power and high performance applications. VI. Conclusion The work proposed the auto gated flip flop design with the implementation of parallel counter and the PBSCG with modified flip-flops applications. The circuit has been designed and verified by using the TSPICE compiler. The [1] V. G. Oklobdzija, Digital System Clocking High-Performance and Low-Power Aspects. New York, NY, USA: Wiley, 2003. [2] L. Benini, A. Bogliolo, and G. De Micheli, A survey on design techniques for system-level dynamic power management, IEEE Trans. VLSI Syst., vol. 8, no. 3, pp. 299 316, Jun. 2000. [3] M. S. Hosny and W. Yuejian, Low power clocking strategies in deep submicron technologies, in Proc. IEEE Int. Conf. Integr. Circuit Design Technol., ICICDT 2008, pp. 143 146. [4] C. Chunhong, K. Changjun, and S. Majid, Activity-sensitive clock tree construction for low power, in Proc. ISLPED, 2002, pp. 279 282. [5] A. Farrahi, C. Chen, A. Srivastava, G. Tellez, andm. Sarrafzadeh, Activity- driven clock design, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 20, no. 6, pp. 705 714, Jun. 2001. [6] W. Shen, Y. Cai, X. Hong, and J. Hu, Activity and register placement aware gated clock network design, in Proc. ISPD, 2008, pp. 182 189. [7] Synopsys Design Compiler, Version E- 2010.12-SP2. [8] S. Wimer and I. Koren, The Optimal fan-out of clock network for power minimization by adaptive gating, IEEE Trans. VLSI Syst., vol. 20, no. 10, pp. 1772 1780, Oct. 2012. [9] M. Donno, E. Macii, and L. Mazzoni, Poweraware clock tree planning, in Proc. ISPD, 2004, pp. 138 147. [10] S. Wimer and I. Koren, Design flow for flipflop grouping in data driven clock gating, IEEE Trans. VLSI Syst., to be published. [11] M. Muller, S. Simon, H. Gryska, A. Wortmann, and S. Buch, Low power synthesizable register files for processor and IP cores, INTEGRATION, The VLSI J., vol. 39, pp. 131 155, 2006. [12] A. G. M. Strollo and D. De Caro, Low power flip-flop with clock gating on master and slave latches, Electron. Lett., vol. 36, no. 4, pp. 294 295, Feb. 2000. [13] C. E. Stroud, R. R. Munoz, and D. A. Pierce, Behavioral model synthesis with Cones, IEEE Design Test Comput., vol. 5, no. 3, pp. 22 30, Jun. 1988.
12 Reshmachandran et al. [14] J. A. Bondy and U. S. R. Murty, Graph Theory. : Srpinger, 2008. [15] V. Kolmogorov, Blossom V: A new implementation of a minimum cost perfect matching algorithm, Math. Prog. Comp., pp. 43 67, 2009.