Characterizing Dynamic and Leakage Power Behavior in Flip-Flops R. Ramanarayanan, N. Vijaykrishnan and M. J. Irwin Dept. of Computer Science and Engineering Pennsylvania State University, PA 1682 Abstract This paper presents a detailed analysis of power consumption in a variety of flip-flop designs including scannable latches. The analysis was performed by implementing and simulating the different designs using 7nm, 1V CMOS technology. First, we perform a detailed characterization of the dynamic power consumption due to output transitions and that due to clock and data transitions when there is no output transition. Further, we also characterize the leakage behavior of each of the flip-flop designs and specifically, characterize the input dependence of leakage. Index Terms Flip-flops, Power characterization, Leakage power. I. INTRODUCTION OWER and performance analysis of flip-flops have Palways been critical due to its applications in data path designs. With rapid technology scaling, it is important to have a better insight into the power consumption especially that due to leakage current. This analysis will be useful for data path power analysis, as the flip-flops constitute an important component in current pipelined architectures. Further, benefits of leakage management techniques such as input vector control [8] can be accurately modeled by accounting for the variation in leakage in the flip-flops due to the input state. In this paper, we present a detailed analysis of the dynamic and leakage power consumption in different styles of flip-flops designed in 7nm, 1V supply voltage CMOS process. The designs evaluated encompass three widely used design styles: master-slave, pulse-triggered and sense-amplifier based flip-flops. In the sense-amplifier based designs, we further consider two different variations, one with a scan element and another without a scan element. This variation is considered as scan mechanisms are being employed increasingly for supporting testability of designs. In addition to power consumption, we also evaluate important timing metrics for each of the flip-flop designs. Earlier works on flip-flops focussed on characterizing them in terms of energy and energy-delay products during various transitions and presented some power-performance trade-off [1], [2]. Here, the focus is mainly on the characterizing the dynamic power consumption when the output changes and when the output remains the same but the clock or data change. Also, our work performs a detailed characterization of leakage power that is becoming very important in sub-nm regimes. II. ANALYSIS A. Timing Metrics The timing parameters considered in this work includes the basic flip-flop parameters, the clock-to-output propagation delay, setup time and hold time. The clock-to-output delay is the delay measured from the active edge of the clock to the output. The setup and hold times are the minimum time for which that data is to be kept high before and after clock transition. In all our designs except one, the clock period is 1ns. For the semidynamic flip-flop, which turned out to be very slow, a clock period of 2.2 ns is used. B. Power Metrics The power metric used here is the average power consumed over a constant time interval for various input states and transitions. We measure four types of power consumption: the transition power, the clock power, the data power and the leakage power. The transition power is defined as the power consumed by the flip-flop when there is an output transition. The clock and data powers are measured (for the entire flip-flop) during the respective transitions when the other signals (including the output transition) remain steady. The leakage power is consumed immaterial of whether there is a transition or not. Further, leakage power is state dependent where it is different for different states of the clock, data and output. This state dependency is characterized by finding the percentage difference between the maximum and minimum leakage states. III. FLIP-FLOP DESIGNS A. Master-Slave Latch pairs The master-slave flip-flops are designed as a latch pair, where one is transparent high while the other is transparent low. The transmission gate flip-flop (TGFF)[1], [2], [3], derived from the PowerPC63, is one of the fastest and low-power consuming flip-flop designs. Here, for a clock period of 1ns, the setup and hold time are found to be both
.1 ns. The worst-case propagation delay is.299 ns. The clock power consumed is very close to transition power. Further, we observe that the state dependency of the leakage power results in a 7% difference between the minimum and maximum leakage states, which is quite significant. Thus, when flip-flops are not used, it is necessary to not only gate their clock for reducing dynamic power but also to set their inputs to a low-leakage state. s Table 1. Power Characterization of TGFF (N/A means not applicable) Q --> 1 N/A 15.44 Q 1 --> 149.44 N/A D --> 1 C = 1 74.14 74.18 D 1 --> C = 1 55.24 55.21 D --> 1 C = 125.44 125.5 D 1 --> C = 71.51 71.52 C --> 1 D = N/A 146.82 C 1 --> D = 75.11 74.94 C --> 1 D = 1 147.15 N/A C 1 --> D = 1 74.88 75.2 C =, D =.97.8 C = 1, D = 1.66.96 C = 1, D =.94.78 C =, D = 1.89.112 The double edge triggered flip-flop (DETFF) is designed using two TGFF blocks, one active at the positive edges and the other active at the negative edges of the clock. Thus, the timing parameters are similar to that of the previous design. But due to increased number of transistors used, area becomes twice as much as the TGFF. Further more, as can be seen from the results, there is an increase in the power consumption when compared to the TGFF. The power consumption per clock cycle effectively increases by more than twice. But the clock frequency can be suitably reduced to achieve greater power efficiency for similar performance [4]. This design is found to have the maximum data power with the leakage power having state dependency of around 81%. The C 2 MOS [1] flip-flop comes under the category of pseudo-static master-slave flip-flops. These designs, originally dynamic are converted to be pseudo-static by adding a feedback loop at the end of the master and slave. They work in a way similar to the keeper circuits in dynamic logic by just keeping the previous output value when the respective latch is in hold mode. The setup and hold time are found to be.1 and.2 ns respectively while the worst case propagation delay is.146 ns. Here the transition and leakage power are quite less when compared to the other designs considered in this work, but this design has a comparatively high data and clock power. One of the reasons is due to the large number of clocked transistors in this design. The state dependency of the leakage power here is quite less, with percentage difference found to be around 43%. Table 2. Power Characterization of DETFF Q --> 1 N/A 158.928 Q 1 --> 158.9 N/A D --> 1 C = 1 139.924 139.949 D 1 --> C = 1 73.347 73.374 D --> 1 C = 139.945 139.972 D 1 --> C = 73.363 73.374 C --> 1 D = N/A 152.449 C 1 --> D = N/A 76.477 C --> 1 D = 1 154.35 N/A C 1 --> D = 1 76.46 N/A C =, D =.133.78 C = 1, D = 1.95.141 C = 1, D =.132.78 C =, D = 1.95.141 Table 3. Power Characterization of C 2 MOSFF Q --> 1 N/A 114.232 Q 1 --> 73.669 N/A D --> 1 C = 1 91.48 95.525 D 1 --> C = 1 39.578 39.447 D --> 1 C = 2.75 1.835 D 1 --> C = 4.781 2.13 C --> 1 D = 31.742 131.78 C 1 --> D = N/A 57.844 C --> 1 D = 1 132.337 131.983 C 1 --> D = 1 75.894 N/A C =, D =.49.52 C = 1, D = 1.69.7 C = 1, D =.59.69 C =, D = 1.64.58 B. Pulse-Triggered Flip-Flops Pulse-triggered flip-flops generate pulses during the active edge of the clock, which in turn would result in a
transition at the output. This technique makes the flip-flop switch faster, but with the cost of having more power dissipation due to the glitches in the output and the internal signals. This is the case especially with the hybrid latch flip-flop (HLFF) [1], [2], [5], which has very low propagation delay, setup and hold time. The setup time is found to be and the hold time is.2 ns with the worstcase propagation delay of.9 ns. As expected the transition power consumption is very high. The clock power here is sometimes greater than the transition power, especially when the data line is one. This may be due to the spurious transitions at the internal nodes caused by the clock. A relatively low leakage power may be attributed to the large number of stacked transistors in this design. But the state dependency of leakage power is found to be high with the percentage difference being 98%. Table 4. Power Characterization of HLFF Q --> 1 N/A 317.573 Q 1 --> 138.83 N/A D --> 1 C = 1.891 5.46 D 1 --> C = 1 7.43 4.499 D --> 1 C = 6.571 14.267 D 1 --> C = 14.26 13.955 C --> 1 D = N/A 71.146 C 1 --> D = 63.49 63.434 C --> 1 D = 1 331.89 N/A C 1 --> D = 1 63.857 65.169 C =, D =.86.83 C = 1, D = 1.98.79 C = 1, D =.5.49 C =, D = 1.14.99 The semi-dynamic flip-flop (SDFF)[1], [2], [6] is another such design, which works by triggering a pulse at the internal node. This flip-flop is transparent during time window determined by the delay through two inverters and a NAND gate. However, this feature causes a Static-one- Hazard [6] at the output when both D and C are ones, which is a major problem of this topology. This also results in unnecessary additional power consumption. Thus, this flip-flop has the maximum transition power among all designs considered. The setup time again is while the hold time is.1 ns. But the worst-case propagation delay is around.98 ns. Thus the minimum driving clock had to be 2.2 ns. C. Sense-Amplifier based Flip-flops and Scannable latches The sense-amplifier based flip-flops (SAFF) [2], used in the StrongArm are evaluated here. The advantage of using such flip-flops are that they are highly sensitive to input transition. The SR latch made of NAND gates at the output stage is the major bottleneck in terms of both power and performance. This design has a propagation delay of.593ns. It's setup and hold times are similar to that of the HLFF. Further, this design consumes the least transition and leakage power when compared to the other designs considered here. Table 5. Power Characterization of SDFF Q --> 1 N/A 352.351 Q 1 --> 196.312 N/A D --> 1 C = 1 1.323 7.492 D 1 --> C = 1 1.689 7.512 D --> 1 C =.594 6.745 D 1 --> C = 4.817 4.558 C --> 1 D = N/A 18.9 C 1 --> D = 215.359 88.327 C --> 1 D = 1 438.227 N/A C 1 --> D = 1 217.343 9.434 C =, D =.82.86 C = 1, D = 1.19.115 C = 1, D =.17.16 C =, D = 1.135.132 Table 6. Power Characterization of SAFF Q --> 1 N/A 53.465 Q 1 --> 52.933 N/A D --> 1 C = 1.855.845 D 1 --> C = 1.845.856 D --> 1 C = 2.519 3.168 D 1 --> C = 3.168 2.519 C --> 1 D = N/A 53.85 C 1 --> D = 139.88 139.783 C --> 1 D = 1 53.855 N/A C 1 --> D = 1 139.954 138.895 C =, D =.53.59 C = 1, D = 1.59.53 C = 1, D =.32.37 C =, D = 1.46.46 The output SR stage of the SAFF can be suitably modified to get improved performance [1]. This modified
design is implemented in the two scannable designs [6] that are used here. Due to the popularity of the scannable designs, it is important to analyze them. The modified sense-amplifier flip-flops can offer additional power reduction while also incorporating scan elements. The edge-triggered scannable SA latch (ETSA) [6] has very low power overhead due to the scan element. These flip-flops operate in two different modes. The normal mode in which it acts as an ordinary flip-flop and the scan mode in which it acts like a scanable latch which is used for testing purposes. This design has a propagation delay of.366ns. Further, this design is found have the least data and clock power while the state dependency of leakage power is maximum at 2%. Table 7. Power Characterization of ETSA Q --> 1 N/A 55.121 Q 1 --> 54.841 N/A D --> 1 C = 1.183.171 D 1 --> C = 1.183.172 D --> 1 C =.188.597 D 1 --> C =.593.176 C --> 1 D = N/A 55.63 C 1 --> D = 77.686 76.934 C --> 1 D = 1 55.221 N/A C 1 --> D = 1 17.928 76.467 C =, D =.15.8 C = 1, D = 1.5.52 C = 1, D =.53.5 C =, D = 1.79.139 The level sensitive scannable device SA Latch (LSSDSA) [6] is another approach to incorporate the scan element. The advantage here is that it is race free and more robust. But due to the increased number of clocks, the complexity increases and so does the power. This design also has similar timing characteristics as the SAFF with the worst-case propagation delay of.168ns and maximum clock and leakage power consumption among all designs considered. IV. COMPARISON A detailed study of the performance and power has been presented for each design. The transition power of all the designs is compared here (Figure 1.). It is found that the Semi dynamic flip-flop has the maximum power consumption due to the glitches produced internally during transition. The HLFF and LSSDSA latches follow it closely. The SAFF is found to have the least transition power consumption followed by ETSA latch. Table 8. Power Characterization of LSSDSA Q --> 1 N/A 224.643 Q 1 --> 227.625 N/A D --> 1 C = 1 1.12.96 D 1 --> C = 1.97 1.22 D --> 1 C = 1.353 2.37 D 1 --> C = 3.661 2.95 C --> 1 D = N/A 228.358 C 1 --> D = 216.474 22.88 C --> 1 D = 1 224.22 N/A C 1 --> D = 1 217.15 219.647 C =, D =.324.342 C = 1, D = 1.366.341 C = 1, D =.372.349 C =, D = 1.338.338 Power (micro W) Figure 1. 3 25 2 15 5 Comparison of Transition Power Type of flip-flop Apart from the transition power, the data and the clock power are also analyzed here. The data power (Figure 2.) is generally higher in master slave flip-flops than in the pulse-triggered and the sense-amplifier based designs. The DETFF consumes the maximum, while the ETSA consumes the minimum data power. The clock power (Figure 3.) is dependent mainly on the number of clocked transistors and how they affect the signal and few other factors. It is clearly seen that the level sensitive scannable latch consumes the maximum clock power while the edge-triggered, the least. Clock gating can provide significant gains for designs with high clock power.
The analysis of leakage power (Figure 4.) has also become crucial with technology scaling. Hence the designs are compared with respect to the leakage power. The SAFF is found to consume the least leakage power, while the LSSDSA latch consumes the maximum leakage power. Figure 2. Power (microw) 12 8 6 4 2 Comparison of Data Power Power (nw) Figure 4. Leakage Power 4 35 3 25 2 15 5 TGFF DETFF C2MOSFF HLFF SDFF SAFF ETSA LSSDSA VI. ACKNOWLEDGEMENT This work was supported in part by grants from GSRC, NSF Awards Career 9385 and 8264. Figure 3. Power(microW) 25 2 15 5 Comparison of Clock Power V. CONCLUSION The characterization of dynamic and leakage power was discussed for different classes of widely used flip-flops. A detailed comparison was also presented. We believe this characterization will be useful for other works in determining the choices of power-efficient flip-flops. VII. REFERENCES [1] Dejan Markovic, Borivoje Nikolic and Robert W. Brodersen, Analysis and Design of Low-Energy Flip- Flops, ISPLED 1, August, 21, pp 52-55 [2] Vladmir Stojanovic, Vojin G. Oklobdzija and Raminder Bajwa, A Unified Approach in the Analysis of Latches and Flip-Flops for Low-Power Systems, ISPLED 98, August 1-12, 1998, pp 227-232. [3] Vladmir Stojanovic and Vojin G. Oklobdzija, Comparative Analysis of Master-Slave Latches and Flop- Flops for High-Performance and Low-Power Systems, IEEE JSSC, April 1999, pp536-548. [4] Antonio G. M. Strollo, Ettore Napoli, and Carlo Ciminio, Analysis of Power Dissipation in Double Edge- Triggered Flip-Flops, IEEE Transactions on VLSI Systems, October 2, pp. 624 629. [5] Nikola Nedovic and Vojin G. Oklobdzija, Hybrid Latch Flip-Flop with Improved Power Efficiency, (SBCCI'), pp. 211 215. [6] Nikola Nedovic and Vojin G. Oklobdzija, Dynamic Flip-Flop with Improved Power, 26th European Solid- State Circuits Conference Stockholm, Sweden, 19-21 September 2,pp. 323-326. [7] V. Zyuban and D Meltzer, Clocking Strategies and Scannable Latches for Low Power Applications, ISPLED 1, August 21, pp 346-351. [8] A. Chandrakasan, W. J. Bowhill, and F. Fox, Design of High-Performance Microprocessor Circuits, IEEE Press, 21.