Design and Analysis of Metastable-Hardened and Soft-Error Tolerant. High-Performance, Low-Power Flip-Flops

Similar documents
Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

New Single Edge Triggered Flip-Flop Design with Improved Power and Power Delay Product for Low Data Activity Applications

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Novel Low Power and Low Transistor Count Flip-Flop Design with. High Performance

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

A Low-Power CMOS Flip-Flop for High Performance Processors

ISSCC 2003 / SESSION 19 / PROCESSOR BUILDING BLOCKS / PAPER 19.5

Comparative Analysis of low area and low power D Flip-Flop for Different Logic Values

Parametric Optimization of Clocked Redundant Flip-Flop Using Transmission Gate

Analysis and Optimization of Sequential Circuit Elements to Combat Single-Event Timing Upsets

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

LOW-POWER CLOCK DISTRIBUTION IN EDGE TRIGGERED FLIP-FLOP

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

DESIGN OF DOUBLE PULSE TRIGGERED FLIP-FLOP BASED ON SIGNAL FEED THROUGH SCHEME

Robust Synchronization using the Wagging Technique

ECE321 Electronics I

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

II. ANALYSIS I. INTRODUCTION

Energy Recovery Clocking Scheme and Flip-Flops for Ultra Low-Energy Applications

Research Article Ultra Low Power, High Performance Negative Edge Triggered ECRL Energy Recovery Sequential Elements with Power Clock Gating

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

Load-Sensitive Flip-Flop Characterization

A Power Efficient Flip Flop by using 90nm Technology

Design and Analysis of Custom Clock Buffers and a D Flip-Flop for Low Swing Clock Distribution Networks. A Thesis presented.

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

Design of Low Power Universal Shift Register

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

A Unified Approach in the Analysis of Latches and Flip-Flops for Low-Power Systems

AN EFFICIENT DOUBLE EDGE TRIGGERING FLIP FLOP (MDETFF)

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

P.Akila 1. P a g e 60

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

Comparative study on low-power high-performance standard-cell flip-flops

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

An FPGA Implementation of Shift Register Using Pulsed Latches

Novel Design of Static Dual-Edge Triggered (DET) Flip-Flops using Multiple C-Elements

Energy-Delay Space Analysis for Clocked Storage Elements Under Process Variations

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

Topic 8. Sequential Circuits 1

(CSC-3501) Lecture 7 (07 Feb 2008) Seung-Jong Park (Jay) CSC S.J. Park. Announcement

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Combinational vs Sequential

GLITCH FREE NAND BASED DCDL IN PHASE LOCKED LOOP APPLICATION

Power Optimization Techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic

EE-382M VLSI II FLIP-FLOPS

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Modeling and designing of Sense Amplifier based Flip-Flop using Cadence tool at 45nm

Memory elements. Topics. Memory element terminology. Variations in memory elements. Clock terminology. Memory element parameters. clock.

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Soft Error Resilient System Design through Error Correction

Lecture 21: Sequential Circuits. Review: Timing Definitions

PERFORMANCE ANALYSIS OF POWER GATING TECHNIQUES IN 4-BIT SISO SHIFT REGISTER CIRCUITS

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

Sequential Logic. References:

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP

CMOS Layout Design and Performance Analysis for Synchronization Failures using 50nm Technology

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

An Optimized Implementation of Pulse Triggered Flip-flop Based on Single Feed-Through Scheme in FPGA Technology

Asynchronous Model of Flip-Flop s and Latches for Low Power Clocking

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1

EFFICIENT POWER REDUCTION OF TOPOLOGICALLY COMPRESSED FLIP-FLOP AND GDI BASED FLIP FLOP

Metastability Analysis of Synchronizer

DUAL EDGE-TRIGGERED D-TYPE FLIP-FLOP WITH LOW POWER CONSUMPTION

11. Sequential Elements

Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop

ISSN Vol.08,Issue.24, December-2016, Pages:

Area Efficient Level Sensitive Flip-Flops A Performance Comparison

Single Edge Triggered Static D Flip-Flops: Performance Comparison

Performance Driven Reliable Link Design for Network on Chips

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

Design And Analysis of Clocked Subsystem Elements Using Leakage Reduction Technique

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Low Power D Flip Flop Using Static Pass Transistor Logic

Lecture 26: Multipliers. Final presentations May 8, 1-5pm, BWRC Final reports due May 7 Final exam, Monday, May :30pm, 241 Cory

CMOS DESIGN OF FLIP-FLOP ON 120nm

SGERC: a self-gated timing error resilient cluster of sequential cells for wide-voltage processor

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

Design and Analysis of Semi-Transparent Flip-Flops for high speed and Low Power Applications in Networks

EDSU: Error detection and sampling unified flip-flop with ultra-low overhead

Low Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis

Noise Margin in Low Power SRAM Cells

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications

Flip-Flops A) Synchronization: Clocks and Latches B) Two Stage Latch C) Memory Requires Feedback D) Simple Flip-Flop Gate

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

Sequential Circuit Design: Part 1

Retiming Sequential Circuits for Low Power

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

Transcription:

Design and Analysis of Metastable-Hardened and Soft-Error Tolerant High-Performance, Low-Power Flip-Flops David Li, David Rennie, Pierce Chuang, David Nairn, Manoj Sachdev Department of Electrical and Computer Engineering, University of Waterloo Waterloo, Ontario, Canada {d4li, djrennie, pichuang, nairn, msachdev} @uwaterloo.ca Abstract In this paper, detailed analysis is given on the design of metastable-hardened and soft-error tolerant flip-flops while maintaining the basic characteristics of low-power and high-performance. We also propose two new flip-flop designs: pre-discharge soft-error tolerant flip-flop (PDFF-SE) and senseamplifier transmission-gate soft-error tolerant flip-flop (SATG- SE). Following our main design approach, both PDFF-SE and SATG-SE use a cross-coupled inverter on the critical path in the master-stage to achieve good metastability while generating differential signals to facilitate the usage of the uatro cell in the slave-stage to protect against soft-errors. PDFF-SE is designed to achieve very high performance with good metastability while SATG-SE is a low-power design also with good metastability. We also introduce two new design metrics, namely the metastabilitydelay-product (MDP) and the metastability-power-delay-product (MPDP), to analyze the design tradeoffs between metastability, power, and performance. Simulation results in 65nm CMOS technology have shown that both proposed designs achieve significant reduction in MDP and MPDP when compared to other flip-flop architectures analyzed in this work. Monte Carlo simulation results also show that these flip-flops are very robust and reliable against process variations and mismatches. I. INTRODUCTION Pipelining is a common technique that is often used to achieve high-performance datapaths in digital processors. Timing elements such as flip-flops and latches are an important part of such technique since they synchronize the flow of data from one stage to next. As size and complexity of chip design are rapidly growing, reliability is becoming an important factor to consider when designing nanometer circuits and systems. Two of the main reliability concerns associated with flip-flop design are metastability and soft-errors. Voltage V dd D 1 D 0 V m Initialization Phase Fig. 1. T s Metastable Resolving Phase Stable Stable Illustration of Metastability in Flip-Flop Time Metastability is a phenomenon where a bi-stable element enters an undesirable third state in which the output is at an intermediate level between logic 0 and 1. It typically occurs during the synchronization of two signals, in particular the data (D) and the clock () signal in the case of latches and flip-flops. If the data changes during the aperture between the setup and hold time restriction period of the flip-flop, the output may become unpredictable and takes an unbounded amount of time to settle to a stable level (Fig. 1). Hence, it is important to maintain a reliable system by avoiding metastable output that may propagate from stage to stage in the pipeline systems and ultimately results in system failures. As the CMOS technology continues to scale, factors such as process, voltage, and temperature (PVT) variations, rapid increasing in clock frequencies, as well as shorter intrinsic delay all contribute to the increasing probability of flip-flops producing metastable outputs [1]. Fig. 2. + - + + - + SET Illustration of Soft-Error in Flip-Flop Cosmic radiation-induced single-event transient (SET), also known as soft-error, has become a major reliability concern in today s integrated circuits (Fig. 2). Consequently, factors such as increasing clock frequencies and decreasing node capacitances and supply voltage all contribute to a drastic increase in the soft-error susceptibility of both combinational and sequential circuits to alpha particle and cosmic neutron strikes. In combinational circuits, phenomenon such as logical masking, electrical masking, and latch-window masking can all mask the glitches caused by soft-errors [2]. Such masking, however, does not exist in sequential elements such as latches and flip-flops, which contribute to approximately 50% of the soft-errors observed in various processors [3]. Recently, the usage of tolerant cells [4][5][6] has emerged as a more popular technique for soft-error protection in flip-flops over other techniques such as error-correction code (ECC) and redundancy due to more design robustness along with less delay, power, and area overhead. For example, more than 99% 978-1-61284-914-0/11/$26.00 2011 IEEE 583 12th Int'l Symposium on uality Electronic Design

of the latches in the system interface are soft-error-hardened in the state-of-the-art microprocessor design [7]. In this work, we will analyze the techniques involved in designing high-performance and low-power flip-flops that also address the reliability issues of metastability and softerror. Detailed analysis and simulation results will be given on the techniques and issues involved in designing reliable and robust flip-flops. We will apply the idea of using crosscoupled inverter and soft-error tolerant cells on various past flip-flop architectures as well as the two proposed designs. Two new design metrics, metastability-delay-product (MDP) and metastability-power-delay-product (MPDP), are also introduced to analyze and characterize the design tradeoffs between metastability, performance, and power. The rest of this paper is organized in the following manner. Section II reviews some background information on metastability and soft-error tolerant designs as well as introducing the new design metrics of MDP and MPDP. Section III describes the past and the proposed flip-flop architectures as well as the experimental setup and procedures for simulation. Extensive simulation results will be provided in Section IV comparing different flip-flop designs. Section V concludes with some final remarks and comments. that represents the inverse of the gain-bandwidth product of the feedback element in the flip-flop. If the data transitions at a frequency of f D with respect to the clock which has a frequency of f, the mean time between failures (MTBF) is then given in Equation (2). 1 MTBF = (2) f D f T 0 e tr/τ As seen from Equation (1), τ has the greatest impact on the metastability window due to the exponential relationship. Asmallτ value results in fast flip-flop resolution time from the metastable region and thus increases the MTBF [9]. For an MTBF of five years, Fig. 4 plots the required τ value as a function of clock frequency assuming the following parameters: f D =0.5f, t r =1/(0.5f ), T 0 =25ps. It is clear that for reliable multi-ghz pipeline systems, a τ value of less than 50ps is highly desirable. II. BACKGROUND INFORMATION A. Metastability Analysis Fig. 3 shows an ideal plot of - delay vs. data arrival time for a typical flip-flop. When data arrives at or near t meta, the flip-flop enters the metastable region and requires the longest time to resolve the output. Past studies have shown Fig. 4. Relationship between τ and Clock Frequency The analysis of a cross-coupled inverter (Fig. 5) can be incorporated and extended to the design of metastable-hardened flip-flops because all the dynamic and critical nodes in flipflops are stabilized by some form of a cross-coupled inverter pair. Using small-signal analysis, τ of a cross-coupled inverter can be modeled by Equation (3) based on the parasitic capacitances C L, Miller Capacitance C m, loop transconductance G m, and output resistance R [10][11]. V X Fig. 3. Extraction of Flip-Flop Metastability Parameters C L inv 1 C L that the flip-flop delay in the metastable region is exponential in nature where two exponential parameters (τ and T 0 ) can be extracted from simulation to model the behavior of the delay in the metastable region [8]. Metastability window, δ, can be defined as the period where data transition will not be resolved within a given resolution time (t r ). The metastability window can then be calculated using Equation (1). δ = T 0 e tr/τ (1) Fig. 5. V X inv 2 Metastability Modeling using Cross-Coupled Inverter τ = C L +4C m G m 1 R (3) where T 0 is the asymptotic width of the metastability window with no resolution time, and τ is the resolution time constant From this equation, it is clear that τ can be minimized by either reducing the parasitic and the Miller capacitances or

increasing the overall transconductance in the inverter loop. Past work have shown that the value of τ can be reduced if the cross-coupled inverter is on the critical path of the flip-flop architecture because larger inverter size can be used to achieve both good-performance and higher transconductance [12]. The common figure of merits (FOM) for comparing flipflop designs include delay, power, power-delay-product (PDP) and area considerations. As the size of the cross-coupled inverter increases (Fig. 5), the power consumption is linearly increasing while the delay initially decreases then increases due to the self-loading effects (Fig. 6(a)). This type of powerdelay behavior has resulted in an optimum point for PDP (Fig. 6(b)). In terms of τ, initial increasing in inverter size will reduce its value because the increase in the transconductance G m dominates over the increase in the parasitic capacitances (C L and C m ), as modeled in Equation (3) and illustrated in Fig. 6(a). As the inverter size continues to increase, τ tends to saturate to a constant value as further increase in G m is offset by the increase in C L and C m. From Fig. 6(a), it is clear that a design tradeoff exists between τ, delay, and power. Hence, we introduce two new FOMs in designing reliable, high-performance, and low-power flip-flops, namely the metastability-delay-product (MDP) and metastability-powerdelay-product (MPDP). MDP is simply the product of the resolution time constant τ and the propagation delay (τ delay), and it is a measure of the optimum design tradeoff between metastability and performance. MPDP, which is the product of τ and PDP (τ PDP), extends the traditional metric of PDP by introducing the element of metastability into the flip-flop designs. Fig. 6(b) plots the PDP, MDP, and MPDP values as a function of inverter size. Depending on the design objectives, different inverter sizes could be chosen for optimum PDP, MDP, or MPDP designs. B. Soft-Error Tolerant s A number of soft-error tolerant cells have been proposed in the past. In this work, we will focus on two particular cells: DICE [13] and uatro [14]. The Dual-Interlocked (DICE) (Fig. 7(a)) stores a logic 0 or 1 as a combination of four node voltages: two nodes holding the original data and two nodes retain the complement of the data. When the value stored at any node (i.e. ) is modified due to SET, other unaffected nodes (,, and ) will help to restore the correct value of the affected node because one transistor of each inverter driving one of the affected nodes is driven by one unaffected node. The uatro cell (Fig. 7(b)) also has four storage nodes. Each of these nodes is driven by an NMOS and a PMOS transistor with their gates connected to two different nodes. If an SET upsets a node voltage, the affected node is restored by the corresponding ON PMOS (NMOS) transistor connected to the node and driven by an unaffected node. A detailed operation and simulation waveforms on the usage of uatro cell in flip-flop design is given in [15]. In soft-error tolerant flip-flops, the critical internal nodes are protected by being written into the tolerant cells. When writing into the DICE cell, the two nodes must have the same Fig. 6. (a) Delay, Power, τ (b) PDP, MDP, MPDP Flip-Flop FOM Analysis phase and written into cell location of either and or and (Fig. 7(a)). Hence, the usage of the DICE cell requires the flip-flop architecture to produce identical signals, which is typically accomplished by using duplicated datapath [4]. uatro cell, on the other hand, facilitates many differential flip-flop architectures because it requires differential signals to be written into the cell location of either and or and (Fig. 7(b)). In master-slave configuration flip-flops, both DICE and uatro cells can be added at the critical nodes in both the master and slave stage or just simply the slave stage to protect the output nodes. While the addition of the tolerant cells increases the immunity of the flip-flops against soft-errors, it also impacts its performance by adding more resistivity in terms of changing the values stored at the critical nodes during the normal operation of the flip-flops. Hence, two additional -controlled transistors are added to the DICE (M5 and M8) and the uatro (M7 and M8) cell respectively in order to maintain highperformance. Depending on the location of these cells, these transistors are controlled either by in the master-stage or in the slave-stage. Assuming the flip-flop is positiveedge triggered, for example, during the evaluation period in the slave stage, cuts off the NMOS path that holds a

-Δ +Δ M 1 M 2 M3 M 4 / / M 5 M 8 IStatic M 6 M 7 M 9 M 10 (a) DICE M 1 M 2 to PVT variations and transistor mismatches, it is possible that the two signals can have a small static offset of Δ (i.e. arrives earlier than or vice versa) and consequently results in a few static power dissipation paths in the cell for a given data transition. In uatro cell, a static offset of Δ exists even without the presence of PVT variations and mismatches due to the inverter delay required to generate the differential signal such that the signal transition of will always arrive later than that of. If makes a 0-1 transition, will make a 1-0 transition after an inverter delay. During this period, however, four potential paths in the uatro cell could potentially result in static power dissipation by simultaneously turning on both the PMOS and NMOS transistors. The same scenario does not occur when is making a 1-0 transition and is making a 0-1 transition because all the NMOS transistors are turned off. M 3 M 4 M 5 M 6 / M 7 M 8 / -Δ M 9 M 10 +Δ (b) uatro Fig. 7. Soft-Error Tolerant s Fig. 8. Power Consumption of Soft-Error Tolerant s logic 0 in the hardened cell, which allows the node to be flipped to logic 1. If these two transistors are not present, contention exists between the flip-flop and the hardened cell in changing the node value from 0-1, which results in significant performance degradation. Alternatively, two more clockedtransistors can be added in the PMOS paths that holds a logic 1, however the amount of performance degradation without these two transistors is not as significant when changing the node value from 1-0 due to the relative weaker strength of PMOS transistors when compared to NMOS transistors. Although the critical nodes are protected for only half of the clock cycle using this approach, the likelihood of these nodes being upset by SET is reduced significantly because either the master or the slave-stage is transparent and evaluating during that time period. Simulation results have shown that the presence of these transistors in the tolerant cells improve the performance by at least 10% depending on the flip-flop architecture. The power consumption of the DICE and uatro cell is also analyzed in this work. Ideally, there is no phase offset between the signals being written into the DICE cell. Due A simple test bench was setup to measure the power consumption (Fig. 8) of the DICE and the uatro cell using a data activity of 25% with equal number of 0-1 and 1-0 data transitions for input signal ( and ) having two sets of rise/fall time: 50ps and 100ps. +Δ indicates signal arrives later than in both the DICE and the uatro cell respectively, and vice versa for Δ. From the figure, it is evident that the power consumption in the DICE cell is symmetrical about the point where the static phase offset Δ is 0, which means power consumption only depends on the absolute value of the phase offset and indifferent to the arrival order of the input signals. In the uatro cell, the power consumption for a rise/fall time of 50ps is symmetrical about the Δ=10ps point, which is roughly equivalent to an inverter delay for the corresponding signal rise/fall time. The symmetry point moves to 40ps when the rise/fall time is 100ps, which suggests the inverter delay degrades with input signals having a higher rise/fall time. Once again, the power consumption is irrelevant to the arrival of the input signals in the uatro cell as long as the number of 0-1 and 1-0 data transitions is equal. If the input data vector has more 0-1 transitions, then the power dissipated will be significantly higher than when there is more 1-0 transitions. Under such scenario, the power consumption

T1 T1 B T1 T1 T2 T2 2 T2 DICE T2 2 uatro (a) DICE-C 2 MOS (b) uatro-c 2 MOS DICE DICE uatro uatro 2 B 2 (c) DICE-Hazucha (d) uatro-hazucha B B B M6 M5 M8 M7 M12 M16 D M3 SET M4 inv1 M2 RESET SET M11 M10 M9 uatro M15 M14 M13 RESET uatro _D M1 _D B (e) PDFF-SE (f) SATG-SE Fig. 9. Metastable-Hardened, Soft-Error Tolerant Flip-Flop Designs will no longer be symmetrical for +Δ and Δ offset. Finally, a faster rise/fall time will result in significant power saving in both the DICE and the uatro cell, as evident by the data comparison between 50ps and 100ps rise/fall time shown in Fig. 8. The effect of higher rise/fall time is more prominent in the uatro cell where both the short-circuit and static power dissipation contribute to the overall power consumption. Based on the above analysis, it is clear that the power consumption of the uatro cell is generally higher than that of the DICE cell. III. DESIGN AND ANALYSIS A. Flip-Flops Analyzed In this work, we have analyzed the usage of the DICE and uatro cells along with the cross-coupled inverter structure on various flip-flops architectures in order to simultaneously

achieve good metastability and soft-error protection while maintaining the characteristic of high-performance and lowpower. The main approach is to resolve metastability in the master-stage with a cross-coupled inverter pair while adding the soft-error tolerant cell in the slave stage to protect the output nodes against possible SET. For this reason, highperformance flip-flops such as TSPC and impulse flip-flop are not included in this work due to extremely poor metastability because their architecture is not feasible to use the crosscoupled inverter to stabilize the critical nodes. Two C 2 MOS-based architectures are analyzed in this work (Fig. 9(a) and (b)). In the uatro-c 2 MOS configuration, a cross-coupled inverter pair is used to stabilize the dynamic nodes of T1 and T2 while improving the metastability. The DICE-C 2 MOS configuration does not produce differential signals, and hence separate inverter pair is used on each datapath to improve the metastability in the master stage. The value of τ is limited by the size of the feedback inverter, which must be kept close to minimum size to reduce the amount of parasitic capacitance at critical nodes in order to maintain good performance and functionality. A special soft-error robust latch based on transmission-gate and DICE cell was proposed in [6]. In this work, we modify the design slightly to create a Hazucha flip-flop (Fig. 9(c) and (d)) using both the DICE and uatro tolerant cell by cascading two identical latches. Instead of using the traditional crosscoupled inverter in the master-stage to improve metastability, DICE and uatro cell are used in each respective design because the cross-coupled inverter structure still exists in these cells except for a different configuration to improve the immunity against soft-errors. Two new differential flip-flops are proposed in this work: (i) pre-discharge soft-error tolerant flip-flop (PDFF-SE, Fig. 9(e)) (ii) sense-amplifier transmission-gate soft-error tolerant flipflop (SATG-SE, Fig. 9(f)). Both designs have good metastability with a cross-coupled inverter in the master-stage and soft-error protected by using the uatro cell in the slavestage. The cross-coupled inverter structure in the master-stage can be sized up to simultaneously achieve good performance and metastability while the differential nature facilitates the usage of the uatro cell in the slave-stage. The design of PDFF-SE is targeted towards very high-performance with good metastability while SATG is designed to have low-power consumption also with good metastability. The master-stage of the PDFF-SE is identical to the flip-flop previously proposed in [16]. Its slave-stage is modified in order to minimize the power consumption by balancing the arrival time of the input signals written into the uatro cell. In SATG-SE, the masterstage utilizes a cross-coupled inverter structure similar to the sense-amplifier flip-flop. Differential data is being written into the flip-flop through two NMOS-pass transistors plus two additional discharge paths to improve the overall performance. The slave-stage of SATG-SE composes of a differential path using transmission-gate switches to facilitate the usage of the uatro cell. With careful design considerations, the power consumption of SATG-SE can be reduced significantly with reasonable performance despite the usage of the uatro cell. B. Analysis Procedure and Experimental Setup All the flip-flops analyzed in this work are designed to be positive-edge triggered. Iterative process was used in transistor sizing in order to achieve the optimum MPDP flip-flop design for the best possible combination between delay, power, and metastability of τ. Minimum-sized transistors are used in all the DICE and uatro cells. All simulation runs are performed in Cadence environment using 65nm CMOS bulk technology with 1V as the nominal supply voltage at 27 C. The simulation test bench shown in Fig. 10 is used to measure delay, power, τ of the flip-flops. Both the data (D) and the signals have been buffered to ensure realistic waveforms of 50ps rise/fall time are being fed into the flipflops. For fair comparison, the output buffer of each flip-flop architecture is sized identically to drive an output load of 20fF. AC AC V DD_ Fig. 10. V DD_ D D VDD SET CLR 20fF Simulation Test Bench 20fF The performance of the flip-flops in this work is characterized by the D- delay measured at 50% delay points. The power consumption is measured at two different data activity factors of 10% and 50%. The power measurement of the flipflops includes the total power dissipated in the flip-flop as well as the local data and clock power dissipated in the shaded inverters shown in Fig. 10 [17]. PDP is calculated as the product of the D- delay and the total power dissipation for a given data activity. The extraction of τ from the simulation results is identical to the method described in [12][18][19]. Metastability-delay-product (MDP) is calculated as the product of τ and the D- delay while metastability-power-delayproduct (MPDP) is the product of τ and PDP. The values of all the D- delay, τ, PDP, MDP, and MPDP given in this work are the worst case value of either the 0-1 or 1-0 data transition. IV. SIMULATION RESULTS TABLE I summarizes the simulation results for all the metastable-hardened and soft-error tolerant flip-flops analyzed in this work. The addition of the minimum-sized cross-coupled inverter on the dynamic nodes of the C 2 MOS flip-flops to enhance metastability significantly degrade its performance. Without these inverters, however, the value of τ can be as much as 40 times higher. The performance of uatro-hazucha is worse than DICE-Hazucha because the uatro cell is more resistant in writing data for 1-0 transition which consequently results in higher setup time when it is used in the master-stage. Based on the reasonings from earlier analysis, the power consumption of

TABLE I SIMULATION RESULTS OF METASTABLE-HARDENED, SOFT-ERROR TOLERANT FLIP-FLOPS Delay 10% Power 50% Power τ MDP 10% PDP 50% PDP 10% MPDP 50% MPDP (ps) (μw) 100% (μw) (ps) (ps 2 ) (fj) (fj) (fj ps) (fj ps) DICE-C 2 MOS 79.34 3.61 6.95 24.26 1614.97 0.286 0.551 5.829 13.367 uatro-c 2 MOS 73.55 3.88 7.95 27.49 1865.13 0.285 0.584 7.239 16.054 DICE-Hazucha 52.57 3.89 6.88 25.2 1324.76 0.204 0.362 5.153 9.114 uatro-hazucha 89.58 4.19 8.32 28.12 1605.09 0.375 0.745 6.720 20.949 PDFF-SE 39.68 5.66 8.05 19.2 761.86 0.225 0.319 4.312 6.133 SATG-SE 56.29 2.97 6.48 20.86 1174.10 0.167 0.365 3.488 7.609 uatro-c 2 MOS and uatro-hazucha are shown to be higher, especially at higher data activity, than the same flip-flop architectures when DICE cell is used. Minimum transistor sizing is used on the cross-coupled inverter structure in the C 2 MOS and Hazucha architectures, and therefore their respective τ is very similar with the difference due to the parasitic capacitance surrounding the critical node. The proposed PDFF-SE results in at least 17% performance improvement over the other flipflop architectures, but the pre-discharging of the internal nodes during every clock cycle makes its power consumption higher than the other flip-flops, especially at low data activity. The proposed SATG-SE maintains a very comparable performance to the other analyzed flip-flops while achieving a minimum 18% and 6% power reduction for 10% and 50% data activity respectively. The size of the cross-coupled inverter in both PDFF-SE and SATG-SE can be sized up significantly higher than minimum size because they are on the critical path, and thus results in a lower value of τ. Theτ of PDFF-SE and SATG-SE is at least 21% and 14% lower than the other flipflops respectively. The PDP of SATG-SE and PDFF-SE is the lowest among all flip-flops for data activity of 10% and 50% respectively. The PDP value of DICE-Hazucha is also small for both data activity factors. Since PDFF-SE exhibits both the best performance and metastability, its MDP value is significantly lower than the other flip-flops such as a 43% reduction than DICE-Hazucha, and thus indicating a well-balanced design tradeoff between performance and metastability. While higher than PDFF-SE, the MDP of SATG-SE is still at least 12% lower than the other flip-flops. For 10% data activity, the MPDP of PDFF-SE and SATG-SE is 16% and 32% lower than the DICE-Hazucha flip-flop. At 50% data activity, the minimum MPDP reduction of PDFF-SE and SATG-SE from other flip-flops is 33% and 17% respectively. In this work, we also analyze the robustness of each flip-flop architecture against process variations and mismatches when it is operating near the metastable region. For each flip-flop, the data arrival time in which the flip-flop first fails to capture the correct data was determined and will be referred to as t meta, the point where the flip-flop is operating at the metastable region. A Monte Carlo simulation of 2000 iterations with both process variations and mismatches was performed to determine the number of clock cycles where the correct data was sampled. Then the data arrival time of the flip-flop is relaxed from t meta by a certain value, and another set of Monte Carlo simulation is performed. This procedure (Fig. 11) is repeated for a number of data arrival time values until the sampled data is 100% correct. Based on previous studies and simulation results, a 20ps flip-flop metastable region from t meta is assumed. Relaxed Arrival Time Logic 1 Correctly Sampled Fig. 11. Metastable Region (20ps) t meta Logic 1 Incorrectly Sampled as Logic 0 Waveform for Monte Carlo Simulation Fig. 12 shows the Monte Carlo simulation results of each flip-flop architecture for both 0-1 and 1-0 data transition at various data arrival times. At t meta for each respective flipflop, the percentage of correctness is approximately 50%, which suggests total randomness when the flip-flop is going under metastability [20]. As the data arrival time is relaxed, the percentage gradually increases at various rates for different flip-flops depending on their resolving time constant. It is interesting to note that the flip-flops with a lower τ value have an overall higher percentage of correctness, and thus are more robust against process variations and mismatches. For example, the PDFF-SE and SATG-SE have an overall 83% and 81% correctness respectively in the metastable region for 0-1 data transition and 81% and 78% for 1-0 data transition. uatro-c 2 MOS and uatro-hazucha, on the other hand, have the highest τ values and consequently yield the lowest overall percentage of 75% and 74% for 0-1 data transition and 75% and 71% for 1-0 data transition respectively. V. CONCLUSIONS In this work, we have analyzed the design of metastablehardened and soft-error tolerant master-slave flip-flops as well as proposing two new flip-flop designs. The main approach is to resolve metastability in the master-stage with a crosscoupled inverter pair while adding the soft-error tolerant cell in the slave-stage to protect the output nodes against possible soft-errors.

good metastability. Simulation results have shown that both designs achieve significant reduction in MDP and MPDP when compared to other flip-flop architectures analyzed in this work. Monte Carlo simulation results demonstrate that the two proposed flip-flops are very robust against process variations and mismatches. Fig. 12. (a) 0-1 Transition (b) 1-0 Transition Flip-Flop Robustness against Process Variations and Mismatches To achieve good metastability, it is desirable to have the cross-coupled inverter on the critical path of the flip-flops in order to increase the overall loop gain and lower the value of τ, the resolving time constant. Two new design metrics of MDP and MPDP are introduced to characterize the design tradeoff between performance, power, and metastability. The DICE and the uatro cell are the two soft-error tolerant cells used in the flip-flop design to provide protection against soft-errors. The former requires the flip-flop to generate inphase signals to be written into the cell while a differential signal is needed in the latter cell. Additional clocked-transistors are added to both cells in this work when compared to the traditional design in order to maintain high-performance. The power dissipation of the uatro cell is higher than the DICE cell due to an inverter delay that generates the differential path as well as more leakage paths. The design of the proposed flip-flop PDFF-SE and SATG- SE uses a cross-coupled inverter on the critical path in the master-stage to achieve good metastability while generating differential signals to facilitate the usage of the uatro cell in the slave-stage to protect against soft-error. PDFF- SE is designed to achieve very high-performance with good metastability while SATG-SE is a low-power design also with REFERENCES [1] K. Bowman and et.al., Energy-Efficient and Metastability-Immune Resilient Circuits for Dynamic Variation Tolerance, Journal of Solid- State Circuits, vol. 44, pp. 49 63, January 2009. [2] Y. Dhillon, A. Diril, A. Chatterjee, and A. Singh, Analysis and Optimization of Nanometer CMOS Circuits for Soft-Error Tolerance, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, pp. 514 524, May 2006. [3] S. Mitra, N. Seifert, M. Zhang,. Shi, and K. Kim, Robust System Deisgn with Built-In Soft-Error Resilience, Computer, vol. 38, pp. 43 52, February 2005. [4] R. Naseer and J. Draper, DF-DICE: a Scalable Solution for Soft Error Tolerant Circuit Design, International Symposium on Circuits and Systems, pp. 3890 3893, May 2006. [5] W. Wang and H. Gong, Edge Triggered Pulse Latch Design with Delayed Latching Edge for Radiation Hardened Application, IEEE Transactions on Nuclear Science, vol. 51, pp. 3626 3630, December 2004. [6] P. Hazucha and et.al., Measurements and Analysis of SER-Tolerant Latch in a 90nm Dual-V T CMOS Process, Journal of Solid-State Circuits, vol. 39, pp. 1536 1543, September 2004. [7] D. Krueger, E. Francom, and J. Langsdorf, Circuit Design for Voltage Scaling and SER Immunity on a uad-core Itanium Processor, ISSCC, Digest of Technical Papers, pp. 94 95, February 2008. [8] M. Baghini and M. Desai, Impact of Technology Scaling on Metastability Performance of CMOS Synchronizing Latches, 7th Asia and South Pacific Design Automation Conference and 15th International Conference on VLSI Design, pp. 317 322, January 2002. [9] F. Rosenberger and T. Chaney, Flip-Flop Resolving Time Test Circuit, Journal of Solid-State Circuits, vol. sc-17, pp. 731 738, August 1982. [10] T. KacprzakL and A. Albicki, Analysis of Metastable Operaiton in RS CMOS Flip-Flops, Journal of Solid-State Circuits, vol. sc-22, pp. 57 64, February 1987. [11] L. Kim and R. Dutton, Metastability of CMOS Latch/Flip-Flop, Journal of Solid-State Circuits, vol. 25, pp. 942 951, August 1990. [12] D. Li, P. Chuang, and M. Sachdev, Comparative Analysis and Study of Metastability on High-Performance Flip-Flops, 11th International Symposium on uality Electronic Design, pp. 853 860, March 2010. [13] T. Calin, M. Nicolaidis, and R. Velazco, Upset Hardened Memory Design for Submicron CMOS Technology, IEEE Transactions on Nuclear Science, vol. 43, pp. 2874 2878, December 1996. [14] S. Jahinuzzaman, D. Rennie, and M. Sachdev, A Soft Error Tolerant 10T SRAM Bit- with Differential Read Capability, IEEE Transactions on Nuclear Science, vol. 56, pp. 3768 3773, December 2009. [15] S. Jahinuzzaman, D. Rennie, and M. Sachdev, Soft Error Robust Impulse and TSPC Flip-Flops in 90nm CMOS, 2nd Microsystems and Nanoelectronics Research Conference, pp. 45 48, October 2009. [16] D. Li, P. Chuang, and M. Sachdev, Design of a Novel High- Performance Pre-Discharge Flip-Flop (PDFF), 8th Northeast Workshop on Circuits and Systems Conference, pp. 42 45, June 2010. [17] V. Stojanovics and V. G. Oklobdzija, Comparative Analysis of Master- Slave Latches and Flip-Flops for High-Performance and Low-Power Systems, Journal of Solid-State Circuits, vol. 34, pp. 536 548, April 1999. [18] C. Portmann and T. Meng, Metastability in CMOS Library Elements in Reduced Supply and Technology Scaled Applications, Journal of Solid-State Circuits, vol. 30, pp. 39 46, January 1995. [19] U. Ko and P. Balsara, High-Performance Energy-Efficient D-Flip-Flop Circuits, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 8, pp. 94 98, February 2000. [20] C. Tokunaga, D. Blaauw, and T. Mudge, True Random Number Generator With a Metastability-Based uality Control, Journal of Solid-State Circuits, vol. 43, pp. 78 85, January 2008.