Product Level MTBF Calculation

Similar documents
MTBF Bounds for Multistage Synchronizers

Metastability Analysis of Synchronizer

Robust Synchronization using the Wagging Technique

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

CMOS Implementation of Reliable Synchronizer for Multi clock domain System-on-chip

FPGA TechNote: Asynchronous signals and Metastability

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

Sequential Logic. E&CE 223 Digital Circuits and Systems (A. Kennings) Page 1


INTEGRATED CIRCUITS. AN219 A metastability primer Nov 15

Design and Measurement of Synchronizers

Static Timing Analysis for Nanometer Designs

ELE2120 Digital Circuits and Systems. Tutorial Note 7

CPS311 Lecture: Sequential Circuits

RS flip-flop using NOR gate

Introduction. NAND Gate Latch. Digital Logic Design 1 FLIP-FLOP. Digital Logic Design 1

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

ECE321 Electronics I

Unit 9 Latches and Flip-Flops. Dept. of Electrical and Computer Eng., NCTU 1

L4: Sequential Building Blocks (Flip-flops, Latches and Registers)

Measurements of metastability in MUTEX on an FPGA

A Low-Power CMOS Flip-Flop for High Performance Processors

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

Chapter 5 Flip-Flops and Related Devices

DESIGN AND IMPLEMENTATION OF SYNCHRONOUS 4-BIT UP COUNTER USING 180NM CMOS PROCESS TECHNOLOGY

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

Lecture 21: Sequential Circuits. Review: Timing Definitions

DEDICATED TO EMBEDDED SOLUTIONS

CMOS Latches and Flip-Flops

CSE115: Digital Design Lecture 23: Latches & Flip-Flops

Lecture 8: Sequential Logic

DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING DIGITAL DESIGN

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Asynchronous inputs. 9 - Metastability and Clock Recovery. A simple synchronizer. Only one synchronizer per input

RS flip-flop using NOR gate

Clock Domain Crossing. Presented by Abramov B. 1

EITF35: Introduction to Structured VLSI Design

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

cascading flip-flops for proper operation clock skew Hardware description languages and sequential logic

Switching Circuits & Logic Design

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

Parametric Optimization of Clocked Redundant Flip-Flop Using Transmission Gate

Figure 9.1: A clock signal.

A High-Resolution Flash Time-to-Digital Converter Taking Into Account Process Variability. Nikolaos Minas David Kinniment Keith Heron Gordon Russell

Digital System Design

EE178 Spring 2018 Lecture Module 5. Eric Crabill

UNIT III COMBINATIONAL AND SEQUENTIAL CIRCUIT DESIGN

D Latch (Transparent Latch)

Unit 11. Latches and Flip-Flops

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

FLIP-FLOPS AND RELATED DEVICES

Lecture 13: Clock and Synchronization. TIE Logic Synthesis Arto Perttula Tampere University of Technology Spring 2017

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains

Cascadable 4-Bit Comparator

EE273 Lecture 11 Pipelined Timing Closed-Loop Timing November 2, Today s Assignment

Electrical & Computer Engineering ECE 491. Introduction to VLSI. Report 1

CMOS Layout Design and Performance Analysis for Synchronization Failures using 50nm Technology

Combinational vs Sequential

Logic and Computer Design Fundamentals. Chapter 7. Registers and Counters

Power Reduction and Glitch free MUX based Digitally Controlled Delay-Lines

Software Engineering 2DA4. Slides 9: Asynchronous Sequential Circuits

Flip-Flops. Because of this the state of the latch may keep changing in circuits with feedback as long as the clock pulse remains active.

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) LATCHES and FLIP-FLOPS

LFSR Counter Implementation in CMOS VLSI

The outputs are formed by a combinational logic function of the inputs to the circuit or the values stored in the flip-flops (or both).

Lecture 12: Clock and Synchronization. TIE Logic Synthesis Arto Perttula Tampere University of Technology Spring 2018

Synchronous Sequential Logic

Sequential Circuits: Latches & Flip-Flops

Good afternoon! My name is Swetha Mettala Gilla you can call me Swetha.

EECS150 - Digital Design Lecture 15 Finite State Machines. Announcements

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS

EECS150 - Digital Design Lecture 17 - Circuit Timing. Performance, Cost, Power

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

DIGITAL CIRCUIT LOGIC UNIT 11: SEQUENTIAL CIRCUITS (LATCHES AND FLIP-FLOPS)

Module for Lab #16: Basic Memory Devices

Chapter 6. Flip-Flops and Simple Flip-Flop Applications

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

Digital Electronics II 2016 Imperial College London Page 1 of 8

Synchronizers and Arbiters

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

11. Sequential Elements

II. ANALYSIS I. INTRODUCTION

Project 6: Latches and flip-flops

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Chapter 3 :: Sequential Logic Design

MC9211 Computer Organization

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

(CSC-3501) Lecture 7 (07 Feb 2008) Seung-Jong Park (Jay) CSC S.J. Park. Announcement

CSE Latches and Flip-flops Dr. Izadi. NOR gate property: A B Z Cross coupled NOR gates: S M S R Q M

Low Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis

2.6 Reset Design Strategy

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

INTRODUCTION TO SEQUENTIAL CIRCUITS

ELCT201: DIGITAL LOGIC DESIGN

Digital Fundamentals

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

EE-382M VLSI II FLIP-FLOPS

Transcription:

2014 Fifth International Conference on Intelligent Systems, Modelling and Simulation Product Level MTBF Calculation Ang Boon Chong easic Corp bang@easic.com Abstract Synchronizers are used in sampling an asynchronous data for digital circuits. It protects the chips from metastability failure. As mean time between failure degrade with technology scaling while chip performance increase with multiple clock domain on chip and the synchronizer chain s usage increase, the mean time between failure, MTBF requirements is getting tougher to meet with technology scaling. The objective of this paper is to share the proper N number of synchronizer chains calculation as well as the product level mean time between failure, MTBF s derivation and the caveats of using traditional product level s mean time between failure,mtbf estimation. Hopefully the sharing will benefit the readers. Keywords Metastability, Synchronization, Chip I. INTRODUCTION For synchronous design, all sequential elements have to satisfy certain setup and hold requirement to ensure valid state is able to propagate. Metastability happens when asynchronous signals are transferred and the resulting output goes into undetermined state. A common example of metastability is the data violating the setup and hold requirement of the sequential element. It can be illustrated from Figure 1 asynchronous transfer and Figure 2, flipflop s behavior during metastability. flop hovers at voltage level between high and low and causing the output transition to be delayed beyond the specified clock-to-output delay (tco) shown in Figure 2. During metastability, the extra resolution time (Tw) happens sometime after the normally specified clock to output delay,tco if not accounted with extra timing slack, then system failures may occur. The duration of metastable condition is a probabilistic phenomenon and therefore is no guaranteed maximum time.entering a metastable state is a probabilistic function related to the clock frequency, the transition frequency of the asynchronous data signal and a constant that defines the window in which a transition can cause metastability. Metastability can appear as flip-flop that switches late or does not switch at all. It can present a brief pulse at flip-flop output or oscillations. Any of these conditions can cause system failure. Once in a metastability state, the value to which the flipflop resolves cannot be determined. It is analogous to a ball rolling over the hill shown in Figure 3. Each side of the hill represents a stable state and the top of the hill represents the metastable state. Figure 1 Asynchronous Transfer[1]-[3] Figure 3 Metastability Analogy The problem with metastable events is not merely their occurance but when the event causes inconsistent values to be latched into subsequent flip-flops if not synchronized properly shown in Figure 4. Figure 2 Metastability FF Behavior When a flip-flop is in metastable state, the output of the 2166-0662/14 $31.00 2014 IEEE DOI 10.1109/ISMS.2014.137 749

Figure 4 Metastability Propagation and Mitigation In this example, if one flip-flop latches a value of 1 while another flip-flop latches a value of 0, then the design can become unpredictable and may fail. This situation may occur because the 2 paths shown could have different routing delays. For meantime between failure, MTBF models evolution, it is shown in Table 1. TABLE 1 SUMMARY OF MULTISTAGE SYNCHRONIZER S MTBF MODELS[4]-[12] Figure 5 Flip-Flop Schematic Based on Figure 5, Upon entering the metastability, assuming the clock is low and, node A is at logic 1 and input D transitions from low to high. As a result, node A is falling and node B is rising. When the clock, CLK2, rises, it disconnects the input from node A and close the A-B loop. If A and B happens to be around its metastable levels, it would take them long time to diverge away towards legal digital value. From Figure 5, during metastability, the voltage levels of nodes A and B of the master latch are roughly mid-way between logic 1 and logic 0. The exact voltage levels depend on transistor sizing, process variations and are not necessarily equal for the 2 nodes. The settling time required for metastability of a flip-flop can be derived by plotting the data arrival time at node D versus clock to out delay at node B based on fixed clock and data slew shown in Table 2. TABLE 2 DATA ARRIVAL VERSUS CLKTOOUT DELAY SPICE SIMULATION In this paper, the scope of discussion will be product level MTBF formulation, the caveats of traditional product level MTBF estimation as well as brief introduction of MTBF measurement techniques and metaharden flip-flop analysis. II. MTBF MEASUREMENT TECHNIQUE To accurately determine the settling time, T w the Table 2 can be further processed to Table 3. For a typical flip-flop, it is shown in figure 5. 750

TABLE 3 PROCESSED DATA driving output resistance, R and capacitive load C shown in Figure 7. From Table 3, the T dc is the the time where clock to output delay is longer than the usual clock to output delay, T co based on fixed data and clock slew. T extra is the extra clock to output delay required. The negative value on T dc is due to setup volation as data shift from left to the leading edge of clock transition. From Table 3, when the data arrival happens at 4.96, the clock to output delay start to increase. T dc is extracted by deducting the 4.96ns to data arrival time when no data is captured at flop s output. The plot of T dc versus T extra is shown in Figure 6 Figure 7 Master Latch Small Signal Model Based on Figure 7, the model can be written as (3) Assuming symmetry cross-couple inverters,, and subtracting equation (2) and (3) Redefine (2) (4) equation (4) can simplify as (5) If assuming equation (5) can simplify as (6) where = Hence the solution is Figure 6 T dc versus T extra From Figure 6, as T dc approach 0, T extra increases expontially, while T dc increases, T extra will reduce to 0. The equation relating it can be expressed in the following manner[4]: T extra = - ln where T dc T W else T extra =0 ; (1) The value of T W can be derive by from equation (1) using sampling point of T dc at -33ps and -13ps. During metastability exit, the 2 inverters operate at their linear-transfer function region. This can be modeled in small signal with the inverters as negative amplifiers, each (7) The simulation setting of the exiting metastability circuit is shown in Figure 8. For the simulation setup, the switch starts closed with small supply such as to match the none equilibrium state based on the voltage transfer curve. Once the equilibrium voltage required is determined, it is defined as the DC supply value and the switch is shorted initially and then opens the switch around 1ns to allow the latch to resolve. The possible voltage transfer curve anticipated is shown in Figure 9. 751

By dividing equation (8) and (9) (10) Hence the value of can be derive from the linear slope of Figure 10. III. FLOP LEVEL MTBF REQUIREMENT Figure 8 Exiting Metastability Simulation Circuit Setup To derive the flop level meantime between failure,mtbf(s) for a typical 2 stage synchronous flip-flop in Figure 1,we simplify the notations to Figure 11. Figure 11 Async Flop Transfer Figure 9 Voltage Transfer Curve From Figure 9, it is observed that the voltage transfer curve is non-symmetry. Hence, an initial voltage supply, 0.037v is required to match the voltage node of and to equivalibrium stage. If the metastable state values are equal, it is a symmetry latch. For symmetry latch, the initial voltage supply, v The plot of voltage different versus time in natural log is shown in Figure 10. From equation (1), we can derive the equation for failure rate of register in the present of data source whose transition times are uncorrelated to the clock input,. Assume that is the amount of timing slack given to the synchronizer registers to resolve the metastable state, when the is greater than the available timing slack,, it is possible that errors will propagate to downstream logic and cause the system error. Hence the probability of failure after the sampling edge of the capturing clock can be express as : p(failure) =p(enter metastability) x p( > (11) Substitute equation (1) into equation (11) p(failure) = (12) With the account of async data rate, entering metastability shown in Figure 11, the rate of expected failures: p(failure) = (13) For N stage synchronizer where N is greater than 2 flipflops, the expected failure rate can be written as: p(failure) = (14) Figure 10 Output Voltage Difference Versus Time The mean time between failures, MTBF for N stage synchronizers is inverse of expected failure rate Based on equation (7), (8) (9) MTBF( )= (15) From equation (15), it is observed that for N stage synchronizers, the cumulative slack, (N-1) obtained is critical for better mean time between failures,mtbf(s) for synchronizer chain. 752

IV. METAHARDENING FLOP TRADE OFF For comparison of metahardening flop versus normal flip, it is shown in Table 4. TABLE 4: METAHARDEN FLOP COMPARISON Setup Hold Setup + Hold %diff Clk2Out %diff Tw Tau Leakage Dynamic Original 61 4 65 187 78 1.055 0.298 7.019 Metaharden 58 4 62 4.62% 182 2.67% 84 0.338 0.732 7.666 From Table 4, it is observed that though metaharden flop provide better settling time, the clock to out delay is degraded as the loading from the master latch increase. To improve the clock to out delay, further optimization on the slave latch is required. Leakage power increase due to the metaharden flip-flop usage is expected as the best performance is for metaharden flop ssettling time, T w is obtained through low threshold voltage, LVT cells usage. V. CHIP LEVEL MTBF REQUIREMENT Equation (15) provides a formula for single synchronizer chain. As design may consist of M synchronizer chains denoted as,,,, the entire chip effective mean time between failure can be derived as[4]: MTBF( )= (16) The equation (16) is valid under the hypothesis of independent failure which may not always be the case. Based on equation (16), it concludes that for entire chip M synchronizer chains, the effective mean time between failure corresponds to the th of the harmonic means of all synchronizer chains MTBFs and performance of the chip s mean time between failure is dominated by the performance of the worst chain in the design. Traditional method of deriving chip mean time between failure, MTBF( is derived based on the worst synchronizer chain s mean time between failure MTBF( divide by the total number of synchronizer chain, M in a chip. This will result in extremely pessimistic chip level mean time between failure, MTBF(. It can be explained by the following synchronizer chain conditions shown in Table 5. TABLE 5 CHIP S MEAN TIME BETWEEN FAILURE chain per Destination Clock clock domain MTBF single chain (Year) MTBF per clock domain (Year) clock A 200 3.19E+06 1.60E+04 clock B 100 1.13E+06 1.13E+04 clock C 20 5.65E+03 2.83E+02 clock D 100 1.40E+06 1.40E+04 clock E 10 1.40E+06 1.40E+05 total 430 chip MTBF 2.65E+02 calculated chip s mean time between failure, MTBF( would be MTBF( = =13 years while the actual chip s mean time between failure is 265 years. Hence traditional chip mean time between failures calculation, MTBF(C) is pessimistic and results in overdesign metaharden flops requirement or unnecessary high number of synchronizer stages required per synchronizer chain which will degrade the system performance. VI. TOTAL PRODUCTS MTBF REQUIREMENT For a product that consists of L counts of unique chips with different chip level s mean time between failures denoted as,the product s mean times between failures, MTBF( can be derived as: MTBF( )= (17) The equation (17) is valid under the hypothesis of independent failure which may not always be the case. Based on equation (17), it is concluded that for total L chip in a product, the effective mean time between failure corresponds to the th of the harmonic means of all chips MTBFs and the performance of the product s mean time between failure is dominated by the performance of the worst chip s MTBF in the product.example of the product mean time between failure is shown in table 6. TABLE 6 PRODUCT S MEAN TIME BETWEEN FAILURE chip chip count single chip MTBF (Year) MTBF per chip per product (Year) chip A 1 3.19E+03 3.19E+03 chip B 2 1.13E+03 5.65E+02 chip C 3 5.65E+03 1.88E+03 chip E 1 1.40E+03 1.40E+03 total 7 product MTBF 3.00E+02 From table 6, it is shown that the effective chip mean time between failure, MTBF (C) reduces when the chip used in a product is higher than 1. If there is total of K counts of the same products and operating in the field, the total product mean time between failure, can be derived as: MTBF(TP)= (18) Assuming a total of 80 products delivered and operating in the field, the effective chip s mean time between failures per total product is shown in Table 7. From Table 5, based on the traditional method, the 753

TABE 7 PRODUCT S MEAN TIME BETWEEN FAILURE MTBF per total chip in chip single chip MTBF per chip per total product chip count MTBF (Year) product (Year) (Year) chip A 1 3.19E+03 3.19E+03 3.99E+01 chip B 2 1.13E+03 5.65E+02 7.06E+00 chip C 3 5.65E+03 1.88E+03 2.35E+01 chip E 1 1.40E+03 1.40E+03 1.75E+01 total 7 product MTBF 3.00E+02 3.76E+00 From table 7, it is observed that chip B has the worst chip level s mean time between failure among the rest of other chips. The effective total product mean time between failure is merely 3.76 years for total of 80 products shipped and operating in the field. This implies that for every 2.5 weeks, one product will experience a mysterious failure in the field. An IC supplier typically has the visibility of the chip consumption in the targeted market segment as well as the required operating time in each market segment during the chip design phase. The required operating time in various market segments is shown in Table 8. TABLE 8 RELIABILITY REQUIREMENT FOR DIFFERENT MARKET SEGMENT If the chip s mean time between failure, MTBF(C) is improved by a factor of total chip count in total product, the total mean time between failure for total product is shown in Table 9. TABLE 9 IMPROVED PRODUCT S MEAN TIME BETWEEN FAILURE single chip chip count MTBF (Year) MTBF per chip per MTBF per total product chip in total (Year) product (Year) chip chip A 1 2.55E+05 2.55E+05 3.19E+03 chip B 2 1.81E+05 9.04E+04 1.13E+03 chip C 3 1.36E+06 4.52E+05 5.65E+03 chip E 1 1.12E+05 1.12E+05 1.40E+03 total 7 product MTBF 3.83E+04 4.79E+02 From Table 9, it is observed that if chip s mean time between failure, MTBF(C) improved by a factor of total chip in total product, the total product mean time between failure, MTBF (TP) improves to 479 years. Hence every 479 years, a product will experience a mysterious failure in the field. The product is rest assured from asynchronous transfer reliability concern with improved mean time between failure, MTBF s value. VII. CONCLUSION The summaries of product level mean time between failures are: Chip level mean time between failure need to account for total chips shipped to total product s impact to ensure the product can operate reliably in the field. A single chip mean time between failure derivation based on worst synchronizer chain s mean time between failure divided by total synchronizer chains counts will result in over design metaharden flip-flops or higher synchronizer stages which will degrade the system performance. ACKNOWLEDGEMENTS Thanks to Lai Kok Keong and Massimo Verita for the support given. REFERENCES [1] D. Kinniment, K. Heron and G. Russell, Measuring Deep Metastability, ASYNC,10pp-11, 2006. [2] C. Dike and E. Burton, Miller and noise effects in synchronizing flip-flop, JSSC, 34(6):849-855,1999. [3] S. Beer, R. Ginosar, M. Priel, R.Dobkin, A. Kolodny, An on-chip metastability measurement circuit to characterize synchronization behavior in 65nm, ISCAS, pp 2593-2596, 2011. [4] D. Chen, D. Singh et al. A comprehensive approach to modeling, characterizing and optimizing for metastability in FPGAs, FPGA 2010 [5] L.Kleeman and A. Cantoni, Metastable behavior in Digital Systems, IEEE Design and Test of Computers, 4(6), 4-19, 1987 [6] C. Brown and K. Feher, Measuring metastability and its effect on communication signal processing systems, IEEE Transactions on Instrumentationi and Measurement, 46(1), 1997 [7] D. Kinniment, Synchronization and Arbitration in Digital Systems, Wiley 2007 [8] S. Beer, R. Ginosar, et. al The Devolution of Synchronizers, ASYNC 2010 [9] Terrence Mak, Trunaction Error Analysis of MTBF Computation for Multi-Latch Synchronizers, Elsevier, Microelectronics Journal, pp. 1-10, 2011 [10] T.J Gabara, G.J Cyr and C.E Stroud, Metastability of CMOS Master-Slave flip-flops, IEEE Transactions on Circuits and Systems II-Analog and Digital Signal Processing, 734-740, 1992 [11] C. Myers, E Mercer and H. Jacobson, Verifying synchronization strategies in Formal Methods for Globally Asynchronous Locally Synchronous (GALS) Architecture, 2003 [12] I.W. Jones, S. Yang and M. Greenstreet, Synchronizer Behavior and Analysis, ASYNC, pp 117-126, 2009 754