DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT

Similar documents
Built-In Proactive Tuning System for Circuit Aging Resilience

Timing Error Detection and Correction for Reliable Integrated Circuits in Nanometer Technologies

RAZOR: CIRCUIT-LEVEL CORRECTION OF TIMING ERRORS FOR LOW-POWER OPERATION

A Low-Power CMOS Flip-Flop for High Performance Processors

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

An Automated Design Approach of Dependable VLSI Using Improved Canary FF

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

LOW POWER BASED DUAL MODE LOGIC GATES USING POWER GATING TECHNIQUE

PERFORMANCE ANALYSIS OF POWER GATING TECHNIQUES IN 4-BIT SISO SHIFT REGISTER CIRCUITS

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop

Low Power D Flip Flop Using Static Pass Transistor Logic

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Power Optimization Techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic

DESIGN OF DOUBLE PULSE TRIGGERED FLIP-FLOP BASED ON SIGNAL FEED THROUGH SCHEME

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

A Design for Improved Very Low Power Static Flip Flop Using Two Inverters and Five NORs

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

A Power Efficient Flip Flop by using 90nm Technology

Figure.1 Clock signal II. SYSTEM ANALYSIS

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

LFSR Counter Implementation in CMOS VLSI

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science

Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology

Dual Edge Triggered Flip-Flops Based On C-Element Using Dual Sleep and Dual Slack Techniques

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

32 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 44, NO. 1, JANUARY /$ IEEE

PICOSECOND TIMING USING FAST ANALOG SAMPLING

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

DESIGN AND IMPLEMENTATION OF SYNCHRONOUS 4-BIT UP COUNTER USING 180NM CMOS PROCESS TECHNOLOGY

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

An FPGA Implementation of Shift Register Using Pulsed Latches

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

Low Power Different Sense Amplifier Based Flip-flop Configurations implemented using GDI Technique

Comparative Analysis of Pulsed Latch and Flip-Flop based Shift Registers for High-Performance and Low-Power Systems

CMOS DESIGN OF FLIP-FLOP ON 120nm

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

DESIGN OF LOW POWER TEST PATTERN GENERATOR

ECE 555 DESIGN PROJECT Introduction and Phase 1

LOW-POWER CLOCK DISTRIBUTION IN EDGE TRIGGERED FLIP-FLOP

8. Design of Adders. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

P.Akila 1. P a g e 60

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Novel Low Power and Low Transistor Count Flip-Flop Design with. High Performance

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

Design and Analysis of Custom Clock Buffers and a D Flip-Flop for Low Swing Clock Distribution Networks. A Thesis presented.

Noise Margin in Low Power SRAM Cells

DESIGN AND ANALYSIS OF LOW POWER STS PULSE TRIGGERED FLIP-FLOP USING 250NM CMOS TECHNOLOGY

II. ANALYSIS I. INTRODUCTION

Power Optimization of Linear Feedback Shift Register (LFSR) using Power Gating

Design Project: Designing a Viterbi Decoder (PART I)

Design of Low Power Universal Shift Register

Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Design And Analysis of Clocked Subsystem Elements Using Leakage Reduction Technique

BUILT-IN PROACTIVE TUNING SYSTEM FOR CIRCUIT AGING AND PROCESS VARIATION RESILIENCE. A Thesis NIMAY SHAH

Novel Design of Static Dual-Edge Triggered (DET) Flip-Flops using Multiple C-Elements

Low Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis

Design and Analysis of Modified Fast Compressors for MAC Unit

Analysis of Digitally Controlled Delay Loop-NAND Gate for Glitch Free Design

IC Layout Design of Decoders Using DSCH and Microwind Shaik Fazia Kausar MTech, Dr.K.V.Subba Reddy Institute of Technology.

Modeling and designing of Sense Amplifier based Flip-Flop using Cadence tool at 45nm

On the Rules of Low-Power Design

Static Timing Analysis for Nanometer Designs

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

Design Low-Power and Area-Efficient Shift Register using SSASPL Pulsed Latch

AN EMISSION REINFORCED SCHEME FOR PIPELINE DEFENSE IN MICROPROCESSORS

Reduction of Area and Power of Shift Register Using Pulsed Latches

Timing Error Detection and Correction using EDC Flip-Flop for SOC Applications

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

EDSU: Error detection and sampling unified flip-flop with ultra-low overhead

Low Power Area Efficient Parallel Counter Architecture

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

Adaptive Overclocking and Error Correction Based on Dynamic Speculation Window

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Energy-Delay Space Analysis for Clocked Storage Elements Under Process Variations

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Design and Analysis of CNTFET Based D Flip-Flop

Design and Analysis of Metastable-Hardened and Soft-Error Tolerant. High-Performance, Low-Power Flip-Flops

ANALYSIS OF POWER REDUCTION IN 2 TO 4 LINE DECODER DESIGN USING GATE DIFFUSION INPUT TECHNIQUE

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

An Optimized Implementation of Pulse Triggered Flip-flop Based on Single Feed-Through Scheme in FPGA Technology

Comparative Analysis of low area and low power D Flip-Flop for Different Logic Values

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

Power Reduction and Glitch free MUX based Digitally Controlled Delay-Lines

Transcription:

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT Sripriya. B.R, Student of M.tech, Dept of ECE, SJB Institute of Technology, Bangalore Dr. Nataraj. K. R Head Of the Department, Dept of ECE, SJB Institute of Technology, Bangalore ABSTRACT We present a technique for compensating process, voltage and temperature variations due to manufacturing and environmental variability in submicron circuits using canary flip-flop. This canary flip flop predicts the timing error before it actually occurs and compensate the performance so that the system performance does not get affected. I am going to design a 16-bit Brent-Kung adder in 45-nm CMOS technology, whose performance will be controlled by supply voltage scaling. I will show that this technique can compensate process, supply voltage, and temperature variations and improve the energy efficiency of submicron circuits. I also discuss how to determine design parameters, such as the inserted location and the buffer delay of the canary FF. key words Manufacturing variability, Timing error prediction, Brent-Kung adder, Speed control unit, Canary flip-flop. I. INTRODUCTION Manufacturing processes necessary to make an integrated circuit (IC) are among the most sophisticated processes ever invented, since they are inherently sensitive to all kinds of disturbances and, as a result, the manufactured IC components exhibit large variations of their electrical parameters [1]. Process-related variability of semiconductor device characteristics has always been a hardest task for IC process engineers and circuit designers. In nanometer scale devices variability approaches the limits determined by the discrete nature of the solid matter and cannot be reduced by means of improvements in manufacturing processes and equipment. A good example is the doping concentration. The electrical characteristics of MOS transistors are controlled by introducing dopant atoms into the electrically active regions of these devices. The number of dopant atoms in the channel region of a deep submicron devices is of the order of tens or even single atoms, e.g. the estimated average number of boron atoms in a 32 nm device is 3.5. This means that we will have 3 atoms in one device and 4 in another a 25% variation of the dopant dose. As CMOS transistors are scaled to nanometer feature sizes, variations in transistor performance and leakage has become a critical challenge. There are three main sources of variations Process variation, supply voltage variation and Operating temperature variation. These variations cannot be avoided or reduced in any way and lead to unavoidable variations of the device parameters. Therefore it is crucial to estimate these variations and to account for them in the circuit design process. 23 P a g e

II. CANARY FLIP-FLOP Canary FF is augmented with a delay element and a redundant FF named shadow FF, as shown in Figure1. Each FF (main FF) is augmented with a delay buffer and a redundant FF (shadow FF). The shadow FF is used as a canary in a coal mine to help detect whether a timing error is about to occur [2]. Timing errors are predicted by comparing the main FF value with that of the shadow FF, which runs into the timing error a little bit before the main FF. Alert signal triggers voltage or frequency control. Figure 1. Block diagram of Canary flip-flop Utilizing canary FFs has the following three advantages. 1. Elimination of the delayed clock: Using single phase clock simplifies clock tree design. It also eliminates the short path problem [3] in Razor FF, and hence its minimum-path length constraint should not be considered. 2. Protection offered against timing errors: As explained above, in Canary, the shadow FF protects the main FF against timing errors. This freedom from timing errors eliminates any complex recovery mechanism. Hence, Canary is applicable to the common LSIs as well as modern microprocessors that have the recovery mechanism for branch miss-predictions. If Canary FF predicts a timing error, the supply voltage is increased to satisfy timing constraints. 3. Robustness for variations: Canary FF is variation resilient. The delay buffer always has a positive delay, even though parameter variations affect it. Hence, the shadow FF always encounters a timing error before the main FF [5]. Fig 2 shows configurable canary flip-flop where canary flip-flop is agumented with delay buffers. The buffer delay is calculated by: F1 = 1/(Dc+Dd) F2 = 1/Dc. Delay = F1 - F2 Where Dc = circuit delay, Dd = Buffer delay. 24 P a g e

Figure 2 Configurable Canary flip-flop III. SPEED CONTROL UNIT Speed control unit alters speed of the Brent-Kung adder whenever warning signal goes high. Figure. 3 shows a schematic of the speed control unit. Four speed levels can be provided by applying four different voltage. Figure 3. Block diagram of Speed control unit. VDD1, VDD2, VDD3, and VDD4 are selected according to the speed level stored in a two-bit register, that is, when the stored value in the speed level register is three, for instance,vdd3 is selected. Circuit operation starts at the maximum speed level. When the timer signal is asserted, the speed control unit decrements the speed level 25 P a g e

by one and the circuit is slowed down. In contrast, when the warning signal is asserted, the speed control unit immediately increments the speed level by one. IV. BRENT-KUNG ADDER The Brent-Kung adder is a parallel prefix adder. Parallel prefix adders are special class of adders that are based on the use of generate and propagate signals. Simpler Brent-Kung adders was been proposed to solve the disadvantages of Kogge-Stone adders. The cost and wiring complexity is greatly reduced. But the logic depth of Brent-Kung adders increases to 2log (2n- 1), so the speed is lower. The block diagram of 4-bit Brent-Kung adder is shown in Figure 4. In order to achieve a high speed circuit with less complex design, a Brent kung adder is a good candidate and suitable for parallel adder designs with high bit input numbers. Propagate (Exclusive OR) and generate (AND gate) functions. Figure 4. Brent-Kung adder 26 P a g e

V. PROPOSED BLOCK DIAGRAM Figure 5. Block diagram of test circuit.16 bit Brent-Kung adder speed is control unit with canary flip-flop. The structure of the circuit is as depicted in Figure 5. It consists of Configurable canary flip-flop, Speed control unit, brentkung adder. A 16-bit Brent-hung adder is adopted as a circuit whose performance is controlled by using speed control unit. The circuit speed is controlled digitally and the term speed level is used to describe how fast or slow the circuit is controlled. A higher speed level means the circuit is controlled for faster operation. The output of Brent-kung adder consists of sum bits as S[0] to S[15], from which only S[0] to s[7] is connected to Configurable canary flip-flop. This uses technique that pads the data-path with a delay element and samples the delayed datapath signal in another flip-flop, called the canary flip-flop, which is as shown in figure 5. Each flip-flop (main flip-flop) is augmented with a delay buffer and a redundant flip-flop (shadow flip-flop). Timing errors are predicted by comparing the main flip-flop value with that of the shadow FF, which runs into the timing error a little bit before the main flip-flop. Alert signal triggers voltage control. The warning signal is monitored during a specified period. Once the warning signal is detected, which is generated by canary flip-flop, the circuit is controlled to speed up. That means speed gets increased by one level. If no warning signals are observed during the monitoring period, the circuit is slowed down by one level for power reduction. Here each speed level has a difference of 0.1 volt. The speed control unit as in the fig consists of a decoder, 4-bit Counter and a voltage switch. If the warning signal from the configurable canary flip-flop is high and the timer signal is high, then these two bits are given to 27 P a g e

decoder. Then this output is given to counter which increases the value of decoder by one, that means the speed is increased by one level. Then this value is given to vltage switch, which consists of four pmos each connected to different vdd as in vdd0, vdd1, vdd2, vdd3. Depending upon the output of counter, one out of the four vdd is selected. VI. MEASUREMENT RESULTS AND WAVEFORM The Block diagram as in Figure.5 is designed using Cadence Virtuoso tool and simulated by Synopsys Hspice. Figure.6 shows the operation example with 5 sec as monitoring period, where the circuit is operated at 2 MHz with 1v as supply voltage. Speed level for speed control unit corresponds to 0.1 volt. Measurement purpose Three spice models for slow, fast and nominal speeds, which are available in hspice are used. Figure6. Measurement of timing error with monitoring period of 5 sec Figure 7. Waveforms of Configurable canary flip-flop with Speed control unit 28 P a g e

Figure 8.Measurement of power dissipation at various supply voltages at 2 MHz frequency. Figure 9. Measurement of Power dissipation at various Temperature conditions at 2 MHz frequency Figure 10 comparisons of worst case and performance compensating designs 29 P a g e

Figure 8 and 9 shows the variation of Power dissipation for different supply voltage and temperature respectively. Here three Spice models are taken they are slow, fast and nominal. VII. CONCLUSION We presented a performance compensation technique using canary flip-flop for submicron circuits. Canary flipflop, Configurable canary flip-flop and speed conrol unit is designed in 45nm technology. A 16-bit Brent-Kung adder, whose performance was controlled by speed control unit, was designed in a 45-nm CMOS process using cadence virtuoso tool. Three spice models namely FF, TT and SS are used as design corners to show fast, Nominal and slow chips respectively. Measurement results showed that the adaptive control compensated manufacturing and environmental variability and reduced power dissipation compared to traditional worst-case design. Simulation results indicated that it is appropriate to adjust the buffer delay to attain higher mean time between failures, canary FF insertion with the sufficient buffer delay to cover a wider manufacturing variability space is the most practical. REFERENCES [1] J. W. Tschanz, J. T. Kao, S. G. Narendra, R. Nair, D. A. Antoniadis, A. P. Chandrakasan, and V. De, Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage, IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1396 1402, Nov. 2002. [2] S. Das, D. Roberts, S. Lee, S. Pant, D. Blaauw, T. Austin, K. Flautner, and T. Mudge, A self-tuning DVS processor using delay-error detection and correction, IEEE J. Solid-State Circuits, vol. 41, no. 4, pp. 792 804, Apr. 2006. [3] M. Agarwal, B. C. Paul, M. Zhang, and S. Mitra, Circuit failure prediction and its application to transistor aging, in Proc. VLSI Test Symmp. (VTS), 2007, pp. 277 286. [4] S. Das, C. Tokunaga, S. Pant, W. H. Ma, S. Kalaiselvan, K. Lai, D. M. Bull, and D. Blaauw, Razor II: In situ error detection and correction for PVT and SER tolerance, IEEE J. Solid-State Circuits, vol. 44, no. 1, pp. 32 48, Jan. 2009. [5] H. Fuketa, M. Hashimoto, Y. Mitsuyama, and T. Onoye, Trade-off analysis between timing error rate and power dissipation for adaptive speed control with timing error prediction, IEICE Trans. Fund., vol. E92-A, no. 12, pp. 3094 3102, Dec. 2009. [6] T. Nakura, K. Nose, and M. Mizuno, Fine-grain redundant logic using defect-prediction flip-flops, in Int. Solid-State Circuits Conf. papers, 2007, pp. 402 403. 30 P a g e