A Low Power Delay Buffer Using Gated Driver Tree

Similar documents
Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

AN OPTIMIZED IMPLEMENTATION OF MULTI- BIT FLIP-FLOP USING VERILOG

Low-Power Delay Buffer Design Using Asymmetric C-Element Gated Clock Strategy

A Design Of A Low Power Delay Buffer Using Ring Counter Addressing Schemes

FAULT SECURE ENCODER AND DECODER WITH CLOCK GATING

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

An FPGA Implementation of Shift Register Using Pulsed Latches

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

CMOS Low Power, High Speed Dual- Modulus32/33Prescalerin sub-nanometer Technology

Figure.1 Clock signal II. SYSTEM ANALYSIS

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

RS flip-flop using NOR gate

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Logic Design II (17.342) Spring Lecture Outline

2.6 Reset Design Strategy

Chapter 6. Flip-Flops and Simple Flip-Flop Applications

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Scanned by CamScanner

DESIGN AND IMPLEMENTATION OF SYNCHRONOUS 4-BIT UP COUNTER USING 180NM CMOS PROCESS TECHNOLOGY

ISSN Vol.08,Issue.24, December-2016, Pages:

Asynchronous (Ripple) Counters

Combinational vs Sequential

Chapter 2. Digital Circuits

Counter dan Register

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Low Power Area Efficient Parallel Counter Architecture

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Chapter 4. Logic Design

Final Exam review: chapter 4 and 5. Supplement 3 and 4

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Counters

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Power Optimization by Using Multi-Bit Flip-Flops

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

Research Article Low Power 256-bit Modified Carry Select Adder

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Analysis of Digitally Controlled Delay Loop-NAND Gate for Glitch Free Design

LFSR Counter Implementation in CMOS VLSI

POWER OPTIMIZED CLOCK GATED ALU FOR LOW POWER PROCESSOR DESIGN

Design Low-Power and Area-Efficient Shift Register using SSASPL Pulsed Latch

Design of Fault Coverage Test Pattern Generator Using LFSR

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

GLITCH FREE NAND BASED DCDL IN PHASE LOCKED LOOP APPLICATION

WINTER 15 EXAMINATION Model Answer

Design of Low Power and Area Efficient 64 Bits Shift Register Using Pulsed Latches

Design of BIST with Low Power Test Pattern Generator

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

P.Akila 1. P a g e 60

ASYNCHRONOUS COUNTER CIRCUITS

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

MODEL QUESTIONS WITH ANSWERS THIRD SEMESTER B.TECH DEGREE EXAMINATION DECEMBER CS 203: Switching Theory and Logic Design. Time: 3 Hrs Marks: 100

IN DIGITAL transmission systems, there are always scramblers

IT T35 Digital system desigm y - ii /s - iii

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

High Speed 8-bit Counters using State Excitation Logic and their Application in Frequency Divider

Dual Edge Adaptive Pulse Triggered Flip-Flop for a High Speed and Low Power Applications

CHAPTER1: Digital Logic Circuits

CHAPTER 6 DESIGN OF HIGH SPEED COUNTER USING PIPELINING

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

Chapter 5: Synchronous Sequential Logic

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

RS flip-flop using NOR gate

Vignana Bharathi Institute of Technology UNIT 4 DLD

Reduction of Area and Power of Shift Register Using Pulsed Latches

EITF35: Introduction to Structured VLSI Design

CHAPTER 4: Logic Circuits

Contents Slide Set 6. Introduction to Chapter 7 of the textbook. Outline of Slide Set 6. An outline of the first part of Chapter 7

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

A Power Efficient Flip Flop by using 90nm Technology

VTU NOTES QUESTION PAPERS NEWS RESULTS FORUMS Registers

CHAPTER 4: Logic Circuits

Logic and Computer Design Fundamentals. Chapter 7. Registers and Counters

Logic Devices for Interfacing, The 8085 MPU Lecture 4

MC9211 Computer Organization

The basic logic gates are the inverter (or NOT gate), the AND gate, the OR gate and the exclusive-or gate (XOR). If you put an inverter in front of

Dual Edge Triggered Flip-Flops Based On C-Element Using Dual Sleep and Dual Slack Techniques

CSE115: Digital Design Lecture 23: Latches & Flip-Flops

SEQUENTIAL LOGIC. Satish Chandra Assistant Professor Department of Physics P P N College, Kanpur

International Journal of Computer Trends and Technology (IJCTT) volume 24 Number 2 June 2015

EEE2135 Digital Logic Design Chapter 6. Latches/Flip-Flops and Registers/Counters 서강대학교 전자공학과

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Low Power D Flip Flop Using Static Pass Transistor Logic

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

MODULE 3. Combinational & Sequential logic

BUSES IN COMPUTER ARCHITECTURE

Solution to Digital Logic )What is the magnitude comparator? Design a logic circuit for 4 bit magnitude comparator and explain it,

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Transcription:

IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) ISSN: 2319 4200, ISBN No. : 2319 4197 Volume 1, Issue 4 (Nov. - Dec. 2012), PP 26-30 A Low Power Delay Buffer Using Gated Driver Tree Kokkilagadda Sarath Babu 1, K.Rajasekhar 2 1(Electronics and Communication Department, ASR College/ JNTUniversity, India) 2(Electronics and Communication Department, ASR College/ JNTUniversity, India) Abstract : This project presents circuit design of a low-power delay buffer. The proposed delay buffer uses several new techniques to reduce its power consumption. Since delay buffers are accessed sequentially, it adopts a ring-counter addressing scheme. In the ring counter, double-edge-triggered (DET) flip-flops are utilized to reduce the operating frequency by half and the C-element gated-clock strategy is proposed. A novel gatedclock-driver tree is then applied to further reduce the activity along the clock distribution network. Moreover, the gated-driver-tree idea is also employed in the input and output ports of the memory block to decrease their loading, thus saving even more power. The simplest way to implement a delay buffer is to use shift registers. If the buffer length N is and the word-length is b, then a total of Nb DFFs are required, and it can be quite large if a standard cell for DFF is used. In addition, this approach can consume huge amount of power since on the average Nb/2 binary signals make transitions in every clock cycle. As a result, this implementation is usually used in short delay buffers, where area and power are of less concern. Although some power is indeed saved by gating the clock signal in inactive blocks, the extra R S flipflops still serve as loading of the clock signal and demand more than necessary clock power. We propose to replace the R S flip-flop by a C-element and to use tree-structured clock drivers with gating so as to greatly reduce the loading on active clock drivers. Additionally, DET flip-flops are used to reduce the clock rate to half and thus also reduce the power consumption on the clock signal. The proposed ring counter with hierarchical clock gating and the control. Each block contains one C-element to control the delivery of the local clock signal CLK to the DET flip-flops, and only the CKE signals along the path passing the global clock source to the local clock signal are active. The gate signal (CKE) can also be derived from the output of the DET flip-flops in the ring counter. Keywords - C- element, delay buffer, first-in-first-out, gated clock, ring counter. I. INTRODUCTION Portable multimedia and communication devices have experienced explosive growth recently. Longer battery life is one of the crucial factors in the wide spread success of these products. As such, low power circuit design for multimedia and wireless communication applications has become very important. In many such products, delay buffers (line buffers, delay lines) make up a significant portion of their circuits. Such serial access memory is needed in temporary storage of signals that are being processed, e.g., delay of one line of video signals, delay of signals within a fast Fourier transform (FFT) architectures and delay of signals in a delay correlate. Currently, most circuits adopt static random access memory (SRAM) plus some control/addressing logic to implement delay buffers. For smaller length delay buffers, shift register can be used instead. The former approach is convenient since SRAM compilers are readily available and they are optimized to generate memory modules with low power consumption and high operation speed with a compact cell size. The latter approach is also convenient since shift register can be easily synthesized, though it may consume much power due to unnecessary data movement. Previously, a simplified and thus lower-power sequential addressing scheme for SRAM application in delay buffers is proposed. A ring counter is used to point to the target words. Since the ring counter is made up of an array of D-type flip-flops (DFFs) triggered by a global clock signal. In this paper, we propose to use double-edge-triggered (DET) flip-flops instead of traditional DFFs in the ring counter to halve the operating clock frequency. A novel approach using the C-elements instead of the R S flip-flops in the control logic for generating the clock-gating signals is adopted to avoid increasing the loading of the global clock signal. In addition to gating the clock signal going to the DET flip-flops in the ring counter, we also proposed to gate the drivers in the clock tree. The technique will greatly decrease the loading on distribution network of the clock signal for the ring counter and thus the overall power consumption. The same technique is applied to the input driver and output driver of the memory part in the delay buffer. In a delay buffer based on the SRAM cell array such as the one in, the read/write circuitry is through the bit lines that work as data buses. In the proposed new delay buffer, we use a tree hierarchy for the read/write circuitry of the memory module. For the write circuitry, in each level of the driver tree, only one driver along the path leading to the addressed 26 Page

memory word is activated. Similarly, a tree of multiplexers and gated drivers comprise the read circuitry for the proposed delay buffer. Simulation results show the effectiveness of the above techniques in power reduction. II. CONVENTIONAL DELAY BUFFERS The simplest way to implement a delay buffer is to use shift registers as shown in Fig. 1. If the buffer length is and the word-length is, then a total of DFFs are required, and it can be quite large if a standard cell for DFF is used. In addition, this approach can consume huge amount of power since on the average binary signals make transitions in every clock cycle. As a result, this implementation is usually used in short delay buffers, where area and power are of less concern. RAM-based delay buffers are more popular in long delay buffers because of the compact SRAM cell size and small total area. Also, the power consumption is much less than shift registers because only two words are accessed in each clock cycle: one for write-in and the other for readout.a binary counter can be used for address generation since the memory words are accessed sequentially. The SRAM-based delay buffers do away with many data transitions, there still can be considerable power consumption. in the SRAM address decoder and the read/write circuits. In fact, since the memory words are accessed sequentially, we can use a ring counter with only one rotating active cell to point to the words for write-in and read-out. This method, known as the pointer-based scheme.this type flip-flops is initialized with only one 1 (the active cell) and all the other DFFs are kept at 0. When a clock edge triggers the DFFs, this 1 signal is propagated forward. Consequently, the traditional binary address decoder can be replaced by this unary-coded ring counter. Compared to the shift register delay buffers, this approach propagates only one 1 in the ring counter instead of propagating-bit words. Obviously, with much less data transitions, the pointerbased delay buffers can save a lot of power. Figure 1 Delay Buffer implemented by Shift Register Figure 2 Pointer based Scheme Figure 3 Ring counter with RS flip-flop 27 Page

By observing the fact that only one of the DFFs in the ring counter is activated, the gated-clock technique has then been proposed to be applied to the DFFs. In their approach, every eight DFFs in the ring counter are grouped into one block. Then, a gate signal is computed for each block to gate the frequently toggled clock signal when the block can be inactive so that unnecessary power wasted in clock signal transitions is saved. III. PROPOSED DELAY BUFFER In the proposed delay buffer, several power reduction techniques are adopted. Mainly, these circuit techniques are designed with a view to decreasing the loading on high fan-out nets, e.g., clock and read/write ports. A Gated-clock ring counter Although some power is indeed saved by gating the clock signal in inactive blocks, the extra R S flip-flops still serve as loading of the clock signal and demand more than necessary clock power. We propose to replace the R S flip flop by a C-element and to use tree-structured clock drivers with gating so as to greatly reduce the loading on active clock drivers. Additionally, DET flip-flops are used to reduce the clock rate to half and thus also reduce the power consumption on the clock signal. The proposed ring counter with hierarchical clock is shown in figure. Each block contains one C- element to control the delivery of the local clock signal CLK to the DET flip-flops, and only the CKE signals along the path passing the global clock source to the local clock signal are active. The gate signal (CKE) can also be derived from the output of the DET flip-flops in the ring counter. The C-element is an essential element in asynchronous circuits for handshaking. The logic of the C-element is given by Figure 4 Logic Circuit of C-element where A as well B are its two inputs and C+ as well as C are the next and current outputs. If A=B, then the next output will be the same as. Otherwise,A#B and C+ remain unchanged. Since the output of C-element can only be changed when A=B, it can avoid the possibility of glitches, a crucial property for a clock gating signal. In order to reduce more power, we replace DFFs by double-edge-triggered flip-flops and operate the ring counter at half speed. With such changes, the clock gating control mechanism is different. When the input of the last DET flip-flop in the previous block changes to 1 making both two inputs of the C-element the same, the clock signal in the current block will be turned on. When the output of the first DET flip-flop in the current block is asserted, then both inputs of the C-element in the previous block go to 0 and the clock for the previous block is disabled. In order to further diminish the loading on the global clock signal ( CLK ), we propose to use a driver tree distribution network for the global clock and activate only those drivers. Figure 5 Diagram of Ring Counter with Clock gated by C-elements 28 Page

IV. SIMULATION RESULTS Circuit Design and Simulation of C- Element Design and Simulation of Proposed Ring Counter of Double edge triggered Flip Flop with Clock Gated by C-Elements 29 Page

V. ANALYSIS OF GATED DRIVER TREE USING D AND DET FLIP FLOP Ring counter Structure (N=512,D=8,M=64) Estimated Loading ratio by equations (3)-(5) Traditional Ring Counter 1024 Gated clock ring counter with RS Flip Flop 208 Proposed Gated clock ring counter 31 Table 1 Estimated Loading Ratio of three different Ring Counters Proposed Ring counter Structure Clock gated with C-element Simulated Power @ 0.12um Double Edge Triggered Flip flop 1.948mw Pulse Triggered Flip Flop 0.767mw Improved Pulse Triggered Flip Flop 0.582mw Table 2 Comparison of Power by using different D Flip Flops in Delay Buffer Ring Counter Input Driver Tree Structure (N=512,M=64,D=8) Simulated Power @ 1.8v,50MHz,0.18µm Estimated Loading Ratio by Eqs. (6)-(7) Without Gated Driver Tree 520µW 512 With Gated Driver Tree 44.2µW 44 Table 3 Power of the Input Driver Tree with and Without the Gating Strategy VI. Conclusion In this project, we presented a low-power delay buffer architecture which adopts several novel techniques to reduce power consumption. The ring counter with clock gated by the C-elements can effectively eliminate the excessive data transition without increasing loading on the global clock signal. The gated-driver tree technique used for the clock distribution networks can eliminate the power wasted on drivers that need not be activated. Another gated-de multiplexer tree and a gated-multiplexer tree are used for the input and output driving circuitry to decrease the loading of the input and output data bus. All gating signals are easily generated by a C-element taking inputs from some DET flip-flop outputs of the ring counter. We believe that with more experienced layout techniques the cell size of the proposed delay buffer can be further reduced, making it very useful in all kinds of multimedia/communication signal processing ICs. REFERENCES [1] Eberle et al., 80-Mb/s QPSK and 72-Mb/s 64-QAM flexible and scalable digital OFDM transceiver ASICs for wireless local area networks in the 5-GHz band, IEEE J. Solid-State Circuits, vol. 36, no. 11, pp. 1829 1838, Nov. 2001. [2] M. L. Liou, P. H. Lin, C. J. Jan, S. C. Lin, and T. D. Chiueh, Design of an OFDM baseband receiver with space diversity, IEE Proc. Commun., vol. 153, no. 6, pp. 894 900, Dec. 2006. [3] G. Pastuszak, A high-performance architecture for embedded block coding in JPEG 2000, IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 9, pp. 1182 1191, Sep. 2005. [4] W. Li and L.Wanhammar, A pipeline FFT processor, in Proc. Workshop Signal Process. Syst. Design Implement., 1999, pp. 654 662. [5] N. Shibata, M.Watanabe, and Y. Tanabe, A current-sensed high-speed and low-power first-in-first-out memory using a wordline/bitline- swapped dual-port SRAM cell, IEEE J. Solid-State circuits, vol. 37, no. 6, pp. 735 750, Jun. 2002. [6] Hosain.R, L. D. Wronshi, and albicki.a, 1994. Low power design using double edge triggered flip-flop, IEEE Trans. Very Large Scale Integr. (VLSI ) Syst., vol.2, no. 2, pp. 261 265. 30 Page