A HIGH SPEED CMOS INCREMENTER/DECREMENTER CIRCUIT WITH REDUCED POWER DELAY PRODUCT

Similar documents
Low Power Area Efficient Parallel Counter Architecture

Design and Analysis of Modified Fast Compressors for MAC Unit

Implementation of Low Power and Area Efficient Carry Select Adder

ANALYSIS OF POWER REDUCTION IN 2 TO 4 LINE DECODER DESIGN USING GATE DIFFUSION INPUT TECHNIQUE

High Speed 8-bit Counters using State Excitation Logic and their Application in Frequency Divider

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

IN DIGITAL transmission systems, there are always scramblers

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

High Performance Carry Chains for FPGAs

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

DESIGN AND ANALYSIS OF COMBINATIONAL CODING CIRCUITS USING ADIABATIC LOGIC

A Review on Hybrid Adders in VHDL Payal V. Mawale #1, Swapnil Jain *2, Pravin W. Jaronde #3

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT

LUT Optimization for Memory Based Computation using Modified OMS Technique

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

CHAPTER 6 DESIGN OF HIGH SPEED COUNTER USING PIPELINING

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

An MFA Binary Counter for Low Power Application

A Power Efficient Flip Flop by using 90nm Technology

A Symmetric Differential Clock Generator for Bit-Serial Hardware

Low Power D Flip Flop Using Static Pass Transistor Logic

Pak. J. Biotechnol. Vol. 14 (Special Issue II) Pp (2017) Parjoona V. and P. Manimegalai

DESIGN AND IMPLEMENTATION OF SYNCHRONOUS 4-BIT UP COUNTER USING 180NM CMOS PROCESS TECHNOLOGY

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

Altera s Max+plus II Tutorial

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

An FPGA Implementation of Shift Register Using Pulsed Latches

CMOS DESIGN OF FLIP-FLOP ON 120nm

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

8. Design of Adders. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

Glitch Free Strobe Control Based Digitally Controlled Delay Lines

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

An Efficient High Speed Wallace Tree Multiplier

SA4NCCP 4-BIT FULL SERIAL ADDER

Memory efficient Distributed architecture LUT Design using Unified Architecture

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003

International Journal of Engineering Research-Online A Peer Reviewed International Journal

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Design Low-Power and Area-Efficient Shift Register using SSASPL Pulsed Latch

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Design of an Efficient Low Power Multi Modulus Prescaler

Modeling and designing of Sense Amplifier based Flip-Flop using Cadence tool at 45nm

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

Research Article Low Power 256-bit Modified Carry Select Adder

DESIGN OF LOW POWER AND HIGH SPEED BEC 2248 EFFICIENT NOVEL CARRY SELECT ADDER

High Frequency 32/33 Prescalers Using 2/3 Prescaler Technique

Novel Design of Static Dual-Edge Triggered (DET) Flip-Flops using Multiple C-Elements

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

DESIGN AND ANALYSIS OF LOW POWER STS PULSE TRIGGERED FLIP-FLOP USING 250NM CMOS TECHNOLOGY

DESIGN OF LOW POWER TEST PATTERN GENERATOR

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

Parametric Optimization of Clocked Redundant Flip-Flop Using Transmission Gate

Find the equivalent decimal value for the given value Other number system to decimal ( Sample)

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

Power Optimization Techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

Design and Evaluation of a Low-Power UART-Protocol Deserializer

Introduction to Digital Logic Missouri S&T University CPE 2210 Exam 3 Logistics

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

P.Akila 1. P a g e 60

Area-efficient high-throughput parallel scramblers using generalized algorithms

An Asynchronous Fully Digital DLL for DDR SDRAM Data Recovery

MODULE 3. Combinational & Sequential logic

Metastability Analysis of Synchronizer

High speed, Low power N/ (N+1) prescaler using TSPC and E-TSPC: A survey Nemitha B 1, Pradeep Kumar B.P 2

Analysis of Digitally Controlled Delay Loop-NAND Gate for Glitch Free Design

Design of High Speed Phase Frequency Detector in 0.18 μm CMOS Process for PLL Application

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

COMPUTATIONAL REDUCTION LOGIC FOR ADDERS

Low Power Different Sense Amplifier Based Flip-flop Configurations implemented using GDI Technique

DESIGN OF DOUBLE PULSE TRIGGERED FLIP-FLOP BASED ON SIGNAL FEED THROUGH SCHEME

Comparison of Conventional low Power Flip Flops with Pulse Triggered Generation using Signal Feed through technique

New Single Edge Triggered Flip-Flop Design with Improved Power and Power Delay Product for Low Data Activity Applications

A Low Power Delay Buffer Using Gated Driver Tree

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

128 BIT MODIFIED CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER

IC Design of a New Decision Device for Analog Viterbi Decoder

Transcription:

A HIGH SPEED CMOS INCREMENTER/DECREMENTER CIRCUIT WITH REDUCED POWER DELAY PRODUCT P.BALASUBRAMANIAN DR. R.CHINNADURAI Department of Electronics and Communication Engineering National Institute of Technology, (Deemed University) Tiruchirappalli 620 015. INDIA Web: http://www.nitt.edu Abstract: A novel circuit topology for the CMOS based Incrementer/Decrementer circuit is presented in this paper. The design methodology is extensively based on Domino logic and it utilizes a simple two level look-ahead structure. The highly parallel, regular structure of the proposed 8-bit decision module (DM) macro cell makes this design, especially advantageous for constructing higher order versions, facilitating an easier layout and test mechanism. For a 32-bit Incrementer/Decrementer circuit, based upon the proposed design, the savings in power delay product (PDP) are of the order of 65 % and 38 %, with a significant reduction in the number of transistors, in comparison with the best decision module designs reported in [1] and [2], based on a similar MOSIS 0.6 µm CMOS technology. Key-Words: - Decision Module (DM), Incrementer/Decrementer, Proposed Priority Resolver (PPR), High Speed (HS) High Speed Low Power (HSLP), Power Delay Product (PDP) 1 Introduction The Incrementer/Decrementer is a basic building block in digital systems, which counts up or down by one step in each clock cycle. It finds applications as program counter [4], frequency divider [5] etc. A general Incrementer /Decrementer circuit can be designed based on the Adder/Subtractor module or the Priority Resolution Module (PRM) block, as illustrated in [10]. The speed of operation of the Incrementer/Decrementer circuit is likely to reduce considerably, with an increase in the order of the bit-width or bit levels. The speed limitation of a conventional Incrementer/Decrementer circuit, based on the Adder/Subtractor module, comes from the carry signal propagation. For improvement in speed, we resort to a high-speed adder structure, such as the Carry Look-ahead Adder (CLA). But this is possible only with a greater compromise on the silicon real estate. On the other hand, if the Up/Down counter is used as an Incrementer/Decrementer, the speed limitation still springs from the carry propagation. This is due to the design of the Up/Down counter, relying upon the addition mechanism, although it utilizes a half adder circuit in place of a full adder structure. The PRM based Incrementer/Decrementer circuit design is somewhat area efficient, but its operating speed is however limited due to the delay, associated with the propagation of the priority token. The rest of this paper is organized as follows. In Section 2, the conventional design approaches for the Incrementer/Decrementer circuit and their inherent limitations are described briefly. In Section 3, the basic circuit operation and a brief description of decision module design methodologies proposed in [2] are presented. In Section 4, we present our proposed 8-bit macro cell design of the decision module, the associated design equations and explain its operation. The performance evaluation results and comparisons are presented in Section 5. Finally, we make the concluding remarks in Section 6. 2 Conventional design approaches

2.1 Adder based Increment/Decrement Module Fig.1 shows a 4-bit CLA based Incrementer/Decrementer [7]. The PG generator and the look-ahead carry generator are the root causes for the circuit complexity, though they are necessary for speeding up the operation of the circuit. This circuit has a long critical path, illustrated by the gray portion [1]. The critical path delay is directly and linearly proportional to the bit width of the circuit and this has an adverse impact on the speed of operation. Fig.2 : A 32-bit counter based Inc./Dec. circuit 3 Basic circuit operation Fig.1 : 4-bit CLA based Incrementer/Decrementer If only the increment or the decrement operation is required, then the circuit logic can be simplified. However, if both the operands are required simultaneously, then the adder/subtractor modules are required to remain. This acts as a deterrent from simplifying the complexity of the circuit. For the sake of increasing the bit-widths or in order to construct a higher order Incrementer/Decrementer, many such 4-bit units have to cascaded, which degrades the speed of operation due to the associated lengthy carry propagation chain. 2.2 Counter based Incrementer/Decrementer circuit A 32-bit counter based Increment/Decrement module is shown in Fig.2 [4]. Since the half adder circuit is basically used for the function realization of the Incrementer/Decrementer, this circuit possesses a very regular structure. In this section, we first explain the operation of the Incrementer/Decrementer circuit and then briefly discuss about the existing high-speed decision module (DM) blocks, cited in [2]. 3.1 Circuit Operation Fig. 3 : Block Diagram of the circuit The Decision Module (DM) is the main functional block in the Incrementer/Decrementer circuit. It s operation is similar to a Priority Encoder (PE) or a Priority Resolver (PR). Hereafter the terms DM, PE and PR will be used synonymously. Let the input data be 10111000. Each input bit should first be complemented. Assuming that the input data is still the same, the data to Decision Module (DM) will be 01000111 and the output

of DM will be 00000001, as illustrated in [2]. After the operation performed by the data-out selector, the output of DM is XORed with the input data and the final output is obtained as 10111001, which is the desired result. The decrement operation is somewhat similar and is elucidated in detail, available in [1] and [2]. 3.2 High Speed 8-bit PR module high speed performance, however the power consumption of this circuit is high. All the output bits are precharged to logic 1 in the precharge phase. During the evaluation phase (Clock=1), all but one output bit continues to remain in the high state, while others are pulled low. The high switching probability subsequently leads to higher power dissipation, as mentioned in [3]. 3.3 High Speed and Low Power 8-bit PR module Fig.5 shows the high speed and low power (HSLP) decision module macro. In comparison with the circuit in Fig.4, this circuit has two modifications. Firstly, for the first level look-ahead functions, the n-type dynamic gates with the series connected circuit structure are used to replace the p-type dynamic gates with an equivalent parallel-connected circuit structure. Secondly, several PMOS transistors Mrfo~Mrf7 controlled by the look-ahead signals are added, to prevent the circuit from going into the erroneous state. Fig.4 : A High Speed Decision Module 8-bit macro This high speed (HS) macro cell is implemented in the NP Domino CMOS logic. During the precharge phase (Clock=0), LA_inter and LA_out are 1 and 0 respectively and all the outputs are precharged to 1 in this phase. la0~la2 comprise the first level lookahead signals. LA_inter is used to realize the second-level look-ahead functions, while LA_in and LA_out are used to realize the third level look-ahead function between two 8-bit macro cells. Though the circuit achieves a Fig.5 : High speed and Low power DM macro cell

This circuit still preserves the high speed characteristics, because of a similar multilevel look-ahead structure. Lower power consumption is possible, as all the outputs will evaluate in the evaluation phase, but with only one output, changing its state obtained in the precharge phase. The power reduction is achieved with a slight trade-off with area and even with the introduction of multilevel look-ahead and multilevel folding techniques, reported in [2], there is slight speed improvement. However, this requires complex look-ahead signal routing, making the resulting circuit design and test mechanism difficult. as indicated in Eqn. (1). This circuit implements the following simplified functions, 4 Proposed circuit design 4.1 Proposed 8-bit PR Module Fig.6 shows the novel design of an 8-bit Priority Resolver macro cell (PPR), based on Domino CMOS logic. Fig.7 gives the circuit diagram for the 8-bit Incrementer/Decrementer macro, incorporating the proposed decision module block. This circuit design encompasses a highly regular and parallel architecture with a simple two level look-ahead structure. The proposed cell structure permits sharing of a similar evaluation chain of n-type transistors for the critical data path corresponding to the output EP7 and the look-ahead signal for the next subsequent stage, which is referred to as LA_inter, which is in sharp contrast to the separate transistor chains available for these signals, in the previous designs. This has enabled us in greatly reducing the transistor count for higher order versions, albeit improvement in performance. 4.2 Design Equations for the Macro cell The design equations for the cell have been framed with an active high look-ahead or enable signal, in contrary to the active low signals, adopted in the earlier designs. This facilitates easier and a more regular realization of logic functionality for the different outputs, Fig.6 : Proposed 8-bit DM Macro cell EP0=LA_in.D0 EP1=LA_in.D0. D1 EP2=LA_in.D0. D1. D2 EP3=LA_in.D0.D1.D2. D3. (1) EP4=LA_in. D0.D1.D2. D3. D4 EP5=LA_in. D0.D1.D2. D3. D4.D5 EP6=LA_in.D0.D1.D2. D3.D4.D5. D6 EP7=LA_in. D0.D1.D2. D3. D4. D5. D6. D7

4.3 Operation of the PR Macro cell In the PR macro, during the precharge phase of the clock, all the output bits EP0~EP7 are predischarged to logic low state, via p-type transistors mp0~mp7. When the clock signal becomes high, the circuit enters the evaluation phase and the outputs are evaluated according to the inputs present, in accordance with the design equations, given above. As a result, only one of the output bits switches to the high state, while the other bits assume the same previous state. In the priority resolver macro, LA_in acts as the look-ahead or enable signal for this stage, which forms the first level lookahead, while LA_inter acts as the look-ahead signal for the next higher stage, which is an internal signal for a higher order circuit. This forms the second level look-ahead function and is described by the equation, LA_inter=D0.D1.D2.D3.D4.D5.D6.D7.LA_in. (2) 4.4 8-bit Incrementer/Decrementer Macro cell The operation of the Incrementer/Decrementer circuit is explained in detail and the complete block diagram for the 32-bit bit-width is available in [1]. The circuit, when incorporating the proposed DM block, results in greater power savings and area efficiency, as demonstrated by the simulation results, obtained using Mentor Graphics design tools. 5. Simulation results Fig.8 : Waveform corresponding to Critical Path Table 1: Post-Layout performance comparison 32 bit DM macro cell Critical Path Delay (ns) Max. Freq. (MHz) PDP Device Count HS 4.98 100.40 0.42 372 HSLP 4.11 121.65 0.17 396 PPR 3.99 125.31 0.11 322 Table 2: Post-layout Performance comparisons 32 bit Inc./Dec. PDP designs Max. Freq. (MHz) Power Dissipation (mw/mhz) HS version 118.54 0.108 0.91 HSLP version 145.61 0.076 0.52 PPR version 158.79 0.051 0.32 Fig.7 : Basic Incrementer/Decrementer macro 6. Conclusion and Ongoing work A novel topology for the 8-bit Decision Module macro cell is presented in this paper. The proposed design is especially suitable for realizing higher order Incrementer/Decrementer circuits. The 32-bit Incrementer/Decrementer

circuit, based on the proposed DM cell was designed in MOSIS 3V, 0.6µm CMOS technology on a Sun Solaris platform using an industry standard BSIM device model. The simulation results are very encouraging and they report significant savings in power dissipation and area occupancy, with simultaneous minimal improvement in speed performance. Many higher order Incrementer/Decrementer circuits are also being designed based on the proposed strategy, to evaluate the feasibility and highlight the efficacy of the proposed design strategy. [7] K.Hwang, Computer Arithmetic: Principles, Architecture and Design, John Wiley and Sons, 1979. [8] J -S Wang and C -S Huang, A high-speed single-phase-clocked CMOS priority encoder, IEEE ISCAS, vol.5, May 2000, pp. 537-540 [9] www.mosis.org [10] R.Hashemian, Highly parallel increment/decrement using CMOS technology, Proc. of IEEE Midwest Symp. on Circuits and Systems, vol.2, 1991 [11] User s and Reference Manuals, ASIC Tools, Mentor Graphics Corporation, USA Acknowledgement The authors wish to thank the higher authorities and the faculty of the ECE department of their institution for their support and encouragement. References: - [1] Chung-Hsun Huang, Jinn-Shyan Wang and Yan-Chao Huang, A High-Speed CMOS Incrementer/Decrementer, Proc. IEEE Intl. Symp. On Circuits and Systems, Vol.4, May 2001, pp.88-91 [2] Chung-Hsun Huang, J -S Wang and J S. Huang, Design of High Performance CMOS Priority Encoders and Incrementer/Decrementer using Multilevel Lookahead and Multilevel Folding Techniques, IEEE J. SSC., Vol.37, No.1 Jan. 2002 [3] J.-S. Wang and C.-H. Huang, High-Speed and Low Power CMOS Priority Encoders, IEEE J. Solid State Ckts., Vol.35, Oct.2000, pp. 1511-1514 [4] Stan M.R et.al., Long and fast Up/Down counters, IEEE Tran. on Comp., July 98, pp722-735 [5] Lutz D.R et.al., Programmable modulo-k counters, IEEE Tran. on Circuits and Systems: Fundamental Theory and Appl., Nov.96, pp.939-941 [6] N.Weste and K.Eshraghian, Principles of CMOS VLSI Design A Design Perspective, Pearson Education, 2 nd Edition, 1993