Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

Similar documents
ISSN:

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

Implementation of Low Power and Area Efficient Carry Select Adder

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

Implementation of High Speed Adder using DLATCH

Design of Carry Select Adder using Binary to Excess-3 Converter in VHDL

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

DESIGN OF HIGH PERFORMANCE, AREA EFFICIENT FIR FILTER USING CARRY SELECT ADDER

Design and Implementation of Low-Power and Area-Efficient for Carry Select Adder (Csla)

Research Article Low Power 256-bit Modified Carry Select Adder

An Efficient Carry Select Adder

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Efficient Implementation of Multi Stage SQRT Carry Select Adder

Improved 32 bit carry select adder for low area and low power

128 BIT MODIFIED CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER

FPGA IMPEMENTATION OF LOW POWER AND AREA EFFICIENT CARRY SELECT ADDER

FPGA Implementation of Low Power and Area Efficient Carry Select Adder

Implementation of efficient carry select adder on FPGA

Modified128 bit CSLA For Effective Area and Speed

Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

Pak. J. Biotechnol. Vol. 14 (Special Issue II) Pp (2017) Parjoona V. and P. Manimegalai

DESIGN OF LOW POWER AND HIGH SPEED BEC 2248 EFFICIENT NOVEL CARRY SELECT ADDER

Design of Modified Carry Select Adder for Addition of More Than Two Numbers

Design and Analysis of Modified Fast Compressors for MAC Unit

A Review on Hybrid Adders in VHDL Payal V. Mawale #1, Swapnil Jain *2, Pravin W. Jaronde #3

An Efficient High Speed Wallace Tree Multiplier

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

ALONG with the progressive device scaling, semiconductor

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

LUT Optimization for Memory Based Computation using Modified OMS Technique

Design of Memory Based Implementation Using LUT Multiplier

An MFA Binary Counter for Low Power Application

High Performance Carry Chains for FPGAs

Research Article VLSI Architecture Using a Modified SQRT Carry Select Adder in Image Compression

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier

Adaptive Fir Filter with Optimised Area and Power using Modified Inner-Product Block

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

COMPUTATIONAL REDUCTION LOGIC FOR ADDERS

Memory efficient Distributed architecture LUT Design using Unified Architecture

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

Distributed Arithmetic Unit Design for Fir Filter

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Optimization of memory based multiplication for LUT

MODEL QUESTIONS WITH ANSWERS THIRD SEMESTER B.TECH DEGREE EXAMINATION DECEMBER CS 203: Switching Theory and Logic Design. Time: 3 Hrs Marks: 100

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

A Low Power Delay Buffer Using Gated Driver Tree

A Parallel Area Delay Efficient Interpolation Filter Architecture

International Journal of Engineering Research-Online A Peer Reviewed International Journal

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

A Novel Architecture of LUT Design Optimization for DSP Applications

An Lut Adaptive Filter Using DA

Fault Detection And Correction Using MLD For Memory Applications

Reconfigurable Fir Digital Filter Realization on FPGA

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Comparative Analysis of Pulsed Latch and Flip-Flop based Shift Registers for High-Performance and Low-Power Systems

Aging Aware Multiplier with AHL using FPGA

Low Power High Speed Voltage Level Shifter for Sub- Threshold Operations

THE USE OF forward error correction (FEC) in optical networks

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

Design and Implementation of LUT Optimization DSP Techniques

Low Power Area Efficient Parallel Counter Architecture

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

OMS Based LUT Optimization

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array

Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir

Implementation of Memory Based Multiplication Using Micro wind Software

An Efficient Reduction of Area in Multistandard Transform Core

An FPGA Implementation of Shift Register Using Pulsed Latches

Performance Analysis and Behaviour of Cascaded Integrator Comb Filters

Reduction of Area and Power of Shift Register Using Pulsed Latches

SIC Vector Generation Using Test per Clock and Test per Scan

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

TEST PATTERN GENERATION USING PSEUDORANDOM BIST

Efficient Method for Look-Up-Table Design in Memory Based Fir Filters

FPGA Implementation of DA Algritm for Fir Filter

Optimizing area of local routing network by reconfiguring look up tables (LUTs)

Effect of Compensation and Arbitrary Sampling in interpolators for Different Wireless Standards on FPGA Platform

A HIGH SPEED CMOS INCREMENTER/DECREMENTER CIRCUIT WITH REDUCED POWER DELAY PRODUCT

A High- Speed LFSR Design by the Application of Sample Period Reduction Technique for BCH Encoder

Hardware Implementation of Viterbi Decoder for Wireless Applications

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

Midterm Exam 15 points total. March 28, 2011

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

A Symmetric Differential Clock Generator for Bit-Serial Hardware

A Power Efficient Flip Flop by using 90nm Technology

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Low Power and Reduce Area Dual Edge Pulse Triggered Flip-Flop Based on Signal Feed-Through Scheme

Transcription:

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA Ch. Pavan kumar #1, V.Narayana Reddy, *2, R.Sravanthi *3 #Dept. of ECE, PBR VIT, Kavali, A.P, India #2 Associate.Proffesor, Department of E.CE, VEC, Kavali, A.P, India Abstract Carry Select Adder is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. Carry select adder(csla)is used to increase the speed of a parallel adder that expands area in favour of speed.csla is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carriers and then select a carry to generate the sum. The problem raised in CSLA is not area efficient because it uses multiple pairs of Ripple carry adders (RCA) to generate the partial sum and carry which are selected by the multiplexer. Square Root CSLA is constructed by equalising the delay through two carry chains and the block multiplexer signal from previous stage. This is an extension of linear CSLA which improves the delay time greatly. By using SQRT CSLA, the time can be improved, as the time waiting for carry bit is used to calculate an extra input bit in each stage. The main disadvantage in the SQRT CSLA is duplication of adders is done. By this duplication the size of the adder is bigger and takes more space than standard ripple adder. This disadvantage is overcome by using Binary to Excess-1 convertor for RCA with cin=1 to optimise the area and delay.this modified design will reduce area and power as compared with regular SQRT CSLA with only a slight increase in delay. Based on this modification 8, 16, 32, 64,128-b SQRT CSLA architecture and simulation will be developed and compare with regular SQRT CSLA Key words- MIMO, Broadcast channels, and array signal processing, feedback communication, co channel interference, diversity methods. Index Terms Keywords: CRC, lookup table, Fast update I. INTRODUCTION In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The circuit architecture is simple and area-efficient. However, the computation speed is slow because each full-adder can only start operation till the previous carry-out signal is ready. On the other hand, Carry Look-ahead Adders (CSLAs) are the fastest adders, but they are the worst from the area point of view. Carry Select Adders have been considered as a compromise solution between RCAs and CSLAs because they offer a good trade-off between the compact area of RCAs and the short delay of CSLAs. Reduced area and high speed data path logic systems are the main areas of research in VLSI system design. High speed addition and multiplication has always been a fundamental requirement of high-performance processors and systems. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been s ummed and a carry propagated into the next position. There are many types of adder designs available (Ripple Carry Adder, Carry Look Ahead Adder, Carry Save Adder, Carry Skip Adder) which have its own advantages and disadvantages. The major speed limitation in any adder is in the production of carries and many authors considered the addition problem. To solve the carry propagation delay CSLA is developed which drastically reduces the area and delay to a great extent. However, the Regular CSLA is not area and speed efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input. The final sum and carry are selected by the multiplexers (mux). Due to the use of two independent RCA the area will increase which leads an increase in delay. To overcome the above problem, the basic idea of the proposed work is to use n-bit binary to excess-1 code converters (BEC) to improve the speed of addition. This logic can be replaced in RCA for Cin=1 to further improves the speed and thus reduces the delay. Using Binary to Excess -1 Converter (BEC) instead of RCA in the regular CSLA will achieve lower area, delay which speeds up the addition operation. The main advantage of this BEC logic comes from the lesser number of logic gates than the Full Adder (FA) structure because the number of gates used will be decreased II. RELATED WORK Basic adder blocks We elucidated how to calculate delay and area theoretically. The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig. 1. ISSN: 2231-2803 http://www.ijcttjournal.org Page63

Fig. 1. Delay and Area evaluation of an XOR gate The gates as depicted in between the dotted lines are performing the operations in parallel and the numeric representation of each gate indicates the delay contributed by that gate. Basic adder block considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder (HA), and FA are evaluated and listed in Table I Table 1: Delay and area evolution of CSLA As discussed above the main idea of this work is to use BEC instead of the RCA with Cin=1 in order to reduce the area and power consumption of the regular CSLA. To replace the n-bit RCA, an n + 1-bit BEC is required. A structure and the function table of a 2-b BEC are shown in Fig. 5 and Table II, respectively. Fig.2 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with the mux. One input of the 16:8 mux gets as it input and another input of the mux is the BEC output. This produces the two possible partial results in parallel and the mux is used to select either the BEC output or the direct inputs according to the control signal Cin. As (note the functional symbols ~ NOT, & AND, XOR).The Boolean expression of 8 -bit BEC is X0 = ~B0 X1 = B0^B1 X2 = B2^ (B0 & B1) X3 = B3^ (B0 & B1 & B2) X4 = B4^ (B0 & B1 & B2 & B3) X5 = B5^ (B0 & B1 & B2 & B3 & B4). X6 = B6^ (B0 & B1 & B2 & B3 & B4 & B5). X7 = B^ (B0 & B1 & B2 & B3 & B4 & B5 & B6). B[5:0] 0000000 0000001. 1111111 X[5:0 0000001 0000010. 0000000 Table 2: BEC functional table Ripple Carry Adder: The ripple carry adder is constructed by cascading full adders (FA) blocks in series. One full adder is responsible for the addition of two binary digits at any stage of the ripple carry. The carryout of one stage is fed directly to the carryin of the next stage.4-bit ripple carry adder. A serious drawback of this adder is that the delay increases linearly with the bit length. Carry-Select Adder: The basic idea of the carry-select adder is to use blocks of two ripple-carry adders, one of which is fed with a constant 0 carry-in while the other is fed with a constant 1 carry-in. Therefore, both blocks can calculate in parallel. When the actual carry-in signal for the block arrives, multiplexers are used to select the correct one of both pre - calculated partial sums. Also, the resulting carry-out is selected and propagated to the next carry-select block. The time taken to compute the sum is then avoided which improves speed. III. PROPOSED SYSTEM 16-bit sqrt carry select adder Fig.2 8 bit BEC with 16:8 mux A carry-select adder is divided into sectors, each of which, except for the least significant performs two additions in parallel, one assuming a carry-in of zero, the other a carry-in of one within the sector, there are two 4-bit ripples- carry adders receiving the same data inputs but different Cin. The upper adder has a carry-in of zero, the lower adder a carry-in of one. The actual Cin from the preceding sector selects one of the two adders. If the carry-in is zero, the sum and carryout of the upper adder are selected. If the carry-in is one, the ISSN: 2231-2803 http://www.ijcttjournal.org Page64

sum and carry-out of the lower adder are selected. Logically, the result is not different if a single ripple -carry adder were used. The structure of regular 16-bit SQRT CSLA has five groups of different size RCA as shown in figure 3. (c) Fig. 3. Regular 16-b SQRT CSLA (a) (d) Fig. 4. Delay and area evaluation of regular SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. F is a Full Adder. The delay and area evaluation of each group are shown in Fig. 4, in which the numerals within [] specify the delay values, e.g., sum2 requires 10 gate delays. GROUP DELAY AREA Group2 Group3 Group4 Group5 11 13 16 19 57 87 117 147 Delay and area count of regular SQRT CSLA groups Modified sqrt carry select adder: (b) A carry-select adder achieves speeds 40% to 90% faster by performing additions in parallel and reducing the maximum carry path. Modified carry select adder is similar to regular 16-bit SQRT CSLA. Only change is that in basic blocks having two ripple-carry adders, one ripple carry adder fed with a constant 1 carry-in is replaced by BEC.BEC has less number of gates compared to RCA. Instead of another 2-b RCA with cin=1 is replaced with a 3-b BEC is used which adds one to the output from 2-b RCA.The output from the group is selected by multiplexer. ISSN: 2231-2803 http://www.ijcttjournal.org Page65

Fig.5. Modified 16-b SQRT CSLA. The parallel RCA with Cin=1is replaced with BEC. (c) The structure of the proposed 16-b SQRT CSLA using BEC for RCA with Cin=1 to optimize the area and power is shown in Fig. 5. We again split the structure into five groups. The delay and area estimation of each group are shown in Fig.6. The steps leading to the evaluation are given here.the modified SQRT carry select adder is also divided into 5 groups. (d) Fig. 6. Delay and area evaluation of modified SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. H is a Half Adder. (a) The group 1 has only one 2-bit RCA, group-2 has one 2-b RCA which has 1 FA (full adder) and 1HA (half adder) with cin=1 a 3-b BEC is used which adds one to the output from 2-b RCA. Thus the sum3 and final c3 (output from mux) are depending on s3 and partial c3 (input to mux) and mux respectively. The sum2 depends on c1 and mux. GROUP DELAY AREA Group2 Group3 Group4 Group5 13 16 19 22 43 61 84 107 Table IV delay and area count of modified SQRT CSLA IV. RESULTS AND ANALYSIS (b) ISSN: 2231-2803 http://www.ijcttjournal.org Page66

Fig.7.16-bit ripple carry simulation timing diagram The simulation of the ripple carry adder is shown in Fig 7. As can be seen from the simulation output of ripple carry adder, the addition operation was performed between FAC2 and ACDA with carry as 0.The sum resulted as A79C and carry was generated as 1 Fig 9: 16-bit regular SQRTCSLA Simulation Timing Diagram Fig 8. Multiplexer Simulation Timing Diagram The simulation of the multiplexer is shown in Fig 8. As can be seen from the simulation output of multiplexer, the multiplexing operation was performed between m1 and m2 which were given as 10 and 11 with selection signal s as 1 and output selected as 11 The simulation of the regular SQRTCSLA is shown in Fig 9. As can be seen from the simulation output of regular SQRTCSLA, the addition operation was performed between two 16-bit numbers with cin as 1 and given output as 16-bit number without any carry generation. This model offers more delay and more power consumption. Fig 10: 16-bit modified SQRTCSLA Simulation Timing Diagram Fig 9:16-bit Carry select adder Simulation Timing Diagram The simulation of the carry-select adder is shown in Fig 9. As can be seen from the simulation output of carry-select adder, the addition operation was performed between 5678 and a3b5. The sum resulted as fa2d and no carry was generated and no carrier was generated.then cc1, cc2, cc3 signals were used diagram. Based upon the signal cc1, the carrier was selected from cc2 and cc3.similarly based upon the cc1, sum was selected from sum1 and sum2. The simulation of the modified SQRTCSLA is shown in Fig 10. As can be seen from the simulation output of modified SQRTCSLA, the addition operation was performed between two 16-bit numbers with cin as 1 and given output as 16-bit number without any carry generation. This model offers less delay and less power consumption. CONCLUSION Interchanging ripple carry adder with BEC in SQRT CSLA was achieved. The SQRT CSLA was designed using BEC to overcome the limitations of the ripple carry adder. The comparison is made on the basis of delay between RCA and BEC.This project presents an efficient implementation of ISSN: 2231-2803 http://www.ijcttjournal.org Page67

SQRT CSLA using BEC's. We had compared the working of the two SQRT CSLA by implementing each of them separately using RCA and BEC's. So it is clear that BEC performs better in the terms of delay as in this adder technique it accomplishes the addition by adding small portions of bits (each of equal size) then it selects the correct outputs using multiplexer. The delay is reduced from 17.281ns to 13.619ns (20%). FUTURE SCOPE The SQRT CSLA using BEC's can be in many processing processors in order to achieve fast performance. The Area and Power can be reduced. We also conclude that this addition technique can be implemented for larger higher values of bits. [14] Design compiler User Guide, ver.b-2008.09, Synopsys Inc., Sep.2008. BIO DATA Ch. Pavan kumar presently pursuing M.Tech in Department of electronics and communications in PBR VITS, Kavali, A.P, India AP, India REFERENCES [1] O. J. Bedrij, Carry-select adder, IRE Trans. Electron. Computes., pp. 340 344, 1962. [2] B. Ramkumar, H.M. Kittur, and P. M. Kannan, ASIC implementation of modified faster carry save adder, Eur. J. Sci. Res., vol. 42, no. 1, pp. 53 58, 2010. [3] J. M. Rabaey, Digtal Integrated Circuits A Design Perspective. Upper Saddle River, NJ: Prentice-Hall, 2001. [4] Y. He, C. H. Chang, and J. Gu, An area efficient 64 -bit square root carry-select adder for low power applications, in Proc. IEEE Int. Symp Circuits Syst., 2005, vol. 4, pp. 4082 4085. [5] D.A.Parker and K.K.Parthi, Low-area/power parallel FIR digital filter implementations, J.VLSI Signal Process.Syst, vol.17, no.1, pp.75-92, 1997. [6] J.G.Chung and K.K.Parthi Frequency-spectrum-based Low-area/power parallel FIR filter design, EURASIP J.Appl.Signal Process.,vol.2002,no.9.pp 444-453,2002. [7] K.K.Prathi, VLSI Digital Signal Processing systems: Design and implementation. New York: Wiley, 1999. [8] Z-J.Mou and Duhamel, Short-length FIR filters and their use in fast no recursive filtering, IEEE Trans.Signal process, vol, 39, no.6, pp., 1332, jun.1991 [9] J.I.Acha, Computational structures for fast implementation of L-path and L-block digital filters, IEEE Trans.Circuit syst., vol.36, no.6.pp.805-812, Jun 1989. [10] C.Cheng and K.K.Prathi, Hardware efficient fast parallel FIR filter structures based on itered short convolution, IEEE Trans. Circuits syst.i, Reg.Papers, vol.51, no.8, pp.1492-1500, aug.2004. [11] C.Cheng and K.K.Prathi, Furthur complexity reduction of parallel FIR filters, in Proc.IEEE Int.Symp.circuits syst.kobe, Japan, May 2005. [12] C.Cheng and K.K.Prathi, Low-cost parallel FIR Structures with 2-stage parallelism, IEEE Trans. Circuits Syst.I, Reg.papers.vol.54, no.2, pp.280-290, Feb. 2007. [13] L-S. Lin and S.K.Mitra, Overlapped block digital filtering, IEEE Trans.Circuits syst,ii,analog Digit, Signal process,vol.43,no.8,pp.586-596,aug.1996. V.Narayana Reddy presently working as a Associate Professor in Department of electronics and communications in VEC, Kavali, A.P, India AP, India Mrs. R.Sravanthi, M.tech, Assoc.Professor VEC kavali, A.P, India ISSN: 2231-2803 http://www.ijcttjournal.org Page68