Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA Ch. Pavan kumar #1, V.Narayana Reddy, *2, R.Sravanthi *3 #Dept. of ECE, PBR VIT, Kavali, A.P, India #2 Associate.Proffesor, Department of E.CE, VEC, Kavali, A.P, India Abstract Carry Select Adder is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. Carry select adder(csla)is used to increase the speed of a parallel adder that expands area in favour of speed.csla is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carriers and then select a carry to generate the sum. The problem raised in CSLA is not area efficient because it uses multiple pairs of Ripple carry adders (RCA) to generate the partial sum and carry which are selected by the multiplexer. Square Root CSLA is constructed by equalising the delay through two carry chains and the block multiplexer signal from previous stage. This is an extension of linear CSLA which improves the delay time greatly. By using SQRT CSLA, the time can be improved, as the time waiting for carry bit is used to calculate an extra input bit in each stage. The main disadvantage in the SQRT CSLA is duplication of adders is done. By this duplication the size of the adder is bigger and takes more space than standard ripple adder. This disadvantage is overcome by using Binary to Excess-1 convertor for RCA with cin=1 to optimise the area and delay.this modified design will reduce area and power as compared with regular SQRT CSLA with only a slight increase in delay. Based on this modification 8, 16, 32, 64,128-b SQRT CSLA architecture and simulation will be developed and compare with regular SQRT CSLA Key words- MIMO, Broadcast channels, and array signal processing, feedback communication, co channel interference, diversity methods. Index Terms Keywords: CRC, lookup table, Fast update I. INTRODUCTION In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The circuit architecture is simple and area-efficient. However, the computation speed is slow because each full-adder can only start operation till the previous carry-out signal is ready. On the other hand, Carry Look-ahead Adders (CSLAs) are the fastest adders, but they are the worst from the area point of view. Carry Select Adders have been considered as a compromise solution between RCAs and CSLAs because they offer a good trade-off between the compact area of RCAs and the short delay of CSLAs. Reduced area and high speed data path logic systems are the main areas of research in VLSI system design. High speed addition and multiplication has always been a fundamental requirement of high-performance processors and systems. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been s ummed and a carry propagated into the next position. There are many types of adder designs available (Ripple Carry Adder, Carry Look Ahead Adder, Carry Save Adder, Carry Skip Adder) which have its own advantages and disadvantages. The major speed limitation in any adder is in the production of carries and many authors considered the addition problem. To solve the carry propagation delay CSLA is developed which drastically reduces the area and delay to a great extent. However, the Regular CSLA is not area and speed efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by considering carry input. The final sum and carry are selected by the multiplexers (mux). Due to the use of two independent RCA the area will increase which leads an increase in delay. To overcome the above problem, the basic idea of the proposed work is to use n-bit binary to excess-1 code converters (BEC) to improve the speed of addition. This logic can be replaced in RCA for Cin=1 to further improves the speed and thus reduces the delay. Using Binary to Excess -1 Converter (BEC) instead of RCA in the regular CSLA will achieve lower area, delay which speeds up the addition operation. The main advantage of this BEC logic comes from the lesser number of logic gates than the Full Adder (FA) structure because the number of gates used will be decreased II. RELATED WORK Basic adder blocks We elucidated how to calculate delay and area theoretically. The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig. 1. ISSN: 2231-2803 http://www.ijcttjournal.org Page63

Fig. 1. Delay and Area evaluation of an XOR gate The gates as depicted in between the dotted lines are performing the operations in parallel and the numeric representation of each gate indicates the delay contributed by that gate. Basic adder block considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder (HA), and FA are evaluated and listed in Table I Table 1: Delay and area evolution of CSLA As discussed above the main idea of this work is to use BEC instead of the RCA with Cin=1 in order to reduce the area and power consumption of the regular CSLA. To replace the n-bit RCA, an n + 1-bit BEC is required. A structure and the function table of a 2-b BEC are shown in Fig. 5 and Table II, respectively. Fig.2 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with the mux. One input of the 16:8 mux gets as it input and another input of the mux is the BEC output. This produces the two possible partial results in parallel and the mux is used to select either the BEC output or the direct inputs according to the control signal Cin. As (note the functional symbols ~ NOT, & AND, XOR).The Boolean expression of 8 -bit BEC is X0 = ~B0 X1 = B0^B1 X2 = B2^ (B0 & B1) X3 = B3^ (B0 & B1 & B2) X4 = B4^ (B0 & B1 & B2 & B3) X5 = B5^ (B0 & B1 & B2 & B3 & B4). X6 = B6^ (B0 & B1 & B2 & B3 & B4 & B5). X7 = B^ (B0 & B1 & B2 & B3 & B4 & B5 & B6). B[5:0] 0000000 0000001. 1111111 X[5:0 0000001 0000010. 0000000 Table 2: BEC functional table Ripple Carry Adder: The ripple carry adder is constructed by cascading full adders (FA) blocks in series. One full adder is responsible for the addition of two binary digits at any stage of the ripple carry. The carryout of one stage is fed directly to the carryin of the next stage.4-bit ripple carry adder. A serious drawback of this adder is that the delay increases linearly with the bit length. Carry-Select Adder: The basic idea of the carry-select adder is to use blocks of two ripple-carry adders, one of which is fed with a constant 0 carry-in while the other is fed with a constant 1 carry-in. Therefore, both blocks can calculate in parallel. When the actual carry-in signal for the block arrives, multiplexers are used to select the correct one of both pre - calculated partial sums. Also, the resulting carry-out is selected and propagated to the next carry-select block. The time taken to compute the sum is then avoided which improves speed. III. PROPOSED SYSTEM 16-bit sqrt carry select adder Fig.2 8 bit BEC with 16:8 mux A carry-select adder is divided into sectors, each of which, except for the least significant performs two additions in parallel, one assuming a carry-in of zero, the other a carry-in of one within the sector, there are two 4-bit ripples- carry adders receiving the same data inputs but different Cin. The upper adder has a carry-in of zero, the lower adder a carry-in of one. The actual Cin from the preceding sector selects one of the two adders. If the carry-in is zero, the sum and carryout of the upper adder are selected. If the carry-in is one, the ISSN: 2231-2803 http://www.ijcttjournal.org Page64

sum and carry-out of the lower adder are selected. Logically, the result is not different if a single ripple -carry adder were used. The structure of regular 16-bit SQRT CSLA has five groups of different size RCA as shown in figure 3. (c) Fig. 3. Regular 16-b SQRT CSLA (a) (d) Fig. 4. Delay and area evaluation of regular SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. F is a Full Adder. The delay and area evaluation of each group are shown in Fig. 4, in which the numerals within [] specify the delay values, e.g., sum2 requires 10 gate delays. GROUP DELAY AREA Group2 Group3 Group4 Group5 11 13 16 19 57 87 117 147 Delay and area count of regular SQRT CSLA groups Modified sqrt carry select adder: (b) A carry-select adder achieves speeds 40% to 90% faster by performing additions in parallel and reducing the maximum carry path. Modified carry select adder is similar to regular 16-bit SQRT CSLA. Only change is that in basic blocks having two ripple-carry adders, one ripple carry adder fed with a constant 1 carry-in is replaced by BEC.BEC has less number of gates compared to RCA. Instead of another 2-b RCA with cin=1 is replaced with a 3-b BEC is used which adds one to the output from 2-b RCA.The output from the group is selected by multiplexer. ISSN: 2231-2803 http://www.ijcttjournal.org Page65

Fig.5. Modified 16-b SQRT CSLA. The parallel RCA with Cin=1is replaced with BEC. (c) The structure of the proposed 16-b SQRT CSLA using BEC for RCA with Cin=1 to optimize the area and power is shown in Fig. 5. We again split the structure into five groups. The delay and area estimation of each group are shown in Fig.6. The steps leading to the evaluation are given here.the modified SQRT carry select adder is also divided into 5 groups. (d) Fig. 6. Delay and area evaluation of modified SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. H is a Half Adder. (a) The group 1 has only one 2-bit RCA, group-2 has one 2-b RCA which has 1 FA (full adder) and 1HA (half adder) with cin=1 a 3-b BEC is used which adds one to the output from 2-b RCA. Thus the sum3 and final c3 (output from mux) are depending on s3 and partial c3 (input to mux) and mux respectively. The sum2 depends on c1 and mux. GROUP DELAY AREA Group2 Group3 Group4 Group5 13 16 19 22 43 61 84 107 Table IV delay and area count of modified SQRT CSLA IV. RESULTS AND ANALYSIS (b) ISSN: 2231-2803 http://www.ijcttjournal.org Page66

Fig.7.16-bit ripple carry simulation timing diagram The simulation of the ripple carry adder is shown in Fig 7. As can be seen from the simulation output of ripple carry adder, the addition operation was performed between FAC2 and ACDA with carry as 0.The sum resulted as A79C and carry was generated as 1 Fig 9: 16-bit regular SQRTCSLA Simulation Timing Diagram Fig 8. Multiplexer Simulation Timing Diagram The simulation of the multiplexer is shown in Fig 8. As can be seen from the simulation output of multiplexer, the multiplexing operation was performed between m1 and m2 which were given as 10 and 11 with selection signal s as 1 and output selected as 11 The simulation of the regular SQRTCSLA is shown in Fig 9. As can be seen from the simulation output of regular SQRTCSLA, the addition operation was performed between two 16-bit numbers with cin as 1 and given output as 16-bit number without any carry generation. This model offers more delay and more power consumption. Fig 10: 16-bit modified SQRTCSLA Simulation Timing Diagram Fig 9:16-bit Carry select adder Simulation Timing Diagram The simulation of the carry-select adder is shown in Fig 9. As can be seen from the simulation output of carry-select adder, the addition operation was performed between 5678 and a3b5. The sum resulted as fa2d and no carry was generated and no carrier was generated.then cc1, cc2, cc3 signals were used diagram. Based upon the signal cc1, the carrier was selected from cc2 and cc3.similarly based upon the cc1, sum was selected from sum1 and sum2. The simulation of the modified SQRTCSLA is shown in Fig 10. As can be seen from the simulation output of modified SQRTCSLA, the addition operation was performed between two 16-bit numbers with cin as 1 and given output as 16-bit number without any carry generation. This model offers less delay and less power consumption. CONCLUSION Interchanging ripple carry adder with BEC in SQRT CSLA was achieved. The SQRT CSLA was designed using BEC to overcome the limitations of the ripple carry adder. The comparison is made on the basis of delay between RCA and BEC.This project presents an efficient implementation of ISSN: 2231-2803 http://www.ijcttjournal.org Page67

SQRT CSLA using BEC's. We had compared the working of the two SQRT CSLA by implementing each of them separately using RCA and BEC's. So it is clear that BEC performs better in the terms of delay as in this adder technique it accomplishes the addition by adding small portions of bits (each of equal size) then it selects the correct outputs using multiplexer. The delay is reduced from 17.281ns to 13.619ns (20%). FUTURE SCOPE The SQRT CSLA using BEC's can be in many processing processors in order to achieve fast performance. The Area and Power can be reduced. We also conclude that this addition technique can be implemented for larger higher values of bits. [14] Design compiler User Guide, ver.b-2008.09, Synopsys Inc., Sep.2008. BIO DATA Ch. Pavan kumar presently pursuing M.Tech in Department of electronics and communications in PBR VITS, Kavali, A.P, India AP, India REFERENCES [1] O. J. Bedrij, Carry-select adder, IRE Trans. Electron. Computes., pp. 340 344, 1962. [2] B. Ramkumar, H.M. Kittur, and P. M. Kannan, ASIC implementation of modified faster carry save adder, Eur. J. Sci. Res., vol. 42, no. 1, pp. 53 58, 2010. [3] J. M. Rabaey, Digtal Integrated Circuits A Design Perspective. Upper Saddle River, NJ: Prentice-Hall, 2001. [4] Y. He, C. H. Chang, and J. Gu, An area efficient 64 -bit square root carry-select adder for low power applications, in Proc. IEEE Int. Symp Circuits Syst., 2005, vol. 4, pp. 4082 4085. [5] D.A.Parker and K.K.Parthi, Low-area/power parallel FIR digital filter implementations, J.VLSI Signal Process.Syst, vol.17, no.1, pp.75-92, 1997. [6] J.G.Chung and K.K.Parthi Frequency-spectrum-based Low-area/power parallel FIR filter design, EURASIP J.Appl.Signal Process.,vol.2002,no.9.pp 444-453,2002. [7] K.K.Prathi, VLSI Digital Signal Processing systems: Design and implementation. New York: Wiley, 1999. [8] Z-J.Mou and Duhamel, Short-length FIR filters and their use in fast no recursive filtering, IEEE Trans.Signal process, vol, 39, no.6, pp., 1332, jun.1991 [9] J.I.Acha, Computational structures for fast implementation of L-path and L-block digital filters, IEEE Trans.Circuit syst., vol.36, no.6.pp.805-812, Jun 1989. [10] C.Cheng and K.K.Prathi, Hardware efficient fast parallel FIR filter structures based on itered short convolution, IEEE Trans. Circuits syst.i, Reg.Papers, vol.51, no.8, pp.1492-1500, aug.2004. [11] C.Cheng and K.K.Prathi, Furthur complexity reduction of parallel FIR filters, in Proc.IEEE Int.Symp.circuits syst.kobe, Japan, May 2005. [12] C.Cheng and K.K.Prathi, Low-cost parallel FIR Structures with 2-stage parallelism, IEEE Trans. Circuits Syst.I, Reg.papers.vol.54, no.2, pp.280-290, Feb. 2007. [13] L-S. Lin and S.K.Mitra, Overlapped block digital filtering, IEEE Trans.Circuits syst,ii,analog Digit, Signal process,vol.43,no.8,pp.586-596,aug.1996. V.Narayana Reddy presently working as a Associate Professor in Department of electronics and communications in VEC, Kavali, A.P, India AP, India Mrs. R.Sravanthi, M.tech, Assoc.Professor VEC kavali, A.P, India ISSN: 2231-2803 http://www.ijcttjournal.org Page68