Research Article Low Power 256-bit Modified Carry Select Adder

Similar documents
Implementation of Low Power and Area Efficient Carry Select Adder

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

ISSN:

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

Design and Implementation of Low-Power and Area-Efficient for Carry Select Adder (Csla)

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

Improved 32 bit carry select adder for low area and low power

Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder

DESIGN OF HIGH PERFORMANCE, AREA EFFICIENT FIR FILTER USING CARRY SELECT ADDER

Design of Carry Select Adder using Binary to Excess-3 Converter in VHDL

Implementation of High Speed Adder using DLATCH

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

Implementation of efficient carry select adder on FPGA

FPGA Implementation of Low Power and Area Efficient Carry Select Adder

Modified128 bit CSLA For Effective Area and Speed

FPGA IMPEMENTATION OF LOW POWER AND AREA EFFICIENT CARRY SELECT ADDER

Pak. J. Biotechnol. Vol. 14 (Special Issue II) Pp (2017) Parjoona V. and P. Manimegalai

128 BIT MODIFIED CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER

An Efficient Carry Select Adder

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Efficient Implementation of Multi Stage SQRT Carry Select Adder

DESIGN OF LOW POWER AND HIGH SPEED BEC 2248 EFFICIENT NOVEL CARRY SELECT ADDER

Design of Modified Carry Select Adder for Addition of More Than Two Numbers

An MFA Binary Counter for Low Power Application

Design and Analysis of Modified Fast Compressors for MAC Unit

Research Article VLSI Architecture Using a Modified SQRT Carry Select Adder in Image Compression

A Review on Hybrid Adders in VHDL Payal V. Mawale #1, Swapnil Jain *2, Pravin W. Jaronde #3

LUT Optimization for Memory Based Computation using Modified OMS Technique

COMPUTATIONAL REDUCTION LOGIC FOR ADDERS

ALONG with the progressive device scaling, semiconductor

Clock Gating Aware Low Power ALU Design and Implementation on FPGA

ANALYSIS OF POWER REDUCTION IN 2 TO 4 LINE DECODER DESIGN USING GATE DIFFUSION INPUT TECHNIQUE

Design of Memory Based Implementation Using LUT Multiplier

Novel Low Power and Low Transistor Count Flip-Flop Design with. High Performance

A Low Power Delay Buffer Using Gated Driver Tree

International Journal of Engineering Research-Online A Peer Reviewed International Journal

TEST PATTERN GENERATION USING PSEUDORANDOM BIST

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Figure.1 Clock signal II. SYSTEM ANALYSIS

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Design of BIST with Low Power Test Pattern Generator

A Novel Architecture of LUT Design Optimization for DSP Applications

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Implementation of Memory Based Multiplication Using Micro wind Software

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

An Efficient High Speed Wallace Tree Multiplier

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

Power Optimization by Using Multi-Bit Flip-Flops

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE

Distributed Arithmetic Unit Design for Fir Filter

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Low Power Approach of Clock Gating in Synchronous System like FIFO: A Novel Clock Gating Approach and Comparative Analysis

Adaptive Fir Filter with Optimised Area and Power using Modified Inner-Product Block

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

An Improved Recursive and Non-recursive Comb Filter for DSP Applications

Design of Low Power Efficient Viterbi Decoder

Design and Simulation of Modified Alum Based On Glut

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

Research Article Ultra Low Power, High Performance Negative Edge Triggered ECRL Energy Recovery Sequential Elements with Power Clock Gating

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

POWER AND AREA EFFICIENT LFSR WITH PULSED LATCHES

OMS Based LUT Optimization

CMOS DESIGN OF FLIP-FLOP ON 120nm

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

Aging Aware Multiplier with AHL using FPGA

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

A Novel Approach for Auto Clock Gating of Flip-Flops

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

CMOS Technology for Increasing Efficiency of Clock Gating Techniques Using Tri-State Buffer

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Combinational Logic Design

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

CHAPTER 6 DESIGN OF HIGH SPEED COUNTER USING PIPELINING

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

Optimization of memory based multiplication for LUT

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

High speed, Low power N/ (N+1) prescaler using TSPC and E-TSPC: A survey Nemitha B 1, Pradeep Kumar B.P 2

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

Design of Fault Coverage Test Pattern Generator Using LFSR

DESIGN AND ANALYSIS OF COMBINATIONAL CODING CIRCUITS USING ADIABATIC LOGIC

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Dynamic Power Reduction in Sequential Circuit Using Clock Gating

SA4NCCP 4-BIT FULL SERIAL ADDER

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

R13 SET - 1 '' ''' '' ' '''' Code No: RT21053

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

Transcription:

Research Journal of Applied Sciences, Engineering and Technology 8(10): 1212-1216, 2014 DOI:10.19026/rjaset.8.1086 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted: March 29, 2014 Accepted: July 01, 2014 Published: September 15, 2014 Research Article Low Power 256-bit Modified Carry Select Adder P. Ramani, G. Priya, Murala Chandana, T. Sharmila, Seeram Tejaswi and M. Manjushri Department of ECE, SRM University, Chennai, India Abstract: Carry Select Adder (CSLA) is one of the high speed adders used in many computational systems to perform fast arithmetic operations. When compared to earlier Ripple Carry Adder and Carry Look Ahead Adder, Regular CSLA (R-CSLA) is observed to provide optimized results in terms of area. This study proposes an efficient method which replaces the RCA using BEC. The modified CSLA architecture has been developed using gate-level modification to significantly reduce the delay and power of the CSLA. Based on this modification 8-, 16-, 32-, 64- and 128-bit Square-Root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design for 256-bit has reduced power and delay as compared with the regular SQRT CSLA. Designs were developed using structural Verilog module and synthesized using Xilinx ISE simulator and the implementation is done in cadence RTL compiler using 0.18 µm technology. For 256-bit addition in this study, it is proposed to simple gate level modification which significantly reduces the power by 19.4% when compared with R-CSLA. The result analysis shows that the proposed architecture achieves two folded advantages in terms of delay and power. Keywords: BEC, cadence, CSLA INTRODUCTION Power and area have major role in the designing of integrated circuit because of the increase in popularity of portable systems as well as the rapid growth of power density in VLSI circuits. Addition usually influences strongly on the overall performance of digital systems and a crucial arithmetic function. Adders are most widely used in electronic applications. For example, in microprocessors, millions of instructions per second are performed. Due to the increase in the portability of the devices like mobile, laptop etc., require more battery backup. Low power (Edison and Manikandababu, 2012) and area efficient addition and multiplication have always been a fundamental requirement of high performance processors and systems. Designing efficient adder is the most difficult problem in VLSI design. Carry Select Adder are used for high speed application by reducing propagation delay. The basic operation Carry Select Adder (CSLA) is parallel computation. CSLA generates many carriers and partial sum. The final sum and carry are selected by multiplexers (mux). Multiple pairs of Ripple Cary Adders (RCA) are used in CSLA (Mitra and Dutta, 2012) structure. Hence, the CSLA is not area efficient. The proposed method use Binary to Excess-1 Converter (BEC) instead of RCA with Cin = 1 in the regular CSLA. The main goal of this BEC logic is to use lesser number of logic gate than the n-bit Full Fig. 1: BEC with mux Adder. So that, the modified CSLA architecture is lower area and power consumption. In the modified CLSA the input bits are given in linear manner to achieve low power. This study is implemented for higher order bits (till 256 bits) and the Comparison between Regular SQRT CSLA and modified SQRT CSLA is discussed below: Binary to Excess -1 Converter The main idea of this study is to use BEC instead of the RCA with Cin = 1 in order to reduce the area and power consumption of the regular CSLA. To replace Corresponding Author: P. Ramani, Department of ECE, SRM University, Chennai, India This work is licensed under a Creative Commons Attribution 4.0 International License (URL: http://creativecommons.org/licenses/by/4.0/). 1212

Fig. 2: Modified 32-bit SQRT CSLA the n-bit RCA, an n+1-bit BEC is required. The modified CSLA architecture has developed using Binary to Excess-1 Converter (BEC). Figure 1 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with the mux. The XOR gate in BEC of Modified CSLA is replaced with the optimized XOR gate in And or Inverter (AOI) of Modified Area Efficient CSLA. With BEC there is reduction of gates by replacing n bit RCA with n+1 bit BEC. When the optimized XOR gate is used in Modified CSLA, it is verified that there is large reduction in number of gates. The multiplexer (mux) is used to select either the BEC output or the inputs given directly to a BEC circuit of next block. In this design, the major function of mux is to derive the adder speed. According to the control signal C in (Subha and Durga, 2013), the mux is used to select the output from the inputs (input bits as per the block size and the BEC output). The importance of the BEC logic stems from the large silicon area reduction when the CSLA with large number of bits are designed. The Boolean expressions of the 4-bit BEC (Pandey et al., 2013) are listed as (note the functional symbols~not, & AND, ^XOR): X0 = ~B0 X1 = B0^B1 X2 = B2^ (B0&B1) X3 = B3^ (B0&B1&B2) Regular SQRT carry select adder: In this part the regular SQRT CSLA operation and its delay calculation are discussed. A SQRT carry select adder is constructed using the conventional 4-bit Ripple Carry Adder (RCA). The RCA uses multiple full adders to perform addition operation. Each full adder inputs a carry-in, which is the carry-out of the preceding adder. The CSLA divides the words to be added into blocks and forms two sums for each block in parallel, one with assumed carry in (Cin) of 0 and the other with Cin of 1. The carry-out from one stage of 4-bit RCA is used as the select signal for the multiplexer. This selects the corresponding sum bit from the next block. This speeds up the computation process of the adder. Thus, the carry select adder achieves higher speed of operation at the cost of increased number of devices used in the circuit. This in turn increases the area and power consumed by the circuits of this type of structure. 1213 Table 1: Area count of CSLA Group no. Regular Modified 2 64 50 3 94 73 4 124 96 5 154 119 6 184 142 7 214 165 Delay evaluation methodology of regular 16-bit SQRT CSLA: The structure of the 16-bit Regular Square Root Carry Select Adder (SQRT CSLA) has five groups of different size RCA. Only group 2 delay evaluation is discussed: The group 2 has two sets of 2-bit RCA. The regular CSLA (Mitra and Dutta, 2012) structure has two Ripple Carry Adder (RCA). One of RCA use with initial carry Cin = 0 and other with carry Cin = 1. The sum 3 (t = 11) is summation of S3 and mux (t = 3) and sum 2 (t = 10) is summation of c1 and mux, based on the delay values stated earlier (Mitra and Dutta, 2012) and thereby their respective arrival time. Except for group 2, the arrival time of mux selection input (Nair, 2013) is always greater than the arrival time of data outputs from the RCAs. Thus, the delay of group 3 to 5 is determined (Mitra and Dutta, 2012; Pandey et al., 2013) respectively as follows: {c6, sum [6: 4]} = c3 [t = 10] +mux {c10, sum [10: 7]} = c6 [t = 13] +mux {c15, sum [15: 11]} = c10 [t = 16] +mux The one set of 2-bit RCA in group 2 has 2 FA for Cin = 1 and other set has 1 HA for Cin = 0. The area and total no. of gates can be calculated as follows: Gate Count = 57 (FA+HA+mux) FA = 39 (3*13) (FA-Full Adder) HA = 6 (1*6) (HA-Half Adder) Mux = 12 (3*4) Proposed SQRT carry select adder: In this type of Adder, the block of Ripple Carry Adder with input carry as 1 has been replaced with a block of Binary to Excess-1 Converter (BEC) as shown in Fig. 2. This is done in order to reduce the area and power requirement of the previous conventional Carry Select Adder. The

(a) Fig. 3: (a) Group 2 (modified CSLA), (b) group 5 (modified CSLA) (b) 1214

Fig. 4: Percentage of delay overhead Fig. 5: Percentage reduction of product parameter Table 2: Percentage reduction Word size (bit) Area Area-delay Power Power-delay 8 17.30 5.50 12.10 0.46 16 19.02 10.30 13.81 4.34 32 20.20 13.20 16.87 8.41 64 20.69 18.10 17.84 15.05 128 20.80 19.50 18.78 17.40 256 21.58 22.28 19.40 21.10 Table 3: Implementation result Power (µw) ---------------------------------------------------------- Power-delay Area-delay Word size CSLA Delay (nsec) Area (um 2 ) Leakage power Switching power Total power * product (10-15 ) product (10-21 ) 8-bit Regular 0.700 1134 0.043 96.021 96.065 67.2455 793.80 Modified 0.800 0938 0.034 84.411 84.446 67.5568 750.40 16-bit Regular 1.212 2472 0.095 226.311 226.407 274.4052 2996.06 Modified 1.343 2002 0.070 195.305 195.370 260.5180 2688.56 32-bit Regular 1.926 5246 0.199 504.093 504.293 971.2680 10103.79 Modified 2.095 4188 0.152 424.497 424.649 889.6390 8773.86 64-bit Regular 3.500 10694 0.407 997.391 997.798 3492.2930 37429.00 Modified 3.616 8482 0.307 820.153 820.461 2966.7980 30670.91 128-bit Regular 5.644 21961 0.826 2176.423 2177.249 12279.6300 123860.00 Modified 5.736 17394 0.634 1837.932 1768.567 10144.5050 99771.98 256-bit Regular 9.804 44684 1.672 4546.005 4547.678 44588.4350 438081.90 Modified 9.598 35044 1.253 3665.546 3666.800 35193.9490 336352.30 * : Total power = Leakage power + Internal power + Switching power 1215

maximum estimated areas of each group of the modified and regular SQRT CSLA are given in Table 1. The area of each group is calculated manually by the no. of gates used in the structure. The percentage reduction of CSLA for different word sizes are given in Table 2. Delay evaluation methodology of modified SQRT CSLA: Thirty two-bit modified SQRT CSLA structure is given in Fig. 2. The steps leading to the delay evaluation are given here Table 1 and 3. The second group has a 2-bit RCA. Instead of another 2-bit RCA with Cin = 1, a 3-bit BEC is used which adds 1 to the output from 2-bit RCA. Based on the values of the Arrival time of selection input c1 of 6:3 mux is earlier than the s3 and c3 and later than the s2. Thus, the sum 3 and final c3 (output from mux) depend on s3 and mux and partial c3 (input to mux) and mux, respectively. An area count of CSLA is given in Table 1. Modified partial CSLA structure of group 2 and group 5 are given in Fig. 3a and b. For the remaining groups the arrival time of mux selection input is always greater than the arrival time of data inputs from the BECs. Thus, the delay of the remaining groups depends on the arrival time of mux selection input and the mux delay. Comparing the delay values of the earlier models and the proposed model, the reduction in area, power and delay values are given in Table 2. The implementation results in terms of leakage, switching and total power are given in Table 3. These results are obtained by Cadence RTL compiler. RESULTS AND DISCUSSION The design proposed in this study has been developed using Verilog-HDL and synthesized in Cadence RTL compiler using typical libraries of TMSC 180 nm technology. Designs of CSLA were developed using structural Verilog module and synthesized using Xilinx ISE simulator, version 10.1 and the implementation is done in cadence RTL compiler. The percentage reduction in the total power dissipation and the delay, with respect to the worst path of the flow is given in Table 2. The analysis shows that there has been a considerable decrease in the power and delay with slight increase in the area compared to the earlier work (Ramkumar and Kittur, 2012). The implementation results are as shown in Table 3. The percentage of delay overhead is as shown in Fig. 4. The percentage reduction in the cell area, total power, power-delay product and the area-delay product as function of the bit size are shown in Fig. 5. CONCLUSION After comparing the different parameters of various adders with the proposed modified SQRT CSLA, it is evident that the power dissipation has been reduced to the desired extent with a slight increase in area. The proposed model provides a good tradeoff between the time and power consumption. Hence the modified 256-bit CSLA is more efficient for the VLSI hardware implementation. Further work is to be done in reducing the area and for higher order adders (512-bit), thus improving the overall system performance as such. REFERENCES Edison, A.J. and C.S. Manikandababu, 2012. An efficient CSLA architecture for VLSI hardware implementation. IJMIE, 2(5), ISSN: 2249-0558. Mitra, P. and D. Dutta, 2012. Low power high speed SQRT carry select adder. IOSR J. VLSI Signal. Proc. (IOSR-JVSP), 1: 46-51. Nair, V.V., 2013. Modified low power and area efficient carry select adder using D-latch. Int. J. Eng. Sci. Innov. Technol., 2(4), ISSN: 2319-5967. Pandey, S.S., A. Bakshi and V. Sharma, 2013. 128 bit low power and area efficient carry select adder. Int. J. Comput. Appl., 69(6): 29-33. Ramkumar, B. and H.M. Kittur, 2012. Low-power and area-efficient carry select adder. IEEE T. VLSI Syst., 20(2): 371-375. Subha, R. and G. Durga, 2013. Design of Digital filter using low power and area efficient SQRT CSLA. IJCA Proceedings on National Conference on VLSI and Embedded Systems (NCVES), 1: 14-17. 1216