Design and Implementation of Low-Power and Area-Efficient for Carry Select Adder (Csla)

Similar documents
Implementation of Low Power and Area Efficient Carry Select Adder

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

ISSN:

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

FPGA IMPEMENTATION OF LOW POWER AND AREA EFFICIENT CARRY SELECT ADDER

Improved 32 bit carry select adder for low area and low power

An Efficient Carry Select Adder

FPGA Implementation of Low Power and Area Efficient Carry Select Adder

DESIGN OF HIGH PERFORMANCE, AREA EFFICIENT FIR FILTER USING CARRY SELECT ADDER

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

Implementation of efficient carry select adder on FPGA

Research Article Low Power 256-bit Modified Carry Select Adder

Implementation of High Speed Adder using DLATCH

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

Design of Carry Select Adder using Binary to Excess-3 Converter in VHDL

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

128 BIT MODIFIED CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER

Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Modified128 bit CSLA For Effective Area and Speed

Pak. J. Biotechnol. Vol. 14 (Special Issue II) Pp (2017) Parjoona V. and P. Manimegalai

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

Efficient Implementation of Multi Stage SQRT Carry Select Adder

DESIGN OF LOW POWER AND HIGH SPEED BEC 2248 EFFICIENT NOVEL CARRY SELECT ADDER

Design of Modified Carry Select Adder for Addition of More Than Two Numbers

Design and Analysis of Modified Fast Compressors for MAC Unit

An MFA Binary Counter for Low Power Application

Research Article VLSI Architecture Using a Modified SQRT Carry Select Adder in Image Compression

A Review on Hybrid Adders in VHDL Payal V. Mawale #1, Swapnil Jain *2, Pravin W. Jaronde #3

An Efficient High Speed Wallace Tree Multiplier

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

R13 SET - 1 '' ''' '' ' '''' Code No: RT21053

R13. II B. Tech I Semester Regular Examinations, Jan DIGITAL LOGIC DESIGN (Com. to CSE, IT) PART-A

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Combinational Logic Design

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

COMPUTATIONAL REDUCTION LOGIC FOR ADDERS

High Performance Carry Chains for FPGAs

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

6.3 Sequential Circuits (plus a few Combinational)

MODEL QUESTIONS WITH ANSWERS THIRD SEMESTER B.TECH DEGREE EXAMINATION DECEMBER CS 203: Switching Theory and Logic Design. Time: 3 Hrs Marks: 100

Clock Gating Aware Low Power ALU Design and Implementation on FPGA

Midterm Exam 15 points total. March 28, 2011

CHAPTER 4 RESULTS & DISCUSSION

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

ALONG with the progressive device scaling, semiconductor

SA4NCCP 4-BIT FULL SERIAL ADDER

CS 110 Computer Architecture. Finite State Machines, Functional Units. Instructor: Sören Schwertfeger.

International Journal of Engineering Research-Online A Peer Reviewed International Journal

VLSI IEEE Projects Titles LeMeniz Infotech

Microprocessor Design

Chapter 8 Functions of Combinational Logic

Subject : EE6301 DIGITAL LOGIC CIRCUITS

Bachelor Level/ First Year/ Second Semester/ Science Full Marks: 60 Computer Science and Information Technology (CSc. 151) Pass Marks: 24

A High-Speed Low-Power Modulo 2 n +1 Multiplier Design Using Carbon-Nanotube Technology

FUNCTIONS OF COMBINATIONAL LOGIC

TIME SCHEDULE. MODULE TOPICS PERIODS 1 Number system & Boolean algebra 17 Test I 1 2 Logic families &Combinational logic

Encoders and Decoders: Details and Design Issues

A Novel Architecture of LUT Design Optimization for DSP Applications

CS/ECE 250: Computer Architecture. Basics of Logic Design: ALU, Storage, Tristate. Benjamin Lee

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

Adaptive Fir Filter with Optimised Area and Power using Modified Inner-Product Block

Dev Bhoomi Institute Of Technology Department of Electronics and Communication Engineering PRACTICAL INSTRUCTION SHEET

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Aging Aware Multiplier with AHL using FPGA

Implementation of Memory Based Multiplication Using Micro wind Software

FPGA Implementation of DA Algritm for Fir Filter

WINTER 15 EXAMINATION Model Answer

Chapter Contents. Appendix A: Digital Logic. Some Definitions

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

BCN1043. By Dr. Mritha Ramalingam. Faculty of Computer Systems & Software Engineering

Using minterms, m-notation / decimal notation Sum = Cout = Using maxterms, M-notation Sum = Cout =

Logic Design II (17.342) Spring Lecture Outline

8. Design of Adders. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

MODULE 3. Combinational & Sequential logic

WELCOME. ECE 2030: Introduction to Computer Engineering* Richard M. Dansereau Copyright by R.M. Dansereau,

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

LUT Optimization for Memory Based Computation using Modified OMS Technique

EEE130 Digital Electronics I Lecture #1_2. Dr. Shahrel A. Suandi

Principles of Computer Architecture. Appendix A: Digital Logic

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

CHAPTER 4: Logic Circuits

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

Find the equivalent decimal value for the given value Other number system to decimal ( Sample)

1. Convert the decimal number to binary, octal, and hexadecimal.

The basic logic gates are the inverter (or NOT gate), the AND gate, the OR gate and the exclusive-or gate (XOR). If you put an inverter in front of

A Standard Cell Based Synchronous Dual-Bit Adder with Embedded Carry Look-Ahead

Computer Architecture and Organization

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Chapter 3. Boolean Algebra and Digital Logic

Multiplexor (aka MUX) An example, yet VERY useful circuit!

gate symbols will appear in schematic Dierent of a circuit. Standard gate symbols have been diagram Figures 5-3 and 5-4 show standard shapes introduce

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

OMS Based LUT Optimization

Transcription:

Design and Implementation of Low-Power and Area-Efficient for Carry Select Adder (Csla) M.Deepika Department of the Electronics and Communication Engineering, NITS, Hyderabad, AP, India. K.Srinivasa Reddy Department of the Electronics and Communication Engineering, NITS, Hyderabad, AP, India. U.Srinivasa Rao Department of the Electronics and Communication Engineering, NITS, Hyderabad, AP, India. Abstract Carry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSLA, it is clear that there is scope for reducing the area and power consumption in the CSLA. This work uses a simple and efficient gate-level modification t significantly reduce the area and power of the CSLA. Based on this modification 8-, 16-, 32-, and 64-bit square-root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only a slight increase in the delay. This work evaluates the performance of the proposed designs in terms of delay, area power, and their products by hand with logical effort and through custom design and layout in 0.18- m CMOS process technology. The results analysis shows that the proposed CSLA structure is better than the regular SQRT CSLA. Index Terms Application-specific integrated circuit (ASIC), area-efficient, CSLA, low power. I. INTRODUCTION Design of area- and power-efficient high-speed systems are one of the most substantial areas of research in VLSI system design. In digital adders, the speed of addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous bit position has been summed and a carry propagated into the next position. The CSLA is used in many computational systems to alleviate the problem of carry propagation delay by independently generating multiple carries and then select a carry to generate the sum. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry Adders (RCA) to generate partial sum and carry by consideringcarry input cin=0 andcin=1 then the final sum and carry are selected by the multiplexers (mux).the basic idea of this work is to use Binary to Excess-1Converter (BEC) instead of RCA with cin=1in the regular CSLA to achieve lower area and power consumption [2][4]. The main advantage of this BEC logic comes from the lesser number of logic gates than the -bit Full Adder (FA) structure. The details of the BEC logic are discussed in Section III. This brief is structured as follows. Section II deals with the delay area evaluation methodology of the basic adder blocks. Section III Vol. 4 Issue 1 August 2014 50 ISSN: 2319 1058

Fig4.1 4 Bit BEC Fig 4.2 4 Bit BEC With 8:4 MUX II. BEC System design. In digital adders, the speed of data path logic addition is limited by the time required to propagate a carry through the adder. The sum for each bit position in an elementary adder is generated sequentially only after the previous the state of this work iuse BEC instead of the RCA with cin=1 in order to reduce the area and power consumption The regular CSLA in RCA replace BEC. Fig. 3 illustrates how the basic function of the CSLA is obtained by using the 4-bit BEC together with the mux. One input of the 8:4 mux gets as it input (B3, B2, B1, and B0) and another input of the mux is the BEC output. This produces the two possibl partial results in parallel and the mux is used to select either the BEC output or the direct inputs according to the control signal Cin. The importance of the BEC logic stems from the large silicon area reduction when the CSLA with largem number of bits are designed. The Boolean expressions of the 4-bit BEC XO=~B0 X1=B0^B1. X2=((B0&B1)^B2) Vol. 4 Issue 1 August 2014 51 ISSN: 2319 1058

X3=(B0&B1&B2)^B3 III. DELAY AND AREA EVALUATION METHODOLOGY OF THE BASIC ADDER BLOCKS The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig. 1. The gates between the dotted lines are performing the operations in parallel and the numeric representation of each gate indicates the delay contributed by that gate. The delay and area evaluation methodology considers all gates to be made up of AND, OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. We then add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder (HA), and FA are evaluated and listed in Table I. unit. Wethen add up the number of gates in the longest path of a logic block that contributes to the maximum delay. The area evaluation is done by counting the total number of AOI gates required for each logic block. Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder(HA), and FA are evaluated and listed in TablAs stated above the main IV. DELAY AND AREA EVALUATION METHODOLOGY OF REGULAR 16-B SQRT CSLA The structure of the 16-b regular SQRT CSLA is shown in Fig. 4. It has five groups of different size RCA. The delay and area evaluation of each group are shown in Fig. 5, in which the numerals within [] specify the delay values, e.g., sum2 requires 10 gate delays. The steps leading to the evaluation are as follows. Fig 3 16-b Regular SQRT CSLA The group2 [see Fig. 5(a)] has two sets of 2- b RCA. Based on the consideration of delay values of Table I, the arrival time of selection input c1(time(t)=7) of 6:3 mux is earlier than s3[t=8] and later thans2[t=6]. Thus,sum3[t=11]is summation of s3 and mux[t=3] and sum2[t=10] is summation of c1 and mux 1.Except for group2, the arrival time of mux selection input is always greater than the arrival time of data outputs from the RCA s. Thus, the delay of group3 to group5 is determined, respectively as follows: Vol. 4 Issue 1 August 2014 52 ISSN: 2319 1058

Fig.3b 3 Bit RCA With 8:4 MUX 2.The one set of 2-b RCA in group2 has 2 FA for cin=0 and the other set has 1 FA and 1 HA forcing=1. Based on the area count of Table I, the total number of gate counts in group2 is determined as follows: V. DELAY AND AREA EVALUATION METHODOLOGY OF MODIFIED 16-B SQRT CSLA The structure of the proposed 16-b SQRT CSLA using BEC for RCA with cin=1 to optimize the area and power is shown in Fig.4. We again split the structure into five groups. The delay and area estimation of each group are shown in Fig. 5 The steps leading to the evaluation are given here. Vol. 4 Issue 1 August 2014 53 ISSN: 2319 1058

3.Similarly, the estimated maximum delay and area of the other groups in the regular SQRT CSLA are evaluated and listed in Table3 Fig 4 modified 16-bit SQRT CSLA Group Delay Area Group2 13 43 Vol. 4 Issue 1 August 2014 54 ISSN: 2319 1058

Group3 16 61 Group4 19 84 Group5 22 167 Table 4 delay and area 16-bit modified CSLA Fig 4a 2Bit RCA And 3-Bit BCE With 6:3 MUX 1) The group2 [see Fig. 7(a)] has one 2-b RCA which has 1 FA and 1 HA for Cin=0. Instead of another 2-b RCA with Cin=1 a 3-b BEC is used which adds one to the output from 2-b RCA. Based on the consideration of delay values of Table I, the arrival time of selection input c1[time(t)=7] of 6:3 mux is earlier than the s3[t=9] and c3[t=10] and later than the s2[t=4]. Thus, the sum3 and final c3 (output from mux) are depending on s3 and mux and partial c3 (input to mux) and mux, respectively. The sum2 depends on c1 and mux. Fig 4b 3 Bit RCA And 4-Bit BEC With 8:4 MUX Vol. 4 Issue 1 August 2014 55 ISSN: 2319 1058

Fig 4c 4 Bit RCA And 5-Bit BEC With 10:5 MUX Fig 4.d 5 Bit RCA And 6-Bit BEC With 12:6 MUX ) For the remaining group s the arrival time of mux selection input is always greater than the arrival time of data inputs from the BEC s. Thus, the delay of the remaining groups depends on the arrival time of mux selection input and the mux delay. The area count of group2 is determined as follows: 3) Similarly, the estimated maximum delay and area of the other groups of the modified SQRT CSLA are evaluated and listed in Table IV. Low area efficiency(less complexity) Less number of gates Low power More speed compare regular CSLA ADVANTAGES Vol. 4 Issue 1 August 2014 56 ISSN: 2319 1058

Larger delay DISADVANTAGES ALU Operations High Speed Multiplications Advanced Microprocessor Design Digital Signal Process APLLICATIONS XILINX Version. SOFTWARE REQUIREMENTS: FPGASpartan-3E HARDWARE REQUIREMENTS: SIMULATION RESULT. Simulation Result-1 Vol. 4 Issue 1 August 2014 57 ISSN: 2319 1058

Simulation Result-2 VII. CONCLUSION A simple approach is proposed in this paper to reduce the area and power of SQRT CSLA architecture. The reduced number of gates of this work offers the great advantage in the reduction of area and also the total power. The compared results how that the modified SQRT CSLA has a slightly larger delay (only 3.76%), but the area and power of the 64-b modified SQRT CSLA are significantly reduced by 17.4% an 15.4% respectively. The power-delay product and also the area-dela product of the proposed design show a decrease for 16-, 32-, and 64- b sizes which indicates the success of the method and not a mere tradeoff of delay for power and area. The modified CSLA architecture is terefore, low area, low power, simple and efficient for VLSI hardware implementation It would be interesting to test the design of the modified 128-b SQRT CSLA. ACKNOWLEDGMENT The authors would like to thank S.Sivanantham, P. MageshKannan,and S. Ravi of the VLSI Division, VIT University, Vellore, India, for their contributions to this work. Vol. 4 Issue 1 August 2014 58 ISSN: 2319 1058