FPGA Implementation of DA Algritm for Fir Filter

Similar documents
Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

LUT Optimization for Memory Based Computation using Modified OMS Technique

Distributed Arithmetic Unit Design for Fir Filter

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Memory efficient Distributed architecture LUT Design using Unified Architecture

FPGA Hardware Resource Specific Optimal Design for FIR Filters

Optimization of memory based multiplication for LUT

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Field Programmable Gate Arrays (FPGAs)

Reconfigurable Fir Digital Filter Realization on FPGA

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

An Lut Adaptive Filter Using DA

ALONG with the progressive device scaling, semiconductor

Design of Memory Based Implementation Using LUT Multiplier

Designing Fir Filter Using Modified Look up Table Multiplier

VLSI IEEE Projects Titles LeMeniz Infotech

An Efficient High Speed Wallace Tree Multiplier

LUT Optimization for Distributed Arithmetic-Based Block Least Mean Square Adaptive Filter

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Design on CIC interpolator in Model Simulator

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

Implementation of Low Power and Area Efficient Carry Select Adder

Efficient Method for Look-Up-Table Design in Memory Based Fir Filters

An Efficient Reduction of Area in Multistandard Transform Core

Adaptive Fir Filter with Optimised Area and Power using Modified Inner-Product Block

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

Microprocessor Design

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

An MFA Binary Counter for Low Power Application

L12: Reconfigurable Logic Architectures

A Novel Architecture of LUT Design Optimization for DSP Applications

FPGA Design with VHDL

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier

ISSN:

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

FPGA Development for Radar, Radio-Astronomy and Communications

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

FPGA Design. Part I - Hardware Components. Thomas Lenzi

Using SignalTap II in the Quartus II Software

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

Further Details Contact: A. Vinay , , #301, 303 & 304,3rdFloor, AVR Buildings, Opp to SV Music College, Balaji

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

Modified Reconfigurable Fir Filter Design Using Look up Table

Implementation of Memory Based Multiplication Using Micro wind Software

Design and Analysis of Modified Fast Compressors for MAC Unit

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

University of Maiduguri Faculty of Engineering Seminar Series Volume 6, december 2015

FPGA Realization of Farrow Structure for Sampling Rate Change

Design of an Area-Efficient Interpolated FIR Filter Based on LUT Partitioning

EEM Digital Systems II

Design and Implementation of LUT Optimization DSP Techniques

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

A Fast Constant Coefficient Multiplier for the XC6200

White Paper Versatile Digital QAM Modulator

OMS Based LUT Optimization

Inside Digital Design Accompany Lab Manual

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

L11/12: Reconfigurable Logic Architectures

Radar Signal Processing Final Report Spring Semester 2017

Dynamically Reconfigurable FIR Filter Architectures with Fast Reconfiguration

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

An Improved Recursive and Non-recursive Comb Filter for DSP Applications

Design & Simulation of 128x Interpolator Filter

THE USE OF forward error correction (FEC) in optical networks

A Parallel Area Delay Efficient Interpolation Filter Architecture

Area and Speed Efficient Implementation of Symmetric FIR Digital Filter through Reduced Parallel LUT Decomposed DA Approach

International Journal of Engineering Research-Online A Peer Reviewed International Journal

CHAPTER 4 RESULTS & DISCUSSION

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

International Journal Of Global Innovations -Vol.6, Issue.I Paper Id: SP-V6-I1-P11 ISSN Online:

Serial FIR Filter. A Brief Study in DSP. ECE448 Spring 2011 Tuesday Section 15 points 3/8/2011 GEORGE MASON UNIVERSITY.

TEST PATTERN GENERATION USING PSEUDORANDOM BIST

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Research on Precise Synchronization System for Triple Modular Redundancy (TMR) Computer

Implementation of High Speed Adder using DLATCH

DDC and DUC Filters in SDR platforms

Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

Why FPGAs? FPGA Overview. Why FPGAs?

Solution to Digital Logic )What is the magnitude comparator? Design a logic circuit for 4 bit magnitude comparator and explain it,

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Scan. This is a sample of the first 15 pages of the Scan chapter.

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

Design and Implementation of Low-Power and Area-Efficient for Carry Select Adder (Csla)

DESIGN OF HIGH PERFORMANCE, AREA EFFICIENT FIR FILTER USING CARRY SELECT ADDER

Designing an Efficient and Secured LUT Approach for Area Based Occupations

Improved 32 bit carry select adder for low area and low power

Cyclone II EPC35. M4K = memory IOE = Input Output Elements PLL = Phase Locked Loop

Upgrading a FIR Compiler v3.1.x Design to v3.2.x

Fully Pipelined High Speed SB and MC of AES Based on FPGA

Implementation of CRC and Viterbi algorithm on FPGA

Transcription:

International Journal of Computational Engineering Research Vol, 03 Issue, 8 FPGA Implementation of DA Algritm for Fir Filter 1, Solmanraju Putta, 2, J Kishore, 3, P. Suresh 1, M.Tech student,assoc. Prof.,Professor Dept of ECE,Sunflower college og Engg.&Technology Challapalli, Krishna (D), A.P - 521131 ABSTRACT MAC is composed of an adder, multiplier and an accumulator. Usually adders implemented are Carry- Select or Carry-Save adders, as speed is of utmost importance in DSP. One implementation of the multiplier could be as a parallel array multiplier. The inputs for the MAC are to be fetched from memory location and fed to the multiplier block of the MAC, which will perform multiplication and give the result to adder which will accumulate the result and then will store the result into a memory location. This entire process is to be achieved in a single clock cycle. The architecture of the MAC unit which had been designed in this work consists of one 16 bit register, one 16-bit Modified Booth Multiplier, 32-bit accumulator. To multiply the values of A and B, Modified Booth multiplier is used instead of conventional multiplier because Modified Booth multiplier can increase the MAC unit design speed and reduce multiplication complexity. SPST Adder is used for the addition of partial products and a register is used for accumulation. The product of A i X B i is always fed back into the 32- bit accumulator and then added again with the next product A i x B i. This MAC unit is capable of multiplying and adding with previous product consecutively up to as many as times. I. INTRODUCTION Due to the intensive use of FIR filters in video and communication systems, high performance in speed, area and power consumption is demanded. Basically, digital filters are used to modify the characteristic of signals in time and frequency domain and have been recognized as primary digital signal processing. In DSP, the design methods were mainly focused in multiplier-based architectures to implement the multiply-and- Accumulate (MAC) blocks that constitute the central piece in FIR filters and several functions. The FIR digital filter is presented as: Where y[n] is the FIR filter output, x[n k] is input data and c[k] represents the filter coefficients Equation(1) shows that multiplier-based filter implementations may become highly expensive in terms of area and speed. This issue has been partially solved with the new generation of low-cost FPGAs that have embedded DSP blocks. The advantages of the FPGA approach to digital filter implementation include higher sampling rates than are available from traditional DSP chips, lower costs than an ASIC for moderate volume applications, and more flexibility than the alternate approaches. II. DISTRIBUTED ARITHMETIC (DA) Distributed arithmetic (DA) is an important FPGA technology. It is extensively used in computing the sum of products DA system, assumes that the variable x[n] is represented by- If c[n] are the known coefficients of the FIR filter, then output of FIR filter in bit level form is: Issn 2250-3005 August 2013 Page 78

In distributed arithmetic form In Equation (5.5) second summation term realizing as one LUT. The use of this LUT or ROM eliminates the multipliers [9]. For signed 2 s complement number output of FIR filter can be computed as- Where B represents the total number of bits used. Fig 4.1 shows the Distributed architecture for FIR filter and different with the MAC architecture. When x[n] <0, Binary representation of the input is [10], The output in distributed arithmetic form- If the number of coefficients N is too large to implement the full word with a single LUT (Input LUT bit width = number of coefficients), then partial tables can be and add the results as shown in Fig 4.2. If pipeline registers are also added, then this modification will not reduce the speed, but can dramatically reduce the size of the design III. IMPLEMENTATION High Level Specifications are nothing but the requirements to understand and begin the design. In this stage the designer main aim is to capture the behavior of the design using mostly behavioral constructs of the HDL s. The next step after capturing the designs functionality is to segregate the design in all possible ways and try to write a synthesizable code which infers available primitives from the library. Here mostly the mixed style of modeling is used and only synthesizable constructs of the HDL is used.then comes the synthesis step which is actually target driven. Here we have an FPGA as the target device. Then Implementation is nothing but the process of placing and routing the design on an FPGA. Mostly it is a tool driven and no manual intervention of the designer is required. Designer only needs to specify constrain file in design if any.below Flow chart shows this in brief. Fig: Design Flow Chart Issn 2250-3005 August 2013 Page 79

Fig: Bit File Burning The Final stage after doing the place and route successfully and fully satisfied with mapping and other reports is the bit file generation phase. This bit file has to be downloaded on to FPGA via a JTAG cable. Above Figure shows a simple setup for this. IV. Xilinx Spartan3E XC3s100E Device TARGET DEVICE The below Figure shows the Board with all the peripherals connected to it.the FPGA used belongs to Spartan 3E family and the device is XC3s100E. Fig: Spartan kit V. PROGRAMMING After successfully compiling an FPGA design using the Xilinx development software, the design can be downloaded using the impact programming software and the USB cable. To begin programming, connect the USB cable to the starter kit board and apply power to the board. Then, double-click Configure Device (impact) from within Project. As shown in the figure Fig: Configure Device. If the board is connected properly, the impact programming software automatically recognizes the three devices in the JTAG programming file, as shown in Figure 5.7. If not already prompted, click the first device in the chain, the Spartan-3E FPGA, to highlight it. Right-click the FPGA and select assign New Configuration File. Select the desired FPGA configuration file and click OK. As shown in the fig 5.7 If the original FPGA configuration file used the default Startup clock source, CCLK,IMPACT issues the warning message shown in Figure 5.8. This message can be safely ignored. When downloading via JTAG, the IMPACT software must change the Startup clock source to use the TCK JTAG clock source. Issn 2250-3005 August 2013 Page 80

Fig: Assigning configuration File In above figure we need to assign the correct bit stream file generated to the device and then locate the path of the bit file in the system and then check the box. Fig: IMPACT Issues a Warning if the Startup Clock Was Not CCLK This warning is ignored and checked to continue the burning of bit file to the FPGA. Fig: Program FPGA When the FPGA successfully programs, the impact software indicates success, as shown in below Figure. The FPGA application is now executing on the board and the DONE pin LED lights up. Fig : Successfully Programmed VI. SIMULATION Iteration 1 Iteration 2 Issn 2250-3005 August 2013 Page 81

Iteration 3 WALLACE TREE DA ALGORITHM VII. CONCLUSION DA algorithm which is implemented consumes low power of 0.10 watts when compared with recent implementation like Wallace tree which consumes 0.30 watts power.an achievement of 0.20 watts power has been implanted using distributed algorithm. The proposed method has been implemented for 3 bit multiplier and results obtained without any computation error.in general, a multiplier uses Booth s algorithm and array of full adders (FAs), or Wallace tree instead of the array of FA s., i.e., this multiplier mainly consists of the three parts: Booth encoder, a tree to compress the partial products such as Wallace tree, and final adder. Because Wallace tree is to add the partial products from encoder as parallel as possible, its operation time is proportional to, where is the number of inputs. It uses the fact that counting the number of 1 s among the inputs reduces the number of outputs into. In real implementation, many counters are used to reduce the number of outputs in each pipeline step. Issn 2250-3005 August 2013 Page 82

REFERANCE FPGA Implementation of DA algritm [1] Fast Multiplication Algorithms and Implementation [2] Implementation of High Speed FIR Filter using Serial and Parallel Distributed Arithmetic Algorithm [3] FPGA Implementation of Digital FIR Filter [4] L. Zhao, W. H. Bi, F. Liu, Design of digital FIR band pass filter using distributed algorithm based on FPGA, Electronic Measurement Technology,2007,vol.30 [5] P. Girard, O. Héron, S.Pravossoudovitch, and M.Renovell, Delay Fault Testing of Look-Up Tables in SRAM-Based FPGAs, Journal of ElectronicTesting, 2005, vol. 21 [6] H. Chen, C. H. Xiong, S. N. Zhong, FPGA-based efficient programmable polyphase FIR filter, Journal of Beijing lnsititute of Technology,2005, vol. 14. [7] Y. T. Xu, C. G. Wang, J. L. Wang, Hardware Implementation of FIR Filter Based on DA Algorithm, Journal of PLA University of Science and Technology, 2003, vol. 4 [8] D. Wu, Y. H. Wang, H. Z. Lu, Distributed Arithmetic and its Implementation in FPGA, Journal of National University of Defense Technology, 2000, vol. 22 [9] L. Wei, R. J. Yang, X. T. Cui, Design of FIR filter based on distributed arithmetic and its FPGA implementation, Chinese Journal of Scientific Instrument, 2008, vol. 29 [10] W. Zhu, G. M. Zhang, Z. M. Zhang, Design of FIR Filter Based on Distributed Algorithm with Parallel Structure, Journal of Electronic Measurement andinstrument, 2007, vol. 21 [11] W. Wang, M. N. S. Swamy, M. O.Ahmad, Novel Design andfpga Implemention of DAFIR Filters, Journal ofsystems and Computers, 2004,vol. 13 [12] M. Nagamatsu et al., A 15ns 32 x 32-bit CMOS Multiplier an Improved Parallel Structure, Proc. CICC, pp.10.3.11989. Issn 2250-3005 August 2013 Page 83