An Efficient Viterbi Decoder Architecture

Similar documents
Hardware Implementation of Viterbi Decoder for Wireless Applications

Design And Implementation Of Coding Techniques For Communication Systems Using Viterbi Algorithm * V S Lakshmi Priya 1 Duggirala Ramakrishna Rao 2

FPGA Implementation of Convolutional Encoder And Hard Decision Viterbi Decoder

SDR Implementation of Convolutional Encoder and Viterbi Decoder

Implementation of CRC and Viterbi algorithm on FPGA

Design of Low Power Efficient Viterbi Decoder

THE USE OF forward error correction (FEC) in optical networks

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

Design Project: Designing a Viterbi Decoder (PART I)

Adaptive decoding of convolutional codes

Performance Analysis of Convolutional Encoder and Viterbi Decoder Using FPGA

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

FPGA Implementaion of Soft Decision Viterbi Decoder

The Design of Efficient Viterbi Decoder and Realization by FPGA

An Efficient Reduction of Area in Multistandard Transform Core

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

A High- Speed LFSR Design by the Application of Sample Period Reduction Technique for BCH Encoder

FPGA Implementation of Viterbi Decoder

FPGA Implementation of Convolutional Encoder and Adaptive Viterbi Decoder B. SWETHA REDDY 1, K. SRINIVAS 2

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

TRELLIS decoding is pervasive in digital communication. Parallel High-Throughput Limited Search Trellis Decoder VLSI Design

An FPGA Implementation of Shift Register Using Pulsed Latches

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

LOW POWER VLSI ARCHITECTURE OF A VITERBI DECODER USING ASYNCHRONOUS PRECHARGE HALF BUFFER DUAL RAILTECHNIQUES

A Robust Turbo Codec Design for Satellite Communications

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

BER Performance Comparison of HOVA and SOVA in AWGN Channel

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

Implementation of a turbo codes test bed in the Simulink environment

An Lut Adaptive Filter Using DA

Implementation of Memory Based Multiplication Using Micro wind Software

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

Optimization of memory based multiplication for LUT

ALONG with the progressive device scaling, semiconductor

CONVOLUTION ENCODING AND VITERBI DECODING BASED ON FPGA USING VHDL

Design of Memory Based Implementation Using LUT Multiplier

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

A Low Power Delay Buffer Using Gated Driver Tree

Design Low-Power and Area-Efficient Shift Register using SSASPL Pulsed Latch

An Approach for Adaptively Approximating the Viterbi Algorithm to Reduce Power Consumption while Decoding Convolutional Codes

An MFA Binary Counter for Low Power Application

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

Memory efficient Distributed architecture LUT Design using Unified Architecture

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

LUT Optimization for Memory Based Computation using Modified OMS Technique

POWER AND AREA EFFICIENT LFSR WITH PULSED LATCHES

Fault Detection And Correction Using MLD For Memory Applications

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir

NUMEROUS elaborate attempts have been made in the

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

Higher-Order Modulation and Turbo Coding Options for the CDM-600 Satellite Modem

An Implementation of a Forward Error Correction Technique using Convolution Encoding with Viterbi Decoding

Power Efficient Design of Sequential Circuits using OBSC and RTPG Integration

A Novel Architecture of LUT Design Optimization for DSP Applications

Retiming Sequential Circuits for Low Power

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

International Journal of Engineering Research-Online A Peer Reviewed International Journal

Viterbi Decoder User Guide

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

AN UNEQUAL ERROR PROTECTION SCHEME FOR MULTIPLE INPUT MULTIPLE OUTPUT SYSTEMS. M. Farooq Sabir, Robert W. Heath and Alan C. Bovik

Figure.1 Clock signal II. SYSTEM ANALYSIS

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

TERRESTRIAL broadcasting of digital television (DTV)

Commsonic. (Tail-biting) Viterbi Decoder CMS0008. Contact information. Advanced Tail-Biting Architecture yields high coding gain and low delay.

Power Optimization by Using Multi-Bit Flip-Flops

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

IC Design of a New Decision Device for Analog Viterbi Decoder

Implementation and performance analysis of convolution error correcting codes with code rate=1/2.

Power Reduction Techniques for a Spread Spectrum Based Correlator

Design and Implementation of Uart with Bist for Low Power Dissipation Using Lp-Tpg

VHDL IMPLEMENTATION OF TURBO ENCODER AND DECODER USING LOG-MAP BASED ITERATIVE DECODING

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame

A Discrete Time Markov Chain Model for High Throughput Bidirectional Fano Decoders

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Error Performance Analysis of a Concatenated Coding Scheme with 64/256-QAM Trellis Coded Modulation for the North American Cable Modem Standard

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Design and Analysis of Modified Fast Compressors for MAC Unit

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

Distributed Arithmetic Unit Design for Fir Filter

LUT Optimization for Distributed Arithmetic-Based Block Least Mean Square Adaptive Filter

Design of BIST with Low Power Test Pattern Generator

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Memory Efficient VLSI Architecture for QCIF to VGA Resolution Conversion

An Efficient High Speed Wallace Tree Multiplier

IN DIGITAL transmission systems, there are always scramblers

HYBRID CONCATENATED CONVOLUTIONAL CODES FOR DEEP SPACE MISSION

Fully Pipelined High Speed SB and MC of AES Based on FPGA

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

Transcription:

IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume, Issue 3 (May. Jun. 013), PP 46-50 e-issn: 319 400, p-issn No. : 319 4197 An Efficient Viterbi Decoder Architecture Kalpana. R 1, Arulanantham. D, Dr.Marimuthu.C.N 3 1 PG Scholar, ECE/ Nandha Engineering College/Anna University, Chennai/India AP, ECE/ Nandha Engineering College/Anna University, Chennai/India 3 Dean, ECE/ Nandha Engineering College/Anna University, Chennai/India Abstract: A breadth first trellis decoding algorithm is introduced for application to sequence estimation in digital transmission. The decoding effort adapts to the prevailing noise conditions to yield low average effort. The proposed method of a performance of error correcting in noisy channels to reduce the power dissipation, the decoding speed in much to design a VD for TCM is presented in this paper. VD is used to decode the digital data communications and storage devices without any performance loss. A viterbi decoder uses the viterbi algorithm for decoding a bit stream that has been encoded using Forward error correction based on a code There are other algorithms for decoding a convolutionally encoded stream as well as punctured codes The advandage of VD is errors and correction and digital communication. The pre computation VD could reduce the power consumption by 70% with only 11% reduction of the maximum decoding speed. Keywords- Convolution codes, Trellis coded modulation(tcm), Threshold,Viterbi decoder, VLSI. I. Introduction Viterbi algorithm is a well-known maximum-likehood algorithm for decoding of convolution codes. The approach is suitable both for hardware and software. Convolution codes are used to correct errors in noisy channels. The redundant information is the users data and then correcting errors using this informations are given in section II. The trellis coded modulation is many bandwidth of high rating of convolutional code.the viterbi decoder for the TCM decoder is a Constraint length of the code moderate.it uses the deep space communication for reducing the power supply voltages ove rscaling or not The encoder generates the two output code word bits as a function of the input information bits and the encoder state. The output bits are transmitted over a nosy channel. The encoder state evolution and the decoding processes can be represented by the trellis encodingstates are discussed in section III. The incoming channel to the branch metrics unit is the combined signal. It is convolutional to correct the errors in noise.the convolutional encoder for shift register performs the serial and parallel states to scares state transistion in limited code. The punctured code is performed the decoding should ignore the erased bits during lower correction in branch metric calculations. The branch metrics units are discussed in section IV-4.1. The high rate of bit error in metrics to ACSU is estimated by the inherent drifting error between path metrics and accurate measurement shifting. the efficient clock speed of a VD for critical path is rotated by synchronize clock cycles. The set (or) reset of a data is information of next stages to the input of a shift keying.bms are fed into the ACSU that recursively computes the PMs and outputs decision bits for each possible state transition. After that the decision bits for stored in and retrieved from the SMU in order to decode the source bits along the final survivor path. The Pms of he current iteration are stored in the PMU. The ACSU loop for calculating the optimal PM and puncturing states are discussed in section IV-4.. The adaptive decoder discards some states with high path metrics dynamically during the decoding process. The use of a scarce state transition scheme for the multimedia mobile communication. The scheme employs simple pre decoder followed by a pre encoder to minimize signal transitions at the input of a convolutional, which leads to lower dynamic power dissipation. The decision bits are allowed to pass through the state metric unit. The decoding process is done of the pre-encoding to convert the decoded output. The register exchange method & the trace back method needs to move the data in memories for every cycle to do implementation of SMU is discussed in section IV-4.3. II. Viterbi Decoder A Viterbi decoder uses the Viterbi algorithm for the decoding a bit stream that has been encoded using Forward error correction based on a code. A conventional Viterbi decoder contains three major parts. A Branch Metric Unit-BMU which calculates the branch metrics; An Add-compare-Select Unit -ACUS which recursively7 accumulates the branch metrics as the path metrics (PM), compares the incoming path metrics, and makes a decision to select the most likely state transitions for each state of the trellis and generates the corresponding decision bits [1]. 46 Page

A survivor memory unit-smu, which stores the decision bits and helps to generate the decoded output. Among these three units, the ACSU and SMU consume most of the power of the decoder. There are two known methods for the implementation of the SMU, namely the Register Ex-change method(re) and the Trace Back (TB) method. In general, Re has the advantage of high speed, low latency, and simple control but it consumes more power than the Trace (TB) mechanism since it needs to move the data among the memories in every cycle. Therefore, the TB mechanism is commonly used for the implementation of the SMU [3]. III. Viterbi Decoder Algorithm The Convolutional encoder adds redundancy to the input signal and the encoded output symbols are transmitted over a noisy channel. The input of the Convolutional decoder that is the input for the Viterbi decoder is the encoded symbols contaminated by noise. Then the decoder tries to extract the original information from the received sequence and generates an estimate. The path in the trellis for the channel output r is the one that maximizes the function. This is called the metrics in branches of the encoded data. In [Pr(r/x m )] is the functions of metrics in the decoder. Moreover finding the trellis with the largest function corresponds to the maximum decoding. The Hamming distance between the trellis code word and the received sequence is a constraint length[3]. 3.1 BMU Calculation Methods of branch metric calculation are different for hard decision and soft decision decoders. For a hard decision decoder as branch metric is a Hamming distance between the received pair of bits and the ideal pair therefore a branch metric can take values of 0,1 and. Thus for every input pair we have 4 branch metrics. For a soft decision decoder, a branch metric is measured using the Euclidean distance let x be the first received bit in the pair, y-the second, x 0 and y 0- the ideal values. Then branch metric is in Equation 1 M b =( x- x 0 ) + (y -y o ) - (1) M b =( x x x 0 +x 0 ) + (y -y y o + y o ) - () M* b= M b- x - y = (x 0- x x 0 )+ (y 0-y y 0 ) - (3) Note that the second formula, M * b, can be calculated without hardware multiplication x 0 y 0 can be pre-calculated, and multiplication x by x 0 and y by y 0 in equation &3. M * b is signed variable is calculated in s complement format[3]. 3. ASCU Calculation The optimal choice for the convolutional encoder to correct the lower noise in puncturing and the serial and parallel states of the transition metrics from branch metrics. The 64 states and PMs are labeled from 0to 63.the minimum value of each BM group can be calculated in BMU and the threshold Generator to calculate (PM opt +T) and the functions of synchronize speed is calculated through ACSU[1]. PM opt (n)= Min[min[min(cluster0(n-))+ min (BM G0(n-1)); Min[min[min(cluster1(n-))+ min (BM G1(n-1)); Min[min[min(cluster(n-))+ min (BM G(n-1)); Min[min[min(cluster3(n-))+ min (BM G3(n-1)); 3.3. Path Metric Calculation Path metrics are calculated using a procedure called ACS. This procedure is repeated for every encoder state. There are two ways of dealing with this problem. Since the absolute values of path metric don t actually matter, we can at any time subtract an identical value from the metric of ever path. It is usually done when all path metrics exceed a chosen threshold. This method is simple, but not very efficient when implemented in hardware. The second approach allows overflow, but used a sufficient number of bits to be able to detect 47 Page

whether the flow took place or not. The compare, select. have two paths, ending a given state. As these are k-1 encoder states in a survivor paths at any given time[7]. 3.4 SMU Calculation The decoded data is based on the 64 to 6 priority encoder based on three 4 to priority encoders. The encoders will determine the index [1:0] the MUX select one group of flags based on index[1:0] input of the priority encoder at level can be computed from the output of MUX by OR operations[1]. IV. Low Power High Speed Viterbi Decoder Design 4.1 BMU Design The BM module generates branch metric for ACS module in terms of the received channel symbols. It provides branch information for branch metric computation, in terms of constraint length selection The channel symbol comes to BMU performing Convolutional codes in it. The incoming channels have a noisy, cc is used to correct the errors in noisy. The number of rows in a puncturing matrix is equal to the number of output polynomials. The elements of this matrix are 1 s. The puncturing matrix is applied to the output stream. This is done in BMU to correct a low capable errors[7]. The serial bits stre3am in BMU is doing by shift registers. The output from the registers cell is Polynomial in constraint length[10]. 4. ACSU Design The ACSU so that it can be operated at reduced supply voltage. The ACSU loop for calculating the optimal PM and puncturing states. In order to evaluate of process variations on the BER, the delay distributions of various metrics found in the ACSU. The rotate synchronize BER is used to speed up the VD into a synchronizing clock frequencies. The enable signal is activated during the one clock period at the end of the frame clock pluses for synchronizing[1]. Each state in the trellis of Viterbi decoding, current path metrics are obtained from current branch metric and path metrics of the previous states, which lead to current state, by executing addition, comparison and selection operators to speed up this module. The first architecture is based on permitting limited decision errors in order to decrease the critical path delay of the ACSU so that is can be operated at reduced supply voltage[3]. 4.3 SMU DESIGN The two, methods are namely the register EX-change method(re) and the trace back (TB). In general, Re has the advantage of high speed, low latency, and simple control but it consumes more power than the trace(tb) mechanism since it needs to move the data among the memories in every cycle[1]. The addition, best metrics of the branches to select the decision bits to convert the encoder to decoder. The conversion is required to synchronize clock pulses the output decoded bits are coming from SMU.A register is assigned to each state, and the length of a register is equal to the frame length in decoder. The corrected output sequence is produced by tracing the decision vectors. The trace back module is used to decide the final output. Viterbi decoder in trace back approach, it saves about 68.8% of power compared with conventional Viterbi decoder[3]. Table 1. Area power dissipation and speed comparison of proposed method, Speed of Cell Power Type area(mm operatio dissipatio n in ) n in mw MHZ Full-trellis VD 0.58 1.473(10 0%) 505 Conventional T-Algorithm VD with step pre computation 0.685 0.91 3 0.68 0.069 446.4 48 Page

V. Simulation Results The full trellis VD is used for VHDL coding to implement the output & input datas using xilinx 9. version. The serial state transitions & punctured polynomials in the branch metrics calculations are shown in Fig. 1 &. In ACSU, to evaluate the BER performance for incoming branch metrics for normalizations of a signal to pass through the ACSU to select the decision bits in different path metrics are shown in Fig.3. The synchronize the clock speed of the encoding error rate as shown in Fig. 4. The decision bits are encoded from the different path to choose the ACSU finally the decoded process output in SMU is coming from the decoding of Viterbi decoder. Fig.1 Serial bits in BMU Fig. Punctured Codes Fig. 3 VD BER threshold in acsuthe bit error rate in acsu is calculated by the matrices of BER threshold. Fig. 4 Rotate synchronize clock speed in path metrics to ACSU VI. Conclusion A design starts with the development of a behavioral VDHL description. if the target throughput is moderately high, the proposed architecture can operate at a lower supply voltage, which will lead to quadratic power reduction compared to the conventional scheme. ASIC synthesis and power estimation results show that, compared with the full trellis VD without a low-power scheme, the pre-computation VD could reduce the power consumption by 70% with only 11% reduction of the maximum decoding speed. The use of error-correcting codes 49 Page

has proven to be an effective way to overcome data corruption in digital communication channels. The error in channel symbols are reduced by VD, it makes to design for future to less correction coding error bits. The purpose is to avoid the over scaling to reduce the speed & noise in data. The best metrics of encoded bit streams are filtered noise fully in the trellis coded modulation. The fuzzy logic of decision bits are avoid the un encoded the puncturing states to improve a performance of execution speed for low supply dissipations. References [1]. Jinjin He.Huaping Liu, Zhongfeng Wang. Xinming Huang and Kai Zhang, High-Speed Low-Power Viterbi Decoder Design for TCM Decoders IEEE TRANS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEM, Vol, 0, No.4, April 01. []. J.He.Z.Wang, and H.Liu, An efficient 4-d 8PSK TCM decoder architecture. IEEE Trans, Very Large Scale Integer, (VLSI) Syst., Vol, 18, No.5, pp.808-817, May 010. [3]. J.Communications, Networks and Sciences A Low Power and High Speed Viterbi Decoder Based on Deep Pipelined, Clock Blocking and Hazrds Filtering Int., 009, 6, 575-58. [4]. Rami A.Abdullah, Student Member, IEEE, and Naresh R.Shanbhag, Fellow, IEEE Error-Resilient Low-Power Viterbi Decoder Architectures December 009. [5]. Charan Langton Trellis Coded Modulation (TCM) Intuitive Guide to Principles of Communications, 004. [6]. Bandwidth-efficient modulations, Consulative committee for space Data System Matera, Italy. CCSDS 40(3.3.6) Green Book. Issue 1, Apr, 003. [7]. R.Henning and C.Chakrabarti, Low power approach to decoding convoltional codes with adaptive Viterbi algorithm approximations, in Proceedings, IEEE/ACM International Symposium on Low Power Electronics and Design, Monterey, CA, Aug, 00, pp-68-71 [8]. F.Chan and D.Hacoun, Adaptive Viterbi decoding of convolutional codes over memoryless channels, IEEE transaction on Communications, Vol. 45, No.11, pp.1389-1400, Nov, 1997. [9]. J.B. Anderson and E.Offer Reduced-state sequence detection with convolutional codes IEEE Trans INf. Theory, Vol. 40 No.3 pp. 965-97. May 1994. [10]. G.Fettweis and H.Myer, High-Speed parallel Viterbi decoding algorithm and VLSI-architecture, IEEE Communications Magazinr, Vol.9, No.05, pp 46-55, May 1991. [11]. Stanley J.Simmons Breadth First Trellis Decoding with adaptive effort IEEE TRANS ON COMMUNICATION, Vol, 38, No. 1, January 1990. [1]. S.J.Simmons, Breadth-first trells decoding of channel convolutional codes, presented at the Princeton Conf. Info. Sei. Syst. Princeton Mar, 1986. 50 Page