Hardware Implementation of Viterbi Decoder for Wireless Applications

Similar documents
An Efficient Viterbi Decoder Architecture

Performance Analysis of Convolutional Encoder and Viterbi Decoder Using FPGA

FPGA Implementation of Convolutional Encoder And Hard Decision Viterbi Decoder

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

The Design of Efficient Viterbi Decoder and Realization by FPGA

Implementation of CRC and Viterbi algorithm on FPGA

Adaptive decoding of convolutional codes

THE USE OF forward error correction (FEC) in optical networks

FPGA Implementation of Convolutional Encoder and Adaptive Viterbi Decoder B. SWETHA REDDY 1, K. SRINIVAS 2

Design And Implementation Of Coding Techniques For Communication Systems Using Viterbi Algorithm * V S Lakshmi Priya 1 Duggirala Ramakrishna Rao 2

Implementation and performance analysis of convolution error correcting codes with code rate=1/2.

Design of Low Power Efficient Viterbi Decoder

Design Project: Designing a Viterbi Decoder (PART I)

LOW POWER VLSI ARCHITECTURE OF A VITERBI DECODER USING ASYNCHRONOUS PRECHARGE HALF BUFFER DUAL RAILTECHNIQUES

BER Performance Comparison of HOVA and SOVA in AWGN Channel

SDR Implementation of Convolutional Encoder and Viterbi Decoder

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

FPGA Implementation of Viterbi Decoder

FPGA Implementaion of Soft Decision Viterbi Decoder

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

A Robust Turbo Codec Design for Satellite Communications

TRELLIS decoding is pervasive in digital communication. Parallel High-Throughput Limited Search Trellis Decoder VLSI Design

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

LUT Optimization for Memory Based Computation using Modified OMS Technique

Implementation of a turbo codes test bed in the Simulink environment

Viterbi Decoder User Guide

An Implementation of a Forward Error Correction Technique using Convolution Encoding with Viterbi Decoding

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

An Approach for Adaptively Approximating the Viterbi Algorithm to Reduce Power Consumption while Decoding Convolutional Codes

Commsonic. (Tail-biting) Viterbi Decoder CMS0008. Contact information. Advanced Tail-Biting Architecture yields high coding gain and low delay.

A Review on Hybrid Adders in VHDL Payal V. Mawale #1, Swapnil Jain *2, Pravin W. Jaronde #3

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

An optimized implementation of 128 bit carry select adder using binary to excess-one converter for delay reduction and area efficiency

TERRESTRIAL broadcasting of digital television (DTV)

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Design and Implementation of Encoder and Decoder for SCCPM System Based on DSP Xuebao Wang1, a, Jun Gao1, b and Gaoqi Dou1, c

LOW POWER & AREA EFFICIENT LAYOUT ANALYSIS OF CMOS ENCODER

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

LOCAL DECODING OF WALSH CODES TO REDUCE CDMA DESPREADING COMPUTATION. Matt Doherty Introductory Digital Systems Laboratory.

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel

Design of Memory Based Implementation Using LUT Multiplier

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

Retiming Sequential Circuits for Low Power

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

Design of Modified Carry Select Adder for Addition of More Than Two Numbers

International Journal of Engineering Research-Online A Peer Reviewed International Journal

ECSE-323 Digital System Design. Datapath/Controller Lecture #1

A Compact and Fast FPGA Based Implementation of Encoding and Decoding Algorithm Using Reed Solomon Codes

An Improved Recursive and Non-recursive Comb Filter for DSP Applications

Error Performance Analysis of a Concatenated Coding Scheme with 64/256-QAM Trellis Coded Modulation for the North American Cable Modem Standard

ISSCC 2006 / SESSION 14 / BASEBAND AND CHANNEL PROCESSING / 14.6

Chapter 3. Boolean Algebra and Digital Logic

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

ALONG with the progressive device scaling, semiconductor

ISSN:

Power Reduction Techniques for a Spread Spectrum Based Correlator

A Reconfigurable, Power-Efficient Adaptive Viterbi Decoder

Logic Design II (17.342) Spring Lecture Outline

Implementation of High Speed Adder using DLATCH

An Lut Adaptive Filter Using DA

An Efficient High Speed Wallace Tree Multiplier

CONVOLUTION ENCODING AND VITERBI DECODING BASED ON FPGA USING VHDL

Design and Implementation of Uart with Bist for Low Power Dissipation Using Lp-Tpg

Distributed Arithmetic Unit Design for Fir Filter

Analog Sliding Window Decoder Core for Mixed Signal Turbo Decoder

Design of BIST with Low Power Test Pattern Generator

Available online at ScienceDirect. Procedia Technology 24 (2016 )

VHDL IMPLEMENTATION OF TURBO ENCODER AND DECODER USING LOG-MAP BASED ITERATIVE DECODING

Clock Gating Aware Low Power ALU Design and Implementation on FPGA

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

An MFA Binary Counter for Low Power Application

FPGA Implementation OF Reed Solomon Encoder and Decoder

Inside Digital Design Accompany Lab Manual

Fully Pipelined High Speed SB and MC of AES Based on FPGA

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

Guidance For Scrambling Data Signals For EMC Compliance

NUMEROUS elaborate attempts have been made in the

Implementation of UART with BIST Technique

Design and Analysis of Modified Fast Compressors for MAC Unit

A High- Speed LFSR Design by the Application of Sample Period Reduction Technique for BCH Encoder

Reducing DDR Latency for Embedded Image Steganography

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

A Low Power Delay Buffer Using Gated Driver Tree

Memory efficient Distributed architecture LUT Design using Unified Architecture

VITERBI DECODER FOR NASA S SPACE SHUTTLE S TELEMETRY DATA

FPGA Hardware Resource Specific Optimal Design for FIR Filters

Individual Project Report

California State University, Bakersfield Computer & Electrical Engineering & Computer Science ECE 3220: Digital Design with VHDL Laboratory 7

VA08V Multi State Viterbi Decoder. Small World Communications. VA08V Features. Introduction. Signal Descriptions

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

FPGA Implementation of Sequential Logic

DESIGN OF LOW POWER AND HIGH SPEED BEC 2248 EFFICIENT NOVEL CARRY SELECT ADDER

Transcription:

Hardware Implementation of Viterbi Decoder for Wireless Applications Bhupendra Singh 1, Sanjeev Agarwal 2 and Tarun Varma 3 Deptt. of Electronics and Communication Engineering, 1 Amity School of Engineering and Technology, Noida, India Email: 1 bsingh.tech@gmail.com 2,3 Malaviya National Institute of Technology, Jaipur, India Email: 2 san@mnit.ac.in, 3 tarun.varma.jaipur@gmail.com Abstract In 2G mobile terminals, the VD consumes approximately one third of the power consumption of a baseband mobile transceiver. Thus, in 3G mobile systems, it is essential to reduce the power consumption of the VD. In this report the register exchange (RE) method, adopting a pointer concept, is used to implement the survivor memory unit (SMU) of the VD. For the implementation part, hardware implementation of MLVD through Synopsys Design Compiler Synthesis is done. For synthesis UMC-180nm Library is used. controlled shift register is designed at the circuit level and integrated into the ACS module. A. Structure of Viterbi Decoder The four functional blocks of VD in term of implementation, including branch metric unit (BMU), add-compare-select unit (ACSU), feedback unit (FBU) and survivor memory unit (SMU). Index Terms Viterbi Decoder, SMU, ACSU, RE, MLVD I. INTRODUCTION The register exchange (RE) method, adopting a pointer concept, is used to implement the survivor memory unit (SMU) of the VD. The method entails assigning a pointer to each register or memory location. The contents of the pointer, which points to one register, is altered to point to a second register, instead of copying the contents of the first register to the second. When the pointer concept is applied to the RE's SMU implementation[2], there is no need to copy the contents of the SMU and rewrite them, but one row of memory is still needed for each state of the VD. Thus, the VDs in CDMA systems require only 256 rows of memory, hence reducing the VD's power consumption. Also, if the initial state of the convolutional encoder is known, the entire SMU is reduced to only one row. Because the decoded data is generated in the required order, even this row of memory is dispensable. The zero-memory architecture is called the MemoryLess Viterbi Decoder (MLVD)[6], and reduce power consumption. Another problem of the VD, which is addressed in this report, is the Add Compare Select Unit (ACSU) which is composed of 128 butterfly ACS modules. The ACSU's high parallelism has been previously solved by using a bit serial implementation. The 8-bit First Input First Output (FIFO) register, needed for the storage of each path metric (PM), is at the heart of the single bit serial ACS butterfly module. A new, simply Fig.1 Functional Block of VD The BMU calculates branch metric of each branch according to maximum likelihood of the received data. The ACSU makes the sum of branch and path metrics, then compares and selects the survivor path metric and the decision bit. The FBU stores the survivor path metric for ACSU to be used in the next cycle. The SMU produces the decoded data based on the decision bit and the survivor path metric. The SMU marked by boldfaced letters in Fig. 1 significantly influences latency, power and chip area in a VD. B. Viterbi Decoding Algorithms In 1967, Viterbi developed the Viterbi Algorithm (VA) as a method to decode convolutional codes [1]. The VA uses the trellis diagram to decode an input sequence, as demonstrated in Figure 2. The VA[4], which uses a hard decision format, is exhibited in Fig.2. A node is assigned to each state for each time stage.the transition between two states is represented by a branch, which is assigned a weight, referred to as a branch metric (BM). The BM is a measure of the likelihood of the transition, given the noisy observations. The BMs that are accumulated along a path form a path metric (PM). For the two branches entering the same state, the branch with the smaller PM survives, and the other one is discarded. Then two methods can be used to extract the decoded bits: the trace back (TB) or the register exchange (RE)[3]. 44

C. Trace Back (TB) Method At the last stage of the trellis diagram Fig 2, the TB method extracts the decoded bits, beginning from the state with the minimum PM, S0. From this state and tracing backwards in time by following the survivor path, which originally contributed to the current PM, a unique path is identified. Fig.2 Trace Back (TB) Viterbi Decoding Fig.4 The Finite State Machine of Viterbi decoder D. Register Exchange (RE) Method In the RE approach, a register is assigned to each state. The register records the decoded output sequence along the path from the initial state to the final state. This is depicted in Fig 3. At the last stage, the decoded output sequence is the one that is stored in the survivor path register, the register assigned to the state with the minimum PM. Fig.3 Register Exchange (RE) Method II. IMPLEMENTATION A. Viterbi Decoder Implementation The Viterbi decoder is introduced by the flow chart in Fig.5. With more specification, we will introduce it with the micro architecture of the hardware. Here, we will introduce the Next state ROM, BMU block, ACS block, trace-back block and decode-data block one by one as shown in the Fig. 6. The Finite State Machine (FSM) of our Viterbi decoder is composed by 5 states and 11 possible conditions shown in fig.4. Fig.5 Flow Chart of the Viterbi decoder 45

Fig.8 shows the ACS module of viterbi decoder. Fig.8 The ACS Module The total working of decoder can be summarized as follows: 1) The BM Unit (BMU) which calculates the BMs; 2) The Add Compare Select Unit (ACSU) which adds the BMs to the corresponding PMs, compares the new PMs, and then stores the selected PMs in the Path. Metric Memory (PMM); at the same time, the ACSU stores the associated survivor path decisions in the Survivor Memory Unit (SMU); Fig.6 Function allocations in micro architecture of VD When the reset signal is set high, Viterbi decoder needs to initialize the Next state Rom. The architecture of the Next state ROM is shown in the fig.7. 3) The SMU which stores the survivor path decisions; then the TB mechanism is applied to the SMU. B. Architecture of MLVD The diagram presented in Fig.9 shows the RE approach with pointer implementation (the upper register carries the pointer and the lower register carries the decoded bits)[5]. The first row of memory decodes the data, if an initial state, S 0, is assumed. The last row records the decoded data, if an initial state, S 255, is assumed, and so on. At the end of the decoding process, the row which has the lowest PM is chosen to be the decoder output. Fig.7 The Next State ROM hardware implementation Fig.9 New RE approach with pointer implementation. 46

If the initial state is known, then there is no need for the storage of the other rows except one row next to that state. The new VD implementation is called, the MemoryLess Viterbi Decoder (MLVD)[6]. Fig. 10 MLVD approach with pointer implementation The MLVD is an extra low power design for a VD with the only restriction of resetting the encoder register at each L of the encoded data bits and providing the necessary synchronization. The block diagram of the MLVD, designed in VHDL, is shown in Fig.11[6]. III SIMULATIONS AND RESULTS To calculate the power estimation, cost values for the MLVD operations are provided in Table 2. It shows the maximum power consumption is in ACSU block and also take the maximum area. For hardware implementation of the design, we continue with ASIC flow. For that we have synthesized the design using Synopsys Design Compiler. Various area and power report has been generated to summarize the need of hardware size for the decoder. Fig.12 SNR vs BER for MLVD Fig.11 MLVD block diagram Table I VD Specification Constraint Length K= 9 Coding Rate r = 1/3 Generator Polynomials Decision Level Path Metric Target Speed G 0= 557, G 1=633, G 2= 711 3-bit Soft Decision 8-Module Arithmetic 2 Mbps In order to have a built-in self-test design, a Linear Feedback Shift Register (LFSR) and a comparator are added. The LFSR produces the random input for the encoder, whereas the comparator compares the delayed version of the LFSR with the output of the MLVD. Fig.13 BER For Different Rates 3/4, 2/3 and 1/2 47

Table II Table for Synthesis of MLVD using Synopsys Design Compiler CONCLUSIONS Among the several architectures that are available to realize the VD, the RE method with pointer concept is conceptually the simplest, fastest, and most commonly used in VDs with only small values of k. By reinforcing the initial state of the convolutional encoder and synchronizing the VD with the resetting procedure, a design, the MLVD, with the highest power reduction is realized. The new MLVD is a memoryless high speed, low latency, and low power variation to the VD with an approximated BER of 10-5 and an SNR of 5 db. The MLVD along with a convolutional encoder is implemented on a Xilinx ISE chip to demonstrate both the design s functionality and feasibility of implementation. Design synthesis results shows the hardware utilization and power consumption of the resources on FPGA and chip. REFERENCES [1] Viterbi, "Error bounds for convolutional codes andasymptotically optimum decoding algorithm," IEEE Transactions on Information theory, vol. It-13, no. 2, pp. 260_269, April 1967. [2] Dalia A. El-Dib and M.I. Elmasry, Modified Register- Exchange Viterbi Decoder for Low-Power Wireless Communications, IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 51, no. 2, pp. 371-378, February 2004. [3] S. B. Wicker, "Error Control Systems for Digital Communication and Storage". Prentice Hall, 1995. [4] G. Forney, The viterbi algorithm", Proceedings of the IEEE, vol. 61, no. 3, pp. 268_278, March 1973. [5] Dalia A. El-Dib and M.I. Elmasry, Low power registerexchange Viterbi decoder for high speed wireless communications, IEEE International Symposium on Circuits and Systems, May 2002, pp. 737-740. [6] S A. El-Dib and M.I. Elmasry, Memoryless Viterbi Decoder: an extremely low power Viterbi Decoder, IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 51, no. 3, pp. 371-378, February 2004. 48