Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

Similar documents
This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

An MFA Binary Counter for Low Power Application

A High- Speed LFSR Design by the Application of Sample Period Reduction Technique for BCH Encoder

LUT Optimization for Memory Based Computation using Modified OMS Technique

ISSN:

Memory efficient Distributed architecture LUT Design using Unified Architecture

Implementation of Low Power and Area Efficient Carry Select Adder

FPGA Implementation of Convolutional Encoder And Hard Decision Viterbi Decoder

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Implementation of CRC and Viterbi algorithm on FPGA

Design of Memory Based Implementation Using LUT Multiplier

Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

Area-efficient high-throughput parallel scramblers using generalized algorithms

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL

THE USE OF forward error correction (FEC) in optical networks

Fault Detection And Correction Using MLD For Memory Applications

An Efficient Viterbi Decoder Architecture

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

The implementation challenges of polar codes

An Efficient 64-Bit Carry Select Adder With Less Delay And Reduced Area Application

An Efficient Reduction of Area in Multistandard Transform Core

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

LUT Design Using OMS Technique for Memory Based Realization of FIR Filter

Design And Implementation Of Coding Techniques For Communication Systems Using Viterbi Algorithm * V S Lakshmi Priya 1 Duggirala Ramakrishna Rao 2

POLAR codes are gathering a lot of attention lately. They

Successive Cancellation Decoding of Single Parity-Check Product Codes

Design and Implementation of High Speed 256-Bit Modified Square Root Carry Select Adder

A Robust Turbo Codec Design for Satellite Communications

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Metastability Analysis of Synchronizer

Figure 1.LFSR Architecture ( ) Table 1. Shows the operation for x 3 +x+1 polynomial.

Implementation and Analysis of Area Efficient Architectures for CSLA by using CLA

REDUCED-COMPLEXITY DECODING FOR CONCATENATED CODES BASED ON RECTANGULAR PARITY-CHECK CODES AND TURBO CODES

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

Research Article Low Power 256-bit Modified Carry Select Adder

FPGA Hardware Resource Specific Optimal Design for FIR Filters

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

An Efficient High Speed Wallace Tree Multiplier

FAULT SECURE ENCODER AND DECODER WITH CLOCK GATING

The Design of Efficient Viterbi Decoder and Realization by FPGA

Implementation of Area Efficient Memory-Based FIR Digital Filter Using LUT-Multiplier

A Compact and Fast FPGA Based Implementation of Encoding and Decoding Algorithm Using Reed Solomon Codes

SDR Implementation of Convolutional Encoder and Viterbi Decoder

Adaptive Fir Filter with Optimised Area and Power using Modified Inner-Product Block

An Lut Adaptive Filter Using DA

NUMEROUS elaborate attempts have been made in the

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

AbhijeetKhandale. H R Bhagyalakshmi

Midterm Exam 15 points total. March 28, 2011

A Novel Architecture of LUT Design Optimization for DSP Applications

Design of Carry Select Adder using Binary to Excess-3 Converter in VHDL

Polar Decoder PD-MS 1.1

FPGA Implementation of Viterbi Decoder

VHDL IMPLEMENTATION OF TURBO ENCODER AND DECODER USING LOG-MAP BASED ITERATIVE DECODING

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

Adaptive decoding of convolutional codes

Implementation of High Speed Adder using DLATCH

Fast Polar Decoders: Algorithm and Implementation

Hardware Implementation of Viterbi Decoder for Wireless Applications

ALONG with the progressive device scaling, semiconductor

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

On The Feasibility of Polar Code as Channel Code Candidate for the 5G-IoT Scenarios 1

TEST PATTERN GENERATION USING PSEUDORANDOM BIST

SRAM Based Random Number Generator For Non-Repeating Pattern Generation

Implementation of UART with BIST Technique

FPGA Implementaion of Soft Decision Viterbi Decoder

Design and Simulation of Modified Alum Based On Glut

Efficient Method for Look-Up-Table Design in Memory Based Fir Filters

IN A SERIAL-LINK data transmission system, a data clock

Design of Low Power Efficient Viterbi Decoder

International Journal of Engineering Research-Online A Peer Reviewed International Journal

Implementation of a turbo codes test bed in the Simulink environment

TERRESTRIAL broadcasting of digital television (DTV)

COMPUTATIONAL REDUCTION LOGIC FOR ADDERS

Optimum Frame Synchronization for Preamble-less Packet Transmission of Turbo Codes

PIPELINE ARCHITECTURE FOR FAST DECODING OF BCH CODES FOR NOR FLASH MEMORY

Design and Implementation of LUT Optimization DSP Techniques

Lecture 16: Feedback channel and source-channel separation

ISSN (Print) Original Research Article. Coimbatore, Tamil Nadu, India

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

Department of Electrical and Computer Engineering Mid-Term Examination Winter 2012

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

High-Speed Decoders for Polar Codes

A Novel Turbo Codec Encoding and Decoding Mechanism

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

Designing Fir Filter Using Modified Look up Table Multiplier

Design And Implimentation Of Modified Sqrt Carry Select Adder On FPGA

Research Article Design and Implementation of High Speed and Low Power Modified Square Root Carry Select Adder (MSQRTCSLA)

Implementation of Dynamic RAMs with clock gating circuits using Verilog HDL

128 BIT CARRY SELECT ADDER USING BINARY TO EXCESS-ONE CONVERTER FOR DELAY REDUCTION AND AREA EFFICIENCY

Synthesis Techniques for Pseudo-Random Built-In Self-Test Based on the LFSR

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

VLSI Chip Design Project TSEK06

Optimization of memory based multiplication for LUT

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

Transcription:

IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 03, 2015 ISSN (online): 2321-0613 V Priya 1 M Parimaladevi 2 1 Master of Engineering 2 Assistant Professor 1,2 Department of Electronics and Communication Engineering 1,2 Velalar College of Engineering and Technology, Erode Abstract The number of decoding architecture techniques are introduced for wireless communication channel. Such architectural techniques aim to improve the latency, area and complexity of various methods. This paper aims to get solution for FPGA based list decoder architecture, achieving the latency, area and throughput of the system. In this paper, 2bit Successive Cancellation (SC) decoding algorithm is used in polar list decoder architecture. Also, the usage of CRC in list decoder increases the throughput. The architecture is implemented in Xilinx Spartan 3E FPGA device. The result shows that the latency is reduced up to 64% compared to conventional SC decoder. The proposed design achieves area and delay of 88% & 17.3% than conventional SC decoder. Key words: Polar Codes, Successive Cancellation, Polar List Decoder, 2-Bit SC Decoding Algorithm I. INTRODUCTION Polar codes are introduced by Erdal in (1), are a new class of error correcting codes due to their capacity achieving property. It is the first code with explicit construction to provably achieve the channel capacity for symmetric binaryinput discrete memory less channels (B-DMC). The significant investigation has been directed toward code constructions and improvement of error correction performance of polar codes with short and moderate lengths (2)-(4). The error correction performance of SC algorithm is worse than that of Turbo codes and Low-density parity check codes. However, for polar decoder design, further research is still required. For successive cancellation algorithm [5], leads to a very long decoding latency and the complexity of the SC decoder is O (NlogN), where N=2 n is the code length. In (6), an FPGA implementation of belief propagation (BP) was described. The BP decoder has particular advantages in parallel design but not striking for practical applications due to their large processing elements. In order to guarantee good performance, polar codes length N needs to be greater than 2 10. For future real-time applications, the designs of low-latency SC decoders are required. A polar list decoder is proposed to perform minimum clock cycles than the SC algorithm. This paper presents two-bit decision approaches that can reduce the latency of conventional SC decoders. First, reformulated component of list decoder using 2-bit SC decoding algorithm, which can perform intermediate decoding of 2- bits simultaneously. Then by, apply the reformulated component of list decoder in polar list decoder architecture, is presented to reduce the overall latency from 14 clock cycles to 5 clock cycles. Based on both algorithm and architectural enhancements, our decoder architecture achieves better error performance and high area efficiency compared with other decoder architecture in (7). In (8), the list decoder architecture was presented to contribute single bit in processing unit array, which has leads to high latency and area overhead. In this paper efficient processing unit (PU) is proposed. For the proposed polar list decoder architecture, a 2-bit reformulated decoding algorithm is proposed to determine the minimum size of each input message for each PU so that there is no message overflow. Using the algorithm for each PU, the overall latency and area of PUs is reduced. The remainder of this brief is organized as follows. Section 2 presents a related work of the SC algorithm and 2- bit SC algorithm. In Section 3, the proposed work for the polar list decoder Section 4 presents the experimental results and parameter comparison of latency area. The final section concludes this brief. II. RELATED WORK This paper is extension of work initiated in (1). Where the 2- bit reformulated SC algorithm is reduce the latency over conventional SC algorithm. In that last stage is reformulated through p node which requires minimum processing elements. The remainder of this paper is follows: A. Polar Encoder The (n, k) polar code contains k information bits and (n-k) frozen bits. These (n-k) frozen bits are fixed at bad position and information bits are fixed at good position. The information bit is also called as non-frozen bit. The polar encoder needs 8 input information bits and it transmitted to convert a code word. Each input is depends on one another (1). And the converted code word passed through three stages of polar encoder. The input of the polar encoder is to construct n-bit source data vector as u= (u 1, u 2, u n ). Where u i = c j, if i= free j ; or u i = 0, if i= frozen j. Here c indicates original k-bit information message. After obtaining code word u next to computes the code word x= (x 1, x 2, x n ) by the generator matrix G. The encoded output is given by, The generator matrix is constructed using the n th kronecker power of the matrix F= [ ]. For example n=3, F= [ ]. The equivalent implementation of polar encoder with n=8 is shown in Figure 1. For the reliability of transmission, the information bits and frozen bits are fixed at good and bad positions. Finally, the encoded output is transmitted through the channel. All rights reserved by www.ijsrd.com 2282

Fig. 1: The Polar Encoder Implementation With N=8 B. Existing Conventional SC Decoder The original information bits are retained in the receiver part by decoding the code word. But the receiver end corrupted by the transmission noise, the received code word will no longer be x but change to y= (y 1, y 2, y n ). With the recursive computation procedure the SC algorithm can use the likelihood ratio (LRs) of y to estimate the decoder output. The SC algorithm proposed by arikan is shown in algorithm 1. The estimated output is represented by u= (u 1, u 2, u n ). The decoded output information obtained back is as follows initially u0, and then comes u 1 and finally u n-1. The output of each decoded bit û 1 is obtained by the following decision function h (.) (1) and û 0 is the previously decoded bits. (2) where h(lr[y, û 1 ])=1 if i is not frozen position and h(lr[y, û 1 ])=0 if i is frozen position. Algorithm 1: Successive Cancellation (SC) Decoding (1) For i=0,1,2...n-1 do If i k then û i = u i ; //frozen bits else û i = 0; //non-frozen bits return û i ; The likelihood ratio of the conventional SC decoder it uses two functions f and g with three stages depending on each other. And these two functions f and g are employed to calculate LR[y, û 1 ]. Here LR[y, û 1 ] in stage-3 is calculated from the stage-2, and the stage-2 needs the information message from stage-1. Since these intermediate propagating messages are also LR values. The f and g functions are defined by, In equation (4), u sum is the modulo-2 sum of partial previous decoded bits. And the term u sum represents the successive operation in the SC algorithm. To clearly illustrate this phenomenon, the conventional SC decoder is shown in Figure 2. From this Figure 2. The decision of current bit strongly depends on the previous decoded bit; therefore the decoded bits can only be computed successfully. From that decoded output, the decision unit decides correct or not, by denoting it as either 0 or 1. By using the SC algorithm in polar decoder, increase the latency and complexity. Fig. 2: Conventional SC Decoder C. Existing 2-bit SC Decoder The 2-bit SC algorithm is shown in algorithm 2.In this 2-bit SC decoder last stage is reformulated through p node. Since the p node it performs the same output of conventional SC decoder instead 2-bits to be decoded in the same clock cycle and the remaining operation is same as conventional SC decoder. The 2-bit SC decoder is shown in Fig.3. From this Figure 3.p node it performs the operation of U 2 and U 2i in the same cycles. The process of p node is given by, From this above equation 2-bits are decoded in the same clock cycles. Therefore, the overall latency is reduced from the conventional SC decoder. And also clock cycles get reduced from 14 clock cycles to 10 clock cycles. Algorithm 2: 2-bit Successive Cancellation (SC) Decoding Input: Log - Likelihood ratios LLR(c) and LLR(d) from stage- m-1. Judge u 2 and u 2i are frozen bits or not. (1) Case 1: None of u 2 or u 2i is a frozen bit. Find the largest element among {LLR(c) + LLR(d), 0, LLR(c), LLR(d)}. If the largest element is LLR(c) + LLR(d): u 2 = 0, u 2i = 0. If the largest element is 0: u 2 = 0, u 2i = 1. If the largest element is LLR(d): u 2 = 1, u 2i = 0. If the largest element is LLR(c): u 2 = 1, u 2i = 1. (2) Case 2: Both u 2 and u 2i are frozen bits u 2 = 0, u 2i = 0. (3) Case 3: only u 2 is frozen bit, u 2 = 0, u 2i = 0, if LLR(c) + LLR(d) 0 1, if LLR(c) + LLR(d) < 0 Case 4: only u 2i is frozen bit, u 2i = 0, u 2 = 0, if sign(llr(c))sign(llr(d)) 0 1, if sign(llr(c))sign(llr(d)) < 0 Output: u 2, u 2i All rights reserved by www.ijsrd.com 2283

2) Processing Unit Array (PUA): These processing unit performs metric computation unit and zero forcing unit i.e.).f, g, p nodes. 3) Partial Sum Network (PSU): To speed up the partial sum network. 4) Cyclic Redundancy Check (CRC): The CRC to find out the correct output list decoder. Fig. 3: 2-bit SC decoder A. Proposed List Decoder III. PROPOSED WORK In proposed method 2-bit SC decoder algorithm is implemented in list decoder architecture. This reduces the latency and area. The new form of decoder is known as the polar list decoder. The component of list decoder is shown in Figure 4. and the overall architecture is shown in Figure 5. This 2-bit SC decoding algorithm in polar list decoder performs intermediate decoding of 2-bits simultaneously. The process of list decoder performs initially the input can be loaded from the channel; it can be likelihood values of either 0 or 1. After the value passes through the different stages and last stage is reformulated through 2-bit SC decoder replaces the original stage-m with two units, referred as metric computation unit (MCU) and zero-forcing unit (ZFU) respectively. The metric computation unit calculates the different combinations of input with the use of the messages a(0),a(1),b(0) and b(1) output from the stage(m-1). From the output of metric computation unit performs intermediate two bits can be loaded at the same clock cycles, and also reduce the overall latency from the conventional SC decoder. Fig. 4: Reformulated component of list decoder The output of metric computation unit passed through the zero-forcing unit. From that it avoids some unqualified paths, because the value of U 2 and U 2i do not only depend on the corresponding path matrices but also whether they are frozen bits or not. Finally M(00),M(01),M(10) and M(11) are the output of component list decoder. The component list decoder is implemented in list decoder architecture with the help of processing unit array, the architecture performs different functions: 1) Message Memory Architecture (MMA): It performs internal computations and storing channel messages. Figure 5. Architecture of Polar List Decoder IV. RESULT ANALYSIS In signal processing, Polar code technique is implemented where a higher channel capacity can be achieved. By using this, the utilization rate of channel will be attained and the complexity of the Processing Elements is reduced by decoding method while transmitting the information bits from sender to receiver. Generally, error detection and correction can be achieved by adding a redundancy bits to the message, where receiver can check the reliability of the data that has been transmitted without any error. In systematic method, by using some deterministic algorithm the transmitter sends the original data along with check bits. If the values are getting mismatched then at some point in the transmission side error has occurred. Unlike in non- systematic method, the message will be encoded. Communication channel plays a major role for good error control performance. Initially streaming data s are loaded into the polar encoder. The encoded data s are forwarded to receive via communication channel. At receiver, data s are again loaded into the processing elements and decoding using SC and 2-bit SC algorithm. The Figure 7, Figure 8 and Figure 9 show the design summary of conventional SC decoder, 2-bit SC decoder and polar list decoder. The output of the conventional SC decoder and 2-bit SC decoder are shown in Figure 10 and Figure 11. By comparing these decoders the 2-bit SC decoder reduce the latency over conventional SC decoder. By implementing 2-bit SC algorithm in list decoder, it greatly reduces the latency and area. The RTL view of the polar list decoder is shown in Figure 6. Fig. 6: RTL View of Reformulated Component of List Decoder All rights reserved by www.ijsrd.com 2284

By applying likelihood values from the channel to the list decoder performs intermediate decoding of 2-bits simultaneously. From these 2-bits to be decoded instead of same clock cycles, which can reduce the clock cycles of 5. The output of the polar list decoder is shown in Figure 12. The proposed polar list decoder architecture shown in the previous section is implemented and resulted from using Verilog and simulated in Xilinx ISE 9.2i. Table 1 shows the implementation result analysis of existing and proposed decoders. Implementing the decoders the polar list decoder achieves the better latency and area than the SC decoder. Fig. 10: Output of Conventional SC Decoder Fig. 7: Design Summary of Conventional SC Decoder Figure 11. Output of 2-bit SC Decoder Fig. 8. Design Summary Of 2-Bit SC Decoder Figure 12. Output of Polar List Decoder Fig. 9: Design Summary of 2-Bit SC Decoder Existing Existing Proposed Parameter Conventional 2-Bit SC Polar list SC decoder decoder decoder Combinational Path Delay(ns) 30.483 26.450 25.206 Latency(Clock cycles) 14cycles 10cycles 5cycles LUT 1408 704 181 Gate Count 9996 5004 1194 Table 1: Implementation Results of Latency and Clock Cycles Analysis V. CONCLUSION Latency is an important factor in VLSI because it reduces the efficiency of the system. In order to increase the efficiency, latency must be reduced. The 2-Bit SC decoder achieves the lower latency due to its series connected p node which offers 2-bits that are decoded simultaneously instead of single bit. Our proposed system avoids processing elements at last stage by replacing simple elements. This reduces the overall latency from of 64% from the conventional SC algorithm and also 88% of area is minimized. Simulations are carried out over various input data bits and got the decoded output is obtained from the original source data vector using Xilinx Spartan 3E FPGA kit. REFERENCE [1] E. Arıkan, Channel polarization: A method for constructing capacity achieving codes for symmetric binary-input memoryless channels, IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051 3073, 2009. [2] S. B. Korada, E. Sasoglu, and R. Urbanke, Polar codes: Characterization of exponent, bounds, and All rights reserved by www.ijsrd.com 2285

constructions, IEEE Trans. Inf. Theory, vol. 56, no. 12, pp. 6253 6264, 2010. [3] R. Mori and T. Tanaka, Performance of polar codes with the construction using density evolution, IEEE Commun. Lett., vol. 13, no. 7, pp. 519 521, Jul. 2009. [4] I. Tal and A.Vardy, How to construct polar codes, May 2011, arxiv: 1105.6164v1. [5] Bo Yuan, Keshab K. Parhi, Low-latency successive cancellation polar decoder architectures using 2-bit decoding, IEEE Transactions on circuits and systems, vol. 61,no.a,pp. 1241-1253,April. 2014. [6] A. Pamuk, An FPGA implementation architecture for decoding of polar codes, in Proc. 8th Int. Symp. on Wireless Commun.yst. (ICWCS), pp. 437 441, Nov. 2011. [7] A. Balatsoukas-Stimming, A. J. Raymond, W. J. Gross, and A. Burg, Hardware architecture for list successive cancellation decoding of polar codes, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 61, no. 8,pp. 609 613, Aug. 2014. [8] Jun Lin and Zhiyuan Yan, An efficient list decoder architechture for polar codes, inieee Trans.Very Large Scale Integration(VLSI) systems. All rights reserved by www.ijsrd.com 2286