COM-7003SOFT Turbo code encoder/decoder VHDL source code overview / IP core

Similar documents
ERROR CORRECTION CODEC

COM-7002 TURBO CODE ERROR CORRECTION ENCODER / DECODER

Polar Decoder PD-MS 1.1

Viterbi Decoder User Guide

Commsonic. Satellite FEC Decoder CMS0077. Contact information

Commsonic. (Tail-biting) Viterbi Decoder CMS0008. Contact information. Advanced Tail-Biting Architecture yields high coding gain and low delay.

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

Block Diagram. 16/24/32 etc. pixin pixin_sof pixin_val. Supports 300 MHz+ operation on basic FPGA devices 2 Memory Read/Write Arbiter SYSTEM SIGNALS

LogiCORE IP AXI Video Direct Memory Access v5.01.a

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

Radar Signal Processing Final Report Spring Semester 2017

Block Diagram. dw*3 pixin (RGB) pixin_vsync pixin_hsync pixin_val pixin_rdy. clk_a. clk_b. h_s, h_bp, h_fp, h_disp, h_line

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

Implementation of CRC and Viterbi algorithm on FPGA

On the design of turbo codes with convolutional interleavers

LogiCORE IP Video Timing Controller v3.0

Block Diagram. pixin. pixin_field. pixin_vsync. pixin_hsync. pixin_val. pixin_rdy. pixels_per_line. lines_per_field. pixels_per_line [11:0]

LogiCORE IP AXI Video Direct Memory Access v5.03a

Part 2.4 Turbo codes. p. 1. ELEC 7073 Digital Communications III, Dept. of E.E.E., HKU

Performance Analysis of Convolutional Encoder and Viterbi Decoder Using FPGA

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

VA08V Multi State Viterbi Decoder. Small World Communications. VA08V Features. Introduction. Signal Descriptions

SDR Implementation of Convolutional Encoder and Viterbi Decoder

Forward Error Correction on ITU-G.709 Networks using Reed-Solomon Solutions Author: Michael Francis

A Robust Turbo Codec Design for Satellite Communications

MC-ACT-DVBMOD April 23, Digital Video Broadcast Modulator Datasheet v1.2. Product Summary

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Block Diagram. deint_mode. line_width. log2_line_width. field_polarity. mem_start_addr0. mem_start_addr1. mem_burst_size.

HYBRID CONCATENATED CONVOLUTIONAL CODES FOR DEEP SPACE MISSION

Modeling Latches and Flip-flops

Reducing DDR Latency for Embedded Image Steganography

PCD04C CCSDS Turbo and Viterbi Decoder. Small World Communications. PCD04C Features. Introduction. 5 January 2018 (Version 1.57) Product Specification

Higher-Order Modulation and Turbo Coding Options for the CDM-600 Satellite Modem

Modeling and Implementing Software-Defined Radio Communication Systems on FPGAs Puneet Kumar Senior Team Lead - SPC

Traffic Light Controller

Hardware Implementation of Viterbi Decoder for Wireless Applications

LogiCORE IP Spartan-6 FPGA Triple-Rate SDI v1.0

OL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0. General Description. Applications. Features

T1 Deframer. LogiCORE Facts. Features. Applications. General Description. Core Specifics

ECT 224: Digital Computer Fundamentals Digital Circuit Simulation & Timing Analysis

FPGA Design. Part I - Hardware Components. Thomas Lenzi

White Paper Lower Costs in Broadcasting Applications With Integration Using FPGAs

Laboratory 4. Figure 1: Serdes Transceiver

Modeling Latches and Flip-flops

OL_H264e HDTV H.264/AVC Baseline Video Encoder Rev 1.0. General Description. Applications. Features

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

PROCESSOR BASED TIMING SIGNAL GENERATOR FOR RADAR AND SENSOR APPLICATIONS

FPGA Implementation of Convolutional Encoder And Hard Decision Viterbi Decoder

EXOSTIV TM. Frédéric Leens, CEO

CCSDS TELEMETRY CHANNEL CODING: THE TURBO CODING OPTION. Gian Paolo Calzolari #, Enrico Vassallo #, Sandi Habinc * ABSTRACT

Design and implementation (in VHDL) of a VGA Display and Light Sensor to run on the Nexys4DDR board Report and Signoff due Week 6 (October 4)

A LOW COST TRANSPORT STREAM (TS) GENERATOR USED IN DIGITAL VIDEO BROADCASTING EQUIPMENT MEASUREMENTS

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

A Fast Constant Coefficient Multiplier for the XC6200

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

VHDL IMPLEMENTATION OF TURBO ENCODER AND DECODER USING LOG-MAP BASED ITERATIVE DECODING

ECE532 Digital System Design Title: Stereoscopic Depth Detection Using Two Cameras. Final Design Report

VITERBI DECODER FOR NASA S SPACE SHUTTLE S TELEMETRY DATA

The Design of Efficient Viterbi Decoder and Realization by FPGA

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

LUT Optimization for Memory Based Computation using Modified OMS Technique

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

An Implementation of a Forward Error Correction Technique using Convolution Encoding with Viterbi Decoding

Certus TM Silicon Debug: Don t Prototype Without It by Doug Amos, Mentor Graphics

Sub-LVDS-to-Parallel Sensor Bridge

DESIGN OF A MEASUREMENT PLATFORM FOR COMMUNICATIONS SYSTEMS

Spartan-II Development System

Nutaq. PicoDigitizer-125. Up to 64 Channels, 125 MSPS ADCs, FPGA-based DAQ Solution With Up to 32 Channels, 1000 MSPS DACs PRODUCT SHEET. nutaq.

DVB-S Modulator IP Core Specifcatoon

microenable 5 marathon ACL Product Profile of microenable 5 marathon ACL Datasheet microenable 5 marathon ACL

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

Investigation on Technical Feasibility of Stronger RS FEC for 400GbE

LogiCORE IP CIC Compiler v2.0

DATUM SYSTEMS Appendix A

Implementation and performance analysis of convolution error correcting codes with code rate=1/2.

T-COR-11 FPGA IP CORE FOR TRACKING OBJECTS IN VIDEO STREAM IMAGES Programmer manual

FPGA Implementaion of Soft Decision Viterbi Decoder

Frame Synchronization in Digital Communication Systems

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

A Novel Turbo Codec Encoding and Decoding Mechanism

White Paper Versatile Digital QAM Modulator

EE178 Spring 2018 Lecture Module 5. Eric Crabill

BER Performance Comparison of HOVA and SOVA in AWGN Channel

EEM Digital Systems II

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

UG0651 User Guide. Scaler. February2018

FPGA based Satellite Set Top Box prototype design

DVB-S2X for Next Generation C4ISR Applications

Commsonic. ISDB-S3 Modulator CMS0070. Contact information

High-Performance DDR2 SDRAM Interface Data Capture Using ISERDES and OSERDES Author: Maria George

LogiCORE IP Video Timing Controller v3.0

Digilent Nexys-3 Cellular RAM Controller Reference Design Overview

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

Implementing SMPTE SDI Interfaces with Artix-7 FPGA GTP Transceivers Author: John Snow

Programmable Logic Design I

Microprocessor Design

Optimum Frame Synchronization for Preamble-less Packet Transmission of Turbo Codes

Design and analysis of microcontroller system using AMBA- Lite bus

Memec Spartan-II LC User s Guide

Commsonic. DVB-S2 Modulator CMS0025. Contact information

Transcription:

COM-7003SOFT Turbo code encoder/decoder VHDL source code overview / IP core Overview The COM-7003SOFT is an error correction turbocode encoder/decoder written in generic VHDL. The entire VHDL source code is deliverable. Target Hardware The code is written in generic standard VHDL so that it can be ported to a variety of FPGAs. The code was developed and tested on a Xilinx 7-series FPGA but is expected to work similarly on other targets. Key features and performance: Flexible dynamic (i.e. at runtime) user-selected configuration: o Block length up to 8000 bits o Puncturing patterns for rates 1/3,1/2,2/3,3/4,4/5,5/6,6/7,7/8 Frame error rate examples: o 2032-bit frame, Rate 1/3, 5-bit soft quantization, 15-iterations: FER = 10-2 @ E b /N o = 1.4 db FER = 10-3 @ E b /N o = 1.6 db o 768-bit frame, Rate 3/4, 5-bit soft quantization, 15-iterations: FER = 10-2 @ E b /N o = 3.1 db FER = 10-3 @ E b /N o = 3.5 db Provided with IP core: o VHDL source code o o Matlab.m file for simulating the encoding and decoding algorithms, for generating stimulus files for VHDL simulation and for end-to-end BER performance analysis at various signal to noise ratios VHDL testbench Complies with ETSI EN 301 545-2, DVB- RCS2. MSS 845-N Quince Orchard Boulevard Gaithersburg, Maryland 20878 U.S.A. Telephone: (240) 631-1111 Facsimile: (240) 631-1676 www.comblock.com MSS 2016 Issued 9/28/2016

Configuration Synthesis-time configuration parameters The following constants are user-defined in the decoder component generic section prior to synthesis. These parameters generally define the size of the decoder embodiment. Synthesis-time configuration parameters Parameters Number of softquantized bits at the decoder input NQBITS log2 of the maximum payload size in Bytes, rounded up NADDRBITS. Configuration Typical values: 4. A minor performance improvement can be achieved with 5-bits. While the actual payload size is user-programmable at run-time, the maximum payload size is an important factor that affects the number of RAM blocks used in the FPGA. For the maximum payload size of 1000 Bytes, set NADDRBITS = 10. Run-time configuration parameters The user can set and modify the following controls at run-time through the top level component interface: Parameters Frame size BURST_PAYLOAD_SIZE Encoding rate R1/R2 Configuration Uncoded/Decoded frame size, expressed in bytes. Valid range 1 1000 Bytes. Constraints: - when using puncturing BURST_PAYLOAD_SI ZE *4 must be an integer multiple of the puncturing period. - must NOT be an integer multiple of 15. Valid values are 1/3,1/2,2/3,3/4,4/5,5/6,6/7, 7/8 I/Os General CLK: input The synchronous clock. The user must provide a global clock (use BUFG). The CLK timing period must be constrained in the.xdc file associated with the project. SYNC_RESET: input Synchronous reset. The reset MUST be exercised at least once to initialize the internal variables. It must be exercised whenever a control parameter is changed. Encoder/Decoder controls Users can define the encoder and decoder controls with one of two possible levels of abstraction: simple and detailed. The simplest form is described by the payload size BURST_PAYLOAD_SIZE and the code rate R1/R2, as described in the run-time configuration section above. A more detailed configuration consists of several arcane parameters BURST_PAYLOAD_SIZE, P, Q0, Q1, Q2, Q3, Y_PUNCTURING_PERIOD, Y_PUNCTURING_PATTERN, W_PUNCTURING_PATTERN, defined in Table A-1 of [1]. To simplify operation, a VHDL component (TC_DECODER_DVB_RCS2.VHD) and a Matlab table1.m program are provided to look-up the optimum detailed configuration from just the payload size BURST_PAYLOAD_SIZE and the code rate R1/R2. 2

Encoder CLK SYNC_RESET DATA_IN(1:0) DATA_IN_VALID SOF_IN CTS_OUT TC_ENCODER_DVB_RCS2 INPUT BITS DATA_OUT(1:0) DATA_OUT_VALID ENCODED BITS SOF_OUT EOF_OUT CTS_IN BURST_PAYLOAD_SIZE(9:0) R1(2:0) R2(3:0) CONTROLS P(6:0) Q0(3:0) Q1(3:0) Q2(3:0) Q3(2:0) Y_PUNCTURING_PERIOD(4:0) Y_PUNCTURING_PATTERN(27:0) W_PUNCTURING_PATTERN 160018 DATA_IN(1:0): Input data is read two bits at a time A(bit 0) and B (bit1). DATA_IN_VALID: input. 1 CLK-wide pulse indicating that DATAIN is valid. SOF_IN: input Start Of Frame. 1 CLK-wide pulse. The SOF is aligned with DATA_IN_VALID. Note that there is no need for an end of frame as the input frame size is defined as a control parameter. CTS_OUT: output. Clear-To-Send flow control. '1' indicates that the encoder is ready to accept another input dibit. IMPORTANT: relying on CTS_OUT for flow control may not be sufficient because of latency in stopping the flow. NEVER send the next SOF_IN when CTS_OUT = '0'. This implies the sender must count the data symbol in a frame, stop at N and wait 2 CLKs at least before checking CTS_OUT again. The encoder outputs mirror its inputs: DATA_OUT(1:0), DATA_OUT_VALID, SOF_OUT, EOF_OUT, CTS_IN. 3

Decoder CLK SYNC_RESET DATA_A_IN(NQBITS-1:0) DATA_B_IN(NQBITS-1:0) SAMPLE_CLK_IN SOF_IN EOF_IN CTS_OUT BURST_PAYLOAD_SIZE(NADDRBITS-1:0) P(6:0) Q0(3:0) Q1(3:0) CONTROLS Q2(3:0) Q3(2:0) Y_PUNCTURING_PERIOD(4:0) Y_PUNCTURING_PATTERN(27:0) W_PUNCTURING_PATTERN N_ITER(3:0) TC_DECODER_DVB_RCS2 INPUT SAMPLES DATA_OUT(1:0) SAMPLE_CLK_OUT SOF_OUT DECODED CTS_IN OUTPUT DATA_A_IN / DATA_B_IN: Two soft-quantized input samples. The precision (NQBITS) is selectable at the time of synthesis. A 4-bit softquantization is considered a good trade-off between decoding performance and FPGA occupancy. A 5- bit soft-quantization may yield minor performance improvement. 160017 demodulated samples prior to soft-quantization by using an AGC loop. The AGC target level is important in maximizing the decoder BER performance. DATA_IN_VALID: input. 1 CLK-wide pulse indicating that DATAIN is valid. SOF_IN / EOF_IN: inputs Start Of Frame and End Of Frame. 1 CLK-wide pulses. A aligned with DATA_IN_VALID. CTS_OUT: output. Clear-To-Send flow control. '1' indicates that the encoder is ready to accept another input dual input samples. The decoder outputs mirror its inputs: DATA_OUT(1:0), DATA_OUT_VALID, SOF_OUT, CTS_IN. N_ITER(3:0): input. Number of decoder iterations. MUST be an odd number between 1 and 15. The more iterations, the lower the BER. However, the decoder latency is nearly proportional to the number of iterations. 7 is a good tradeoff between performance and latency. Usage: it is expected that the demodulator preceding this decoder will normalize the 4

Performance Encoder throughput The maximum encoder throughput is as follows: Encoded output: 2*f clk bits/s Uncoded input: 2*f clk *R bits/s, where R is the encoding rate and f clk the FPGA clock. Decoder latency The decoder can only handle one frame at a time. The latency between input SOF and decoded output EOF is a function of BURST_PAYLOAD_SIZE, the coding rate R1/R2 and the selected number of decoding iterations N_ITER: Latency (in processing clocks CLK) = (BURST_PAYLOAD_SIZE * 4 + 25) * (2 * N_ITER + 1/( R1/R2)) For example, in the case of a 1000 Bytes payload, rate 3/4 and 7 iterations, the latency is 61717 clocks (including 5333 clocks for encoded input samples, 4000 clocks for output decoded bits). Frame Error Rate The decoded errors are somewhat bursty in nature, with many error-free decoded frames followed by an occasional erroneous frame with many bit errors. Therefore, we prefer to measure the decoder performance in terms of frame error rate (FER). Frame error rate examples: 2032-bit frame, Rate 1/3, 5-bit soft quantization, 15-iterations: FER = 10-2 @ E b /N o = 1.4 db FER = 10-3 @ E b /N o = 1.6 db 768-bit frame, Rate 3/4, 5-bit soft quantization, 15-iterations: FER = 10-2 @ E b /N o = 3.1 db FER = 10-3 @ E b /N o = 3.5 db 472-bit frame, Rate 1/2, 5-bit soft quantization, 15-iterations: FER = 10-2 @ E b /N o = 1.9 db FER = 10-3 @ E b /N o = 2.2 db 5

Software Licensing The COM-7003SOFT is supplied under the following key licensing terms: 1. A nonexclusive, nontransferable license to use the VHDL source code internally, and 2. An unlimited, royalty-free, nonexclusive transferable license to make and use products incorporating the licensed materials, solely in bit stream format, on a worldwide basis. The complete VHDL/IP Software License Agreement can be downloaded from http://www.comblock.com/download/softwarelicense.pdf Configuration Management The current software revision is 3. Directory /doc /src /sim /matlab /bin Contents Specifications, user manual, implementation documents.vhd source code,.pkg packages,.xdc constraint files (Xilinx) One component per file. VHDL test benches Matlab.m file for simulating the encoding and decoding algorithms, for generating stimulus files for VHDL simulation and for end-to-end BER performance analysis at various signal to noise ratios.bit configuration files (for use with ComBlock COM-1800 FPGA development platform) Project files: Xilinx ISE 14 project file: com-7003.xise Xilinx Vivado v2015.2 project file: project_1.xpr (b) Xilinx Vivado 2015.2 for synthesis, place and route and VHDL simulation The entire project fits easily within a Xilinx Artix7-100T. Therefore, the ISE project can be processed using the free Xilinx WebPack tools. Device Utilization Summary The encoder size is fixed (not parameterized). Device: Xilinx Artix7-100T Encoder Registers 721 0.6% LUTs 996 1.6% Block RAM/FIFO 3.5 2.6% DSP48 1 0.4% GCLKs 1 3.1% % of Xilinx Artix7-100T The decoder size depends essentially on two key parameters defined in the generic section of tc_decoder_dvb_rcs2.vhd, namely: Decoder The maximum payload size defined by the constant NADDRBITS The number of soft-quantized bits at the decoder input NQBITS 4-bit soft-quantization Frame size < 2048 bits Registers 3558 2.8% % of Xilinx Artix7-100T LUTs 10652 16.8% Block RAM/FIFO 15 11.1% DSP48 1 0.4% GCLKs 1 3.1% VHDL development environment The VHDL software was developed using the following development environment: (a) Xilinx ISE 14.7 for synthesis, place and route 6

Decoder 4-bit soft-quantization Frame size 8000 bits Registers 3591 2.8% % of Artix7-100T LUTs 10726 16.9% Block RAM/FIFO 39.5 29.3% DSP48 1 0.4% GCLKs 1 3.1% VHDL components overview Top level Clock and decoding speed The entire design uses a single global clock CLK. Typical maximum clock frequencies for various FPGA families are listed below: Device family Encoder Decoder Xilinx Artix 7-2 speed grade Xilinx Kintex-7-2 speed grade 212 MHz 155 MHz 294 MHz 230 MHz Ready-to-use Hardware The COM-7003SOFT was developed on, and therefore ready to use on the following commercial off-the-shelf hardware platform: FPGA development platform COM-1800 FPGA (XC7A100T) + ARM + DDR3 SODIMM socket + GbE LAN development platform Xilinx-specific code The VHDL source code is written in generic VHDL and thus can be ported FPGAs from various vendors. No Xilinx CORE nor Xilinx primitive is used. TC_CODEC_CONFIG.vhd generates detailed configuration parameters for the encoder and decoder. The user enters the burst payload size (in BYTES) and the coding rate R1/R2. This component looks up the optimum detailed configuration in Table-A1 of [1] TC_ENCODER_DVB_RCS2.vhd is the encoder top component. The ARITH.vhd component performs minor arithmetic operations to compute the initial permutation indices 3, (4Q1+3) modulo N, (4Q2+3 + 4Q0P) modulo N, (4Q3+3 + 4Q0P) modulo N. BRAM_DP2.vhd is a generic dual-port memory, used as input and output elastic buffers. Memory is inferred (no Xilinx primitive is used). TC_DECODER_DVB_RCS2.vhd is the decoder top component. It processes one frame at a time, i.e. the input flow must be stopped until the entire frame is decoded. 7

PERMUTATION_TABLE.vhd generate permutation and inverse permutation lookup tables BM.vhd generates the 16 branch metrics value, based on the received samples ABYW and the associated erasure information (when puncturing is enabled). FORWARD_STATE_GEN.vhd generates forward state metrics a(k+1,s) from the previous state metrics a(k,s') BACKWARD_STATE_GEN.vhd generates backward state metrics b(k,s) from the next state metrics b(k+1,s) LLR.vhd generates the log likelihood ratio (LLR) from a(k), bm(s,s'), b(k+1). see Matlab turbo.m COM7003_TOP.vhd: is mostly a use example when the turbo-codec is implemented on a ComBlock COM-1800 FPGA development platform. This component includes encoder, decoder, detailed codec configuration, clock generation, interface to a supervisory microcontroller (8-bit address/data bus to exchange control registers REG and status registers SREG). CLK_P is the main processing clock. INFILE2SIM.vhd reads an input file. This component is used by the testbench to read a softquantized encoded bit stream generated by the turbo.m Matlab program for various Eb/No cases. SIM2OUTFILE.vhd writes three 12-bit data variables to a tab delimited file which can be subsequently read by Matlab (load command) for plotting or analysis. Matlab simulation The turbo.m program - generates a stimulus file fecdecin.txt for use as input to the decoder VHDL simulation. The file includes a frame of pseudo-random (PRBS11) data bits, turbo code encoding, additive white Gaussian noise and soft-quantization. - Performs end-to-end BER performance analysis of the turbo-codec over a noisy (AWGN) channel. The turbo.m program uses treillis_diagram.m to generate the treillis state diagram (input state, input data, output state, output parity bits). The tc_dec_ber.m program reads a file of decoded data tcout.txt generated by VHDL simulation and compare it with the original PRBS-11 test sequence. It counts the number of bit errors. PRBS11 test sequence Encoder AWGN Matlab turbo.m fecdecin.txt sample file Decoder VHDL tb_decoder tc_decoder.vhd Reference documents tcout.txt file BER Matlab tc_dec_ber.m 160016 [1] ETSI EN 301 545-2 Digital Video Broadcasting (DVB); Second Generation DVB Interactive Satellite System (DVB-RCS2); Part 2: Lower Layers for Satellite standard 7.3.5.1 Turbo FEC Encoder 8

Implementation Overview Turbo Code Encoder Encoding requires four passes of the input block through an encoder core: - pass #1: natural order. Determines the circulation state C1 - pass #2: natural order starting at encoder state C1 - pass #3: interleaved order. Determines the circulation state C2. - Pass #4: interleaved order starting at encoder state C2. ComBlock Ordering Information COM-7003SOFT Turbo code encoder/decoder, VHDL source code / IP core Contact Information MSS 845-N Quince Orchard Boulevard Gaithersburg, Maryland 20878-1676 U.S.A. Telephone: (240) 631-1111 Facsimile: (240) 631-1676 E-mail: info@comblock.com Input blocks For maximum throughput, two encoder cores are used in parallel according to the sequencing below: SOF Encoder core 1 Initialize encoder at state C1 ready for next input frame Input data block Encode to find C1 Save block in buffer1 (natural order) and buffer2 (interleaved order) Encoder core 2 Re-encode block (now in buffer) Natural order Encode interleaved block to find C2 Initialize encoder at state C2 Re-encode block (now in buffer) Interleaved order Output blocks 9