Investigation on Technical Feasibility of Stronger RS FEC for 400GbE

Similar documents
802.3bj FEC Overview and Status. 400GbE PCS Baseline Proposal DRAFT. IEEE P802.3bs 400 Gb/s Ethernet Task Force

802.3bj FEC Overview and Status. PCS, FEC and PMA Sublayer Baseline Proposal DRAFT. IEEE P802.3ck

FEC Options. IEEE P802.3bj January 2011 Newport Beach

FEC Codes for 400 Gbps 802.3bs. Sudeep Bhoja, Inphi Vasu Parthasarathy, Broadcom Zhongfeng Wang, Broadcom

Backplane NRZ FEC Baseline Proposal

Toward Convergence of FEC Interleaving Schemes for 400GE

Further Studies of FEC Codes for 100G-KR

Further Investigation of Bit Multiplexing in 400GbE PMA

100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017

FEC Architectural Considerations

PAM8 Baseline Proposal

Further Clarification of FEC Performance over PAM4 links with Bit-multiplexing

802.3bj FEC Overview and Status IEEE P802.3bm

FEC IN 32GFC AND 128GFC. Scott Kipp, Anil Mehta June v0

Analysis on Feasibility to Support a 40km Objective in 50/200/400GbE. Xinyuan Wang, Yu Xu Huawei Technologies

50GbE and NG 100GbE Logic Baseline Proposal

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

A Way to Evaluate post-fec BER based on IBIS-AMI Model

EEE ALERT signal for 100GBASE-KP4

40/100 GbE PCS/PMA Testing

Performance Results: High Gain FEC over DMT

(51) Int Cl.: H04L 1/00 ( )

Proposal for 400GE Optical PMD for 2km SMF Objective based on 4 x 100G PAM4

50 Gb/s per lane MMF baseline proposals. P802.3cd, Whistler, BC 21 st May 2016 Jonathan King, Finisar Jonathan Ingham, FIT

Optical transmission feasibility for 400GbE extended reach PMD. Yoshiaki Sone NTT IEEE802.3 Industry Connections NG-ECDC Ad hoc, Whistler, May 2016

802.3bj Scrambling Options

Clause 74 FEC and MLD Interactions. Magesh Valliappan Broadcom Mark Gustlin - Cisco

Baseline proposal update

FEC Selection for 25G/50G/100G EPON

Further information on PAM4 error performance and power budget considerations

Cost Effective High Split Ratios for EPON. Hal Roberts, Mike Rude, Jeff Solum July, 2001

64G Fibre Channel strawman update. 6 th Dec 2016, rv1 Jonathan King, Finisar

Error performance objective for 400GbE

Thoughts on 25G cable/host configurations. Mike Dudek QLogic. 11/18/14 Presented to 25GE architecture ad hoc 11/19/14.

THE USE OF forward error correction (FEC) in optical networks

Error performance objective for 25 GbE

LPI SIGNALING ACROSS CLAUSE 108 RS-FEC

Update on FEC Proposal for 10GbE Backplane Ethernet. Andrey Belegolovy Andrey Ovchinnikov Ilango. Ganga Fulvio Spagna Luke Chang

Technical Feasibility of Single Wavelength 400GbE 2km &10km application

Implementation of Modified FEC Codec and High-Speed Synchronizer in 10G-EPON

100G PSM4 & RS(528, 514, 7, 10) FEC. John Petrilla: Avago Technologies September 2012

FEC Issues PCS Lock SMs. Mark Gustlin Cisco IEEE Dallas 802.3ba TF November 2008

Detailed. EEE in 100G. Healey, Velu Pillai, Matt Brown, Wael Diab. IEEE P802.3bj March, 2012

P802.3av interim, Shanghai, PRC

Hardware Implementation of Viterbi Decoder for Wireless Applications

100GBASE-KP4 Link Training Summary

PAPER A High-Speed Low-Complexity Time-Multiplexing Reed-Solomon-Based FEC Architecture for Optical Communications

Baseline Proposal for 200 Gb/s Ethernet 40 km SMF 200GBASE-ER4 in 802.3cn

100GBASE-DR2: A Baseline Proposal for the 100G 500m Two Lane Objective. Brian Welch (Luxtera)

100G SR4 Link Model Update & TDP. John Petrilla: Avago Technologies January 2013

Investigation of PAM-4/6/8 Signaling and FEC for 100 Gb/s Serial Transmission

Improving the Performance of Advanced Modulation Scheme. Yoshiaki Sone NTT IEEE802.3bs 400 Gb/s Ethernet Task Force, San Antonio, Novenver 2014.

Issues for fair comparison of PAM4 and DMT

500 m SMF Objective Baseline Proposal

100G CWDM Link Model for DM DFB Lasers. John Petrilla: Avago Technologies May 2013

Lossless Compression Algorithms for Direct- Write Lithography Systems

Investigation of PAM-4/6/8 Signaling and FEC for 100 Gb/s Serial Transmission

100G-FR and 100G-LR Technical Specifications

Summary of NRZ CDAUI proposals

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

System Evolution with 100G Serial IO

Thoughts about adaptive transmitter FFE for 802.3ck Chip-to-Module. Adee Ran, Intel Phil Sun, Credo Adam Healey, Broadcom

400G-FR4 Technical Specification

100G MMF 20m & 100m Link Model Comparison. John Petrilla: Avago Technologies March 2013

Analysis of Link Budget for 3m Cable Objective

The Case of the Closing Eyes: Is PAM the Answer? Is NRZ dead?

Brian Holden Kandou Bus, S.A. IEEE GE Study Group September 2, 2013 York, United Kingdom

Analysis of Link Budget for 3m Cable Objective

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

40G SWDM4 MSA Technical Specifications Optical Specifications

EFM Copper Technical Overview EFM May, 2003 Hugh Barrass (Cisco Systems), Vice Chair. IEEE 802.3ah EFM Task Force IEEE802.

GALILEO Timing Receiver

CAUI-4 Application Requirements

FEC Applications for 25Gb/s Serial Link Systems

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel

Data Rate to Line Rate Conversion. Glen Kramer (Broadcom Ltd)

Eric Baden (Broadcom) Ankit Bansal (Broadcom)

Commsonic. Satellite FEC Decoder CMS0077. Contact information

100GBASE-SR4 Extinction Ratio Requirement. John Petrilla: Avago Technologies September 2013

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

40G SWDM4 MSA Technical Specifications Optical Specifications

PAM-2 on a 1 Meter Backplane Channel

Modeling Digital Systems with Verilog

A 9.52 db NCG FEC scheme and 164 bits/cycle low-complexity product decoder architecture

10GBASE-LRM Interoperability & Technical Feasibility Report

100GBASE-FR2, -LR2 Baseline Proposal

Data Converters and DSPs Getting Closer to Sensors

PIPELINE ARCHITECTURE FOR FAST DECODING OF BCH CODES FOR NOR FLASH MEMORY

Component BW requirement of 56Gbaud Modulations for 400GbE 2 & 10km PMD

A Terabyte Linear Tape Recorder

Hardware Implementation for the HEVC Fractional Motion Estimation Targeting Real-Time and Low-Energy

A Reed Solomon Product-Code (RS-PC) Decoder Chip for DVD Applications

White Paper Lower Costs in Broadcasting Applications With Integration Using FPGAs

ECE532 Digital System Design Title: Stereoscopic Depth Detection Using Two Cameras. Final Design Report

Ali Ghiasi. Nov 8, 2011 IEEE GNGOPTX Study Group Atlanta

Updated Considerations on 400Gb/s Ethernet SMF PMDs

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

SECQ Test Method and Calibration Improvements

Draft 100G SR4 TxVEC - TDP Update. John Petrilla: Avago Technologies February 2014

CCSDS TELEMETRY CHANNEL CODING: THE TURBO CODING OPTION. Gian Paolo Calzolari #, Enrico Vassallo #, Sandi Habinc * ABSTRACT

Transcription:

Investigation on Technical Feasibility of Stronger RS FEC for 400GbE Mark Gustlin-Xilinx, Xinyuan Wang, Tongtong Wang-Huawei, Martin Langhammer-Altera, Gary Nicholl-Cisco, Dave Ofelt-Juniper, Bill Wilkie-Xilinx, Jeff Slavick-Avago, Zhongfeng Wang-Broadcom

Introduction and Background This presentation investigates the technical feasibility of stronger RS FEC options BCH FEC options have different FEC performance for random and burst errors, with poor burst performance BCH FEC implementations require greater (when compared to RS FECs) logic resources even without KES duplication anslow_3bs_02_1114 langhammer_3bs_01_1114 Page 2

400GbE Stronger FEC Tradeoff Overhead vs. SerDes rate & technology feasibility Latency in sensitive applications, such as Finance, DC,. Especially for short reach solutions,100/500m Higher HW complexity will lead to higher power and difficulty in integrating into a host ASIC or FPGA Higher complexity/power can impact optical modules if the FEC is integrated into the module Page 3

History of Ethernet Latency In existing low latency Ethernet switches, you see latencies as low as 250-350ns (for 10GE). These switches use cut-through switching, and this is the total latency including switching time High frequency trading (HFT) in financial applications, high performance computing(hpc) in DC are especially sensitive to latency Latency in DC is incurred by upper layer protocol (TCP windows, flow control, etc) and much cost on server implementation, especially memory Our proposed FEC latency 400GbE target is <250ns. It was100ns for 802.3bj KR4/KP4 FEC Page 4

Coding Gain Calculation of RS(n,k,t,m) FEC CG/NCG is based on 802.3bs BER Objective: 1E-13 Assuming white Gaussian noise random error only for easy analyst in this slides. Burst error just some additional penalty to CG/NCG Coding Gain is the reduction of raw BERin to a required BERpost value within the information signal Net Coding Gain is corrected to CG by the increased noise due to bandwidth expansion needed for FEC bits Code rate R is the ratio of bit rate without FEC to bit rate with FEC Transcoding to lower over-clock and improve Net Coding Gain CodingGain 20log [ erfc (2* BERpost )] 20log [ erfc (2* BERin)] 1 1 10 10 NetCoding Gain 20log [ erfc (2* BERpost )] 20log [ erfc (2* BERin)] 10log R 1 1 10 10 10 Page 5

Latency Estimation of RS(n,k,t,m) FEC Use 100Gbps KR4 FEC@644MHz for ASIC as baseline in this slides Latency estimation based on (RS FEC correction ability) t and parallelism(p1/p2) on each sub blocks in the following diagram; FEC Decoder performs error detection with error correction, same as in CL91.5.3.3, aka Mode A in 802.3bj Syndrome Computation KES (BM Algorithm) Chien Search Algorithm Forney Algorithm Controller + Frame Buffer t syndrome t KES t chien + t forney = n/p1, p1=16 for KR4/KP4 FEC implementation in this slides = x2t, (if t KES > t syndrome, duplicate KES in this slides) x=1 for t<=15, x=2 for t>15; For longer RS FEC, level of pipelining in the iterative calculation may increase due to longer critical path = n/p2+1, p2=66/68 for KR4/KP4 FEC implementation in this slides, p2 p1 FEC Decode Latency = ~( t syndrome + t KES + t chien + t forney ) Page 6

Area Estimation of RS(n,k,t,m) FEC For area estimation refer to langhammer_3bs_01_1114 With modification for low latency target and larger area permitted, KR4 FEC ASIC area ratio is: Syndrome: KES: (Chien+Forney)=20%:40%:40% if t KES > t syndrome, duplicate KES block to match the throughput of syndrome. This will increase area cost significantly for longer block RS FEC Page 7

Summary of RS FEC Options Considered RS FEC(n,k,t,m) CG NCG* BERin Overhead SerDes Rate Block Time Latency** Area Ratio Group 1 : Similar RS FEC as KR4 FEC RS(528,514,7,10) 5.39 5.28 3.92E-05 0% 25.78125 51.2ns ~87ns 1X RS(544,514,15,10) 6.64 6.39 3.09E-04 3.03% 26.5625 51.2ns ~112ns 2.9X RS(560,514,23,10) 7.3 6.93 7.60E-04 6.06% 27.34375 51.2ns ~208ns 14.5X RS(576,514,31,10) 7.76 7.26 1.30E-03 9.09% 28.125 51.2ns ~258ns 33.4X Group 2 : Large Block RS FEC RS(1056,1028,14,11) 6.07 5.95 1.29E-04 0% 25.78125 102.4ns ~172ns 2.6X RS(1088,1028,30,11) 7.12 6.88 6.06E-04 3.03% 26.5625 102.4ns ~315ns 16.7X RS(1120,1028,46,11) 7.7 7.33 1.20E-03 6.06% 27.34375 102.4ns ~414ns 54.8X RS(1152,1028,62,11) 8.11 7.61 1.90E-03 9.09% 28.125 102.4ns ~514ns 129.5X Group 3 : RS(255,239) Like RS FEC RS(255,239,8,8) 6.12 5.83 1.39E-04 6.7% 27.5 18.9ns ~49ns 1.1X RS(510,478,16,9) 6.85 6.57 4.21E-04 6.7% 27.5 42.5ns ~162ns 5.3X RS(1020,956,32,10) 7.34 7.06 7.95E-04 6.7% 27.5 93.1ns ~304ns 27.2X Group 4 : 256/257b coding friendly RS FEC*** RS(800,771,14,10) 6.29 6.13 1.83E-04 1.01% 26.04 76.8ns ~140ns 2.6X RS(816,771,22,10) 6.95 6.71 4.84E-04 3.03% 26.5625 76.8ns ~232ns 9.4X RS(840,771,34,10) 7.58 7.22 1.10E-03 6.06% 27.34375 76.8ns ~306ns 30.6X RS(864,771,46,10) 8.02 7.53 1.80E-03 9.09% 28.125 76.8ns ~379ns 72.1X The latency and area ratio is based on current RS FEC in ASIC and possible to decrease by optimized FEC algorithm or implementation * : NCG doesn t include gain from 256/257 Transcoding at 0.12dB **: Added latency for FEC only ***: Needs dummy bits to support FEC lane distribution Page 8

Generic Rules for RS(n,k,t,m) FEC in Logic Layer with i FEC Lanes Rule 1: Prefer to keep 16384*66bit*20 AM spacing gustlin_02a_0511 Page 9

Generic Rules for RS(n,k,t,m) FEC in Logic Layer with i FEC Lanes (cont'd) Rule 2: Alignment marker is uniquely identify for each FEC lanes and friendly Idle delete(64bit) for IPG adjustment. Generally AM length should at least LCM(Least Common Multiple) of "m, i and 64 Rule 3: FEC information block: k*m should be divisible by encoder length if no dummy bit added, e.g. 257bit of 256/257 TC/DC, 65bit of 64/65 TC or 513bit of 512/513 TC Rule 4: FEC block: n*m should be divisible by i*m. for example, i=4 in KR4/KP4 FEC Rule 5: Feasible RCM(integer Reference Clock Multiplier) with 156.25MHz. For example, KP4 FEC with 3% over-clocking, RCM=170 for 26.5625Gbps Page 10

RS(576/560/544/528,514,31/23/15/7,10) 4096 FEC blocks in AM period with 0%/3.03%/6.06%/9. 09% over-clocking; gustlin_400_02a_1113 AM=320bit; FEC Information Block=5140bit=257*20 with 256/257 TC/DC; FEC Block=(576/566/544/528)*10=(144/140/136/132)*4*10; RCM=180/175/170/165@156.25MHz. Page 11

RS(1152/1120/1088/1056,1028,62/46/30/14,11) Not an integer number of FEC blocks in AM spacing! Change AM distance? Or Overlap 1 st FEC Block with part of AM area? Not a good option for coupling AM with FEC blocks. AM=319bit with 1 dummy bit; FEC Information Block=1028*11bit=257*44 with 256/257 TC/DC; FEC Block=(1152/1120/1088/1056)*11=(288/280/272/264)*4*11; RCM=180/175/170/165@156.25MHz. Page 12

RS(1020,956,32,10) Not an integer number of FEC blocks in AM spacing! AM=320bit; FEC Information Block=9560bit, not an integer number of 65,66,257,513bit; Change to 9570bit for adapting to 66bit block; FEC Block=(1020)*10=255*4*10; RCM, Not an integer number @156.25MHz. Page 13

RS(840,771,34,10) Extend FEC block to 840*m for easy implementation with 10bit dummy bit; Not an integer number of FEC blocks in AM spacing! AM=320bit; FEC Information Block=771*10bit=257*30 with 256/257 TC/DC; FEC Block=(840)*10=(210)*4*10; RCM=175@156.25MHz. Same over-clock as RS(560,514,23,10). Page 14

Compare of Possible Stronger RS FEC for 400GbE We can pick some candidate stronger RS FECs with latency < ~250ns and Area < ~30X KR4 FEC. RS FEC(n,k,t,m) CG NCG BERin Overhead SerDes Rate Block Time Latency Area Ratio Hardware complexity RS(528,514,7,10) 5.39 5.28 3.92E-05 0% 25.78125 51.2ns ~87ns 1X 802.3bj RS(544,514,15,10) 6.64 6.39 3.09E-04 3.03% 26.5625 51.2ns ~112ns 2.9X 802.3bj RS(560,514,23,10) 7.3 6.93 7.60E-04 6.06% 27.34375 51.2ns ~208ns 14.5X Implementation compatible with 802.3bj; costs more logic resource RS(576,514,31,10) 7.76 7.26 1.30E-03 9.09% 28.125 51.2ns ~258ns 33.4X Implementation compatible with 802.3bj;costs significant logic resource RS(1088,1028,30,11) 7.12 6.88 6.06E-04 3.03% 26.5625 102.4ns ~315ns 16.7X costs more logic resource and requires to change AM spacing of 16384; Rule 1 not satisfied RS(1020,956,32,10) 7.34 7.06 7.95E-04 6.7% 27.5 93.1ns ~304ns 27.2X cost too more logic resource and require to change AM spacing of 16384; Rule 1,2,5 not satisfied RS(840,771,34,10) 7.58 7.22 1.10E-03 6.06% 27.34375 76.8ns ~306ns 30.6X cost too more logic resource and require to change AM spacing of 16384; Rule 1 not satisfied Page 15

Comparison of 4X100G/1X400Gbps RS(528,514) FEC in 400GbE Logic Layer RS(528,514,7,10)(100Gbp s)160bit@644mhz(asic) Area Latency (Cycle) 1. Syndrome(16 parallel) 0.2a 33 2. KES(BM) 0.4a 14 3. Chien(66 parallel) 0.15a 8 4. Forney 0.25a 1 TOTAL a 56 Cycle(~87ns) RS(528,514,7,10)(400Gbps) 660bit@624MHz(ASIC) Area Latency (Cycle) 1. Syndrome(66 parallel) 0.825a 8 2. KES(BM) (X2 duplication) 0.8a 14 3. Chien(66 parallel) 0.15a 8 4. Forney 0.25a 1 TOTAL 2.025a 31Cycle(~49ns) Exact comparison is affected by process node or combinational logic, etc. To meet our low latency criteria, 1x400G RS FEC@~49ns is around 2x size of 1x100G RS FEC@~87ns For real implementation of high parallelism in 400G RS FEC, the reasonable area of 1x400G RS FEC is larger than 2.5x size of 1x100G RS FEC Page 16

Proposal of 400GbE Logic Layer with RS FEC MAC/RS PCS RS FEC in the PCS to provide a single FEC in the system PMA CDAUI-n PMA PMD Medium MDI gustlin_3bs_02a_1114 Page 17

Summary RS FEC seems like a good fit for this project: less complex to implement and better gain in the face of burst errors when compared to a BCH code There are several good RS FEC candidates in this presentation, we need to make the right tradeoff between latency, complexity and gain for the PMDs in order to select the best FEC code Further work on gain/latency/area of stronger RS FEC by deeper analysis of FEC model and algorithm. Provide RS FEC candidates for PMD discussion Page 18

Thank you