100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017

Similar documents
Comment #147, #169: Problems of high DFE coefficients

CDAUI-8 Chip-to-Module (C2M) System Analysis #3. Ben Smith and Stephane Dallaire, Inphi Corporation IEEE 802.3bs, Bonita Springs, September 2015

Clause 74 FEC and MLD Interactions. Magesh Valliappan Broadcom Mark Gustlin - Cisco

Further Investigation of Bit Multiplexing in 400GbE PMA

Further Clarification of FEC Performance over PAM4 links with Bit-multiplexing

Performance comparison study for Rx vs Tx based equalization for C2M links

Brian Holden Kandou Bus, S.A. IEEE GE Study Group September 2, 2013 York, United Kingdom

MR Interface Analysis including Chord Signaling Options

A Way to Evaluate post-fec BER based on IBIS-AMI Model

The Case of the Closing Eyes: Is PAM the Answer? Is NRZ dead?

100G EDR and QSFP+ Cable Test Solutions

LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES. Masum Hossain University of Alberta

Problems of high DFE coefficients

Comparison of NRZ, PR-2, and PR-4 signaling. Qasim Chaudry Adam Healey Greg Sheets

CDAUI-8 Chip-to-Module (C2M) System Analysis. Stephane Dallaire and Ben Smith, September 2, 2015

More Insights of IEEE 802.3ck Baseline Reference Receivers

64G Fibre Channel strawman update. 6 th Dec 2016, rv1 Jonathan King, Finisar

Proposal for 10Gb/s single-lane PHY using PAM-4 signaling

Thoughts about adaptive transmitter FFE for 802.3ck Chip-to-Module. Adee Ran, Intel Phil Sun, Credo Adam Healey, Broadcom

Investigation on Technical Feasibility of Stronger RS FEC for 400GbE

Practical Receiver Equalization Tradeoffs Applicable to Next- Generation 28 Gb/s Links with db Loss Channels

BER margin of COM 3dB

Toward Convergence of FEC Interleaving Schemes for 400GE

Analysis of Link Budget for 3m Cable Objective

100G PSM4 & RS(528, 514, 7, 10) FEC. John Petrilla: Avago Technologies September 2012

Ali Ghiasi. Nov 8, 2011 IEEE GNGOPTX Study Group Atlanta

Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

The Challenges of Measuring PAM4 Signals

Analysis of Link Budget for 3m Cable Objective

Update on FEC Proposal for 10GbE Backplane Ethernet. Andrey Belegolovy Andrey Ovchinnikov Ilango. Ganga Fulvio Spagna Luke Chang

Exceeding the Limits of Binary Data Transmission on Printed Circuit Boards by Multilevel Signaling

PAM4 signals for 400 Gbps: acquisition for measurement and signal processing

SECQ Test Method and Calibration Improvements

PAM8 Baseline Proposal

Summary of NRZ CDAUI proposals

100GBASE-SR4 Extinction Ratio Requirement. John Petrilla: Avago Technologies September 2013

Improving the Performance of Advanced Modulation Scheme. Yoshiaki Sone NTT IEEE802.3bs 400 Gb/s Ethernet Task Force, San Antonio, Novenver 2014.

Technical Feasibility of Single Wavelength 400GbE 2km &10km application

System Evolution with 100G Serial IO

Measurements and Simulation Results in Support of IEEE 802.3bj Objective

802.3bj FEC Overview and Status IEEE P802.3bm

CU4HDD Backplane Channel Analysis

COM Study for db Channels of CAUI-4 Chip-to-Chip Link

Cost Effective High Split Ratios for EPON. Hal Roberts, Mike Rude, Jeff Solum July, 2001

32 G/64 Gbaud Multi Channel PAM4 BERT

Optical transmission feasibility for 400GbE extended reach PMD. Yoshiaki Sone NTT IEEE802.3 Industry Connections NG-ECDC Ad hoc, Whistler, May 2016

CAUI-4 Chip to Chip Simulations

Presentation to IEEE P802.3ap Backplane Ethernet Task Force July 2004 Working Session

On Figure of Merit in PAM4 Optical Transmitter Evaluation, Particularly TDECQ

100G SR4 Link Model Update & TDP. John Petrilla: Avago Technologies January 2013

100G CWDM Link Model for DM DFB Lasers. John Petrilla: Avago Technologies May 2013

Approach For Supporting Legacy Channels Per IEEE 802.3bj Objective

New Results on QAM-Based 1000BASE-T Transceiver

Half-Rate Decision-Feedback Equalization Di-Bit Response Analysis and Evaluation EDA365

Component BW requirement of 56Gbaud Modulations for 400GbE 2 & 10km PMD

50 Gb/s per lane MMF objectives. IEEE 50G & NGOATH Study Group January 2016, Atlanta, GA Jonathan King, Finisar

50GbE and NG 100GbE Logic Baseline Proposal

Transmission Strategies for 10GBase-T over CAT- 6 Copper Wiring. IEEE Meeting November 2003

FEC Architectural Considerations

Draft Baseline Proposal for CDAUI-8 Chipto-Module (C2M) Electrical Interface (NRZ)

100G MMF 20m & 100m Link Model Comparison. John Petrilla: Avago Technologies March 2013

10 Gb/s Duobinary Signaling over Electrical Backplanes Experimental Results and Discussion

Measurements Results of GBd VCSEL Over OM3 with and without Equalization

Duobinary Transmission over ATCA Backplanes

Performance Results: High Gain FEC over DMT

Draft 100G SR4 TxVEC - TDP Update. John Petrilla: Avago Technologies February 2014

100GEL C2M Channel Reach Update

Simulations of Duobinary and NRZ Over Selected IEEE Channels (Including Jitter and Crosstalk)

Proposed reference equalizer change in Clause 124 (TDECQ/SECQ. methodologies).

52Gb/s Chip to Module Channels using zqsfp+ Mike Dudek QLogic Barrett Bartell Qlogic Tom Palkert Molex Scott Sommers Molex 10/23/2014

Refining TDECQ. Piers Dawe Mellanox

Line Signaling and FEC Performance Comparison for 25Gb/s 100GbE IEEE Gb/s Backplane and Cable Task Force Chicago, September 2011

Further information on PAM4 error performance and power budget considerations

CAUI-4 Chip to Chip and Chip to Module Applications

InfiniBand Trade Association

FEC Codes for 400 Gbps 802.3bs. Sudeep Bhoja, Inphi Vasu Parthasarathy, Broadcom Zhongfeng Wang, Broadcom

FEC Applications for 25Gb/s Serial Link Systems

Development of an oscilloscope based TDP metric

BRR Tektronix BroadR-Reach Compliance Solution for Automotive Ethernet. Anshuman Bhat Product Manager

40G SWDM4 MSA Technical Specifications Optical Specifications

InfiniBand Trade Association

C65SPACE-HSSL Gbps multi-rate, multi-lane, SerDes macro IP. Description. Features

PAM-2 on a 1 Meter Backplane Channel

Performance Evaluation of Proposed OFDM. What are important issues?

Next Generation Ultra-High speed standards measurements of Optical and Electrical signals

Baseline proposal update

Issues for fair comparison of PAM4 and DMT

Systematic Tx Eye Mask Definition. John Petrilla, Avago Technologies March 2009

Investigation of PAM-4/6/8 Signaling and FEC for 100 Gb/s Serial Transmission

TDECQ update noise treatment and equalizer optimization (revision of king_3bs_01_0117) 14th February 2017 P802.3bs SMF ad hoc Jonathan King, Finisar

Development of an oscilloscope based TDP metric

Thoughts on 25G cable/host configurations. Mike Dudek QLogic. 11/18/14 Presented to 25GE architecture ad hoc 11/19/14.

40G SWDM4 MSA Technical Specifications Optical Specifications

DataCom: Practical PAM4 Test Methods for Electrical CDAUI8/VSR-PAM4, Optical 400G-BASE LR8/FR8/DR4

New Serial Link Simulation Process, 6 Gbps SAS Case Study

802.3bj FEC Overview and Status. 400GbE PCS Baseline Proposal DRAFT. IEEE P802.3bs 400 Gb/s Ethernet Task Force

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

50 Gb/s per lane MMF baseline proposals. P802.3cd, Whistler, BC 21 st May 2016 Jonathan King, Finisar Jonathan Ingham, FIT

New Techniques for Designing and Analyzing Multi-GigaHertz Serial Links

Transcription:

100Gb/s Single-lane SERDES Discussion Phil Sun, Credo Semiconductor IEEE 802.3 New Ethernet Applications Ad Hoc May 24, 2017

Introduction This contribution tries to share thoughts on 100Gb/s single-lane SERDES development and bring discussions on these topics: 100Gb/s SERDES Opportunities and Challenges Modulation choices: PAM4 v.s. PAM8 BER Requirement and FEC Lower-power Architecture for 100Gb/s Long Reach SERDES TX FIR Training Real-time tuning TX Training time 1

Process Nodes (nm) Single-lane SERDES Speeds (Gb/s) 100Gb/s SERDES Opportunities and Challenges Higher speed SERDES is desired for higher throughput interconnect. On the other hand, it requires faster and more complexed circuits. SERDES design takes advantage of faster process nodes to solve design challenges and meet power constraints. 100Gb/s short reach SERDES has been demoed on 28nm. Lower power may be achieved on 16nm and 7nm. 100Gb/s long reach has higher complexity. Generations of Process Nodes and SERDES Speeds 200 120 180 160 140 120 0.18um 0.13um Process Nodes Single-lane SERDES Speeds year?,100g 100 80 100 80 60 90nm 65nm 55nm 40 40nm 25G 28nm 20 20 1G 2.5G 10G 16nm 10nm 7nm 0 0 1995 2000 2005 2010 2015 2020 2025 Year 50G 60 40 [Goergen_nea_01a_0317]. 2

100G Short Reach Design Results A 100Gb/s PAM4 SERDES for short reach has been developed and demoed. With 28nm process node, TX eye is clean. Multiple tap TX FIR has been applied for TX eye measurement. TX Eye Diagram Eye Monitor Test Setup 3

Modulation Choices: PAM4 vs. PAM8 Considering high clock rate of SERDES and power/latency constraints, some hardware costly equalization and FEC schemes are unlikely to be used for 100GE. This contribution compares two modulation schemes assuming SERDES RX DFE (at least 1 tap) may be used and FEC power/latency should not be dramatically increased. From PAM4 to PAM8, bandwidth reduction is 1/3. Less than 1/2 bandwidth reduction from NRZ to PAM4. PAM8 eye height is -7.4dB lower. Therefore it is more sensitive to residual ISI and circuit distortion. PAM4 EYE For PAM8, DFE error propagation rate is higher (7/8 v.s. 3/4), and each FEC symbol covers less (2/3) PAM8 symbols. Burst error penalty is worse for FEC (e.g. Reed Solomon FEC). For the FEC schemes shown later, DER requirement for PAM8 and PAM4 is 2.6E-8 and 3.8E-5 respectively to achieve FLR equivalent to BER 1E-15. SNR is 27.9dB and 18.8dB (9.1dB higher for PAM8). Note this still assumes FEC for PAM8 has more latency and complexity. PAM8 results in higher DFE complexity. PAM8 EYE 4

PAM4 and PAM8 Performance Comparison Maximum SNR at decision point can be computed by Salz SNR, which is: SNR salz =10 log 10 exp 1 FN 0 FN ln 1 + S(f) 1 FN FN S f 10 log10 df 0 N f = AVG 0 f Fn [SNR db f ] TX SNR-IL NY /2 = PT/(2N 0 )-IL NY /2 where Fn is Nyquist Frequency, P is TX signal power, IL NY is insertion loss at Fn. For simplicity, system noise is assumed to be AWGN, and channel is assumed to be dielectric loss (linear phase) dominant. For PAM8, T and IL NY are both 2/3 of PAM4 N(f) PAM4 performs better for channels with IL less than 50.8dB at PAM4 Nyquist frequency. Considering skin loss, PAM4 performs better on even higher loss channel. df 5

DER Requirement and FEC For PAM4 modulation with KP4 FEC, worst DFE error propagation rate ( a ) is 0.75. In this case, DER needs to be 2.9E-5 to achieve frame loss ratio equivalent to BER 1E-12 (FLR=6.2E-10). Raw BER requirement is 5.8E-5. If DFE error propagation rate can be limited to 0.6, DER and BER requirement can be relaxed to 2.1E-4 and 3.2E-4. Raw BER requirement needs to be lower (shared) if there are multiple links. If 1E-15 post FEC BER is required for some applications, burst error penalty is very high and needs to be controlled. BER Target FLR a=0.75 a=0.6 a=0 1E-12 6.2E-10 2.9E-5 2.1E-4 7.6E-4 1E-15 6.2E-13 2.5E-7 6.0E-5 5.0E-4 DER Requirement KP4 FEC Performance for PAM4 6

DER Requirement and Interleaved FEC Considering 1+D precoder is only effective on certain burst patterns, symbol interleaving is more reliable to treat burst errors. Assuming no interleaving for NRZ, 2-way interleaving for PAM4, 3-way interleaving for PAM8, KP4 FEC net coding gain is much less for PAM8 than NRZ and PAM4. 3-way interleaving also results in longer latency and higher complexity. Lane multiplexing schemes are not decided and may further degrade FEC coding gain to some extent. Preliminary simulation results in the following slides indicate PAM4 DER requirement is reasonable. PAM8 needs a stronger FEC and/or THP if 1E-15 is required for some applications. BER Target FLR NRZ PAM4 PAM8 1E-12 6.2E-10 2.3E-4 1.1E-4 7.8E-6 1E-15 6.2E-13 1.2E-4 3.8E-5 2.6E-8 DER Requirement for Interleaved FEC Interleave no 2-way 3-way 100GE FEC Latency 110ns 160ns 210ns FEC Latency Interleaved KP4 FEC Performance No interleaving for NRZ, 2-way interleaving for PAM4, 3-way for PAM8 7

Power Challenge of 100Gb/s LR SERDES Given the same channel, reflections may appear on double number of UI s because of double Baud Rate. FFE or DFE is commonly used for equalization and consumes a big portion of SERDES power (usually 25% to 50% depending on architecture). FFE or DFE needs double number of taps and double throughput compared to 50Gb/s. Power of RX FFE or DFE theoretically will be up to 4x on the same process node! Single Bit Response of a 100Gb/s channel Because throughput or bandwidth doubles, power of other major components (ADC, TX, CTLE) theoretically double as well. For a switch ASIC with 128 or 256 ports, this power increase is significant! Solutions need to be found! 8

Lower-Power 100Gb/s Architecture Opportunity Conventional SERDES Architecture Moves FFE to TX A Low-power SERDES Architecture with simpler RX Proposed SERDES moves FFE to TX. Therefore, receiver can be much simpler and easier. For example, CTLE and a 1-tap DFE. TX FFE is much less expensive than RX FFE because input bit width is much less and multipliers can be avoided. ADC power can reduced as well as dynamic range is reduced. TX-centric equalization is not new. It is commonly used to save receiver power, and manage interference. In SERDES case, TX FIR costs much less compared to RX FFE/DFE. Interoperation and test experience can be borrowed from these projects. About 30% SERDES power reduction compared to conventional architecture! 9

Noise and Distortion Analysis Noise and distortion sources TX Noise CTLE Noise ADC Noise FEXT NEXT Signal distortion and ADC dynamic range FFE2 is moved from RX to TX 10

Noise and Distortion Analysis cont. Signal: Same at slicer input if system is linear. Distortion and noise: Refection ISI can be better cancelled because more FFE taps can be implemented on TX side with lower power. Easier RX and better linearity. For example, CTLE output signal dynamic range is smaller and less distortion. Important for PAM4 signal. ADC needs less dynamic range, and no noise enhancement by RX FFE. XTALK: Aggressors have lower PSD. Same XTALK impact from aggressors using the same structure. NEXT and CTLE noise are relatively boosted higher. The difference can be controlled if TX FIR post cursors are only used to cancel reflections (RX takes care of material loss). Noise enhancement and distortion tradeoff is application dependent. The advantage of heavy TX FIR scheme is to cancel reflections and alleviate distortion with significantly less power. 11

Performance simulation setup Test Channel 33.28dB IL at 26.5625GHz including package, bad reflections. 5 FEXT and 3 NEXT channels. NEXT noise dominates. TX SNDR 34dB Jitter: RJ 0.01 UI RMS, Even/Odd: 0.02 UI p2p. 12

Performance simulation result Scheme 1: Traditional RX Equalization: 19 tap RX FFE BER is 2.2E-4. Scheme 2: TX-FIR Equalization: 29-tap TX FIR. BER is 7.5E-6. * Both schemes have CTLE and 1-tap RX DFE 13

Performance Analysis Scheme 1 performance is limited by residual ISI. It will burn a lot of power for a RX FFE/DFE to have 25 post cursors. Scheme 2 has much less residual ISI. NEXT noise is relatively boosted. Overall scheme 2 has better performance and costs less power. In scheme 2, 25% TX FIR tap weights are for reflections. Scheme 2 TX FIR frequency response is about 2dB lower. This channel has quite severe reflections. For a smoother channel, 10% TX FIR taps is normally used for reflections. TX signal power penalty is only about 1dB. Overall performance will be better because of less distortion of on RX. 14

TX FIR Real-Time Adaptation Startup training mechanism is defined in IEEE 802.3 Clause 94 and 136. The purpose is to adapt TX FIR. Channels have big variation due to temperature or humidity. More TX FIR taps are needed for 100G. More impact on performance. TX FIR real-time adaptation is desired for optimal performance and simpler RX? This kind of adaptation rate can be low because channel variation is slow,. How to pass training information to remote TX during normal data traffic? Training info include control and status and need to travel two directions. Status Info TX FIR Normal Traffic RX Control Info RX Normal Traffic FIR TX 15

Finding Back Channel Alignment Marker is inserted for lane alignment and FEC boundary (e.g. IEEE802.3 clause 133), and is mandatory for links with PAM4 signaling. Back channel mechanism: TX add status and control field (from local RX) into alignment marker. RX has a detection logic to lock to the alignment marker, and fetch status and control commands (for local TX). 16

TX FIR Update Rate AM spacing. Speed AM spacing (66b blocks) Spacing on PAM4 SERDES 50GE 20480x4 16384x5 104.86 100GE 16384x20 16384x5 104.86 200GE 81920x4 16384x5 104.86 400GE 163840x4 16384x5 104.86 Time interval between AM (us) Update rate is about 10000 times per second, enough to track temperature/humidity variations. 17

Back Channel Mechanism AM lock, add Training Info Back Channel Diagram AM lock, fetch Training Info UM TX0 RX0 TX1 RX1 This AM lock is done in SERDES for repeater applications. This logic only locks to AM without doing alignment. Hardware cost is trivial. For implementations with FEC layer, logic could be shared. 18

Training Info Field To get back channel, some bits in AM can be reserved or reused for each FEC lane. For example, reserve some bits in 2 FEC lanes as shown in the following figure. Reliability can be guaranteed by error detection protocols. RX knows whether training info should be expected in AM during frame training or MDIO. FEC Lane Reed Solomon Symbols Lane 0 amp_tx_0(0:63) amp_tx_4(0:63) amp_tx_8(0:63) amp_tx_12(0:63) command field Lane 1 amp_tx_1(0:63) amp_tx_5(0:63) amp_tx_9(0:63) amp_tx_13(0:63) status field Lane 2 amp_tx_2(0:63) amp_tx_6(0:63) amp_tx_10(0:63) amp_tx_14(0:63) Lane 3 amp_tx_3(0:63) amp_tx_7(0:63) amp_tx_11(0:63) amp_tx_15(0:63) A possible approach of back channel bits allocation 19

TX Training Time Current TX FIR training updates only one coefficient per training frame. More TX FIR taps are needed for 100G and results in more TX training work. Can longer training time be tolerated by upper layer? Can we update multiple coefficients simultaneously to speedup? Need to extend control/status field structure to have dedicated bits for each coefficient. It may be useful to add status information, such as the number of unused drivers, and each coefficient weight. 20

Summary 100Gb/s PAM4 SERDES is desired for higher speed interconnect and being shown on silicon. Two modulation schemes are compared. PAM4 is preferable than PAM8 considering joint performance of FEC and SERDES. DER requirement for interleaved KP4 FEC is studied. With the development of channels and SERDES, there will be more information whether stronger FEC is needed. SERDES power may dramatically increase due to equalization challenge and speed of 100Gb/s electrical link, and result in significant ASIC power increase. A low-power architecture opportunity for 100Gb/s LR SERDES : A standard supporting heavy TX FFE will enable remarkable SERDES power reduction! Real-time TX training and faster adaptation mechanism are introduced for robust SERDES performance. 21

Thanks! 22