LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES. Masum Hossain University of Alberta

Similar documents
100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017

25.5 A Zero-Crossing Based 8b, 200MS/s Pipelined ADC

Performance comparison study for Rx vs Tx based equalization for C2M links

Comparison of NRZ, PR-2, and PR-4 signaling. Qasim Chaudry Adam Healey Greg Sheets

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

Exceeding the Limits of Binary Data Transmission on Printed Circuit Boards by Multilevel Signaling

Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

A 90 Gb/s 2:1 Multiplexer with 1 Tap FFE in SiGe Technology

MR Interface Analysis including Chord Signaling Options

PAM4 signals for 400 Gbps: acquisition for measurement and signal processing

100G EDR and QSFP+ Cable Test Solutions

A low jitter clock and data recovery with a single edge sensing Bang-Bang PD

High-Speed ADC Building Blocks in 90 nm CMOS

10 Gb/s Duobinary Signaling over Electrical Backplanes Experimental Results and Discussion

C65SPACE-HSSL Gbps multi-rate, multi-lane, SerDes macro IP. Description. Features

Half-Rate Decision-Feedback Equalization Di-Bit Response Analysis and Evaluation EDA365

How advances in digitizer technologies improve measurement accuracy

32 G/64 Gbaud Multi Channel PAM4 BERT

Presentation to IEEE P802.3ap Backplane Ethernet Task Force July 2004 Working Session

ECEN620: Network Theory Broadband Circuit Design Fall 2014

DPD80 Infrared Datasheet

SECQ Test Method and Calibration Improvements

Is the Golden Age of Analog circuit Design Over?

Digital Correction for Multibit D/A Converters

Loop Bandwidth Optimization and Jitter Measurement Techniques for Serial HDTV Systems

Further Investigation of Bit Multiplexing in 400GbE PMA

100G SR4 Link Model Update & TDP. John Petrilla: Avago Technologies January 2013

PEP-II longitudinal feedback and the low groupdelay. Dmitry Teytelman

Clock Jitter Cancelation in Coherent Data Converter Testing

Large Area, High Speed Photo-detectors Readout

Digitally Assisted Analog Circuits. Boris Murmann Stanford University Department of Electrical Engineering

PAPER A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution

The Case of the Closing Eyes: Is PAM the Answer? Is NRZ dead?

Practical Receiver Equalization Tradeoffs Applicable to Next- Generation 28 Gb/s Links with db Loss Channels

DPD80 Visible Datasheet

IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing

Trigger synchronization and phase coherent in high speed multi-channels data acquisition system

A 5-Gb/s Half-rate Clock Recovery Circuit in 0.25-μm CMOS Technology

CONVENTIONAL phase-tracking clock and data recovery

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Timing Modules. Connect Frequency Control Timing Modules

EE273 Lecture 11 Pipelined Timing Closed-Loop Timing November 2, Today s Assignment

Synthesized Clock Generator

MCP Signal Extraction and Timing Studies. Kurtis Nishimura University of Hawaii LAPPD Collaboration Meeting June 11, 2010

Duobinary Transmission over ATCA Backplanes

PICOSECOND TIMING USING FAST ANALOG SAMPLING

Research Results in Mixed Signal IC Design

IC Design of a New Decision Device for Analog Viterbi Decoder

AMI Modeling Methodology and Measurement Correlation of a 6.25Gb/s Link

Component BW requirement of 56Gbaud Modulations for 400GbE 2 & 10km PMD

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

EE241 - Spring 2005 Advanced Digital Integrated Circuits

Proposed reference equalizer change in Clause 124 (TDECQ/SECQ. methodologies).

RFI MITIGATING RECEIVER BACK-END FOR RADIOMETERS

ISSCC 2006 / SESSION 18 / CLOCK AND DATA RECOVERY / 18.6

Next Generation Ultra-High speed standards measurements of Optical and Electrical signals

A Low-Power 0.7-V H p Video Decoder

Further information on PAM4 error performance and power budget considerations

Draft 100G SR4 TxVEC - TDP Update. John Petrilla: Avago Technologies February 2014

PAM8 Baseline Proposal

EVLA Fiber Selection Critical Design Review

Investigation of PAM-4/6/8 Signaling and FEC for 100 Gb/s Serial Transmission

New Results on QAM-Based 1000BASE-T Transceiver

2 MHz Lock-In Amplifier

«Trends in high speed, low power Analog to Digital converters»

QUICK START GUIDE FOR DEMONSTRATION CIRCUIT /12/14 BIT 10 TO 65 MSPS DUAL ADC

On Figure of Merit in PAM4 Optical Transmitter Evaluation, Particularly TDECQ

Ali Ghiasi. Nov 8, 2011 IEEE GNGOPTX Study Group Atlanta

Application Note 5098

Nutaq. PicoDigitizer-125. Up to 64 Channels, 125 MSPS ADCs, FPGA-based DAQ Solution With Up to 32 Channels, 1000 MSPS DACs PRODUCT SHEET. nutaq.

Understanding Sampling rate vs Data rate. Decimation (DDC) and Interpolation (DUC) Concepts

Investigation of PAM-4/6/8 Signaling and FEC for 100 Gb/s Serial Transmission

10GBASE-LRM Interoperability & Technical Feasibility Report

System Evolution with 100G Serial IO

Copyright. Robert Alexander Fontaine

GHz Sampling Design Challenge

2 MHz Lock-In Amplifier

Delta-Sigma Modulators

Comment #147, #169: Problems of high DFE coefficients

Achieving Timing Closure in ALTERA FPGAs

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1

4 MHz Lock-In Amplifier

Measurements Results of GBd VCSEL Over OM3 with and without Equalization

Benefits of the R&S RTO Oscilloscope's Digital Trigger. <Application Note> Products: R&S RTO Digital Oscilloscope

Datasheet SHF A

Agilent N4965A Multi-Channel BERT 12.5 Gb/s Data Sheet

Datasheet SHF A Multi-Channel Error Analyzer

RF Record & Playback MATTHIAS CHARRIOT APPLICATION ENGINEER

DataCom: Practical PAM4 Test Methods for Electrical CDAUI8/VSR-PAM4, Optical 400G-BASE LR8/FR8/DR4

Receiver Testing to Third Generation Standards. Jim Dunford, October 2011

100GBASE-SR4 Extinction Ratio Requirement. John Petrilla: Avago Technologies September 2013

WAVEEXPERT SERIES OSCILLOSCOPES WE 9000 NRO 9000 SDA 100G. The World s Fastest Oscilloscope

More Insights of IEEE 802.3ck Baseline Reference Receivers

Impact of Clock Content on the CDR with Propose Resolution

Chapter 6: Real-Time Image Formation

Clause 74 FEC and MLD Interactions. Magesh Valliappan Broadcom Mark Gustlin - Cisco

Measurements and Simulation Results in Support of IEEE 802.3bj Objective

Clock Generation and Distribution for High-Performance Processors

QUICK START GUIDE FOR DEMONSTRATION CIRCUIT /12/14 BIT 10 TO 105 MSPS ADC

Transcription:

LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES Masum Hossain University of Alberta 0

Outline Why ADC-Based receiver? Challenges in ADC-based receiver ADC-DSP based Receiver Reducing impact of Quantization Noise Variable Resolution ADC low-latency high-resolution TDC-based timing recovery Implemented Prototype and Measured Results 1

Conventional mixed-signal Link Tx FIR Filter: Peak power constrained Limited by supply voltage Peaking equalizer: Analog - does not scale well Limited by supply voltage PVT variation Decision Feedback Eq.: Latency constrained Difficult for multilevel signaling Existing equalization strategy does not scale well with technology, channel loss and data rate 2

Mixed-signal vs ADC-based Link ADC-based high speed Link Analog mixed-signal Digital Benefits of DSP-based equalization: Scales well with technology Frequency response can be well controlled Can equalize both pre and post cursors Challenges of DSP-based equalization: ADC-DSP is power hungry. Higher loop latency make timing recovery difficult 3

PAM-4 Digital Receiver Architecture Variable Resolution Predictive ADC 8-tap Digital FFE 3-tap in Look-up table 5-tap in conventional way Timing Recovery 3-bit TDC 4

Variable Resolution ADC 12 db loss 1 4 Fixed Reference Normalized Step Response & Comparator Reference 0.8 0.6 0.4 0.2 Transient Data Edge 0 1 2 3 4 5 6 7 8 9 10 11 12 Time (Bit period) Between two consecutive samples signal changes a lot Need to cover entire dynamic range - 4 Fixed References. 5

Variable Resolution ADC 25 db loss Normalized Step Response & Comparator Reference 1 0.8 0.6 0.4 0.2 Transient Data Edge Data Edge 0 1 2 3 4 5 6 7 8 9 10 11 12 Time (Bit period) Between two consecutive samples signal changes around 20% - 30% Need to cover a portion of entire dynamic range Reference Switching 6

Variable Resolution ADC 25 db loss Normalized Step Response & Comparator Reference 1 0.8 0.6 0.4 0.2 Transient Data Edge Data Edge 0 1 2 3 4 5 6 7 8 9 10 11 12 Time (Bit period) Edge comparator output defines the next probable location of references 7

Variable Resolution ADC 25 db loss 1 2 Edge Reference Normalized Step Response & Comparator Reference 0.8 0.6 0.4 0.2 Fine Reference Transient Data Edge Data Edge 0 1 2 3 4 5 6 7 8 9 10 11 12 Time (Bit period) Fine references are carried over to the mid of two coarse references 8

Variable Resolution ADC Sample and Hold Coarse Fine ODD Quad Edge Octal ODD Quad Coarse Octal EVEN EDGE Fine EVEN 9 Quad Octal PGEN PGEN Quad and Octal clock is retimed with a the original quad clock /2 Matched delay 3.5 GHz

ADC Offset Correction Ref: [2] Unbalance the capacitive load attached to the input of the strong-arm latch Store the bit-decisions into a 6T SRAM to reduce the area. 10

Measured ADC Performance 11

PAM-4 Digital Receiver Architecture Variable Resolution Predictive ADC 8-tap Digital FFE 3-tap in Look-up table 5-tap in conventional way Timing Recovery 3-bit TDC 12

Timing Recovery Challenge for ADC-based Receiver Digital FFE Ф Q MM Phase Detector Ф N Digital Filter MM based phase detection is not as robust as 2x (i.e. data and edge) sampled CDR Bang-bang or 1 bit phase quantization at the Phase detector increases in-band jitter Lowering loop bandwidth increases VCO phase noise contribution Loop latency makes it difficult to achieve wider loop bandwidth 13

Effect of Timing Noise on SNR Effect of timing noise on SNR is less when we consider channel loss!!!

Phase Tracking vs Blind ADC based [Clifford et.al. JSSC, 2013] Simple But latency sensitive ADC benefits from jitter tracking Less latency sensitive ADC does not benefits from jitter tracking

Low-latency Timing Recovery Region 3 Region 2 Region 1 Region 0 16

Low-latency Timing Recovery SAR TDC operation Proposed CDR Advantages: ADC bypass significantly reduces latency 3b SAR TDC reduces bang-bang dithering by 4x. Wider loop BW effectively filters VCO phase noise 17

Jitter Tolerance (UIpp) CDR Performance Phase Noise Jitter Tolerance with 2 7-1 pattern Free-running Equipment limit Locked Integrated jitter = 0.5 ps In-band phase noise = - 90 dbc/hz Frequency (MHz) 10 2 18

PAM-4 Digital Receiver Architecture Variable Resolution Predictive ADC 8-tap Digital FFE 3-tap in Look-up table 5-tap in conventional way Timing Recovery 3-bit TDC 19

Noise Sources in ADC-based Receiver N LEQ N ADC N QZ Digital FFE Ф N Noise Source Constrain Transfer Gain N LEQ Power/Gain/BW LEQ + FFE Φ N Power and latency FFE N ADC Power/Settling time FFE N QZ ADC Resolution FFE Power (mw) 300 250 200 150 100 50 Timing Recovery Flash ADC, Fs=14GS/s 20 0 2 3 4 5 6 ADC Resolution (No. of bits)

Quantization Noise Impact N QZ, out N QZ W h X FFE h pre 2 h h X main N 2 Q Pr e h post W 2 Pr e N 2 QMain W 3 2 Main, x Pr e, Main, Post N 2 QPost W 2 Post 0-10 -20 FFT at the ADC Output (Simulated) FFT at the FFE Output (Simulated) ADC quantization Noise Floor (Theoretical) Quantization noise floor at the FFE output (Theoretical) AMPLITUDE (db) -30-40 -50-60 -70-80 -90-100 1 2 3 4 5 6 7 ANALOG INPUT FREQUENCY (GHz)

How to reduce ADC quantization noise impact? N QZ N bit N bit Z -1 N bit Z -1 h main h post h main h post N QMain N QPost N QZ 2N bit 2N bit Although Digital FFE output can be 2N bit, we are we are still limited by ADC s N bit resolution If FFE can be moved ahead of the ADC than we can Minimize ADC s quantization noise penalty How can we build a digital FFE with resolution better than the ADC? 22 22

Reducing Quantization Noise Impact LUT FFE Conv. FFE 5 bit 5 5 5 5 5 Address Decoder 9 9 LUT based first three taps reduces quantization noise impact 3 to 8 taps does not significantly amplify quantization noise 23

Reducing Quantization Noise Impact 8-tap Conventional Power for different no. of taps and tap resolution 3-tap LUT + 5-tap Conventional Power for different no. of taps and tap resolution - LT 150 150 100 100 50 50 10 8 6 Tap resolution 4 4 6 No. of taps 8 10 10 8 6 Tap resolution 4 4 6 No. of taps 8 10 Proposed approach is 30% lower power compared to conv. FIR implementation

500 µm 1000 µm Area Impact of the proposed solution 8-tap Conventional 500 µm 3-tap LUT + 5-tap Conventional 1300 µm Area increases by 4x but Standard cell SRAM will reduce is by 25% Area will scale significantly with technology

Implemented Prototype in 65nm CMOS Long Reach DSP 30 mw 40 mw Analog TDC 33 mw 29 mw Digital 28 mw Clk. Gen + Buffer Medium Reach DSP 26 mw Analog 35 mw TDC 24 mw 26 mw Digital 23 mw Clk. Gen + Buffer Digital: T-to-B, Mode selection Retimer High BW Amplifier Passive Equalizer P0 HR (Fine S/H) Reference Generator P0 (Coarse S/H) P315 (Edge S/H) 3.5 GHz Clock Gen Even Odd TDC Implemented in TSMC 65nm 26 T-to-B T-to-B T-to-B 2.5 2 1 2 1.5 T-to-B Mode Selection 5.5 5 4 3 2 CH0 CH90 CH180 CH270 Digital Interface DSP FPGA

Implemented Prototype in 65nm CMOS To FPGA Heavily digital solution Input needs only 7 GHz bandwidth 27

Experimental Setup Matched SMA cables PCB for testing FPGA Interface Input Clock Cyclone V FPGA Varying channel loss by cascading SMA cables. 28

Input EYE in Digital Domain frequency responses of LR, MR and SR channels S R Linear Equalizer output EYE ADC Code 31 31 20 10 Reconstructed digital EYE from ADC output 0-0.5 0 0.5 Time (UI) MR ADC Code 20 10 0-0.5-0.25 0 0.25 0.5 Time (UI) LR Tx has 6 db equalization Linear equalizer boost: 6 to 14 db 29

BER Occurrenc e Link Margin at 28Gb/s 30 db Channel 3-tap LUT + 5-tap 8-tap Conventional -3 Conventional -1 1 3-3 -1 1 3 Equalized output code Equalized output code FPGA gives the distribution of the bins The distribution is converted into log-scale Gaussian fit to extract the BER. 30

BER Power (mw) @ 28 Gb/s Link Margin Test and Energy Efficiency Data rate: 28 Gb/s PAM-4 4.6 pj/bit 5.7 pj/bit FFE 2.1 pj/bit 2.1 pj/bit 3.25 pj/bit TDC ADC Channel Loss (db) Receiver can achieve BER up to 10-9 31

Comparison with state-of-art Shafik ISSCC 2015[4] Frans VLSI 2016[5] Cui ISSCC 2016[3] 32 Rylov ISSCC 2016 [6] This Work Technology 65 nm CMOS 16 nm FinFET 28 nm CMOS 32 nm CMOS 65 nm CMOS Data Rate (Gb/s) ADC Architecture ENOB@ Nyquist Timing Recovery 10 NRZ 32x TI SAR ADC 56 PAM-4 32x TI SAR ADC 32 PAM-4 32x TI SAR ADC 25 NRZ 4x Flash ADC 28 PAM-4 4x Flash ADC 4.74 4.9 5.85 4 4.1 N/A Baud-rate Baud-rate Baud-rate Edge & Data Sampled Tracking BW --- --- --- --- 10+ MHz Jitter Tolerance Channel Loss Equalization Power (mw) --- ---- --- --- 0.2 UIpp @ 50 MHz 36.4 db @ 5 GHz 79(w/o DSP) 87(w DSP) 25 db @ 14 GHz 32 db @ 8 GHz 40 db @ 12 GHz 30 db @ 7 GHz 410(w/o DSP) 320 453 130@30 db w/o 45 @ 15 db DSP 160@30 db with 60 @ 15 db DSP FOM (pj/bit) 8.7 7.32 10 18.12 5.71@ 30 db with 2.14@ 15 db DSP

Summary of ADC Based Receiver ADC- DSP Based receivers are the future for multilevel signaling in advanced CMOS but it s power has to be reduced. DSP needs to be more information efficient Non-uniform quantization is a simple way to improve effective resolution. ADC for wireline is different than general purpose ADC. General purpose ADC considers each sample uncorrelated but in reality channel ISI makes them correlated predictive ADC is a simple way to take advantage of that. Timing recovery is as important as data recovery Multibit TDC and lower latency is an effective way to improve timing recovery loop and meet jitter requirement of the ADC. 33