Half-Rate Decision-Feedback Equalization Di-Bit Response Analysis and Evaluation EDA365

Similar documents
Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

Combating Closed Eyes Design & Measurement of Pre-Emphasis and Equalization for Lossy Channels

Comparison of NRZ, PR-2, and PR-4 signaling. Qasim Chaudry Adam Healey Greg Sheets

LOW POWER DIGITAL EQUALIZATION FOR HIGH SPEED SERDES. Masum Hossain University of Alberta

Time Domain Simulations

PAM4 signals for 400 Gbps: acquisition for measurement and signal processing

Duobinary Transmission over ATCA Backplanes

100Gb/s Single-lane SERDES Discussion. Phil Sun, Credo Semiconductor IEEE New Ethernet Applications Ad Hoc May 24, 2017

Brian Holden Kandou Bus, S.A. IEEE GE Study Group September 2, 2013 York, United Kingdom

Draft Baseline Proposal for CDAUI-8 Chipto-Module (C2M) Electrical Interface (NRZ)

DesignCon Pavel Zivny, Tektronix, Inc. (503)

Practical Receiver Equalization Tradeoffs Applicable to Next- Generation 28 Gb/s Links with db Loss Channels

MR Interface Analysis including Chord Signaling Options

AS THE required data rate for wire-line interconnect systems

New Serial Link Simulation Process, 6 Gbps SAS Case Study

Presentation to IEEE P802.3ap Backplane Ethernet Task Force July 2004 Working Session

Development of an oscilloscope based TDP metric

ECEN689: Special Topics in High-Speed Links Circuits and Systems Spring 2011

10 Gb/s Duobinary Signaling over Electrical Backplanes Experimental Results and Discussion

The EMC, Signal And Power Integrity Institute Presents

IN A SERIAL-LINK data transmission system, a data clock

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

How advances in digitizer technologies improve measurement accuracy

Communication Lab. Assignment On. Bi-Phase Code and Integrate-and-Dump (DC 7) MSc Telecommunications and Computer Networks Engineering

A low jitter clock and data recovery with a single edge sensing Bang-Bang PD

ECE 5765 Modern Communication Fall 2005, UMD Experiment 10: PRBS Messages, Eye Patterns & Noise Simulation using PRBS

C65SPACE-HSSL Gbps multi-rate, multi-lane, SerDes macro IP. Description. Features

BER MEASUREMENT IN THE NOISY CHANNEL

Development of an oscilloscope based TDP metric

TERRESTRIAL broadcasting of digital television (DTV)

PICOSECOND TIMING USING FAST ANALOG SAMPLING

The Measurement Tools and What They Do

RFI MITIGATING RECEIVER BACK-END FOR RADIOMETERS

Exceeding the Limits of Binary Data Transmission on Printed Circuit Boards by Multilevel Signaling

LFSR Counter Implementation in CMOS VLSI

EVALUATION KIT AVAILABLE 12.5Gbps Settable Receive Equalizer +2.5V +3.3V V CC1 V CC. 30in OF FR-4 STRIPLINE OR MICROSTRIP TRANSMISSION LINE SDI+ SDI-

VLSI Chip Design Project TSEK06

Experiment 7: Bit Error Rate (BER) Measurement in the Noisy Channel

Practical Bit Error Rate Measurements on Fibre Optic Communications Links in Student Teaching Laboratories

Using the MAX3656 Laser Driver to Transmit Serial Digital Video with Pathological Patterns

Further Investigation of Bit Multiplexing in 400GbE PMA

SMPTE STANDARD Gb/s Signal/Data Serial Interface. Proposed SMPTE Standard for Television SMPTE 424M Date: < > TP Rev 0

International Journal of Engineering Research-Online A Peer Reviewed International Journal

The Case of the Closing Eyes: Is PAM the Answer? Is NRZ dead?

Datasheet SHF A Multi-Channel Error Analyzer

32 G/64 Gbaud Multi Channel PAM4 BERT

PAPER A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

AMI Simulation with Error Correction to Enhance BER

AMI Modeling Methodology and Measurement Correlation of a 6.25Gb/s Link

Experiment 4: Eye Patterns

Features. For price, delivery, and to place orders, please contact Hittite Microwave Corporation:

Techniques for Extending Real-Time Oscilloscope Bandwidth

ISSCC 2006 / SESSION 18 / CLOCK AND DATA RECOVERY / 18.6

GHz Sampling Design Challenge

Synthesized Clock Generator

25.5 A Zero-Crossing Based 8b, 200MS/s Pipelined ADC

BASE-LINE WANDER & LINE CODING

Draft 100G SR4 TxVEC - TDP Update. John Petrilla: Avago Technologies February 2014

CDAUI-8 Chip-to-Module (C2M) System Analysis #3. Ben Smith and Stephane Dallaire, Inphi Corporation IEEE 802.3bs, Bonita Springs, September 2015

Sharif University of Technology. SoC: Introduction

Hardware Implementation of Viterbi Decoder for Wireless Applications

IBIS AMI Modeling of Retimer and Performance Analysis of Retimer based Active Serial Links

The Challenges of Measuring PAM4 Signals

Practical De-embedding for Gigabit fixture. Ben Chia Senior Signal Integrity Consultant 5/17/2011

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

Design Project: Designing a Viterbi Decoder (PART I)

Student Laboratory Experiments Exploring Optical Fibre Communication Systems, Eye Diagrams and Bit Error Rates

Optimizing BNC PCB Footprint Designs for Digital Video Equipment

CONVENTIONAL phase-tracking clock and data recovery

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

SV1C Personalized SerDes Tester

Technical Article MS-2714

Power Reduction Techniques for a Spread Spectrum Based Correlator

Area-Efficient Decimation Filter with 50/60 Hz Power-Line Noise Suppression for ΔΣ A/D Converters

CONVOLUTIONAL CODING

EC 6501 DIGITAL COMMUNICATION

HMC-C064 HIGH SPEED LOGIC. 50 Gbps, XOR / XNOR Module. Features. Typical Applications. General Description. Functional Diagram

Comment #147, #169: Problems of high DFE coefficients

Guidance For Scrambling Data Signals For EMC Compliance

Powering Collaboration and Innovation in the Simulation Design Flow Agilent EEsof Design Forum 2010

for Television ---- Bit-Serial Digital Interface for High-Definition Television Systems Type FC

Physical Layer Testing of 3G-SDI and HD-SDI Serial Digital Signals APPLICATION NOTE

Problems of high DFE coefficients

Future of Analog Design and Upcoming Challenges in Nanometer CMOS

Course Title: High-Speed Wire line/optical Transceiver Design

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

Dual Link DVI Receiver Implementation

40G SWDM4 MSA Technical Specifications Optical Specifications

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series

medlab One Channel ECG OEM Module EG 01000

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology

VITERBI DECODER FOR NASA S SPACE SHUTTLE S TELEMETRY DATA

Approach For Supporting Legacy Channels Per IEEE 802.3bj Objective

DDC and DUC Filters in SDR platforms

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Digital Correction for Multibit D/A Converters

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

RX40_V1_0 Measurement Report F.Faccio

Transcription:

DesignCon 2008 Half-Rate Decision-Feedback Equalization Di-Bit Response Analysis and Evaluation Jihong Ren, Rambus Inc. jren@rambus.com Brian Leibowitz, Rambus Inc. Dan Oh, Rambus Inc. Jared Zerbe, Rambus Inc.

Abstract High-speed links are required to operate at BERs typically lower than 10-12. To meet such a low BER under tight power constraints, equalizers such as DFE are widely used together with half-rate/quarter-rate sampling receive architecture [1,2,3]. In this paper, we present a statistical performance analysis method based on di-bit (two-bit) response analysis to accurately analyze the performance of links with half-rate receive architectures. Compared with single-bit response analysis, di-bit response analysis can naturally model link non-idealities such as DCD and odd/even path mismatch, as well as half-rate DFE. We use this method to compare the performance of full-rate DFE [1] and half-rate DFE [2,3] receive architectures. Our simulation results show that compared with half-rate DFE, full-rate DFE provides better ISI cancellation at edge sample time, improving timing margin and clock recovery. Our analysis also shows that, by using PrDFE [1,4] to handle the first post-cursor at both data and edge time, in addition to the ability to handle DCD and odd/even path mismatches, half-rate DFE could provide an overall better performance with modest hardware cost. Author(s) Biography Jihong Ren received her Ph.D. in Computer Science from University of British Columbia, Canada in 2006 where she worked on optimal equalization for chip to chip high-speed buses. She has been with Rambus since January 2006 where she has worked on equalization algorithms and link performance analysis. Brian Leibowitz received a B.S. in Electrical Engineering from Columbia University in 1998. He received a Ph.D. in Electrical Engineering and Computer Science from UC Berkeley in 2004 where he developed an integrated CMOS imaging receiver for freespace optical communication. His Ph.D. research was funded by a fellowship from the Fannie and John Hertz Foundation. Since 2004 he has been at Rambus working on mixed-signal circuit design for high-speed serial links. Kyung Suk (Dan) Oh is Senior Engineering Manager at Rambus Inc. He received the B.S., M.S., and Ph.D. in electrical engineering from the University of Illinois, at Urbana- Champaign in 1990, 1992, and 1995, respectively. His doctoral research was in the area of computational electromagnetics applied to transmission line modeling and simulation. Since 2000, he is with Rambus Inc., Los Altos, CA. His group is responsible for providing signal integrity analysis for various products including serial, parallel, and memory interfaces. Additional responsibilities of his group include an advance CAD tool development for high-speed link simulation. His current interests include advance signal and power integrity modeling and simulation techniques, optimization of channel designs for various standard or proprietary I/O links, and application of signaling techniques to high speed digital links. He has published over 45 papers and holds 2 patents and 6 patent applications in areas of high-speed link design.

Jared Zerbe joined Rambus Inc in 1992 where he has since specialized in the design of high-speed I/O, PLL/DLL clock-recovery, and SerDes circuits. He has authored multiple papers and patents in the area of high-speed clocking and data transmission. He has guest lectured and taught courses at Berkeley and Stanford in link design and is currently a Technical Director where he is focused on development of future signaling technologies.

1. Introduction Backplane interconnects present great challenges to achieving multi-gb/s signaling rates. Signal integrity issues such as dispersion and reflections result in inter-symbol interference (ISI) which severely limits the channel bandwidth. To mitigate the impact of ISI on link performance and push high data rates over off-chip interconnects, equalization techniques such as transmitter equalization and receiver decision-feedback equalization (DFE) are widely used. Transmitter equalization typically flattens the channel response by attenuating the low-frequency response of the channel due to the transmitter peak power constraint. In contrast, DFE mitigates ISI without amplifying noise or reducing the received signal strength. Moreover, for high-speed signaling over backplanes, it is necessary to adapt the equalizer settings to the channel characteristics. Information needed for equalizer adaptation is readily available at the receiver side while transmitter equalization requires a back-channel for adaptation. For these reasons, DFE has gained more and more popularity in high-speed link design over backplanes [1,2,3,4,5]. This paper proposes a statistical performance analysis method based on di-bit response analysis to accurately analyze the performance of links for half-rate receivers with DFE. Compared with single-bit response analysis [6,7,8], di-bit response analysis can naturally model link non-idealities such as duty-cycle distortion (DCD). We use this method to compare the performance of full-rate DFE and half-rate DFE receive architectures. We will focus on receivers that sample the data at twice the baud-rate for clock and data recovery (CDR) as shown in Figure 1.1. In steady state, one sample time occurs at the transitions of the data eye (edge sampler) while the other occurs at the center of the eye opening (data sampler). This paper also investigates the performance of CDR in different DFE receive architectures. Figure 1.1 Baseband high-speed link architecture with receiver decision feedback equalization, and 2x oversampled clock-data recovery. The rest of this paper is organized as follows. Section 2 introduces full-rate DFE and half-rate DFE architectures and discusses the pros and cons of each design. Section 3 introduces di-bit response analysis. Section 4 briefly describes LinkLab, an in-house statistical link performance analysis tool. We embed the di-bit analysis method in LinkLab and use it to estimate the system bit error rate (BER) for different channels and

equalization architectures. In Section 5, we show that full-rate DFE partially removes edge ISI, which improves CDR performance compared with no DFE. In contrast, halfrate DFE incorrectly equalizes edge ISI and adds additional noise to edge samples, which degrades CDR performance even compared with no DFE. We show that a combination of PrDFE with half-rate DFE achieves the best overall performance. It correctly removes the biggest edge ISI therefore improves CDR performance. Moreover, by using different equalizer coefficient for odd and even paths, it can significantly reduce the impact of DCD on link BER with a slight increase in hardware cost. Finally, we draw conclusions in Section 6. 2. Half-Rate Receiver with Full-Rate DFE vs. Half-Rate DFE High-speed links are required to operate at BERs typically lower than 10-12. To meet such a low BER under tight power constraints, equalizers such as DFE are widely used together with half-rate/quarter-rate sampling receive architecture [1,2,3]. In this paper, we focus on half-rate receiver. The analysis developed in this paper can be easily extended to quarter-rate receivers. As shown in Figure 2.1, in a half-rate receiver, a half-rate clock drives two duplicate paths at opposite lock phases, generating even and odd bit sequences. At the end, the even and odd bit sequences are multiplexed back into full data rate sequence. The half-rate clocking technique allows the receiver operates at half the data rate with twice the hardware cost. However, by allowing circuitry operating at a lower speed, it can significantly reduce the power consumption. Moreover, as we will see later in the paper, by incorporating partial response DFE (PrDFE), half-rate receiver relaxes the critical timing path for DFE equalization. For a half-rate receiver, there are two popular DFE architecture choices, full-rate [1] or half-rate [5]. For the rest of this Section, we introduce half-rate receivers with full-rate DFE and half-rate DFE, and discuss the pros and cons of each architecture. Half-Rate Receiver with Conventional Full-Rate DFE Figure 2.1 shows a simplified block diagram of a half-rate receiver with a conventional full-rate DFE of 3 taps. The purpose of the DFE loop is to cancel the ISI generated by previous received bits. Because the DFE feedback removes ISI by direct subtraction, each DFE weight α 1-3 is an estimation of the ISI contribution from that corresponding bit. Analog pulses synthesized from previous symbol decisions and the DFE weights are current-summed with the received signal before the receiver branches into odd and even path. The resulting equalized signal is the input to the data samplers which make future symbol decisions. It is important to note that in conventional full-rate DFE architecture, both even and odd samplers sample the same equalized waveform, although at different times. This makes single-bit response analysis sufficient for a full-rate DFE receiver, neglecting any DCD or deviation between the samplers. The clock to the DFE is ideally adjusted so that the transitions of the DFE correction pulses are phase aligned with the transitions of the input data [1,9]. With such alignment, the DFE correction pulses reach nearly full swing at the eye center and add half swing at

the edges of the data. Figure 2.2 shows the alignment of the DFE correction pulses relative to the single bit response of the channel. The subtraction of the two gives the single bit response for the DFE-equalized channel. Note that if the channel single bit response can be well approximated by piece-wise linear function, good ISI cancellation at both data and edge locations can be achieved at the same time. When this piece-wise linear assumption does not hold, perfect cancellation at data (edge) times only partially corrects, or possibly even introduces additional ISI at edge (data) times. Figure 2.3 shows the received eye diagram (Left) and the full-rate DFE-equalized eye diagram (Right) for an example channel. Clearly, full-rate DFE not only cleans up the ISI at data sampling time but also reduces data-dependent deterministic jitter by partially removing edge ISI. Figure 2.1 A half-rate receiver with a conventional full-rate DFE. channel single bit response (t) t s -T t s -T/2 t s t s +T/2 t s +T t s +3T/2 t s +2T 1 2 1/2 ( 1 2 )/ 2 ( 2 Figure 2.2 Full-rate DFE timing calibration: Left: Decision feedback path modeled as a simple RC path. Right: DFE feedback pulse adds half of the DFE weights to the edges. 3 )/2

Voltage Voltage Figure 2.3 Left: received eye diagram for a 12 legacy backplane channel with 6 traces on daughter cards at 6.4Gbps. Right: equalized eye diagram with 5 tap full-rate DFE. Half-Rate Receiver with Half-Rate DFE Figure 2.4 shows a half-rate receiver with half-rate DFE of 3 taps. The green dotted lines indicate the blocks on odd data clock phase. Solid blocks are on even data clock phase. The DFE feedback path is cross-coupled between the odd and even paths, and the DFE correction is applied to even and odd paths separately. Compared with full-rate DFE, all circuitry in this receiver architecture is clocked at half-rate. Note that even though halfrate DFE operates with half-rate clocking, its requirement for the settling time of the feedback signals at the summing amplifier output node is still 1UI, the same as the fullrate DFE. However, it removes the double data rate up-conversion multiplexer from the DFE critical timing path, easing the stringent timing requirement. In Section 5, we will discuss a combination of PrDFE and half-rate DFE to further relax the critical timing path. Furthermore, compared with full-rate DFE, there is no separate DFE clock domain. This greatly simplifies the receiver design by removing clock domain crossings and the need for DFE phase calibration, even though it doubles the number of DFE drivers. As shown in Figure 2.5, DFE correction pulses last for 2UI and are different for even and odd paths. Therefore, in this architecture, even and odd samplers sample different equalized waveforms. As shown in Figure 2.6, at node V E, correct DFE correction is applied at even sampling times which results in a wide open eye, but incorrect correction is applied at odd sampling times which results in a closed eye. Similarly, the odd sampler sees an equalized odd eye but an incorrectly-equalized even eye. Single bit response analysis no longer applies. In the next section, we show how we can extend single bit response analysis to di-bit response analysis to analyze half-rate DFE. Different from conventional DFE architecture with DFE clock calibration, the edge correction seen by the even edge sampler depends on the delay of the DFE feedback as well as the timing relationship between the even edge clock and the even data clock. In Figure 2.5, the dashed line shows the even data sampling times and the green dotted line followed shows the even edge sampling times. This is only one of the possible timing relationships between the data sampling times and the edge sampling times. For example, the edge sampling times can lead the data sampling times depending on the

implementation details. In this paper, we focus on the first case where the edge phase follows the data phase. Note that in this case, the edge sampler sees the same amount of DFE correction as the data sampler. If the piece-wise linear assumption discussed earlier holds, the amount of DFE correction at edge time is incorrect. The incorrect edge equalization leads to large data-dependent deterministic jitter as shown in Figure 2.6 and large CDR phase uncertainty. Figure 2.4 Half-rate receiver with half-rate DFE of 3 taps. D d 4 d 5 d 6 d 7 d 8 dclke dclko dfee 1 d 5 + 2 d 4 + 3 d 3 1 d 7 + 2 d 6 + 3 d 5 dfeo 1 d 4 + 2 d 3 + 3 d 2 1 d 6 + 2 d 5 + 3 d 4 Figure 2.5 Half-rate DFE timing. Each DFE correction pulse lasts for 2 UI. The dashed line shows the even data sampling times and the dotted line shows the even edge sampling times.

Voltage Voltage Figure 2.6 Left: received eye diagram for a 12 legacy backplane channel with 6 traces on daughter cards at 6.4Gbps. Right: equalized eye diagram at node V E with half-rate DFE. Eye diagram at node V O is the same but time shifted by 1UI. 3. Di-Bit Analysis As discussed in Section 2, a link with half-rate DFE receiver no longer has a valid continuous time DFE-equalized single bit response since even and odd samplers are sampling different waveforms. In this section, we present di-bit response analysis which can be naturally used to analyze half-rate DFE receiver. First, to simplify presentation, we assume the ideal case where the even and odd paths are completely symmetric (for example, same DFE tap weights etc.) and there is no duty-cycle distortion. In this case, the eye diagram at V O is simply a time shifted version of V E for truly random input data streams. Therefore, we can focus on one path and estimate the link BER based on one path. To deal with non-ideal cases with duty-cycle distortion and path mismatches, we look at the eye diagrams at both V O and V E nodes. The final BER is the average of the two paths. Di-Bit Responses and Half-Rate DFE Since the DFE correction pulses last for 2UI, let us look at channel responses in terms of symbols A = (1,-1), B = (1,1). Note that any NRZ bit stream can be decomposed into concatenation of A, -A, B and -B. Di-bit responses of a channel are the channel responses to symbol A and B as shown in Figure 3.1. Similar to single-bit response analysis, channel response to any input pattern is the summation/subtraction of the appropriate time shift (multiples of 2UI) of the di-bit responses. For symbol (1,-1), the DFE correction pulse applied at node V E at time 2UI later is (-α 1 +α 2 ), where -α 1 is the correction from the odd path for the -1 bit in symbol (1,-1) and α 2 is the correction from the even path for the 1 bit in symbol (1,-1) respectively. Note that this provides correct equalization at time t s +2UI, but erroneous equalization at time t s +1UI which is not sampled by the even sampler. Figure 3.2 shows the di-bit responses (unequalized and equalized) for a 12 legacy backplane channel. The blue dots indicate the even data sampling times. Note that after equalization, there is no ISI at even sampling times in these di-bit responses, although ISI at other times is potentially increased.

A channel (1,-1) response (t) t s -2T t s - T t s t s +T t s +2T t s +3T t s +4T Half-rate DFE pulses that remove ISI resulting from symbol (1,-1) at the even sampling times B t s -2T t s - T t s t s +T t s +2T t s +3T t s +4T channel (1,-1) response (t) Half-rate DFE pulses that remove ISI resulting from symbol (1,1) at the even sampling times even samples. Figure 3.1 A: channel response to symbol (1,-1) and the corresponding DFE corrections applied at node V E. B: channel response to symbol (1,1) and the corresponding DFE corrections applied at node V E. Figure 3.2 Di-bit responses (unequalized and equalized) for a 12 legacy backplane channel with 6 traces on daughter cards at 6.4Gbps. The blue dots indicate the half-rate

Di-Bit Responses and Received signal distribution Assuming the transmitted data stream is uncoded and random, the received signal is simply a sum of the transmitted symbols weighted by the corresponding di-bit responses. Therefore, similar to the statistical link analysis based on single bit response [6,7,8], we can calculate the probability distribution of the received signal by convolving the PMFs of the weighted random variables, as illustrated in Figure 3.3. In the case of di-bit responses, for the upper eye, there are two choices of the main signal, one from the (1,-1) response and the other from the (1,1) response. For ISI, there are four possibilities, from (1,-1), (-1,1), (1,1), (-1,-1) responses respectively as shown in Figure 3.3. The received signal probability is simply the convolution of the main signal probability distribution and the ISI probability distributions. Once the received signal probability distributions are calculated, we feed them into our in-house statistical link performance analysis tool, LinkLab, and estimate the link BER. In the following section, we briefly introduce LinkLab. Figure 3.3 Received signal probability distribution based on di-bit responses. 4. Simulation Environment We embed the di-bit analysis method in an in-house statistical link performance analysis tool, LinkLab, to estimate system BER [6,7]. This section briefly describes the tool. LinkLab takes into account the link architecture, various noise sources present in the link, and different equalization algorithms to evaluate the performance of the system using a BER criterion. For example, the simulator models various passive channel components (e.g. package, vias, PCB trace, and connectors) and noise sources (e.g. Tx/Rx jitter, phase

offset, receiver voltage offset, quantization noise, and power supply noise). It also accurately models the interaction between equalization adaptation and the CDR. Given the received signal distributions, it adds the noise sources and then computes the probability for the data sampler to detect an error at each time slice. By computing this probability for each phase and different voltage offsets relative to the nominal data sampling level, the simulator effectively constructs a statistical eye diagram (Figure 4.1 a). The statistical eye diagram is a BER contour that plots the BER as a function of voltage and time. It shows how much voltage and timing margin the system has if we can ideally determine the sampling location. For example, the timing bathtub is the horizontal slice of the BER contour at zero additional voltage margin. LinkLab models a 2x oversampled CDR commonly used in serial links. The CDR uses data samples to detect transitions and edge samples to detect timing errors. LinkLab models the CDR with a first-order Markov chain phase-state model [6]. This CDR model has been shown to be well correlated with lab measurements [10]. CDR phase dithering is modeled as steady state CDR phase distribution as shown in Figure 4.1 b. To include the impact on BER due to the uncertainty in the CDR phase, we convolve the statistical eye diagram with the CDR phase distribution (Figure 4.1 b). The final BER contour is a comprehensive visual display of the performance of the link (Figure 4.1 c). For example, the voltage margin for a given BER is simply the minimum distance between the BER contour in question and the threshold at zero. It should be emphasized that this measurement of voltage margin includes the impact of the CDR phase distribution on final BER. A B C Figure 4.1 A: Statistical eye diagram with CDR hold; B: CDR steady state phase distribution; C: BER contour conditioned with CDR distribution [6].

5. Simulation Results As shown in Figure 2.6, the biggest concern with the half-rate DFE architecture is the increased data-dependent deterministic jitter and its impact on CDR performance. In this section, we first present simulation results to compare the CDR performance for full-rate DFE, half-rate DFE and no DFE receivers. Then we introduce an alternative receiver architecture which combines partial-response DFE (PrDFE) and half-rate DFE to improve the CDR performance and relax the critical path timing requirement. At the end, we discuss the impact of transmitter duty-cycle distortion (DCD) on link performance for different architectures. Full-rate DFE vs. Half-rate DFE Figure 5.1 shows the simulated CDR phase distribution for receivers with 5 tap full-rate DFE, 5 tap half-rate DFE and no DFE, assuming identical CDR filtering in the three cases. The channel is again the 12 legacy backplane channel with 6 traces on daughter cards. As we discussed earlier in Section 2, since full-rate DFE provides partial equalization at edge times, it partially removes data-dependent deterministic jitter and improves CDR performance. The CDR phase distribution for full-rate DFE is the tightest among the three cases. Due to incorrect edge equalization, half-rate DFE has even wider CDR phase distribution than no DFE. Both are much wider than the full-rate DFE. This confirms our discussion earlier regarding the weakness of the half-rate DFE architecture. Figure 5.1 Simulated CDR phase distribution for 5 tap full-rate DFE, 5 tap half-rate DFE and no DFE, with equal CDR filtering in all cases. The channel is a 12 legacy backplane channel with 6 traces on daughter cards. Data rate is 6.4Gbps. Table 5.1 compares the simulated voltage and timing margin at 10-15 BER for different receive architectures in the example channel. Note that both full-rate DFE and half-rate DFE correctly removes the ISI at the data sampling time. Therefore, the difference in terms of the final link performance is from the impact of CDR phase uncertainty on BER.

For this channel, there is no margin at all if there is no equalization. Full-rate DFE opens the eye and provides 112mV voltage margin and 91ps timing margin. In contrast, even though half-rate DFE correctly removes the ISIs at the data sampling time, it only provides 70mV voltage margin and 75ps timing margin due to the large CDR phase uncertainty. Voltage margin at 10-15 BER (mv) Timing margin at 10-15 BER (ps) No Full-rate Half-rate PrDFE+Fullrate PrDFE+Half- DFE DFE DFE DFE rate DFE 0 112 70 110 110 0 91 75 91 90 Table 5.1 Simulated voltage margin and timing margin at 10-15 BER for five receiver DFE architectures: No DFE, 5-tap full-rate DFE, 5-tap half-rate DFE, PrDFE for the first tap with full-rate DFE from 2 to 5 taps, and PrDFE for the first tap with half-rate DFE from 2 to 5 taps. The channel is a 12 legacy backplane channel with 6 traces on daughter cards. Data rate is 6.4Gbps. An Alternative DFE Architecture: Partial Response DFE and Half-Rate DFE Even though half-rate DFE uses half-rate clocking, the critical timing path requirement is still 1UI: from the odd (even) sampling instant to when the first tap feedback signal to the even (odd) sampler settles. This stringent timing requirement has always been the limitation of conventional DFE architecture for high-speed link applications. PrDFE relaxes this timing requirement by employing a speculative sampling technique as shown in Figure 5.2. By sampling the data twice at the same time assuming the previous bit is either +1 or -1, we can decide the bit later once the previous bit is actually available. This increases the total time from sampling to DFE feedback summation to 2UI. Note that there is still a 1UI feedback path around the multiplexers, but this purely logical path is easier to implement and improves rapidly with technology scaling. Moreover, by employing PrDFE on edge samplers [1,9] as well, we can effectively remove the first edge ISI, which usually is the biggest ISI component at edge time. Thus, PrDFE edge sampling can potentially resolve the main drawback of large edge ISI in the half-rate DFE architecture. Figure 5.3 shows the PrDFE + half-rate DFE eye diagram for the 12 legacy backplane channel. The PrDFE eyes are highlighted with yellow thick lines. Note that the incorrect equalization for the first tap at the edge time has been resolved, so there is less deterministic jitter compared with the half-rate DFE equalized eye diagram shown in Figure 2.6. Figure 5.3 also compares the CDR phase distribution for three receiver architectures: PrDFE with full-rate DFE, PrDFE with half-rate DFE, and no equalization. Both DFE architectures improve the CDR performance dramatically compared with no DFE, with the half-rate DFE performance nearly matching that of the full-rate DFE case.

Table 5.1 also compares the simulated voltage and timing margin at 10-15 BER for these two receiver architectures. Compared with half-rate DFE, by employing PrDFE for the first tap, PrDFE with half-rate DFE achieves roughly the same performance as the fullrate DFE with the added benefit of relaxed critical path timing and half-rate clocking. Figure 5.2 Alternative DFE architecture employing PrDFE for the first tap and half-rate DFE for the other taps. 0-50 -100-150 -200-250 -300 CDR Phase Distribution -350 0 50 100 150 200 250 300 Phase PrDFE + full rate DFE PrDFE + half rate DFE no DFE Figure 5.3 Left: Equalized PrDFE eye diagram (PrDFE for the first tap and half-rate DFE from 2 to 5 taps) at node VE for a 12 legacy backplane channel. Right: CDR phase distribution for four receiver DFE architectures: PrDFE for the first tap + full-rate DFE from 2 to 5 taps, PrDFE for the first tap + half-rate DFE from 2-5 taps, and no DFE. Impact of Transmitter Duty-Cycle Distortion (DCD) Duty-cycle distortion (DCD) contributes to deterministic jitter and degrades channel performance especially for high-speed links. Transmitter DCD is particularly harmful since it directly modulates the width of the transmitted pulses. With transmitter DCD, the odd and even bits have different bit times, hence different responses which impact later bits in a different manner. In this case, the odd and even paths are no longer completely symmetric and the odd/even samplers in the receiver are sampling different eyes. By using different equalization settings on odd and even paths, one can reduce the impact of DCD on link performance. PrDFE and half-rate DFE can naturally support asymmetric equalization with modest additional hardware cost since half-rate DFE already features

separate odd/even feedback drivers. By considering two bits at a time and modeling odd and even path separately, di-bit response analysis can naturally model DCD and asymmetric odd/even path equalization. Figure 5.4 shows the S-parameters for two example channels. Channel 2 has less ISI compared with channel 1. Figure 5.5 shows the impact of transmitter DCD on link performance for different receive architectures. Different DFE tap settings are used for odd and even paths in the cases of odd/even PrDFE and odd/even half-rate DFE. The figure on the left is for channel 1 which has more severe ISI than channel 2 (right). Simulation results show that transmitter DCD severely degrades link performance and increases the link BER by orders of magnitude. With odd/even PrDFE, the impact of transmitter DCD is significantly reduced. Odd/even half-rate DFE further improves BER especially for channel 1 where there is more ISI. 0-10 -20-30 -40-50 -60-70 -80 0 5 10 15 GHz channel 1 channel 2 Nyquist Frequency Figure 5.4 S-Parameters for two example backplane channels. log10 BER log10 BER -14-16 -18-20 -22-24 -26-28 -30 PrDFE + full-rate DFE odd/even PrDFE + full-rate DFE odd/even PrDFE + odd/even half-rate DFE -32 0 2 4 6 8 10 DCD % Figure 5.5 Impact of transmitter DCD on link performance for different receive architectures and two channels. Left: channel 1, Right: channel 2. 6. Conclusion In this paper, we presented di-bit (two-bit) response analysis to accurately analyze the performance of links with half-rate receiver architectures. Compared with single-bit

response analysis, di-bit response analysis can naturally model link non-idealities such as DCD and odd/even path mismatch, as well as half-rate DFE. We compared the performance of full-rate DFE and half-rate DFE receive architectures. Our simulation results show that full-rate DFE provides better ISI cancellation at edge sample time than half-rate DFE, resulting in better overall link performance. As an alternative architecture, using PrDFE to handle first post-cursor at both data and edge times and half-rate DFE for the later post-cursors, could provide similar performance to full-rate DFE with modest hardware cost and added benefits of relaxed critical timing requirement, lower clock rate (lower power consumption), and the ability to handle DCD and odd/even path mismatches.

7. Reference [1] B. Leibowitz, et. al., A 7.5Gb/s 10-Tap DFE Receiver with First Tap Partial Response, Spectrally Gated Adaptation, and 2nd-Order Data-Filtered CDR, Solid- State Circuits Conference, 2007. [2] J. F. Bulzacchelli, et. al., A 10-Gb/s 5-Tap DFE/4-Tap FFE Transceiver in 90-nm CMOS Technology, IEEE Journal of Solid-State Circuits, Vol. 41, No. 12, Dec. 2006. [3] K.-L. Wong, A. Rylyakov, and C.-K. Yang, A 5-mW 6-Gb/s Quarter-Rate Sampling Receiver with a 2-Tap DFE Using Soft Decisions. IEEE Symposium on VLSI Circuits, June 15-17, 2006, pp. 90-191. [4] R.S Kajley et al., A Mixed-Signal Decision-Feedback Equalizer that uses a Look- Ahead Architecture, IEEE J. Solid-State Circuits, pp.450-459, Mar., 1997. [5] K.-L. Wong and C.-K. Yang, A Serial-Link Transceiver with Transition Equalization, ISSCC 2006. [6] V. Stojanovic, Channel-Limited High-Speed Links: Modeling, Analysis and Design, ProQuest / UMI, 2006. [7] D. Oh, et. al., Accurate Method for Analyzing High-Speed I/O System Performance, DesignCon 2007. [8] B. K. Casper, M. Haycok, and R. Mooney, An accurate and efficient analysis method for multi-gb/s chip-to-chip signaling schemes, IEEE Symposium on VLSI Circuits, June 2002, pp. 54-57. [9] J. Ren et. al., Performance Comparison of Edge-based and Data-based Equalization. DesignCon 2007. [10] F. Lambrecht, et. al., Accurate System Voltage and Timing Margin Simulation in CDR Based High Speed Designs, EPEP, October, 2006.