A novel digital phase interpolation control for clock and data recovery circuit

Similar documents
A low jitter clock and data recovery with a single edge sensing Bang-Bang PD

A 5-Gb/s Half-rate Clock Recovery Circuit in 0.25-μm CMOS Technology

Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology

ISSCC 2006 / SESSION 18 / CLOCK AND DATA RECOVERY / 18.6

PAPER A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution

IN A SERIAL-LINK data transmission system, a data clock

GLITCH FREE NAND BASED DCDL IN PHASE LOCKED LOOP APPLICATION

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Power Reduction and Glitch free MUX based Digitally Controlled Delay-Lines

IC Design of a New Decision Device for Analog Viterbi Decoder

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

Glitch Free Strobe Control Based Digitally Controlled Delay Lines

Laboratory 4. Figure 1: Serdes Transceiver

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

A Low Power Delay Buffer Using Gated Driver Tree

Digital Correction for Multibit D/A Converters

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

Logic Design II (17.342) Spring Lecture Outline

CMOS Low Power, High Speed Dual- Modulus32/33Prescalerin sub-nanometer Technology

Design of High Speed Phase Frequency Detector in 0.18 μm CMOS Process for PLL Application

High Speed 8-bit Counters using State Excitation Logic and their Application in Frequency Divider

EE241 - Spring 2005 Advanced Digital Integrated Circuits

Design of an Efficient Low Power Multi Modulus Prescaler

Advanced Devices. Registers Counters Multiplexers Decoders Adders. CSC258 Lecture Slides Steve Engels, 2006 Slide 1 of 20

Final Exam review: chapter 4 and 5. Supplement 3 and 4

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

PICOSECOND TIMING USING FAST ANALOG SAMPLING

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

ISSN Vol.08,Issue.24, December-2016, Pages:

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

LFSR Counter Implementation in CMOS VLSI

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

Chapter 6. sequential logic design. This is the beginning of the second part of this course, sequential logic.

IN DIGITAL transmission systems, there are always scramblers

RS flip-flop using NOR gate

EITF35: Introduction to Structured VLSI Design

Logic and Computer Design Fundamentals. Chapter 7. Registers and Counters

Guidance For Scrambling Data Signals For EMC Compliance

Chapter 4. Logic Design

Course 10 The PDH multiplexing hierarchy.

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

A Power Efficient Flip Flop by using 90nm Technology

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

Research on Precise Synchronization System for Triple Modular Redundancy (TMR) Computer

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Clocks. Sequential Logic. A clock is a free-running signal with a cycle time.

Texas Instruments TNETE2201 Ethernet Transceiver Circuit Analysis

An FPGA Implementation of Shift Register Using Pulsed Latches

An MFA Binary Counter for Low Power Application

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP

Analysis of Digitally Controlled Delay Loop-NAND Gate for Glitch Free Design

Asynchronous inputs. 9 - Metastability and Clock Recovery. A simple synchronizer. Only one synchronizer per input

ASNT8140. ASNT8140-KMC DC-23Gbps PRBS Generator with the (x 7 + x + 1) Polynomial. vee. vcc qp. vcc. vcc qn. qxorp. qxorn. vee. vcc rstn_p.

EECS150 - Digital Design Lecture 19 - Finite State Machines Revisited

Half-Rate Decision-Feedback Equalization Di-Bit Response Analysis and Evaluation EDA365

Computer Organization & Architecture Lecture #5

Chapter Contents. Appendix A: Digital Logic. Some Definitions

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

An Asynchronous Fully Digital DLL for DDR SDRAM Data Recovery

ASYNCHRONOUS COUNTER CIRCUITS

Digital Fundamentals: A Systems Approach

ASNT8142-KMC Generator of DC-to-23Gbps PRBS with Selectable Polynomials

Analogue Versus Digital [5 M]

Design of New Dual Edge Triggered Sense Amplifier Flip-Flop with Low Area and Power Efficient

A clock is a free-running signal with a cycle time. A clock may be either high or low, and alternates between the two states.

LOW POWER AND HIGH PERFORMANCE SHIFT REGISTERS USING PULSED LATCH TECHNIQUE

Logic Design Viva Question Bank Compiled By Channveer Patil

Design and Simulation of a Digital CMOS Synchronous 4-bit Up-Counter with Set and Reset

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

IT T35 Digital system desigm y - ii /s - iii

Clock Domain Crossing. Presented by Abramov B. 1

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

LOW POWER DOUBLE EDGE PULSE TRIGGERED FLIP FLOP DESIGN

DESIGN AND IMPLEMENTATION OF SYNCHRONOUS 4-BIT UP COUNTER USING 180NM CMOS PROCESS TECHNOLOGY

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003

Asynchronous (Ripple) Counters

ECEN620: Network Theory Broadband Circuit Design Fall 2014

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

Design of an Error Output Feedback Digital Delta Sigma Modulator with In Stage Dithering for Spur Free Output Spectrum

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS

VLSI Chip Design Project TSEK06

PHASE-LOCKED loops (PLLs) are widely used in many

MODULE 3. Combinational & Sequential logic

Low Power Area Efficient Parallel Counter Architecture

The reduction in the number of flip-flops in a sequential circuit is referred to as the state-reduction problem.

Synchronous Sequential Logic

Flip-Flops. Because of this the state of the latch may keep changing in circuits with feedback as long as the clock pulse remains active.

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications

Figure 1 shows a simple implementation of a clock switch, using an AND-OR type multiplexer logic.

MODU LE DAY. Class-A, B, AB and C amplifiers - basic concepts, power, efficiency Basic concepts of Feedback and Oscillation. Day 1

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

The outputs are formed by a combinational logic function of the inputs to the circuit or the values stored in the flip-flops (or both).

25.5 A Zero-Crossing Based 8b, 200MS/s Pipelined ADC

Transcription:

This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* A novel digital phase interpolation control for clock and data recovery circuit Huihua Liu 1,2a), Lei Li 3, Ping Li 2 and Jun Zhang 3 1 School of Electronic Engineering, University of Electronic Science and Technology of China, No, 2006, Xiyuan Avenue, High-Tech West Zone, Chengdu 611731, Sichuan, China 2 State Key Laboratory of Electronic Thin Films and Integrated Devices, University of Electronic Science and Technology of China, No, 4, Section 2, North Jianshe Road, Chengdu 610054, China 3 Research Institute of Electronic Science and Technology, University of Electronic Science and Technology of China No, 2006, Xiyuan Avenue, High-Tech West Zone, Chengdu 611731, Sichuan, China a) lhhua@uestc.edu.cn Abstract: In this express, we present a new architecture of digital phase interpolation (PI) controller with clock and data loops, which can greatly reduce the jitter of recovery clock by reducing the probability of the coarse phase jumping and interpolating among several fine phases. A demo design was implemented using 0.13 μ m CMOS technology for verification, and the simulation results demonstrate that the recovered clock of the presented architecture has a peak to peak jitter no more than 29ps under 2.5Gbps received data, which shows no coarse phase dithering happening. The area of this proposed PI controller is only 0.1mm 2. Keywords: Clock and data recovery, digital phase interpolation, PI controller, data interpolation Classification: Integrated circuits IEICE 2015 DOI: 10.1587/elex.12.20150617 Received July 13, 2015 Accepted August 17, 2015 Publicized September 4, 2015 [1] Ming-ta. Hsieh: Ph.D thesis University of Minnesota, Minnesota (2008). [2] Rainer. Kreienkamp, Ulrich. Langmann, Christoph. Zimmermann, Takuma. Aoyama, and Hubert. Siedhoff: IEEE J. Solid-State Circuits 40 (2005) 736. [3] Wei. Xueming, Wang. Yiwen, Li. Ping, and Luo. Heping: Journal of Semiconductors 32 (2011) 125009-1. DOI: 10.1088/1674-4926/32/12/125009 1

[4] Behrooz. Abiri, Ravi. Shivnaraine, Ali. Sheikholeslami, Hirotaka. Tamura, and Masaya. Kibune: ISSCC Dig. Tech. Papers (2011) 154. [5] Kunal. Desai, and Vijay. Krishna: Vlsid. 24 th Conf. International Conference on VLSI Design (2011) 41. [6] Jinn-Yeh. Chien: U.S. Patent 0098203 A1 (2010). [7] Yu-Hsin. Tseng, and Wen-Ching. Hsing: U.S. Patent 7795926 B2 (2010). [8] Jri. Lee, Kenneth S. Kundert, and Behzad. Razavi: IEEE J. Solid-State Circuits 39 (2004) 1571. [9] A. Rezayee, and K. Martin: Proc. 29 th Conf. proceedings of the 2003 European Solid-State Circuits Conference (2003) 683. [10] D. Li, P. Chuang, and M. Sachdev: IEEE 11 th ISQED (2010) 853. [11] Jin-gook. KIM, Seung-jun. BAE, and Kwang-il PARK: U.S. Patent 10102523 A1 (2009). 1. Introduction High speed serial link systems have gradually dominated parallel link systems in modern communications [1]. During a high speed serial transmission, usually a clock and data recovery (CDR) circuit is needed. CDR architectures can be classified according to the phase relationship between the received input data and the local clock at the receiver. Commonly CDR based on phase interpolator is used when the recovery data s rate below 10 Gbps [2], and more and more hybrid PIs are presented, which benefits from digitalization [3, 4, 5, 6]. A block diagram illustrating conventional phase interpolation based on CDR is plotted in Fig.1. The CDR circuit consists of delay elements, a phase interpolation core and a phase interpolation controller. The delay elements implemented in either PLL or DLL are used to generate complementary multi-phases, namely CK 1, CK 1_, CK N, and CK N_. Every clock cycle is divided into 2N phases, which are called coarse phases ψ in the express. A pair of adjacent coarse phases will be selected by the calibration state machine (CSM), which means that a degree of 180/N is interpolated. The PI delay cells are called fine phases ϕ in this express. And the corresponding fine phases are selected by the bi-directional shift-registers (BDSR) which include W shift cells. Fig. 1. Conventional CDR structure based on PI. 2

A coarse phase can be given as: Therefore, the total recovery clock phase can be given: ψ = W* ϕ (1) P= K* ψ + M* ϕ (2) where ψ is 180 / N, and M coming from W shift cells, means that M PI cells are selected, K N, M W. Due to the effect of nonlinear binary phase detector (PD), data s dithering, flip-flop s meta-stability and the quantified errors, the recovery clock may bang-bang among several fine phases, two conjoint coarse phases, or their combinations. What is worse, in general the arriving time of the CSM adding N:2 MUX to PI is different from that of BDSR module, for it s very difficult to match the delay of these two paths. Therefore, the recovery clock may jump among coarse phases with a comparatively high probability. When it takes place, a large phase jitter called the coarse-phase jitter will be generated [3, 5, 6], which will degrade the performance of our design. This express proposes a novel architecture to resolve the problem described above. In this architecture, Data Interpolator Loop (DIL) is employed to address this problem. The simulation results demonstrate that the new architecture can greatly reduce the occurrence probability of large jitter. 2. Topology description As described in Section 1, the coarse-phase jitter will be generated when the selected phases jump back and forth between two coarse phase edges, which will also degrade the performance of the recovery data. This section will first analyze the cause of coarse-phase jitter, and the proposed architecture will be presented later. 2.1 Cause of coarse-phase jitter As shown in Fig.1, the CSM chooses two conjoint differential signals, namely {CK, CK ;CK, CK ; 1 M N 1} and sends them to the phase M M_ M+1 M+1_ interpolator module. After being interpolated, four quadrature clock signals will be sent to PD module and be compared with the data. After that the phase information that the data leads or lags the clock is sent to digital filter, and the results of this digital filter are used to control the BDSR module. The BDSR shifts one bit right when an E_F pulse is received and also one fine phase will be interpolated, whereas the module will shift one bit left when a L_F pulse comes and also the PI will decrease by one fine phase. When all BDSR controllers are in the logic state of high, the module sends a carry pulse Incr, otherwise when all is in the logic state of low, a borrow pulse Dec is generated. The generated {Incr, Dec} signals will be inputted into CSM module. As described above, the negative feedback regulation 3

mechanism is established. The coarse phase dithering is shown in Fig.2 (a) and (b). Let us define the total jitter as below: Tj = Tf + Tc (3) where T f and T c represent the fine-phase and coarse-phase jitter respectively. (a) Fig. 2. (a) Coarse phase dithering. (b) Jitter happens between coarse and fine phases. A case of the coarse-phase jitter is described in Fig.2. At a time, the W-bit controllers of the bi-directional shift-registers are all ones and the carry or borrow signals are all zeros, which are given as below: C1, C2... C W = {1,1...1} (4) Incr = 0; Dec = 0 At the next time, if a left shift signal comes, the state of bi-directional shift-registers will change as. C1, C2... C W = {0,0...0} (5) Incr = 1; Dec = 0 But at the next point, if a right shift signal comes, the state of bi-directional shift-registers will change as. (b) C1, C2... C W = {1,1...1} Incr = 0; Dec = 1 As mentioned above, CDR based binary phase detector finally bang-bangs among several phases, rather than stabilizes at a certain phase. Therefore, if the (6) 4

above phenomenon continues, the jitter of the recovery clock will increase largely. To resolve this problem, in the patent [6], a thermal code generator is introduced, and in the paper [3], the redundant control logic was implemented by Wei Xueming. But there was no further analysis of the cause of coarse-phase jitter and all the solutions focused on the clock path. 2.2 Proposed topology The proposed topology is shown in Fig. 3. Compared with the conventional phase interpolator controller as shown in Fig.1, Data Interpolator Loop (DIL) is introduced to address the above mentioned problem as shown inside the box, and the proposed topology uses a sample finite state machine (FSM) to replace the complex CSM module. Besides, a clock management device (CMD) is used to provide different frequency clocks to all the digital modules. Fig. 3. Simplified topology of the proposed PI controller. 0 1 1 0 1 CW CK 0 0 C1 Incr Dec (a) Fig. 4. (a) Fine phase dithering. (b)jitter happens inside fine phases. As seen from Fig.4 (a) and (b), an example of the fine-phase jitter is described. When the CDR is locked, the carry or borrow signal {Incr, Dec} fixes at one of the (b) 5

three states {{1, 0}, {0, 1}, {0, 0}}. In our analysis, we suppose that they are in the state of {0, 0} and the fine-phase controllers output M-bit high values 1. The signal C k in Fig. 4(a) is one of the M-bit controllers. Different from Fig. 2, the final recovery clock swags only among several fine phases, avoiding jumping between two coarse phases under certain condition. Thus the total jitter can be reduced remarkably. 3. Circuits design 3.1. Improved PD Bang-Bang Phase Detectors are widely used in clock synchronization and data recovery since they provide high gain, work at high speed and are less sensitive to process variation [8]. And the Half-Alexander configuration is one of classical Bang-Bang PDs, which tracks a data signal and only needs half of clock frequency compared to full-rate PDs. The thesis [9] proposed a phase detector for half-rate bang-bang CDR circuit, as shown in Fig.5. DATA CK0 D D Q Q D0 D90 UP1 - CK0 + M U X UP CK90 UP2 CK180 D D Q Q D180 D270 DN1 - CK90 DN2 + M U X DN CK270 D0 Fig. 5. Half-rate Alexander phase detector. The phase detector uses four quadrature clocks to sample the data at 0, 90 180 and 270, and output D 0, D 90, D l80 and D 270 signals respectively. Similar to a full-rate Alexander phase detector, the logic of phase detector can be given as follows: UP 1=D0 D 90, DN 1=D90 D180 (7) UP 2=D180 D 270, DN 2=D0 D270 (8) The phase detector adopts a completely symmetric design. However, when the PD operates at a very high transmitting bit rate, if a sum of a clock to output delay time (CK-Q delay) of the D flip-flops and a delay time of the XOR gates exceeds 1/4 clock cycle, an unpredicted glitch will be generated in the instruction signals {UP, DN}[7]. In other words, PD generates wrong phase-difference instruction signals {UP 1,DN 1 } or {UP 2,DN 2 }, which may deteriorate CDR performance. This express presents an improved PD, which eliminates the mismatch and also guarantees PD s reliability when it runs at very high speed. Seen from the below Fig.6, four D-flip-flops are added to resample D0, D90, D180 and D270, which 6

can ensure that the phase instruction signals {UP 1, DN 1 } take place at the same time, and so do the signals {UP 2, DN 2 }. Similarly a selecting device includes four D-flip-flops to sample the XOR output signals. So the proposed PD guarantees the sequence of all critical paths. According to the relationship between the input signal and the clock,the clock lagging the data, ideally the UP signal varies and the DN signal keeps low. Suppose that the delay would occur as mentioned [7], given the delay time t=300ps (exceeds 1/4 clock cycle, where one cycle period is 800ps). Through the theory analysis under such condition above, the half-rate Alexander PD has glitches. And from simulated by the half-rate Alexander PD and the improved PD in Fig. 7, where the solid lines {UP 1, DN 1 } and {UP 2, DN 2 } are ideal signals, and the dash lines are actual signals, the conventional structure generates glitches, while there are no glitches in the novel PD. It can be concluded that the improved structure prevents the generation of glitch. Fig. 6. Improved half-rate PD. DATA_IN CK0/180 CK90/270 D0 D90 D180 D270 UP1 UP2 DN1 DN2 DATA_IN CK0/180 CK90/270 D0 D90 D180 D270 D0_1 D90_1 D180_1 D270_1 UP1 UP2 DN1 DN2 UP DN (a) UP DN Fig. 7. (a) Simulated by the conventional PD (b) (b) Simulated by the improved PD 3.2. Finite State Machine Module The FSM module includes one pair, which is gray coding counter and decoder. Gray coder is a binary numeral system in which two successive values differ by only one bit and that is originally designed to prevent spurious output from 7

electromechanical switches. First after the FSM module is initialized, the gray code counter is set at {000}. Then the counter receives the indicator signals {Incr, Dec} from the BDSR and changes one bit on which rising edge. If the signals {Incr, Dec} change from {0, 0} to {1, 0}, the counter adds one. If the signals change from {0, 0} to {0, 1}, the counter subtracts one. While the signals remain unchanged, the counter doesn t count. It is noted that the signals {Incr, Dec} cannot be {1, 1} since the BDSR module avoids this state. The decoder is associated to the gray code counter, which produces differential clock phase controller signals according to the counter. These generated signals are sent to the Multiplexer to choose phases of PLL or DLL. And then the chosen phases are interpolated by the PI module. Table I lists the relation among the gray coding counter, the decoder sequence and the coarse phases, in which we don t list the complementary coarse phases. Table I. Relation of coding/decoding and coarse phase. Gray Code Counter Decoder Sequence Quad Decoder Sequence Coarse Phase Range 000 1000 1000 0, 45 001 0010 1000 45, 90 011 0010 0010 90, 135 010 0100 0010 135, 180 110 0100 0100 180, 225 111 0001 0100 225, 270 101 0001 0001 270, 315 100 1000 0001 315, 0 3.3. Data interpolator Module When the BDSR sends a carry or borrow signal, the coarse phase shifts π / N radian. If the coarse phases have experienced 2π phase rotation and the indicator signals continue to bang-bang after the reference clock is locked, we may think that the recovery clock falls into the boundary of two coarse phases and can t jump out of this state. Thus, the module will interpolate some delay to the data, changing the initial phase difference of the reference clock and the received data, and making the recovery clock track the interpolated data again. Based on the form signals {Incr, Dec} and the number of pulses, the module produces four control signals {CTR1, CTR2, CTR3, and CTR4} to interpolate the corresponding delay to the data. The CDR completes recovering clock and data experiences two periods. Firstly, the recovery clock tracks the data s phase. Secondly, when the CDR stabilizes, the recovery clock bang bangs among several phases. So under the first period, the DIL doesn t operate in order to avoid effecting the normal operation, and two main methods have been adopted. The first one is that the effective time of the signal CTR1 is about 5us longer than the time that the coarse phases have rotated 2π after the reference clock (PLL or DLL) is locked in the design. As depicted in Subsection 3.2, when a high pulse is generated on the signal Incr or Dec, the coarse phase rotates π / N radian. Therefore it s easy to set a suitable beginning time for the data delay control signals. The other one is to set the time 8

gap between {CTR1, CTR2}, {CTR2, CTR3}, and {CTR3, CTR4} longer than the full 2π coarse phase for when the data phase changes, the loops have enough time to recover. The DIL need not interpolate very accurate delay to the data because its main function is to prevent the CDR to fall into the dead zone and if we change the initial condition, which can jump over it. From an example simulation waveform of Fig.8, when the data interpolator control signals {CTR1, CTR2, CTR3} change from low to high, an appropriate delay {t1, t2, t3} is interpolated to the signal DATA_IN. In this case, after it is interpolated three times, the CDR stabilizes. So the data need not be interpolated more, and the signal CTR4 remains unchanged. Fig. 8. Simulate by delay control. 4. Simulation results To verify the dual-loop PI controller architecture, a SerDes circuit has been implemented using the CMOS 0.13 μ m process. Fig. 9 is a layout of the circuit. The Data interpolator module only occupies about ten percent of the CDR controller layout area, 0.1mm 2. Fig. 9. SerDes layout including CDR controller. The CDR is simulated with 1.25GHz reference clock provided by four differential-stage PLL, 2.5Gbps received data, and the various initial phase differences between the received data and the reference clock are studied. In the demo design, CTR1, CTR2, CTR3 and CTR4 separately control to interpolate 50ps, 70ps, 100ps and 130ps delay to the received data. Through comparing the phase difference between the reference clock and the data, the CDR system recovers clock and data. However, the initial phase 9

difference is uncertain, for the serial data streams are sent without an accompanying clock signal. And as analyzed in the sections above, the occurrence of large dithering is that the CDR falls into the dead zone and jumps among coarse phases. So in order to compare the presented CDR with the conventional structure, different initial phase difference is studied. By setting a phase step 50ps and swapping within a full clock cycle, the large dithering can be found. For example, suppose that the initial phase difference of the reference clock and the received data is 50ps, and compared with the PI controller by using single loop. Fig.10 (a) and (b) plot the corresponding results. Simulation shows that the presented CDR only causes 28ps dithering, while the single loop PI causes a peak to peak recovery clock jitter of 205.5ps under the worst case, obviously where the coarse phase dithering occurs. Furthermore, in a full clock cycle, different phases are simulated in Fig.11. Where the x-coordinate represents initial phase difference between the received data and the reference clock, and the y-coordinate represents peak to peak jitter of the recovered clock corresponding to x-coordinate. In Fig.11 the little dots denote clock jitter of the single loop, while the little triangles are that of the dual loops. From the figure, the maximum jitter is no more than 29ps in our design, while to the single loop PI, there are four cases in which the peak to peak recovery clock jitter is about 200ps under the worst conditions. (a) (b) Fig. 10. (a) Recovery clock jitter of Single loop controller. (b) Recovery clock jitter of dual loop controller. 10

220 200 DLrecCK-jitter SLrecCK-jitter 180 Jitter in the recovered clock(ps) 160 140 120 100 80 60 40 20 0 1 2 3 4 5 6 7 Initial phase difference between the received data and the reference clock(radian) Fig. 11. Simulated the recovery clock. As mentioned in the express, only when the system meets some conditions CDR may occur the coarse phase dithering. It can be concluded that the new architecture can reduce the occurrence probability further more. 5. Conclusion In conclusion, a dual loop PI controller is presented in this express, which can reduce the probability of the coarse phase jumping, and also the recovery clock can be interpolated among several fine phases. Thus, the peak to peak jitter of the interpolated clock is reduced. Since the proposed PI controller is based on digital logic operations, it can be used in many architectures of CDR. Acknowledgments The authors thank ASIC Team for providing advice and discussion. 11