VLSI Clock Domain Crossing Giorgos Dimitrakopoulos Electrical and Computer Engineering Democritus University of Thrace dimitrak@ee.duth.gr
Clock relationships Asynchronous Clock domains completely unrelated Plesiochronous The frequency of one clock domain is the ratio of the frequency of the other clock domain Mesochronous Same frequency Constant or slowly changing jitter
Send data across clock domains: Metastability
Time to resolve Metastability Flip-flop output transitions happens significantly later than the nominal C-Q delay. then we know that the flipflop was metastable.
Main Effect of Metastability Normal operation: Clock Clock R R1 R2 R3 Clock R R1 R2 R3 T CL R R1 R2 R3 FF1 Long Delay (FF1) may lead to failure: T CL FF2 24 2009 Ran Ginosar
Another Effect of Metastability R FF1 R1 R2 FF2 R3 FF3 R4 Clock Fork: Short and long paths Clock R R1 R2 R3 R4 T CL 25 2009 Ran Ginosar
Failure means that a flip-flop became metastable after the clock s sampling edge, and that it is still metastable S time later short window TW around the clock s sampling edge (sort of setupand- hold time ) The probability of D s having changed within the TW window, is TW/TC = TWFC (FC is the clock frequency) D may not change every cycle; if it changes at a rate FD
The Two-FF Synchronizer: PREVIEW Control synchronizer Use out after one clock cycle Metastable X settles after time S = T t SU Failure rate too high? add another FF / another cycle IN FF1 X FF2 OUT S Clock IN FF1 X FF2 FF3 OUT S 2009 Ran Ginosar
The Two-FF Synchronizer : PREVIEW Data too? Synchronize only the control line! Not the data! The synchronized control used as Load-Enable RDY FF1 data enable REG RDY FF1 data FF2 enable REG S Clock B S Clock B S = T t CL t SU S = 2T t CL 2t SU 2009 Ran Ginosar
a) Q1 could switch at the beginning of clock cycle 1 and Q2 will copy that on clock cycle 2 b) Q1 could completely miss D1. It will surely rise on cycle 2, and Q2 will rise one cycle later c) FF1 could become metastable, but its output stays low. It later resolves so that Q1 rises (the bold rising edge). This will happen before the end of the cycle. Then Q2 rises in cycle 2. d) FF1 could become metastable, its output stays low, and when it resolves, the output still stays low. This appears the same as case (b). Q1 is forced to rise in cycle 2, and Q2 rises in cycle 3. e) FF1 goes metastable, and its output goes high. Later, it resolves to low (we see a glitch on Q1). By the end of cycle 1, Q1 is low. It rises in cycle 2, and Q2 rises in cycle 3. f) FF1 goes metastable, its output goes high, and it later resolves to high. Q1 appears the same as case (a). Q2 rises in cycle 2.
R1 R2 R Once input signal stable, output Q2 will resolve in 2 or 3 cycles depending on the state of the first flip-flop
2FF Synchronizer Handshake Requests sent must be held up long enough so that the receiver manages to capture them( three edge requirement )
Control Synchronizer F V TX FSM A2 R A R2 RX FSM R3 D EDGE DETECTOR RCV clock R2 R3 D 34 2009 Ran Ginosar
-- TRANSMITTER (inputs V, A, output R) if rising_edge(tx_clock) then A2 <= A1; A1 <= A; -- 2 FFs A3 <= A2; F <= A3 xor A2; -- 1 shot case (tx_fsm_state) is when idle => if (V = '1') then tx_fsm_state <= req; R <= '1'; end if; when req => if (A2 = '1') then tx_fsm_state <= waiting; R <= '0'; end if; when waiting => if (A2 = '0') then tx_fsm_state <= idle; end if; when others => tx_fsm_state <= idle; R <= '0'; end case; end if; -- RECEIVER (input R, output A) if rising_edge(rx_clock) then R2 <= R1; R1 <= R; -- 2 FFs R3 <= R2; D <= R3 xor R2; case (rx_fsm_state) is when idle => if (R2 = '1') then rx_fsm_state <= ack; A <= '1'; end if; when ack => if (R2 = 0') then rx_fsm_state <= idle; A <= '0'; end if; when others => rx_fsm_state <= idle; A <= '0'; end case; end if; -- 1 shot 35 2009 Ran Ginosar
How much latency? Worst case RàR2 is two Rx cycles (best: ~1) One more Rx cycle for R2àACK Worst case AàA2 is two Tx cycles One Tx cycle A2àR- Three Rx cycles R- à A- (best: 2) Three Tx cycles A- à next R Total worst case: 6 Tx cycles + 6 Rx cycles Total best case: 5+5 IDLE IDLE V R2 R2 A2 A2&V REQ/R=1 A2 ACK/A=1 WAIT 36 2009 Ran Ginosar
Longer Latency (in phase mesochronous clocks) CLK-TX CLK-RX R R1 R2 A A1 A2 Data Cycle = 12 Cycles R, A1, A2 are in the CLK-TX domain, and R1, R2, A are in the CLK-RX domain 37 2009 Ran Ginosar
Short Latency (off mesochronous phase clocks) CLK-TX CLK-RX R R1 R2 A A1 A2 Data Cycle = 10 Cycles 38 2009 Ran Ginosar
Sync FIFO behavior Once you write new data the read side would need 2 RCV cycles to understand their presence plus 1 RCV cycle to read them out When the FIFO becomes full and a read takes place the send side would require 2 SND cycles to understand that there is empty space in the FIFO plus 1 SND cycle to put a new word in the FIFO Total of 3 x rcv_cycles + 3 x snd_cycles More sync stages would increase the latency If rcv_cycle can be equal to snd_cycle (frequencies can be dynamically changing) 6 FIFO slots are needed at least