Synchronous Sequential Design

Similar documents
FPGA TechNote: Asynchronous signals and Metastability

DEDICATED TO EMBEDDED SOLUTIONS

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

Level and edge-sensitive behaviour

HDL & High Level Synthesize (EEET 2035) Laboratory II Sequential Circuits with VHDL: DFF, Counter, TFF and Timer

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

Sequential circuits. Same input can produce different output. Logic circuit. William Sandqvist

Asynchronous inputs. 9 - Metastability and Clock Recovery. A simple synchronizer. Only one synchronizer per input

Flip-flop and Registers

EECS150 - Digital Design Lecture 10 - Interfacing. Recap and Topics

Software Engineering 2DA4. Slides 9: Asynchronous Sequential Circuits

EECS150 - Digital Design Lecture 15 Finite State Machines. Announcements

EE178 Spring 2018 Lecture Module 5. Eric Crabill

Modeling Latches and Flip-flops

Chapter 6. sequential logic design. This is the beginning of the second part of this course, sequential logic.

California State University, Bakersfield Computer & Electrical Engineering & Computer Science ECE 3220: Digital Design with VHDL Laboratory 7

EECS150 - Digital Design Lecture 19 - Finite State Machines Revisited

Ryerson University Department of Electrical and Computer Engineering EES508 Digital Systems


Combinational / Sequential Logic

EITF35: Introduction to Structured VLSI Design

Sequential Circuit Design: Principle

Laboratory Exercise 7

Chapter 5: Synchronous Sequential Logic

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Figure 1 shows a simple implementation of a clock switch, using an AND-OR type multiplexer logic.

CS8803: Advanced Digital Design for Embedded Hardware

Lecture 8: Sequential Logic

The outputs are formed by a combinational logic function of the inputs to the circuit or the values stored in the flip-flops (or both).

1. What does the signal for a static-zero hazard look like?

DEPARTMENT OF ELECTRICAL &ELECTRONICS ENGINEERING DIGITAL DESIGN

Chapter 5 Synchronous Sequential Logic

Feedback Sequential Circuits

Outline. EECS150 - Digital Design Lecture 27 - Asynchronous Sequential Circuits. Cross-coupled NOR gates. Asynchronous State Transition Diagram

Outline. CPE/EE 422/522 Advanced Logic Design L04. Review: 8421 BCD to Excess3 BCD Code Converter. Review: Mealy Sequential Networks

Advanced Digital Logic Design EECS 303

CPS311 Lecture: Sequential Circuits

IT T35 Digital system desigm y - ii /s - iii

Digital Design, Kyung Hee Univ. Chapter 5. Synchronous Sequential Logic

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

cascading flip-flops for proper operation clock skew Hardware description languages and sequential logic

CHAPTER 4: Logic Circuits

Name Of The Experiment: Sequential circuit design Latch, Flip-flop and Registers

CS3350B Computer Architecture Winter 2015

Laboratory Exercise 7

Unit 11. Latches and Flip-Flops

problem maximum score 1 28pts 2 10pts 3 10pts 4 15pts 5 14pts 6 12pts 7 11pts total 100pts

Digital Logic & Computer Design CS Professor Dan Moldovan Spring Chapter 3 :: Sequential Logic Design

2.6 Reset Design Strategy

Modeling Latches and Flip-flops

Basis of sequential circuits: the R-S latch

Administrative issues. Sequential logic

Lecture 13: Clock and Synchronization. TIE Logic Synthesis Arto Perttula Tampere University of Technology Spring 2017

ECE 263 Digital Systems, Fall 2015

Microprocessor Design

Logic Design. Flip Flops, Registers and Counters

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

Clock Domain Crossing. Presented by Abramov B. 1

Clocking Spring /18/05

Flip-Flops. Because of this the state of the latch may keep changing in circuits with feedback as long as the clock pulse remains active.

Keeping The Clock Pure. Making The Impurities Digestible

Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM)

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Final Exam review: chapter 4 and 5. Supplement 3 and 4

ECE 3401 Lecture 11. Sequential Circuits

FPGA Implementation of Sequential Logic

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Chapter 4. Logic Design

Sequential Logic. E&CE 223 Digital Circuits and Systems (A. Kennings) Page 1

Logic Design II (17.342) Spring Lecture Outline

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

Combinational vs Sequential

Fundamentals of Computer Systems

Sequential Circuits. Output depends only and immediately on the inputs Have no memory (dependence on past values of the inputs)

Clock and Asynchronous Signals

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS 150 Spring 2000

Lab 3: VGA Bouncing Ball I

CHAPTER 4: Logic Circuits

Sequential Circuit Design: Part 1

Metastability Analysis of Synchronizer

EEC 118 Lecture #9: Sequential Logic. Rajeevan Amirtharajah University of California, Davis Jeff Parkhurst Intel Corporation

FIFO Memories: Solution to Reduce FIFO Metastability

Digital Fundamentals: A Systems Approach

Sequential logic. Circuits with feedback. How to control feedback? Sequential circuits. Timing methodologies. Basic registers

Lecture 12: Clock and Synchronization. TIE Logic Synthesis Arto Perttula Tampere University of Technology Spring 2018

Memory Interfaces Data Capture Using Direct Clocking Technique Author: Maria George

Chapter 3. Boolean Algebra and Digital Logic

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) LATCHES and FLIP-FLOPS

More on Flip-Flops Digital Design and Computer Architecture: ARM Edition 2015 Chapter 3 <98> 98

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

Using minterms, m-notation / decimal notation Sum = Cout = Using maxterms, M-notation Sum = Cout =

ECE 3401 Lecture 12. Sequential Circuits (II)

Digital Logic Design I

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

CS8803: Advanced Digital Design for Embedded Hardware

Modeling Digital Systems with Verilog

Lec 24 Sequential Logic Revisited Sequential Circuit Design and Timing

Digital Systems Laboratory 1 IE5 / WS 2001

Figure 1 Block diagram of a 4-bit binary counter

Fundamentals of Computer Systems

Transcription:

Synchronous Sequential Design SMD098 Computation Structures Lecture 4 1 Synchronous sequential systems Almost all digital systems have some concept of state the outputs of a system depends on the past values of its inputs as well as the present values. Such systems are known as sequential systems, as opposed to combinational systems. A sequential system be Asynchronous. The next state is updated as soon as the next state changes (there is no clock signal) Synchronous. The next state is only updated when a clock signal changes Simple model of a synchronous sequential system SMD098 Computation Structures Lecture 4 2

Why synchronous? Signals are sampled at well-defined time intervals. Problems with speed variation through different paths of logic, short and long paths, can easily be avoided. Glitches caused by dynamic and static hazards have no effect since the data is sampled only after the glitches have had a chance to settle out Asynchronous designs do not operate with a clock. Relies on handshaking between logic. Sensitive to glitches and ordering of signals Works well under variations of temperature, voltage and process. Changing an asynchronous design from a.5 micron process to a.18 micron process will change the timing of the design. The result will most likely be that the new design will not work as intended Interfacing two synchronous blocks is simple. Interfacing asynchronous blocks is not simple Synthesis and other tools does not handle asynchronous logic very well. Designing asynchronously puts much higher demands on the designer SMD098 Computation Structures Lecture 4 3 Never asynchronous? Asynchronous designs can sometimes be motivated. Asynchronous design can be faster and consume less power if designed correctly. But it is difficult - Only idiots and geniuses designs asynchronously. I don t consider myself to be an idiot or a genius Interfacing asynchronous input signals is unavoidable in many designs Interfacing clock domains that are asynchronous relative to each other is common. See lab 3.3, asynchronous FIFO Synchronous is is simple simple --keep keep it it simple! simple! Asynchronous interfaces needs needs special special attention. SMD098 Computation Structures Lecture 4 4

Finite State Machines - FSM Moore FSM: Outputs are a function of the current state Inputs Next State State Register Output Moore outputs Moore Mealy FSM: Outputs are a function of the current state and inputs. We have combinational paths through the FSM. Inputs Next State State Register Output Mealy outputs Mealy SMD098 Computation Structures Lecture 4 5 Combined Mealy / Moore FSM State machines Inputs Next State State Register Output Mealy outputs Combined Mealy / Moore Output Moore outputs Registered output FSM. Outputs are registered, which prevents output glitches. Output glitches may not matter though Inputs Next State State Register Next Output Output Register Registered outputs Registered output FSM SMD098 Computation Structures Lecture 4 6

Example from text book Low end traffic light controller A combined Mealy/Moore FSM Input Input Mealy Mealy output output State State name name Moore Moore outputs outputs The diagram shown is probably what you are used to see from the Digital Design course, but here you will also learn the Algorithmic State Machine (ASM) chart representation SMD098 Computation Structures Lecture 4 7 Algorithmic State Machines (ASM) State box. A state takes exactly one clock cycle to complete. If a signal has no value associated to it (Y) it is logic one and logic zero elsewhere. The notation Z Ä 1 means that the signal is assigned at the end of the state, during the next clock cycle, and holds its value until otherwise set elsewhere. Decision box. Must follow and be associated with a state box. The decision is made base upon one or more input signals in same cycle as the other actions of the state. Conditional output box. Must follow a decision box. The outputs are asserted in the same clock cycle as those in the state box to which it is attached. The output signals are Mealy outputs since they depend on the present state as well as the inputs. SMD098 Computation Structures Lecture 4 8

ASM chart for traffic light controller 1 clock cycle In the digital course you should have learned how to manually obtain a hardware implementation of a FSM from a chart representation. If you have not or have forgotten read chapter 5.4.1 in the text book SMD098 Computation Structures Lecture 4 9 State encoding State encoding - the way binary numbers are assigned to states. You may define your own encoding or let the synthesis tool define it Common encoding formats: No. Binary Gray One-hot 0 0000 0000 0000000000000001 1 0001 0001 0000000000000010 2 0010 0011 0000000000000100 3 0011 0010 0000000000001000 4 0100 0110 0000000000010000 5 0101 0111 0000000000100000 6 0110 0101 0000000001000000 7 0111 0100 0000000010000000 8 1000 1100 0000000100000000 9 1001 1101 0000001000000000 10 1010 1111 0000010000000000 11 1011 1110 0000100000000000 12 1100 1010 0001000000000000 13 1101 1011 0010000000000000 14 1110 1001 0100000000000000 15 1111 1000 1000000000000000 There There is is no no known known method method for for determining determining in in advance advance which which state state assignment/encoding is is best best in in the the sense sense of of giving giving simplest simplest next nextstate logic. logic. SMD098 Computation Structures Lecture 4 10

State assignment guide lines We should always provide some means of initializing the state machine when power is applied. This can be done with asynchronous sets or resets. However, do not use asynchronous resets sets that are sourced from internal combinational logic (more about this later). Minimizing the number of flip-flops is not necessarily good. The states may have some particular. For instance a state variable may be set in one state but in no others. This may result in simple output logic, but non-minimal number of flip-flops. One-hot encoding allow very simple and fast next state logic. In FPGAs there is a large number of flips-flops. If the flip-flops are not use they are wasted, so one-hot encoding is often used in FPGA implementations. For some reason 1 we may want to minimize the number of state bits that changes between each state then Gray coding should be chosen. Note that it is not meaningful if we for instance have an FSM where each state can transition to any other state. 1. Some reason may be power minimization or reducing the probability of entering an erroneous state for an FSM with asynchronous inputs SMD098 Computation Structures Lecture 4 11 Safety critical system For a safety critical system there should be logic that detects if an FSM has entered an illegal state. When an illegal state is entered the FSM should be reset. For instance we may have an FSM with five states with binary encoding (2 3 = 8). Then there are three illegal states that the FSM should never enter (but it may due to for instance alpha particles). What is a safety critical system? SMD098 Computation Structures Lecture 4 12

State encoding in VHDL Using an enumerated type, synthesis tool will decide encoding: architecture Enc1 of StateMachine is type State is ( Idle, S1, S2, S3, S4, S5 ); signal PresentState, NextState : State;... end architecture Enc1; Using constants to represent states. If this is done make sure you turn off any synthesis FSM optimization, since the synthesis tool may re-encode the states. architecture Enc2 of StateMachine is -- One-hot constant Idle : std_logic_vector(2 downto 0) := "000001"; constant S1 : std_logic_vector(2 downto 0) := "000010"; constant S2 : std_logic_vector(2 downto 0) := "000100"; constant S3 : std_logic_vector(2 downto 0) := "001000"; constant S4 : std_logic_vector(2 downto 0) := "010000"; constant S5 : std_logic_vector(2 downto 0) := "100000"; signal PresentState, NextState : std_logic_vector(2 downto 0);... end Enc2; SMD098 Computation Structures Lecture 4 13 library ieee; use ieee.std_logic_1164.all; Example Moore FSM entity StateMachine is port ( Clk, Reset : in std_logic; A, B : in std_logic; Y : out std_logic_vector(2 downto 0)); end StateMachine; architecture Moore of StateMachine is type State is ( Idle, S1, S2, S3, S4, S5 ); signal PresentState, NextState : State; -- rest of code goes here end architecture Moore; Inputs A, B Next State NextState State Register PresentState Output Moore outputs Moore Y SMD098 Computation Structures Lecture 4 14

Example Moore FSM Learn both types of representations! SMD098 Computation Structures Lecture 4 15 NxtState: process(a, B, PresentState) case PresentState is when Idle => NextState <= S1; when S1 => if A = 1 then NextState <= S2; else NextState <= S1; end if; when S2 => if A = 0 and B = 1 then NextState <= S3; elsif A = 1 and B = 1 then NextState <= S4; elsif A = 1 and B = 0 then NextState <= S5; else NextState <= S2; end if; when S3 => NextState <= S4; when S4 => NextState <= S5; when S5 => NextState <= Idle; end case; end process; Moore FSM three-process model FFs: process(clk, Reset) if Reset = 1 then PresentState <= Idle; elsif rising_edge(clk) then PresentState <= NextState; end if; end process; Output : process(presentstate) case PresentState is when Idle => Y <= "000"; when S1 => Y <= "101"; when S2 => Y <= "111"; when S3 => Y <= "011"; when S4 => Y <= "110"; when S5 => Y <= "000"; end case; end process; Inputs Next State State Register Output Moore outputs SMD098 Computation Structures Lecture 4 16

process(clk, Reset) if Reset = 1 then PresentState <= Idle; elsif rising_edge(clk) then case PresentState is when Idle => PresentState <= S1; when S1 => if A = 1 then PresentState <= S2; else PresentState <= S1; end if; when S2 => if A = 0 and B = 1 then PresentState <= S3; elsif A = 1 and B = 1 then PresentState <= S4; elsif A = 1 and B = 0 then PresentState <= S5; else PresentState <= S2; end if; when S3 => PresentState <= S4; when S4 => PresentState <= S5; when S5 => PresentState <= Idle; end case; end if; end process; Moore FSM two-process model ver.. 1 Inputs Next State process(presentstate) case PresentState is when Idle => Y <= "000"; when S1 => Y <= "101"; when S2 => Y <= "111"; when S3 => Y <= "011"; when S4 => Y <= "110"; when S5 => Y <= "000"; end case; end process; State Register Output Moore outputs SMD098 Computation Structures Lecture 4 17 Comb: process(a, B, PresentState) case PresentState is when Idle => NextState <= S1; Y <= "000"; when S1 => if A = 1 then NextState <= S2; else NextState <= S1; end if; Y <= "101"; when S2 => if A = 0 and B = 1 then NextState <= S3; elsif A = 1 and B = 1 then NextState <= S4; elsif A = 1 and B = 0 then NextState <= S5; else NextState <= S2; end if; Y <= "111"; when S3 => NextState <= S4; Y <= "011"; when S4 => NextState <= S5; Y <= "110"; when S5 => NextState <= Idle; Y <= "000"; end case; end process; Moore FSM two-process model ver.. 2 Inputs StateFFs: process(clk, Reset) if Reset = 1 then PresentState <= Idle; elsif rising_edge(clk) then PresentState <= NextState; end if; end process StateFFs; Next State State Register Output Use one of the three different models for describing your FSMs. Moore outputs SMD098 Computation Structures Lecture 4 18

process(car, Timed, PresentState) StartTimer <= 0 ; -- Default output case PresentState is when S1 => MajorGreen <= 1 ; MinorGreen <= 0 ; if Car = 1 then StartTimer <= 1 ; -- Mealy output NextState <= S2; else NextState <= 1; end if; when S2 => MajorGreen <= 0 ; MinorGreen <= 1 ; if Timed = 1 then NextState <= S1; else NextState <= S2; end if; end case; end process; Combined Mealy/Moore FSM coding process(clk, Reset) if Reset = 1 then PresentState <= Idle; elsif rising_edge(clk) then PresentState <= NextState; end if; end process; Inputs Next State State Register Output Mealy outputs Combined Mealy / Moore Output Moore outputs SMD098 Computation Structures Lecture 4 19 Synplify and FSM encoding Synplify Pro have a FSM compiler. It automatically detects state machines in the source code. The FSMs are implemented with either sequential, gray or one-hot encoding. architecture Synplify of StateMachine is type State is ( Idle, S1, S2, S3, S4, S5 ); signal PresentState, NextState : State; attribute syn_encoding : string; attribute syn_encoding of PresentState : signal is "onehot";... end architecture Synplify; To implement safe FSMs the attribute should be changed to architecture SynplifySafe of StateMachine is type State is ( Idle, S1, S2, S3, S4, S5 ); signal PresentState, NextState : State; attribute syn_encoding : string; attribute syn_encoding of PresentState : signal is "onehot, safe";... end architecture SynplifySafe; SMD098 Computation Structures Lecture 4 20

Synplify FSM Compiler and Explorer Use the FSM viewer in Synplify! You may even do so before simulation. If FSM Explorer is enabled Synplify tries to find the optimal state encoding. The FSM Compiler uses a default encoding. In simulation make sure you trace the state signal in the waveform view. You will see the name of the enumerated type (i.e. S3 ) in the waveform viewer. SMD098 Computation Structures Lecture 4 21 FSM partitioning and linked FSMs Do not design too large FSMs. Break down large complicated control paths in to more manageable pieces. For instance you may have one master FSM and one or more slave FSMs or two or more FSMs that execute serially. Note that this example is no standard solution. There exist many more configurations for partitioned FSMs. As in many other cases it is very hard to find the optimal solution. But there are bad and good designs. SMD098 Computation Structures Lecture 4 22

Data path / control path partitioning It is not sufficient to describe a sequential systems with one or more FSMs. It is common practice to partition a system in to control and data paths. Advantages A better structural and logic decomposition of a system Design reuse of common data path blocks Efficient for CAD tools. Simplifies synthesis, place and route. SMD098 Computation Structures Lecture 4 23 Simple example of data/control path partitioning Data and control path Extracted control path Control Path Data and control path structurally separated. Separate VHDL entities! X[3:0] Data path Reset X0 FSM MuxSel ShiftLeft Round B[3:0] ShiftRight SMD098 Computation Structures Lecture 4 24

A more complex data path The pipelined MIPS architecture used in SMD082. Notice the clear definition of synchronizers (D flip-flops) that makes it easy to analyze the data path. Feed back paths are always synchronized! = = Zero ext. Register file Branch logic 0 A ALU 4 + = = B Sgn/Ze extend 31 + Instruction memory Data memory The SMD082 models of memories (register file, instr. and data memory) are simplified models. Don t try to implement this data path directly! SMD098 Computation Structures Lecture 4 25 What to do before you start to code! What are the design specification? Make sure you fully understand this Next sit down an plan your design. Take a piece of paper and sketch your solution Partition the design into data and control paths Define which units are clocked and which are combinational Use hierarchy when needed When this is done start coding in VHDL - not before! When coding VHDL for synthesis it is not like you are using in any other programming language. Remember that it is hardware you want to implement - so think hardware! This is much easier if plan your design. SMD098 Computation Structures Lecture 4 26

Synchronous design timing Combinational loops Long paths, short paths, false paths, multicycle paths skew, race hazards Global clock and clock enables Asynchronous reset, synchronous reset or no reset? Asynchronous inputs, synchronizers and metastability Crossing clock domains SMD098 Computation Structures Lecture 4 27 Wazzup? SMD098 Computation Structures Lecture 4 28

Combinational loops Combinational loops are a no-no in a synchronous digital design. Unexpected behavior may result. There may be a parasitic latch or even an oscillator. The result is simply unpredictable. Fix break every combinational loop with a synchronizer, i.e. a flip-flop. How would you do in this case? SMD098 Computation Structures Lecture 4 29 Timing of a positive edge triggered D flip-flop flop Setup time, t setup The time the D input must be stable before the rising edge of the flip-flop (assuming a positive edge triggered flip-flop) Hold time, t hold The time the D input must be stable after the rising edge of the clock D Clk Q tsetup t hold 1, 0 or metastable! If the setup or hold time parameters are violated the Q output will be either logic 0, logic 1 or the flip-flop will enter a metastable state, but will eventually become a valid logic level. SMD098 Computation Structures Lecture 4 30

Positive edge triggered flip-flop flop D Clk Q t plh(cq) t phl(cq) t hold t setup 1, 0 or metastable! -to-output delay, t p The delay of a low to high transition and the delay of a high to low transition may be different. We use the worst case when analyzing timing of circuits. SMD098 Computation Structures Lecture 4 31 What is the maximum clock frequency? FF1 FF2 D Q D Q Clk f = t pff1 1 + t + t setupff2 Routing delays and clock skew are not taken into account SMD098 Computation Structures Lecture 4 32

Timing constraints for a synchronous design When we constrain a design for timing we rely on the tools to make sure that we don t violate the setup and hold times. The constraints should at least be: The desired clock period (or frequency) Input arrival times relative to clock Maximum output delay relative to clock Xilinx FPGA IOBFF CLBFF IOBFF D Q Delay D Q CL1 D Q CL2 D Q Delay D Q BUFG CL = Combinatorial For Xilinx FPGAs always use the IOB flip-flops and always use the global buffers for clock routing SMD098 Computation Structures Lecture 4 33 Long path and short path The long path or critical path of a design determines the maximum achievable clock frequency. But why do we have to worry about the short path? D D Q Q 5 ns CL1 Long path - 6 ns 1 ns CL2 D Q D Q Short path - 1 ns FF1 Short path FF2 For proper operation the following must be satisfied: D Q t D Q short t pff1 + t short > t skew + t hff2 t skew or else we will get hold time violations for FF2. As you can see the relationship does not depend on the clock period SMD098 Computation Structures Lecture 4 34

skew and race hazards IN Clk1 FF1 D Q Q1 Delay Clk2 FF2 D Q Q2 t pff(min) For proper function the following must be satisfied + t t p(min) - t hold - skew(max) > 0 IN The short path is important! Clk1 Q1 Clk2 Q2 Ooops! SMD098 Computation Structures Lecture 4 35 skew What if we route the clock in the opposite direction of the data flow? Then for proper operation we must have: t pff1 + t short + t skew > t hff2 FF1 Short path FF2 D Q t short t skew D Q So this is better, but in most design you will have feedback paths so it is not possible to route the clock signal in the opposite direction of the data flow. Also clock skew will limit the maximum achievable clock frequency. So what is the solution? Ensure that the clock skew between communicating registers is bonded. For ASIC design this require careful design of the clock network. For FPGA design with Xilinx it is simple - use the global low skew clock nets. This ensures you that there will be no hold time violations, assuming you have one global clock. Use clock enable, instead of multiple clocks! SMD098 Computation Structures Lecture 4 36

trees H-tree Balanced tree source source The delay from the clock source to each tree node should be matched as close as possible in order to reduce the clock skew. SMD098 Computation Structures Lecture 4 37 (Unless you really know what you are doing.) SMD098 Computation Structures Lecture 4 38

False paths The paths A1 B2 and B1 A2 are false paths. These false paths can be excluded when performing timing driven optimization, such as synthesis and implementation. Depending on what tools are used a timing constraint should be set to indicate that the paths in fact are false paths. EnA EnA A1 A2 EnB B1 CL EnB B2 SMD098 Computation Structures Lecture 4 39 Multi cycle paths Consider this non-loadable pre-scaled counter 2 bit fast counter Wide counter Carry Out Enable The registers in the wide counter are enabled at a rate that is one fourth of the clock rate. Hence the timing constraint for the wide counter can be set to a clock rate corresponding to f clock /4. The combinatorial paths in the wide counter are multi cycle paths. The design tools do not automatically detect that so is the case, so a timing constraint, indicating the multi cycle paths must be set SMD098 Computation Structures Lecture 4 40

Synchronization at the system level A Delayed Locked Loop (DLL) can align internal and external clocks. Effectively eliminates on-chip clock distribution delay. This maximizes the achievable I/O speed. Chip 1 Chip 2 D Q D Q DLL DLL Comparator Error Delay The Virtex FPGA have DLLs. The DLLs can also be used to divide or double incoming clock rate distribution Data SMD098 Computation Structures Lecture 4 41 Asynchronous reset/preset is dangerous... Never glitch an asynchronous reset or preset. Use synchronous reset/preset if you generate the reset/preset signal from combinatorial logic Counter SomeThing Q1 Q0 Asynch Reset Assume Q0: 1 Å 0 and Q1: 0 Å 1 (Q changes from 01 to 10). Variations in routing delay may cause a glitch. Q0 Q1 Async. reset SMD098 Computation Structures Lecture 4 42

Do all memory elements need a reset? Short answer - Nope! FSMs must have a reset, so that when the chip is powered up the FSM can enter a predefined state Data path registers usually do not need a reset. The control path knows when data path registers contents are valid. SMD098 Computation Structures Lecture 4 43 Never synchronize an asynchronous input in more than one place Bad! Good! Asynch input Synchronizer D Q Asynch input Synchronizer D Q Synchronizer D Q FSM FSM Why? SMD098 Computation Structures Lecture 4 44

Asynchronous inputs Asynchronous inputs are unavoidable in many applications. When an asynchronous input is synchronized to a clock there is always a risk that the synchronizing flip-flop will enter a metastable state, since the asynchronous input may change inside the setup/hold time window In the metastable state the output of the flip-flop is undefined. The flip-flop will eventually settle to logic 1 or logic 0, but this must have happen before the next flip-flop will sample the signal or else we have a failure Undefined 1 0 t SMD098 Computation Structures Lecture 4 45 Metastability analysis The Mean Time Between Failure of a synchronizer is determined by t r exp MTBF = T f 0 ( t / τ ) Where is metastability resolution time, maximum time the output can remain metastable without causing synchronizer failure. τ and T0 are constants that depend on the electrical characteristics of the flip-flop. f is the frequency of the asynchronous input and f in clock is the frequency of the sampling clock in r f clock SMD098 Computation Structures Lecture 4 46

Metastability analysis - example Assume we have two identical flip-flops. Both flip-flops are clocked at 10 MHz and the synchronizing flip-flop is sampling an asynchronous 3 khz input τ T o 1 ns 5 5. 10 s Synchronizer Flip-flop in design 13 ns Asynch input D Q CL D Q t su 2 ns 1 f Resolution time: t t t = ( 100 13 2) ns = 85 ns Not that bad! r = CL su clock ( 85) exp 20 MTBF = = 5.5 10 s = 1.74 5 6 3 5 10 10 10 3 10 13 years SMD098 Computation Structures Lecture 4 47 Metastability example What happens if we increase the system clock to 20 MHz? 1 tr = tcl tsu = f clock ( 50 13 2) ns = 35 ns ( 35) exp MTBF = 6 3 5 10 20 10 3 10 5 = 0.053 s Not good at all! SMD098 Computation Structures Lecture 4 48

Cascaded synchronizer It is possible to reduce the MTBF by using cascaded flip-flops as synchronizers Asynch input Synchronizer FF1 FF2 13 ns Flip-flop in design D Q D Q CL D Q Routing delay 2 ns The critical input frequency to FF2 is the mean frequency of the MTBF for FF1 f in 2 5 5 10 20 10 = exp 6 3 10 ( 50 2 2) ( 35) exp MTBF = 5 10 20 10 = 5 6 f in 2 3 Hz 5.8days Changed from 0.05 s to about 6 days, but this is still not good enough. What can we do to further reduce the MTBF? SMD098 Computation Structures Lecture 4 49 Metastability Recovery - XAPP094 (1997) MTBF XC4005E-3 XC4005E-3 XC5206-5 CLB IOB CLB XC3142A-09 IOB XC4005-6 IOB XC4005-6 CLB 13 1 Million Years 12 11 10 1,000 Years XC3142A-09 CLB Log Seconds 9 8 7 1 Year XC3042-70 IOB 6 5 1 Day 4 3 1 Hour XC3042-70 CLB 2 1 Minute 1 0-1 1 2 3 4 5 6-2 Acceptable Extra Delay (ns) -3 X5986 Figure 2: Mean Time Between Failure for various IOB and CLB flip-flop outputs when synchronizing a ~1 MHz asynchronous input with a 10 MHz clock. SMD098 Computation Structures Lecture 4 50

Crossing clock domains When clock A and clock B are totally independent, we are in trouble. There are no completely safe methods to transfer data between two unrelated clock domains. But there are good and bad solutions Block A Domain A HandShake A-B Data A-B HandShake B-A Data B-A Block B Domain B A B One good solution: Carefully synchronize the handshaking signals at both ends. Use a toggle exchange protocol HandShake A-B Data A-B Valid Valid HandShake B-A SMD098 Computation Structures Lecture 4 51 Crossing clock domains In lab 3.3 you will use an asynchronous FIFO to synchronize sampled data from one clock domain to another. You will generate the FIFO with Xilinx CoreGenerator. The tricky part with the FIFO is the implementation of the two FIFO flags, Empty and Full. WRITE ENABLE WRITE CLOCK DATA IN DUAL-PORT WE MEMORY (XC4000E CLBs) WC DATA OUT WRITE ADDRESS W 3-0 4 READ ADDRESS R 3-0 4 WRITE COUNTER 4 4 FULL LOGIC FULL WE WC 2 2 DIR. LOGIC READ COUNTER READ ENABLE READ CLOCK EMPTY LOGIC EMPTY X5887 SMD098 Computation Structures Lecture 4 52