Digital Integrated Circuit Design II ECE 426/526, Chapter 10 $Date: 2016/04/07 00:50:16 $ Professor R. Daasch Depar tment of Electrical and Computer Engineering Portland State University Portland, OR 97207-0751 (daasch@ece.pdx.edu) Course Website http://ece.pdx.edu/%7eecex26 [Note links are parsed by Adober Reader but may not be parsed by browser viewers] R.Daasch, Por tland State University 1 Apr il 2016
Chapter 10 introduces feedback into a CMOS circuit to realize bistable elements Bistable elements for m the basis for restoring (active) data storage Logic Circuits Regenerative=No Regenerative=Yes Combinational Circuits Sequential Circuits Bistable Monostable Elements discussed in Chapter 10 synchronize logic state transitions and store the states Role to separate in space and time the current and next states of a finite state machine R.Daasch, Por tland State University 2 Apr il 2016
Additional sequencing delay is unavoidable (overhead) e.g. setup and hold time requirements CMOS data storage is one or two states One (single) stable state is monostable Small perturbations on either input or output, the output retur n to the original state A static CMOS logic gate is monostable Tw o stable states is a bistable More dependent on operating point than monostable Small perturbations can toggle from state A to state B All regenerative storage is bistable Not all data storage is bistable Not all data storage is regenerative R.Daasch, Por tland State University 3 Apr il 2016
The simplest storage is a cross-coupled inverter pair Basic element of a RAM cell and effectively the D-latch Easy to store a random value, just turn iton RAM(D-latch) harder(easier) to store complementary values on specific nodes Harder still to read the complementary values and have them stay there Retains a token Memor ies are discussed in Chapter 12 Stable states are intersections of the two inver ter DC VTC at Gain 0, the low energy points Small changes are attenuated (reduced) hence a preferred state R.Daasch, Por tland State University 4 Apr il 2016
The third intersection is at maximum gain and any change forces the circuit to one of the stable states For CMOS the maximum difference in voltage limits to the supply voltage V max = V OH,max V OL,min = V DD For this reason memories use clocks and additional inputs to control feedback Storage (memory) elements synchronize data to a (global) clock Clock per iod T,(cycle time T c )time between successive sampling edge of clock signal Three common clock schemes Single phase (see later) requires edge-triggered F-F to avoid race R.Daasch, Por tland State University 5 Apr il 2016
Two non-overlapping phases two active clocks with intentional gaps between each active phase (simple) design with latch Pulsed clock the middle ground between single phase and two non-overlapping phases with intentional small (shor t duty cycle) active phase Setup time (T s )isthe time data (D) isstable before the clock edge Hold time (T h )isthe time data (D) isstable after the clock edge Clock-to-delay (T q )isthe delay from sample clock edge to new data All the typical control types D, SR, T, JK have CMOS circuit implementations The CMOS latch is the most common bistable storage circuit R.Daasch, Por tland State University 6 Apr il 2016
SR F-F assembled from NAND or NOR gates SR NAND S R Q 0 0 Undefined 0 1 1 1 0 0 1 1 Q SR latch (NOR2) S R Q n+1 Q n+1 Operation 0(V OL ) 0(V OL ) 1(V OH ) 0(V OL ) M1/M4 off, M2on 0(V OL ) 0(V OL ) 0(V OL ) 1(V OH ) M1/M4 off, M3on 1 0 1 0 M1/M2 on, M3/M4 off 0 1 0 1 M1/M2 off, M3/M4 on Compare operations using Q transitions from 1 >0 and 0 >1 can be estimated from the delay gate model R.Daasch, Por tland State University 7 Apr il 2016
τ rise,q = τ rise,q(nor) + τ fall,q(nor) Latch or flip-flop (F-F) can be clocked or unclocked Weste & Harris define latch to be level-sensitive Weste & Harris define flip-flops to be edge triggered The clocked version typically disables the input signals (e.g. D or S and R) The AND gate in the logic design is realized as simply as two transistors in series with 2,1 AND-OR-INVERT replacing each NOR When CLK = 0(AOI), latch is reduced to the two, crosscoupled inverters Latch or F-F can include synchronous and asynchronous (re)sets R.Daasch, Por tland State University 8 Apr il 2016
The asynchronous (re)set signal ARST is realized with an additional OR branch parallel to the CLK JK F-F common in TTL, catch-all function in set, reset, T (toggle) and D (latch function) JK J K Q 0 0 Q 0 1 0 1 0 1 1 1 Q T(Toggle) F-F is restricted for m of the JK Simple function (commonly used in TTL counters) Hard to test (race/hazard conditions) T R.Daasch, Por tland State University 9 Apr il 2016
T Q 0 Q 1 Q In CMOS replaced with the D-latch and multiplexer Latch design is common because of simplicity and versatility Latch output reflects input during some part (typically 1/2) of the clock per iod Inputs must be stable for a setup and a hold time As simple as two inver ters and a multiplexer Many latch styles; dynamic versus static Dynamic latches effectively store bits the input capacitance on an inverter Sensitive to clock timing (temporal) and circuit noise R.Daasch, Por tland State University 10 Apr il 2016
Many var iations on regenerative styles Some cross-coupled inverter Some full function cross-coupled NANDs, NOR Simple embedded logic resets, sets, clocked, Full-function blocks (many of these are dynamic) Tr ue single phase latch and flip-flop eliminates complement Reduced transistor count (good) Reduced clock capacitance, P dyn = CV 2 f (good) Output hazards (bad) Master-slave configurations are common Reduces or eliminates race and hazard noise from input to output of latch R.Daasch, Por tland State University 11 Apr il 2016
Edge-tr iggered design can be used to reduce some complexity of master-slave The transmission gate multiplexer is often shown More robust implementations are possible and frequently used in practice Edge-tr iggered registers can be a combination of two levelsensitive latches into a master-slave configuration Conventionally the master is the first latch (ie samples input data); the slave is the second latch (ie transfers new data to output) The internal node (output of master/input of slave) is not usable as input/output Clock relation is typically one inverter (ie 180 deg) Positive edge-tr iggered means data sample which clock is high (CLK = 1) R.Daasch, Por tland State University 12 Apr il 2016
Negative edge-tr iggered means data sample which clock is low (CLK = 0) Finite state machines require sequencing of previous outputs as inputs (ie feedback) Unlike the combinational domino circuit feedback will likely result in logic race conditions Latches are distinguished from flip-flops by the transparency to input transitions Tr ansparency in latches is also known as level-sensitive Flip-flops are familiar and generally easy to design with Flip-flops increase design overhead (area, power, delay) Latches allow for time-borrowing (more on that later) Gray areas between flip-flops and latches R.Daasch, Por tland State University 13 Apr il 2016
Shift from transparency to opacity in a narrow time window looks to be a edge-triggered flip-flop Pulsed sequencing based on latches controlled by a single clock with a narrow window Tr ansparent sequencing based on latches with multiple clocks Flip-flop sequencing used a single clock and an edgetr iggered or at least the functional appearance of edgetr iggered A complex edge-tr iggered flip-flip and back-to-back halfcycle latches appear the same at the input and output ter minals Setup/hold times limit time when transitions on the latch/flipflop inputs can occur For a single phase clock the inputs and feedback synchronized by clock φ R.Daasch, Por tland State University 14 Apr il 2016
φ Comb Logic φ φ Allowing for finite delay ofthe combinational logic τ CL,setup time for the storage τ DC and storage clock delay τ CQ T H < τ CL + τ DC + τ CQ < T c where the clock duty cycle is divides the period P T H +T L = T c R.Daasch, Por tland State University 15 Apr il 2016
Minimum and maximum constraints on the delay are circuits most susceptible to clock skew fails Multi-clock domino circuits can trade a small perfor mance penalty for increased tolerance of the skew The most common multi-clock method is two-phase, nonover lapping φ 1 and φ 2 T c φ 1 φ 2 T 1 T T T 2 3 4 Non-overlapping means then T 2 0 and T 4 0 (W&H assumes T 2 = T 4 and calls them T nonoverlap ) R.Daasch, Por tland State University 16 Apr il 2016
clock- Inputs and feedback are controlled by different T phases φ 2 Comb Logic φ 2 φ 1 φ 1 On input feedback path is broken by φ 1 and the logic delay is τ CL + τ DC + τ CQ < T 3 +T 4 +T 1 R.Daasch, Por tland State University 17 Apr il 2016
φ 2 φ 2 Comb Logic C Comb Logic L φ 1 φ 1 Tw o-phase data is organized into stable, valid and qualified by each clock Stable data, final value is reached before beginning clock transition (typically rising) R.Daasch, Por tland State University 18 Apr il 2016
Valid data, final value is reached before ending clock transition (typically falling) Qualified, tracks clock edges or quiescent Maximum constraint only means any minimum delay is allowed Hard edge timing and more flexible cousin time borrowing differ in the implementation of the synchronizing element Edge triggered elements (flip-flops) impose a strict timing between elements hard edge Tr ansparent latches enters on one transition and exits on the opposite edge time borrowing R.Daasch, Por tland State University 19 Apr il 2016
T c T 4 Combined Delay Too Long max τ L τ L Any combination of delay allowed in white space τ C T 2 Tc max τ C Timing requirements 2-phase non-overlapping clock R.Daasch, Por tland State University 20 Apr il 2016
Time borrowing is limited by feedback (fixed point of return) Time borrowing flexible approach to system timing easing requirements on estimating delay T T2 Exceeds T1 + T3 + T4 Cl 2 Delay T1 + T3 + T4 Exceeds T Valid Delays T1 + T2 + T3 CL 1 Delay Exceeds T1 + T2 + T3 T4 T Accounting for clock skew is simplified R.Daasch, Por tland State University 21 Apr il 2016
Aggressive the circuit design more skew sensitivity Clock skew is a key concer n for domino circuits For the most part the clock signals are global and delay paths from driver to receiver differ Ideally signaling a dynamic circuit to switch from precharge to evaluate is simultaneous across chip Different finite in the clock lines blurs the transition Clock skew that exceeds the (precharge/evaluate) duty cycle causes delay fails that appear as transient logic fails Domino gates can have multiple outputs All have to meet precharge, evaluate and skew requirements Each precharged node can be a separate output R.Daasch, Por tland State University 22 Apr il 2016
Precharging internal nodes also reduces the charge shar ing problem Each output has to have a dedicated buffer (again typically an inverter) Delay tradeoffs shift the relative arr ival of the data or clock to data Domino logic is a high-perfor mance logic family transistor placement and sizing are key Reduce the input capacitance as much as possible Predicted input signal delays to optimize transistor order Charge sharing of internal (and likely not precharged) capacitance can erode output signal Allowable domino delay for no skew Single phase delays R.Daasch, Por tland State University 23 Apr il 2016
t pd = T c 2t pdq Single phase with skew (over lapping) delay for domino t pd = T c 2(t pdq + t skew ) Skew-tolerant designs eliminate the latch between phases while allowing phases to overlap Erroneous data appear when signals do not overlap Output buffers can be modified to provide a weak static output A single feedback transistor provides a small current to hold the output to the precharged logic value during evaluation Tr ansistor is weak (ie drawn long) to limit the unavoidable addition of a switching current Multiple phases and local clock generation support a wealth of design styles. R.Daasch, Por tland State University 24 Apr il 2016
Logic style interfaces meet different latch hold and setup requires Static to domino interface require stable static inputs to the first layer of domino gates Domino to logic interface requires the latch(flip-flop) capture the final value before precharge R.Daasch, Por tland State University 25 Apr il 2016