1 CAD for VLSI Design - I Lecture 38 V. Kamakoti and Shankar Balachandran
2 Overview Commercial FPGAs Architecture LookUp Table based Architectures Routing Architectures FPGA CAD flow revisited
3 Xilinx Spartan Devices I/O Connectivity Memory Resources I O B I O B DLL R A M R A M...... CLB CLB R A M CLB IOB... IOB... CLB DLL R A M DLL IOB IOB DLL I O B I O B Logic & Routing System Clock Management Digital Delay Lock Loops (DLLs)
4 Xilinx Spartan Logic & Routing Look Up Table (LUT) versatility CLB primary building block Flexible for logic or distributed RAM implementation Fast arithmetic operations Specialized Carry Logic for arithmetic operations Fast DSP functions Configurable for simple to complex logic Allow up to 6 input functions in one logic level Configurable Logic Block (CLB) R
CLB Structure COUT COUT G4 G3 G2 G1 F5IN BY SR Look-Up Table O Carry & Control Logic YB Y S D CK EC R Q G4 G3 G2 G1 F5IN BY SR Look-Up Table O Carry & Control Logic YB Y S D CK EC R Q F4 F3 F2 F1 Look-Up Table O Carry & Control Logic XB X S D CK EC R Q F4 F3 F2 F1 Look-Up Table O Carry & Control Logic XB X S D CK EC R Q CIN CLK CE SLICE CIN CLK CE SLICE Each slice has 2 LUT-FF pairs with associated carry logic Two 3-state buffers (BUFT) associated with each CLB, accessible by all CLB outputs 5
6 CLB Slice Structure Each slice contains two sets of the following: Four-input LUT Any 4-input logic function Or 16-bit x 1 sync RAM Or 16-bit shift register Carry & Control Fast arithmetic logic Multiplier logic Multiplexer logic Storage element Latch or flip-flop Set and reset True or inverted inputs Sync. or async. control
Detailed View 7
8 Four-Input LUT Implements combinatorial logic Any 4-input logic function Cascaded for wide-input functions Truth Table Inputs(ABCD) Output(Z) 0000 0 0001 0 0010 0 0011 1.. 1110 1 1111 0 4-input logic function LUT = A B C D Z
9 LUT as a Universal Gate An n-lut as a direct implementation of a function truth-table. Each latch location holds the value of the function corresponding to one input combination. Example: 2-lut INPUTS AND OR 00 0 0 01 0 1 10 0 1 11 1 1 Example: 4-lut INPUTS 0000 F(0,0,0,0) 0001 F(0,0,0,1) 0010 F(0,0,1,0) 0011 F(0,0,1,1) 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 store in 1st latch store in 2nd latch
10 Boolean Functions With 2 Inputs A B False?? AB A True 0 0 0 1 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 Implements any function of 2 inputs. How many of these are there? How many functions of n inputs?
11 Dedicated Expansion Multiplexers MUXF5 combines 2 LUTs to create 4x1 multiplexer Or any 5-input function (LUT5) Or selected functions up to 9 inputs MUXF6 combines 2 slices to form 8x1 multiplexer Or any 6-input function (LUT6) Or selected functions up to 19 inputs Dedicated muxes are faster and more space efficient CLB Slice LUT LUT Slice LUT LUT MUXF5 MUXF5 MUXF6
More than 4 inputs 12
Dedicated Arithmetic 13
14 Distributed RAM CLB LUT configurable as Distributed RAM A LUT equals 16x1 RAM Implements Single and Dual-Ports Cascade LUTs to increase RAM size Synchronous write Synchronous/Asynchronous read Accompanying flip-flops used for synchronous read LUT LUT LUT = RAM32X1S D WE WCLK A0 A1 A2 A3 A4 or = O RAM16X1S D WE WCLK A0 A1 A2 A3 RAM16X2S D0 D1 WE WCLK A0 A1 A2 A3 or O0 O1 O RAM16X1D D WE WCLK A0 SPO A1 A2 A3 DPRA0 DPO DPRA1 DPRA2 DPRA3
15 Spartan Routing Architecture LONG HEX SINGLE LONG HEX SINGLE General SWITCH Routing Matrix MATRIX (GRM) CARRY CARRY INTERNAL BUSSES TRISTATE BUSSES LONG HEX SINGLE Internal 3-state busses Long lines and Global lines Buffered Hex lines Single-length lines LONG HEX SINGLE SLICE Local Feedback SLICE Direct connections CLB CARRY CARRY Local routing Dedicated routing Direct connections Internal 3-state bus General Routing Matrix (GRM) Global routing Single line, Long line, Hex Primary Clock Buffer lines, Secondary lines line
16 Local Routing Local Routing Interconnect among LUTs, FFs, GRM CLB feedback path for connections to LUTs in same CLB Direct path between horizontally adjacent CLBs
General Purpose Routing LONG HEX SINGLE LONG HEX SINGLE SWITCH MATRIX INTERNAL BUSSES CARRY CARRY TRISTATE BUSSES LONG HEX SINGLE Internal 3-state Bus Long lines and Global lines Buffered Hex lines Single-length lines DIRECT CONNECTION LONG HEX SINGLE SLICE Local Feedback SLICE Direct connections CLB CARRY CARRY 24 single-length lines Route GRM signals to adjacent GRMs in 4 directions 96 buffered hex lines Route GRM signals to another GRMs six blocks away in each of the four directions 12 buffered Long lines Routing across top and bottom, left and right 17
18 FPGA Architecture and CAD Synthesis Technology Mapping Placement/Floorplanning Routing Generate Programming Data Programming the Device
Lookup Table Based Technology Mapping 19 A k input lookup table is a digital memory that can implement any boolean function of k input variables. 2 It can implement 2 functions. Library based technology mapping techniques are not efficient. New approaches are needed. k
Why Do FPGA Based Computing Need to understand How costly (big) is a solution How compare to alternatives Cost and benefit of flexbility Complete implementation of our application For each architectural alternatives In same implementation technology with multiple area-time points Start sorting out custom vs. configurable spatial configurable vs. temporal 20
21 Questions and Answers Thank You