EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 9 Field Programmable Gate Arrays (FPGAs)

Similar documents
ECE 545 Lecture 1. FPGA Devices & FPGA Tools

Why FPGAs? FPGA Overview. Why FPGAs?

L12: Reconfigurable Logic Architectures

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

L11/12: Reconfigurable Logic Architectures

Field Programmable Gate Arrays (FPGAs)

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

RELATED WORK Integrated circuits and programmable devices

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

FPGA Design with VHDL

Lecture 6: Simple and Complex Programmable Logic Devices. EE 3610 Digital Systems

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

IE1204 Digital Design. F11: Programmable Logic, VHDL for Sequential Circuits. Masoumeh (Azin) Ebrahimi

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

FPGA Design. Part I - Hardware Components. Thomas Lenzi

IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

Chapter 7 Memory and Programmable Logic

Microprocessor Design

Level and edge-sensitive behaviour

FPGA Laboratory Assignment 4. Due Date: 06/11/2012

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Tajana Simunic Rosing. Source: Vahid, Katz

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family

Combinational vs Sequential

Digital Systems Design

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Examples of FPLD Families: Actel ACT, Xilinx LCA, Altera MAX 5000 & 7000

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

ECE 263 Digital Systems, Fall 2015

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

COE328 Course Outline. Fall 2007

XC4000E and XC4000X Series. Field Programmable Gate Arrays. Low-Voltage Versions Available. XC4000E and XC4000X Series. Features

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements

High Performance Carry Chains for FPGAs

Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System

HDL & High Level Synthesize (EEET 2035) Laboratory II Sequential Circuits with VHDL: DFF, Counter, TFF and Timer

Outline. CPE/EE 422/522 Advanced Logic Design L04. Review: 8421 BCD to Excess3 BCD Code Converter. Review: Mealy Sequential Networks

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Cyclone II EPC35. M4K = memory IOE = Input Output Elements PLL = Phase Locked Loop

Modeling Latches and Flip-flops

Using the Quartus II Chip Editor

VU Mobile Powered by S NO Group

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

MUHAMMAD NAEEM LATIF MCS 3 RD SEMESTER KHANEWAL

Computer Architecture and Organization

A Fast Constant Coefficient Multiplier for the XC6200

FPGA TechNote: Asynchronous signals and Metastability

Chapter Contents. Appendix A: Digital Logic. Some Definitions

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

TKK S ASIC-PIIRIEN SUUNNITTELU

Introduction to Digital Logic Missouri S&T University CPE 2210 Exam 3 Logistics

Using the XSV Board Xchecker Interface

FPGA Implementation of DA Algritm for Fir Filter

Solution to Digital Logic )What is the magnitude comparator? Design a logic circuit for 4 bit magnitude comparator and explain it,

An Efficient Reduction of Area in Multistandard Transform Core

TYPICAL QUESTIONS & ANSWERS

Electrical and Telecommunications Engineering Technology_TCET3122/TC520. NEW YORK CITY COLLEGE OF TECHNOLOGY The City University of New York

MODEL QUESTIONS WITH ANSWERS THIRD SEMESTER B.TECH DEGREE EXAMINATION DECEMBER CS 203: Switching Theory and Logic Design. Time: 3 Hrs Marks: 100

MODULE 3. Combinational & Sequential logic

Designing for High Speed-Performance in CPLDs and FPGAs

ENGG2410: Digital Design Lab 5: Modular Designs and Hierarchy Using VHDL

The word digital implies information in computers is represented by variables that take a limited number of discrete values.

Design for Testability

Principles of Computer Architecture. Appendix A: Digital Logic

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

WINTER 15 EXAMINATION Model Answer

AbhijeetKhandale. H R Bhagyalakshmi

Integrated circuits/5 ASIC circuits

Registers and Counters


Lossless Compression Algorithms for Direct- Write Lithography Systems

Programmable Logic Design I

Flip-flop and Registers

Final Project [Tic-Tac-Toe]

Sharif University of Technology. SoC: Introduction

1. Convert the decimal number to binary, octal, and hexadecimal.

11. Sequential Elements

CS6201 UNIT I PART-A. Develop or build the following Boolean function with NAND gate F(x,y,z)=(1,2,3,5,7).

EECS150 - Digital Design Lecture 2 - CMOS

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

3/5/2017. A Register Stores a Set of Bits. ECE 120: Introduction to Computing. Add an Input to Control Changing a Register s Bits

EEM Digital Systems II

EXPERIMENT: 1. Graphic Symbol: OR: The output of OR gate is true when one of the inputs A and B or both the inputs are true.

9 Programmable Logic Devices

Digital. Digital. Revision: v0.19 Date: : / 76

Combinational / Sequential Logic

S.K.P. Engineering College, Tiruvannamalai UNIT I

EE292: Fundamentals of ECE

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Using SignalTap II in the Quartus II Software

Modeling Latches and Flip-flops

Outline Synchronous Systems Introduction Field Programmable Gate Arrays (FPGAs) Introduction Review of combinational logic

An Efficient High Speed Wallace Tree Multiplier

Figure 1: segment of an unprogrammed and programmed PAL.

Transcription:

EE 459/5 HDL Based Digital Design with Programmable Logic Lecture 9 Field Programmable Gate Arrays (FPGAs) Read before class: Chapter 3 from textbook Overview FPGA Devices ASIC vs. FPGA FPGA architecture CLB, RAM IO, Interconnects FPGA Design Flow Synthesis Place Route

Evolution of implementation technologies Logic gates (95s-6s) Regular structures for two-level logic (96s-7s) muxes and decoders, PLAs Programmable sum-of-products arrays (97s-8s) PLDs, complex PLDs Programmable gate arrays (98s-9s) densities high enough to permit entirely new class of application, e.g., prototyping, emulation, acceleration trend toward higher levels of integration ASIC vs. FPGA ASIC Application Specific Integrated Circuit designed all the way from behavioral description to physical layout designs must be sent for expensive and time consuming fabrication in semiconductor foundry FPGA Field Programmable Gate Array no physical layout design; design ends with a bitstream used to configure a device bought off the shelf and reconfigured by designers themselves 2

Which way to go? ASICs High performance Low power Low cost in high volumes FPGAs Off-the-shelf Low development cost Short time to market Reconfigurability Why FPGAs? Custom ICs sometimes designed to replace large amount of glue logic: Reduced system complexity and manufacturing cost, improved performance. However, custom ICs are very expensive to develop, and delay introduction of product to market (time to market) because of increased design time. Note: need to worry about two kinds of costs:. cost of development, sometimes called non-recurring engineering (NRE) 2. cost of manufacture A tradeoff usually exists between NRE cost and manufacturing costs total costs A B NRE number of units manufactured (volume) 3

Why FPGAs? Custom IC approach viable for products that are very high volume (where NRE could be amortized), not time-to-market sensitive. FPGAs introduced as an alternative to custom ICs for implementing glue logic: improved density relative to discrete SSI/MSI components (within around x of custom ICs) with the aid of computer aided design (CAD) tools circuits could be implemented in a short amount of time (no physical layout process, no mask making, no IC manufacturing), relative to ASICs. lowers NREs shortens TTM Because of Moore s law the density (gates/area) of FPGAs continued to grow through the 8 s and 9 s to the point where major data processing functions can be implemented on a single FPGA. Applications of FPGAs Implementation of random logic easier changes at system-level (one device is modified) can eliminate need for full-custom chips Prototyping ensemble of gate arrays used to emulate a circuit to be manufactured get more/better/faster debugging done than possible with simulation Reconfigurable hardware one hardware block used to implement more than one function functions must be mutually-exclusive in time can greatly reduce cost while enhancing flexibility RAM-based only option Special-purpose computation engines hardware dedicated to solving one problem (or class of problems) accelerators attached to general-purpose computers 4

Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Share about 9% of the market Altera Corp. Atmel Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Xilinx FPGA Families Old families XC3, XC4, XC52 Old.5µm,.35µm and.25µm technology. Not recommended for modern designs. High-performance families Virtex (22 nm) Virtex-E, Virtex-EM (8 nm) Virtex-II, Virtex-II PRO (3 nm) Virtex-4 (9 nm) Virtex-5 (65 nm) Virtex-6 Low Cost Family Spartan/XL derived from XC4 Spartan-II derived from Virtex Spartan-IIE derived from Virtex-E Spartan-3 (9 nm) Spartan-3E (9 nm) logic optimized Spartan-3A (9 nm) I/O optimized Spartan-3AN (9 nm) non-volatile Spartan-3A DSP (9 nm) DSP optimized Spartan-6 5

Altera FPGA Families High & Medium Density FPGAs Stratix II, Stratix, APEX II, APEX 2K, & FLEX K Low-Cost FPGAs Cyclone & ACEX K FPGAs with Clock Data Recovery Stratix GX & Mercury CPLDs MAX 7 & MAX 3 Embedded Processor Solutions Nios, Excalibur Configuration Devices EPC Overview FPGA Devices ASIC vs. FPGA FPGA architecture CLB, RAM IO, Interconnects FPGA Design Flow Synthesis Place Route 6

What is an FPGA? Configurable Logic Blocks (CLBs) Block RAMs Block RAMs I/O Blocks Block RAMs What is an FPGA? Programmable interconnect Programmable logic blocks 7

Example of Xilinx CLB Configurable logic block (CLB) Slice Slice CLB CLB Logic cell Logic cell Logic cell Logic cell Slice Slice CLB CLB Logic cell Logic cell Logic cell Logic cell Simplified view of a Xilinx Logic Cell 6-bit SR 6x RAM a b c d 4-input LUT mux flip-flop y e q clock clock enable set/reset 8

Idealized Configurable Logic Block (CLB) Logic Block latch set by configuration bit-stream INPUTS 4-LUT FF OUTPUT 4-input "look up table" 4-input look-up table (LUT) implements combinational logic functions Register optionally stores output of LUT How could you build a generic Boolean logic circuit? Memories as LUTs N-bit address memory word 2 N words -bit memory to hold boolean value Address is vector of boolean input values Contents encode a boolean function Read out logical value (col) for associated row 9

LUT as general logic gate An n-lut as a direct implementation of a function truth-table. Each latch location holds the value of the function corresponding to one input combination. Example: 2-lut INPUTS AND OR Can be used to implement any function of 2 inputs. How many of these are there? How many functions of n inputs? Example: 4-lut INPUTS F(,,,) F(,,,) F(,,,) F(,,,) store in st latch store in 2nd latch LUT as general logic gate x x 2 x 3 x 4 y x x 2 x 3 x 4 LUT y x x 2 x 3 x 4 x x 2 x 3 x 4 y Look-Up Tables are primary elements for logic implementation Each LUT can implement any function of 4 inputs x x 2 y y

5-Input functions implemented using two LUTs X5 X4 X3 X2 X Y LUT LUT OUT Recall: Multiplexer/Demultiplexer Multiplexer: route one of many inputs to a single output Demultiplexer: route single input to one of many outputs control control multiplexer demultiplexer 4x4 switch

Multiplexers/Selectors: to implement logic 2: mux: Z = A' I + A I 4: mux: Z = A' B' I + A' B I + A B' I2 + A B I3 8: mux: Z = A'B'C'I + A'B'CI + A'BC'I2 + A'BCI3 + AB'C'I4 + AB'CI5 + ABC'I6 + ABCI7 I I 2: mux Z I I I2 I3 4: mux Z I I I2 I3 I4 I5 I6 I7 8: mux Z A A B A B C Multiplexers as LUTs 2 n : multiplexer implements any function of n variables With the variables used as control inputs and Data inputs tied to or In essence, a look-up table Example: F(A,B,C) = m + m2 + m6 + m7 = A'B'C' + A'BC' + ABC' + ABC = A'B'(C') + A'B(C') + AB'() + AB() 2 3 4 8: MUX 5 6 7 S2 S S F A B C 2

Multiplexers as LUTs (cont d) 2 n- : mux can implement any function of n variables With n- variables used as control inputs and Data inputs tied to the last variable or its complement Example: F(A,B,C) = m + m2 + m6 + m7 = A'B'C' + A'BC' + ABC' + ABC = A'B'(C') + A'B(C') + AB'() + AB() 2 3 4 8: MUX 5 6 7 S2 S S F A B C F C' C' C' C' 4: MUX 2 3 S S A B F A B C Cascading Multiplexers Large multiplexers implemented by cascading smaller ones I I I2 I3 I4 I5 I6 I7 4: mux 4: mux B C 2: mux control signals B and C simultaneously choose one of I, I, I2, I3 and one of I4, I5, I6, I7 control signal A chooses which of the upper or lower mux's output to gate to Z A 8: mux Z I I I2 I3 I4 I5 I6 I7 2: mux 2: mux 2: mux 2: mux C alternative implementation 4: mux A B 8: mux Z 3

4-LUT Implementation 6 latch latch latch INPUTS 6 x mux OUTPUT n-bit LUT is implemented as a 2 n x memory: Inputs choose one of 2 n memory locations. Memory locations (latches) are normally loaded with values from user s configuration bit stream. Inputs to mux control are the CLB inputs. Result is a general purpose logic gate. n-lut can implement any function of n inputs! latch Latches programmed as part of configuration bit-stream Example: Xilinx Virtex-E Floorplan Configurable Logic Blocks 4-input function gens buffers flipflop Input/Output Blocks combinational, latch, and flipflop output sampled inputs Block RAM 496 bits each every 2 CLB columns 4

Virtex-E Configurable Logic Block (CLB) CLB = 4 logic cells (LC) in two slices LC: 4-input function generator, carry logic, storage element 8 x 2 CLB array on 2E 6x synchronous RAM FF or latch Details of Virtex-E Slice implements any two 4-input functions 4-input function 3-input function; registered 5

Details of Virtex-E Slice any two 6-input function from other slice 6-input function Distributed RAM CLB LUT configurable as Distributed RAM A single LUT equals 6x RAM Two LUTs Implement Single and Dual-Port RAMs Cascade LUTs to increase RAM size Synchronous write Synchronous/Asynchronou s read Accompanying flip-flops used for synchronous read LUT LUT LUT = RAM32XS D WE WCLK A O A A2 A3 A4 or = RAM6XS D WE WCLK A O A A2 A3 RAM6X2S D D WE WCLK O A O A A2 A3 or RAM6XD D WE WCLK A SPO A A2 A3 DPRA DPO DPRA DPRA2 DPRA3 6

Shift Register Each LUT can be configured as shift register Serial in, serial out Dynamically addressable delay up to 6 cycles For programmable pipeline Cascade for greater cycle delays Use CLB flip-flops to add depth IN CE CLK LUT = LUT D CE D CE D CE Q Q Q OUT D CE Q DEPTH[3:] Carry & Control Logic COUT YB G4 G3 G2 G Look-Up Table O Carry & Control Logic Y S D CK EC R Q F5IN BY SR XB F4 F3 F2 F Look-Up Table O Carry & Control Logic X S D CK EC R Q CIN CLK CE SLICE 7

Carry Logic Routing Fast Carry Logic Each CLB contains separate logic and routing for the fast generation of sum & carry signals Increases efficiency and performance of adders, subtractors, accumulators, comparators, and counters Carry logic is independent of normal logic and routing resources MSB LSB Accessing Carry Logic All major synthesis tools can infer carry logic for arithmetic functions Addition (SUM <= A + B) Subtraction (DIFF <= A - B) Comparators (if A < B then ) Counters (count <= count +) 8

Overview FPGA Devices ASIC vs. FPGA FPGA architecture CLB, RAM IO, Interconnects FPGA Design Flow Synthesis Place Route Basic I/O Block (IOB) Structure Three-State FF Enable Clock Set/Reset Output FF Enable D EC SR D EC SR Q Q Three-State Control Output Path Direct Input FF Enable Registered Input Q D EC SR Input Path 9

IOB Functionality IOB provides interface between the package pins and CLBs Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered advised for high-performance I/O Inputs can be delayed Example: Virtex-E IOB detail 2

Interconnects: Routing Logic blocks embedded in a sea of connection resources CLB = logic block IOB = I/O buffer PSM = programmable switch matrix Interconnections critical Transmission gates on paths Flexibility Connect any LB to any other but Much slower than connections within a logic block Much slower than long lines on an ASIC Every one of these connection points is a transmission gate This switch matrix is a mass of transmission gates too! Programmable switch matrix Diamond switch Vertical routing channels Horizontal routing (interconnect) channel PSM: Programmable Switch Matrix (for making connections between interconnects of different channels). The structure shown only allows i-to-i connections 2

Diamond switch FF Example: SRAM-type FPGA Interconnection Cell Connection Matrix (CCM) PSM 22

Configuring an FPGA Millions of SRAM cells holding LUTs and Interconnect Routing info Volatile Memory. Loses configuration when board power is turned off. Keep Bit Pattern describing the SRAM cells in non-volatile Memory e.g. ROM or Digital Camera card Configuration takes ~ secs JTAG Port Configuration data in Configuration data out Programming Bit File = I/O pin/pad = SRAM cell SRAM JTAG Testing Overview FPGA Devices ASIC vs. FPGA FPGA architecture CLB, RAM IO, Interconnects FPGA Design Flow Synthesis Place Route 23

FPGA Generic Design Flow Design Entry: Create your design files using: schematic editor or hardware description language (VHDL, Verilog) Design implementation on FPGA: Partition, place, and route to create bit-stream file Design verification: Use Simulator to check function. Load onto FPGA device (cable connects PC to development board) Check operation at full speed in real environment VHDL description (Your Source Files) Library IEEE; use ieee.std_logic_64.all; use ieee.std_logic_unsigned.all; entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(3 downto ); data_output: out std_logic_vector(3 downto ); out_full: in std_logic; key_input: in std_logic_vector(3 downto ); key_read: out std_logic; ); end AES_core; Functional simulation Synthesis Post-synthesis simulation Implementation Timing simulation Configuration On chip testing 24

Logic Synthesis VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A:STD_LOGIC; signal B:STD_LOGIC; signal Y:STD_LOGIC; signal MUX_, MUX_, MUX_2, MUX_3: STD_LOGIC; begin A<=A when (NEG_A='') else not A; B<=B when (NEG_B='') else not B; Y<=Y when (NEG_Y='') else not Y; MUX_<=A and B; MUX_<=A or B; MUX_2<=A xor B; MUX_3<=A xnor B; with (L & L) select Y<=MUX_ when "", MUX_ when "", MUX_2 when "", MUX_3 when others; end MLU_DATAFLOW; Implementation After synthesis the entire implementation process is performed by FPGA vendor tools 25

Translation Synthesis Circuit netlist Electronic Design Interchange Format EDIF Timing Constraints Native Constraint File NCF UCF Constraint Editor User Constraint File Translation NGD Native Generic Database file Pin Assignment FPGA B P H3 K2 G5 CLOCK CONTROL() CONTROL() CONTROL(2) RESET top_level_design SEGMENTS() SEGMENTS() SEGMENTS(2) SEGMENTS(3) SEGMENTS(4) SEGMENTS(5) SEGMENTS(6) H2 H6 H5 K3 H K4 G4 26

Circuit netlist Mapping LUT LUT4 LUT LUT2 LUT5 FF LUT3 FF2 27

Placement FPGA CLB SLICES Routing FPGA Programmable Connections 28

Configuration Once a design is implemented, you must create a file that the FPGA can understand This file is called a bitstream: a BIT file (.bit extension) The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information Map report Design Summary -------------- Number of errors: Number of warnings: Logic Utilization: Number of Slice Flip Flops: 3 out of 26,624 % Number of 4 input LUTs: 38 out of 26,624 % Logic Distribution: Number of occupied Slices: 33 out of 3,32 % Number of Slices containing only related logic: 33 out of 33 % Number of Slices containing unrelated logic: out of 33 % *See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 62 out of 26,624 % Number used as logic: 38 Number used as a route-thru: 24 Number of bonded IOBs: out of 22 4% IOB Flip Flops: 7 Number of GCLKs: out of 8 2% 29

Place & route report Asterisk (*) preceding a constraint indicates it was not met. This may be due to a setup or hold violation. ------------------------------------------------------------------------------------------------------ Constraint Requested Actual Logic Absolute Number of Levels Slack errors ------------------------------------------------------------------------------------------------------ * TS_CLOCK = PERIOD TIMEGRP "CLOCK" 5 ns 5.ns 5.4ns 4 -.4ns 5 HIGH 5% ------------------------------------------------------------------------------------------------------ TS_genHz_ClockHz = PERIOD TIMEGRP "gen 5.ns 4.37ns 2.863ns "genhz_clockhz" 5 ns HIGH 5% ------------------------------------------------------------------------------------------------------ Post layout timing report Clock to Setup on destination clock CLOCK ---------------+---------+---------+---------+---------+ Src:Rise Src:Fall Src:Rise Src:Fall Source Clock Dest:Rise Dest:Rise Dest:Fall Dest:Fall ---------------+---------+---------+---------+---------+ CLOCK 5.4 ---------------+---------+---------+---------+---------+ Timing summary: --------------- Timing errors: 9 Score: 543 Constraints cover 574 paths, nets, and 87 connections Design statistics: Minimum period: 5.4ns (Maximum frequency: 94.553MHz) 3

Summary FPGAs are more and more prevalent! They offer a flexible platform for increasingly complex systems Design automation tools take care of the entire design process from VHDL configuration bitstream file Appendix A: other FPGA architectures Virtex-II Block SelectRAM resource I/O Blocks (IOBs) Dedicated multipliers Virtex -II architecture s core voltage operates at.5v Programmable interconnect Configurable Logic Blocks (CLBs) Clock Management (DCMs, BUFGMUXes) 3

Slices and CLBs Each Virtex-II CLB contains four slices Local routing provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs A switch matrix provides access to general routing resources Switch Matrix COUT BUFT BUF T SHIFT Slice S Slice S Slice S3 Slice S2 COUT Local Routing CIN CIN Dedicated Multiplier Blocks 8-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM memory Data_A (8 bits) Data_B (8 bits) 8 x 8 Multiplier Output (36 bits) 4 x 4 signed 8 x 8 signed 2 x 2 signed 8 x 8 signed 32

Virtex-4 Architecture RocketIO Multi-Gigabit Transceivers 622 Mbps.3 Gbps Advanced CLBs 2K Logic Cells Smart RAM New block RAM/FIFO Xesium Clocking Technology 5 MHz XtremeDSP Technology Slices 256 8x8 GMACs PowerPC 45 with APU Interface 45 MHz, 68 DMIPS Tri-Mode Ethernet MAC // Mbps Gbps SelectIO ChipSync Source synch, XCITE Active Termination Choose the Platform that Best Fits the Application! LX FX SX Resource Logic 4K 2K LCs 2K 4K 4K LCs 23K 55K LCs Memory.9 6 Mb.6 Mb 2.3 5.7 Mb DCMs 4 2 4 2 4 8 DSP Slices 32 96 32 92 28 52 SelectIO 24 96 24 896 32 64 RocketIO N/A 24 Channels N/A PowerPC N/A or 2 Cores N/A Ethernet MAC N/A 2 or 4 Cores N/A 33

34