L12: Reconfigurable Logic Architectures

Similar documents
L11/12: Reconfigurable Logic Architectures

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

Field Programmable Gate Arrays (FPGAs)

Why FPGAs? FPGA Overview. Why FPGAs?

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Tajana Simunic Rosing. Source: Vahid, Katz

RELATED WORK Integrated circuits and programmable devices

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Examples of FPLD Families: Actel ACT, Xilinx LCA, Altera MAX 5000 & 7000

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

FPGA Design with VHDL

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

9 Programmable Logic Devices

Chapter 7 Memory and Programmable Logic

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

FPGA Design. Part I - Hardware Components. Thomas Lenzi

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 9 Field Programmable Gate Arrays (FPGAs)

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

Microprocessor Design

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

Integrated circuits/5 ASIC circuits

Lecture 6: Simple and Complex Programmable Logic Devices. EE 3610 Digital Systems

Digital Systems Design

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

Lecture 10: Programmable Logic

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

A Fast Constant Coefficient Multiplier for the XC6200

2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family

High Performance Carry Chains for FPGAs

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

An Application Specific Reconfigurable Architecture Diagnosis Fault in the LUT of Cluster Based FPGA

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

An Application Specific Reconfigurable Architecture Diagnosis Fault in the LUT of Cluster Based FPGA

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

An Efficient High Speed Wallace Tree Multiplier

COE328 Course Outline. Fall 2007

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

IE1204 Digital Design. F11: Programmable Logic, VHDL for Sequential Circuits. Masoumeh (Azin) Ebrahimi

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

A S. x sa1 Z 1/0 1/0

Combinational vs Sequential

Modeling Latches and Flip-flops

IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits

TKK S ASIC-PIIRIEN SUUNNITTELU

Cyclone II EPC35. M4K = memory IOE = Input Output Elements PLL = Phase Locked Loop

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

A Tour of PLDs. PLD ARCHITECTURES. [Prof.Ben-Avi]

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Lossless Compression Algorithms for Direct- Write Lithography Systems

A Briefing on IEEE Standard Test Access Port And Boundary-Scan Architecture ( AKA JTAG )

PROGRAMMABLE ASIC LOGIC CELLS

L14: Quiz Information and Final Project Kickoff. L14: Spring 2004 Introductory Digital Systems Laboratory

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

Outline Synchronous Systems Introduction Field Programmable Gate Arrays (FPGAs) Introduction Review of combinational logic

Integrated Circuit Design ELCT 701 (Winter 2017) Lecture 1: Introduction

EEM Digital Systems II

ECE 545 Lecture 1. FPGA Devices & FPGA Tools

FPGA and CPLD Architectures: A Tutorial

Sharif University of Technology. SoC: Introduction

Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Using the Quartus II Chip Editor

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Implementation of Low Power and Area Efficient Carry Select Adder

Modeling Latches and Flip-flops

VU Mobile Powered by S NO Group

Optimization of memory based multiplication for LUT

Digital. Digital. Revision: v0.19 Date: : / 76

Clock-Aware FPGA Placement Contest

XC4000E and XC4000X Series. Field Programmable Gate Arrays. Low-Voltage Versions Available. XC4000E and XC4000X Series. Features

FPGA Implementation of Sequential Logic

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: N. Weaver & J. Wawrzynek. Lecture 2 EE141

EECS150 - Digital Design Lecture 2 - CMOS

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

FPGAs for bits & giggles

S.K.P. Engineering College, Tiruvannamalai UNIT I

MUHAMMAD NAEEM LATIF MCS 3 RD SEMESTER KHANEWAL

FPGA Implementation of DA Algritm for Fir Filter

VLSI Design Digital Systems and VLSI

CS150 Fall 2012 Solutions to Homework 4

Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM)

Figure 1: segment of an unprogrammed and programmed PAL.

An Efficient Reduction of Area in Multistandard Transform Core

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

VLSI System Testing. BIST Motivation

Design for Testability

At-speed Testing of SOC ICs

Electrically Erasable Programmable Logic Devices as an Aid for Teaching Digital Electronics

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

Previous Lecture Sequential Circuits. Slide Summary of contents covered in this lecture. (Refer Slide Time: 01:55)

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

3/5/2017. A Register Stores a Set of Bits. ECE 120: Introduction to Computing. Add an Input to Control Changing a Register s Bits

BIST to Diagnosis Delay Fault in the LUT of Cluster Based FPGA

Computer Systems Architecture

Transcription:

L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics Corporation Distinguished Professor in Electrical Engineering and Computer Science at the University of California, Berkeley) and Prof. Gaetano Borriello (University of Washington Department of Computer Science & Engineering) From Chapter 2 of R. Katz, G. Borriello. Contemporary Logic Design. 2nd ed. Prentice-Hall/Pearson Education, 2005. 1

History of Computational Fabrics Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s) e.g. TTL packages: Data Book for 100 s of different parts Gate Arrays (IBM 1970s) Transistors are pre-placed on the chip & Place and Route software puts the chip together automatically only program the interconnect (mask programming) Software Based Schemes (1970 s- present) Run instructions on a general purpose core Programmable Logic (1980 s to present) A chip that be reprogrammed after it has been fabricated Examples: PALs, EPROM, EEPROM, PLDs, FPGAs Excellent support for mapping from Verilog ASIC Design (1980 s to present) Turn Verilog directly into layout using a library of standard cells Effective for high-volume and efficient use of silicon area 2

Reconfigurable Logic Logic blocks To implement combinational and sequential logic Interconnect Wires to connect inputs and outputs to logic blocks I/O blocks Special logic blocks at periphery of device for external connections Key questions: How to make logic blocks programmable? (after chip has been fabbed!) What should the logic granularity be? How to make the wires programmable? (after chip has been fabbed!) Specialized wiring structures for local vs. long distance routes? How many wires per logic block? Inputs n Logic LogicD Configuration SET CLR Q Q m Outputs 3

Programmable Array Logic (PAL) Based on the fact that any combinational logic can be realized as a sum-of-products PALs feature an array of AND-OR gates with programmable interconnect input signals AND array OR array output signals programming of product terms programming of sum terms 4

Inside the 22v10 PAL Each input pin (and its complement) sent to the AND array OR gates for each output can take 8-16 product terms, depending on output pin Macrocell block provides additional output flexibility... Image removed due to copyright restrictions. 5

Cypress PAL CE22V10 From Lattice Semiconductor Image removed due to copyright restrictions. Images courtesy of Lattice Semiconductor Corporation. Used with permission. Outputs may be registered or combinational, positive or inverted 6

Anti-Fuse Fuse-Based Approach (Actel( Actel) Rows of programmable logic building blocks + rows of interconnect Anti-fuse Technology: Program Once Use Anti-fuses to build up long wiring runs from short segments I/O Buffers, Programming and Test Logic I/O Buffers, Programming and Test Logic I/O Buffers, Programming and Test Logic Logic Module Wiring Tracks I/O Buffers, Programming and Test Logic 8 input, single output combinational logic blocks FFs constructed from discrete cross coupled gates 7

Actel Logic Module Combinational block does not have the output FF Example Gate Mapping GND A 00 01 10 11 Y D E B C S-R Flip-Flop GND VDD S GND R VDD 00 01 10 11 Q 8

Actel Routing & Programming Courtesy of Actel. Used with permission. Precharge Phase Vpp/2 Vpp/2 Vpp/2 Input Segments Vpp/2 Inputs Outputs Gnd Vpp/2 Horizontal Channel Vpp/2 Logic Module Antifuse shorted Vpp Output Segments Long Vertical Tracks Programming an Antifuse Programming is Permanent (one time) Courtesy of Actel. Used with permission. 9

RAM Based Field Programmable Logic - Xilinx CLB CLB Slew Rate Control Passive Pull-Up, Pull-Down Vcc Switch Matrix D Q Output Buffer Pad CLB CLB Q D Delay Input Buffer Programmable Interconnect I/O Blocks (IOBs) C1 C2 C3 C4 H1 DIN S/R EC G4 G3 G2 G1 F4 F3 F2 F1 K G Func. Gen. F Func. Gen. H Func. Gen. DIN F' G' H' G' H' DIN F' G' H' H' F' S/R Control 1 S/R Control 1 SD D Q EC RD SD D Q EC RD Y X Configurable Logic Blocks (CLBs) Courtesy of Xilinx. Used with permission. 10

The Xilinx 4000 CLB Courtesy of Xilinx. Used with permission. 11

Two 4-input 4 Functions, Registered Output and a Two Input Function Courtesy of Xilinx. Used with permission. 12

5-input Function, Combinational Output Courtesy of Xilinx. Used with permission. 13

LUT Mapping N-LUT direct implementation of a truth table: any function of n-inputs. N-LUT requires 2 N storage elements (latches) N-inputs select one latch location (like a memory) Inputs Why Latches and Not Registers? Courtesy of Xilinx. Used with permission. Output Latches set by configuration bitstream 4LUT example 14

Configuring the CLB as a RAM Memory is built using Latches not FFs Courtesy of Xilinx. Used with permission. 16x2 Read is same a LUT Function! 15

Xilinx 4000 Interconnect Courtesy of Xilinx. Used with permission. 16

Xilinx 4000 Interconnect Details Wires are not ideal! Courtesy of Xilinx. Used with permission. 17

Xilinx 4000 Flexible IOB Outputs through FF or bypassed Adjust Transition Time Courtesy of Xilinx. Used with permission. Adjust the Sampling Edge 18

Add Bells & Whistles Hard Processor Gigabit Serial 18 Bit 18 Bit 36 Bit I/O Multiplier VCCIO Programmable Termination Z Z Z Impedance Control BRAM Clock Mgmt Courtesy of David B. Parlour, ISSCC 2004 Tutorial, The Reality and Promise of Reconfigurable Computing in Digital Signal Processing. and Xilinx. Used with permission. 19

The Virtex II CLB (Half Slice Shown) Courtesy of Xilinx. Used with permission. 20

Adder Implementation LUT: A B Cout A B Y = A B Cin Dedicated carry logic 1 half-slice = 1-bit adder Cin Courtesy of Xilinx. Used with permission. 21

Carry Chain Courtesy of Xilinx. Used with permission. 1 CLB = 4 Slices = 2, 4-bit adders 64-bit Adder: 16 CLBs A[63:0] B[63:0] + Y[63:0] A[63:60] B[63:60] CLB15 Y[64] Y[63:60] A[7:4] B[7:4] CLB1 Y[7:4] A[3:0] B[3:0] CLB0 Y[3:0] CLBs must be in same column 22

Virtex II Features Double Data Rate registers Digital Clock Manager Embedded Multiplier Courtesy of Xilinx. Used with permission. Block SelectRAM 23

The Latest Generation: Virtex-II Pro FPGA Fabric Embedded memories Embedded PowerPc Hardwired multipliers High-speed I/O Courtesy of Xilinx. Used with permission. 24

FPGA Evolution Summary [Parlour04] 1000 Transistors x 10 6 Logic + FF Distributed RAM Arithmetic Support DSP System Design Tools Block RAM Hard MAC Hard CPU High Speed Serial IO 10 0.1 1980 1985 1990 1995 2000 2005 Glue Logic Core Functionality Logic Platform Courtesy of Xilinx. Used with permission. System Platform Domain Specific Platform 25

Design Flow - Mapping Technology Mapping: Schematic/HDL to Physical Logic units Compile functions into basic LUT-based groups (function of target architecture) a b c b d D SET CLR Q Q LUT D SET CLR Q Q always @(posedge Clock or negedge Reset) begin if (! Reset) q <= 0; else q <= (a & b & c) (b & d); end 26

Design Flow Placement & Route Placement assign logic location on a particular device LUT LUT LUT Routing iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay can take hours or days for large, dense designs Iterate placement if timing not met Satisfy timing? Generate Bitstream to config device Challenge! Cannot use full chip for reasonable speeds (wires are not ideal). Typically no more than 50% utilization. 27

Example: Verilog to FPGA module adder64 (a, b, sum); input [63:0] a, b; output [63:0] sum; assign sum = a + b; endmodule Synthesis Tech Map Place&Route 64-bit Adder Example Virtex II XC2V2000 Courtesy of Xilinx. Used with permission. 28

How are FPGAs Used? Prototyping Ensemble of gate arrays used to emulate a circuit to be manufactured Get more/better/faster debugging done than with simulation Reconfigurable hardware One hardware block used to implement more than one function Special-purpose computation engines Hardware dedicated to solving one problem (or class of problems) Accelerators attached to general-purpose computers (e.g., in a cell phone!) 29

Summary FPGA provide a flexible platform for implementing digital computing A rich set of macros and I/Os supported (multipliers, block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to validate a design before ASIC mapping) to highperformance spatial computing Interconnects are a major bottleneck (physical design and locality are important considerations) College students will study concurrent programming instead of C as their first computing experience. -- David B. Parlour, ISSCC 2004 Tutorial 30