L11/12: Reconfigurable Logic Architectures

Similar documents
L12: Reconfigurable Logic Architectures

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

Field Programmable Gate Arrays (FPGAs)

Why FPGAs? FPGA Overview. Why FPGAs?

Examples of FPLD Families: Actel ACT, Xilinx LCA, Altera MAX 5000 & 7000

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Tajana Simunic Rosing. Source: Vahid, Katz

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

RELATED WORK Integrated circuits and programmable devices

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

FPGA Design with VHDL

9 Programmable Logic Devices

nmos transistor Basics of VLSI Design and Test Solution: CMOS pmos transistor CMOS Inverter First-Order DC Analysis CMOS Inverter: Transient Response

Chapter 7 Memory and Programmable Logic

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 9 Field Programmable Gate Arrays (FPGAs)

FPGA Design. Part I - Hardware Components. Thomas Lenzi

Digital Systems Design

Integrated circuits/5 ASIC circuits

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

Lecture 6: Simple and Complex Programmable Logic Devices. EE 3610 Digital Systems

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

Microprocessor Design

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

High Performance Carry Chains for FPGAs

A Fast Constant Coefficient Multiplier for the XC6200

An Application Specific Reconfigurable Architecture Diagnosis Fault in the LUT of Cluster Based FPGA

An Application Specific Reconfigurable Architecture Diagnosis Fault in the LUT of Cluster Based FPGA

IE1204 Digital Design. F11: Programmable Logic, VHDL for Sequential Circuits. Masoumeh (Azin) Ebrahimi

An Efficient High Speed Wallace Tree Multiplier

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Lecture 10: Programmable Logic

2. Logic Elements and Logic Array Blocks in the Cyclone III Device Family

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

Laboratory 1 - Introduction to Digital Electronics and Lab Equipment (Logic Analyzers, Digital Oscilloscope, and FPGA-based Labkit)

IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits

TKK S ASIC-PIIRIEN SUUNNITTELU

EECS150 - Digital Design Lecture 3 Synchronous Digital Systems Review. Announcements

Combinational vs Sequential

Modeling Latches and Flip-flops

PROGRAMMABLE ASIC LOGIC CELLS

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

A S. x sa1 Z 1/0 1/0

FPGA Implementation of Sequential Logic

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

Outline Synchronous Systems Introduction Field Programmable Gate Arrays (FPGAs) Introduction Review of combinational logic

L14: Quiz Information and Final Project Kickoff. L14: Spring 2004 Introductory Digital Systems Laboratory

COE328 Course Outline. Fall 2007

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

A Tour of PLDs. PLD ARCHITECTURES. [Prof.Ben-Avi]

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

Lossless Compression Algorithms for Direct- Write Lithography Systems

Modeling Latches and Flip-flops

Sharif University of Technology. SoC: Introduction

Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM)

Authentic Time Hardware Co-simulation of Edge Discovery for Video Processing System

VU Mobile Powered by S NO Group

ECE 545 Lecture 1. FPGA Devices & FPGA Tools

Cyclone II EPC35. M4K = memory IOE = Input Output Elements PLL = Phase Locked Loop

EECS150 - Digital Design Lecture 2 - CMOS

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Integrated Circuit Design ELCT 701 (Winter 2017) Lecture 1: Introduction

Implementation of Low Power and Area Efficient Carry Select Adder

EEM Digital Systems II

A Briefing on IEEE Standard Test Access Port And Boundary-Scan Architecture ( AKA JTAG )

FPGA and CPLD Architectures: A Tutorial

CS150 Fall 2012 Solutions to Homework 4

Digital. Digital. Revision: v0.19 Date: : / 76

MUHAMMAD NAEEM LATIF MCS 3 RD SEMESTER KHANEWAL

problem maximum score 1 28pts 2 10pts 3 10pts 4 15pts 5 14pts 6 12pts 7 11pts total 100pts

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

ispmach 4000 Timing Model Design and Usage Guidelines

XC4000E and XC4000X Series. Field Programmable Gate Arrays. Low-Voltage Versions Available. XC4000E and XC4000X Series. Features

Clock-Aware FPGA Placement Contest

Read-only memory (ROM) Digital logic: ALUs Sequential logic circuits. Don't cares. Bus

Using the Quartus II Chip Editor

Evaluation of Advanced Techniques for Structural FPGA Self-Test

Testability: Lecture 23 Design for Testability (DFT) Slide 1 of 43

Figure 1: segment of an unprogrammed and programmed PAL.

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

Testing Sequential Circuits

BIST to Diagnosis Delay Fault in the LUT of Cluster Based FPGA

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Introductory Digital Systems Laboratory

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2011

Previous Lecture Sequential Circuits. Slide Summary of contents covered in this lecture. (Refer Slide Time: 01:55)

Saturated Non Saturated PMOS NMOS CMOS RTL Schottky TTL ECL DTL I I L TTL

Chapter Contents. Appendix A: Digital Logic. Some Definitions

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: N. Weaver & J. Wawrzynek. Lecture 2 EE141

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

Low Power and Area Efficient 256-bit Shift Register based on Pulsed Latches

VLSI Design Digital Systems and VLSI

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

Laboratory 4. Figure 1: Serdes Transceiver

Designing for High Speed-Performance in CPLDs and FPGAs

Transcription:

L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley, Department of Electrical Engineering & Computer Science) - Gaetano Borriello (University of Washington, Department of Computer Science & Engineering, http://www.cs.washington.edu/370) - Frank Honore 1

History of Computational Fabrics Discrete devices: relays, transistors (1940s-50s) Discrete logic gates (1950s-60s) Integrated circuits (1960s-70s) e.g. TTL packages: Data Book for 100 s of different parts Gate Arrays (IBM 1970s) Transistors are pre-placed on the chip & Place and Route software puts the chip together automatically only program the interconnect (mask programming) Software Based Schemes (1970 s- present) Run instructions on a general purpose core ASIC Design (1980 s to present) Turn Verilog directly into layout using a library of standard cells Effective for high-volume and efficient use of silicon area Programmable Logic (1980 s to present) A chip that be reprogrammed after it has been fabricated Examples: PALs, EPROM, EEPROM, PLDs, FPGAs Excellent support for mapping from Verilog 2

Reconfigurable Logic Logic blocks To implement combinational and sequential logic Interconnect Wires to connect inputs and outputs to logic blocks I/O blocks Special logic blocks at periphery of device for external connections Key questions: How to make logic blocks programmable? (after chip has been fabbed!) What should the logic granularity be? How to make the wires programmable? (after chip has been fabbed!) Specialized wiring structures for local vs. long distance routes? How many wires per logic block? Inputs n Logic LogicD Configuration SET CLR Q Q m Outputs 3

Programmable Array Logic (PAL) Based on the fact that any combinational logic can be realized as a sum-of-products PALs feature an array of AND-OR gates with programmable interconnect input signals AND array OR array output signals programming of product terms programming of sum terms 4

Inside the 22v10 Macrocell Macrocell Block Outputs may be registered or combinational, positive or inverted Registered output may be fed back to AND array for FSMs, etc. (Courtesy of Lattice Semiconductor Corporation. Used with permission.) 6

Anti-Fuse Fuse-Based Approach (Actel( Actel) Rows of programmable logic building blocks + rows of interconnect Anti-fuse Technology: Program Once Use Anti-fuses to build up long wiring runs from short segments I/O Buffers, Programming and Test Logic I/O Buffers, Programming and Test Logic I/O Buffers, Programming and Test Logic Logic Module Wiring Tracks I/O Buffers, Programming and Test Logic 8 input, single output combinational logic blocks FFs constructed from discrete cross coupled gates 8

Actel Logic Module Combinational block does not have the output FF Example Gate Mapping GND A D E B C 00 01 10 11 Y S-R Flip-Flop GND VDD 00 01 10 11 Q S GND R VDD 9

Actel Routing & Programming Precharge Phase Vpp/2 Vpp/2 Vpp/2 Input Segments Vpp/2 Inputs Outputs Gnd Vpp/2 Horizontal Channel Vpp/2 Logic Module Antifuse shorted Vpp Output Segments Long Vertical Tracks Programming an Antifuse (Courtesy of Actel. Used with permission.) 10

RAM Based Field Programmable Logic - Xilinx CLB CLB Slew Rate Control Passive Pull-Up, Pull-Down Vcc Switch Matrix D Q Output Buffer Pad CLB CLB Q D Delay Input Buffer Programmable Interconnect I/O Blocks (IOBs) C1 C2 C3 C4 H1 DIN S/R EC G4 G3 G2 G1 F4 F3 F2 F1 K G Func. Gen. F Func. Gen. H Func. Gen. DIN F' G' H' G' H' DIN F' G' H' H' F' S/R Control 1 S/R Control 1 SD D Q EC RD SD D Q EC RD Y X Configurable Logic Blocks (CLBs) 11

The Xilinx 4000 CLB 12

Two 4-input 4 Functions, Registered Output and a Two Input Function 13

5-input Function, Combinational Output 14

LUT Mapping N-LUT direct implementation of a truth table: any function of n-inputs. N-LUT requires 2 N storage elements (latches) N-inputs select one latch location (like a memory) Inputs Why Latches and Not Registers? Output Latches set by configuration bitstream 4LUT example 15

Configuring the CLB as a RAM Memory is built using Latches not FFs 16x2 Read is same a LUT Function! 16

Xilinx 4000 Interconnect 17

Xilinx 4000 Interconnect Details Wires are not ideal! 18

Add Bells & Whistles Hard Processor Gigabit Serial 18 Bit 18 Bit 36 Bit I/O Multiplier VCCIO Programmable Termination Z Z Z Impedance Control BRAM Clock Mgmt Courtesy of David B. Parlour. Used with permission., ISSCC 2004 Tutorial, The Reality and Promise of Reconfigurable Computing in Digital Signal Processing 19

Xilinx 4000 Flexible IOB Outputs through FF or bypassed Adjust Transition Time Adjust the Sampling Edge 20

The Virtex II CLB (Half Slice Shown) 21

Adder Implementation LUT: A B Cout A B Y = A B Cin Dedicated carry logic 1 half-slice = 1-bit adder 22 Cin

Carry Chain 1 CLB = 4 Slices = 2, 4-bit adders 64-bit Adder: 16 CLBs A[63:0] B[63:0] + Y[63:0] A[63:60] B[63:60] CLB15 Y[64] Y[63:60] A[7:4] B[7:4] CLB1 Y[7:4] A[3:0] B[3:0] CLB0 Y[3:0] CLBs must be in same column 23

Virtex II Features Double Data Rate registers Digital Clock Manager Embedded Multiplier Block SelectRAM 24

The Latest Generation: Virtex-II Pro FPGA Fabric Embedded memories Embedded PowerPc Hardwired multipliers High-speed I/O 25

Design Flow - Mapping Technology Mapping: Schematic/HDL to Physical Logic units Compile functions into basic LUT-based groups (function of target architecture) a b c b d D SET CLR Q Q LUT D SET CLR Q Q always @(posedge Clock or negedge Reset) begin if (! Reset) q <= 0; else q <= (a & b & c) (b & d); end 31

Design Flow Placement & Route Placement assign logic location on a particular device LUT LUT LUT Routing iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay can take hours or days for large, dense designs Iterate placement if timing not met Satisfy timing? Generate Bitstream to config device Challenge! Cannot use full chip for reasonable speeds (wires are not ideal). Typically no more than 50% utilization. 32

Example: Verilog to FPGA module adder64 (a, b, sum); input [63:0] a, b; output [63:0] sum; assign sum = a + b; Synthesis Tech Map Place&Route endmodule 64-bit Adder Example Virtex II XC2V2000 33

How are FPGAs Used? Prototyping Ensemble of gate arrays used to emulate a circuit to be manufactured Get more/better/faster debugging done than with simulation Reconfigurable hardware One hardware block used to implement more than one function Special-purpose computation engines Hardware dedicated to solving one problem (or class of problems) Accelerators attached to general-purpose computers (e.g., in a cell phone!) 34

Summary FPGA provide a flexible platform for implementing digital computing A rich set of macros and I/Os supported (multipliers, block RAMS, ROMS, high-speed I/O) A wide range of applications from prototyping (to validate a design before ASIC mapping) to highperformance spatial computing Interconnects are a major bottleneck (physical design and locality are important considerations) College students will study concurrent programming instead of C as their first computing experience. -- David B. Parlour, ISSCC 2004 Tutorial 35