PLD Synthesis Algorithms

Size: px
Start display at page:

Download "PLD Synthesis Algorithms"

Transcription

1 What to Synthesize PLD Synthesis Algorithms Professor Jason Cong Computer Science Department University of California, Los Angeles Los Angeles, CA Structured logic: Examples: datapath, register files, Best to be provided by FPGA vendors as libraries and functional generators Random logic: Examples: control circuits, finite state machines Good candidates to be synthesized by automatic tools 6/22/2001 DAC 2001 Tutorial: Jason Cong 1 6/22/2001 DAC 2001 Tutorial: Jason Cong 2

2 Programmable Logic Blocks (PLBs) in FPGAs Focus of This Talk Lookup-table based Altera APEX and FLEX devices Lucent Technologies ORCA devices Xilinx Virtex and XC4K devices MUX-based Actel ACT1 and ACT2 Quicklogic Eclips PLA-based (CPLD) Altera MAX7000 Cypress CY37000 and CY /22/2001 DAC 2001 Tutorial: Jason Cong 3 Synthesis for random logic Need high-degree of automation Much room for optimization Extensive research Synthesis for SRAM-based (LUT-based) FPGAs Has the largest share in the FPGA market Synthesis-friendly Reconfigurability provides many potential applications 6/22/2001 DAC 2001 Tutorial: Jason Cong 4

3 Formulation of LUT-Based Synthesis Problems Logic optimization (Network transformation) Transform the input network into another network that is more suitable for mapping into LUT networks Technology mapping (LUT covering) Cover the optimized network with LUTs for one or more objectives Logic Optimization Operations Example: decomposition structural : abcd = ((ab)c)d functional: f(a,b,c,d) = g (y1(a,b,c), y2(a,b,c), d) f y1 y2 g 6/22/2001 DAC 2001 Tutorial: Jason Cong 5 6/22/2001 DAC 2001 Tutorial: Jason Cong 6

4 Logic Optimization Operations (Cont d) Extraction f = ac + bc, g = ad + bd then f = ec, g = ed, e = (a+b) Substitution f = a+bc, h = bc then f = a + h Elimination f = a+bc, b = d+e, then f = a+cd+ce Critical path re-synthesis... Technology Mapping for K-LUT Cover the network using K-LUTs Duplication-free v.s. duplicated mapping (k = 3) original circuit duplication-free duplication 6/22/2001 DAC 2001 Tutorial: Jason Cong 7 6/22/2001 DAC 2001 Tutorial: Jason Cong 8

5 Outline Outline Architecture Early results ( ) Simple, homogeneous LUTS E.g. XC2K, Flex8K Heterogeneous FPGAs Embedded memory blocks Complex PLBs Million-gate FPGAs Field-programmable system-on-a-chip Recent advances ( ) New challenges ( ) 6/22/2001 DAC 2001 Tutorial: Jason Cong Homogeneous K-LUT mapping for depth, area min. Focus on combinational circuits Synthesis Heterogeneous FPGA mapping Mapping for EMBs Boolean matching Simultaneous mapping + retiming Layout-driven synthesis Use of IP blocks Synthesis for FPSOC 6/22/2001 DAC 2001 Tutorial: Jason Cong 10

6 Outline Architecture Heterogeneous FPGAs Million-gate FPGAs Heterogeneous FPGAs Million-gate FPGAs Simple, Simple, homogeneous homogeneous LUTS LUTS Embedded memory blocks Embedded memory blocks Field-programmable Field-programmable E.g. XC2K, Flex8K Complex PLBs PLBs system-on-a-chip Outline Architecture Simple, homogeneous LUTS E.g. XC2K, Flex8K Heterogeneous FPGAs Embedded memory blocks Complex PLBs Million-gate FPGAs Field-programmable system-on-a-chip Homogeneous K-LUT K-LUT mapping for depth, area area min. min. Focus on combinational circuits Focus on combinational circuits Synthesis Heterogeneous FPGA Heterogeneous FPGA Layout-driven synthesis mapping Use of Use IP of blocks IP Mapping for for EMBs EMBs Synthesis for Synthesis for FPSOC Boolean matching Simultaneous mapping + Simultaneous mapping retiming + retiming 6/22/2001 DAC 2001 Tutorial: Jason Cong Homogeneous K-LUT mapping for depth, area min. Focus on combinational circuits Synthesis Heterogeneous FPGA mapping Mapping for EMBs Boolean matching Simultaneous mapping + retiming Layout-driven synthesis Use of IP blocks Synthesis for FPSOC 6/22/2001 DAC 2001 Tutorial: Jason Cong 12

7 Outline Early Results: Depth Minimization Early results ( ) Developed for homogeneous LUTs Focus on combinational circuits Recent advances ( ) Optimal mapping for trees Chortle-d [Francis, Rose, Vranesic, ICCAD 91] Optimal mapping for general networks FlowMap [Cong&Ding, ICCAD 92] New challenges ( ) 6/22/2001 DAC 2001 Tutorial: Jason Cong 13 6/22/2001 DAC 2001 Tutorial: Jason Cong 14

8 Early Result: FlowMap Depth-optimal technology mapping [Cong&Ding, TCAD 94] BASIC APPROACH Compute a label for each node Label of a node represents the minimum possible depth of the node in any mapping solution Dynamic Programming Starting from PI nodes, compute node labels in topological order: compute the label of a node based on labels of its predecessors Labels of PO nodes give the depth of the optimal mapping solution. 6/22/2001 DAC 2001 Tutorial: Jason Cong 15 Cuts in a Network Given a cut (X, X) and a label l(v) on each node v Node-Cut size: n(x,x) = {v:(v,u) is cut} X K-feasible cut: n(x,x) < K X Height of a cut: h(x,x) = max{l(v) v X} t 6/22/2001 DAC 2001 Tutorial: Jason Cong 16 s

9 Label Computation in FlowMap Dynamic programming - compute each node label (optimal mapping depth) by computing a min-height K-feasible cut. Min-height K-feasible cut can be computed in O(Km) time using flow computation Primary inputs u w 1 2 v 2 infeasible cut, h = 0 K-feasible cut, h = 1 LUT input size K = 3 FlowMap Algorithm: Summary Phase1 : Label computation Process each node t in topological order starting from PIs: Compute minimum height K-feasible cut (Xt, Xt) in Nt; l(t) = h(xt, Xt) + 1; Phase 2: Generate necessary K-LUTs L = list of POs; WHILE L 0 DO remove a node t from L; LUT(t) = Xt; L = L {non-pi inputs to LUT(t)} END. Produce depth-optimal mapping for any K-bounded network in O(Kmn) time where m: # number of edges; n: # nodes in the network 6/22/2001 DAC 2001 Tutorial: Jason Cong 17 6/22/2001 DAC 2001 Tutorial: Jason Cong 18

10 Early Results: Area Minimization Optimal mapping for trees with bounded or unbounded fanins Chortle-crf [Francis, Rose, Vranesic, DAC 91] : Optimal mapping without logic duplication for general networks DF-Map [Cong&Ding, DAC 93] : NP-hard for general networks with possible logic duplication [Farrahi&Sarrafzadeh, TCAD 94] : 6/22/2001 DAC 2001 Tutorial: Jason Cong 19 Early Results Combined Synthesis with Mapping Extension of traditional logic optimization techniques + covering & functional decomposition MIS-pga and MIS-pga-delay [Murgai et. al., DAC 90, ICCAD91] Use of functional decomposition to generate a LUT network directly FGSyn [Lai, Pedram, Vrudula, DAC 93], IMODEC [Wurth, et al, DAC 95], BoolMap-D [Legl, et al, DAC 96] Mapping with Re-synthesis FlowSYN [Cong&Ding, ICCAD 93], ALTO [Huang, Jou, Shen,ICCAD 96] 6/22/2001 DAC 2001 Tutorial: Jason Cong 20

11 Outline Outline Architecture Simple, homogeneous LUTS E.g. XC2K, Flex8K Homogeneous K-LUT mapping for depth, area min. Focus on combinational circuits Synthesis Heterogeneous FPGAs Embedded memory blocks Complex PLBs Heterogeneous FPGA mapping Mapping for EMBs Boolean matching Simultaneous mapping + retiming Million-gate FPGAs Field-programmable system-on-a-chip Layout-driven synthesis Use of IP blocks Synthesis for FPSOC 6/22/2001 DAC 2001 Tutorial: Jason Cong 21 Early results Recent advances Optimization and mapping for sequential circuits Synthesis for heterogeneous FPGAs Synthesis for FPGAs with embedded memory blocks Use of Boolean matching instead of pattern matching Combined decomposition and mapping UCLA RASP FPGA synthesis system New challenges 6/22/2001 DAC 2001 Tutorial: Jason Cong 22

12 Direct Optimization and Mapping for Sequential Circuits Traditional approaches Assuming the positions of FFs are fixed Mapping each combinational subcircuit separately The optimal solutions for all subcircuits may not lead to the optimal solution of the entire circuit Difficulties and Solutions Difficulties: When to retime? Before mapping? -- delay is un-known for retiming After mapping? -- FF positions are fixed during mapping How to compute an equivalent initial state? Solutions: 3-LUT 3-LUT Simultaneous mapping with retiming [Pan&Liu, DAC 96] [Cong&Wu, ICCD 96] Optimal mapping + forward retiming [Cong&Wu, DAC 98] original circuit F = 2 without retiming Φ = 1 with retiming 6/22/2001 DAC 2001 Tutorial: Jason Cong 23 6/22/2001 DAC 2001 Tutorial: Jason Cong 24

13 Simultaneous Mapping with Retiming Key idea -- expanded circuit a DAG rooted at a node and, every path from a node to the root has the same #FFs Usage: to form all possible LUTs under retiming b a c original circuit a b c 3-LUT 6/22/2001 DAC 2001 Tutorial: Jason Cong 25 a0 1 b c 2 a1 0 b c1 0 a Simultaneous Mapping + Retiming (Cont d) Polynomial-time optimal algorithm for mapping + retiming First proposed in SeqMapII [Pan&Liu, DAC 96] Significant speed-up (over 2000x) achieved by TurboMap [Cong&Wu, ICCD 96 ] Automatic pipelining with use of re-synthesis to reduce max. loop s delay-to-register ratio TurboSYN [Cong&Wu, DAC 97] 6/22/2001 DAC 2001 Tutorial: Jason Cong 26

14 Experimental Results: Mapping + Retiming + Pipelining Experimental Results: Mapping + Retiming + Pipelining (Cont d) 16 MCNC FSMs and ISCAS Sequential Benchmarks with 30~10,000 simple gates 16 MCNC FSMs and ISCAS Sequential Benchmarks with 30~10,000 simple gates TurboSYN: resynthesis+retiming+pip elining TurboMap: mapping+retiming FlowMap+retiming:separa te mapping with retiming TurboSYN: resynthesis+retiming+pip elining TurboMap: mapping+retiming FlowMap+retiming: separate mapping with retiming avg. Clock Period avg. #LUT avg. #Flipflop 6/22/2001 DAC 2001 Tutorial: Jason Cong 27 6/22/2001 DAC 2001 Tutorial: Jason Cong 28

15 Retiming with Initial States Many sequential circuit have initial states Retiming will change the initial state! Equivalent initial state computation for (backward) retiming is NP-hard i j k f(x) guarantee init-state move forward (FRT) y = f(x) f(x) y move backward (BRT) exists X, f(x)=y? NP-complete??? f(x) Conventional Approaches no original circuit compute a retiming can find an equivalent init-state? yes finish Initial-state computation for a given retiming is NP-hard Iteration may not find a feasible retiming solution 6/22/2001 DAC 2001 Tutorial: Jason Cong 29 6/22/2001 DAC 2001 Tutorial: Jason Cong 30

16 Optimal Mapping with Forward Retiming Optimal mapping with forward retiming (FRT) in polynomial time => guarantee initial state computation TurboMap-frt [Cong&Wu, DAC 98] Experimental Results of Optimal Mapping with Forward Retiming 18 Benchmarks with 30~10,000 simple gates 10 out 18 TurboMap solutions cannot compute initstates 7.0 New flow for retiming: Step 1: move FFs backward as much as possible create large freedom for mapping+frt Step 2: optimal mapping+frt clock period min. with guaranteed equivalent initial states TurboMap-frt: mapping+forward retiming TurboMap: mapping+retiming FlowMap-frt: separate mapping with forward retiming 6/22/2001 DAC 2001 Tutorial: Jason Cong 31 avg. Clock Period 6/22/2001 DAC 2001 Tutorial: Jason Cong 32

17 Experimental Results of Optimal Mapping with Forward Retiming 18 Benchmarks with 30~10,000 simple gates 10 out 18 TurboMap solutions cannot compute init-states TurboMap-frt: mapping+forward retiming TurboMap: mapping+retiming FlowMap-frt: separate mapping with forward retiming Technology Mapping for FPGAs with Heterogeneous LUTS Almost all recent FPGA architectures support heterogeneous LUTs One-size fits all is not good enough Examples Xilinx XC CLB = 2 x 4-LUTs = 1 x 5-LUT Lucent ORCA2C 1 PFU = 4 x 4-LUTs = 2 x 5-LUTs = 1 x 6-LUT avg. #5-LUTs avg. #Flipflops 6/22/2001 DAC 2001 Tutorial: Jason Cong 33 6/22/2001 DAC 2001 Tutorial: Jason Cong 34

18 XC4000 Block Diagram ORCA2C Block Diagram 1 PFU = 42 1 x 4-LUTs 5-LUTs 6-LUT 1 CLB = 2 x 4-LUTs = 1 x 5-LUT 6/22/2001 DAC 2001 Tutorial: Jason Cong 35 6/22/2001 DAC 2001 Tutorial: Jason Cong 36

19 Problem Formulation Mapping for Heterogeneous FPGAs Problem Heterogeneous LUTs have different delays and areas Two types of heterogeneous LUT-based FPGAs Fully configurable, no fixed ratio between different types of LUTs Fixed combination of several different LUTs in a PLB (discussed later using Boolean matching) Objective: Delay or area minimization 6/22/2001 DAC 2001 Tutorial: Jason Cong 37 Solutions Compute multiple cuts at each node in the network network-flow computation cut enumeration Select the most appropriate LUT implementation Depth minimization HeteroMap [Cong&Xu, DAC 98]: Polynomial-time delayoptimal for general networks Area minimization Optimal for trees [Korupolu, Lee, Wong, DAC 98] Heuristic for general networks [He&Rose, FPGA 94] 6/22/2001 DAC 2001 Tutorial: Jason Cong 38

20 Heterogeneous v.s. Homogeneous Mapping XC4000 Comparison Comparison Ratio Ratio Delay(4-LUT) : Delay(5-LUT) = 1 : 1.5 Comparison Comparison between between FlowMap FlowMap and and HeteroMap HeteroMap on on XC4000 XC4000 Series Series FPGAs FPGAs -19% Mapping- Mapping- Delay Delay PostLayout- PostLayout- Delay Delay -7% +2% [Cong&Xu, DAC 98] #PLB #PLB FlowMap(5) FlowMap(5) HeteroMap(5,4) HeteroMap(5,4) 6/22/2001 DAC 2001 Tutorial: Jason Cong 39 Comparison between Homogeneous and Heterogeneous FPGAs Delay(3-LUT) : Delay(4-LUT) : Delay(5-LUT) : Delay(6-LUT) = 1 : 1.3 : 1.7 : 2 Comparison Comparison Ratio Ratio Performance Comparison between Homogeneous and Heterogeneous FPGAs Mapping-Delay Mapping-Delay MemoryCell-Area 3-LUT-FPGA 3-LUT-FPGA 4-LUT-FPGA 4-LUT-FPGA 5-LUT-FPGA 5-LUT-FPGA 6-LUT-FPGA 6-LUT-FPGA LUT LUT- HeteroFPGA HeteroFPGA 6/22/2001 DAC 2001 Tutorial: Jason Cong 40

21 Mapping for FPGAs with Embedded Memory Blocks Embedded memory blocks (EMBs) On-chip memories Logic functions FLEX10K Device Block Diagram 6/22/2001 DAC 2001 Tutorial: Jason Cong 41 Problem Formulation Limited number of EMBs in one chip Configuration flexibility of EMBs E.g. Each EMB in FLEX10K has 2K cells and can be configured to 2Kx1, 1Kx2, 512x4, 256x8 memory Unmapped Circuit Minimize delay and/or area LUT EMB LUT LUT LUT EMB Mapped Circuit 6/22/2001 DAC 2001 Tutorial: Jason Cong 42

22 Solutions EMB_Pack [Cong&Xu, FPGA 98] Use EMBs to minimize the circuit area Maintain the circuit delay Post-mapping processing and pre-mapping processing Smap [Wilton, FPGA 98] Use EMBs to minimize the circuit area Post-mapping processing Results of EMB_Pack Comparison Comparison Ratio Ratio Comparison between CutMap [Cong&Hwang, FPGA'95] and and CutMap Followed by by EMB_Pack on on MCNC Benchmarks on on FLEX10K Device Family % CutMap CutMap Followed by by EMB_Pack #LUT #LUT Layout Layout Delay Delay 6/22/2001 DAC 2001 Tutorial: Jason Cong 43 6/22/2001 DAC 2001 Tutorial: Jason Cong 44

23 Boolean Matching for Complex PLBs Direct Mapping to Programmable Logic Blocks (PLBs) PLB: Programmable Logic Block Example: given a 9-input function f of f = x 1 x 2 + x 2 x 3 + x 2 x 3 x 8 + x 5 x 6 a + x 5 x 7 a + x 4 x 5 x 6 x 7 + x 5 x 6 x 7 a + x 5 x 6 x 7 a a = x 0 x 4 + x 0 x 4 Target: Xilinx XC4K FPGAs LUT covering + packing: 4 CLBs Boolean matching: 1 CLB Advantage: significant area & delay reduction F x G XC4K H f(x) Benefits May have significant area and delay reduction Difficulties : Need to perform Boolean matching Given an arbitrary function f and a PLB, determine if PLB can implement f. 6/22/2001 DAC 2001 Tutorial: Jason Cong 45 6/22/2001 DAC 2001 Tutorial: Jason Cong 46

24 Example: Boolean Matching for XC4K CLB Boolean Matching Results -- for MCNC benchmarks Experiment: enumerate all K-input functions Functional decomposition f (X) = H ( F (X1), G (X2) ), f(x) = H ( F (X1), G (X2), x ), f(x) = H (F(X1,x), G(X2), x ), f(x) = H (F(X1,x), G(X2,x), x ). x F G XC4K H f(x) Circuits 5-input 6-input 7-input 9sym C alu alu des XC4K CLB can implement Conditions 98% of 6-input functions F and G input sizes 4 88% of 7-input functions 6/22/2001 DAC 2001 Tutorial: Jason Cong 47 6/22/2001 DAC 2001 Tutorial: Jason Cong 48

25 Application to Technology Mapping (for XC4000 and XC5200 FPGAs) Application to Architecture Evaluation (logic capability v.s. silicon area) Comparing to LUT mapping results, the PLB mapping obtains for XC5200 FPGAs 7% depth reduction F H1 XC4K CLB 40 Memory cells ( > 5 inputs) G H 3,4,5 F G S XC5K Memory cells ( > 4 or 5 inputs) 13% area reduction for XC4000 FPGAs 17% depth reduction 3% area increase XC4K(0,4,3) 24 Memory cells ( > 4 inputs) G H 3,4 F G H XC4K(3/4,4,2) 28,36 Memory cells ( > 4 inputs) 6/22/2001 DAC 2001 Tutorial: Jason Cong 49 6/22/2001 DAC 2001 Tutorial: Jason Cong 50

26 Architecture Evaluation (for wide function implementation) # implementable functions / # memory cells for each type of PLB Combined Decomposition with Mapping a b c d e f g a b c d e f g input funcs 6-input funcs B.2 (4,4,MUX) (0,4,3) XC4K XC5K XC4K (3,4,2) (4,4,2) XC4K (a) Initial 5-bounded network (b) Best mapping after dmig: depth 3, area 5 6/22/2001 DAC 2001 Tutorial: Jason Cong 51 6/22/2001 DAC 2001 Tutorial: Jason Cong 52

27 Impact of Decomposition Problem Formulation a b c d e f g a b c d e f g Structural Gate Decomposition in a W- bounded network for K-LUT mapping (W- SGD/K) Goal: find a decomposition with minimum depth after mapping The W-SGD/K problem is NP-hard for W K 5 [Cong&Hwang, DAC96] (a) Initial 5-bounded network (c) Optimal decomposition: depth 2, area 3 6/22/2001 DAC 2001 Tutorial: Jason Cong 53 6/22/2001 DAC 2001 Tutorial: Jason Cong 54

28 Available Solutions Mapping Graph Definition Simultaneous decomposition and mapping for trees (Chortle-crf or Chortle-d algorithms) Combines bin-packing with flow computation (for computing the min height cuts): DOGMA [Cong&Hwang, DAC 96] Use a mapping graph to encode all possible (or a large class of) decompositions, and compute a mapping solution on it: SLDmap [Chen&Cong, FPGA 01] A modified AND2/INV network to encode a set of circuit structures in a single graph [Lehman et al. ICCAD95] choice nodes (logical equivalence) ugates (two choice nodes and fanins) cycles Reduction unique choice node unique INV and AND2 nodes 6/22/2001 DAC 2001 Tutorial: Jason Cong 55 6/22/2001 DAC 2001 Tutorial: Jason Cong 56

29 Mapping Graph Example Mapping Graph Example A B A B C C D 3 Z D 3 Z A B C a b d e f g h a b d e f g h D c c 6/22/2001 DAC 2001 Tutorial: Jason Cong 57 6/22/2001 DAC 2001 Tutorial: Jason Cong 58

30 Mapping Graph Example Mapping Graph Example A B C D 1 2 3i 3 a b 4 d e f g h Z A B C D 1 2 3i a b 4 d e f g h Z 6/22/2001 DAC 2001 Tutorial: Jason Cong 59 6/22/2001 DAC 2001 Tutorial: Jason Cong 60

31 Mapping Graph Example Overview of SLDMap [Chen&Cong, FPGA 01] A B Initial W-bounded network generation Mapping graph construction C Depth optimal labeling D 3i Label relaxation and area minimization a d f 7 8g 9h Z Decomposition selection Fixed decomposition mapping b e 6/22/2001 DAC 2001 Tutorial: Jason Cong 61 6/22/2001 DAC 2001 Tutorial: Jason Cong 62

32 Experimental Flow Depth/Area Comparison MCNC Circuit Set, UCLA RASP package, CUDD dmig [chen et. al, IEEE Design & Test 92] dogma [cong, DAC96] Initial W-bounded network dmig dogma sldmap cutmap greedy_pack Xilinx Foundation 3.1 P&R depth area dmig dogma sldmap 6/22/2001 DAC 2001 Tutorial: Jason Cong 63 6/22/2001 DAC 2001 Tutorial: Jason Cong 64

33 Post-layout Delay Outline 100 ns k2(v) 9sym(S) i3(v) x1(s) C499(3K) dogma sldmap V:Vertex S:Spartan 3K:XC3K 6/22/2001 DAC 2001 Tutorial: Jason Cong 65 Early results Recent advances Synthesis and optimization for sequential circuits Synthesis for heterogeneous FPGAs Synthesis for FPGAs with embedded memory blocks Use of Boolean matching instead of pattern matching Combined decomposition with mapping UCLA RASP FPGA synthesis system New challenges 6/22/2001 DAC 2001 Tutorial: Jason Cong 66

34 UCLA RASP Synthesis System Objective 1: A Flexible and Efficient FPGA Synthesis Engine EDIF netlist HDL design Internal netlist LUT Mapping Engine LUT netlist Placement Routing Vendor Specific netlist Xilinx, Altera, ORCA PLB Mapping Engine Chip Programming Information Delay Optimal Mapping FlowMap/HeteroMap FlowSYN TurboMap TurboSYN PLB Mapping PDDmap PDDSYN Match-4K/3K EAB-pack Delay/Area Trade-off FlowMap-r CutMap CutSyn Area Optimal Mapping DF-map CutMap-E MarkMap Gate Decomposition for Mapping DMIG DOGMA SLDmap 6/22/2001 DAC 2001 Tutorial: Jason Cong 67 6/22/2001 DAC 2001 Tutorial: Jason Cong 68

35 Objective 2: FPGA Architecture Evaluation Outline Architecture Simple, homogeneous LUTS E.g. XC2K, Flex8K Heterogeneous FPGAs Embedded memory blocks Complex PLBs Million-gate FPGAs Field-programmable system-on-a-chip Homogeneous K-LUT mapping for depth, area min. Focus on combinational circuits Heterogeneous FPGA mapping Mapping for EMBs Boolean matching Simultaneous mapping + retiming Layout-driven synthesis Use of IP blocks Synthesis for FPSOC 6/22/2001 DAC 2001 Tutorial: Jason Cong 69 Synthesis 6/22/2001 DAC 2001 Tutorial: Jason Cong 70

36 Outline Early results Recent advances New challenges Integration of synthesis and layout Field-programmable system-on-a-chip Logic vs. Interconnect Delays Example: Altera FPGA: (EPF8282A A-2 speed, Altera Data Book 98) LE delay: 2.4 ns connection between LE in same LAB: 0.5 ns connection between LE in same row, different LAB:4.7 ns connection between LE in different row: 7.2 ns 6/22/2001 DAC 2001 Tutorial: Jason Cong 71 6/22/2001 DAC 2001 Tutorial: Jason Cong 72

37 Layout-Driven Synthesis Iterative design flow construct-by-correction need to guarantee convergence Concurrent design flow correct-by-construction need to handle design abstraction, constraint propagation, design refinement Best candidate: combination of iterative and concurrent design approaches concurrent synthesis, layout planning, and solution refinement limited number of iterations within the same or adjacent levels to correct unacceptable estimation errors Layout-Driven Synthesis Flow of ADT HDL DESIGN OR NETLIST FROM THIRD PARTY SYNTHESIS TOOL GLOBAL LOGIC OPTIMIZATION AND INTERCONNECT PLANNING PLACEMENT-DRIVEN SYNTHESIS AND ARCHITECTURE EMBEDDING FPGA VENDOR P&R TOOL Source: Courtesy of Aplus Design Technologies, Inc. (ADT) 6/22/2001 DAC 2001 Tutorial: Jason Cong 73 6/22/2001 DAC 2001 Tutorial: Jason Cong 74

38 Use of IP Blocks Field-Programmable System-on-a-Chip (FPSOC) Classification of IP Blocks Soft Hard Firm Challenges IP representation and characterization Interface with synthesis tools IP protection General-Purpose FPSOC processor memory Programmable Logic Application-specific FPSOC processor memory ASIC Programmable Logic 6/22/2001 DAC 2001 Tutorial: Jason Cong 75 6/22/2001 DAC 2001 Tutorial: Jason Cong 76

39 Design Challenges Integration of Embedded operating systems Compilers Synthesis tools Layout tools Need for architecture evaluation Explore and choose the best embedded FPGA architecture (for the given application domain) 6/22/2001 DAC 2001 Tutorial: Jason Cong 77 ArchEvaluator: ADT s PLD Architecture Evaluation Tool Evaluation of programmable logic blocks Sizes Configurations Evaluation of on-chip hierarchy Number of levels Sizes, configuration, and delays at each level Evaluation of heterogeneous architectures Multiple sizes and/or configurations of the same type of logic blocks Multiple types of logic blocks Different kinds of resources on the same chip Embedded Array Configuration Array aspect-ratio Single vs. multiple arrays Source: Courtesy of Aplus Design Technologies, Inc. (ADT) 6/22/2001 DAC 2001 Tutorial: Jason Cong 78

40 Conclusions Synthesis and technology mapping for homogeneous LUTs is a wellunderstood problem (some room for area min.) Recent advances in FPGA synthesis enable many new architecture innovations Embedded memory blocks Heterogeneous LUT based FPGAs Architectures for efficient retiming and pipelining New FPGA synthesis tools and algorithms have to consider layout design support efficient IP re-use Field-programmable logic will be an important component of system-on-a-chip designs Acknowledgments Contributions from Current and former graduate students from my group: Michael Chen (UCLA), Eugene Ding (Agere), Yean-Yow Hwang (Altera), John Peck (AMD), Chang Wu (ADT), and Songjie Xu (ADT) Other colleagues: Peichen Pan (ADT) Supports from National Science Foundation Support from Actel, Altera, Lucent Technologies, Quickturn, Vantis/Lattic, and Xilinx under the California MICRO Program 6/22/2001 DAC 2001 Tutorial: Jason Cong 79 6/22/2001 DAC 2001 Tutorial: Jason Cong 80

41 Further Information Speaker Bio Visit Updated copy of the slides of this talk Survey/tutorial paper on FPGA synthesis Cong and Ding, ACM TODAES, 1996 Recent research publications and software on FPGA synthesis from UCLA JASON CONG received his B.S. degree in computer science from Peking University in 1985, his M.S. and Ph. D. degrees in Computer Science from the University of Illinois at Urbana-Champaign in 1987 and 1990, respectively. Currently, he is a Professor and Co-Director of the VLSI CAD Laboratory in the Computer Science Department of University of California, Los Angeles. His research interests include layout synthesis and logic synthesis for high-performance low-power VLSI circuits, design and optimization of high-speed VLSI interconnects, synthesis and architecture design for FPGAs. Dr. Cong is a fellow of IEEE and serves as a consultant or advisory board member for several semiconductor or EDA companies. In 1998, Dr. Cong founded Aplus Design Technologies, Inc. ( which provides innovative layout-driven synthesis solutions and architecture evaluation solutions for both stand-alone FPGAs/CPLDs and embedded FPGAs for SOC designs. 6/22/2001 DAC 2001 Tutorial: Jason Cong 81 6/22/2001 DAC 2001 Tutorial: Jason Cong 82

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures Jörn Gause Abstract This paper presents an investigation of Look-Up Table (LUT) based Field Programmable Gate Arrays (FPGAs)

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques

On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques Andy Yan, Rebecca Cheng, Steven J.E. Wilton Department of Electrical and Computer Engineering University

More information

L11/12: Reconfigurable Logic Architectures

L11/12: Reconfigurable Logic Architectures L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,

More information

Field Programmable Gate Arrays (FPGAs)

Field Programmable Gate Arrays (FPGAs) Field Programmable Gate Arrays (FPGAs) Introduction Simulations and prototyping have been a very important part of the electronics industry since a very long time now. Before heading in for the actual

More information

L12: Reconfigurable Logic Architectures

L12: Reconfigurable Logic Architectures L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics

More information

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida Reconfigurable Architectures Greg Stitt ECE Department University of Florida How can hardware be reconfigurable? Problem: Can t change fabricated chip ASICs are fixed Solution: Create components that can

More information

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation Outline CPE 528: Session #12 Department of Electrical and Computer Engineering University of Alabama in Huntsville Introduction Actel Logic Modules Xilinx LCA Altera FLEX, Altera MAX Power Dissipation

More information

A Fast Constant Coefficient Multiplier for the XC6200

A Fast Constant Coefficient Multiplier for the XC6200 A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx

More information

A Synthesis Oriented Omniscient Manual Editor

A Synthesis Oriented Omniscient Manual Editor A Synthesis Oriented Omniscient Manual Editor Tomasz S. Czajkowski and Jonathan Rose Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto, Toronto, Ontario, M5S

More information

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices March 13, 2007 14:36 vra80334_appe Sheet number 1 Page number 893 black appendix E Commercial Devices In Chapter 3 we described the three main types of programmable logic devices (PLDs): simple PLDs, complex

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014 EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect

More information

GlitchLess: An Active Glitch Minimization Technique for FPGAs

GlitchLess: An Active Glitch Minimization Technique for FPGAs GlitchLess: An Active Glitch Minimization Technique for FPGAs Julien Lamoureux, Guy G. Lemieux, Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver,

More information

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath Objectives Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath In the previous chapters we have studied how to develop a specification from a given application, and

More information

CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA

CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA Jeongbin Kim +822-2123-7826 xtankx123@yonsei.ac.kr Ki Tae Kim +822-2123-7826 ktkim1116@yonsei.ac.kr Eui-Young Chung +822-2123-5866

More information

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow

Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow Bradley R. Quinton*, Mark R. Greenstreet, Steven J.E. Wilton*, *Dept. of Electrical and Computer Engineering, Dept.

More information

Why FPGAs? FPGA Overview. Why FPGAs?

Why FPGAs? FPGA Overview. Why FPGAs? Transistor-level Logic Circuits Positive Level-sensitive EECS150 - Digital Design Lecture 3 - Field Programmable Gate Arrays (FPGAs) January 28, 2003 John Wawrzynek Transistor Level clk clk clk Positive

More information

FPGA Glitch Power Analysis and Reduction

FPGA Glitch Power Analysis and Reduction FPGA Glitch Power Analysis and Reduction Warren Shum and Jason H. Anderson Department of Electrical and Computer Engineering, University of Toronto Toronto, ON. Canada {shumwarr, janders}@eecg.toronto.edu

More information

Sharif University of Technology. SoC: Introduction

Sharif University of Technology. SoC: Introduction SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting

More information

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES 1 Learning Objectives 1. Explain the function of a multiplexer. Implement a multiplexer using gates. 2. Explain the

More information

Computer Architecture and Organization

Computer Architecture and Organization A-1 Appendix A - Digital Logic Computer Architecture and Organization Miles Murdocca and Vincent Heuring Appendix A Digital Logic A-2 Appendix A - Digital Logic Chapter Contents A.1 Introduction A.2 Combinational

More information

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR

More information

Designing for High Speed-Performance in CPLDs and FPGAs

Designing for High Speed-Performance in CPLDs and FPGAs Designing for High Speed-Performance in CPLDs and FPGAs Zeljko Zilic, Guy Lemieux, Kelvin Loveless, Stephen Brown, and Zvonko Vranesic Department of Electrical and Computer Engineering University of Toronto,

More information

Improving FPGA Performance with a S44 LUT Structure

Improving FPGA Performance with a S44 LUT Structure Improving FPGA Performance with a S44 LUT Structure Wenyi Feng, Jonathan Greene Microsemi Corporation SOC Products Group, San Jose {wenyi.feng, jonathan.greene}@microsemi.com ABSTRACT FPGA performance

More information

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,

More information

288 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 12, NO. 3, MARCH 2004

288 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 12, NO. 3, MARCH 2004 288 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 12, NO. 3, MARCH 2004 The Effect of LUT and Cluster Size on Deep-Submicron FPGA Performance and Density Elias Ahmed and Jonathan

More information

Interconnect Planning with Local Area Constrained Retiming

Interconnect Planning with Local Area Constrained Retiming Interconnect Planning with Local Area Constrained Retiming Ruibing Lu and Cheng-Kok Koh School of Electrical and Computer Engineering Purdue University,West Lafayette, IN, 47907, USA {lur, chengkok}@ecn.purdue.edu

More information

A Method to Decompose Multiple-Output Logic Functions

A Method to Decompose Multiple-Output Logic Functions 27. A Method to Decompose Multiple-Output Logic Functions Tsutomu Sasao Kyushu Institute of Technology 68-4 Kawazu Iizuka 82-852, Japan Munehiro Matsuura Kyushu Institute of Technology 68-4 Kawazu Iizuka

More information

Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University

Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University Power-Driven Flip-Flop p Merging g and Relocation Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Mak @National Tsing Hua University Outline Introduction Problem Formulation Algorithms Experimental Results

More information

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview

DC Ultra. Concurrent Timing, Area, Power and Test Optimization. Overview DATASHEET DC Ultra Concurrent Timing, Area, Power and Test Optimization DC Ultra RTL synthesis solution enables users to meet today s design challenges with concurrent optimization of timing, area, power

More information

Post-Routing Layer Assignment for Double Patterning

Post-Routing Layer Assignment for Double Patterning Post-Routing Layer Assignment for Double Patterning Jian Sun 1, Yinghai Lu 2, Hai Zhou 1,2 and Xuan Zeng 1 1 Micro-Electronics Dept. Fudan University, China 2 Electrical Engineering and Computer Science

More information

RELATED WORK Integrated circuits and programmable devices

RELATED WORK Integrated circuits and programmable devices Chapter 2 RELATED WORK 2.1. Integrated circuits and programmable devices 2.1.1. Introduction By the late 1940s the first transistor was created as a point-contact device formed from germanium. Such an

More information

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 2: Basic FPGA Fabric. James C. Hoe Department of ECE Carnegie Mellon University 18 643 Lecture 2: Basic FPGA Fabric James. Hoe Department of EE arnegie Mellon University 18 643 F17 L02 S1, James. Hoe, MU/EE/ALM, 2017 Housekeeping Your goal today: know enough to build a basic FPGA

More information

ESE534: Computer Organization. Previously. Today. Previously. Today. Preclass 1. Instruction Space Modeling

ESE534: Computer Organization. Previously. Today. Previously. Today. Preclass 1. Instruction Space Modeling ESE534: Computer Organization Previously Instruction Space Modeling Day 15: March 24, 2014 Empirical Comparisons Previously Programmable compute blocks LUTs, ALUs, PLAs Today What if we just built a custom

More information

Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security

Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security Grace Li Zhang, Bing Li, Ulf Schlichtmann Chair of Electronic Design Automation Technical University of Munich (TUM)

More information

Automated Design for Current-Mode Pass-Transistor Logic Blocks

Automated Design for Current-Mode Pass-Transistor Logic Blocks Automated Design for Current-Mode Pass-Transistor Logic Blocks Matthew David Pierson Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2007-70

More information

Raising FPGA Logic Density Through Synthesis-Inspired Architecture

Raising FPGA Logic Density Through Synthesis-Inspired Architecture 1 Raising FPGA Logic Density Through ynthesis-inspired Architecture Jason H. Anderson, Member, IEEE, Qiang Wang, Member, IEEE, and Chirag Ravishankar, tudent Member, IEEE Abstract We leverage properties

More information

High Performance Carry Chains for FPGAs

High Performance Carry Chains for FPGAs High Performance Carry Chains for FPGAs Matthew M. Hosler Department of Electrical and Computer Engineering Northwestern University Abstract Carry chains are an important consideration for most computations,

More information

Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering

Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering NCTU CHIH-LONG CHANG IRIS HUI-RU JIANG YU-MING YANG EVAN YU-WEN TSAI AKI SHENG-HUA CHEN IRIS Lab National Chiao Tung University

More information

Chapter 7 Memory and Programmable Logic

Chapter 7 Memory and Programmable Logic EEA091 - Digital Logic 數位邏輯 Chapter 7 Memory and Programmable Logic 吳俊興國立高雄大學資訊工程學系 2006 Chapter 7 Memory and Programmable Logic 7-1 Introduction 7-2 Random-Access Memory 7-3 Memory Decoding 7-4 Error

More information

Retiming Sequential Circuits for Low Power

Retiming Sequential Circuits for Low Power Retiming Sequential Circuits for Low Power José Monteiro, Srinivas Devadas Department of EECS MIT, Cambridge, MA Abhijit Ghosh Mitsubishi Electric Research Laboratories Sunnyvale, CA Abstract Switching

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. FSMs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab FSMs Tajana Simunic Rosing Source: Vahid, Katz 1 Flip-flops Hardware Description Languages and Sequential Logic representation of clocks

More information

Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL

Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL Design and Implementation of Encoder for (15, k) Binary BCH Code Using VHDL K. Rajani *, C. Raju ** *M.Tech, Department of ECE, G. Pullaiah College of Engineering and Technology, Kurnool **Assistant Professor,

More information

BIST-Based Diagnostics of FPGA Logic Blocks

BIST-Based Diagnostics of FPGA Logic Blocks To appear in Proc. International Test Conf., Nov. 1997 BIST-Based Diagnostics of FPGA Logic Blocks Charles Stroud, Eric Lee, Dept. of Electrical Engineering University of Kentucky and Miron Abramovici

More information

Lossless Compression Algorithms for Direct- Write Lithography Systems

Lossless Compression Algorithms for Direct- Write Lithography Systems Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley

More information

K.T. Tim Cheng 07_dft, v Testability

K.T. Tim Cheng 07_dft, v Testability K.T. Tim Cheng 07_dft, v1.0 1 Testability Is concept that deals with costs associated with testing. Increase testability of a circuit Some test cost is being reduced Test application time Test generation

More information

IE1204 Digital Design. F11: Programmable Logic, VHDL for Sequential Circuits. Masoumeh (Azin) Ebrahimi

IE1204 Digital Design. F11: Programmable Logic, VHDL for Sequential Circuits. Masoumeh (Azin) Ebrahimi IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits Masoumeh (Azin) Ebrahimi (masebr@kth.se) Elena Dubrova (dubrova@kth.se) KTH / ICT / ES This lecture BV pp. 98-118, 418-426, 507-519

More information

Latch-Based Performance Optimization for FPGAs. Xiao Teng

Latch-Based Performance Optimization for FPGAs. Xiao Teng Latch-Based Performance Optimization for FPGAs by Xiao Teng A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of ECE University of Toronto

More information

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design

More information

Microprocessor Design

Microprocessor Design Microprocessor Design Principles and Practices With VHDL Enoch O. Hwang Brooks / Cole 2004 To my wife and children Windy, Jonathan and Michelle Contents 1. Designing a Microprocessor... 2 1.1 Overview

More information

Exploring Architecture Parameters for Dual-Output LUT based FPGAs

Exploring Architecture Parameters for Dual-Output LUT based FPGAs Exploring Architecture Parameters for Dual-Output LUT based FPGAs Zhenghong Jiang, Colin Yu Lin, Liqun Yang, Fei Wang and Haigang Yang System on Programmable Chip Research Department, Institute of Electronics,

More information

ESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large

ESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large ESE680-002 (ESE534): Computer Organization Day 20: March 28, 2007 Retiming 2: Structures and Balance Last Time Saw how to formulate and automate retiming: start with network calculate minimum achievable

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

Leveraging Reconfigurability to Raise Productivity in FPGA Functional Debug

Leveraging Reconfigurability to Raise Productivity in FPGA Functional Debug Leveraging Reconfigurability to Raise Productivity in FPGA Functional Debug Abstract We propose new hardware and software techniques for FPGA functional debug that leverage the inherent reconfigurability

More information

The Stratix II Logic and Routing Architecture

The Stratix II Logic and Routing Architecture The Stratix II Logic and Routing Architecture David Lewis*, Elias Ahmed*, Gregg Baeckler, Vaughn Betz*, Mark Bourgeault*, David Cashman*, David Galloway*, Mike Hutton, Chris Lane, Andy Lee, Paul Leventis*,

More information

FPGA Implementation of Viterbi Decoder

FPGA Implementation of Viterbi Decoder Proceedings of the 6th WSEAS Int. Conf. on Electronics, Hardware, Wireless and Optical Communications, Corfu Island, Greece, February 16-19, 2007 162 FPGA Implementation of Viterbi Decoder HEMA.S, SURESH

More information

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, Sequencing ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2013 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines Introduction Sequencing

More information

Design for Testability

Design for Testability TDTS 01 Lecture 9 Design for Testability Zebo Peng Embedded Systems Laboratory IDA, Linköping University Lecture 9 The test problems Fault modeling Design for testability techniques Zebo Peng, IDA, LiTH

More information

FPGA Design with VHDL

FPGA Design with VHDL FPGA Design with VHDL Justus-Liebig-Universität Gießen, II. Physikalisches Institut Ming Liu Dr. Sören Lange Prof. Dr. Wolfgang Kühn ming.liu@physik.uni-giessen.de Lecture Digital design basics Basic logic

More information

11. Sequential Elements

11. Sequential Elements 11. Sequential Elements Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 11, 2017 ECE Department, University of Texas at Austin

More information

Designing an Efficient and Secured LUT Approach for Area Based Occupations

Designing an Efficient and Secured LUT Approach for Area Based Occupations Designing an Efficient and Secured LUT Approach for Area Based Occupations 1 D. Jahnavi, 2 Y. Ravikiran varma 1 M.Tech scholar, E.C.E, Sreenivasa institute of technology and management studies, Chittoor

More information

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course

Adding Analog and Mixed Signal Concerns to a Digital VLSI Course Session Number 1532 Adding Analog and Mixed Signal Concerns to a Digital VLSI Course John A. Nestor and David A. Rich Department of Electrical and Computer Engineering Lafayette College Abstract This paper

More information

Innovative Fast Timing Design

Innovative Fast Timing Design Innovative Fast Timing Design Solution through Simultaneous Processing of Logic Synthesis and Placement A new design methodology is now available that offers the advantages of enhanced logical design efficiency

More information

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material

More information

Glitch Reduction and CAD Algorithm Noise in FPGAs. Warren Shum

Glitch Reduction and CAD Algorithm Noise in FPGAs. Warren Shum Glitch Reduction and CAD Algorithm Noise in FPGAs by Warren Shum A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical and

More information

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com IMPLEMENTATION OF FAST SQUARE ROOT SELECT WITH LOW POWER CONSUMPTION V.Elanangai*, Dr. K.Vasanth Department of

More information

Optimizing area of local routing network by reconfiguring look up tables (LUTs)

Optimizing area of local routing network by reconfiguring look up tables (LUTs) Vol.2, Issue.3, May-June 2012 pp-816-823 ISSN: 2249-6645 Optimizing area of local routing network by reconfiguring look up tables (LUTs) Sathyabhama.B 1 and S.Sudha 2 1 M.E-VLSI Design 2 Dept of ECE Easwari

More information

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran 1 CAD for VLSI Design - I Lecture 38 V. Kamakoti and Shankar Balachandran 2 Overview Commercial FPGAs Architecture LookUp Table based Architectures Routing Architectures FPGA CAD flow revisited 3 Xilinx

More information

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL Indira P. Dugganapally, Waleed K. Al-Assadi, Tejaswini Tammina and Scott Smith* Department of Electrical and Computer

More information

IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits

IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits IE1204 Digital Design F11: Programmable Logic, VHDL for Sequential Circuits Elena Dubrova KTH/ICT/ES dubrova@kth.se This lecture BV pp. 98-118, 418-426, 507-519 IE1204 Digital Design, HT14 2 Programmable

More information

VLSI IEEE Projects Titles LeMeniz Infotech

VLSI IEEE Projects Titles LeMeniz Infotech VLSI IEEE Projects Titles -2019 LeMeniz Infotech 36, 100 feet Road, Natesan Nagar(Near Indira Gandhi Statue and Next to Fish-O-Fish), Pondicherry-605 005 Web : www.ieeemaster.com / www.lemenizinfotech.com

More information

Design for Testability Part II

Design for Testability Part II Design for Testability Part II 1 Partial-Scan Definition A subset of flip-flops is scanned. Objectives: Minimize area overhead and scan sequence length, yet achieve required fault coverage. Exclude selected

More information

Fault Location in FPGA-Based Reconfigurable Systems

Fault Location in FPGA-Based Reconfigurable Systems Fault Location in FPGA-Based Reconfigurable Systems Subhasish Mitra, Philip P. Shirvani and Edward J. McCluskey Center for Reliable Computing Departments of Electrical Engineering and Computer Science

More information

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2011

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2011 University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science EECS150, Spring 2011 Homework Assignment 2: Synchronous Digital Systems Review, FPGA

More information

Hardware Implementation of Viterbi Decoder for Wireless Applications

Hardware Implementation of Viterbi Decoder for Wireless Applications Hardware Implementation of Viterbi Decoder for Wireless Applications Bhupendra Singh 1, Sanjeev Agarwal 2 and Tarun Varma 3 Deptt. of Electronics and Communication Engineering, 1 Amity School of Engineering

More information

Digital Systems Design

Digital Systems Design ECOM 4311 Digital Systems Design Eng. Monther Abusultan Computer Engineering Dept. Islamic University of Gaza Page 1 ECOM4311 Digital Systems Design Module #2 Agenda 1. History of Digital Design Approach

More information

Examples of FPLD Families: Actel ACT, Xilinx LCA, Altera MAX 5000 & 7000

Examples of FPLD Families: Actel ACT, Xilinx LCA, Altera MAX 5000 & 7000 Examples of FPL Families: Actel ACT, Xilinx LCA, Altera AX 5 & 7 Actel ACT Family ffl The Actel ACT family employs multiplexer-based logic cells. ffl A row-based architecture is used in which the logic

More information

Design of Memory Based Implementation Using LUT Multiplier

Design of Memory Based Implementation Using LUT Multiplier Design of Memory Based Implementation Using LUT Multiplier Charan Kumar.k 1, S. Vikrama Narasimha Reddy 2, Neelima Koppala 3 1,2 M.Tech(VLSI) Student, 3 Assistant Professor, ECE Department, Sree Vidyanikethan

More information

EVE: A CAD Tool Providing Placement and Pipelining Assistance for High-Speed FPGA Circuit Designs

EVE: A CAD Tool Providing Placement and Pipelining Assistance for High-Speed FPGA Circuit Designs EVE: A CAD Tool Providing Placement and Pipelining Assistance for High-Speed FPGA Circuit Designs by William Chow A Thesis submitted in conformity with the requirements For the degree of Master of Applied

More information

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Abstract The Peak Dynamic Power Estimation (P DP E) problem involves finding input vector pairs that cause maximum power dissipation (maximum

More information

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array

Hardware Modeling of Binary Coded Decimal Adder in Field Programmable Gate Array American Journal of Applied Sciences 10 (5): 466-477, 2013 ISSN: 1546-9239 2013 M.I. Ibrahimy et al., This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/ajassp.2013.466.477

More information

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida CDA 4253 FPGA System Design FPGA Architectures Hao Zheng Dept of Comp Sci & Eng U of South Florida FPGAs Generic Architecture Also include common fixed logic blocks for higher performance: On-chip mem.

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

ESE534: Computer Organization. Today. Image Processing. Retiming Demand. Preclass 2. Preclass 2. Retiming Demand. Day 21: April 14, 2014 Retiming

ESE534: Computer Organization. Today. Image Processing. Retiming Demand. Preclass 2. Preclass 2. Retiming Demand. Day 21: April 14, 2014 Retiming ESE534: Computer Organization Today Retiming Demand Folded Computation Day 21: April 14, 2014 Retiming Logical Pipelining Physical Pipelining Retiming Supply Technology Structures Hierarchy 1 2 Image Processing

More information

FPGA and CPLD Architectures: A Tutorial

FPGA and CPLD Architectures: A Tutorial F I E L D - P R O G R A M M A B L E D E V I C E S FPGA and CPLD Architectures: A Tutorial RECENTLY, the development of new types of sophisticated fieldprogrammable devices (FPDs) has dramatically changed

More information

COE328 Course Outline. Fall 2007

COE328 Course Outline. Fall 2007 COE28 Course Outline Fall 2007 1 Objectives This course covers the basics of digital logic circuits and design. Through the basic understanding of Boolean algebra and number systems it introduces the student

More information

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP 1 R.Ramya, 2 C.Hamsaveni 1,2 PG Scholar, Department of ECE, Hindusthan Institute Of Technology,

More information

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3. International Journal of Computer Engineering and Applications, Volume VI, Issue II, May 14 www.ijcea.com ISSN 2321 3469 Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol

More information

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm Mustafa Parlak and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences Sabanci University, Tuzla, 34956, Istanbul, Turkey

More information

Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM)

Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design. Laboratory 3: Finite State Machine (FSM) Faculty of Electrical & Electronics Engineering BEE3233 Electronics System Design Laboratory 3: Finite State Machine (FSM) Mapping CO, PO, Domain, KI : CO2,PO3,P5,CTPS5 CO2: Construct logic circuit using

More information

TKK S ASIC-PIIRIEN SUUNNITTELU

TKK S ASIC-PIIRIEN SUUNNITTELU Design TKK S-88.134 ASIC-PIIRIEN SUUNNITTELU Design Flow 3.2.2005 RTL Design 10.2.2005 Implementation 7.4.2005 Contents 1. Terminology 2. RTL to Parts flow 3. Logic synthesis 4. Static Timing Analysis

More information

FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture

FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 1 FPGA Power Reduction by Guarded Evaluation Considering Logic Architecture Chirag Ravishankar, Student Member, IEEE, Jason

More information

FPGA Implementation of DA Algritm for Fir Filter

FPGA Implementation of DA Algritm for Fir Filter International Journal of Computational Engineering Research Vol, 03 Issue, 8 FPGA Implementation of DA Algritm for Fir Filter 1, Solmanraju Putta, 2, J Kishore, 3, P. Suresh 1, M.Tech student,assoc. Prof.,Professor

More information

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity.

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity. Prototyping an ASIC with FPGAs By Rafey Mahmud, FAE at Synplicity. With increased capacity of FPGAs and readily available off-the-shelf prototyping boards sporting multiple FPGAs, it has become feasible

More information

Memory efficient Distributed architecture LUT Design using Unified Architecture

Memory efficient Distributed architecture LUT Design using Unified Architecture Research Article Memory efficient Distributed architecture LUT Design using Unified Architecture Authors: 1 S.M.L.V.K. Durga, 2 N.S. Govind. Address for Correspondence: 1 M.Tech II Year, ECE Dept., ASR

More information

Section 6.8 Synthesis of Sequential Logic Page 1 of 8

Section 6.8 Synthesis of Sequential Logic Page 1 of 8 Section 6.8 Synthesis of Sequential Logic Page of 8 6.8 Synthesis of Sequential Logic Steps:. Given a description (usually in words), develop the state diagram. 2. Convert the state diagram to a next-state

More information

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder Dept. of Electrical and Computer Engineering University of California, Davis Issued: November 2, 2011 Due: November 16, 2011, 4PM Reading: Rabaey Sections

More information

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT. An Advanced and Area Optimized L.U.T Design using A.P.C. and O.M.S K.Sreelakshmi, A.Srinivasa Rao Department of Electronics and Communication Engineering Nimra College of Engineering and Technology Krishna

More information

INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE

INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE By AARON LANDY A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN

More information

Lecture #4: Clocking in Synchronous Circuits

Lecture #4: Clocking in Synchronous Circuits Lecture #4: Clocking in Synchronous Circuits Kunle Stanford EE183 January 15, 2003 Tutorial/Verilog Questions? Tutorial is done, right? Due at midnight (Fri 1/17/03) Turn in copies of all verilog, copy

More information