ISPD 2017 Contest Clock-Aware FPGA Placement

Size: px
Start display at page:

Download "ISPD 2017 Contest Clock-Aware FPGA Placement"

Transcription

1 ISPD 2017 Contest Clock-Aware FPGA Placement Stephen Yang, Chandra Mulpuri, Sainath Reddy, Meghraj Kalase, Srinivasan Dasasathyan, Mehrdad E. Dehkordi, Marvin Tom, Rajat Aggarwal

2 Acknowledgement Xilinx Vivado Management Team Support from Dr. Sudip Nag and Dr. Salil Raje Support from Xilinx Lab

3 Outline Background Top-5 Team Presentations Benchmarking Results Award Ceremony

4 Last Year: Routability-Driven FPGA Placement First FPGA related contest Latest FPGA architecture Vivado: Industrial flow for evaluation Academic benchmark format: bookshelf Focus: FPGA legalization rule and routing congestion

5 This Year: Clock-Aware FPGA Placement Continuous Effort on FPGA Placement Problem Clock Legalization: Key Constraint in FPGA Placement Wirelength as the primary metric Reduced difficulty on routability, reduced runtime factor

6 Contest Timelines Oct 2016: Problem definition and contest planning Nov 2016: Contest Announcement Dec 12, 2015: Sample benchmarks ready Jan 15, 2017: Registration deadline Feb 3, 2017: Evaluation flow ready Feb 15, 2017: Alpha submission Mar 9, 2017: Final submission Mar 10-12, 2017: Benchmarking Mar 22, 2017: Announce winners at ISPD Page 6

7 Registration: 13 Teams Team Affiliation Region VDAplacer National Chiao Tung University Asia UTPlaceF2.0 University of Texas at Austin North America WicilPlacer University of Wisconsin-Madison North America RippleFPGA Chinese University of Hong Kong Asia Uni-Placer Ulsan National Institute of Science and Technology Asia CECA_Placer Peking University Asia NTUfplace National Taiwan University Asia GPlace University of Guelph North America BMTIplacer Beijing Microelectronics and Technology Institute Asia AggiePlace Texas A&M University North America UFRGSPlace Universidade Federal do Rio Grande do Sul South America POCA Tool Politecnico di Torino, Torino, Italy Europe Kapees Indian Institute of Technology, Guwahati Asia

8 Final Submission: 9 Teams Team Affiliation Region VDAplacer National Chiao Tung University Asia UTPlaceF2.0 University of Texas at Austin North America WicilPlacer University of Wisconsin-Madison North America RippleFPGA Chinese University of Hong Kong Asia CECA_Placer Peking University Asia NTUfplace National Taiwan University Asia GPlace University of Guelph North America BMTIplacer Beijing Microelectronics and Technology Institute Asia UFRGSPlace Universidade Federal do Rio Grande do Sul South America Congratulations!

9 Target FPGA: Xilinx UltraScale VU095 20nm Technology 1.2M Logic Cell Page 9

10 Clock Routing Architecture Page 10

11 Clock Region Rule distinct clocks per region Page 11

12 Half Column Rule 12 distinct clocks per half column Page 12

13 (Hidden) Benchmark Statistics Design #LUTs #FFs #BRAMs #DSPs #I/O #Clocks Design1 215K (40%) 236K (22%) 170 (10%) 75 (10%) Design2 215K (40%) 236K (22%) 170 (10%) 75 (10%) Design3 242K (45%) 270K (25%) 255 (15%) 112 (15%) Design4 268K (50%) 300K (28%) 340 (20%) 150 (20%) Design5 295K (55%) 325K (30%) 425 (25%) 187 (25%) Design6 322K (60%) 354K (33%) 510 (30%) 225 (30%) Design7 350K (65%) 384K (36%) 595 (35%) 262 (35%) Design8 376K (70%) 414K (38%) 680 (40%) 300 (40%) Design9 392K (73%) 431K (40%) 765 (45%) 337 (45%) Design10 408K (76%) 449K (42%) 850 (50%) 375 (50%) Design11 424K (79%) 450K (43%) 900 (53%) 397 (53%) Design12 440K (82%) 484K (45%) 950 (56%) 420 (56%) Design13 456K (85%) 503K (47%) 1000 (59%) 442 (59%) Largest: 1.0M instances, 57 clocks Page 13

14 Placer Evaluation Flow Design (bookshelf) Design (Xilinx DB) Load Design Vivado Contest Placer.pl file Read Placement Clock and Legality Check Routing Routed WL Page 14

15 Evaluation Metrics and Ranking Score = Routed-WL * (1 + Runtime_Factor) Runtime Factor 20% runtime -> 1% QoR Bounded by +/- 2.5% Failures Routing-Failures > Legalization-Failures > Placer-Failures Ranking per design: 1, 2, 3,, n Sum-of-the-rankings of each team

16 Top-5 Team Presentation

17 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

18 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

19 GPlace 2.0: Clock-Aware Placement Tool for UltraScale FPGAs Ziad Abuowaimer Shawki Areibi Anthony Vannelli University of Guelph March 22, 2017 Gary Grewal

20 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals NO YES <= 24 placement.pl 20

21 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals Pin-Propagation Preplacement (Similar to GPlace 1.0) NO YES <= 24 placement.pl 21

22 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals NO YES <= 24 placement.pl 22

23 Preplacement Global Placement (WL-Driven) Star+ Solver Analytical Placement (Star+ and Jacobi): = = = = Site & Clock Legalization = + : : 23

24 Preplacement Global Placement (WL-Driven) FF Legalization: (Objective is WL minimization) Use Bipartition Legalization in three levels: Star+ Solver FF Legalization First partition the FPGA into Clock Regions and recursively bipartition FFs into those clock regions. Clock-Region Bipartition Half-Column Bipartition Site Bipartition Second, partition each Clock-Region into half-columns and recursively bipartition FFs into those half-columns. Third, partition each half-columns into sites and recursively bipartition FFs into those sites. 24

25 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization Create a Recursive bi-partitioning tree data structure for the 40 Clock Regions. Each node in the tree contains: Site capacity. Clock Capacity. Clock-Region Bipartition Half-Column Bipartition Site Bipartition 25

26 Preplacement Global Placement (WL-Driven) Star+ Solver #Groups CR0 #Slices #Groups #Sub-groups RG0 CR1 CE0 CE1 CE0 Tree structure Maintain Sites and Control-Set Capacity constraints. FF Legalization Clock-Region Bipartition 9 #FFs 5 17 Half-Column Bipartition Site Bipartition CS0 RG0 CS1 Tree structure Maintain Clock Signals Capacity Constraints 9 FFs 17 FFs 26

27 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization # Clocks & Clocksids FPGA-Clock-Region-Tree: A tree data structure that stores # of Clocks and Clocks ids At each node after FF legalization Level 1. Clock-Region Bipartition Half-Column Bipartition Site Bipartition 27

28 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization Clock-Region Bipartition Half-Column Bipartition Create a Recursive bi-partitioning tree data structure of the half-columns within each Clock Region. (Actually we need only 3 Trees since we have 3 different patterns). Each node in the tree contains: Site capacity. Clock Capacity. Site Bipartition 28

29 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization Tree: Clock Capacity Tree: Site & Control-Set Capacity Clock-Region Bipartition RG0 #Slices RG0 Half-Column Bipartition CS0 CS1 #Groups CR0 Site Bipartition 9 FFs 17 FFs CE0 #Sub-groups CE1 9 #FFs 5 29

30 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization Clock-Region Bipartition Half-Column Bipartition Site Bipartition FPGA-Half-Column-Tree: A tree data structure that stores # of Clocks and Clocks ids At each node after FF legalization Level 2. 30

31 Preplacement Global Placement (WL-Driven) Star+ Solver FF Legalization Clock-Region Bipartition Half-Column Bipartition Site Bipartition Tree: Site & Control-Set Capacity Create a Recursive bipartitioning tree data #Slices RG0 structure of the Sites within each half-column. #Groups Each node in the tree contains: Site capacity. CE0 CR0 #Sub-groups CE1 9 #FFs 5 31

32 Preplacement Global Placement (WL-Driven) Star+ Solver DSP Legalization Clock-Region Bipartition Half-Column Bipartition Site Bipartition DSP Legalization: (Similar to FF legalization but without Control-Set Constraints) Use Bipartition Legalization in three levels: First partition the FPGA into Clock Regions and recursively bipartition DSPs into those clock regions. (Use and update FPGA-Clock-Region-Tree). Second, partition each Clock-Region into half-columns and recursively bipartition DSPs into those half-columns. (Use and update FPGA-Half-Column-Tree). Third, partition each half-columns into sites and recursively bipartition DSPs into those sites. 32

33 Preplacement Global Placement (WL-Driven) BRAM Legalization: (Similar to DSP legalization) Use Bipartition Legalization in three levels: Star+ Solver BRAM Legalization First partition the FPGA into Clock Regions and recursively bipartition BRAMs into those clock regions. (Use and update FPGA-Clock-Region-Tree). Clock-Region Bipartition Half-Column Bipartition Second, partition each Clock-Region into half-columns and recursively bipartition BRAMs into those half-columns. (Use and update FPGA-Half-Column-Tree). Site Bipartition Third, partition each half-columns into sites and recursively bipartition BRAMs into those sites. 33

34 Preplacement v Adjust the Global Routing Grid Capacity. Global Placement (WL-Driven) Star+ Solver v Run NCTU-gr 2.0 Global Router to get the congestion estimation. Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 v Inflate LUTs based on both # of pins and congestion value: = ( ) Ratio is based on Congestion Value. LUT inflation 34

35 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation 35

36 Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Calculate the center of gravity for each Clock Signal based on the position of its Clock Loads. (Ignore The two Global Clock Signals ControlSig0 & ControlSig1) Clock-Loads Assignment 36

37 Clock-Signals Partitioning Clock-Loads Center of Gravity Find a bounding box that contains all center of gravity points. Bbox of Center of Gravity Clock-Loads Assignment 37

38 Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Assign each Clock Loads to the closest corner based on the distance of its center of gravity to that corner. Limit each partition to have 20 different Clocks maximum. Clock-Loads Assignment 38

39 Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Place each partition to the corresponding FPGA corner. Place the inflated LUTs in the middle of the FPGA. Clock-Loads Assignment LUTs 39

40 (Congestion-Driven) Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals Similar to Global Placement (WL-Driven) but with inflated LUTs. NO YES <= 24 placement.pl 40

41 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals NO YES <= 24 placement.pl 41

42 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals NO YES <= 24 placement.pl 42

43 Preplacement Global Placement (WL-Driven) Star+ Solver Site & Clock Legalization Congestion Estimation Adjust Global Routing Grid NCTU-gr 2.0 LUT inflation Clock-Signals Partitioning Clock-Loads Center of Gravity Bbox of Center of Gravity Clock-Loads Assignment Global Placement (Congestion-Driven) Star+ Solver Site & Clock Legalization Overlap Bbox of Clock Signals NO YES <= 24 placement.pl 43

44 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

45 NTUfplace Clock-Aware FPGA Placement Yun-Chih Kuo, Chau-Chin Huang, Shih-Chun Chen, Chun-Han Chiang, Yao-Wen Chang, and Sy-Yen Kuo Mar. 22, 2017 National Taiwan University 45

46 Outline Introduction Proposed Approach Experimental Results Demo 46

47 Outline Introduction Proposed Approach Experimental Results Demo 47

48 bin Analytical Placement Formulation Given the chip region and block dimensions, determine (x, y) for all movable blocks min s.t. W( x, y ) // wirelength function D b ( x, y ) M b D b : density for bin b M b : max density for bin b Density = A block A bin Relax the constraints into the objective function (penalty) min W( x, y ) + λ ( max( D b ( x, y ) M b, 0 ) ) 2 Apply differentiable wirelength and density models Use the gradient method to solve the optimization problem Increase λ gradually to meet density constraints 48

49 Differentiable Wirelength and Density Models Log-sum-exp wirelength model [Naylor et al., 2001] ¾ An effective smooth and differentiable function for HPWL approximation; this model achieves exact HPWL when γ à 0 Bell-shaped density model [Kahng et al., ICCAD 04] ℎ ℎ (, ) (, )

50 Multilevel Global Placement Cluster the blocks based on connectivity/size to reduce the problem size clustering clustering Initial placement Iteratively decluster the clusters and further refine the placement declustering & refinement declustering & refinement clustered block chip boundary 50

51 Outline Introduction Proposed Approach Experimental Results Demo 51

52 Clock-Aware Multilevel Global Placement Cluster blocks with clock constraint Initial placement clustering clustering declustering & refinement declustering & refinement clustered block chip boundary Blocks within same clock domain 52

53 Mismatch between GP and LG Analytical model for global placement gives continuous solutions while legalization pulls blocks to discrete and scattered legal locations Displacement of blocks is large I/O block DSP CLB RAM 53

54 Heterogeneous Cost Function Therefore, we can solve this with gradient method: min W( x, y ) + λ 1 ( max( D b ( x, y ) M b, 0 ) ) 2 + λ 2 G(x) Cost of complex-block-alignment function Smoothed cost DSP columns 54

55 Clocking Resource Constraint We formulate the clocking resource constraint in clock regions as a cost in the placement stages Therefore, we can resolve the clocking resource constraint by moving blocks out of resource-lacking regions Clock Region 55

56 Outline Introduction Proposed Approach Experimental Results Demo 56

57 Experimental Results We ran our program on an Intel Xeon E CPU with 32GB memory Design #nodes #nets Routed-WL Runtime clk_design s clk_design m41s clk_design m11s clk_design m1s clk_design m57s 57

58 Outline Introduction Proposed Approach Experimental Results Demo 58

59 Demo 59

60 Thank You! 60

61 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

62 CUHK - RippleFPGA Gengjie Chen, Chak-Wa Pui, Evangeline F. Y. Young, Bei Yu March 22, 2017

63 Outline Background Our Flow How We Handle Clock Rules Clock region Half column

64 Background Hetergenous FPGA I/O CLB RAM DSP Switch Box

65 Background Configurable Logic Block (CLB) Basic Logic Element (BLE) BLE 0 LUT 0 CK0 SR0 CE0 FF 0 upper half using CK0, SR0, CE0/1 BLE 1 BLE 2 LUT 1 FF 1 CK0 SR0 CE1 CLB BLE 3 BLE 4 lower half using CK1, SR1, CE2/3 BLE 5 BLE 6 LUT 14 LUT 15 BLE 7... CK1 SR1 CE2 FF 14 FF 15 CK1 SR1 CE3

66 Outline Background Our Flow How We Handle Clock Rules Clock Region Half Column

67 Flows in Previous Work packing flat netlist pack-place placement LUT/FF BLE CLB placed design Convectional flow (pack-place) Packing based on physical information (place-packplace): Un/DoPack [ICCAD 06], HDPack [FPL 07], UTPlaceF [ICCAD 16], GPlace-pack [ICCAD 16] Flat placement followed by legalization (place-pack): GPlace-flat [ICCAD 16] place-pack-place place-pack

68 Our Flow placement flat netlist packing flat netlist 1 flat GP LUT/FF BLE 5 CLB placed design soft BLE packing BLE GP CLB physical packing (LG) 45 two-level DP 5 slot assignment in CLB placed design

69 Our flow Features Stair-step flow which interleaves packing and placement Implicit CLB packing similar to ASIC LG (Tetris) Strengths Feedback quickly Iteratively improve other metrics (congestion, timing, power etc) Approximate analytical GP directly Smoothly control packing density Easily embed other metrics Easily consider some constraints (e.g., clock rules)

70 Outline Background Our Flow How We Handle Clock Rules Clock region Half column

71 Clock Rules Clock region ~32x60 sites => global A clock occupies a clock region if its bounding box (BB) does <= 24 clocks in each Half column 2x30 sites => local <= 12 clocks in each

72 Clock Region Clock region ~32x60 sites => global <= 24 clocks in each Solution Plan clock regions Apply it to GP, LG, DP

73 Clock Region Planning Clock bounding box (CBB): restrict the movement of cells of the same clock to a bounding box Shrinking: reduce overflow in clock region iteratively until no Expanding: reduce cell density in CBB iteratively until impossible

74 Clock Region Planning Assume 3x3 clock regions <= 2 clocks in each clock region 4 clocks 1 1 The CBB of a clock 1 1

75 Clock Region Planning Assume 3x3 clock regions <= 2 clocks in each clock region 4 clocks

76 Clock Region Planning Assume 3x3 clock regions <= 2 clocks in each clock region 4 clocks

77 Clock Region Planning Assume 3x3 clock regions <= 2 clocks in each clock region 4 clocks

78 Clock Region Planning Assume 3x3 clock regions <= 2 clocks in each clock region 4 clocks Overflow: #clk = 4 >

79 Clock Region Planning Shrinking: reduce overflow in clock region iteratively until no For clock region with max overflow Calculate total cell displacement when shrinking Select CBB & direction with min displacement and do

80 Clock Region Planning Shrinking: reduce overflow in clock region iteratively until no

81 Clock Region Planning Shrinking: reduce overflow in clock region iteratively until no

82 Clock Region Planning Shrinking: reduce overflow in clock region iteratively until no It s legal now! 1 2 1

83 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible For unmarked CBB with max cell density Try expanding, mark if cannot

84 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible

85 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible

86 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible

87 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible

88 Clock Region Planning Expanding: reduce cell density in CBB iteratively until impossible It s exhausted now! 2 2 2

89 Clock Region Plan clock region Apply it to GP, LG, DP GP: add box constraints (not implemented) LG/DP: only consider sites within CBB

90 Half Column Half column 2x30 sites => local <= 12 clocks in each Solution Resolve overflow after normal LG Forbid movement causing overflow in DP

91 Half Column Resolve overflow after normal LG For a half column with overflow Select the clock with fewest cells Move cells to neighboring overflow-free half columns with min displacement

92 Half Column Resolve overflow after normal LG

93 Half Column Resolve overflow after normal LG

94 Half Column Resolve overflow after normal LG It s legal now!

95 Summary Background Our Flow How We Handle Clock Rules Clock region Plan clock region Apply it to GP, LG, DP Half column Resolve overflow after normal LG Forbid movement causing overflow in DP

96 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

97 UT DA UTPlaceF 2.0 ISPD 2017 Clock-Aware FPGA Placement Contest Wuxi Li, David Z. Pan ECE Department, University of Texas at Austin 97

98 Team Introduction Wuxi Li Ph.D. student UT-Austin David Z. Pan Professor UT-Austin UT Design Automation Lab 98

99 Outline Original UTPlaceF Flow Clock Constraints Clock Region Constraint Half Column Constraint Clock Region Assignment UTPlaceF 2.0 Flow 99

100 Original UTPlaceF Flow Circuit Wirelength-driven Phase Routability-driven Phase Flat Initial Placement Netlist Cell In ation Packing Global Placement Quadratic Programming + Rough Legalization Quadratic Programming + Rough Legalization Legalization No Almost Converged? Yes Legalize DSP, RAM, I/O No Converged? Detailed Placement Yes FIP Done Done 100

101 Clock Region Constraint The FPGA is divided into 5 by 8 clock regions Clock demand of each clock region

102 Half Column Constraint Each clock region is divided into half column regions Clock demand of each half column region

103 Clock Region Assignment Problem Inputs A rough legalized placement Outputs Cells to clock region assignment with minimized total cell movement Capacity constraint is satisfied for each clock region Clock demand 24 for each clock region 103

104 Problem Transformation 104

105 Algorithm Overview 105

106 Min-Cost-Max-Flow Based Assignment 106

107 UTPlaceF 2.0 Flow Circuit Wirelength-driven Phase Routability & Clock Driven Phase Flat Initial Placement Netlist Cell In ation Clock-Aware Packing Clock Region Assign. + Global Placement Clock Region Assign. + Half Column Assign. + Legalization Clock-Aware Detailed Placement No Quadratic Programming + Rough Legalization Almost Converged? Yes Quadratic Programming + Clock Region Assign. + Rough Legalization Legalize DSP, RAM, I/O Converged? Yes FIP Done No Done 107

108 Thanks! 108

109 Top-5 Teams (In Alphabetical Order) GPlace, University of Guelph, Ziad Abuowaimer NTUfplace, National Taiwan University, Yun-Chih Kuo RippleFPGA, Chinese University of Hong Kong, Gengjie Chen UTPlaceF2.0, University of Texas, Austin, Wuxi Li VDAplacer, National Chiao Tung University, Chen Chen

110 VDAplacer ISPD 2017 Contest Clock-Aware FPGA Placement Presenter: Chen Chen Advisor: Prof. Hung-Ming Chen Dept. of Electronic Engineering, National Chiao Tung University 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 110

111 Outline Problem Formulation FPGA Packing Problem Clock-Aware Heterogeneous Placement Proposed Algorithm Dynamic Packing with physical information Global Placement Placement Migration Legalization and Detailed Placement 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 111

112 Outline Problem Formulation FPGA Packing Problem Clock-Aware Heterogeneous Placement Proposed Algorithm Dynamic Packing with physical information Global Placement Placement Migration Legalization and Detailed Placement 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 112

113 FPGA Packing Problem The FPGA packing problem is to cluster LUTs and FFs into groups to minimize the total number of blocks and block interconnections while satisfying the limitations of the FF controlling signals and the fracturable LUT constraints. A configurable logic block (CLB) contains 8 fracturable LUTs, 16 FFs, 2 clock inputs (CLK), 2 set/reset inputs (SR),4 clock enables (CE). The CEs are independent for { FF0, FF2, FF4, FF6 }, { FF1, FF3, FF5, FF7 }, { FF8, FF10, FF12, FF14 }, { FF9, FF11, FF13, FF15 }. A Configurable Logic Block (CLB) 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 113

114 FPGA Packing Problem A fracturbale LUT has three modes of operation: As single K-input LUT (K from 1 to 6) As two 5-input (or fewer input) LUTs with separate outputs but common inputs As two 3-input (or fewer input) LUTs irrespective of common inputs 1 to 6 1 to 5 1 to 3 LUT LUT LUT LUT 1 to 3 LUT LUT Mode (1) Mode (2) Mode (3) 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 114

115 Clock-Aware Heterogeneous Placement The FPGA placement problem: Given a heterogeneous FPGA and circuit, we are to determine the desired position for each movable block to minimize the routed wirelength such that each block is in specified regions without overlapping among the blocks. 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 115

116 Clock-Aware Heterogeneous Placement Clock-Aware Placement Constraints Number of global clocks in each clock region is at most 24 clocks. Within each clock region, each half column has at most 12 clocks. Each clock should be constrained to a continuous rectangular area. 5x8 Clock Regions (14~18)x2 Half Columns 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 116

117 Outline Problem Formulation FPGA Packing Problem Clock-Aware Heterogeneous Placement Proposed Algorithm Dynamic Packing with physical information Global Placement Placement Migration Legalization and Detailed Placement 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 117

118 Dynamic Packing with physical information Apply POLAR[1] framework Increase the force of anchor net in initial placement stage and decrease in dynamic packing stage. Packing Factor: # of Clocks # of Control Sets(C/R/CE) Distance # of Common Nets Initial Placement Solve quadratic objective function using B2B model and obtain lower bound HPWL placement using CG Obtain upper bound HPWL placement using Look Ahead Legalization (LAL) Density-Aware Global Move Upper Bound & Lower Bound Converge? YES x5 Dynamic Packing Solve quadratic objective function using B2B model and obtain lower bound HPWL placement using CG Obtain upper bound HPWL placement using Look Ahead Legalization (LAL) Density-Aware Global Move Legalized locations serve as pseudo anchors and add anchors to quadratic objective function Packing NO Legalized locations serve as pseudo anchors and add anchors to quadratic objective function NO no more good packing? YES Global Placement [1]: T. Lin, C. Chu, J. R. Shinnerl, I. Bustany, and I. Nedelchev. POLAR: Placement based on novel rough legalization and renement. ICCAD '13, /3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 118

119 Global Placement Global Placement Lower density around fixed nodes HPWL-Driven Global Placement B2B wirelength model Lower bound placement from solving quadratic objective function Upper bound placement from look-aheadlegalization Density-Aware Global Move Move to optimal region with consideration of Density Wirelength Move to clock valid location (after clock selection) Clock Selection 1. Select a initial Clock Region for each clock 2. Expand each clock s area gradually in consideration of amount of uncovered nodes 3. Unpack CLBs that cannot find any valid location Solve quadratic objective function using B2B model and obtain lower bound HPWL placement using CG Obtain upper bound HPWL placement using Look Ahead Legalization (LAL) Density-Aware Global Move Upper Bound & Lower Bound Converge? NO Routing congestion estimation Congestion-driven packing YES Legalized locations serve as pseudo anchors and add anchors to quadratic objective function Placement Migration (near converge) 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 119

120 Global Placement Routing Congestion Estimation Apply NCTUgr for estimation Congestion-driven Packing Apply further packing for overlapped but routing congestion-free area Apply unpacking for routing congested area Global Placement Lower density around fixed nodes Solve quadratic objective function using B2B model and obtain lower bound HPWL placement using CG Obtain upper bound HPWL placement using Look Ahead Legalization (LAL) Density-Aware Global Move Upper Bound & Lower Bound Converge? NO Routing congestion estimation YES Placement Migration Congestion-driven packing (near converge) Legalized locations serve as pseudo anchors and add anchors to quadratic objective function 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 120

121 Placement Migration For closing the gap between global placement and legalization : Modify the three forces balance system from Kraftwerk2 [2] Placement Migration Obtain move force by calculating cell density gradient Obtain target step size for each cell Hold force : preserve the integrity of the original placement result Net force : model the wirelength of the netlist Move force : perturb the placement and smooth the transition from global placement to legalization YES Density Overflow? NO Legalization & Detailed Placement the cell s surface model obtained by Gaussian Blurring [2]: P. Spindler, U. Schlichtmann, and F. M. Johannes. Kraftwerk2: A fast force-directed quadratic placement approach using an accurate net model. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 27(8): , Aug /3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 121

122 Legalization and Detailed Placement (1/2) Minimize displacement in legalization 1. Apply bipartite matching to each clock region for legalization 2. Select Clocks for every half column 3. Apply another bipartite matching to fit half column constraints. Legalization & Detailed Placement Legalization using bipartite matching Wirelength-driven detailed placement Placement Result 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 122

123 Legalization and Detailed Placement (2/2) Detailed Placement Perform the Global Swap [3] to reduce the wirelength Identify a good swap pair or a space for each cell After swapping the cell would be in the position that gives the best wirelength while all other cells are treated as fixed Legalization & Detailed Placement Legalization using bipartite matching Wirelength-driven detailed placement Placement Result [3]: M. Pan, N. Viswanathan, and C. Chu. An efficient and effective detailed placement algorithm. In IEEE/ACM International Conference on Computer-Aided Design, pages 48 55, Nov /3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 123

124 Thank you! 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 124

125 Benchmarking Results

126 Top-5 Results: Place/Route Completion Designs Placer-A Placer-B Placer-C Placer-D Placer-E CLK-FPGA01 PASS PASS PASS PASS FAIL CLK-FPGA02 PASS PASS PASS PASS PASS CLK-FPGA03 PASS PASS PASS PASS FAIL CLK-FPGA04 PASS PASS PASS PASS FAIL CLK-FPGA05 PASS PASS PASS PASS FAIL CLK-FPGA06 PASS PASS PASS PASS FAIL CLK-FPGA07 PASS PASS PASS PASS PASS CLK-FPGA08 PASS PASS PASS PASS PASS CLK-FPGA09 PASS PASS PASS PASS PASS CLK-FPGA10 PASS PASS PASS PASS FAIL CLK-FPGA11 PASS PASS PASS PASS FAIL CLK-FPGA12 PASS PASS PASS PASS PASS CLK-FPGA13 PASS PASS PASS PASS PASS

127 Top-4 Placers: Total Routed Wirelength Designs Placer-A Placer-B Placer-C Placer-D CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA

128 Total Routed Wirelength (Normalized) Designs Placer-A Placer-B Placer-C Placer-D CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA Average

129 Placer Runtime (seconds) Designs Fastest 2nd 3rd 4th CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA Less than 10 mins for the largest design!

130 Placer Runtime (Normalized) Designs Fastest 2nd-fastest 3rd-fastest 4th-fastest CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA Average

131 Final Results with Runtime Factor Designs Placer-A Placer-B Placer-C CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA Average

132 Award Ceremony

133 Fifth Place goes to

134 5 GPlace 2.0: Clock-Aware Placement Tool for UltraScale FPGAs Ziad Abuowaimer Shawki Areibi Anthony Vannelli University of Guelph March 22, 2017 Gary Grewal

135 Fourth Place goes to

136 4 VDAplacer ISPD 2017 Contest Clock-Aware FPGA Placement Presenter: Chen Chen Advisor: Prof. Hung-Ming Chen Dept. of Electronic Engineering, National Chiao Tung University 2017/3/22 Department of Electronics Engineering, National Chiao Tung University VLSI Design Automation LAB 136

137 Third Place goes to

138 3 Fastest Placer CUHK - RippleFPGA Gengjie Chen, Chak-Wa Pui, Evangeline F. Y. Young, Bei Yu March 22, 2017

139 Second Place goes to

140 NTUfplace Clock-Aware FPGA Placement 2 Yun-Chih Kuo, Chau-Chin Huang, Shih-Chun Chen, Chun-Han Chiang, Yao-Wen Chang, and Sy-Yen Kuo Mar. 22, 2017 National Taiwan University 140

141 First Place goes to

142 UT DA Two years in a row! 1 UTPlaceF 2.0 ISPD 2017 Clock-Aware FPGA Placement Contest Wuxi Li, David Z. Pan ECE Department, University of Texas at Austin 142

143 Final Results with Runtime Factor Designs UTPlaceF2.0 NTUfplace RippleFPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA CLK-FPGA Average

144 Congratulations!

Clock-Aware FPGA Placement Contest

Clock-Aware FPGA Placement Contest Clock-Aware FPGA Placement Contest Stephen Yang, Chandra Mulpuri, Sainath Reddy, Meghraj Kalase, Srinivasan Dasasathyan, Mehrdad E. Dehkordi, Marvin Tom, Rajat Aggarwal Xilinx Inc. 2100 Logic Drive San

More information

Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering

Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering NCTU CHIH-LONG CHANG IRIS HUI-RU JIANG YU-MING YANG EVAN YU-WEN TSAI AKI SHENG-HUA CHEN IRIS Lab National Chiao Tung University

More information

ISPD 2015 Detailed Routing-Driven Placement Contest with Fence Regions and Routing Blockages

ISPD 2015 Detailed Routing-Driven Placement Contest with Fence Regions and Routing Blockages ISPD 2015 Detailed Routing-Driven Placement Contest with Fence Regions and Routing Blockages Ismail Bustany David Chinnery Joseph Shinnerl Vladimir Yutsis www.ispd.cc/contests/15/ispd2015_contest.html

More information

Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University

Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University Power-Driven Flip-Flop p Merging g and Relocation Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Mak @National Tsing Hua University Outline Introduction Problem Formulation Algorithms Experimental Results

More information

Flip-flop Clustering by Weighted K-means Algorithm

Flip-flop Clustering by Weighted K-means Algorithm Flip-flop Clustering by Weighted K-means Algorithm Gang Wu, Yue Xu, Dean Wu, Manoj Ragupathy, Yu-yen Mo and Chris Chu Department of Electrical and Computer Engineering, Iowa State University, IA, United

More information

University College of Engineering, JNTUK, Kakinada, India Member of Technical Staff, Seerakademi, Hyderabad

University College of Engineering, JNTUK, Kakinada, India Member of Technical Staff, Seerakademi, Hyderabad Power Analysis of Sequential Circuits Using Multi- Bit Flip Flops Yarramsetti Ramya Lakshmi 1, Dr. I. Santi Prabha 2, R.Niranjan 3 1 M.Tech, 2 Professor, Dept. of E.C.E. University College of Engineering,

More information

L12: Reconfigurable Logic Architectures

L12: Reconfigurable Logic Architectures L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics

More information

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material

More information

Clock Tree Power Optimization of Three Dimensional VLSI System with Network

Clock Tree Power Optimization of Three Dimensional VLSI System with Network Clock Tree Power Optimization of Three Dimensional VLSI System with Network M.Saranya 1, S.Mahalakshmi 2, P.Saranya Devi 3 PG Student, Dept. of ECE, Syed Ammal Engineering College, Ramanathapuram, Tamilnadu,

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014 EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect

More information

Exploring Architecture Parameters for Dual-Output LUT based FPGAs

Exploring Architecture Parameters for Dual-Output LUT based FPGAs Exploring Architecture Parameters for Dual-Output LUT based FPGAs Zhenghong Jiang, Colin Yu Lin, Liqun Yang, Fei Wang and Haigang Yang System on Programmable Chip Research Department, Institute of Electronics,

More information

L11/12: Reconfigurable Logic Architectures

L11/12: Reconfigurable Logic Architectures L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,

More information

Automatic Transistor-Level Design and Layout Placement of FPGA Logic and Routing from an Architectural Specification

Automatic Transistor-Level Design and Layout Placement of FPGA Logic and Routing from an Architectural Specification Automatic Transistor-Level Design and Layout Placement of FPGA Logic and Routing from an Architectural Specification by Ketan Padalia Supervisor: Jonathan Rose April 2001 Automatic Transistor-Level Design

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL Random Access Scan Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL ramamve@auburn.edu Term Paper for ELEC 7250 (Spring 2005) Abstract: Random Access

More information

Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security

Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security Timing with Virtual Signal Synchronization for Circuit Performance and Netlist Security Grace Li Zhang, Bing Li, Ulf Schlichtmann Chair of Electronic Design Automation Technical University of Munich (TUM)

More information

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz CSE140L: Components and Design Techniques for Digital Systems Lab CPU design and PLDs Tajana Simunic Rosing Source: Vahid, Katz 1 Lab #3 due Lab #4 CPU design Today: CPU design - lab overview PLDs Updates

More information

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures

Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures Investigation of Look-Up Table Based FPGAs Using Various IDCT Architectures Jörn Gause Abstract This paper presents an investigation of Look-Up Table (LUT) based Field Programmable Gate Arrays (FPGAs)

More information

Quantifying Academic Placer Performance on Custom Designs

Quantifying Academic Placer Performance on Custom Designs Quantifying Academic Placer Performance on Custom Designs Samuel Ward IBM STG 4 Burnet RD Austin TX 78758 siward {@us.ibm.com} Charles Alpert 5 BURNET RD AUSTIN TX 78758 alpert {@us.ibm.com} David A. Papa

More information

DUE to the popularity of portable electronic products,

DUE to the popularity of portable electronic products, 64 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 1, NO. 4, APRIL 013 Effective and Efficient Approach for Power Reduction by Using Multi-Bit Flip-Flops Ya-Ting Shyu, Jai-Ming Lin,

More information

Latch-Based Performance Optimization for FPGAs. Xiao Teng

Latch-Based Performance Optimization for FPGAs. Xiao Teng Latch-Based Performance Optimization for FPGAs by Xiao Teng A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of ECE University of Toronto

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

ECE 555 DESIGN PROJECT Introduction and Phase 1

ECE 555 DESIGN PROJECT Introduction and Phase 1 March 15, 1998 ECE 555 DESIGN PROJECT Introduction and Phase 1 Charles R. Kime Dept. of Electrical and Computer Engineering University of Wisconsin Madison Phase I Due Wednesday, March 24; One Week Grace

More information

Why FPGAs? FPGA Overview. Why FPGAs?

Why FPGAs? FPGA Overview. Why FPGAs? Transistor-level Logic Circuits Positive Level-sensitive EECS150 - Digital Design Lecture 3 - Field Programmable Gate Arrays (FPGAs) January 28, 2003 John Wawrzynek Transistor Level clk clk clk Positive

More information

PLACEMENT is an important step in the overall IC design

PLACEMENT is an important step in the overall IC design IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 23, NO. 4, APRIL 2004 537 Optimality and Scalability Study of Existing Placement Algorithms Chin-Chih Chang, Jason Cong,

More information

Interconnect Planning with Local Area Constrained Retiming

Interconnect Planning with Local Area Constrained Retiming Interconnect Planning with Local Area Constrained Retiming Ruibing Lu and Cheng-Kok Koh School of Electrical and Computer Engineering Purdue University,West Lafayette, IN, 47907, USA {lur, chengkok}@ecn.purdue.edu

More information

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm

CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm CS/EE 6710 Digital VLSI Design CAD Assignment #3 Due Thursday September 21 st, 5:00pm Overview: In this assignment you will design a register cell. This cell should be a single-bit edge-triggered D-type

More information

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops

Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops Reduction of Clock Power in Sequential Circuits Using Multi-Bit Flip-Flops A.Abinaya *1 and V.Priya #2 * M.E VLSI Design, ECE Dept, M.Kumarasamy College of Engineering, Karur, Tamilnadu, India # M.E VLSI

More information

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 03, 2015 ISSN (online): 2321-0613 V Priya 1 M Parimaladevi 2 1 Master of Engineering 2 Assistant Professor 1,2 Department

More information

A Survey on Post-Placement Techniques of Multibit Flip-Flops

A Survey on Post-Placement Techniques of Multibit Flip-Flops International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 3 (March 2014), PP.11-18 A Survey on Post-Placement Techniques of Multibit

More information

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest

The main design objective in adder design are area, speed and power. Carry Select Adder (CSLA) is one of the fastest ISSN: 0975-766X CODEN: IJPTFI Available Online through Research Article www.ijptonline.com IMPLEMENTATION OF FAST SQUARE ROOT SELECT WITH LOW POWER CONSUMPTION V.Elanangai*, Dr. K.Vasanth Department of

More information

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright.

This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright. This paper is a preprint of a paper accepted by Electronics Letters and is subject to Institution of Engineering and Technology Copyright. The final version is published and available at IET Digital Library

More information

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation

High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities IBM Corporation High Performance Microprocessor Design and Automation: Overview, Challenges and Opportunities Introduction About Myself What to expect out of this lecture Understand the current trend in the IC Design

More information

International Journal of Engineering Research-Online A Peer Reviewed International Journal

International Journal of Engineering Research-Online A Peer Reviewed International Journal RESEARCH ARTICLE ISSN: 2321-7758 VLSI IMPLEMENTATION OF SERIES INTEGRATOR COMPOSITE FILTERS FOR SIGNAL PROCESSING MURALI KRISHNA BATHULA Research scholar, ECE Department, UCEK, JNTU Kakinada ABSTRACT The

More information

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method M. Backia Lakshmi 1, D. Sellathambi 2 1 PG Student, Department of Electronics and Communication Engineering, Parisutham Institute

More information

Power Reduction Approach by using Multi-Bit Flip-Flops

Power Reduction Approach by using Multi-Bit Flip-Flops International Journal of Emerging Engineering Research and Technology Volume 2, Issue 4, July 2014, PP 60-77 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Power Reduction Approach by using Multi-Bit

More information

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Abstract The Peak Dynamic Power Estimation (P DP E) problem involves finding input vector pairs that cause maximum power dissipation (maximum

More information

INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE

INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE INTERMEDIATE FABRICS: LOW-OVERHEAD COARSE-GRAINED VIRTUAL RECONFIGURABLE FABRICS TO ENABLE FAST PLACE AND ROUTE By AARON LANDY A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN

More information

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2011

University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science. EECS150, Spring 2011 University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science EECS150, Spring 2011 Homework Assignment 2: Synchronous Digital Systems Review, FPGA

More information

An Efficient High Speed Wallace Tree Multiplier

An Efficient High Speed Wallace Tree Multiplier Chepuri satish,panem charan Arur,G.Kishore Kumar and G.Mamatha 38 An Efficient High Speed Wallace Tree Multiplier Chepuri satish, Panem charan Arur, G.Kishore Kumar and G.Mamatha Abstract: The Wallace

More information

A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits

A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits C.N.Kalaivani 1, Ayswarya J.J 2 Assistant Professor, Dept. of ECE, Dhaanish Ahmed College of Engineering,

More information

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran

CAD for VLSI Design - I Lecture 38. V. Kamakoti and Shankar Balachandran 1 CAD for VLSI Design - I Lecture 38 V. Kamakoti and Shankar Balachandran 2 Overview Commercial FPGAs Architecture LookUp Table based Architectures Routing Architectures FPGA CAD flow revisited 3 Xilinx

More information

A Fast Constant Coefficient Multiplier for the XC6200

A Fast Constant Coefficient Multiplier for the XC6200 A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx

More information

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle

Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle 184 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 Temporal Error Concealment Algorithm Using Adaptive Multi- Side Boundary Matching Principle Seung-Soo

More information

Retiming Sequential Circuits for Low Power

Retiming Sequential Circuits for Low Power Retiming Sequential Circuits for Low Power José Monteiro, Srinivas Devadas Department of EECS MIT, Cambridge, MA Abhijit Ghosh Mitsubishi Electric Research Laboratories Sunnyvale, CA Abstract Switching

More information

ESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large

ESE (ESE534): Computer Organization. Last Time. Today. Last Time. Align Data / Balance Paths. Retiming in the Large ESE680-002 (ESE534): Computer Organization Day 20: March 28, 2007 Retiming 2: Structures and Balance Last Time Saw how to formulate and automate retiming: start with network calculate minimum achievable

More information

CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA

CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA CAD Tool Flow for Variation-Tolerant Non-Volatile STT-MRAM LUT based FPGA Jeongbin Kim +822-2123-7826 xtankx123@yonsei.ac.kr Ki Tae Kim +822-2123-7826 ktkim1116@yonsei.ac.kr Eui-Young Chung +822-2123-5866

More information

Optimization of memory based multiplication for LUT

Optimization of memory based multiplication for LUT Optimization of memory based multiplication for LUT V. Hari Krishna *, N.C Pant ** * Guru Nanak Institute of Technology, E.C.E Dept., Hyderabad, India ** Guru Nanak Institute of Technology, Prof & Head,

More information

BIST-Based Diagnostics of FPGA Logic Blocks

BIST-Based Diagnostics of FPGA Logic Blocks To appear in Proc. International Test Conf., Nov. 1997 BIST-Based Diagnostics of FPGA Logic Blocks Charles Stroud, Eric Lee, Dept. of Electrical Engineering University of Kentucky and Miron Abramovici

More information

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA M.V.M.Lahari 1, M.Mani Kumari 2 1,2 Department of ECE, GVPCEOW,Visakhapatnam. Abstract The increasing growth of sub-micron

More information

Fine-grain Leakage Optimization in SRAM based FPGAs

Fine-grain Leakage Optimization in SRAM based FPGAs Fine-grain Leakage Optimization in based FPGAs Abstract FPGAs are evolving at a rapid pace with improved performance and logic density. At the same time, trends in technology scaling makes leakage power

More information

Optimizing area of local routing network by reconfiguring look up tables (LUTs)

Optimizing area of local routing network by reconfiguring look up tables (LUTs) Vol.2, Issue.3, May-June 2012 pp-816-823 ISSN: 2249-6645 Optimizing area of local routing network by reconfiguring look up tables (LUTs) Sathyabhama.B 1 and S.Sudha 2 1 M.E-VLSI Design 2 Dept of ECE Easwari

More information

Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction

Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction 1 Bubble Razor An Architecture-Independent Approach to Timing-Error Detection and Correction Matthew Fojtik, David Fick, Yejoong Kim, Nathaniel Pinckney, David Harris, David Blaauw, Dennis Sylvester mfojtik@umich.edu

More information

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.8, NO.5, OCTOBER, 08 ISSN(Print) 598-657 https://doi.org/57/jsts.08.8.5.640 ISSN(Online) -4866 A Modified Static Contention Free Single Phase Clocked

More information

The Stratix II Logic and Routing Architecture

The Stratix II Logic and Routing Architecture The Stratix II Logic and Routing Architecture David Lewis*, Elias Ahmed*, Gregg Baeckler, Vaughn Betz*, Mark Bourgeault*, David Cashman*, David Galloway*, Mike Hutton, Chris Lane, Andy Lee, Paul Leventis*,

More information

Design of Fault Coverage Test Pattern Generator Using LFSR

Design of Fault Coverage Test Pattern Generator Using LFSR Design of Fault Coverage Test Pattern Generator Using LFSR B.Saritha M.Tech Student, Department of ECE, Dhruva Institue of Engineering & Technology. Abstract: A new fault coverage test pattern generator

More information

Innovative Fast Timing Design

Innovative Fast Timing Design Innovative Fast Timing Design Solution through Simultaneous Processing of Logic Synthesis and Placement A new design methodology is now available that offers the advantages of enhanced logical design efficiency

More information

The Effect of Wire Length Minimization on Yield

The Effect of Wire Length Minimization on Yield The Effect of Wire Length Minimization on Yield Venkat K. R. Chiluvuri, Israel Koren and Jeffrey L. Burns' Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 01003

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

A Proposal for Routing-Based Timing-Driven Scan Chain Ordering

A Proposal for Routing-Based Timing-Driven Scan Chain Ordering A Proposal for Routing-Based Timing-Driven Scan Chain Ordering Puneet Gupta, Andrew B. Kahng and Stefanus Mantik Department of Electrical and Computer Engineering, UC San Diego, La Jolla, CA, USA Department

More information

FPGA Glitch Power Analysis and Reduction

FPGA Glitch Power Analysis and Reduction FPGA Glitch Power Analysis and Reduction Warren Shum and Jason H. Anderson Department of Electrical and Computer Engineering, University of Toronto Toronto, ON. Canada {shumwarr, janders}@eecg.toronto.edu

More information

Design of Routing-Constrained Low Power Scan Chains

Design of Routing-Constrained Low Power Scan Chains 1530-1591/04 $20.00 (c) 2004 IEEE Design of Routing-Constrained Low Power Scan Chains Y. Bonhomme 1 P. Girard 1 L. Guiller 2 C. Landrault 1 S. Pravossoudovitch 1 A. Virazel 1 1 Laboratoire d Informatique,

More information

Modeling Latches and Flip-flops

Modeling Latches and Flip-flops Lab Workbook Introduction Sequential circuits are digital circuits in which the output depends not only on the present input (like combinatorial circuits), but also on the past sequence of inputs. In effect,

More information

Lossless Compression Algorithms for Direct- Write Lithography Systems

Lossless Compression Algorithms for Direct- Write Lithography Systems Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley

More information

11. Sequential Elements

11. Sequential Elements 11. Sequential Elements Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 11, 2017 ECE Department, University of Texas at Austin

More information

VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units

VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units VirtualSync: Timing Optimization by Synchronizing Logic Waves with Sequential and Combinational Components as Delay Units Grace Li Zhang 1, Bing Li 1, Masanori Hashimoto 2 and Ulf Schlichtmann 1 1 Chair

More information

An Efficient Reduction of Area in Multistandard Transform Core

An Efficient Reduction of Area in Multistandard Transform Core An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai

More information

Lecture #4: Clocking in Synchronous Circuits

Lecture #4: Clocking in Synchronous Circuits Lecture #4: Clocking in Synchronous Circuits Kunle Stanford EE183 January 15, 2003 Tutorial/Verilog Questions? Tutorial is done, right? Due at midnight (Fri 1/17/03) Turn in copies of all verilog, copy

More information

On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques

On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques On the Sensitivity of FPGA Architectural Conclusions to Experimental Assumptions, Tools, and Techniques Andy Yan, Rebecca Cheng, Steven J.E. Wilton Department of Electrical and Computer Engineering University

More information

Post-Routing Layer Assignment for Double Patterning

Post-Routing Layer Assignment for Double Patterning Post-Routing Layer Assignment for Double Patterning Jian Sun 1, Yinghai Lu 2, Hai Zhou 1,2 and Xuan Zeng 1 1 Micro-Electronics Dept. Fudan University, China 2 Electrical Engineering and Computer Science

More information

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops International Journal of Emerging Engineering Research and Technology Volume 2, Issue 4, July 2014, PP 250-254 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Gated Driver Tree Based Power Optimized Multi-Bit

More information

Radar Signal Processing Final Report Spring Semester 2017

Radar Signal Processing Final Report Spring Semester 2017 Radar Signal Processing Final Report Spring Semester 2017 Full report report by Brian Larson Other team members, Grad Students: Mohit Kumar, Shashank Joshil Department of Electrical and Computer Engineering

More information

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida Reconfigurable Architectures Greg Stitt ECE Department University of Florida How can hardware be reconfigurable? Problem: Can t change fabricated chip ASICs are fixed Solution: Create components that can

More information

FPGA Implementation of DA Algritm for Fir Filter

FPGA Implementation of DA Algritm for Fir Filter International Journal of Computational Engineering Research Vol, 03 Issue, 8 FPGA Implementation of DA Algritm for Fir Filter 1, Solmanraju Putta, 2, J Kishore, 3, P. Suresh 1, M.Tech student,assoc. Prof.,Professor

More information

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013

International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 International Journal of Engineering Trends and Technology (IJETT) - Volume4 Issue8- August 2013 Design and Implementation of an Enhanced LUT System in Security Based Computation dama.dhanalakshmi 1, K.Annapurna

More information

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,

More information

RELATED WORK Integrated circuits and programmable devices

RELATED WORK Integrated circuits and programmable devices Chapter 2 RELATED WORK 2.1. Integrated circuits and programmable devices 2.1.1. Introduction By the late 1940s the first transistor was created as a point-contact device formed from germanium. Such an

More information

Australian Journal of Basic and Applied Sciences. Design of SRAM using Multibit Flipflop with Clock Gating Technique

Australian Journal of Basic and Applied Sciences. Design of SRAM using Multibit Flipflop with Clock Gating Technique ISSN:1991-8178 Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Design of SRAM using Multibit Flipflop with Clock Gating Technique 1 Divya R. and 2 Hemalatha K.L. 1

More information

An FPGA Implementation of Shift Register Using Pulsed Latches

An FPGA Implementation of Shift Register Using Pulsed Latches An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,

More information

Logic Design for Single On-Chip Test Clock Generation for N Clock Domain - Impact on SOC Area and Test Quality

Logic Design for Single On-Chip Test Clock Generation for N Clock Domain - Impact on SOC Area and Test Quality and Communication Technology (IJRECT 6) Vol. 3, Issue 3 July - Sept. 6 ISSN : 38-965 (Online) ISSN : 39-33 (Print) Logic Design for Single On-Chip Test Clock Generation for N Clock Domain - Impact on SOC

More information

Designing for High Speed-Performance in CPLDs and FPGAs

Designing for High Speed-Performance in CPLDs and FPGAs Designing for High Speed-Performance in CPLDs and FPGAs Zeljko Zilic, Guy Lemieux, Kelvin Loveless, Stephen Brown, and Zvonko Vranesic Department of Electrical and Computer Engineering University of Toronto,

More information

In-System Testing of Configurable Logic Blocks in Xilinx 7-Series FPGAs

In-System Testing of Configurable Logic Blocks in Xilinx 7-Series FPGAs In-System Testing of Configurable Logic Blocks in Xilinx 7-Series FPGAs Harmish Rajeshkumar Modi Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment

More information

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel IEEE TRANSACTIONS ON MAGNETICS, VOL. 46, NO. 1, JANUARY 2010 87 Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel Ningde Xie 1, Tong Zhang 1, and

More information

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, tomott}@berkeley.edu Abstract With the reduction of feature sizes, more sources

More information

VLSI Chip Design Project TSEK06

VLSI Chip Design Project TSEK06 VLSI Chip Design Project TSEK06 Project Description and Requirement Specification Version 1.1 Project: High Speed Serial Link Transceiver Project number: 4 Project Group: Name Project members Telephone

More information

FPGA Design. Part I - Hardware Components. Thomas Lenzi

FPGA Design. Part I - Hardware Components. Thomas Lenzi FPGA Design Part I - Hardware Components Thomas Lenzi Approach We believe that having knowledge of the hardware components that compose an FPGA allow for better firmware design. Being able to visualise

More information

Improving FPGA Performance with a S44 LUT Structure

Improving FPGA Performance with a S44 LUT Structure Improving FPGA Performance with a S44 LUT Structure Wenyi Feng, Jonathan Greene Microsemi Corporation SOC Products Group, San Jose {wenyi.feng, jonathan.greene}@microsemi.com ABSTRACT FPGA performance

More information

Testing of Cryptographic Hardware

Testing of Cryptographic Hardware Testing of Cryptographic Hardware Presented by: Debdeep Mukhopadhyay Dept of Computer Science and Engineering, Indian Institute of Technology Madras Motivation Behind the Work VLSI of Cryptosystems have

More information

Lecture 23 Design for Testability (DFT): Full-Scan

Lecture 23 Design for Testability (DFT): Full-Scan Lecture 23 Design for Testability (DFT): Full-Scan (Lecture 19alt in the Alternative Sequence) Definition Ad-hoc methods Scan design Design rules Scan register Scan flip-flops Scan test sequences Overheads

More information

Power-Aware Placement

Power-Aware Placement Power-Aware Placement Yongseok Cheon, Pei-Hsin Ho, Andrew B. Kahng, Sherief Reda, Qinke Wang Advanced Technology Group, Synopsys, Inc. CSE Department, University of California at San Diego {cheon,pho}@synopsys.com,

More information

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique Dr. Dhafir A. Alneema (1) Yahya Taher Qassim (2) Lecturer Assistant Lecturer Computer Engineering Dept.

More information

Iterative Deletion Routing Algorithm

Iterative Deletion Routing Algorithm Iterative Deletion Routing Algorithm Perform routing based on the following placement Two nets: n 1 = {b,c,g,h,i,k}, n 2 = {a,d,e,f,j} Cell/feed-through width = 2, height = 3 Shift cells to the right,

More information

TKK S ASIC-PIIRIEN SUUNNITTELU

TKK S ASIC-PIIRIEN SUUNNITTELU Design TKK S-88.134 ASIC-PIIRIEN SUUNNITTELU Design Flow 3.2.2005 RTL Design 10.2.2005 Implementation 7.4.2005 Contents 1. Terminology 2. RTL to Parts flow 3. Logic synthesis 4. Static Timing Analysis

More information

Distributed Arithmetic Unit Design for Fir Filter

Distributed Arithmetic Unit Design for Fir Filter Distributed Arithmetic Unit Design for Fir Filter ABSTRACT: In this paper different distributed Arithmetic (DA) architectures are proposed for Finite Impulse Response (FIR) filter. FIR filter is the main

More information

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015 Optimization of Multi-Channel BCH Error Decoding for Common Cases Russell Dill Master's Thesis Defense April 20, 2015 Bose-Chaudhuri-Hocquenghem (BCH) BCH is an Error Correcting Code (ECC) and is used

More information

FPGA Design with VHDL

FPGA Design with VHDL FPGA Design with VHDL Justus-Liebig-Universität Gießen, II. Physikalisches Institut Ming Liu Dr. Sören Lange Prof. Dr. Wolfgang Kühn ming.liu@physik.uni-giessen.de Lecture Digital design basics Basic logic

More information

Low Power Illinois Scan Architecture for Simultaneous Power and Test Data Volume Reduction

Low Power Illinois Scan Architecture for Simultaneous Power and Test Data Volume Reduction Low Illinois Scan Architecture for Simultaneous and Test Data Volume Anshuman Chandra, Felix Ng and Rohit Kapur Synopsys, Inc., 7 E. Middlefield Rd., Mountain View, CA Abstract We present Low Illinois

More information

IN A SERIAL-LINK data transmission system, a data clock

IN A SERIAL-LINK data transmission system, a data clock IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 9, SEPTEMBER 2006 827 DC-Balance Low-Jitter Transmission Code for 4-PAM Signaling Hsiao-Yun Chen, Chih-Hsien Lin, and Shyh-Jye

More information

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, Sequencing ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2013 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines Introduction Sequencing

More information

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida

CDA 4253 FPGA System Design FPGA Architectures. Hao Zheng Dept of Comp Sci & Eng U of South Florida CDA 4253 FPGA System Design FPGA Architectures Hao Zheng Dept of Comp Sci & Eng U of South Florida FPGAs Generic Architecture Also include common fixed logic blocks for higher performance: On-chip mem.

More information