Minimizing FPGA Reconfiguration Data at Logic Level

Similar documents
Efficient Building Blocks for Reversible Sequential

Automatic Repositioning Technique for Digital Cell Based Window Comparators and Implementation within Mixed-Signal DfT Schemes

Hierarchical Reversible Logic Synthesis Using LUTs

CPE 200L LABORATORY 2: DIGITAL LOGIC CIRCUITS BREADBOARD IMPLEMENTATION UNIVERSITY OF NEVADA, LAS VEGAS GOALS:

The Official IDENTITY SYSTEM. A Manual Concerning Graphic Standards and Proper Implementation. As developed and established by the

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

ECE 274 Digital Logic. Digital Design. Sequential Logic Design Controller Design: Laser Timer Example

Unit 10: I don t feel very well

ScienceDirect. Suppression of higher order modes in an array of cavities using waveguides

Mapping Arbitrary Logic Functions into Synchronous Embedded Memories For Area Reduction on FPGAs

Sequencer devices. Philips Semiconductors Programmable Logic Devices

LOGICAL FOUNDATION OF MUSIC

Politecnico di Torino. Porto Institutional Repository

Development of High-quality Large-size Synthetic Diamond Crystals

Contents. English. English

RV6**D Digital Series LV6**D Digital Series

ECE 274 Digital Logic. Digital Design. Datapath Components Registers. Datapath Components Register with Parallel Load

A Wave-Pipelined On-chip Interconnect Structure for Networks-on-Chips

Security of IoT Systems: Design Challenges and Opportunities

The Role of the Federal Reserve in the Economy. A. I d like to try to answer some of the questions that I often hear people ask:

Final Project: Musical Memory

Basic Image Features (BIFs) arising from approximate Symmetry Type

Applications to Transistors

ANSWER: POINTS: 1 REFERENCES: 2 LEARNING OBJECTIVES: STAT.HEAL Describe the limited but crucial role of statistics in social research.

STANDARD CONSTRUCTION DETAILS TRAFFIC REVISED MAY 2017 DEPARTMENT OF ENGINEERING

Standards Overview (updated 7/31/17) English III Louisiana Student Standards by Collection Assessed on. Teach in Collection(s)

PMT EFFECTIVE RADIUS AND UNIFORMITY TESTING

Chapter 1: Introduction

Interactions of Folk Melody and Transformational (Dis)continuities in Chen Yi s Ba Ban

Designs and Implementations of Low-Leakage Digital Standard Cells Based on Gate- Length Biasing

GRABLINKTM. FullTM. - DualBaseTM. - BaseTM. GRABLINK Full TM. GRABLINK DualBase TM. GRABLINK Base TM

Corporate Logo Guidelines

WE SERIES DIRECTIONAL CONTROL VALVES

LOGOMANUAL. guidelines how to use Singing Rock logotype. Version 1.5 English. Lukáš Matěja

ARCHITECTURAL CONSIDERATION OF TOPS-DSP FOR VIDEO PROCESSING. Takao Nishitani. Tokyo Metropolitan University

CPSC 121: Models of Computation Lab #2: Building Circuits

Reproducible music for 3, 4 or 5 octaves handbells or handchimes. by Tammy Waldrop. Contents. Performance Suggestions... 3

1 --FORMAT FOR CITATIONS & DOCUMENTATION-- ( ) YOU MUST CITE A SOURCE EVEN IF YOU PUT INFORMATION INTO YOUR OWN WORDS!

Signaling Specifications

Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors

VISUAL IDENTITY GUIDE

PROFESSIONAL D-ILA PROJECTOR

About the Transcriptions. Liszt as a Pianist

LEC-23: Scan Testing and JTAG

RV73* Digital Series XV73* Digital Series MV73* Digital Series

Introduction. APPLICATION NOTE 712 DS80C400 Ethernet Drivers. Jun 06, 2003

VOCAL MUSIC I * * K-5. Red Oak Community School District Vocal Music Education. Vocal Music Program Standards and Benchmarks

Predicted Movie Rankings: Mixture of Multinomials with Features CS229 Project Final Report 12/14/2006

This is a PDF file of an unedited manuscript that has been accepted for publication in Omega.

A New Method for Tracking Modulations in Tonal Music in Audio Data Format 1

Texas Transportation Institute The Texas A&M University System College Station, Texas

SeSSION 9. This session is adapted from the work of Dr.Gary O Reilly, UCD. Session 9 Thinking Straight Page 1

Synchronising Word Problem for DFAs

Soft Error Derating Computation in Sequential Circuits

UNIT-1 19 Acoustics 04 Microphones and Loud speakers 10 Magnetic recording 05. UNIT-2 20 Video disc recording 06 Monochrome TV 10 Remote controls 04

DRAFT. Vocal Music AOS 2 WB 3. Purcell: Music for a While. Section A: Musical contexts. How is this mood achieved through the following?

Boxes made of corrugated cardboard are ubiquitous, Andrew Glassner s Notebook. Know When to Fold

lookbook Transportation - Airports

Chapter 3: Sequential Logic Design -- Controllers

Cambridge University Press 2004

Answers to Exercise 3.3 (p. 76)

TACT2015 Staff ReCertification Test 2015 Please write ONLY on the answer sheet

PIRELLI BRANDBOOK 4. IDENTITY DESIGN

A Proposed Keystream Generator Based on LFSRs. Adel M. Salman Baghdad College for Economics Sciences

walking. Rhythm is one P-.bythm is as Rhythm is built into our pitch, possibly even more so. heartbeats, or as fundamental to mu-

Chapter 5. Synchronous Sequential Logic. Outlines

Application Support. Product Information. Omron STI. Support Engineers are available at our USA headquarters from

Have they bunched yet? An exploratory study of the impacts of bus bunching on dwell and running times.

LAERSKOOL RANDHART ENGLISH GRADE 5 DEMARCATION FOR EXAM PAPER 2

OWNER'S MANUAL 55VL900A 47VL900A

Pitch I. I. Lesson 1 : Staff

Safety Relay Unit G9SB

Notations Used in This Guide

RL85* Digital Series. Register your TV online at

Contents. English. English. Your remote control 2

Safety Relay Unit G9SB

Your Summer Holiday Resource Pack: English

T KS. by DON LANCASTER. walking ring computer and the pse11do random seq11ence generator.

PRACTICE FINAL EXAM T T. Music Theory II (MUT 1112) w. Name: Instructor:

USER MANUAL. L73** Digital Series M83** Digital Series

Engineer To Engineer Note

USER MANUAL. M74** Digital Series L74** Digital Series

CSEE 6861 CAD of Digital Systems Handout: Lecture #5

Characterization of transmission line based on advanced SOLTcalibration: Review

DIGITAL EFFECTS MODULE OWNER'S MANUAL

LETTER. Preplay of future place cell sequences by hippocampal cellular assemblies

6. Vocabulary making adjectives and adverbs

Give sequence to events Have memory y( (short-term) Use feedback from output to input to store information

NEW CUTTING ELEMENTARY. with mini-dictionary STUDENTS BOOK. with frances eales

COMPUTER-ASSISTED EXTRACTION OF TERMS IN SPECIFIC DOMAINS:

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 30. Setting Up the Projector 17

TL96* Digital Series. Register your TV online at

Dream On READING BEFORE YOU READ

Experiments in Digital Television

Notations Used in This Guide

Class Piano Resource Materials

research is that it is descriptive in nature. What is meant by descriptive is that in a

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 28. Setting Up the Projector 15

4 Food, glorious food!

Lecture 3: Circuits & Layout

Transcription:

Southern Illinois University Crondle OpenSIUC Conferene Proeedings Deprtment of Eletril nd Computer Engineering 3-2006 Minimizing FPGA Reonfigurtion Dt t Logi Level Krishn Rghurmn Southern Illinois University Crondle Hio Wng Southern Illinois University Crondle, hio@engr.siu.edu Spyros Trgouds Southern Illinois University Crondle Follow this nd dditionl works t: http://opensiu.li.siu.edu/ee_onfs Pulished in Rghurmn, K., Wng, H., & Trgouds, S. (2006). Minimizing FPGA reonfigurtion dt t logi level. Proeedings of the 7th Interntionl Symposium on Qulity Eletroni Design (ISQED 06), 224. doi: 10.1109/ISQED.2006.87 2006 IEEE. Personl use of this mteril is permitted. However, permission to reprint/repulish this mteril for dvertising or promotionl purposes or for reting new olletive works for resle or redistriution to servers or lists, or to reuse ny opyrighted omponent of this work in other works must e otined from the IEEE. This mteril is presented to ensure timely dissemintion of sholrly nd tehnil work. Copyright nd ll rights therein re retined y uthors or y other opyright holders. All persons opying this informtion re expeted to dhere to the terms nd onstrints invoked y eh uthor's opyright. In most ses, these works my not e reposted without the expliit permission of the opyright holder. Reommended Cittion Rghurmn, Krishn; Wng, Hio; nd Trgouds, Spyros, "Minimizing FPGA Reonfigurtion Dt t Logi Level" (2006). Conferene Proeedings. Pper 43. http://opensiu.li.siu.edu/ee_onfs/43 This Artile is rought to you for free nd open ess y the Deprtment of Eletril nd Computer Engineering t OpenSIUC. It hs een epted for inlusion in Conferene Proeedings y n uthorized dministrtor of OpenSIUC. For more informtion, plese ontt opensiu@li.siu.edu.

Minimizing FPGA Reonfigurtion Dt t Logi Level Krishn Rghurmn, Hio Wng, nd Spyros Trgouds Southern Illinois University, Crondle, IL 62901 Astrt A frmework tht reltes the size of FPGA reonfigurtion dt to the numer of minterms of speilly onstruted funtion is presented. Three tehniques, vrile mpping optimiztion, iruit don t-re modifition, nd look-up tle input permuttion, re developed to minimize minterms of the speil funtion. The method to integrte the proposed tehniques into FPGA design utomtion flow is disussed nd experimentl results re presented. 1. Introdution Reonfigurle systems provide numer of dvntges nd re ontinuously gining their populrity in vrious pplitions. Currently, most reonfigurle systems re implemented on FPGA pltforms. For suh systems, n importnt design onern is to minimize FPGA reonfigurtion itstrems, nd this prolem hs een widely investigted from high level design. Studies in [1, 2, 3, 4, 5] present different lgorithms to perform temporl prtitions with the ojetive of reusing funtion units in different temporl prtitions. Menwhile, the reuse of FPGA routing ptterns is investigted in [6]. Relotion nd defrgmenttion tehniques re presented in [7, 8]. The work in [9] minimizes reonfigurtion ost y oth using orse-grin logi loks nd optimizing sheduling nd llotion shemes. Additionlly, other tehniques proposed in literture inlude onfigurtion hing [10], onfigurtion ompression [11], nd olumn-sed onfigurtion method [12]. Differing from previous pprohes, this work ddresses the prolem of minimizing reonfigurtion dt t the logi level. Tehniques developed in this work tke dvntge of two fts. First, FPGA onfigurtion dt re prtitioned into frmes, whih re the smllest dt units tht n e individully essed y onfigurtion ommnds [13]. Seond, frme ontins onfigurtion dt for identil hrdwre loted in n FPGA olumn. To onveniently trk the size of reonfigurtion dt, we introdue frmework tht links reonfigurtion frmes to minterms of speilly onstruted funtion, whih is referred to s the differene funtion of look-up tle (LUT) olumn. Bsed on this frmework, three tehniques, vrile mpping optimiztion, iruit don t-re modifition, nd LUT input order permuttion, re proposed to minimize minterms of LUTolumn differene funtions. The rest of the pper is orgnized s follows. Setion 2 explins FPGA onfigurtion frmes nd desries how to link reonfigurtion frmes to minterms of LUT-olumn differene funtions. Motivtionl exmples re lso given in this setion to eluidte the proposed tehniques. Setion 3 develops proedures to effiiently implement the proposed tehniques. Setion 4 illustrtes how to integrte the proposed tehniques into FPGA design utomtion flow nd reports experimentl results. The pper is onluded in Setion 5. 2. Preliminries In mny LUT-sed FPGAs, onfigurtion dt re prtitioned into frmes [13, 14]. A frme ontins onfigurtion dt for hrdwre loted in n FPGA olumn. The struture of frmes is explined using n FPGA LUT olumn shown in Figure 1. Assume tht there re N LUTs in the olumn nd eh LUT hs 16 memory lotions. The 16 memory lotions of ny LUT in the olumn elong to 16 different frmes. In ddition, eh frme ontins N its, orresponding to the sme memory lotions in the N LUTs of the olumn. Sine frme is the smllest lok of onfigurtion dt tht n e essed y onfigurtion ommnds, the entire frme hs to e written into the FPGA even if we just wnt to hnge single it of n LUT during prtil reonfigurtion. This rrngement lessens the urden of ddressing LUT lotions, onsequently simplifying hrdwre design nd reduing the size of onfigurtion itstrems. As frmes re the primitive units of FPGA reonfigurtion dt, reduing the size of FPGA reonfigurtion itstrems is equivlent to minimizing the numer of reonfigurtion frmes. The ltter minimiztion prolem n e ddressed in two perspetives. First, it is desirle to hve eh LUT require less numer of frmes during reonfigurtion. This leds to minimizing the differene etween dt stored in eh LUT efore nd fter reonfigurtion. This prolem n e tkled y oth optimizing vrile

LUT Column LUT 1 1 1 LUT2 LUTN 1 16 16 16 Configurtion it for memory lotion 1 in Configurtion it for memory lotion 1 in LUT2 Configurtion it for memory lotion 1 in LUT N Frmes of onfigurtion dt Frme 1 Frme 2 Frme 16 Configurtion it for memory lotion 16 in LUT N Figure 1. Virtex onfigurtion frmes. mpping nd modifying LUT don t-re lotions. Before nd fter reonfigurtion, n LUT my implement two different funtions tht depend on two sets of logi vriles. Vrile mpping refers to the rule tht dittes whih two vriles (one is n input of the first funtion nd the other is n input of the seond funtion) should e mpped to the sme LUT ddress input. Menwhile, LUT don t-re lotions re memory lotions whose ddresses orrespond to iruit don t-res. Dt stored in don t-re lotions n e ltered without hnging iruit funtionlity. The seond perspetive on minimizing reonfigurtion frmes is to mximize the effiieny of eh frme, whih is mesured y how mny its of the frme ontining dt tht truly updte LUT lotions. For given numer of LUT lotions tht need to e updted, higher frme effiienies will result in less numer of frmes. The effiienies of frmes n e improved y permuting LUT input orders, whih relotes LUT lotions tht need e updted into ommon frmes. We first introdue nottions used in the pper. We refer to logi funtions implemented on n LUT efore nd fter reonfigurtion s its initil nd finl funtions, respetively. For given LUT, denoted s LUT i,weusef i nd h i to represent its initil nd finl funtions. When it is not neessry to distintively identify LUTs, susripts of f i nd h i re omitted for the ske of oniseness. Furthermore, for ny given logi funtion l, weusel on,l d,l off to represent its on, don t-re, nd off sets, respetively. Three exmples will e given to illustrte how vrile mpping (Exmple 1), don t-re lotions (Exmple 2), nd LUT input orders (Exmple 3) n e utilized to redue reonfigurtion frmes. Without losing generlity, threeinput LUTs re used. Exmple 1: For n LUT, ssume f = + nd h = x+y z. If the vrile mpping is seleted s { x, y, z} (symol indites whih two vriles re mpped to the sme LUT ddress), two frmes (indited y sterisks) re needed for this LUT s shown in Figure 2. However, if the vrile mpping is hnged to { y, z, x}, no frmes re needed. Exmple 2: For n LUT, ssume f = nd h =. As shown in Figure 3, four frmes re needed for this LUT. However, if oth funtions hve don t-re sets f d = + nd (x) (y) (z) Address of LUT lotions 000 001 010 011 100 101 110 111 0 1 0 1 0 1 1 1 Initil LUT ontent fter reonfigurtion 0 0 0 1 1 1 1 1 Finl * * Figure 2. LUT dt without vrile mpping optimiztion. h d = + respetively, then the initil nd finl funtions n e modified s f new = h new = +. Nofrmesre needed fter the modifition. In this exmple, oth the initil nd finl funtions depend on the sme set of logi vriles. After the vrile mpping is fixed, f nd h n hve either the sme or different support sets. Address of LUT lotions 000 001 010 011 100 101 110 111 0 0 0 0 0 0 1 1 Initil LUT ontent fter reonfigurtion 1 0 1 0 0 0 0 0 Finl * * * * Figure 3. LUT dt without don t-re modifition. Exmple 3: Assume LUT 1 nd LUT 2 re in the sme olumn nd f 1 = (+), h 1 = +, f 2 = +, h 2 =(+). Ifthe input orders for oth LUTs re { A 3, A 2, A 1}, five frmes re needed s shown in Figure 4(). However, if the input order for LUT 2 is hnged to { A 3, A 2, A 1}, only three frmes re required s shown in Figure 4(). Note tht LUT input order permuttion is performed with fixed vrile mppings. During the permuttion, LUT input orders for oth initil nd finl funtions re hnged in the sme wy. LUT2 Address of LUT lotions 000 001 010 011 100 101 110 111 0 0 0 0 0 1 1 1 Initil LUT ontent fter reonfigurtion 0 0 1 1 1 1 1 1 Finl * * * 0 1 0 1 0 1 1 1 Initil 0 0 0 1 0 1 0 1 Finl * * () Before LUT input permuttion. LUT2 0 0 0 0 0 1 1 1 Initil LUT ontent fter reonfigurtion 0 0 1 1 1 1 1 1 Finl * * * 0 0 0 1 1 1 1 1 Initil 0 0 0 0 0 1 1 1 Finl () After LUT input permuttion. Figure 4. LUT dt with input permuttion.

In the quest for solutions of the proposed minimiztion prolem, we re more interested in how logi vlues (0 or 1) re stored in LUTs, rther thn wht tul funtions implemented on LUTs re. Due to this reson, we introdue the onept of the LUT mpping funtion. LUT mpping funtions re LUT funtions expressed in terms of LUT ddress vriles. For n LUT whose implemented logi funtion is given, we n otin its mpping funtion through sustituting logi vriles y their ssoited ddress vriles. For exmple, the initil funtion of LUT 1 in Figure 4 is ( + ). Sustituting logi vriles y their ssoited LUT ddress vriles, we hve its mpping funtion s A 3 (A 2 + A 1 ). The mpping funtion of n LUT represents ll the LUT lotions tht store logi 1. Sine eh LUT is ssoited with two logi funtions (f nd h), there re two mpping funtions for eh LUT s well. Due to the lose reltion etween LUT logi funtions nd their orresponding mpping funtions, we lso use f nd h to represent the initil nd finl mpping funtions of n LUT, respetively. Bsed on LUT mpping funtions, we define the LUT differene funtion s: D = f h (1) In ddition, the differene funtion of n LUT olumn is defined s: N D = D i (2) i=1 where, N is the totl numer of LUTs in the given olumn nd D i is the LUT differene funtion of LUT i.intheomputtion of D, ddress vriles with the sme nme ut loted in different LUTs (e.g. A 1 of LUT i nd LUT j )re treted s the sme vrile, sine they funtion s oordintes to indite LUT lotions ontining logi 1. Therefore, funtion D depends on only p vriles: A p, A p 1,, A 1,wherep is the numer of inputs of the LUTs in the olumn. It is esy to see tht the numer of minterms in D is equl to the numer of frmes requested for reonfiguring the entire LUT olumn. Due to this reson, the phrse of minimizing LUT differene funtions is used in the rest of the pper s onvenient synonym of minimizing the numer of minterms in LUT differene funtions. 3. Proposed Tehniques As disussed erly, FPGA reonfigurtion dt n e minimized y optimizing vrile mpping, modifying LUT don t-re lotions, nd permuting LUT input orders. The prolem of finding optiml vrile mppings is esy sine it n e solved seprtely for eh LUT. Tehniques to perform the other two optimiztion proedures re disussed in the following. 3.1 Modifying LUT don t-re lotions In generl, expressions for f nd h of n LUT ontin their entire on sets (f on nd h on ) nd portions of their don tre sets (f d nd h d ). We use f d nd f d to distinguish don t-res of f tht re inluded nd exluded in the expression of f. Similr nottions pply to funtion h. Then, we hve f = f on + f d nd h = h on + h d.thelut differene funtion n e written s: D = f h + f h = f on h off + f on h d + f d h + h on f off + h on f d + h d f (3) Oviously, f on h off + h on f off onstitutes the lower ound of the differene etween f nd h. The other terms on the right-hnd-side of Eqution 3 n e eliminted y ssigning proper vlues to LUT don t-re lotions. This is formlly stted y the following orollry. Corollry 1 The numer of minterms of n LUT differene funtion is minimized if the initil nd finl funtions of the LUT re modified s follows: f new = f + f d h f d h f d h d (4) h new = h + h d f h d f f d h d (5) In the ove equtions, symols +,,nd represent set union, intersetion, nd sutrtion opertions. For n LUT, dding minterm to its funtion implies hnging the vlue stored in the LUT lotion tht orresponds to the minterm to logi 1. Menwhile, sutrting minterm is the sme s putting logi 0 to the orresponding LUT lotion. It is esy to show f new h new = f on h off + h on f off nd, hene, prove the orollry. By performing funtion modifition ording to the ove orollry, minterms dded to f re: μ + = f d h f d h d f (6) Similrly, minterms tht re sutrted from f n e expressed s: μ = f d h + f d h d f (7) The totl LUT lotions tht re ltered n e expressed y their orresponding minterms s: μ = μ + + μ (8) Note tht similr set of equtions pply to funtion h. For n LUT, its don t-res onsist of ontrollility don t-res (CDCs) nd oservility don t-res (ODCs). CDCs re signl ptterns tht never pper t the LUT inputs. Menwhile, ODCs re defined s LUT input ptterns

representing senrios tht the LUT output nnot e oserved y iruit primry outputs. Beuse CDC sets of different LUTs re independent of eh other, modifying LUT lotions ddressed y CDC ptterns n e performed individully for eh LUT. This simple proess lwys leds to the glolly optimized solution when only CDCs re under onsidertion. On the ontrry, modifying ODC lotions is omplited proess. When ODC lotions of n LUT re modified, ODCs of other LUTs my hnge. Although it is theoretilly possile to re-ompute ODCs for the rest of LUTs fter eh LUT is modified, this pproh is prtilly unttrtive due to its omputtion omplexity. To void repeted re-omputtion of LUT ODCs, this setion presents n effiient method to ompute LUT ODCs tht n e simultneously modified, whih re referred to s omptile ODCs (CODCs). To ddress similr prolem in logi synthesis, severl tehniques [15, 16, 17, 18] hve een proposed. The method presented here is similr to pprohes disussed in [16, 17] in the perspetive of omputing CODC upper ounds. However, it differs from the previous pprohes in the following two spets. First, ODCs overed y their upper ounds re further restrited ording to Eqution 8. Seond, heuristi method is utilized to determine the order of LUTs to e proessed. The simultneous optimiztion for multiple verties (gtes or LUTs), denoted s y 1,y 2,,y n, n e modeled y n perturtion vriles δ 1,δ 2,,δ n [15]. In this pplition, δ i represents ODCs tht re dded or sutrted from the funtion of LUT i.letdc ext represent externl don t-res, ODC yi denote ODCs t vertex y i,ndsymol represent generlized oftor opertions. A suffiient ondition for the equivlene etween the pertured nd originl iruits is [16]: δ i 1 DC ext + ODC yi δ 1,,δ i 1 i =1, 2, n. (9) In the ove expression, don t-res with respet to different primry outputs re represented in the vetor formt nd 1 =(1, 1,, 1). The ove ondition gives series of upper ounds (with respet to different primry outputs) for δ i, whih depend on ODC yi nd previous perturtions. Let m denote the numer of iruit primry outputs, DCj ext nd ODC yi j denote the externl nd oservility don t-re sets t vertex y i with respet to primry output j, respetively. The glol upper ound, whih is in the slr formt, n e otined s: ζ i (δ 1,,δ i 1 ) = m (DCj ext j=1 for i =1, 2, n + ODC yi j δ 1,,δ i 1 )(10) As FPGA reonfigurtion dt for n FPGA olumn depend on ll the LUT funtions of the olumn, it is impertive to simultneously optimize ll LUT funtions of olumn. In ddition, LUT differene funtions with lrge numers of minterms re likely to ffet the overll reonfigurtion frmes. Therefore, suh LUTs should e given high priorities during the optimiztion. Due to this oservtion, the proposed proedure first rnks ll the LUTs ording to the numer of minterms in their differene funtions. LUTs whose differene funtions ontin more minterms re given higher rnks. Following the desending order of LUT rnks, ODCs re pruned in ordne with two onstrints. The first onstrint is Eqution 8, whih elimintes ODCs tht don t minimize LUT differene funtions. The seond onstrint is the upper ound given in Eqution 10, whih is used to gurntee the orretness of the resulted iruit. The proposed proedure is further elorted s follows. For the onveniene of desription, we re-lel LUTs fter rnking suh tht LUTs with higher rnks re given smller index numers. For exmple, N LUTs rrnged in the desending order of their rnks will e listed with their new lels s LUT 1, LUT 2,, LUT N. Thus, LUT 1 is the first LUT to e proessed. When the initil funtion of LUT 1 is under onsidertion, LUT lotions whose vlues re desiredtoelteredre: δ f 1 = μf 1 (11) In the ove nd following equtions, we use supersripts to indite the funtion on whih δ nd μ re defined. Also, we use susripts to indite the LUT tht δ nd μ re ssoited with. Sine LUT 1 is the first LUT to e proessed, δ f 1 is not sujet to the seond onstrint. However, when LUT k (k 1) is proessed, we hve to pply oth onstrints. This leds to: δ f k = μf k ζf k (δ 1, δ k 1 ) (12) The pseudo-ode of the proposed CODC omputtion proedure is given in Figure 5. Note tht CODCs for oth LUT initil nd finl funtions re omputed simultneously in the proedure. ODC OPT( LUTs ) { 1 Compute ODCs for ll LUTs regrding their initil nd finl funtions 2 Rnk ll LUTs nd re-lel them ording to the desending order of their rnking 3 δ f 1 = μf 1 ; δh 1 = μh 1 4 fork=2ton 5 δ f k = μf k ζf k (δf 1, δf k 1 ) 6 δk h = μh k ζh k (δh 1, δh k 1 ) } Figure 5. CODC omputtion proedure. 3.2 Permuting LUT input orders By defining LUT-olumn differene funtion D, werelte the numer of reonfigurtion frmes to the numer of

minterms in D. Thus, the optiml LUT input orders should minimize minterms in the orresponding olumn differene funtion. Although it is possile to solve this prolem through exhustive enumertion, the lrge serh spe of this prolem mkes suh pproh imprtil. This pper presents serh proedure sed on greedy lgorithm. With ssumptions tht eh LUT hs p inputs nd N LUTs re in the give olumn, the mjor steps of the proedure re desried in Figure 6. It first onstruts LUT differene funtions (line 3) nd, onurrently, finds the LUT tht requires the lest numer of reonfigurtion frmes (lines 4 5). The input order of tht LUT will not e permuted, nd is used s referene when permuting other LUT input orders. Also, funtion MintermCount used in line 3 ounts the numer of minterms of its opernd. After the referene LUT is seleted, the lgorithm sequentilly piks n unproessed LUT nd permutes its inputs. The permuttion proedure is skethed from lines 9 to 18. It exhustively tries ll the possile permuttions nd piks the one tht results in the smllest inrese on the numer of minterms of the newly onstruted union funtion (D tmp ). The time omplexity of the proposed proedure is (p!) (N 1), whih is signifintly smller thn the time omplexity of the exhustive enumertion method. 1 min tmp = 2 p 2 for i= 1 to N 3 D[i] =f i h i ; min = MintermCount(D[i]) 4 if min < min tmp 5 min tmp = min; min index = i; D =D[i] 6 for i= 1 to N 7 if i min index 8 D = permute(d, D[i]) 9 permute( D, D[i]){ 10 min tmp = 2 p 11 for eh permuttion order of LUT i 12 derive new funtion D [i] ording to the new input order 13 D tmp = D D [i] 14 min = MintermCount( D tmp ) 15 if min < min tmp) 16 min tmp = min; D min = D tmp 17 Order[LUT i ] = urrent permut. order 18 return D min } Figure 6. LUT input permuttion proedure. 4. Experimentl Results This setion desries how the proposed tehniques n e integrted into FPGA design utomtion flow, nd reports experimentl results. The urrent FPGA design utomtion flow is skethed y the solid rrows in Figure 7(). For reonfigurtion pplitions, FPGA implementtions of oth initil nd finl iruits re generted following the sme flow. The reonfigurtion itstrems, whih hnge FPGA hrdwre from the initil iruit to its finl iruit, re produed y ompring the initil nd finl FPGA implementtions. The proposed optimiztion proedures n e dded into the design flow etween plement nd routing steps s shown in Figure 7(). After the plement phses of oth the initil nd finl iruits, the initil nd finl funtions of ll the LUTs eome ville. Hene, the proposed tehniques n e pplied to optimize vrile mppings, modify LUT don t-re lotions, nd find optiml LUT input orders. After this, FPGA routing n e performed ordingly. Ciruit desription Logi synthesis & tehnology mpping Plement & routing Generting itstrems FPGA () Proposed optimiztion Logi synthesis & tehnology mpping Plement Proposed optimiztion Routing Generting itstrems Figure 7. Integrting the proposed tehniques into FPGA design flow. It is often diffiult to hve diret ess to results produed y the FPGA plement proedure. In this se, our method n e integrted s indited y the dsh rrows in Figure 7(). After the plement nd routing (P&R) phses of oth the initil nd finl iruits, we let the FPGA tool write P&R results into struturl VHDL files. The si omponents in these VHDL files re LUTs. In ddition, we let the FPGA tool generte lotion onstrints for eh LUT in VHDL files ording to P&R results. The VHDL files long with the onstrint files provide informtion out LUTs in the sme olumn nd their initil nd finl funtions. After pplying the proposed optimiztion proedures, LUT init vlues (tht represent LUT lotions storing logi 1) re updted nd new onstrints regrding LUT input orders re dded into onstrint files. The updted VHDL nd onstrint files re fed to the P&R module in the FPGA tool to re-route FPGA iruits. We experimented with the ltter integrtion senrio. Due to the lk of suitle prtil reonfigurtion enhmrk iruits, we use ISCAS85 enhmrk iruits s initil FPGA iruits. We derive finl FPGA iruits y performing rndom funtion modifition on the initil iruits. In this proess, we first define set of funtions, denoted s g 1,g 2, g i, whih depend on vriles A 4,A 3,A 2,A 1 (sine four-input LUTs re used in our experiments). Then, we derive finl LUT funtions y performing either COM- POSE or INTERSECT opertion with using the originl LUT funtion nd one funtion seleted from g 1,g 2, g i ()

s opernds. The COMPOSE nd INTERSECTre funtion mnipultion opertions defined in CUDD pkge tht is used in the implementtion of our optimiztion proedures. The seletion on opertion (COMPOSE or INTERSECT) nd opernd funtion (g 1,g 2, g i ) is totlly rndomized. The experiments re onduted on Xilinx Virtex 1000 pltform. The otined results re summrized in Tle 1. The seond olumn of the tle lists the numer of LUTs ssigned to eh olumn. Severl olumn onfigurtions re investigted in the experiment. The third olumn reords the required frme numers without performing ny of the proposed optimiztion. The fourth olumn summrizes the numer of frmes ontined in reonfigurtion dt when only LUT input order permuttion tehnique is pplied. The perentge of frme redution is given in the fifth olumn. With oth don t-re modifition nd LUT input order permuttion tehniques eing utilized, the resultnt reonfigurtion frme numers nd their orresponding sving (in perentge) re summrized in the sixth nd seventh olumns, respetively. The results show tht the proposed tehniques n redue reonfigurtion frmes y more thn 20% on verge. Tle 1. Compring Reonfigurtion frmes. Ciruit #lut W/o. Inp. Perm. DC Opt. & Opt. Only Inp. Perm. #Frm. R(%) #Frm. R(%) 3 274 244 11% 236 14% C432 4 238 212 11% 208 13% 8 166 136 18% 136 18% 3 142 137 4% 117 18% C1355 6 124 111 10% 99 20% 9 106 95 10% 83 22% 3 255 239 6% 143 44% C1908 6 198 175 12% 119 40% 9 172 141 18% 91 47% 3 430 389 10% 322 25% C2670 6 334 286 14% 251 25% 9 276 232 16% 204 26% 6 771 659 15% 580 25% C3540 9 632 506 20% 452 28% 12 567 409 28% 377 34% 9 769 617 20% 529 31% C5315 12 626 574 8% 505 19% 15 542 440 19% 402 26% 12 1168 964 17% 849 27% C6288 15 986 826 16% 786 20% 18 852 712 16% 672 21% 12 967 780 19% 686 29% C7552 15 814 660 19% 611 25% 18 693 570 18% 539 22% 5. Conluding Remrks This pper presents omprehensive methodology to minimize FPGA reonfigurtion dt t logi level. The methodology is sed on frmework tht links the size of reonfigurtion dt to the numer of minterms ontined in LUT-olumn differene funtions. It omprises three tehniques, whih re vrile mpping optimiztion, don tre lotion modifition, nd LUT input order permuttion. To effiiently implement the proposed tehniques, two heuristi lgorithms re developed for omputing omptile don t-re lotions nd finding optiml LUT input orders from lrge serh spe. The developed tehniques n e perfetly omined with other methods tht minimize FPGA reonfigurtion dt t high levels for further reduing FPGA reonfigurtion ost. Referenes [1] J. M. Crdoso, On Comining Temporl Prtitioning nd Shring of Funtion Units in Compiltion for Reonfigurle Arhitetures, IEEE Trns. on Computers, vol. 52, no. 10, pp. 1362 1375, 2003. [2] M. Meriout nd M. Motomur, Effiient Metris nd High-Level Synthesis for Dynmilly Reonfigurle Logi, IEEE Trns. on VLSI, vol. 12, no. 6, 2004. [3] M. Kul nd R. Vemuri, Temporl Prtitioning Comined with Design Spe Explortion for Lteny Minimiztion of Run-Time Reonfigured Designs, in Pro. DATE, pp. 202 209, 1999. [4] M. Kul nd R. Vemuri, An Automted Temporl Prtitioning nd Loop Fission Approh for FPGA Bsed Reonfigurle Synthesis of DSP Applitions, in Pro. DAC, pp. 616 622, 1999. [5] K. M. GjjlPurn nd D. Bhti, Prtitioning in time: prdigm for reonfigurle omputing, in Pro. ICCD, pp. 340 345, 1998. [6] D. Rkhmtov nd S. B.K. Vrudhul, Minimizing routing onfigurtion ost in dynmilly reonfigurle FPGAs, in Pro. Prllel nd Distriuted Proessing Symp., pp. 1481 1488, 2001. [7] K.Compton, J.Cooley nd S.Knol, Configurtion relotion nd defrgmenttion for reonfigurle omputing, in Pro. IEEE Symp. FPGA Custom Computing Mhines, pp. 79 80, 2000. [8] K.Compton, Z.Li,S.Knol nd S.Huk, Configurtion relotion nd defrgmenttion for reonfigurle omputing, IEEE Trns. on VLSI, vol. 10, pp. 209 220, 2002. [9] Z. Hung nd S. Mlik, Mnging dynmi reonfigurtion overhed in SoC design using reonfigurle dtpths nd optimized interonnet networks, in Pro. DATE, pp. 13 16, 2001. [10] Z. Li, K. Compton, nd S. Huk, Configurtion Ching for FP- GAs, in Pro. IEEE Symp. FPGA Custom Computing Mhines, pp. 22 36, 2000. [11] S. Huk, Z. Li, nd E. Shwe, Configurtion Compression for the Xilinx XC6200 FPGA, in Pro. FPGA Custom Computing Mhines, 1998. [12] S. Mitr, W. Hung, N. Sxen, S. Yu, nd E. J. MCluskey, Reonfigurle Arhiteture for Autonomous Self-Repir, IEEE Design nd Test of Computer, vol. 21, no. 2, pp. 228 240, 2004. [13] XILINX In., Virtex Series Configurtion Arhiteture User Guide, 2003. [14] XILINX In., Two Flows for Prtil Reonfigurtion:Module Bsed or Smll Bit Mnipultions, 2002. [15] G. De Miheli, Synthesis nd Optimiztion of Digitl Ciruits. MGrw-Hill, In., 1994. [16] M. Dmini nd G. De Miheli, Don t Cre set Speifitions in Comintionl nd Synhronous Logi Ciruits, IEEE Trns. on CAD, vol. 12, no. 3, pp. 365 388, 1993. [17] H. Svoj nd R. Bryton, The use of Oservility nd Externl Don t res for the Simplifition of Multi-Level Netwworks, in Pro. DAC, pp. 297 301, 1990. [18] S. Ymshit, H. Swd, nd A. Ngoy, SPFD: A New Method to Express Funtionl Flexiility, IEEE Trns. on CAD, vol. 19, no. 8, pp. 840 849, 2000.