MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya

Similar documents
GRABLINKTM. FullTM. - DualBaseTM. - BaseTM. GRABLINK Full TM. GRABLINK DualBase TM. GRABLINK Base TM

CPE 200L LABORATORY 2: DIGITAL LOGIC CIRCUITS BREADBOARD IMPLEMENTATION UNIVERSITY OF NEVADA, LAS VEGAS GOALS:

ARCHITECTURAL CONSIDERATION OF TOPS-DSP FOR VIDEO PROCESSING. Takao Nishitani. Tokyo Metropolitan University

Chapter 1: Introduction

Application Support. Product Information. Omron STI. Support Engineers are available at our USA headquarters from

VISUAL IDENTITY GUIDE

Safety Relay Unit G9SB

LOGICAL FOUNDATION OF MUSIC

Introduction. APPLICATION NOTE 712 DS80C400 Ethernet Drivers. Jun 06, 2003

ECE 274 Digital Logic. Digital Design. Datapath Components Registers. Datapath Components Register with Parallel Load

The Official IDENTITY SYSTEM. A Manual Concerning Graphic Standards and Proper Implementation. As developed and established by the

Soft Error Derating Computation in Sequential Circuits

Before Reading. Introduce Everyday Words. Use the following steps to introduce students to Nature Walk.

Engineer To Engineer Note

WE SERIES DIRECTIONAL CONTROL VALVES

ECE 274 Digital Logic. Digital Design. Sequential Logic Design Controller Design: Laser Timer Example

CMST 220 PUBLIC SPEAKING

Applications to Transistors

arxiv: v2 [cs.sd] 13 Dec 2016

Corporate Logo Guidelines

Safety Relay Unit G9SB

SeSSION 9. This session is adapted from the work of Dr.Gary O Reilly, UCD. Session 9 Thinking Straight Page 1

Mapping Arbitrary Logic Functions into Synchronous Embedded Memories For Area Reduction on FPGAs

Chapter 5. Synchronous Sequential Logic. Outlines

lookbook Transportation - Airports

Pro Series White Toner and Neon Range

Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors

Synchronising Word Problem for DFAs

Explosion protected add-on thermostat

Have they bunched yet? An exploratory study of the impacts of bus bunching on dwell and running times.

A New Concept of Providing Telemetry Data in Real Time

LCD Data Projector VPL-S500U/S500E/S500M

Chapter 3: Sequential Logic Design -- Controllers

Contents. English. English. Your remote control 2

Answers to Exercise 3.3 (p. 76)

Standard Databases for Recognition of Handwritten Digits, Numerical Strings, Legal Amounts, Letters and Dates in Farsi Language

DIGITAL EFFECTS MODULE OWNER'S MANUAL

A Proposed Keystream Generator Based on LFSRs. Adel M. Salman Baghdad College for Economics Sciences

CPSC 121: Models of Computation Lab #2: Building Circuits

Independent Communications Authority of South Africa/ Onafhanklike Kommunikasie-owerheid van Suid-Afrika

lookbook Higher Education

lookbook Corporate LG provides a wide-array of display options that can enhance your brand and improve communications campus-wide.

Standards Overview (updated 7/31/17) English III Louisiana Student Standards by Collection Assessed on. Teach in Collection(s)

MILWAUKEE ELECTRONICS NEWS

walking. Rhythm is one P-.bythm is as Rhythm is built into our pitch, possibly even more so. heartbeats, or as fundamental to mu-

DRAFT. Vocal Music AOS 2 WB 3. Purcell: Music for a While. Section A: Musical contexts. How is this mood achieved through the following?

Sequencer devices. Philips Semiconductors Programmable Logic Devices

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 30. Setting Up the Projector 17

Rotating Circular Micro-Platform with Integrated Waveguides and Latching Arm for Reconfigurable Integrated Optics

INPUT CAPTURE WITH ST62 16-BIT AUTO-RELOAD TIMER

PRACTICE FINAL EXAM T T. Music Theory II (MUT 1112) w. Name: Instructor:

Avaya P460. Quick Start Guide. Important Information. Unpack the Chassis. Position the Chassis. Install the Supervisor Module and PSU

PIRELLI BRANDBOOK 4. IDENTITY DESIGN

Notations Used in This Guide

Pitch I. I. Lesson 1 : Staff

Panel-mounted Thermostats

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 28. Setting Up the Projector 15

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

months ending June 30th 2001 Innovators in image processing

TAU 2013 Variation Aware Timing Analysis Contest

Reproducible music for 3, 4 or 5 octaves handbells or handchimes. by Tammy Waldrop. Contents. Performance Suggestions... 3

Your Summer Holiday Resource Pack: English

1 --FORMAT FOR CITATIONS & DOCUMENTATION-- ( ) YOU MUST CITE A SOURCE EVEN IF YOU PUT INFORMATION INTO YOUR OWN WORDS!

Evaluation of the Suitability of Acoustic Characteristics of Electronic Demung to the Original Demung

Contents 2. Notations Used in This Guide 7. Introduction to Your Projector 8. Using Basic Projector Features 34. Setting Up the Projector 17

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

LCD VIDEO MONITOR PVM-L1700. OPERATION MANUAL [English] 1st Edition (Revised 2)

Successful Transfer of 12V phemt Technology. Taiwan 333, ext 1557 TRANSFER MASK

Notations Used in This Guide

Interactions of Folk Melody and Transformational (Dis)continuities in Chen Yi s Ba Ban

Day care centres (ages 3 to 5) Kindergarten (ages 4 to 5) taken part in a fire drill in her building and started to beep.

Phosphor: Explaining Transitions in the User Interface Using Afterglow Effects

Predicted Movie Rankings: Mixture of Multinomials with Features CS229 Project Final Report 12/14/2006

Animals. Adventures in Reading: Family Literacy Bags from Reading Rockets

Train times. Monday to Sunday. Stoke-on-Trent. Crewe

User's Guide. Downloaded from

Big Adventures. Why might you like to have an adventure? What kind of adventures might you enjoy?

style type="text/css".wpb_animate_when_almost_visible { opacity: 1; }/style

LCD VIDEO MONITOR PVM-L3200. OPERATION MANUAL [English] 1st Edition (Revised 1)

92.507/1. EYR 203, 207: novaflex universal controller. Sauter Systems

VOCAL MUSIC I * * K-5. Red Oak Community School District Vocal Music Education. Vocal Music Program Standards and Benchmarks

ViaLite SatComs Fibre Optic Link

Train times. Monday to Sunday

LAERSKOOL RANDHART ENGLISH GRADE 5 DEMARCATION FOR EXAM PAPER 2

lookbook Senior Living

ViaLiteHD RF Fibre Optic Link

Electrospray Ionization Ion MoMlity Spectrometry

lookbook Corporate Images are simulated.

Block Diagram. RGB or YCbCr. pixin_vsync. pixin_hsync. pixin_val. pixin_rdy. clk

THE SOLAR NEIGHBORHOOD. XV. DISCOVERY OF NEW HIGH PROPER MOTION STARS WITH 0B4 yr 1 BETWEEN DECLINATIONS 47 AND 00

THE MOSSAT COLLECTION BOOK SIX

Your KIM. characters, along with a fancy. includes scrolling, erase to end of screen, full motions, and the usual goodies. The

1. Connect the wall transformer to the mating connector on the Companion. Plug the transformer into a power outlet.

LOGOMANUAL. guidelines how to use Singing Rock logotype. Version 1.5 English. Lukáš Matěja

Preview Only. Editor s Note. Pronunciation Guide

Generating lyrics with the variational autoencoder and multi-modal artist embeddings

Appendix A. Quarter-Tone Note Names

Tran Thi Thanh Thao Ticker: STB - Exchange: HSX Recommend: HOLD Target price 2011: VND 15,800 STATISTICS

SWITCHED ACCESS REMOTE TEST SYSTEM (SARTS- 1 A) REMOTE TEST SYSTEM 1 A (RTS-1 A) TESTS

... clk. 10 Registers and counters

Transcription:

MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko nd Shuvr S. Bhttchryy Deprtment of Electricl nd Computer Engeerg, nd Institute for Advnced Computer Studies University of Mrylnd, College Prk, 20742, USA ABSTRACT Modelg semntics bsed on dtflow grphs re used widely design tools for digitl signl processg (DSP). This pper develops efficient techniques for representg nd mnipultg blockbsed opertions dtflow-bsed DSP design tools. In this context, block refers to fite-length sequence of dt items, such s sequence of speech smples, n imge, or group of video frmes, s prt of n enclosg dt strem. We develop this pper met-modelg technique clled blocked dtflow (BLDF) for ugmentg DSP design tools with more effective blocked dt support n efficient nd generl mnner. We compre BLDF gst lterntive modelg pproches through detiled cse study of n MPEG 2 video encoder system. 1. INTRODUCTION In the digitl signl processg (DSP) dom, rpid prototypg tools bsed on corse-gr dtflow semntics re widely used [2]. One importnt requirement these tools is support for blockbsed processg, such s tht volved imge nd video pplictions. We develop this pper blocked dtflow (BLDF) modelg pproch for efficient hndlg of block-bsed dt dtflow-bsed DSP design tools. BLDF combes met-modelg, block-bsed processg, multidimensionl representtion, nd dynmic prmeter reconfigurtion sgle, unified frmework tht leds to more efficient dtflow grphs for schedulg nd softwre synthesis. In this pper, by dtflow model of computtion (dtflow MoC), we men progrmmg model bsed on dtflow semntics. Progrms dtflow MoC re thus represented s directed grphs which vertices, clled dtflow ctors, represent computtionl tsks, nd edges represent logicl FIFO communiction chnnels between tsks. A decidble dtflow model is one which dedlock nd unbounded buffer ccumultion cn be determed fite time for every specifiction the model. Exmples of decidble dtflow models re CSDF [3], SDF [8], MDSDF [9] nd SSDF [12]. For consistent specifictions ech of these models, there is unique, teger-vlued repetitions vector tht is dexed by the grph ctors nd gives the number of times ech ctor needs to be voked to form miml periodic schedule for the grph. A number of efforts hve exmed block processg t the level of dividul ctors. The objective such vectoriztion is to improve throughput nd reduce context-switchg overhed by executg ctors mny times succession. The sclble synchronous dtflow (SSDF) [12] model formlized this concept the context of multirte dtflow grphs, nd lgorithms hve been developed to extrct the mximum vectoriztion potentil from n SSDF grph [11]. More recently, retimg techniques hve been explored for mnipultg homogeneous dtflow grphs (grphs which the production nd consumption prmeters re ll equl to one) to improve vectorizbility [6]. BLDF differs from these pproches its pplicbility beyond the level of dividul ctors, nd to rbitrry subsystems t ny level of the modelg hierrchy. BLDF lso differs its close tegrtion with prmeterized dtflow semntics [1], which llows for powerful dynmic reconfigurtion cpbilities. As dtflow modelg lterntives emerge further it is highly desirble to identify new modelg fetures tht cn be chieved through novel pplictions of existg models rther thn defg totlly new dtflow vrt for ech new extension. This promotes reuse nd tegrtion rther thn revention of the growg body of knowledge on estblished dtflow styles. BLDF dheres to this pproch by defg generl mechnisms tht cn be used to ugment existg dtflow models with systemtic dt groupg cpbilities. It is this sense tht we refer to BLDF s metmodel. BLDF cn be used with the well-known decidble dtflow models, SDF, CSDF, MDSDF, nd SSDF, s described bove. Its use with other, more dynmic models such s boolen dtflow [4] nd SBF [5] my be possible, lthough efficient ppliction to such models requires further vestigtion. 2. BLOCKED DATAFLOW Blocked dtflow builds on prmeterized dtflow semntics [1]. In blocked dtflow subsystem, blocks of put dt re treted s subsystem prmeters, nd the itiliztion grphs (the subit or it grphs, s described below) re used -between processg of successive blocks to chnge the vlue of the ssocited blockprmeter. Thus successive blocks of dt re trnslted to successive reconfigurtions of block-prmeter vlues. For exmple, consider n imge processg system tht performs given filterg opertion on strem of put imges. A blocked dtflow representtion might defe the processg of sgle imge usg dtflow grph G c. The grph G c opertes on put from specil imge source ctor tht is prmeterized with n imge I. The imge source ctor simply trnsfers its imge prmeter to its put ccordg to the desired protocol. The trnsfer protocol volves both rsteriztion spects, nd my lso volve sub-blockg (e.g., puttg the imge s sequence of row blocks). Such sub-blockg cn be used to defed nested BLDF subsystems. BLDF herits most fetures of prmeterized dtflow [1]. Thus, BLDF specifiction (or subsystem) lso consists of three distct grphs: 1) the it grph i; 2) the subit grph s; nd 3) the body grph b. Intuitively, the body grph models the m functionl behvior of the subsystem, wheres the it nd subit grphs control the behvior of the body grph by ppropritely configurg the body grph prmeters. The it grph is voked prior to ech voction of the ssocited (hierrchicl) prent subsystem, ˆ Œ ( ), while the subit grph is voked prior to ech voction of the ssocited body subsystem b, thus llowg for two distct frequency levels of reconfigurtion control [1].

2.1 Itertion control The mjor enhncement BLDF is the delivery method of dt tokens to body grphs. In BLDF, blocked dt tokens such s sequentil MPEG2 video strems re delivered vi the prmeter vlue updtg process of it or subit grphs so tht n it or subit grph cn extrct formtion concerned for the ssocited body grph from rw dt tokens delivered, nd then convert rw dt tokens s well s the formtion extrcted to sets of new prmeter vlues for the body grph. Thus, rw dt tokens re delivered to the ssocited body grph s prmeters long with other prmeters extrcted from them before the body grph strts runng. Figure 1 shows the mechnism by which BLDF builds on prmeterized dtflow semntics. Sce the body grph of Figure 1() tkes imge frmes directly from the side with ny prmeteriztion process with n it or subit grph, it is not possible to extrct importnt formtion such s itertions of the ssocited body grph nd lso not possible to defe detiled opertion of ech ctor with tht body grph by settg itertion limits. On the other hnd, Figure 1(b), imge frmes re trnsferred to the subit grph nd then converted to block of prmeters, which re set s prmeters of ech relevnt ctor the ssocited body grph. Figure 1(b) llows dynmic configurtion of prmeters for the ssocited body grph such s imge resolution nd block size s bsic processg units long with other provisionl prmeters t the stge of the subit grph, which directs detiled opertion of the ssocited body grph before tht body grph strts n voction of itself. At the sme time, itertions of ech ctor with body grph cn be obted long with other prmeters. Suppose, for exmple, tht n it or subit grph tkes Z pixel frme from its put port. An it or subit grph cn obt Z / N 2 itertions of the ssocited body grph ctor by settg the block size prmeter for the body grph s N by which imge frmes re divided to sub-imge frmes. Ech ctor with the body grph then opertes on the bsis of sub-imge frmes for high throughput nd more prllelism. Itertion numbers my be used further s fctors qusi-sttic looped schedule by BLDF scheduler. Obtg prmeters relevnt to the schedulg of the ssocited body grph before it runs nd reconfigurg those prmeters dynmiclly bsed on concerned pylods of tokens delivered t runtime gives n ppliction developer enhnced flexibility nd efficiency the design phse. 2.2 Token delivery One of the dvntges of BLDF is its efficiency token delivery. First, token delivery, BLDF enbles us to reduce buffers for deliverg tokens mong ctors. This is becuse tokens cn be delivered from prent grphs to nested body grphs by prmeteriztion. Figure 2 shows how BLDF reduces bufferg requirements this wy. In Figure 2, the D ctor requires both nd b tokens, while the A, B nd C ctors require only token. Here, suppose lso tht smple rte chnge from A to D exists the specifiction. Then Figure 2(), A, B nd C ctors must hve dditionl put/put ports only for deliverg token b to D with smple rte consistency. This turn cuses redundnt or extr buffers between termedite ctors. However, Figure 2(b), the subit grph s converts put dt to two prmeters nd b, nd then token is set to ctor A s prmeter while token b is set to the ctor D directly s prmeter, while mtg smple rte consistency. This prmeteriztion process enbles us to remove redundnt connections nd buffers between ctors BLDF. 2.3 Dt tokens with nested heders Most multimedi dt tokens consist of heder prt nd pylod prt. The heder prt hs the formtion for hndlg the pylod. However, the pylod lso my hve sub-heder nd subpylod components. Therefore, ech level of composite ctors implemented hierrchiclly or heterogeneously my process different re of pcketized multimedi dt token. BLDF provides n efficient wy for deliverg dt tokens to composite ctors of lower hierrchicl levels by prmeteriztion. Only the relevnt prt needs to be decoded for configurtion nd the remg prts cn be encpsulted s prmeters for composite ctors of lower hierrchicl levels the dtflow specifiction. Figure 3 shows how dt tokens with nested heders cn be hndled BLDF. Decodg heders sequentilly ccordg to the need for the ssocited heder formtion llows us to implement ech module with n ppliction consistently, which is esy to understnd for future code reuse. This pproch lso reduces the number of connections nd buffers between ctors by prmeteriztion. 3. APPLICATION EXAMPLE 3.1 Brief review of MPEG2 video strems The MPEG2 specifiction hs been widely selected s stndrd for codg/decodg movg picture frmes. Therefore, mny modern embedded systems hndlg multi medi tegrte MPEG2 decoders. This pper hs selected MPEG2 s one exmple of rel field ppliction for n embedded system. The MPEG2 h b œ b œ i s i h s i ) PSDF pproch b) BLDF pproch A : mjor dt tokens (e.g. imge frmes) B: generl dt tokens for prmeteriztion Figure 1. PSDF nd BLDF. i A Β C b b b D s i prm() prm(b) A Β C ) SDF b) BLDF Figure 2. BLDF nd SDF: prm() : prmeteriztion; s : subit grph, b: body grph;, b : tokens beg delivered. b D

Heder 1 st level Heder 2 nd level s1 i1 Heder 3 rd level prm b1 B 1 st level pylod 2 nd level pylod A B C s2 i2 prm B1 b2 Figure 3. Dt tokens with nested heders. B2 specifiction roughly consists of three prts: the video, udio nd system prts. In this pper, we focus on the video prt to show differences efficiency, flexibility nd extensibility mong lterntive modelg formts. Movg pictures re mde from combtions of consecutive imge frmes. Ech imge frme is composed of pixels nd ech pixel hs its own vlue representg the degree of RGB or YCrCb. Pixel vlues re not dependent but re correlted with their neighbors. Therefore, the vlue of pixel is predictble, given the vlues of neighborg pixels. Imge frmes usully hve redundnt formtion view of imge compression, which cn be ctegorized to two redundncies: sptil redundncy nd temporl redundncy, bsed on whether they re exploited reltion with neighborg frmes or not. Sptil redundncy is redundnt formtion lyg n tr frme while temporl redundncy is redundnt formtion lyg between ter-frmes. The MPEG2 specifiction seprtes imge frmes to three different types (I, P nd B frmes). I frmes exploit only sptil redundncy, while P nd B frmes exploit both sptil redundncy nd temporl redundncy. Thus, n I frme does not refer to neighborg imge frmes for reducg redundnt formtion with itself nd plys role of n nchor frme to seprte groups of pictures from contuous imge frmes. Even though the P nd the B frmes exploit both sptil redundncy nd temporl redundncy, there re different fetures between P nd B frmes view of control flow. The P frme reduces redundnt formtion by referrg to previous I or P imge frme s reference frme, differentitg pixel vlues between the current P frme nd the reference frme, nd exploitg sptil redundncy like the I frme. In contrst, the B frme requires two reference frmes ( previous I or P frme nd future I or P frme) s reference frmes for reducg temporl redundncy. The difference the number of reference frmes mong frme types mkes it difficult to express n MPEG2 encoder pure SDF form. 3.2 Problems design of n MPEG video encoder with SDF The problems from designg n MPEG2 video encoder usg only SDF semntics occur from the dynmic chnge MPEG2 video strems. Some ctors side the MPEG2 encoder dynmiclly chnge their opertion bsed on the content of dt tokens beg delivered to them while other ctors mt their opertion consistently. Also, motion compenstion demnds tht imge frmes re encoded different sequences from sequences trnsferred to the encoder. More specificlly, problems designg n MPEG2 video encoder under SDF re s follows. P1. Control problem. Every ctor under SDF must consume nd produce t lest one token, which mens tht every connection between ctors hs to deliver t lest one token durg one voction of the enclosg system. However, it is possible tht some ctors need specil tokens from their put ports only specil cses nd other cses do not need ny token. This sitution rises ctors of n MPEG2 video encoder. P2. Consistent schedule problem. Dt tokens cn be ctegorized to two sub-clsses: mjor dt tokens every ctor is concerned with, nd dditionl dt tokens tht re relevnt for proper subsets of ctors. Some ctors of n MPEG2 video encoder require dditionl put or put ports tht re only for deliverg dditionl tokens. Those tokens hve fetures of prmeters nd re usully used for settg ternl stte of ctors. With such dditionl put or put ports only for deliverg tokens to other ctors, s the ly of pplictions get more nd more complex, the possibility of troducg smple rte consistency to the dtflow signl processg creses. SPDF (Synchronous Piggybcked Dt Flow) [10] suggested piggybcked wy to solve this problem. However, [10] lso cnnot void unnecessry nd redundnt delivery of the formtion, even if the methods of [10] re used to reduce buffers by piggybcked wy, which delivers only poter of n entry the globl stte tble. P3. Itertion counts. Obtg ctor itertion counts t compile time is mjor dvntge SDF. It reduces overhed of schedulg problems t runtime. However, generl, the voctions of ech ctor cn vry dynmiclly bsed on dt beg delivered. Such scenrios re not hndled by SDF. Also, n ppliction developer my wish to mnully set or dynmiclly chnge itertion numbers of specil ctors for low power requirements or quick user response time, which will ffect itertion counts of subsequent ctors. Such situtions re lso not permitted SDF. However, BLDF, itertion numbers of subsequent ctors cn be determed t the it or subit stge by extrctg correspondg formtion from dt tokens delivered nd reconfigurg the ssocited prmeters, while llowg for low overhed qusi-sttic schedulg, s prmeterized dtflow [1]. This is possible through blocked prmeter delivery BLDF, which tkes block of put tokens, e.g. imge frmes t the it or subit stge, nd then converts them s blocked prmeters long with other prmeters. At the sme time, importnt configurtion formtion such s the resolution of n imge frme nd bsic processg unit size (block size) cn be used for dynmiclly clcultg itertion counts of relevnt ctors the ssocited body grph. P4. Svg buffers nd reducg unnecessry delivery. BLDF llows us to optimize dt token delivery by prmeteriztion. By prmeteriztion, low overhed, low frequency connections between ctors cn be used. As mentioned P2, we hve two kds of dt tokens: tokens every ctor requires nd tokens tht re relevnt for dividul ctors. The second type of tokens cn be directly delivered to the ssocited ctors by prmeter settgs processed t the it or subit stge. This llows us to remove unnecessry dt delivery s well s unnecessry bufferg requirements, s will be demonstrted Section 4. 4. EXPERIMENTS We hve prototyped prelimry version of BLDF semntics Ptolemy II [7], widely-used tool for developg nd tegrtg models of computtion.

4.1 MPEG2 Video encoder implementtion We hve implemented n MPEG2 Video encoder under the Ptolemy II environment three different wys, cludg usg BLDF, nd hve compred the resultg models efficiency nd flexibility. Method 1. FSM nd SDF combtion An ppliction developer often considers FSMs (Fite Stte Mches) when designg n ppliction with nontrivil control flow. An MPEG video encoder clerly hs fetures of dtflow, long with nontrivil control flow. In this method of implementtion, we hve used the two combed models of computtion, SDF nd FSM, heterogeneous nd hierrchicl wy, usg the heterogeneous modelg cpbilities of Ptolemy II. Figure 4 illustrtes our resultg design. Our FSM representtion with the MPEG2 video encoder hs three sttes where ech stte is refed to three different SDF subgrphs, dependg on the type of imge frme: I, P or B. Sce n I frme is coded by exploitg only sptil redundncy, the SDF grph shown figure 4(c) for I frme processg does not hve motion compenstor module. The SDF grph shown figure 4(d) for P frme processg, which refers to only previous I or P frme, hs one motion compenstor module, while the SDF grph shown figure 4(e) for B frme processg, which refers to both previous nd future I or P frme, hs two motion compenstor modules. Here, it is useful to focus on two specil functionl blocks: MPEGQuntizer nd ReferenceFrme, which help to distguish our lterntive encoder implementtions. MPEGQuntizer. This block needs picture ID token to identify wht imge frmes re delivered to it. MPEGQuntizer is plced fter severl precedg ctors tht re not concerned b the picture ID token. In implementtion method 1 nd method 2 (troduced below), the picture ID token must go through ll precedg ctors to the trget ctor, MPEGQuntizer, which, due to smple rte chnges through the precedg ctors, consumes tht token to void n consistent schedule. ReferenceFrme. This block opertes differently, dependg on the type of imge frme delivered, nd uses dummy tokens with ) MPEG2 Encoder (Top) d) P Frme encoder b) Inside the FSM c) I Frme encoder Figure 4. FSM nd SDF Combtion e) B Frme encoder 0 vlues: Cse 1: When n I frme comes, ReferenceFrme produces "0" vlues to put ports both for previous nd for future reference frme. This is becuse n I imge frme does not perform motion compenstion. ReferenceFrme consumes I frme from its put port nd updtes its reference frme with the I frme. Here, ReferenceFrme hs itil tokens s with dely ctor, for it is connected with feedbck loop. Cse 2: When P frme comes, ReferenceFrme produces previous I or P frme, which ws sved previous cycle, for the previous reference frme nd 0 vlue for the future reference frme. Like when n I frme ID comes, P frme is lso sved s reference frme side of ReferenceFrme. Cse 3: When B frme comes, ReferenceFrme produces two sved reference frmes (P nd I frmes) to the put ports. However, sce B frme is not used s reference frme, it is discrded nd not used for updtg reference frmes side of ReferenceFrme. In summry, this implementtion method (Method 1) cn stisfy problem P1; however, P2, P3 nd P4 rem unsolved. Method 2. SDF In this method, we hve implemented n MPEG2 Video encoder with tegrtg the FSM model of computtion. All functionl blocks side re sme s the method 1. However, method 2 does not hve seprted I, P nd B sub-encoders so tht ll imge frmes go through two motion compenstors with rel vlues or dummy vlues dependg upon the imge frmes. This implementtion simplifies the design of n MPEG2 Video encoder. However, it still hs the sme problems (P2, P3 nd P4) unsolved, s with method 1. Method 3. BLDF In this method, we seprte the functionl blocks of n MPEG2 video encoder to two prts: subit nd body grph. The ctors configurg the body subsystem re plced the subit grph, nd the ctors ctully processg imge frmes re plced the body subsystem. First, the subit grph obts formtion for configurg body subsystem from dt tokens delivered to itself nd then converts imge dt tokens, themselves, to blocked prmeters for the body subsystem long with other prmeters, such s block size nd picture ID, obted from imge dt tokens. In prmeterized dtflow, blocked dt tokens such s imge frmes directly go to body grph. An it or subit grph mnipultes only dt tokens with prmeter fetures for body subsystem. Therefore, n it or subit grph cn not obt prmeters such s imge resolution or block size for mnipultg itertion numbers of the ctors the ssocited body grph. Erly knowledge of the itertion count of ech functionl block for body subsystem gives more efficiency nd flexibility mnipultg nd predictg ctors of the ssocited body grph. Above ll, n itertion count cts s fctor looped schedule of qusi-sttic schedulg BLDF. Thus, more efficient qusisttic schedule of the ssocited body grph cn be estblished, while keepg much of the dvntge (the predictbility) of SDF the schedule. The nme of BLDF is origtes from this feture tht block of dt tokens is pckged s prmeters nd then delivered to the ssocited body subsystem. Blocked dt token delivery of BLDF enbles us to reduce dimensions of MDSDF [9] by processg multi dimensionl dt tokens dimension by dimension with blocked dt processg of nested BLDF subsystems. At the sme time, BLDF cn be used conjunction with MDSDF, with BLDF prmeter control used to defe the boundries of pro-

cessg to be performed usg MDSDF semntics. Figure 5 shows itertion counts of the functionl blocks the ssocited body subsystem nd how itertion counts re used for fctors looped qusi-sttic schedule of the MPEG2 video encoder ppliction. Here, the it subsystem conts the followg three ctors. ImgeFrmePrmeterizer. This ctor delivers imge frmes to the ImgePropgtor ctor of the body subsystem s BLDF prmeter vlues. MPEGHederGenretor. This ctor genertes picture ID for the ssocited body subsystem. The prmeterized token delivery of picture ID relieves the ssocited body grph of complicted meshed ly of n MPEG2 video encoder nd the consistent schedulg problem (P2). BlockSize. This ctor sets block size prmeter vlue for the ssocited body subsystem, which is the bsic processg unit by which full imge frme is divided to groups of sub imges for high throughput nd more prllelism. Ech functionl block the ssocited body subsystem processes n imge frme on the bsis of sub imges defed this mnner. In the body subsystem, it is useful to focus on two functionl blocks: the MPEGQuntizer nd ReferenceFrme. These two ctors hve dditionl put ports for picture ID token methods 1 nd 2, but BLDF, no dditionl put port for picture ID token is ny longer sce the tokens re delivered to these ctors s prmeters, not tokens. The prmeterized token delivery simplifies the ly of the MPEG2 video encoder nd lso removes redundnt connections between ll precedg ctors to the trget ctor ctully consumg tht formtion with consistent schedule problem. Also, this method llows dynmic configurtion of prmeters t runtime. The subit grph nlyzes the tokens delivered to itself nd then sets prmeters of the ssocited body subsystem bsed on runtime need for prmeter vlue delivery. Prmeters mt their vlue consistently durg one itertion of the ssocited body grph. Figure 6 shows our implementtion of the MPEG2 video encoder ppliction under BLDF. 4.2 Comprison Method 1 (FSM + SDF Combtion) hs three different SDF grphs to which three sttes of the FSM re refed. However, ech refed SDF grph shres most of its ctors with other refed grphs, so there is problem with redundnt copies of ctors mong ech refed SDF grph. Method 2 (SDF) simplifies three sub-encoders with method 1 to one common encoder. Thus, method 2 removes the problem of redundnt (duplicted) ctors. However, it still hs problems of P2, P3 nd P4 unsolved. Thus, unnecessry connections for picture ID delivery need to be estblished through precedg ctors, most of which don't need picture ID, order to void n consistent schedule when the smple rte of tokens chnges. Method 3 (BLDF) hs similr ly s method 2, except tht connections for deliverg the picture ID re removed due to prmeterized token delivery. This mkes the ly of the encoder much simpler thn method 2. Besides this, sce prmeters of the body subsystem re dynmiclly set by the subit grph, method 3 provides more flexibility nd extensibility the design nd mtennce of the ppliction, especilly by mkg room for future chnges of the specifiction, long with improved efficiency the design by reducg connections between functionl blocks. To illustrte this efficiency dvntge, the followg tble shows how mny buffers nd connections BLDF cn be sved s the ppliction complexity creses. In the MPEG2 ppliction, we hve two ctors nmed MPEGQuntizer nd InverseM- PEGquntizer tht require dditionl tokens for ternl settg of vlues. The number of connections nd the number of buffers cn be clculted by multiplyg the number of precedg ctors nd the number of tokens for prmeters. precedg ctors : n tokens for prmeters : m connections : n*m buffers : n*m Therefore, generlly, n*m unnecessry connections nd buffers between precedg ctors cn be sved BLDF, compred with lterntive modelg formts. 5. CONCLUSIONS This pper hs developed blocked dtflow (BLDF) modelg semntics for ugmentg dtflow-bsed DSP design tools with tegrted cpbilities for met-modelg, block-bsed processg, multidimensionl representtion, nd dynmic prmeter reconfigurtion. BLDF builds on prmeterized dtflow semntics, nd is ) MPEG2 Encoder BLDF (Top) b) subit grph Figure 5. Blocked dt delivery BLDF c) body grph Figure 6. MPEG2 Encoder under BLDF

comptible with decidble dtflow models such s CSDF, MDSDF, SDF, nd SSDF. This pper hs described the semntics of BLDF, nd illustrted its efficiency through cse study of n MPEG 2 video encoder system. Useful directions for further study clude optimized synthesis, hrdwre/softwre prtitiong lgorithms, nd utomted verifiction from BLDF specifictions. ACKNOWLEDGEMENTS This reserch ws supported by the Advnced Sensors Collbortive Technology Allce, nd by DARPA (contrct number MDA972-00-1-0023, through Brown University). REFERENCES [1] B. Bhttchry nd S. S. Bhttchryy. Prmeterized dtflow modelg for DSP systems. IEEE Trnsctions on Signl Processg, 49(10):2408-2421, October 2001. [2] S. S. Bhttchryy, R. Leupers, nd P. Mrwedel. Softwre synthesis nd code genertion for DSP. IEEE Trnsctions on Circuits nd Systems -- II: Anlog nd Digitl Signl Processg, 47(9):849-875, September 2000. [3] G. Bilsen, M. Engels, R. Luweres, nd J. A. Peperstrete. Cyclo-sttic dtflow. IEEE Trnsctions on Signl Processg, 44(2):397-408, Februry 1996. [4] J. T. Buck. Sttic schedulg nd code genertion from dynmic dtflow grphs with teger-vlued control systems. In Proceedgs of the IEEE Asilomr Conference on Signls, Systems, nd Computers, pges 508-513, October 1994. [5] B. Kienhuis nd E. F. Deprettere. Modelg strem-bsed pplictions usg the SBF model of computtion. In Proceedgs of the IEEE Workshop on Signl Processg Systems, pges 385-394, September 2001. [6] K. N. Llgudi, M. C. Ppefthymiou, nd M. Potkonjk. Optimizg computtions for effective block-processg. ACM Trnsctions on Design Automtion of Electronic Systems, 5(3):604-630, July 2000. [7] E. A. Lee. Overview of the Ptolemy project. Technicl Report UCB/ERL M01/11, Deprtment of Electricl Engeerg nd Computer Sciences, University of Cliforni t Berkeley, Mrch 2001. [8] E. A. Lee nd D. G. Messerschmitt. Sttic schedulg of synchronous dtflow progrms for digitl signl processg. IEEE Trnsctions on Computers, Februry 1987. [9] P. K. Murthy nd E. A. Lee. Multidimensionl synchronous dtflow. IEEE Trnsctions on Signl Processg, 50(8):2064-2079, August 2002. [10] C. Prk, J. Chung nd S. H, Efficient Dtflow Representtion of MPEG-1 Audio (Lyer III) Decoder Algorithm with Controlled Globl Sttes, IEEE Workshop on Signl Processg Systems (SiPS): Design nd Implementtion, Tiwn, ROC, Oct, 1999. [11] S. Ritz, M. Pnkert, nd H. Meyr. Optimum vectoriztion of sclble synchronous dtflow grphs. In Proceedgs of the Interntionl Conference on Appliction Specific Arry Processors, October 1993. [12] S. Ritz, M. Pnkert, nd H. Meyr. High level softwre synthesis for signl processg systems. In Proceedgs of the Interntionl Conference on Appliction Specific Arry Processors, August 1992. Tble 1. Comprison of three methods "Buffer memory" nd "Token delivery" SDF + FSM Totl #B: buffers #W : words #W = #B * #W pb #W pb : N um ber of w ords per buffer cf) Picture ID : 1 word per buffer is.(#w pb = 1) #B : = (3+4+5)+(1+1+1) = 15 buffers #W = #B * #WpB : =15 * 1 = 15 words SDF #B : = (5)+(1) = 6 buffers #W = #B * #WpB : = 6 * 1 = 6 words BLDF #B : 0 buffers #W : 0 words M PEGQuntizer ctor < # of precedg ctors > SDF+FSM : 3(I), 4(P), 5(B) SDF, BLDF : 5 # of tokens for prmeters : 1 connections I subencoder: = 3*1 = 3 P subencoder: = 4*1 = 4 B subencoder: = 5*1 = 5 buffers I subencoder : = 3*1 = 3 P subencoder : = 4*1 = 4 B subencoder : = 5*1 = 5 Inverse M PEGQuntizer ctor # of precedg ctors : 1 # of tokens for prmeters : 1 connections I subencoder : P subencoder : B subencoder : buffers 5*1 = 5 5*1 = 5 1*1 = 1 1*1 = 1 0 0 0 0 I subencoder : P subencoder : B subencoder :