CSE 237A FALL 2006 PROF. TAJANA SIMUNIC ROSING Midterm NAME: ID: Solutions Problem Max. Points Points 1 20 2 20 3 30 4 25 5 25 6 30 Total 150 INSTRUCTIONS: 1. There are 6 problems on 11 pages worth a total of 150 points. Please take a moment to make sure that your test is complete and readable. 2. SHOW YOUR WORK. Credit will not be given for answers with no work shown. 3. Open book, open notes, but closed computers, cell phones, PDAs and friends 4. Please sign the statement below before you begin. HONOR CODE: By signing my name below I hereby certify that I have neither given, nor received assistance in completing this examination. CSE237a Midterm exam Prof. Tajana Simunic Rosing 1
1. [20 pts] Write if the following statements are True (T) or False (F) Part Statements T/F - Example: My CSE237a final project is done. F 1 Causality analysis checks if Esterel code is deadlock free. T 2 Cyclic executive RTOS uses preemptive, dynamic schedulers. F 3 Kahn processes use nonblocking FIFOs for communication. F 4 Synchronous data flow graphs are a special case of Petri nets. T 5 StateCharts can be used to model distributed systems. F 6 TinyOS has rich and efficient concurrency support. T 7 Sampling at 10 MHz can fully reconstruct a waveform with max freq of 20 MHz. F 8 Reachability is defined for transitions in a Petri net. F 9 SDL timers are great for modeling hard real time deadlines F 10 RT-CORBA bounds the priority inversion time. T 11 In verilog there can be events that happen at times shorter than a clock tick. T 12 TTP uses TDMA. T 13 Scratchpad memory is used in addition to L1 cache in processors. F 14 VxWorks is a good example of a microkernel RTOS. F 15 Lamport's logical clocks are strongly consistent. F 16 Motes are capable of image detection in still photos for security applications. F 17 DSPs are typically RISC processors F 18 Increasing cache's set associativity normally lowers its power consumption. F 19 CAN protocol needs arbitration to resolve priority issues. F 20 Priority arbiter and DMA can operate together to transfer data from peripherals. T CSE237a Midterm exam Prof. Tajana Simunic Rosing 2
2. [20 pts] A WFQ scheduler has four queues; A, B, C and D with weights 4, 1, 3 and 2 respectively. Outgoing link speed is 10 bits/sec. At time 0 the queues have the following status: Q,w/Pk # P1 P2 P3 A w=4 1 3 5 B w=1 5 3 - C w=3 3 3 - D w=2 5 3 5 a) [10 pts] At what real time in seconds do packets P3 in A & D and P2 in B & C finish? While all 4 queues are full, the round rate is 1/(4+1+3+2)=1/10 The total number of bits to transmit is 14+12+10=36; the last packet is out at 3.6 sec b) [10 pts] At what round number is packet P2 done transmitting from queue A? CSE237a Midterm exam Prof. Tajana Simunic Rosing 3
3. [30 pts] Your job is to design a sensing platform whose block diagram is shown below. SRAM has a 16 bit address and a 16 bit data port. The sensing/detection sequence starts with DSP sending an address of a digitized wave (W) to ADC. Digital to analog conversion (DA) is performed on five 16 bit samples stored in memory. It is followed by the actuation of the wave with one of the sensors (A). Another device senses the wave samples (S), which are then digitized (AD) into five 16 bit values. To detect feature 1 the system has to run first filter 1 (F1), followed by filter 2 (F2). Detecting feature 2 requires first filter 3 (F3) followed by filter 1 (F1). The results of both detections (16 bits of data each) are stored. Performance of the tasks in ms is listed in the tables below. Performance (ms) Label Task DSP Xscale W Wave creation 1 - F1 Filter 1 2 10 F2 Filter 2 2 - F3 Filter 3-3 D1 Detect feature 1 1 - D2 Detect feature 2-2 Label Task Device Perf. A Actuate (all samples) Sensor 2 S Sense (all samples) 2 AD Analog to digital (all data) ADC 1 DA Digital to analog (all data) 1 MW Write (16 bits) SRAM 1 MR Read (16 bits) 1 B Bus transfer (16 bits) Bus 1 CSE237a Midterm exam Prof. Tajana Simunic Rosing 4
3. cont. a) [15 pts] Draw the task graph for this sensing platform; make sure to pay careful attention to dependencies between tasks and hardware components. You may assume that DSP& XScale have large enough data caches to hold their input data and the results of every task they perform. ADC can execute AtoD and DtoA conversions in parallel. Use task labels provided to you in the tables. Start with the wave creation task (W) as shown below. CSE237a Midterm exam Prof. Tajana Simunic Rosing 5
3. cont. b) [15 pts] Show the minimum latency schedule for simultaneous detection of both features. Wave creation task starts at time 0ms as shown in the table below. Data processing is done on five 16 bit data samples. Time is in ms. HW./Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Sensor 1 A A Sensor 2 S S ADC DA AD DSP W Xscale SRAM MR MR MR MR MR Bus B B B B B B B B B B HW./Time 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Sensor 1 Sensor 2 ADC DSP F1 F1 F2 F2 D1 Xscale F3 F3 F3 F1 F1 F1 F1 F1 F1 F1 F1 F1 F1 D2 D2 SRAM MW MW Bus B B Comments: 1. Memory read/write involves bus transfer, in the case of read, we first read, then transfer read data over the bus. Opposite occurs with data writes. The cost of memory accesses is proportional to the amount of data to R/W. 2. Sensing and actuation can happen in parallel, but DA/AD can t due to the sequence between creating a wave and starting to detect it. 3. Once data is sensed, it is transferred to the on-chip cache of DSP/XScale, so there is no memory access needed here. 4. Although DSP executes filter 1 much faster than XScale, the cost of data transfer between the two is higher than just executing on XScale. The data transfer would have to happen twice since only XScale runs the final detect F2 step. CSE237a Midterm exam Prof. Tajana Simunic Rosing 6
4. [25 pts] An SDF is shown below (edge labels are in dark grey): a) [3 pts] Define its incidence matrix G = [ -2 0 1 2 0-1 3-1 0 0 2-3] b) [2 pts] What is its rank? Rank is 2. CSE237a Midterm exam Prof. Tajana Simunic Rosing 7
4. Cont. c) [10 pts] System constraints specify that task A has to execute before task B which has to execute before task C. Is there a PASS schedule? If not, change the SDF so that there is a PASS. Derive the PASS schedule. 6 A 2 B = 0 & 2 B 3 C = 0 => 3A=B, 2B=3C Smallest q = [ 1 3 2] PASS = { A BBB CC } d) [5 pts] Derive the initial condition for the schedule defined in part c) Buf(0) = [2 0 0 0] e) [5 pts] Derive the buffer sizes for the schedule from part c) [2 0 0 0] -> [0 2 3 0] -> [0 2 2 2] -> [0 2 0 6] -> [1 1 0 3] -> [2 0 0 0] Buffer sizes : [2 2 3 6] CSE237a Midterm exam Prof. Tajana Simunic Rosing 8
5. [25 pts] Consider the following one-player game: there is an urn with black and red balls. In each round the player takes two balls: If both are black, then the player returns one black If both are red, then the player returns one red If one is red and one is black, then the player returns one red. a) [10 pts] Assume that the urn initially has Nr red balls and Nb black balls. Draw a Petri net that models this game. b) [10 pts] Is it possible to play the game in such a way that the urn is empty at the end of the game for some Nr and Nb? This question is equivalent to finding all Mo marking such that (0,0, Nr+Nb) is reachable from Mo. Since all sequences for Mo lead to either (1,0,*) or (0,1,*). c) [5 pts] Is this Petri net live? Show why. The Petri net is not live because there are markings when all transitions are dead, such as (1,0,*) and (0,1,*). CSE237a Midterm exam Prof. Tajana Simunic Rosing 9
6. [30 pts] A periodic control tak C is executed on a CPU, which executes also two other tasks, A and B. Assume period=deadline. The tasks have the following characteristics: WCET Period A 1 5 B 2 10 C 2 x a) [10 pts] Suppose 19% of the CPU utilization is reserved for other activities. Derive the minimum task period for the control task C that guarantees schedulability of A,B and C with RM. Show the schedule in the table below. With RM scheduling condition, we obtain that CPU utilization should be less than n(2^1/n-1); with n=3, we get 78% utilization to guarantee RM schedule feasibility. Thus: 0.78-0.19=0.59=1/5+2/10+2/x => x=11 Therefore, the scheduling priorities are: A, B, C Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Task A B B C C A A B B C C A Time 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Time A B B C C A A B B C C A CSE237a Midterm exam Prof. Tajana Simunic Rosing 10
6. cont. b) [10 pts] Due to special design constraints task C has to be executed every 10 time units. Assume that start times for tasks A,B & C are 2,1 and 0 respectively. From that point on they repeat with period shown in the table (e.g. if task A has the highest priority, it would be scheduled at time 2, 7, 12 etc.). Schedule the tasks with EDF. Since task C executes every 10 time units, now the schedule will repeat with period 10. Deadlines: A 7 12 17 B 11 21 C 10 20 Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Task C C A B B A C C A B B A Time 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Task C C A B B A C C A B B A c) [10 pts] Task C has a period of 15 time units. All tasks start at time 0. Their characteristics are shown in the table above. Slowing down any of the tasks by 50% results in saving 75% in terms of power consumption. The cost of speeding back up is one time unit. Derive a minimum power consumption schedule that still meets all the deadlines. How much power is saved? The schedule repeats after 15 time units. First see how EDF does with no DVS: Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Task A B B C C A A B B A B B C C Given this schedule, out of 15 time units, there are 6 idle ones. Given the task availability times, the system will have to speed up at most twice at cost of 2 time units. Therefore we can slow down at time 3, so task C runs 4 time units and task A two. If we continue with the slowdown, we can run task A in 2 time units and then speed up to meet the constraints for task B. S stands for speedup and it is at max pwr. The final schedule is: Time 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Task A B B C C C C A A A A S B B A B B C C High power schedule is for: 0-2 and 12-14 time units, for the total of 6 time units. Low power schedule is for: 3-11, total of 9 time units out of 15. Normalized power consumption at slow schedule: (6*1 + 9*0.25)/15= 55% Total savings are 45%. CSE237a Midterm exam Prof. Tajana Simunic Rosing 11