Chapter 2: Basics Chapter 3: Multimedia Systems Communication Aspects and Services Chapter 4: Multimedia Systems Storage Aspects Optical Storage Media Multimedia File Systems Multimedia Database Systems Chapter 5: Multimedia Usage 4.2: Multimedia File Systems Traditional File Systems Multimedia File Systems Disk Scheduling in Traditional and Multimedia File Systems Data Structuring System Architecture Page 1
Why Multimedia File Systems? Heterogeneous data types including digital audio, animations and video Consuming enormous storage space Media are delay-sensitive: when user plays out or records a time dependent multimedia data object, the system must consume or produce at a constant data rate High demands to access to hard disc A new multimedia enabled file system is needed in two means: Organization of media content on the server Scheduling strategies for access to the data Page 2
Disk Layout Lehrstuhl für Informatik 4 The layout of a disk determines the way in which content is addressed how much storage space on the media is actually addressable and usable the density of stored content on the media Tracks and sectors A hard disk consists of one or more heads A hard disk is divided into tracks and further into sectors (512 Byte) The same track on all heads is called cylinder Storage of a file is done in terms of sectors Unused space of a sector is wasted Easy mapping of file location information to head movement and disc rotation Constant angular velocity (CAV), i.e. same access time to inner/outer tracks Access to a sector by a movable disk arm Page 3
Disk Layout Lehrstuhl für Informatik 4 Zone Bit Recording In the normal way, a sector at an outer radius has the same (sector) data amount, but more raw capacity. In principle, by this space is lost. Current approach for solution is zone bit recording Different read/write speeds, depending on the radius, allowing uniform sector size Place more popular media (movies) on an outer track to reduce average seek time, less popular media on an inner track. This saves disk arm movements. Now: how to place files on such a disc? Page 4
Traditional File Systems The file system manages the data organization on a disk and consists of: 1. Files: program codes, data 2. Directory Structure: Organizes files (usually in a tree structure) and provides information, e.g. -rwxr-xr-x 1 root other 55160 Jul7 2004 gcc* Traditional Files: Files root Directory Executables of programs Numeric data Text Objects etc. Goals: Provide a comfortable interface for file access Make efficient use of storage medium (in terms of space and also of access time to sectors) Page 5
Use of Storage Medium Important: reduce read and write times by fewer seek operations lower rotational delay or latency high actual data transfer rate (can not be improved by placement) Method: store data in a specific pattern Divide file in blocks (can be bytes, or of larger size) Store blocks in certain patterns Larger block size Fewer seek operations Smaller number of requests But higher loss of storage space due to internal fragmentation (last block used only 50% on its sector on the average) Page 6
Traditional File Systems - File Structure How to place the records of a file? Contiguous Placement 1st file 2nd file 3rd file Non-contiguous Placement 1st file 2nd file 3rd file Page 7
Performance Consideration of File Structure Contiguous Placement: Disk access time for reading and writing is minimized Major disadvantage: file creation, deletion and size modification makes this sequential storing difficult Non-Contiguous Placement (two main approaches): 1. Linked Allocation: Using pointers for addressing the next block Fine for sequential access beginning pointer Random access is costly Long seek operations during playback 2. Indexed Allocation: Links are stored in an index-block Complex Performance depends on the index structure and size of the file (first block is 1) (next block is 2) 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 8 6 Page 8
Traditional File Systems: Disk Management Disk access is slow and costly - major bottleneck Techniques reducing overall disk access time: Block caches Keep blocks in memory for future use Reduces the number of disk access Reduce disk arm motion Blocks to be accessed in sequence are placed on the same cylinder Reduces the time for one disk access Take rotation into account by placing consecutive blocks in an interleaved manner Placement of mapping tables Mapping tables are placed in the middle of the disk Tables and the corresponding blocks are placed on the same cylinder Interleaved Storage Non-interleaved Storage Heads may read in parallel Page 9
Traditional File Systems: Disk Scheduling In traditional file systems, efficient usage of storage capacity is the main goal. The total time to service a request to a file in such a system consist of: Seek time, head positioning to appropriate track (diameter) Latency (rotation time), time to find the block in the track Actual data transfer time Technique to reduce delay: Seek operation Scheduling algorithms Latency File allocation methods Next, we will consider strategies for minimizing the seek time, i.e. for the positioning time of the head to the appropriate track. Tracks are numbered 0,..., N - 1. Here, 0 is the innermost and N - 1 the outermost track. Delay Page 10
Disk Scheduling: First-Come-First-Served (FCFS) Serve requests in order of arrival 133 0 13 31 5163 69 108 130 173 198 Queue i (51) 108 i+1 173 31 130 i+2 13 133 63 69 i i+1 i+2 order of successive requests + Easy to implement + Fair algorithm - Not optimal High average seek time Overall movement counted in number of tracks visited for FCFS (in an example scenario): 673 Page 11
Disk Scheduling: Shortest-Seek-Time First (SSFT) SSFT = Serve nearest request 133 0 13 31 51 63 69 108 130 173 Optimal overall movement: 198 SSTF movement: 243 Queue (51) 108 173 31 130 13 133 63 69 + Substantial improvement over FCFS - Still not optimal - Starvation (of some tracks if there is always a track with shorter seek time available) Page 12
Disk Scheduling: SCAN SCAN = serve requests in one direction; then reverse the movement Move from one end to the other, serving each request on the way 69 0 13 31 51 63 108 130 133 173 198 Queue Head Start Overall movement SCAN: 224 (51) 108 173 31 130 13 133 63 69 Page 13
Disk Scheduling: Circular-SCAN (C-SCAN) C-SCAN is similar to SCAN but returns immediately to the beginning if the end is reached; one idle head movement from one edge to the other between two consecutive scans 0 13 31 51 6369 108 130133 173 198 Queue + Fair service - More uniform waiting time - Performance not as good as SCAN + Middle tracks don t get a better service than edge tracks (such as with SCAN or with SSTF) Overall Movement C-SCAN: 376 (51) 108 173 31 130 13 133 63 69 Page 14
Multimedia File System Requirements of continuous data: File size: Highly structured data units (e.g. video and associated audio) New organization policies of the data on disk Efficient usage of limited storage is necessary Multiple data stream: For example: retrieval of a movie requires the processing and synchronization of audio and video data File access: High, continuous throughput Short maximum (not average) response times Real-time characteristic: Stream play-out in constant, gap-free rate additional buffers Page 15
Real-Time Characteristics: Buffers R(t) Reading #samples Buffer is emptied Playback R(t) C(t, t 0 ) = r C (t- t 0 ) B(t, t 0 ) = R(t) - C(t, t 0 ) 0 t r R(t) = Total number of samples read up to time t t r : Reading is completed t 0 : Playback starts t t 0 t t w = minimal starting time without gaps C(t, t 0 ) = Number of consumed samples up to time t r C = Rate of consumed samples Consumption order and physical placement are known Optimal for buffer requirements and throughput R(t) C(t, t 0 ) for all t is required for a gap-free playback Buffer needed if play-out is started at time t 0 : B(t, t 0 ) := R(t) - C(t, t 0 ) We want to have a playback to begin as early as possible and a small buffer size Problem: Find minimum buffer size such that buffer never gets empty Page 16
Real-Time Characteristics: Buffers The earlier the first sample is played out, the less buffer requirement Theorem 1: The solution using the minimum start time for playback also requires the least buffer space at any point in time. Proof: Let t w be a given solution of the minimum start time. We have for any possible playback without gaps: #samples t 0 B(t, t 0 ) t r R(t) t B(t,y ) R(t ) if t < y = R(t ) r C (t y ) if t y it follows for gap-free playout: R(t ) r t + r y 0 C C This must also hold for y = t w and since t w is the minimum starting time we have: rc tw rc y and B(t, t w ) B(t, y) Page 17
Real-Time Characteristics: Buffers B(t, t 0 ) Theorem 2: The minimum value for t 0 is as follows: consider B(t,0) R(t) 1. If B(t,0) 0 for 0 t t r t 0 = 0 is the solution 2. -m := min(b(t,0)) at time t min B(t min, 0) = -m 3. m is the number of missing buffers if we would t begin immediately, i.e. at time t = 0, with the play-out 0 t r t Then the start time t i of a gap free play-out is the intersection of r C t m with the t-axis. R(t) Or alternatively t i is given by the t-axis value where R(t) and B(t, 0) + m intersect Proof: B(t, 0) + m 1. If B(t,0) is a solution, then it must be optimal -m t 0 according to Theorem 1 t i 2. r C t m is the highest (i.e. earliest starting) linear curve which is below R(t) for all t. At the intersection point of R(t) and B(t, 0) + m we have the necessary buffers, namely m, for the first time available in order to make a gap-free play-out B(t, 0) t Page 18
Real-Time Characteristics: Buffers Usually we read samples in sector units of e.g. 512 bytes A sector contains in addition to the data various other information such as error correction codes. Thus the data of the sector is valid only when all the bytes of it have been received. As a consequence, R(t) is a staircase function The minimum play-out curve is the parallel to r C t which meets the staircase only at one (or more) lower edges t Number of samples read Number of samples usable play-out curve without usable samples t i without sector reading Sector Size t i with sector reading play-out with usable samples # samples Page 19
Multimedia Disk Scheduling Algorithms G S + G Restrictions of data placement How to place media blocks? 6 Milliseconds for 3 blocks of data Parameters The size of a media block (granularity parameter G) # blocks: separation between successive blocks (scattering parameter S) playback duration Continuity requirement S+ G G r D data transfer rate from disk rd rc r C playback rate play back rate: 0.5 ms/block i.e. time to skip over a gap and to read the next media block is smaller than or equal to the duration of the playback e.g. G = 3 r D = 2 r C = 0,5 results in (S+G)/2 G/0.5 S+G 12 S 9 Page 20
Disk Scheduling Algorithms To fulfill the requirements of multimedia data, scheduling has another focus than in traditional file systems: Goals in Traditional File Systems: Reduce cost of seek time (effective utilization of disk arm) Achieve fair throughput Provide fair disk access Achieve short average response times Goals in Multimedia File Systems are different: Meet deadlines of all time-critical tasks Keep the necessary buffer space requirements low Find balance between time constraints and efficiency Page 21
Disk Scheduling: Earliest Deadline First (EDF) t 3 24 3 30 deadline track no. In EDF the block with the nearest deadline is read first. Equal deadlines FCFS 2 16 3 50 2 42 1 45 1 12 1 12 1 45 2 42 3 50 2 16 3 30 2 40 2 40 1 12 1 45 2 42 3 50 2 16 1 22 1 22 2 40 2 40 2 40 2 42 3 50 Poor throughput due to excessive seek time. Only deadlines are taken into account, but not track number. Very similar to FCFS: inefficient. Does not reflect the geographical position of tracks. 22 12 45 40 42 16 Page 22
Disk Scheduling: SCAN-EDF SCAN-EDF is a combination of: Deadline scheduling (as in EDF earlier deadlines are served first) Scanning (tasks with same deadline are served according to the actual scan direction) Problem: SCAN (i.e. use of scanning directions for tie break among equal deadlines) does not make much sense if too many different deadlines exist Thus: It has to be enforced that many requests have the same deadline In order to do so, all requests are grouped in a few groups which can be scanned together We require that deadlines D i are multiples of a common period p D i {1, 2, 3,...} Then deadlines with the same period can be grouped and served together by SCAN Page 23
Disk Scheduling: SCAN-EDF Implementation of SCAN-EDF by Perturbation of deadlines (in order to apply EDF) Let D i the deadline of task i and N i be the track number (0 N i < N max, e.g. N max = 100) Assume that D i N Modify D i towards D i (D i = perturbed deadline) D i = D i + f(n i ) f(n i ) converts the track number of i into a small perturbation of the deadline such that for equal deadlines the scanning is automatically applied If we choose (for example) Ni f(n i ) = N max 0 f(n ) < 1 i Thus if the deadline for a task on track 42 is equal to 3 then the perturbed deadline is 42 3+ = 3,42 100 This deadline is given to the task at arrival time Page 24
Disk Scheduling: SCAN-EDF t 2.16 16 3.50 50 2.42 42 1.45 45 1.12 12 2.40 40 1.22 22 Perturbed Deadline 1.12 12 2.40 40 1.22 22 12 Among the same deadline SCAN is applied Request with the earliest deadline is served Sensible only for a large number of requests 1.45 45 2.40 40 1.22 22 22 Track number 2.42 42 2.40 40 1.45 45 45 deadline 1, i.e. [1:2] 3.50 50 2.42 42 2.40 40 40 2.16 16 3.50 50 2.42 42 16 deadline 2, i.e. [2:3] Optimization only applies for requests with the same deadline before the comma Increase this probability by grouping the requests Page 25
Disk Scheduling: EDF, SCAN-EDF 0 10 20 30 40 50 SCAN-EDF EDF Deadlines Page 26
Disk Scheduling A small variation of deadline perturbation : The actual deadline given to the task is refined by: Taking into account the actual movement of the head at arrival time (i.e. upwards from 0 to N max - 1 or downwards from N max - 1 to 0) Considering the actual position N of the head The perturbed deadline for a task which resides on track N i is given by: D i = D i + f(n i ) where: Ni N if Ni N and "head moves upwards" Nmax Nmax Ni if Ni < N and "head moves upwards" Nmax f(n i ) = Ni if Ni N N > and "head moves downwards" max N Ni if Ni N and "head moves downwards" N This allows to serve new requests as soon as possible max Page 27
Group Sweeping Scheduling (GSS) Deadline 1.1 1.2 12 1.4 45 1.1 22 Group 1 SCAN 12, 22, 45 (ascending order) [in next cycle: descending order] Requests are served in cycles in a round-robin manner In one cycle requests are divided into groups. A group is served according to SCAN Service in a group may be in ascending or in descending order depending on the other groups Thus a smoothing buffer may be needed (to assure continuity) 3.4 24 3.3 30 2.0 16 3.3 50 2.2 42 1.2 45 1.4 12 2.4 40 1.1 22 Cycle 2.0 16 2.2 42 2.4 40 Group 2 SCAN 42, 40, 16 (descending order) Deadline 2.0 Group 3 Deadline 3.3 3.4 24 3.3 30 3.3 50 SCAN 24, 30, 50 (ascending order) t Page 28
Group Sweeping Scheduling (GSS) A particular stream can be the first one in its group in a given cycle, but the last one in its group in the next cycle This happens if the scan order is reversed, i.e. if we have an odd number of groups Thus we need a smoothing buffer in order to achieve continuity of play-out GSS is a trade-off between optimization of buffer space and arm movements Page 29
Group Sweeping Scheduling (GSS) - Mixed Strategy The mixed strategy is a compromise between Shortest seek ( greedy ) Balanced strategy Data retrieved from disk are placed into buffers. Different queues are used for different data streams. Shortest seek serves the stream whose data block is nearest Balanced serves the stream which has the lowest utilization of buffers (since this stream risks to run out of data) Page 30
Group Sweeping Scheduling (GSS) - Mixed Strategy Filling status of buffers indicate when to switch from SSTF to Balanced and vice versa 1 Urgency criterion: Urgency = Fullness ( all streams i) i Fullness i = small Urgency = high Balanced strategy should be used Page 31
Storage Devices: Data Structuring Continuous data are characterized by consecutive, time-dependent logical data units. Basic data unit: Video - frame (single video image) Audio - sample The design of data structure is guided by two requirements: Time continuum of media: Media units convey their meaning only when they are presented continuously Synchronization between media: Temporal coordination of different media components is necessary Continuity Requirement Define a continuously recorded sequence of media units (video, audio or both) as a Strand. Strands must be partitioned into blocks and stored on the disk. For providing direct access to any of the media blocks of the strand a hierarchical index structure is used A strand will generally include headers and other information (e.g. about compression) Page 32
Storage Devices: Data Structuring Components of a strand: Media Blocks (MB): placed according to a placement model Primary Blocks (PB): contain a sequence of (MB, disk-location) pairs Secondary Blocks (SB): contains pointers to PB Header Block (HB): root of the strand (pointers to all SB, recording length, rate) HB SB SB HB MB... MB PB... MB MB PB SB... SB SB SB Page 33
Storage Devices: Data Structuring Multimedia Rope: all media strands which constitute a logical entity (e.g. video and associated audio of a movie) Audio Strand 1 Synchronization Requirement Video Strand 1 Continuity Requirement Audio Strand 2 Synchronization Requirement Video Strand 2 At the time of recording, temporal relationships among strands (may be recorded at different sites/times) will be represented by RTS = Relative Time Stamp During playback all media units must be played regarding RTS in relation to the other strands Page 34
Data Structure of a Multimedia Rope MultimediaRopeID Creator Length PlayAccess EditAccess List of [ List of [ MediaStrandID <StrandIntervalStart, StrandIntervalEnd> Media type EncodingFormat MediaRecordingRate StorageGranularity List of [ <MediaUnitID, RTS> ] MaximumAsynchrony ] List of [ RTSvalue ] ] Unique ID Identification of the creator Length of the rope List of users or group ID Unique ID of media strand Strand interval, in media units Medium of the strand Strand encoding format: MPEG, JPEG, etc Rate of recording, in media units/s Media granularity, in media units/s Media unit and its corresponding RTS (RTS = Relative Time Stamps) Tolerable asynchrony threshold Discrete synchronization points Identities of synchronization points Page 35
Storage Devices: Data Structuring Editing operations on ropes manipulate pointers to strands Intervals of strands can be shared by different ropes Strands that are not referenced by any rope can be deleted, and storage can be reclaimed Rope1 Rope2 Audio Video Audio Video Rope1 INSERT Rope2 Audio Video Audio Video Page 36
Storage Devices: Data Structuring Rope1 Rope2 Audio Video Audio Video REPLACE Rope1 Rope2 Audio Video Audio Video Operations INSERT [baserope, position, media, withrope, withinterval] REPLACE [baserope, position, media, withrope, withinterval] SUBSTRING [baserope, media, interval] CONCATENATE [mmropeid1, mmropeid2] DELETE [baserope, media, interval] Page 37
Storage Devices: Data Structuring Further operations: RECORD [media][requestid, mmrpeid] Records a multimedia rope represented by mmropeid which consists of several media strands until a STOP operation is issued PLAY [mmropeid, interval, media] requestid STOP [requestid] Page 38
Multimedia File System Architecture Multimedia Systems must coexist with the conventional Non-Real- Time Environment (NRTE) data processing Many operating systems provide extensions to support multimedia application Real Time Environment (RTE) Application itself never really touches the data Take the shortest possible path from source to the sink Page 39
System Architecture NRTE control deals with all data that have no timing requirements Application(s) events Stream Control Interface(s) RTE schedules the processes according to the timing requirements Stream Handlers manage the RTE data flow in accordance with the control operations of the NRTE Applications access stream handlers by establishing (creating) sessions Stream Management System(s) Stream Handler Source Stream Handler Sink RTE Page 40
System Architecture - UNIX-based Systems user space kernel space NRTE Applications Operating System e.g. Traditional Scheduler Applications make use of systems calls in the NRTE Extensions to the operating system (i.e. RTE) are part of the kernel space e.g. Deadline Scheduler RTE Page 41
System Architecture - IBM OS/2 Multimedia Presentation Manager/2 (MMPM/2) Part of IBM s Operating System/2 (OS/2) Well-suited for multimedia supporting preemptive multitasking, priority scheduling, demand-paged virtual memory storage, etc. Media Control Interface (device independent programming interface) Open, close, status of device (for all devices) Play, record, resume, stop (playback, record) Set cue point (allows for synchronization) Get table of contents of a CD-ROM (device-specific command) Stream Programming Interface Implementation of data streaming and synchronization Access to the SyncStream Manager (coordinates and manages the buffers) Ease of use several levels Style guide for applications, unified graphical interfaces Application developers and device providers can integrate their own I/O processes, stream handlers, etc. Page 42
System Architecture - IBM OS/2 OS/2 Multimedia Presentation Manager/2 Media Control Interface Media Device Manager Non Real Time Environment (NRTE) Real Time Environment (RTE) Source Stream Handler Source HW Device Driver Media Driver Stream Programming Interface Sync/Stream Manager Stream Manager Helpers Stream Data Buffers Sink Stream Handler Sink HW Device Driver Page 43