Supporting Random Access on Real-time. Retrieval of Digital Continuous Media. Jonathan C.L. Liu, David H.C. Du and James A.

Supporting Random Access on Real-time Retrieval of Digital Continuous Media Jonathan C.L. Liu, David H.C. Du and James A. Schnepf Distributed Multimedia Center 1 & Department of Computer Science University of Minnesota Minneapolis, MN 55455 Abstract In addition to the large data size requirement and real-time constraint in continuous media, future video applications such as video editing demands the random access capability on the video-frame level. This paper introduces our study on eective buering control for the real-time retrieval of jitter-free digital video medium. We adopt a video-frame level approach to maintaining the exibility on placement and analyzing the eciency of the buering schemes. An integrated solution which oers ecient buering schemes and exible storage placement to support random access is our goal. We present two buering schemes: the two-buer scheme and the k-buer compensation scheme. The two-buer scheme requires only all the frames in a block to be stored consecutively while providing random access between blocks. However, this intuitive buering scheme potentially requires a large block size and buer space. The k-buer compensation scheme is proposed to resolve this large buer space requirement by using more than two buers and requiring a minimal number of blocks randomly placed in each cylinder. This scheme differs from the contiguous placement scheme because individual blocks can be stored anywhere in each cylinder. Compared to the two-buer scheme, the k-buer compensation scheme requires less buer space, has higher disk utilization, and ner granularity on disk data transfer. The placement requirements are more exible and implementable than the contiguous and storage pattern placement schemes. Experimental measurement results reveal the signicant improvements on the buer-size reduction and placement exibility by using the k-buer compensation scheme. Keywords: Multimedia, Real-time Retrieval, Physical Database Design, Buering Scheme, Disk Placement To appear in Journal of Computer Communications: Special Issue on Multimedia Storage and Databases, Dec. 1994 1 Distributed Multimedia Center (DMC) is sponsored by US West, Honeywell, IVI Publ ishing, Computing Devices International and Network Systems Corporation. 1

1 Introduction The introduction of audio and video media (or so called continuous media) into the distributed computing environment has engendered a new research area called distributed multimedia computing. Rapid advances in computer hardware and communication networks make this new area more feasible than ever. The integration of audio and video media along with traditional elements such as text and images make possible a wide range of distributed multimedia applications. These distributed multimedia applications range from business, education, simulation, entertainment, training to medical applications. In order for these applications to be feasible and eective, a major research eort to determine appropriate solutions for multimedia storage and retrieval is necessary. Among all media types, continuous media which include video and audio media are the most demanding ones. While many research eorts have been addressed on the straight playback of digital continuous media, very few proposed scheme was designed to support the functionality of video editing. Video editing is a fundamental functionality for all distributed multimedia applications. From the aspects of video editing, ideally video segments can be written in anywhere of disk. Video segments might have dierent video qualities (will be explained later) and dierent playback lengths. Because of these video segment might be created in dierent times, thus they are scattered randomly on the disk. Through the video editing, these edited video segments should be integrated as one unit without any jitters. However, as we will point out later in this paper, this integration of digital continuous medium through video editing is not a simple task. In this section, we will introduce the unique characteristics of continuous media, our model of continuous media retrieval system, and problem nature. Then previous solutions and our proposed solutions are logically presented in brief. There are two major characteristics which distinguish continuous media from traditional text data. First, continuous media, particularly video, involves very large amounts of data. Table 1 and Table 2 lists the typical characteristics and storage requirements of three dierent kinds of video quality. 2 Second, retrieving continuous media needs to be perfectly executed under real-time constraints. To meet the real-time constraint, compression of video data is usually required. Although compression reduces the storage size and makes the real-time retrieval of video media from secondary memory devices possible (i.e., transfer speed is faster than the display speed), the data placement on the secondary memory devices and limited buer size in main memory still present problems, namely, jitter and anomaly in real-time retrieval and display. 2 The authors are aware that current MPEG-1 and JPEG compression schemes only oer VCR-quality video and future MPEG standards for NTSC and HTDV video quality are under investigation. These ideal compression ratios, which produce xed-size compressed video frame, are arranged in this paper to simplify the discussion on buering and placement. 2

Video Quality Display Speed (frame/sec) Display Duration (msec/frame) Required Resolution (Width*Height)*(bits/pixel) Animation 10 100 512*480*8=1.9 Mb/frame NTSC 30 33.33 512*480*24=5.9 Mb/frame HDTV 60 16.66 1248*960*24=28.7 Mb/frame Table 1: Requirement for the animation, NTSC and HDTV video qualities Video Quality Required Resolution (Width*Height)*(bits/pixel) Compression Ratio Compressed Size Kb/frame Storage Size (sector/frame) Animation 512*480*8=1.9 Mb/frame 82 100 25 NTSC 512*480*24=5.9 Mb/frame 61 100 25 HDTV 1248*960*24=28.7 Mb/frame 147 200 50 Table 2: Storage requirements for animation, NTSC and HDTV video qualities In order to illustrate the jitter and anomaly problems, we describe the basic model for a continuous media retrieval system in Figure 1. The system we model has one CPU with a large main memory space, an I/O subsystem and a display subsystem. Compressed continuous media data are stored on a hard disk. A decompression VLSI chip in the display subsystem is available for on-the-y real-time decompression before the continuous media is displayed via display devices such as X terminals and speakers. The data ow to retrieve the compressed continuous media can be divided into four steps. In this paper, we assume that contention only exists in allocating memory space as transfer buers (i.e., there is no contention for the CPU and other resources). Although only video medium is illustrated in the rest of this paper, the same proposed schemes are applicable to the audio medium. 1 HOST CPU 4 hard disk I/O 2 2 subsystem 3 4 display subsystem 4 display MEMORY 4 speaker Figure 1: Model of a retrieval system for continuous media Step 1: Host CPU sends the retrieval request to I/O subsystem. Step 2: I/O subsystem moves compressed data from disk to memory. 3

Step 3: Host CPU decompresses the compressed data. Step 4: Host CPU waits for the ready signal from display subsystem, and moves the decompressed data from memory to display device and speakers via display subsystem. The jitter problem in the real-time retrieval of continuous media can be depicted by the following scenario: recall that a compressed NTSC video stream, which is edited from several video segments, is stored randomly on a hard drive with multiple readwrite heads. Each individual video frame is stored as a block. Consider retrieving this compressed video medium using ideal buer size of two video frames, while satisfying the real-time constraints inherent in the nature of continuous media. The next video frame must be retrieved, decompressed and available before the display system nishes with the current frame. This can be achieved if the duration time for displaying the current frame (i.e., 33.33 msec) allows for the system to seek any particular cylinder, rotate to any sector, transfer the compressed frame to memory, and decompress this compressed video frame. The seek time for a typical hard drive usually ranges from 5 to 40 milliseconds. The worst latency time for a hard disk with a typical spinning speed of 3600 rpm is 16.66 milliseconds. The transfer time is the time for the data to actually be transmitted from disk to main memory, and is proportional to the amount of data transmitted. The video decompression time is usually negligible using hardwarebased implementation. Unfortunately, Figure 2 depicts the infeasibility because the disk access time might exceed the duration time, hence causing jitter to occur. This jitter situation becomes even worse to support the fast forward operation without dropping frames because the video frame display duration time is shorter. seek time (~20 msec) latency time(~10 msec) input time axis transfer time (~5 msec) T display time axis t = 0 display one video frame (~33 msec) t = T + 33 msec jitter Figure 2: Infeasibility of real-time retrieval of video frames using random placement The cause of jitter stems from the following three conicting goals: 1. Real-time constraint described above. 4

2. Placement: ideally these frames may be placed anywhere on the disk in order to fully utilize disk space and exibility. However, a random distribution may result in worst seek and latency times when nding the next frame. 3. Buering: we wish to minimize the buer size (e.g., two frames) to accomplish this task. However, the system may not have sucient time to seek, rotate and transfer the next frame before nishing with the current frame. One way of solving the above problem is to place a constraint on the disk placement in order to reduce the unnecessary components of disk access time. An initially proposed solution is to place the continuous media contiguously in either clustered or non-clustered allocation [2, 3, 7, 12]. This contiguous placement scheme does solve the problem but suers from inexibility on dynamic allocation and replacement. Especially for video editing functionality where users would be allowed to delete or insert some video frames anywhere in a video stream. The contiguous placement scheme may requires a large amount of le copying and merging during this on-line operation. It is also anomalous that, due to the limitation of available buer size, it is impossible to retrieve these continuous media all the time. Figure 3 depicts this anomaly when there are three video frames stored contiguously in one track with two-video-frame buer. to display to display start displaying the current block and transfering next block from disk from disk disk rotates for 10 msec Figure 3: Transfer anomaly with contiguous placement and two-frame buer Therefore, even though continuous media can be stored contiguously, we have to skip transferring data in order to avoid buer overow. Although most current implementations use contiguous allocation for continuous media, it is usually left to the synchronization protocols to skip transferring data when buer overow might occur. While traditional text-based storage/retrieval systems have characteristics that the text data should be retrieved as fast as possible or skipped when the buer is full, continuous media, especially video medium, must be retrieved in a rigid timing manner. To meet this rigid timing constraint, only an appropriate amount of audio or video data should be retrieved into the buer no earlier or later than a pre-specied time. 5

Introducing a gap (or so called storage pattern (M,G) in literature) between units of continuous media has been proposed by [20, 14, 16]. These gaps provide a scheme for the system to wait (or transfer other data streams if multiple data streams are retrieved concurrently), then transfer the next frame into the memory buer, while displaying the current video frame. The underlying assumption of this storage pattern scheme is based on the limitation that a small and xed buer size imposes. Gaps are inserted to reduce the required buer size. However, merging techniques are required to merge several streams to increase disk utilization. It also suers the same disadvantage like the contiguous placement scheme in the video editing applications. In addition, a largecapacity disk space is usually shared with other data, and the existence of bad sectors and other data make this particular storage pattern approach less promising. In contrast to previous approaches, we adopt a novel video-frame level approach. We analyze the eciency of the buering schemes while maintaining the highest degree of exibility on the placement. It is our belief that the frame-level model is more proper for a video medium than an audio medium because usually a video frame can not be displayed until all the data which composes one frame has been transferred completely. This observation is important for video applications because features like video editing require the support of random access on individual video frames. This feature highlights the need for exibility on data placement. For example, erasing a couple video frames in a video le should not cause any jitter eect on the real-time retrieval of this edited video le. The contiguous placement and storage pattern placement schemes do not appear to be able to fulll this feature without large le copying and merging on the disk. To the best of our knowledge, this level of modeling and analysis has not been investigated, particularly in terms of ecient buering while maintaining the exibility of placement. We intend to provide an integrated solution which oers more exible storage placement and ecient buering schemes. We present two buering schemes in this paper, the two-buer scheme and the k-buer compensation scheme. The rst buering scheme that we will introduce, the two-buer scheme, requires only a small group of sequential video frames stored consecutively (i.e., clustered) in each cylinder on the disk placement. We call these contiguous frames a block. This scheme eliminates disk seek and latency times within the block. Although this scheme is simple and can be easily implemented, it still suers transfer anomaly (to be discussed later). It also requires a larger buer space, has low disk utilization and has a coarse granularity of disk data transfer 3. We improve this method further by proposing the k-buer compensation scheme which uses more than two buers and requires some blocks to be placed randomly in the same cylinder on disk placement. Note that although each block is internally contiguous, all individual blocks can be stored anywhere in each cylinder. This distinguishes this scheme from contiguous placement. The constraint of placing blocks randomly on the same cylinder increases the slack time (to be dened later) further by eliminating the disk seek time between blocks in the same cylinder. 3 To simplify the discussion in this article, the granularity unit of transferring to/from the video buer is assumed to be frame-based. 6

As we will demonstrate, by increasing the number of blocks in each cylinder, the size of the block can be reduced to the smallest values for video media with dierent qualities (e.g., one for NTSC). Analyses and examples are developed to illustrate the feasibility of these two schemes. The rest of this paper is organized as follows. We introduce the two-buer scheme in Section 2 and the k-buer compensation scheme in Section 3. Further discussions on the exibility of placement is included in Section 4. The experimental performance measurement and discussion of the optimality are described in Section 5. A survey of related work is covered in Section 6. Finally, the conclusion and future directions are included in Section 7 and Section 8. 2 Two-buer Scheme Advances in compression techniques make it possible to have VLSI chips perform decompression on the y when displaying video [8, 5, 10, 13]. Management of the buers for compressed video and audio data has an advantage over uncompressed data because of the reduced size of compressed continuous media and the decreased bandwidth requirement. To provide a better illustration of the k-buer compensation scheme, we introduce the two-buer scheme, which is a special case using only two buers with no compensation feature. This two-buer scheme makes it easier for readers to understand the transition to the ecient k-buer compensation scheme. We will also use this two-buer scheme as a comparison scheme to the k-buer compensation scheme. Table 3 lists the denition and parameters that we use in this article. These typical values are from[1], and we typically use CDC's PA8G1 3600-rpm disk drive and Seagate's Elite-1 5400-rpm disk drive in numerical results. For most video compression schemes, tens of sectors are still required to store one compressed video frame. Random access on video data implies random access on the beginning of any particular frame. However, from Figure 2 and [12, 15], it is impossible to provide random access between each frame and satisfy the real-time requirements. Therefore, we dene a block as one individual unit that consists of multiple frames of video medium. We also assume that random access on continuous media should be of granularity consisting of blocks instead of sectors. We dene this particular random access on continuous media as follows: Denition 1 (constrained random access): Constrained random access on continuous media is a special case of random access such that although the system has random access between dierent blocks (i.e., the next block can be anywhere on disk), it only has sequential access within each block. 7

Symbol Denition Units Typical value(s) s size of disk sector kb/sector 4 kb f number of tracks in one cylinder track/cylinder 15 n number of sectors on one track sector/track 64 T ws worst seek time msec 40 (CDC) 15 (Seagate) T wl worst latency time msec 16.66 (CDC) 8.33 (Seagate) data transfer rate Mb/sec 15 (CDC) 30 (Seagate) S vf size of compressed video frame kbit/frame 100-200 T display display duration of video frame msec/frame 100 (animation) 33.33 (NTSC) 16.66 (HDTV) x block size required for the two-buer scheme frame/block 1-20 (two-buer scheme) and the k-buer compensation scheme mostly 1 (k-buer compensation scheme) k number of buers required for the k-buer block 2-10 compensation scheme p k-buer compensation scheme B total total size of buer required kbits 2 x S vf (two-buer scheme) number of blocks required in each cylinder for the block/cylinder 3-40 k x S vf (k-buer compensation scheme) Table 3: Denitions and parameters used in this article Constrained random access is critical when supporting the real-time retrieval of continuous media. For example, since a compressed video frame take tens of sectors for storage, one cylinder of a hard disk can only store a limited number of video frames (e.g., 40 frames/cylinder). Even when video frames are displayed sequentially, the system still has to move the disk arm to locate another cylinder every 40 frames. The nature of video editing also imposes the need for constrained random access since an integrated video presentation might consist of several dierent video segments stored randomly. It is known that the concurrent pipelining of retrieval and displaying of continuous media requires prefetching and at least two buers [18, 15]. One buer is for retrieval of the next block of data while the other is being displayed. When the block size is just one video frame, and frames are fully random distributed, the real-time retrieval of continuous media can be achieved by satisfying equation(1) for each video frame. Since the decompression time and transfer time on the bus are small (in the range of sec) compared to disk access time (in the range of msec), we will ignore these two parameters and concentrate on the disk access time. Throughout this paper, our analyses will be based on worst case seek and latency times necessary to guarantee jitter-free delivery with the maximum exibility of placement. Thus, on the average, the system may perform better than our predictions. We also assume that no disk scheduling polices are implemented in the hard disk. This assumption is valid because in reality, many disk drives do not have any disk scheduling polices implemented on the disk controller. 8

T ws + T wl + S vf T display (1) As we pointed out earlier, a total buer size consisting of two frames is not sucient for video medium on hard disks with random access. Since T display only lasts 33 milliseconds for NTSC video frame, there is no guarantee that there will be jitter-free real-time retrieval. Therefore, to support constrained random access on real-time retrieval and displaying of video medium, the simplest solution is to enlarge the size of the buer from one frame to one block containing x frames. This enlargement will enable the system to have sucient time to seek, locate, and transfer the next block before the system nishes displaying the current block. This scheme achieves more exibility on the placement than the contiguous and storage pattern approaches. The following inequality describes this relaxation. T ws + T wl + x S vf x T display (2) Figure 4 illustrates the logical model for the two-buer scheme. Throughout this paper, we assume that there are fast switches which enable these buers to read data from the disk and then send data to the display devices. We also assume that all frames in one block are stored consecutively in one cylinder. Thus, for any particular hard disk, there is a upper bound for x, which is b f n S vf c. CPU and Memory disk x basic units (1 block) network subsystem x basic units (1 block) Figure 4: Logical diagram of the two-buer scheme We assume that S vf T display is always true otherwise it will not be possible to support real-time continuous media retrieval. Equation(3) shows the required value of x which supports constrained random access using the two-buer scheme. The total buer size required is: x = T ws + T wl d T display? S e (3) vf 9

B total = 2 S vf x Table 4 lists required x and B total values in the two-buer scheme for dierent video qualities on dierent disk drives. For the CDC 3600-rpm disk drive, the animation-quality video only requires 10 frames per second. This requirement can be easily satised by x = 1. NTSC-quality video requires a larger block size since the display rate is higher. With three-frame block size, the system is able to provide constrained random access on a hard drive to provide the real-time retrieval of NTSC video medium. HDTV-quality video requires a very large block size because the frame size is large and the display rate is high. By setting x = 18, the system is still able to provide constrained random access on a hard drive while satisfying the real-time retrieval of HDTV video medium. Generally speaking, the block sizes required by the two-buer scheme are considered large, especially for HDTV-quality video (e.g., B total = 2 18 200 = 7200 kb). Future multimedia applications require the integration on the processing of continuous and noncontinuous (i.e., traditional) data. The required large block size in the two-buer scheme results in low disk utilization because many segments of free sectors are smaller than the large block size, thus can not be utilized. Since the bandwidth of disk transfer is limited, the large transfer granularity of pipelining concurrency in the two-buer scheme also causes the disadvantage to support the transferring of other data. These will be eciently improved by the k-buer compensation scheme in next section. Disk Drive T ws T wl video quality x B total CDC PA8G1 (3600rpm) 40 msec 16.66 msec animation 1 200 kb ( = 15 Mb/sec) NTSC 3 600 kb HDTV 18 7200 kb Seagate Elite-1 (5400rpm) 22.5 msec 11.12 msec animation 1 200 kb ( = 24 Mb/sec) NTSC 2 400 kb HDTV 5 2000 kb Table 4: Required block size x in the two-buer scheme for dierent video qualities on dierent disk drives 3 Improved Scheme: K-buer Compensation Scheme It is benecial in terms of the system's memory utilization if only the minimum necessary amount of buer space is used, while the constrained random access on realtime retrieval of continuous media is still reserved. The required total buer size B total of the two-buer scheme is still considered too large, especially for HDTV-quality video or the support of fast motion operation without dropping frames for NTSC-quality video. 10

Another disadvantage of the two-buer scheme is the granularity of transfer data during pipelining. The system has to lock one buer when displaying the current block while simultaneously transferring the next block. We assume that the transfer granularity for this concurrency is based on the block in the two-buer scheme. Although the current display buer might have some space to accommodate more video frames after displaying some frames in the display block, no more frames can be transferred into these available buers. Therefore, the reserved buer space is not fully utilized. Figure 5 shows how this transfer anomaly can occur when x = 3 in the two-buer scheme. Part (a) depicts the time instant when the system starts displaying the current block while the disk starts accessing the next block. Part (b) demonstrates that after displaying the two frames, buer (2) is already lled and the I/O system can not transfer more frames into the locked buer(1). As the size of the block is increased, the larger amount of memory is allocated, and most of which is not getting accessed for the extended periods. to display to display start displaying the current block and transfering next block (1) (2) (a) from disk (b) from disk Figure 5: Transfer anomaly in the two-buer scheme with x = 3 Low disk utilization is also the penalty that the two-buer scheme has to pay. This is because all the video frames in the same block must be stored contiguously, thus the disk utilization is low when the block size is large. To avoid the disadvantages of large buer size, large transfer granularity and low disk utilization, we introduce the k-buer compensation scheme in this section. Let us dene the slack time,t slack, as the time the system gains from selective placement on the disk compared to random placement. The two-buer scheme does not fully take advantage of disk placement. The slack time gained by the two-buer scheme with block size x is T slack = (x? 1) (T ws + T wl ) We can improve (i.e., increase) the slack time and decrease the block size by requiring some successive blocks to be placed randomly in the same cylinder. By placing p consecutive blocks randomly in each cylinder randomly, we can avoid the seek time between block access within the same cylinder. Thus, more slack time can be achieved between 11

blocks in the same cylinder while maintaining the desired placement exibility. Equation (4) becomes the new placement requirement which provides constrained random access on the real-time retrieval of continuous media. Because of the xed buer size constraint, and the nature of constrained random access for the video editing functionality, system requires T wl latency time for each of these p video blocks in the worst case. The slack time saved becomes T ws + p (T wl + x S vf ) p x T display (4) T slack = (p x? 1) T ws + p (x? 1) T wl The proper values of x and p (call it a group) in this two-layer placement are the key to saving sucient slack time and requiring a smaller total buer size requirement with the highest degree of exibility on placement. Consider again Figure 5 in the two-buer scheme with x = 3. The disk placement assumed in this buering scheme is equivalent to x = 3 and p = 1, and the total required buer size is six video frames. Since we have a transfer anomaly after we display two video frames, it will be better to have three buers with x = 2 and k = 3 to relieve this anomaly. Furthermore, using x = 1 and 3 k 6 we can achieve the same goal. The idea of the k-buer compensation scheme then becomes clear by putting the constraint of p sequential blocks in each cylinder randomly, a potentially smaller buer size is required. Since now we have two layers x and p, the denition of constrained random access is revised as Denition 2. Denition 2 (revised constrained random access): Constrained random access on continuous media is a special case of random access such that by limiting the placement of some blocks (i.e., a group) on the same cylinder, the system can have random access (i.e., the next block can be anywhere on disk) between groups. Thereafter, the problem of providing a jitter-free video delivery with exible placement can be transformed into the following equivalent problem: How to nd a proper design on x and p in placement and their associated k parameter in buering such that the revised constrained random access can still be achieved? The following subsections describes our k-buer compensation scheme and its derivations for this solution. 3.1 Compensation in the k-buer compensation scheme The logical model of the k-buer compensation scheme is depicted in Figure 6. 12

CPU and Memory disk network subsystem Figure 6: The logical model of the k-buer compensation scheme The idea of compensation is motivated by the fact that the disk data transfer rate varies, it is faster when video frames are placed close together and slower when more seek and latency time are required. In order to overcome this uncertainty and maintain the jitter-free continuity of display, we have to reserve a sucient amount of video frames (i.e., pre-fetch) to accommodate this variation. A major reason that the k-buer compensation scheme performs better than the two-buer scheme results from this unique compensation feature. The idea of compensation is depicted in Figure 7. During normal operation, the system reserves (k? 1) buers for future constrained random access. While the system is displaying the current video frame, the next video frame is transferred into the buers (i.e., from (a) to (b) in Figure 7). This stage can be described by the following equation: T ws + T wl + x S vf (k? 1) x T display (5) The system continues this operation until future constrained random access occurs. When the disk head moves to a dierent cylinder before transferring the next video frame (i.e., from (b) to (c) in Figure 7), the system will use the reserved (k? 1) buers in order to continue displaying video frames. While this happens, the system might use up to k?1 reserved buers because of disk access, as we depict in (c). However, after constrained random access, the retrieval operation tries to compensate for the loss of reserved buers by transferring more data (i.e., from (c) to (d) and from (d) to (e) in Figure 7) in the current cylinder. In order to make this compensation feasible, the following equation needs to be enforced after moving to a new cylinder: x S vf + T wl x T display (6) 13

After these compensation operations, the system resumes the steady state that reserves another k? 1 buers for the next constrained random access before leaving the current cylinder. Since we have p blocks allocated in each cylinder in any order, we have to reserved another (k? 1) video frames before the next constrained random access occurs. This requirement is equivalent to transferring the p? 1 video blocks in the current cylinder while only (p? 1)? (k? 1) = (p? k) are displayed in this period of time. The following equation describes this constraint: (p? 1) [ x S vf + T wl ] (p? k) x T display (7) System reserves (k-1) buffers Compensation disk head moves (a) (b) (c) (d) (e) After displaying one block, and head starts moving... After head moving Back to normal operation : reserved video frames : transferred video frames Figure 7: Compensation in the k-buer compensation scheme with k = 4 Based on the compensation feature, the buering and placement requirements for the k-buer compensation scheme are described in equation (5)-(7). The best results in terms of disk utilization and the transfer granularity of pipelining concurrency, the k- buer compensation scheme should always make x as small as possible. This optimality is discussed in the next section. Solving these equations for x, k, and p and minimizing these values, we can identify these parameters as the following: From equation (6), From equation (5), x = d T wl T display? S vf e (8) T ws + T wl + xs vf R k = d dt e + 1 (9) x T display 14

From equation (7), k x T display? xs vf? R p = d dt T wl x T display? xs e (10) vf? T wl where, x is the block size in memory buer and placement k is the required number of buers p is the minimal number of blocks on the same cylinder 3.2 Numerical Results Table 5 lists the values of x, k and p using the k-buer compensation scheme for dierent video qualities on dierent disks. When x is large enough, the k-buer compensation scheme downgrades to the two-buer scheme with k = 2. We also list the typical values from the two-buer scheme in this table for comparison purposes. Two examples are illustrated in this section. Disk Drive T ws T wl video quality x k p x k x p CDC PA8G1 (3600rpm) 40 msec 16.66 msec animation 1 2 1 2 2 ( = 15 Mb/sec) NTSC 1 3 8 3 8 3 2 1 6 2 HDTV 6 3 62 18 372 18 2 1 36 18 Seagate Elite-1 (5400rpm) 22.5 msec 11.12 msec animation 1 2 1 2 2 ( = 24 Mb/sec) NTSC 1 3 5 3 5 2 2 1 4 2 HDTV 2 3 14 6 28 5 2 1 10 5 Table 5: Required x, k and p values in the k-buer compensation scheme for dierent video qualities on dierent disk drives Example 1 (Video stream of NTSC quality on a CDC hard disk): step 1: examine equation(8), where 16:66 x = d 33:33? 100 15 = 1 15 e

step 2: determine the value of k using equation(9) and x = 1, i.e., 100 40 + 16:66 + k = d 15 1 33:33 = 2 + 1 = 3 e + 1 step 3: from equation(10), the required placement p is 100 3 33:33? 15 p = d? 16:66 33:33? 100 15? 16:66 e = 8 Note the total buer space required to support the NTSC's T display in the k-buer scheme requires only 300 kb. This is a factor of two less than the two-buer scheme's required 600 kb amount. The transfer granularity of pipelining has improved from three video frame to one video frame. Example 2 (Video stream of HDTV quality on a CDC hard disk): step 1: examine equation(8), we nd that 16:66 x = d 16:66? 200 15 = 6 step 2: determine the value of k using equation(9) and x = 6, results in 6200 40 + 16:66 + 15 k = d e + 1 6 16:66 = 2 + 1 = 3 step 3: from equation(10), the new required placement p is 6200 3 6 16:66? 15? 16:66 p = d 6 16:66? 6200 15? 16:66 e = 62 Consider the total buer required to support the HDTV's T display. The k-buer scheme only requires 3600 kb. Again, this is a factor of two 4 less than the two-buer scheme's 4 The same reduction factors in both examples are by accident. Dierent values in the parameter setup can produce dierent reduction factors. e 16

7200 kb. The concurrent access granularity is improved from eighteen video frames to six video frames. However, the placement requirement may be too large to be accommodated in one cylinder. There are two approaches to solve this lack-of-space problem. The rst one is to go for a disk drive with a higher rotational speed. As we can see in Table 5, the required block size and group size are reduced signicantly by using Seagate's 5400-rpm disk drive. The other approach is to extend to a third layer zone in placement constraint. Current two-layer placement requirements do not limit the ordering of allocated cylinders. The track-to-track seek time between adjacent cylinders is much smaller than the worst latency time for disk rotation. Since we used worst latency time in the design and analysis for the k-buer compensation scheme, we can put video frames in the adjacent cylinders as long as they are in the same zone, while the sum of track-to-track seek time plus the rotation time will be less or equal to the worst latency time. 4 On the Flexibility of Placement Previous work on the storage and retrieval of continuous media usually assumed contiguous placement (clustered or non-clustered) or (M,G) storage pattern placement on the disk drive. M represents the disk sectors required to store a video block, and G denotes the sectors that are left unused for this video stream. In the contiguous and the storage pattern placement schemes, the major underlying assumption is that the ordering of storage placement is exactly the same when the system retrieves this video medium. For example, contiguous placement always places the next video frame starting from the next disk sector, and storage pattern placement always places the next video frame after skipping G sectors. A key advantage of the k-buer compensation scheme compared to the storage pattern and contiguous placement schemes is it allows greater exibility in data placement on the disk drive to support advanced features like video editing and multimedia integration. Any placement scheme satisfying the following assumptions will work on the k-buer compensation scheme when retrieving these continuous media: all x video frames in the same block should be stored consecutively. p dierent sequential blocks in the same group can be stored anywhere in one cylinder. dierent video groups can be stored anywhere on the disk. The two-buer scheme itself is a special case of the k-buer compensation scheme with k = 2 and p = 1. However, because k = 2, the compensation feature does not exist 17

in video-frame level. Our placement requirement still diers from the storage pattern (M,G) and contiguous placement schemes because the next block could be anywhere in the same cylinder. Our placement requirement only needs to reserve a 'logical partial' ordering on the storage placement: within a group, there are p blocks stored on the same cylinder, and the storage ordering of these blocks does not need to be the same ordering during retrieval. Dierent groups can be on dierent cylinders, and the same principle applies between groups. We believe that since multiple-head hard drives are round with fast rotation speed, the storage ordering can vary from the retrieval ordering to some degree. This will certainly introduce more seek time between cylinders and latency time within a cylinder. However, we argue that the achieved placement exibility will be more benecial when we consider a general solution for future video editing and multimedia integration applications. Therefore, it is our believe that this two-layer (x; p) placement imposes less restrictions on the placement exibility compared to the contiguous and storage pattern placement schemes. In particular, we use the following two restriction factors in addition to our two-layer (x; p) placement to model other proposed placement schemes in the research literature. 1. whether retrieval order is preserved in the ordering of storage placement and 2. the existence of randomness in the placement of the next video frame: contiguous placement can be modeled as x nf S vf and p = 1 with (1) orderpreserving, and (2) the next video frame should be placed on the next sector in the current or the adjacent cylinder, when the current cylinder is full. storage pattern placement can be modeled as x = M S vf and p = nf (M+G) with (1) order-preserving, and (2) the next video frame should be placed on the rst sector after skipping G sectors on the current or the adjacent cylinders. random placement can be modeled as x = 1 and p can be any number with (1) no order-preserving, and (2) the next video frame can be placed starting from any sector on any cylinder. 5 Performance Measurements To validate the proposed k-buer compensation scheme, a series of experiments were performed to measure the jitters experienced with dierent values of x in the two-buer scheme. It was our expectation that using a sucient large video block, the constrained random access could be maintained without jitter occurred. The measured results in the two-buer scheme has supported this expectation and matched the previous analysis. 18

Same measurements were also performed on the k-buer compensation scheme with dierent values of x,p and k. The k-buer compensation scheme was expected to reduce the required buer size and jitters occurred. Experimental measurement results validated the signicant achievement on these two performance metric using the k-buer compensation scheme. It showed that it is feasible to provide constrained random access with a small amount of buer memory for both NTSC and HDTV-quality video. 5.1 Experiment Setup 5.1.1 Hardware Platform The platform that we adopt is a Sun SPARC-10 machine with a dedicated 2GB SCSI HP C3010 hard disk. This particular HP hard disk contains 19 data surfaces, and each sector has 512-byte capacity. There are 2325 cylinders in this drive, however only 2255 data cylinders are available to store data. The ZBR coding is used in this drive, such that there are three regions. The outer region has 1500 data cylinders, and each cylinder has 1824 sectors. Both the middle and inner regions have 377 data cylinders. each cylinder in the middle region has 1672 sectors, and the cylinder in the inner region has 1444 sectors. The frame sizes for NTSC and HDTV video frames requires 100 and 200 kbit respectively, thus requiring 25 and 50 sectors in this particular disk drive. Therefore, for the worst case, each cylinder in this disk drive can store at least b1444=25c = 57 NTSC video frames or b1444=50c = 28 HDTV video frames. These two numbers have been determined as the upper-bound values for the placement requirements in our experiments. 5.1.2 Softwares A generic video retrieval program has been developed in this series of experiments. Video data was retrieved from the HP C3010 disk drive, and transferred into a shared buer space for the display. The shared buer was locked while in display. The I/O buering itself is a kernel task, which should involve revising the device rmware, device driver and the associated OS kernel. Because lack of the source code of SunOS on SPARC- 10, we emulated this buering scheme in user processes with minimal OS overhead. The workload of the SPARC-10 machine was maintained in a minimal load. The disk access has been monitored such that an exclusive single-access was performed. It were the the performance improvement between the two-buer scheme and the k-buer compensation scheme that we wanted to validate. Therefore, the measurement results we obtained from user process should justify the trend. 19

5.1.3 Video Medium Because of widely adoption of SCSI interface for the connection of hard disk and host computer, it is hard to place the video data onto the exactly desired sectors. The SCSI interface produce a large linear logical array that represents a 'close' mapping of the actual layout of physical sectors. The spare and bad sectors managements are usually handled by the SCSI adaptor, and are totally hidden from user process. Since this limitation, we carefully traced the exact disk characteristics (e.g., number of sectors in one cylinder) of the HP C3010 hard disk. Then the low-level raw I/O operations such as open, lseek, write, and read were adopted to bypass the existed le system overhead. The randomness was determined by getting the current time (e.g., time(tvec)), initialize the seed by srand(tvec[0]), then using the rand() random generator to obtain the randomness for the placement exibility. A video le with 9000 video frames (i.e., 300-second NTSC video or 150-second HDTV video) was placed using this random process. For the twobuer scheme, each block with x video frames was placed contiguously, and dierent blocks were placed randomly on the disk drive. This randomness was achieved by using rand()* total-frames-of-disk. The k-buer compensation scheme requires that at least p video blocks were placed in the same cylinder, however, the order was random within the cylinder for placement exibility. This randomness was achieved by using rand()* totalframes-per-cylinder for these p video blocks. It is purely random to choose a particular cylinder by using rand()* total-cylinder-of-disk. 5.2 Measurement Results 5.2.1 Two-buer scheme As we analyzed in the previous sections, the block size is the key factors when using the two-buer scheme. Figure 8 depicts the number of jitters that occurred when the block size is increased for both NTSC and HDTV video streams. The HP C3010 disk drive has a 4002-rpm disk rotation speed, thus has in-between T ws ; T wl and parameters compared to the 3600-rpm CDC and 5400-rpm Seagate disk drives as we illustrated in the numerical examples. Figure 8 shows that the infeasibility of random access by using x = 1 block size as we illustrated in Introduction section. The number of jitters is large. For NTSC video, it resulted in 8862 jitters among 9000 video frames (i.e., 98.5%), and for HDTV video it introduced 8900 jitters (i.e., 99%). By using x = 2 block size, the number of jitters for NTSC video is reduced to 62 among the 4500 video blocks (i.e., 1.4%). The jitter-free NTSC video quality is maintained by setting the block size x equal or greater than 3 video frames. This measurement result matches to the analysis in previous section. 20

9000 8000 NTSC video-quality HDTV video-quality Number of Jitters Occurred 7000 6000 5000 4000 3000 2000 1000 0 2 4 6 8 10 12 14 16 18 20 Size of block (x frames) Figure 8: Measured number of jitters for dierent block sizes using the two-buer scheme For HDTV video, the following Table 6 lists the jitter reductions by using larger block sizes. When the block size is less than 5 video frames, the jitter ratio is virtually 100%. By using x > 10 block size, the jitter-free HDTV video quality is maintained. This measurement results is also matched to out previous analysis. Notice that to maintain this jitter-free quality, the required memory size is at least 2xS vf = 210200kb = 4Mb. block size (x) # of measurement jitters ratio 1 9000 8900 98.5% 2 4500 4460 99.1% 3 3000 2959 98.6% 4 2250 1856 82.5% 5 1800 650 36.1% 6 1500 45 3.0% 8 1125 8 0.7% 9 1000 6 0.6% 10 900 4 0.4% Table 6: The improvement of jitter reduction using dierent block size in the two-buer scheme for the HDTV video 5.2.2 K-buer compensation scheme As we analyzed in the k-buer compensation scheme, there are three parameters that aect the quality of video retrieval. Block size x, the number of blocks k and the required placement p blocks in a cylinder. To explore the maximal exibility of placement, the block size x has been set as 1 for NTSC video and 2 for the HDTV video. Therefore, there are two parameters that we are particular interested in experiments. The rst one is the placement requirement p, and the second one is the number of blocks k. 21

Increasing the value of p potentially introduce the slack time in current cylinder, thus more likely to achieve compensation as we described in this analysis. Increasing the k value directly improve the amount of compensation, thus increase the allowable display time. Figure 9 depicts the number of jitters that occurred when the placement p is increased for both NTSC and HDTV video streams using k = 3 buers. Figure 10 shows the similar measurement with k = 4 buers. It is worthy to note that the maximal numbers of jitters in both NTSC and HDTV video retrievals have been reduced signicantly. More specically, the maximal number of jitters has been reduced from over 8500 in the two-buer scheme to less than 140 using the k-buer compensation scheme. Unlike the steady reduction curve in the measurement of the two-buer scheme, we sometimes experienced more slightly jitters when using a larger p value (e.g., the HDTV video with k = 3, and p = 10). This was caused by the approximated mapping from SCSI's logical layout to the physical placement. The contiguously-stored block that we assumed in logical array might span to two nonadjacent physical cylinders. Also another possibility is the cylinder size that we assumed sometimes spans across two physical cylinders. This situation is more sensitive when the experienced jitters are few, however, the general trend still proved the eciency of the k-buer compensation scheme. By adopting kernel process and bypassing the SCSI mapping, it is possible to achieve the smooth performance curve. 140 120 NTSC video-quality (k=3) HDTV video-quality (k=3) Number of Jitters Occurred 100 80 60 40 20 0 2 4 6 8 10 12 14 16 18 20 p video frames in each cylinder Figure 9: Measured number of jitters by increasing the number of video frames in each cylinder (i.e., p) using the k(3)-buer compensation scheme For the NTSC video retrieval, using k = 3 blocks as the buering scheme, Figure 9 shows that number of jitters is decreased accordingly when the p value is increasing. Using k = 4 introduces the same trend with much less maximal jitters with the same p placement. Using p = 6 and k = 4 can maintain the jitter-free NTSC video quality. To maintain the same jitter-free quality, the k-buer compensation scheme only requires 4 100kb = 400kb memory while the two-buer scheme needs 2 3 100kb = 600kb. 22

10 NTSC video-quality (k=4) HDTV video-quality (k=4) 8 Number of Jitters Occurred 6 4 2 0 2 4 6 8 10 12 14 16 18 20 p video frames in each cylinder Figure 10: Measured number of jitters by increasing the number of video frames in each cylinder (i.e., p) using the k(4)-buer compensation scheme For HDTV video, the following Table 7 lists the jitter reductions by using k = 3 and k = 4. Notice that the jitter ratio is signicantly improved to 3% just simply adopting more than two buers and minimal placement requirement. Using k = 4 and p > 6, the jitter-free quality is maintained. As we measured in this section, the two-buer scheme needs 4M b memory while k-buer compensation scheme only requires 4 200kb = 800kb to maintain this jitter-free quality. k placement (p) # of measurement jitters ratio 3 4 4500 133 3.0% 3 5 4500 117 2.6% 3 6 4500 36 0.8% 3 7 4500 29 0.6% 3 8 4500 13 0.3% 3 9 4500 14 0.3% 4 4 4500 3 0.1% 4 5 4500 3 0.1% 4 6 4500 1 0.0% Table 7: The improvement using the k-buer compensation scheme for the HDTV video 5.3 On the Optimality of Performance Figure 11 depicts the relationship between k and p when the block size x is increased. It can be shown that by increasing the block size x, the number of blocks in a group for each cylinder (i.e., p) can be decreased. Notice that there is a drop when the block size x = 18. The reason for this sharp drop is because the k-buer compensation scheme 23