IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY

Size: px
Start display at page:

Download "IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY"

Transcription

1 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY Spatial-Random-Acce-Enabled Video Coding for Interactive Virtual Pan/Tilt/Zoom Functionality Aditya Mavlankar, Member, IEEE, and Bernd Girod, Fellow, IEEE Abtract High-patial-reolution video offer the poibility of viewing an arbitrary region-of-interet (RoI) interactively. Zoom functionality enable watching high-reolution content even on diplay of lower patial reolution. If arbitrary region correponding to arbitrary zoom factor can be erved to the uer, the tranmiion and/or decoding of the entire high-patial-reolution video can be avoided. Moreover, if the video content can be encoded uch that arbitrary RoI correponding to different zoom factor can be imply extracted from the compreed bittream, we can avoid dedicated video encoding for each uer. We propoe uch a video coding cheme that i vital in allowing the ytem to cale to large number of remote uer a well a to encode and tore the content for ubequent repeated playback. Apart from generating a multi-reolution repreentation, our coding cheme ue P lice from H.264/AVC. We tudy the tradeoff in the choice of lice ize. A larger lice ize enable higher coding efficiency for repreenting the entire cene but increae the number of pixel that have to be tranmitted. The optimal lice ize achieve the bet tradeoff and minimize the expected tranmiion bitrate. Experimental reult confirm the optimality of our predicted lice ize for variou tet cae. Furthermore, we propoe an improvement baed on background extraction and long-term memory motion-compenated prediction. Experiment indicate up to 85% bitrate reduction while retaining efficient random acce capability. Index Term Interactive video treaming, pan/tilt/zoom, region-of-interet. I. Introduction HIGH-patial-reolution digital video will be widely available at low cot in the near future. Thi development i driven by increaing patial reolution offered by digital imaging enor and increaing capacitie of torage device. Furthermore, there exit algorithm for titching a comprehenive high-reolution view from multiple camera [1], [2]. Certain current product titch a large panoramic view in real time [3]. Alo, image acquiition on pherical, cylindrical, or hyperbolic image plane via multiple camera can record cene with a wide field-of-view while the recorded data can Manucript received May 21, 2009; revied October 22, 2009 and July 30, 2010; accepted October 18, Date of publication March 17, 2011; date of current verion May 4, Thi paper wa recommended by Aociate Editor I. Ahmad. A. Mavlankar wa with Stanford Univerity, Stanford, CA USA. He i now with Tely Lab, Inc., Menlo Park, CA USA ( aditya.mavlankar@ieee.org). B. Girod i with the Department of Electrical Engineering, Stanford Univerity, Stanford, CA USA ( bgirod@tanford.edu). Color verion of one or more of the figure in thi paper are available online at Digital Object Identifier /TCSVT /$26.00 c 2011 IEEE be warped later to the deired viewing format [4]. An example of uch an acquiition device i [5]. Depite the availability of high-reolution video, challenge in delivering thi high-reolution content to the client are poed by the limited reolution of diplay and/or limited data rate for communication. If the uer were made to watch a patially downampled verion of the entire video cene, then he might not be able to watch a local region-of-interet (RoI) with the recorded high reolution. To overcome thi problem, we propoe interactive virtual pan/tilt/zoom functionality while viewing the video. Some practical cenario where thi kind of interactivity i well-uited are: interactive playback of a highreolution video from a locally tored file, interactive TV for watching content captured with very high detail (e.g., interactive viewing of port event), providing virtual pan/tilt/zoom within a wide-angle and high-reolution cene from a urveillance camera, and treaming intructional video captured with high patial reolution (e.g., panel dicuion, lecture video). A video clip that howcae interactive viewing of occer in a TV-like etting can be een here [6]. In a treaming cenario, our propoed video coding cheme allow tranmitting uer-elected RoI, thu eliminating the need to tranmit the entire patial extent of the cene in full reolution. The encoding can either take place live or offline beforehand. Additionally, our cheme allow limiting the load of encoding irrepective of the number of uer. The entire recorded field-of-view can be encoded once, poibly with multiple reolution layer to upport different zoom factor. Spatial reolution layer are coded uing P lice 1 of H.264/AVC. Thi one-time encoding generate a repoitory of lice, and relevant lice can be erved to everal uer depending on their individual RoI. Thu, the coding cheme allow the ytem to cale to large number of uer; it avoid a dedicated encoder for each uer individual RoI equence. Another benefit i that requeted RoI can be extracted from the bittream even inide or at the edge of the network, cloer to the client-node. Ideally, the video delivery ytem hould be able to react to the uer changing RoI with a little latency a poible. The propoed coding cheme enable acce to a new region, with an arbitrary zoom factor, during any frame interval intead of having to wait for the end of a group of picture (GoP) or having to tranmit extra lice from previou frame. 1 We employ the following terminology: lice refer to a rectangular portion of a video frame, wherea tile refer to the equence of lice from the ame reolution layer and at the ame poition in each video frame.

2 578 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 The patial random acce approach developed in thi paper i alo relevant for the deign of ytem that employ imagebaed-rendering (IBR) [7], [8] and manipulate the tranmitted imagery further to yield a novel view, e.g., teleimmerive ytem [9] and free viewpoint TV [10]. Thi paper i tructured a follow. Section II review related work and dicue the challenge in providing random acce. Section III preent the coding cheme and dicue how to optimize the lice ize. The optimal lice ize minimize tranmiion bitrate by triking the bet compromie between compreion efficiency and uperfluou pixel tranmiion. Section IV preent an improvement of the coding cheme baed on background extraction and long-term memory motion-compenated prediction. Experiment indicate that the propoed improvement can reduce bitrate by up to 85% while retaining efficient random acce capability. II. Related Work Taubman et al. [11] propoed a olution for interactive browing of image uing JPEG2000. The multi-reolution repreentation of an image uing wavelet i leveraged to provide pan/tilt/zoom. JPEG2000 encode block of wavelet tranform coefficient independently. Conequently, every coded block ha influence on the recontruction of a limited number of pixel of the image. Moreover, the coding of each block reult in an independent, embedded bittream, which allow treaming any given block with a deired degree of fidelity. Taubman et al. alo developed the JPEG2000 over Internet Protocol, for communication between client and erver that upport remote interactive browing of JPEG2000 coded image [12]. The erver can keep track of the RoI trajectory of the client a well a the part of the bittream that have already been treamed to the client. Given a rate of tranmiion for the current time interval, the erver olve an optimization problem to determine which part of the bittream hould be ent in order to maximize the quality of the current RoI. Thi i imilar to packet cheduling algorithm propoed in [13] for treaming of video. It hould be noted, however, that an accurate model for the ditortion reduction due to ucceful delivery of any particular packet i neceary. Video coding for patial random acce preent a pecial challenge. To achieve good compreion efficiency, video compreion cheme typically employ motion-compenated interframe prediction for exploiting correlation among ucceive frame [14] [16]. However, the coding dependencie among ucceive frame make it difficult to provide random acce for patial browing within the cene. The decoding of a block of pixel require that other reference frame block ued by the predictor have previouly been decoded. Thee reference frame block might lie outide the RoI and might not have been tranmitted or decoded earlier. Coding, tranmiion, and rendering of high-reolution panoramic video uing MPEG-4 i propoed in [17]. A limited part of the entire cene i tranmitted to the client depending on the choen viewpoint. In [17], only intraframe coding i ued to allow random acce. The cene i ubdivided into lice which are coded independently. The author alo conidered interframe coding to improve compreion efficiency. However, they noted that thi involve tranmitting lice from the pat if the current lice require thoe for it decoding. A longer intraframe period entail ignificant tranmiion overhead for lice from the latter frame in the GoP, a thi dependency chain grow. Beide the tranmiion overhead, the reference frame block alo entail growing overhead of decoding. Coding and treaming of image from an IBR repreentation alo entail the random acce iue aociated with interframe coding. Thi applie both when the captured cene i tatic or evolving in time. Interactive treaming of tatic light field ha been tudied by Ramanathan et al. in [18] and [19]. The abovementioned growing dependency chain i avoided by uing multiple repreentation coding baed on two new picture type defined in the H.264/AVC tandard, SP, and SI picture type [20]. Ramanathan et al. alo extended rate-ditortion optimized packet cheduling, baed on the framework in [13], to multiple repreentation coding for light field. However, in their etup, only entire picture from the light field data-et are treamed and there i no proviion of patial random acce within a picture. Compreion and treaming of tatic light field uing ditributed ource coding ha been invetigated in [21] and [22]. If adequate rate i pent for ignaling the non-key frame then identical recontruction i guaranteed independent of the reference block ued a ide information at the receiver. Although thi implifie random acce, the coding efficiency i lower than hybrid video coding and the problem of rate etimation while treaming i challenging. Bauermann et al. conducted a detailed analyi of the decoding complexity and the mean tranmiion bitrate for remote acce to arbitrary part of compreed image-baed cene repreentation encoded uing hybrid video coding [23], [24]. Their work, however, doe not include a multi-reolution repreentation of the image data-et and i retricted to tatic imagery. Alo, for aving tranmiion bitrate, apart from knowing which pixel block are currently required, the erver alo need to know which pixel block have already been tranmitted to the uer. The erver ue thi information to tream a burt of reference pixel block. The variation of intantaneou bitrate and decoding load are undeirable. Recently, Kurutepe et al. [25] propoed live interactive 3DTV baed on dynamic light field. They employed application-layer peer-to-peer (P2P) multicat and delivered a ubet of view to a peer from a et of multiview video of the cene. Similar to [18] and [19], entire view are either elected or dropped according to the peer viewpoint. Random acce to arbitrary view i provided by encoding the view independently. Multicating lower the bandwidth requirement at the erver, however, the coded repreentation hould conit of logical ubtream for which multicat group can be formed. Efficient random acce i highly deirable ince it implifie the peer tak of deciding which multicat group to ubcribe. Similar to [18] and [19], entire frame from the data-et are treamed or not, and there i no proviion of patial random acce within a picture. Background extraction for motion-compenated prediction ha been propoed in [26]. Sprite coding defined in MPEG-4

3 MAVLANKAR AND GIROD: SPATIAL-RANDOM-ACCESS-ENABLED VIDEO CODING FOR INTERACTIVE VIRTUAL PAN/TILT/ZOOM FUNCTIONALITY 579 Fig. 1. Graphical uer interface. The client diplay how the thumbnail and the RoI. The effect of changing the zoom factor can be een by comparing the two creenhot. Each creenhot how a frame of the panoramic Cardgame video equence ued in our experiment. Fig. 2. Video coding cheme. The thumbnail video contitute a bae layer and i coded with H.264/AVC uing I, P, and B picture. The recontructed bae layer video frame are upampled by a uitable factor and ued a prediction ignal for encoding video correponding to the higher reolution layer. Higher reolution layer are coded uing P lice. Viual (MPEG-4 Part 2) allow coding the background either fully or partly for ubequent ue a reference in predictive coding. The term prite i more general and cover any tranmitted video object that can be warped and/or cropped in certain way for ue by the motion predictor. However, unlike our propoed cheme, the compreion cheme in the literature employing background extraction are not deigned to provide virtual pan/tilt/zoom functionality. III. Spatial-Random-Acce-Enabled Video Coding We have developed a graphical uer interface which allow the uer to elect the RoI while watching the video. The RoI location and zoom factor are controlled by operating the moue. The application upport continuou zoom to provide mooth control of the zoom factor. In addition to the RoI, we alo diplay a thumbnail overview with an overlaid rectangle indicating the location of the RoI. Screenhot of the client diplay are hown in Fig. 1. A. Coding Scheme Baed on Upward Prediction and Slice Fig. 2 how the video coding cheme. The thumbnail overview contitute a bae layer video and i coded with H.264/AVC uing I, P, and B picture. The recontructed bae layer video frame are upampled by a uitable factor and ued a prediction ignal for encoding video correponding to the higher reolution layer. Each frame belonging to a higher reolution layer i coded uing a grid of rectangular P lice. Employing upward prediction from only the thumbnail enable efficient random acce to local region within any patial reolution. For a given frame interval, the diplay of the client i rendered by tranmitting the correponding frame from the bae layer and few P lice from exactly one higher reolution layer. We tranmit lice from that reolution layer which correpond cloet to the uer current zoom factor. At the client ide, the correponding RoI from thi reolution layer i reampled to correpond to the uer zoom factor. We may tore few patial reolution layer at the erver but can till render mooth zoom control. If a required enhancement layer P lice i unavailable at the client, for example, due to lo in the network, we perform error concealment by upampling portion of the thumbnail video. In our experiment, the patial reolution layer tored at the erver are dyadically paced. Hence, the recontructed thumbnail frame need to be upampled by power of two horizontally and vertically to generate the correponding prediction ignal. For upampling the luminance component, we employ the ix-tap filter having the coefficient (1, 5, 20, 20, 5, 1) /32 a defined in H.264/AVC. For chroma, we employ a imple two-tap filter with equal coefficient. The upampling procedure i repeated an appropriate number of time depending on the reolution layer. Although

4 580 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 we chooe thee parameter for our experiment, our deign can incorporate arbitrarily paced reolution layer and alo arbitrary procedure for upampling the recontructed bae layer. Alo, at the client ide, for reampling the correponding RoI from the choen reolution layer, any technique can be accommodated. In our experiment, we ue bilinear interpolation. B. Comparion with Current Video Compreion Standard The coding cheme propoed above ue H.264/AVC building block but itelf i not tandard compliant. State-of-the-art video compreion tandard, H.264/AVC and SVC, provide tool like lice but no traightforward method for patial random acce ince their main focu ha been compreion efficiency of full-frame video and reilience to loe. SVC upport both lice a well a patial reolution layer. Ala, SVC allow only ingle-loop decoding wherea upward prediction from intercoded bae-layer frame implie multiple-loop decoding, and hence i not upported by the tandard. If the bae layer frame i intercoded, then SVC allow predicting the motion-compenation reidual at the higher-reolution layer from the reidual at the bae layer. However, interframe prediction dependencie acro tile belonging to a high-reolution layer hamper patial random acce. Note that for employing SVC, the motion vector (MV) can be choen to avoid intertile dependencie. Alo note that intead of SVC, AVC could be employed eparately for the high-reolution layer with the MV imilarly retricted to eliminate inter-tile dependencie. Thi i very imilar to treating the tile a eparate video equence. An obviou drawback i the redundancy between the high-reolution tile and the bae layer. A econd drawback i that after RoI change, a newly needed tile can only be decoded tarting from an intracoded lice. However, note that B lice could alo be employed for the high-reolution layer. Prior work on view random acce, dicued in Section II, employ multiple repreentation for coding an image. Similarly, we can ue multiple repreentation for coding a highreolution lice. Thi will allow u to ue interframe coding among ucceive high-reolution layer frame and to tranmit the appropriate repreentation for a lice depending on the lice that have been tranmitted earlier. Some repreentation will exploit inter-tile correlation, thu lowering the tranmiion bitrate. However, more torage will be required for multiple repreentation. The benefit of the cheme in Fig. 2 i that knowing the current RoI i enough to decide which data need to be tranmitted unlike the cae of multiple repreentation where the deciion i conditional on prior tranmitted data. In our propoed cheme, motion compenation among ucceive frame i performed at the bae layer. We alo employ diplacement compenation with a mall earch range of about four pixel to find the bet match relative to the upampled bae layer frame while coding the high-reolution P lice. The total encoding load i determined by the maximum reolution and the number of layer and can be etimated to be roughly 1.3 time the load of encoding jut the highet reolution layer uing tandard motion-compenated hybrid video coding. Fig. 3. Depending on the lice ize and the location of the RoI within the given reolution layer, there i an overhead of pixel that are tranmitted but not ued for rendering the client diplay. The haded portion depict the pixel overhead in thi example. Fig. 4. Sequence of pixel i divided into 1-D lice. In thi example, the length of each lice i = 4. The length of the 1-D region-of-interet i R =3. C. Minimization of Mean Tranmiion Bitrate For the coding cheme hown in Fig. 2, the lice ize for each reolution layer can be independently optimized given the prediction reidual for that layer. The trategy propoed here can be independently ued for all layer. Given a reolution layer, we aume that the lice form a regular rectangular grid, o that every lice i w pixel wide and h pixel tall. The lice on the boundarie can have maller dimenion due to the layer dimenion not being integer multiple of the lice dimenion. The number of bit tranmitted to the client, or decoded for local playback, depend on the lice ize a well a the uer RoI trajectory over the interactive viewing eion. The quality of the decoded video depend on the quantization parameter (QP) ued for encoding the lice. However, it hould be noted that for the ame QP, almot the ame quality i obtained for different lice ize, even though the number of bit i different. Hence, given the QP, our goal i to chooe the lice ize that minimize the expected number of bit tranmitted and/or decoded per rendered pixel. The maller the lice ize the wore i the coding efficiency. Thi i becaue of increaed number of lice header, lack of context continuation acro lice for context adaptive coding, and inability to exploit interpixel correlation acro lice. On the other hand, a maller lice ize entail lower pixel overhead. The pixel overhead conit of pixel that have to be tranmitted and/or decoded becaue of the coare lice diviion, but are not ued to render the client diplay. For example, the haded pixel in Fig. 3 how the pixel overhead for the hown lice grid and location of the RoI. In the following analyi, we aume that the RoI location can be changed with a granularity of one pixel both horizontally and vertically. Alo, every location i equally likely to be elected. Depending on the application cenario, the lice might be put in different tranport layer packet. The packetization overhead of layer below the application layer, for example RTP/UDP/IP, ha not been taken into account but can be eaily incorporated into the propoed optimization framework. 1) Pixel Overhead: To implify the analyi, we firt conider the 1-D cae and then extend it to 2-D.

5 MAVLANKAR AND GIROD: SPATIAL-RANDOM-ACCESS-ENABLED VIDEO CODING FOR INTERACTIVE VIRTUAL PAN/TILT/ZOOM FUNCTIONALITY 581 a) Analyi in 1-D: Imagine an infinitely long equence of pixel. Thi equence i divided into lice of length. For example, in Fig. 4, = 4. Alo given i the length of the region-of-interet, denoted by R. Aume R = 3 in thi example. To calculate the pixel overhead, we are intereted in the probability ditribution of the number of 1-D lice that need to be tranmitted. Thi can be obtained by teting for location within one lice, ince the pattern repeat every lice. For RoI location w and x, we would need to tranmit a ingle lice, wherea for location y and z, we would need to tranmit two lice. Let N be the random variable repreenting the number of lice to be tranmitted. Given and R, wecan uniquely chooe m, R N uch that m 0 and 1 R and alo the following relationhip hold: R = m + R. (1) By inpection, we find the p.m.f. of random variable N Pr {N = m +1} = (R 1) Pr {N = m +2} = R 1 and zero everywhere ele. From the p.m.f. of N E {N} = (m +1) (R 1) +(m +2) R 1 = (m +1)+ R 1. (2) Let P be the random variable which denote the number of pixel that need to be tranmitted E {P} = E {N} = (m +1) + R 1 = R + 1. (3) The expected pixel overhead i 1. It increae monotonically with lice length and urpriingly i independent of the length R of the region-of-interet. Ala, the reult i that imple only for 1-D. If R itelf i a random variable, then for a given value of R = r, (3) can be rewritten a E {P R = r} = r + 1. (4) b) Analyi in 2-D: We define two new random variable, P w, the number of column to be tranmitted and P h, the number of row to be tranmitted. Similarly, R w and R h are random variable denoting the number of column and row (among thoe tranmitted) required to render the RoI repectively. From the 1-D analyi, we obtain E {P w R w = r w } = r w + w 1 E {P h R h = r h } = r h + h 1. The number of tranmitted pixel i alo a random variable, P = P w P h. Since P w and P h can be aumed to be conditionally independent given R w,r h, we can write E {P R w = r w,r h = r h } = (r w + w 1)(r h + h 1). (5) While R w R h i the number of pixel among thoe tranmitted which are rendered in the RoI window, it i not the ize of the RoI window. The array of R w R h pixel i reampled to fit the fixed ize d w d h of the RoI diplay window. Recall that thi allow u to upport arbitrary zoom factor with mall number of dicretely paced reolution layer. Random variable Z C denote the continuou zoom factor controlled by uer input. It value determine the value of the dicrete random variable Z D which i the zoom factor rounded to a power of two. For example Z D = 1, if (1 Z C < 1.5) 2, if (1.5 Z C < 4). (6) To render the RoI at ome zoom factor Z C, we round to dicrete zoom factor Z D and retrieve the reolution layer log 2 (Z D )+1. The mimatch Z C /Z D i made up by reizing the tranmitted video after decoding. For our analyi, we need to model the conditional pdf of Z C given the layer number. In our modeling below, we aume that, given the layer number, Z C i uniformly ditributed. For example, if the optimization i being carried out for the econd layer in the example above, then we aume that Z C i uniformly ditributed between 1.5 and 4. Note that the ditribution of the uer-elected zoom factor in practice might depend on ize of certain alient object in the video. Neverthele, we make the aumption about Z C without performing any video content analyi. Let d w and d h be contant denoting the width and height of the RoI diplay portion on the client diplay, repectively. The random variable R w and R h are determined by Z C a follow: Z D Z D R w = d w R h = d h. (7) Z C Z C The expected value of R w and R h are given by { } 1 E {R w } = d w Z D E Z { C } 1 E {R h } = d h Z D E ince the analyi i carried out given the layer number and hence the dicrete zoom factor, Z D. Now, we can apply iterated expectation on (5) to yield Z C E {P} = (E {R w } + w 1)(E {R h } + h 1). (8) 2) Optimal Slice Size: The average number of bit per pixel for coding the prediction reidual of a given reolution layer, denoted by η ( w, h ), i a function of the lice ize ( w, h ). We alo define the number of pixel tranmitted per rendered pixel a the relative pixel overhead ψ ( w, h ) = E{P} d w d h, where E {P} i given by (8). The optimal lice ize minimize the expected number of bit tranmitted per rendered pixel and i given by ( opt w, opt h ) = arg min η( w, h ) ψ ( w, h ). (9) ( w, h ) One way to obtain the function η ( w, h ) i through ample encoding of the prediction reidual by varying the lice

6 582 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 Fig. 5. Model prediction veru empirical value for pixel tranmitted per rendered pixel, ψ ( w, h ), hown for the three equence, Cardgame, Making Sene, and Soccer. The empirical value are obtained by averaging over 100 uer-interaction trajectorie for each equence. The econd y-axi how the bit per pixel for coding the reidual of the high-reolution layer, η ( w, h ). The lice width and lice height in number of pixel are denoted by w and h, repectively. (a) Cardgame equence, layer 1 (PSNR 38.7 db). (b) Cardgame equence, layer 2 (PSNR 39.2 db). (c) Making Sene equence, layer 1 (PSNR 39.0 db). (d) Making Sene equence, layer 2 (PSNR 39.6 db). (e) Soccer equence, layer 1 (PSNR 35.5 db). (f) Soccer equence, layer 2 (PSNR 37.0 db). ize. Alternatively, η ( w, h ) could alo be predicted by an analytical model to reduce the number of ample encoding. Either way, (9) can be ued to find the optimal lice ize. We now preent experimental reult to demontrate that our model predict the optimal lice ize accurately without requiring to capture uer-interaction trajectorie. In our experiment, we obtain η ( w, h ) through a ample encoding of about 30 frame for each teted lice ize configuration ( w, h ). We ue three video equence for our experiment. The width height of the Cardgame 2 and Making Sene 2 2 Stanford Center for Innovation and Learning, Stanford, CA, generouly provided thee equence. equence i pixel. For the Soccer 3 equence, it i pixel. The RoI diplay i pixel. For all three equence, the thumbnail video i obtained by patially downampling the original by 4 both horizontally and vertically. There are two high-reolution layer; the firt layer equence i obtained by downampling the original by 2 both horizontally and vertically, while the econd layer equence i imply the original video. All equence are 25 frame/. Cardgame and Making Sene have 298 frame and Soccer ha 598 frame. We encode the thumbnail video with an 3 Fraunhofer Heinrich-Hertz Intitute, Berlin, Germany, generouly provided thi equence.

7 MAVLANKAR AND GIROD: SPATIAL-RANDOM-ACCESS-ENABLED VIDEO CODING FOR INTERACTIVE VIRTUAL PAN/TILT/ZOOM FUNCTIONALITY 583 Fig. 6. Model prediction veru empirical value for bit tranmitted per rendered pixel, hown for the three equence, Cardgame, Making Sene, and Soccer. The empirical value are obtained by averaging over 100 uer-interaction trajectorie for each equence. The lice width and lice height in number of pixel are denoted by w and h, repectively. (a) Cardgame equence, layer 1 (PSNR 38.7 db). (b) Cardgame equence, layer 2 (PSNR 39.2 db). (c) Making Sene equence, layer 1 (PSNR 39.0 db). (d) Making Sene equence, layer 2 (PSNR 39.6 db). (e) Soccer equence, layer 1 (PSNR 35.5 db). (f) Soccer equence, layer 2 (PSNR 37.0 db). Fig. 7. Model prediction veru empirical value for zoom-adjuted relative pixel overhead, φ ( w, h ), hown for Making Sene equence. The empirical value are obtained by averaging over 100 uer-interaction trajectorie. The econd y-axi how the bit per pixel for coding the reidual of the high-reolution layer, η ( w, h ). The lice width and lice height in number of pixel are denoted by w and h, repectively. (a) Making Sene equence, layer 1 (PSNR 39.0 db). (b) Making Sene equence, layer 2 (PSNR 39.6 db).

8 584 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 Fig. 7 how the zoom-adjuted relative pixel overhead, φ ( w, h ), for the Making Sene equence. We oberved that the model prediction i cloe to the empirical value for all three equence. Thu, the analyi preented in thi ection enable etimating variou quantitie related to acceed portion from the cene repreentation without recording uerinteraction trajectorie and meauring thee quantitie from long bittream encoded for variou lice ize. Thi help ytem dimenioning of an interactive video tranmiion ytem. Fig. 8. Improvement baed on background extraction. Each high-reolution layer frame ha two reference to chooe from, the frame obtained by upampling the recontructed thumbnail frame and the background frame from the ame layer in the background pyramid. intraframe period of 15 frame uing two conecutive B frame between anchor frame. The PSNR at bitrate for Cardgame, Making Sene, and Soccer i 39.1 db at 162 kb/, 39.6 db at 201 kb/, and 35.3 db at 355 kb/, repectively. For Cardgame and Making Sene, we chooe the QP to yield a PSNR of db for the high-reolution layer. For Soccer, the QP yield a PSNR of db. Fig. 5 how the relative pixel overhead, ψ ( w, h ) for the three equence. We compare the model prediction againt empirical value averaged over 100 uer-interaction trajectorie for each equence. The trajectorie were recorded while interactively viewing the equence uing the graphical uer interface decribed in Section III. Each trajectory tart at a random location with a random zoom factor, i 1 min long, and the et of frame of the original equence are looped to play for 1 min. The uer zoom factor, Z C, i allowed to vary between 1 and 6. The threhold given by (6) determine the high-reolution layer for rendering the RoI. Fig. 6 how the bit tranmitted per rendered pixel for the three equence. For a given equence and reolution layer, the comparion in Fig. 5 and 6 for different lice ize i made for the ame QP and hence imilar PSNR. Although the model predict the optimal lice ize fairly accurately, it can underetimate or overetimate the tranmitted bitrate. Thi i becaue the popular lice that contitute the alient object in the video could entail high or low bitrate compared to the average. Alo, the location of the object can bia the pixel overhead to the high or low ide, wherea the model ue the average overhead. For certain zoom factor choen by the uer, the acceed /tranmitted pixel could be le than the number of rendered pixel. Thi can be een in Fig. 5 where the relative pixel overhead, ψ ( w, h ), goe below one. Hence, we alo compute { the zoom-adjuted relative pixel overhead, φ ( w, h ) = E Pw P h R w R h }. Thi quantity i alway greater than one where φ ( w, h ) = [ { } ][ { } ] 1 1 ( w 1)E +1 ( h 1)E +1 R w R h { } 1 E = E {Z C} R w d w Z D { } 1 E = E {Z C}. R h d h Z D IV. Background Extraction and Long-Term Memory Motion-Compenated Prediction The coding cheme propoed in Section III exploit temporal correlation by performing motion compenation among ucceive frame of the thumbnail video. Temporal prediction among ucceive frame of the high-reolution layer i avoided to enable efficient random acce. Although it enable efficient random acce, upward prediction uing the recontructed thumbnail frame might reult in ubtantial reidual energy for high patial frequencie. In thi ection, we propoe creating a background frame [27], [28] for each highreolution layer and employing long-term memory motioncompenated prediction (LTM MCP) [29] to exploit the correlation between thi frame and each high-reolution frame to be encoded. The background frame i intracoded. A hown in Fig. 8, high-reolution P lice have two reference to chooe from, upward prediction and the background frame. If a tranmitted high-reolution P lice refer to the background frame, then relevant I lice from the background frame are tranmitted only if they have not been tranmitted earlier. Thi i different from [26], in which the encoder ue only thoe part of the background for prediction that exit in the decoder multi-reolution background pyramid. The encoder mimic the decoder in [26], which build a background pyramid out of all previouly received frame. Background extraction algorithm a well a detection and update of changed background portion have been previouly tudied, for example in [30], and are not the focu of thi paper. Since a moving camera might hamper patial browing experience, the camera i tatic in our equence. A imple temporal median operator [27] yield a plauible background frame. Out of the firt 150 frame, we include every fifth frame for the median operation. Fig. 9 how the reult for Cardgame, Making Sene, and Soccer. Although ome tationary object remain in the background frame, thi help the coding efficiency. In our experiment, the background frame i not updated after it creation at the tart. Thi i typical with a tatic camera. For example, in a occer game, the background typically change due to illumination change, which happen infrequently. The background frame i intracoded with the ame lice tructure a the other frame from the layer. Fig. 10 how the coding bitrate reduction due to thi approach. The figure i hown for lice ize of ( w 16 h 16) =4 16 for layer 1 and ( w 16 h 16) =6 4 for layer 2 of Cardgame and Making Sene. ForSoccer, the lice ize i ( w 16 h 16) =4 4 for both layer. For Cardgame, Fig. 11 how the reulting tranmiion

9 MAVLANKAR AND GIROD: SPATIAL-RANDOM-ACCESS-ENABLED VIDEO CODING FOR INTERACTIVE VIRTUAL PAN/TILT/ZOOM FUNCTIONALITY 585 Fig. 9. Sample frame and background frame for layer 1 of Cardgame, Making Sene, and Soccer equence. Fig. 10. Bitrate reduction through background extraction (BE) and long-term memory motion-compenated ( prediction (LTM MCP), hown for the Cardgame, Making Sene, and Soccer equence. For both Cardgame and Making Sene, the lice ize i w ) ( 16 h 16 =4 16 for layer 1 and w ) ( 16 h 16 =6 4 for layer 2. For Soccer, the lice ize i w ) 16 h 16 =4 4 for both layer. (a) Cardgame equence, layer 1. (b) Making Sene equence, layer 1. (c) Soccer equence, layer 1. (d) Cardgame equence, layer 2. (e) Making Sene equence, layer 2. (f) Soccer equence, layer 2.

10 586 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 Fig. 11. Tranmiion bitrate i reduced after employing background extraction (BE) and long-term memory motion-compenated prediction (LTM MCP), here hown for the two layer of Cardgame. The lice width and lice height in number of pixel are denoted by w and h, repectively. Tranmiion bitrate value are obtained by counting bit required to tranmit relevant high-reolution lice. The value are averaged over 100 uer-interaction trajectorie. (a) Cardgame equence, layer 1. (b) Cardgame equence, layer 2. Fig. 12. Number of I and P lice tranmitted over the treaming eion, when background extraction (BE) and long-term memory motion-compenated prediction (LTM MCP) are employed. The data are plotted for a ingle uer-interaction trajectory. Slice ize are a in Fig. 10. For Cardgame and Making Sene, we chooe the QP to yield around 40.6 db PSNR for both layer. For Soccer, the PSNR i around 37.3 db for layer 1 and 38.5 db for layer 2. (a) Cardgame equence. (b) Making Sene equence. (c) Soccer equence. Fig. 13. Model prediction veru empirical value for bit tranmitted per rendered pixel, hown for the Making Sene equence, encoded uing background extraction (BE) and long-term memory motion-compenated prediction (LTM MCP). The empirical value are obtained by averaging over 100 uer-interaction trajectorie. The lice width and lice height in number of pixel are denoted by w and h, repectively. (a) Making Sene equence, layer 1 (PSNR 40.6 db). (b) Making Sene equence, layer 2 (PSNR 40.6 db). bitrate reduction. For Fig. 10, the lice ize choen are either optimal or cloe to optimal. If the mean tranmiion bitrate correponding to two lice ize are cloe, we prefer the larger lice ize for reaon noted in Section V. For the high-reolution layer, Fig. 12 how the number of tranmitted I lice from the background pyramid and the number of tranmitted P lice. It how the number for a ingle uer-interaction trajectory. For the firt frame of the treaming eion, roughly equal number of I and P lice are tranmitted. Subequently, I lice need to be tranmitted poradically in time and generally fewer in number than at the tart. Although not hown here, when averaged over 100 trajectorie, the profile of the tranmitted I and P lice appear moother; the number of P lice i almot contant and matche the expected number of tranmitted P lice that can be computed from analyi imilar to Section III-C. The average number of tranmitted I lice i highet at the tart and i about 1% of the number of tranmitted P lice thereafter. We model the bit tranmitted per rendered pixel a before. However, for implicity, the cot of tranmitting I lice i counted in the coding bitrate, η( w, h ), but not in the number of pixel tranmitted per rendered pixel, ψ( w, h ). A hown in Fig. 13, the model matche cloely with the empirical

11 MAVLANKAR AND GIROD: SPATIAL-RANDOM-ACCESS-ENABLED VIDEO CODING FOR INTERACTIVE VIRTUAL PAN/TILT/ZOOM FUNCTIONALITY 587 value for the Making Sene equence. The model matche well for the other two equence a well. It hould be noted that the change in the optimal lice ize after employing the background frame i mall, and the lice ize that i optimal for the earlier cheme till yield a mean tranmiion bitrate very cloe to that correponding to the new optimal lice ize. Hence, we chooe the ame lice ize for comparing the coding bitrate of the two cheme in Fig. 10. V. Concluion and Further Work We propoed a patial-random-acce-enabled video coding cheme that eliminate the need to tranmit and/or decode the entire video cene in high patial reolution. The RoI can be witched during any frame interval without waiting for the end of the GoP or having to tranmit extra lice from the pat. The coding cheme allow the ytem to cale with the number of client; it avoid encoding each client RoI equence individually. Another benefit i that requeted RoI can be extracted from the bittream even inide or at the edge of the network, cloer to the client-node. The random acce apect preented in thi paper alo apply to the deign of other IBR-baed interactive treaming ytem. We optimized the lice ize to minimize the tranmiion bitrate. Our model accurately predict the optimal lice ize without requiring to capture uer-interaction trajectorie. We propoed an improvement of the coding cheme baed on background extraction and long-term memory motioncompenated prediction. Experiment indicate that both the coding bitrate a well a the tranmiion bitrate can be reduced by up to 85% while retaining efficient random acce capability. Thi improvement, however, entail tranmitting ome I lice from the background pyramid that might be required for decoding the current high-reolution P lice. Neverthele, the cot of doing thi i amortized over the treaming eion. For reducing latency in a treaming cenario, we propoed predicting the uer RoI in advance [31], [32] and pre-fetching relevant data. A bigger lice ize add robutne againt inaccurate RoI prediction, although it might increae tranmiion bitrate. Alo, if the packetization overhead aociated with layer below the application layer i conidered, for example when each lice need to be put in a different tranport layer packet, then a bigger lice ize might be optimal. A ample cenario i application-layer P2P multicat to a population of peer where each peer can ubcribe/unubcribe requiite tile according to it RoI. In [33] and [34], we propoed forming a multicat group for each lice. In thi cenario, data from dijoint lice are preferably tranmitted/forwarded in different tranport layer packet. In the RoI P2P ytem, the peer tak of deciding which multicat group to ubcribe i implified thank to efficient random acce of the underlying video coding cheme. Acknowledgment The author would like to thank Dr. P. Baccichet, Dr. D. Varodayan, and K. Chono for ueful dicuion. Reference [1] C. Fehn, C. Weiig, I. Feldmann, M. Mueller, P. Eiert, P. Kauff, and H. Blo, Creation of high-reolution video panorama of port event, in Proc. 8th IEEE ISM, Dec. 2006, pp [2] J. Kopf, M. Uyttendaele, O. Deuen, and M. F. Cohen, Capturing and viewing gigapixel image, in Proc. ACM SIGGRAPH, vol. 26, no. 3. Aug. 2007, pp [3] Hewlett-Packard. (2009, Sep. 16). Halo: Video Conferencing Product by Hewlett-Packard [Online]. Available: html [4] A. Smolic and D. McCutchen, 3DAV exploration of video-baed rendering technology in MPEG, IEEE Tran. Circuit Syt. Video Technol., vol. 14, no. 3, pp , Mar [5] Immerive Media. (2009, Sep. 16). Dodeca 2360: An Omni-Directional Video Camera Providing Over 100 Million Pixel per Second by Immerive Media [Online]. Available: [6] Video Clip Showcaing Interactive TV with Pan/Tilt/Zoom (2009, Sep. 26) [Online]. Available: Ko9jcIjBXnk [7] H.-Y. Shum, S. B. Kang, and S.-C. Chan, Survey of image-baed repreentation and compreion technique, IEEE Tran. Circuit Syt. Video Technol., vol. 13, no. 11, pp , Nov [8] M. Levoy and P. Hanrahan, Light field rendering, in Proc. ACM SIGGRAPH, Aug. 1996, pp [9] P. Kauff and O. Schreer, Virtual team uer environment: A tep from tele-cubicle toward ditributed tele-collaboration in mediated workpace, in Proc. IEEE ICME, vol. 2. Aug. 2002, pp [10] M. Tanimoto, Overview of FTV (free-viewpoint televiion), in Proc. ICME, Jul. 2009, pp [11] D. Taubman and R. Roenbaum, Rate-ditortion optimized interactive browing of JPEG2000 image, in Proc. IEEE ICIP, Sep. 2000, pp [12] D. Taubman and R. Prandolini, Architecture, philoophy and performance of JPIP: Internet protocol tandard for JPEG2000, Proc. SPIE Intl. Symp. VCIP, vol. 5150, no. 1, pp , Jul [13] P. Chou and Z. Miao, Rate-ditortion optimized treaming of packetized media, IEEE Tran. Multimedia, vol. 8, no. 2, pp , Apr [14] B. Girod, The efficiency of motion-compenating prediction for hybrid coding of video equence, IEEE J. Sel. Area Commun., vol. 5, no. 7, pp , Aug [15] B. Girod, Motion-compenating prediction with fractional-pel accuracy, IEEE Tran. Commun., vol. 41, no. 4, pp , Apr [16] B. Girod, Efficiency analyi of multihypothei motion-compenated prediction for video coding, IEEE Tran. Image Proce., vol. 9, no. 2, pp , Feb [17] S. Heymann, A. Smolic, K. Mueller, Y. Guo, J. Rurainky, P. Eiert, and T. Wiegand, Repreentation, coding and interactive rendering of high-reolution panoramic image and video uing MPEG-4, in Proc. PPW, Feb [18] P. Ramanathan and B. Girod, Rate-ditortion optimized treaming of compreed light field with multiple repreentation, in Proc. 14th Packet Video Workhop, Dec [19] P. Ramanathan and B. Girod, Random acce for compreed light field uing multiple repreentation, in Proc. IEEE 6th Int. Workhop MMSP, Sep. 2004, pp [20] M. Karczewicz and R. Kurceren, The SP- and SI-frame deign for H.264/AVC, IEEE Tran. Circuit Syt. Video Technol., vol. 13, no. 7, pp , Jul [21] X. Zhu, A. Aaron, and B. Girod, Ditributed compreion for large camera array, in Proc. IEEE Workhop Statit. Signal Proce., Sep. 2003, pp [22] A. Aaron, P. Ramanathan, and B. Girod, Wyner Ziv coding of light field for random acce, in Proc. IEEE 6th Workhop MMSP, Sep. 2004, pp [23] I. Bauermann and E. Steinbach, RDTC optimized compreion of image-baed cene repreentation (part I): Modeling and theoretical analyi, IEEE Tran. Image Proce., vol. 17, no. 5, pp , May [24] I. Bauermann and E. Steinbach, RDTC optimized compreion of image-baed cene repreentation (part II): Practical coding, IEEE Tran. Image Proce., vol. 17, no. 5, pp , May [25] E. Kurutepe, M. R. Civanlar, and A. M. Tekalp, Interactive tranport of multi-view video for 3DTV application, J. Zhejiang Univ. Sci. A, vol. 7, no. 5, pp , May [26] J. Berntein, B. Girod, and X. Yuan, Hierarchical encoding method and apparatu employing background reference for effi-

12 588 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5, MAY 2011 ciently communicating image equence, U.S. Patent , Oct [27] M. Maey and W. Bender, Salient till: Proce and practice, IBM Syt. J., vol. 35, no. 3 4, pp , [28] D. Farin, P. de With, and W. Effelberg, Robut background etimation for complex video equence, in Proc. IEEE ICIP, vol. 1. Sep. 2003, pp [29] T. Wiegand, X. Zhang, and B. Girod, Long-term memory motioncompenated prediction, IEEE Tran. Circuit Syt. Video Technol., vol. 9, no. 1, pp , Feb [30] D. Hepper, Efficiency analyi and application of uncovered background prediction in a low bit rate image coder, IEEE Tran. Commun., vol. 38, no. 9, pp , Sep [31] A. Mavlankar, D. Varodayan, and B. Girod, Region-of-interet prediction for interactively treaming region of high reolution video, in Proc. IEEE 16th Packet Video Workhop, Nov. 2007, pp [32] A. Mavlankar and B. Girod, Pre-fetching baed on video analyi for interactive region-of-interet treaming of occer equence, in Proc. IEEE ICIP, Nov. 2009, pp [33] A. Mavlankar, J. Noh, P. Baccichet, and B. Girod, Peer-to-peer multicat live video treaming with interactive virtual pan/tilt/zoom functionality, in Proc. IEEE ICIP, Oct. 2008, pp [34] A. Mavlankar, J. Noh, P. Baccichet, and B. Girod, Optimal erver bandwidth allocation for treaming multiple tream via P2P multicat, in Proc. IEEE 10th Workhop MMSP, Oct. 2008, pp Aditya Mavlankar (S 99 M 09) received the B.E. degree in electronic and telecommunication from the Univerity of Pune, Pune, India, the M.S. degree in communication engineering from the Technical Univerity of Munich, Munich, Germany, and the Ph.D. degree in electrical engineering from Stanford Univerity, Stanford, CA. He i currently with Tely Lab, Inc., Menlo Park, CA. He ha publihed over 30 conference and journal paper, book chapter, and patent. Hi current reearch interet include calable video coding, interactive video delivery, and peer-to-peer video treaming. Dr. Mavlankar wa the recipient of the Edion Prize Bronze Medal awarded by IIE Europe in conjunction with the GE Foundation for hi Mater thei in 2006, wa a co-recipient of the Bet Student Paper Award at the IEEE Workhop on Multimedia Signal Proceing, Victoria, BC, Canada, and a corecipient of the Bet Student Paper Award at the European Signal Proceing Conference, Poznan, Poland. He won the Student Travel Grant Award for hi paper at the 16th International Packet Video Workhop, Lauanne, Switzerland. Paper co-authored by him have been nominated multiple time for bet paper award at international conference. Bernd Girod (M 80 SM 97 F 98) received the M.S. degree from the Georgia Intitute of Technology, Atlanta, and the Engineering Doctorate degree from the Univerity of Hannover, Hannover, Germany. He ha been a Profeor of electrical engineering and (by courtey) computer cience with the Information Sytem Laboratory, Stanford Univerity, Stanford, CA, ince Previouly, he wa a Profeor of telecommunication with the Department of Electrical Engineering, Univerity of Erlangen- Nuremberg, Erlangen/Nuremberg, Germany. He ha publihed over 400 conference and journal paper, a well a 5 book. Hi current reearch interet include the area of video compreion and networked media ytem. Prof. Girod received the EURASIP Signal Proceing Bet Paper Award in 2002, the IEEE Multimedia Communication Bet Paper Award in 2007, the EURASIP Image Communication Bet Paper Award in 2008, a well a the EURASIP Technical Achievement Award in A an entrepreneur, he ha been involved with everal tartup venture a the founder, director, invetor, or advior, among them Polycom (Nadaq:PLCM), Vivo Software, 8x8 (Nadaq: EGHT), and RealNetwork (Nadaq: RNWK). He i a EURASIP fellow and a member of the German National Academy of Science (Leopoldina).

Grouping and Retrieval Schemes for Stored MPEG. Video. Senthil Sengodan, Victor O. K. Li. University of Southern California

Grouping and Retrieval Schemes for Stored MPEG. Video. Senthil Sengodan, Victor O. K. Li. University of Southern California Grouping and Retrieval Scheme for Stored MPEG Video Senthil Sengodan, Victor O. K. Li Communication Science Intitute Department of Electrical Engineering Univerity of Southern California Lo Angele, CA

More information

Long-Term Mechanical Properties of Smart Cable Based on FBG Desensitized Encapsulation Sensors

Long-Term Mechanical Properties of Smart Cable Based on FBG Desensitized Encapsulation Sensors PHOTONIC SENSORS / Vol. 4, No. 3, 2014: 236 241 Long-Term Mechanical Propertie of Smart Cable Baed on Deenitized Encapulation Senor Sheng LI 1* and Min ZHOU 2 1 National Engineering Laboratory for Fiber

More information

Characterization of Traditional Thai Musical Scale

Characterization of Traditional Thai Musical Scale Characterization of Traditional Thai Muical Scale ATTAKITMONGCOL, K., CHINVETKITVANIT, R., and SUJITJORN, S. School of Electrical Engineering, Intitute of Engineering Suranaree Univerity of Technology

More information

(12) Patent Application Publication (10) Pub. No.: US 2008/ A1

(12) Patent Application Publication (10) Pub. No.: US 2008/ A1 (19) United tate (12) Patent Application Publication (10) Pub. No.: U 2008/0231544A1 Cooper et al. U 20080231544A1 (43) Pub. Date: ep. 25, 2008 (54) (75) (73) (21) (22) (60) YTEMAND METHOD FOR AUTOMATED

More information

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur Module 8 VIDEO CODING STANDARDS Lesson 27 H.264 standard Lesson Objectives At the end of this lesson, the students should be able to: 1. State the broad objectives of the H.264 standard. 2. List the improved

More information

Ausroc III Telemetry System

Ausroc III Telemetry System 1 Auroc III Telemetry Sytem Steven S. Pietrobon 6 Firt Avenue, Payneham South SA 5070, Autralia teven@world.com.au 9th Annual ASRI Conference (ASRI 99) Canberra, Autralia 3 5 December 1999 2 Introduction

More information

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder.

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder. EE 5359 MULTIMEDIA PROCESSING Subrahmanya Maira Venkatrav 1000615952 Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder. Wyner-Ziv(WZ) encoder is a low

More information

Old. New Strand # New. New Standard. New Strand

Old. New Strand # New. New Standard. New Strand Crowalk: Grade 4 (DRAFT) The new Reading and Language Art tandard have been approved by the State Board of Education. Thi draft crowalk ha been developed to ait Florida teacher in identifying connection

More information

Free Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding

Free Viewpoint Switching in Multi-view Video Streaming Using. Wyner-Ziv Video Coding Free Viewpoint Switching in Multi-view Video Streaming Using Wyner-Ziv Video Coding Xun Guo 1,, Yan Lu 2, Feng Wu 2, Wen Gao 1, 3, Shipeng Li 2 1 School of Computer Sciences, Harbin Institute of Technology,

More information

Overview: Video Coding Standards

Overview: Video Coding Standards Overview: Video Coding Standards Video coding standards: applications and common structure ITU-T Rec. H.261 ISO/IEC MPEG-1 ISO/IEC MPEG-2 State-of-the-art: H.264/AVC Video Coding Standards no. 1 Applications

More information

WITH the rapid development of high-fidelity video services

WITH the rapid development of high-fidelity video services 896 IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 7, JULY 2015 An Efficient Frame-Content Based Intra Frame Rate Control for High Efficiency Video Coding Miaohui Wang, Student Member, IEEE, KingNgiNgan,

More information

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter? Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter? Yi J. Liang 1, John G. Apostolopoulos, Bernd Girod 1 Mobile and Media Systems Laboratory HP Laboratories Palo Alto HPL-22-331 November

More information

Systematic Lossy Error Protection of Video based on H.264/AVC Redundant Slices

Systematic Lossy Error Protection of Video based on H.264/AVC Redundant Slices Systematic Lossy Error Protection of based on H.264/AVC Redundant Slices Shantanu Rane and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305. {srane,bgirod}@stanford.edu

More information

Multimedia Communications. Image and Video compression

Multimedia Communications. Image and Video compression Multimedia Communications Image and Video compression JPEG2000 JPEG2000: is based on wavelet decomposition two types of wavelet filters one similar to what discussed in Chapter 14 and the other one generates

More information

CODING EFFICIENCY IMPROVEMENT FOR SVC BROADCAST IN THE CONTEXT OF THE EMERGING DVB STANDARDIZATION

CODING EFFICIENCY IMPROVEMENT FOR SVC BROADCAST IN THE CONTEXT OF THE EMERGING DVB STANDARDIZATION 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 CODING EFFICIENCY IMPROVEMENT FOR SVC BROADCAST IN THE CONTEXT OF THE EMERGING DVB STANDARDIZATION Heiko

More information

New Strand # New Strand. Process. Process. Process. Process

New Strand # New Strand. Process. Process. Process. Process Crowalk: Grade 5 (DRAFT) The new Reading and Language Art tandard have been approved by the State Board of Education. Thi draft crowalk ha been developed to ait Florida teacher in identifying connection

More information

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Selective Intra Prediction Mode Decision for H.264/AVC Encoders Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression

More information

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS Susanna Spinsante, Ennio Gambi, Franco Chiaraluce Dipartimento di Elettronica, Intelligenza artificiale e

More information

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT CSVT -02-05-09 1 Color Quantization of Compressed Video Sequences Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 Abstract This paper presents a novel color quantization algorithm for compressed video

More information

Aalborg Universitet. Published in: I E E E Transactions on Power Delivery. DOI (link to publication from Publisher): /TPWRD.2010.

Aalborg Universitet. Published in: I E E E Transactions on Power Delivery. DOI (link to publication from Publisher): /TPWRD.2010. Aalborg Univeritet Method to Minimize Zero-Miing Phenomenon Silva, Filipe Miguel Faria da; Bak, Clau Leth; Gudmunddottir, Unnur Stella; Wiechowki, W.; Knardrupgård, M.. Publihed in: I E E E Tranaction

More information

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices Shantanu Rane, Pierpaolo Baccichet and Bernd Girod Information Systems Laboratory, Department

More information

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY (Invited Paper) Anne Aaron and Bernd Girod Information Systems Laboratory Stanford University, Stanford, CA 94305 {amaaron,bgirod}@stanford.edu Abstract

More information

Dual Frame Video Encoding with Feedback

Dual Frame Video Encoding with Feedback Video Encoding with Feedback Athanasios Leontaris and Pamela C. Cosman Department of Electrical and Computer Engineering University of California, San Diego, La Jolla, CA 92093-0407 Email: pcosman,aleontar

More information

Multimedia Communications. Video compression

Multimedia Communications. Video compression Multimedia Communications Video compression Video compression Of all the different sources of data, video produces the largest amount of data There are some differences in our perception with regard to

More information

Adaptive Key Frame Selection for Efficient Video Coding

Adaptive Key Frame Selection for Efficient Video Coding Adaptive Key Frame Selection for Efficient Video Coding Jaebum Jun, Sunyoung Lee, Zanming He, Myungjung Lee, and Euee S. Jang Digital Media Lab., Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul,

More information

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm International Journal of Signal Processing Systems Vol. 2, No. 2, December 2014 Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm Walid

More information

KVM IN MOBILE PRODUCTION

KVM IN MOBILE PRODUCTION IHSE KVM SOLUTIONS FOR OUTSIDE BROADCAST KVM IN MOBILE PRODUCTION STREAMLINE THE BROADCAST WORKFLOW DRACO ESSENTIAL TO THE BROADCAST WORKFLOW IHSE KVM SWITCHES IN OB VANS In the fat-paced environment of

More information

SCALABLE video coding (SVC) is currently being developed

SCALABLE video coding (SVC) is currently being developed IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 7, JULY 2006 889 Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding He Li, Z. G. Li, Senior

More information

Chapter 10 Basic Video Compression Techniques

Chapter 10 Basic Video Compression Techniques Chapter 10 Basic Video Compression Techniques 10.1 Introduction to Video compression 10.2 Video Compression with Motion Compensation 10.3 Video compression standard H.261 10.4 Video compression standard

More information

Constant Bit Rate for Video Streaming Over Packet Switching Networks

Constant Bit Rate for Video Streaming Over Packet Switching Networks International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Constant Bit Rate for Video Streaming Over Packet Switching Networks Mr. S. P.V Subba rao 1, Y. Renuka Devi 2 Associate professor

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

Reduced complexity MPEG2 video post-processing for HD display

Reduced complexity MPEG2 video post-processing for HD display Downloaded from orbit.dtu.dk on: Dec 17, 2017 Reduced complexity MPEG2 video post-processing for HD display Virk, Kamran; Li, Huiying; Forchhammer, Søren Published in: IEEE International Conference on

More information

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks July 22 nd 2008 Vineeth Shetty Kolkeri EE Graduate,UTA 1 Outline 2. Introduction 3. Error control

More information

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards COMP 9 Advanced Distributed Systems Multimedia Networking Video Compression Standards Kevin Jeffay Department of Computer Science University of North Carolina at Chapel Hill jeffay@cs.unc.edu September,

More information

IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction 2) Keep thee intruction 3) Heed all warning 4) Follow all intructi

IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction 2) Keep thee intruction 3) Heed all warning 4) Follow all intructi Uer Manual Verion 1 2 June 2004 ENGLISH IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction 2) Keep thee intruction 3) Heed all warning 4) Follow all intruction CAUTION:

More information

The H.26L Video Coding Project

The H.26L Video Coding Project The H.26L Video Coding Project New ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) standardization activity for video compression August 1999: 1 st test model (TML-1) December 2001: 10 th test model

More information

Chapter 2 Introduction to

Chapter 2 Introduction to Chapter 2 Introduction to H.264/AVC H.264/AVC [1] is the newest video coding standard of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). The main improvements

More information

PACKET-SWITCHED networks have become ubiquitous

PACKET-SWITCHED networks have become ubiquitous IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 13, NO. 7, JULY 2004 885 Video Compression for Lossy Packet Networks With Mode Switching and a Dual-Frame Buffer Athanasios Leontaris, Student Member, IEEE,

More information

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS Habibollah Danyali and Alfred Mertins School of Electrical, Computer and

More information

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences Comparative Study of and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences Pankaj Topiwala 1 FastVDO, LLC, Columbia, MD 210 ABSTRACT This paper reports the rate-distortion performance comparison

More information

Montenegro THE AGENCY FOR ELECTRONIC MEDIA Ref. no Podgorica, 14 November 2017

Montenegro THE AGENCY FOR ELECTRONIC MEDIA Ref. no Podgorica, 14 November 2017 Montenegro THE AGENCY FOR ELECTRONIC MEDIA Ref. no. 02 1487 Podgorica, 14 November 2017 REVIEW OF THE PROGRAMME STRUCTURE OF THE NATIONAL BROADCASTER RADIO TELEVISION OF MONTENEGRO Legal framework Article

More information

Differentiating ERAN and MMN: An ERP study

Differentiating ERAN and MMN: An ERP study NEUROPHYSIOLOGY, BASIC AND CLINICAL Differentiating ERAN and MMN: An ERP tudy Stefan Koelch, 1,CA Thoma C. Gunter, 1 Erich SchroÈger, 2 Mari Tervaniemi, 3 Daniela Sammler 1,2 and Angela D. Friederici 1

More information

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences Michael Smith and John Villasenor For the past several decades,

More information

IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction. 2) Keep thee intruction. 3) Heed all warning. 4) Follow all intru

IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction. 2) Keep thee intruction. 3) Heed all warning. 4) Follow all intru Uer Manual Verion 1.1 October 2003 ENGLISH IMPORTANT SAFETY INSTRUCTIONS DETAILED SAFETY INSTRUCTIONS: 1) Read thee intruction. 2) Keep thee intruction. 3) Heed all warning. 4) Follow all intruction. CAUTION:

More information

INFORMATION THEORY INSPIRED VIDEO CODING METHODS : TRUTH IS SOMETIMES BETTER THAN FICTION

INFORMATION THEORY INSPIRED VIDEO CODING METHODS : TRUTH IS SOMETIMES BETTER THAN FICTION INFORMATION THEORY INSPIRED VIDEO CODING METHODS : TRUTH IS SOMETIMES BETTER THAN FICTION Nitin Khanna, Fengqing Zhu, Marc Bosch, Meilin Yang, Mary Comer and Edward J. Delp Video and Image Processing Lab

More information

THE video coding standard H.264/AVC [1] accommodates

THE video coding standard H.264/AVC [1] accommodates IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 6, JUNE 2006 733 Rate-Distortion Analysis and Streaming of SP and SI Frames Eric Setton, Student Member, IEEE, and Bernd Girod,

More information

A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b

A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b 4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) A parallel HEVC encoder scheme based on Multi-core platform Shu Jun1,2,3,a, Hu Dong1,2,3,b 1 Education Ministry

More information

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora MULTI-STATE VIDEO CODING WITH SIDE INFORMATION Sila Ekmekci Flierl, Thomas Sikora Technical University Berlin Institute for Telecommunications D-10587 Berlin / Germany ABSTRACT Multi-State Video Coding

More information

Systematic Lossy Error Protection of Video Signals Shantanu Rane, Member, IEEE, Pierpaolo Baccichet, Member, IEEE, and Bernd Girod, Fellow, IEEE

Systematic Lossy Error Protection of Video Signals Shantanu Rane, Member, IEEE, Pierpaolo Baccichet, Member, IEEE, and Bernd Girod, Fellow, IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 10, OCTOBER 2008 1347 Systematic Lossy Error Protection of Video Signals Shantanu Rane, Member, IEEE, Pierpaolo Baccichet, Member,

More information

Analysis of Video Transmission over Lossy Channels

Analysis of Video Transmission over Lossy Channels 1012 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 18, NO. 6, JUNE 2000 Analysis of Video Transmission over Lossy Channels Klaus Stuhlmüller, Niko Färber, Member, IEEE, Michael Link, and Bernd

More information

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION 1 YONGTAE KIM, 2 JAE-GON KIM, and 3 HAECHUL CHOI 1, 3 Hanbat National University, Department of Multimedia Engineering 2 Korea Aerospace

More information

CHROMA CODING IN DISTRIBUTED VIDEO CODING

CHROMA CODING IN DISTRIBUTED VIDEO CODING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 67-72 CHROMA CODING IN DISTRIBUTED VIDEO CODING Vijay Kumar Kodavalla 1 and P. G. Krishna Mohan 2 1 Semiconductor

More information

(with. Meets. tools. Page 1 of 11

(with. Meets. tools. Page 1 of 11 Tel Aviv, 69719, 6 Irael 48 STP Patch Panel Feature Front-mounted Patching Switche connect port internally, dramatically reducing the need for patch cord Meet the tranfer impedance tandard requirement

More information

UC San Diego UC San Diego Previously Published Works

UC San Diego UC San Diego Previously Published Works UC San Diego UC San Diego Previously Published Works Title Wyner-Ziv Video Coding With Classified Correlation Noise Estimation and Key Frame Coding Mode Selection Permalink https://escholarship.org/uc/item/26n2f9r4

More information

uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu

uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuuu uuu uuu uuu & c & c & able of Content Auxiliary torage charge 2 Change in Fortran /O 2 Short Coure 4 ew Publication 4 Uer Quetion 5 **********AOUC********** he B 514 reproducing card punch and the B 47

More information

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes Digital Signal and Image Processing Lab Simone Milani Ph.D. student simone.milani@dei.unipd.it, Summer School

More information

Dual frame motion compensation for a rate switching network

Dual frame motion compensation for a rate switching network Dual frame motion compensation for a rate switching network Vijay Chellappa, Pamela C. Cosman and Geoffrey M. Voelker Dept. of Electrical and Computer Engineering, Dept. of Computer Science and Engineering

More information

Understanding Compression Technologies for HD and Megapixel Surveillance

Understanding Compression Technologies for HD and Megapixel Surveillance When the security industry began the transition from using VHS tapes to hard disks for video surveillance storage, the question of how to compress and store video became a top consideration for video surveillance

More information

Wyner-Ziv Coding of Motion Video

Wyner-Ziv Coding of Motion Video Wyner-Ziv Coding of Motion Video Anne Aaron, Rui Zhang, and Bernd Girod Information Systems Laboratory, Department of Electrical Engineering Stanford University, Stanford, CA 94305 {amaaron, rui, bgirod}@stanford.edu

More information

Hierarchical SNR Scalable Video Coding with Adaptive Quantization for Reduced Drift Error

Hierarchical SNR Scalable Video Coding with Adaptive Quantization for Reduced Drift Error Hierarchical SNR Scalable Video Coding with Adaptive Quantization for Reduced Drift Error Roya Choupani 12, Stephan Wong 1 and Mehmet Tolun 3 1 Computer Engineering Department, Delft University of Technology,

More information

PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS. Yuanyi Xue, Yao Wang

PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS. Yuanyi Xue, Yao Wang PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS Yuanyi Xue, Yao Wang Department of Electrical and Computer Engineering Polytechnic

More information

Video coding standards

Video coding standards Video coding standards Video signals represent sequences of images or frames which can be transmitted with a rate from 5 to 60 frames per second (fps), that provides the illusion of motion in the displayed

More information

ENCODING OF PREDICTIVE ERROR FRAMES IN RATE SCALABLE VIDEO CODECS USING WAVELET SHRINKAGE. Eduardo Asbun, Paul Salama, and Edward J.

ENCODING OF PREDICTIVE ERROR FRAMES IN RATE SCALABLE VIDEO CODECS USING WAVELET SHRINKAGE. Eduardo Asbun, Paul Salama, and Edward J. ENCODING OF PREDICTIVE ERROR FRAMES IN RATE SCALABLE VIDEO CODECS USING WAVELET SHRINKAGE Eduardo Asbun, Paul Salama, and Edward J. Delp Video and Image Processing Laboratory (VIPER) School of Electrical

More information

Systematic Lossy Forward Error Protection for Error-Resilient Digital Video Broadcasting

Systematic Lossy Forward Error Protection for Error-Resilient Digital Video Broadcasting Systematic Lossy Forward Error Protection for Error-Resilient Digital Broadcasting Shantanu Rane, Anne Aaron and Bernd Girod Information Systems Laboratory, Stanford University, Stanford, CA 94305 {srane,amaaron,bgirod}@stanford.edu

More information

Color Image Compression Using Colorization Based On Coding Technique

Color Image Compression Using Colorization Based On Coding Technique Color Image Compression Using Colorization Based On Coding Technique D.P.Kawade 1, Prof. S.N.Rawat 2 1,2 Department of Electronics and Telecommunication, Bhivarabai Sawant Institute of Technology and Research

More information

On Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efficient Decoding

On Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efficient Decoding 1240 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 6, DECEMBER 2011 On Complexity Modeling of H.264/AVC Video Decoding and Its Application for Energy Efficient Decoding Zhan Ma, Student Member, IEEE, HaoHu,

More information

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work Introduction to Video Compression Techniques Slides courtesy of Tay Vaughan Making Multimedia Work Agenda Video Compression Overview Motivation for creating standards What do the standards specify Brief

More information

Principles of Video Compression

Principles of Video Compression Principles of Video Compression Topics today Introduction Temporal Redundancy Reduction Coding for Video Conferencing (H.261, H.263) (CSIT 410) 2 Introduction Reduce video bit rates while maintaining an

More information

Region-of-InterestVideoCompressionwithaCompositeand a Long-Term Frame

Region-of-InterestVideoCompressionwithaCompositeand a Long-Term Frame Region-of-InterestVideoCompressionwithaCompositeand a Long-Term Frame Athanasios Leontaris and Pamela C. Cosman Department of Electrical and Computer Engineering University of California, San Diego, La

More information

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264 Ju-Heon Seo, Sang-Mi Kim, Jong-Ki Han, Nonmember Abstract-- In the H.264, MBAFF (Macroblock adaptive frame/field) and PAFF (Picture

More information

Error Resilience for Compressed Sensing with Multiple-Channel Transmission

Error Resilience for Compressed Sensing with Multiple-Channel Transmission Journal of Information Hiding and Multimedia Signal Processing c 2015 ISSN 2073-4212 Ubiquitous International Volume 6, Number 5, September 2015 Error Resilience for Compressed Sensing with Multiple-Channel

More information

Video Quality Monitoring for Mobile Multicast Peers Using Distributed Source Coding

Video Quality Monitoring for Mobile Multicast Peers Using Distributed Source Coding Quality Monitoring for Mobile Multicast Peers Using Distributed Source Coding Yao-Chung Lin, David Varodayan, and Bernd Girod Information Systems Laboratory Electrical Engineering Department, Stanford

More information

Improvement of Design Issues in Sequential Logic Circuit with Different CMOS Design Techniques

Improvement of Design Issues in Sequential Logic Circuit with Different CMOS Design Techniques Improvement of Deign Iue in Sequential Logic Circuit with Different CMOS Deign Technique Pradeep Kumar Sharma 1 *, Bhanupriya Bhargava1 and Shyam kahe 2 1 Reearch Scholar of Electronic & Communication

More information

FRAME RATE CONVERSION OF INTERLACED VIDEO

FRAME RATE CONVERSION OF INTERLACED VIDEO FRAME RATE CONVERSION OF INTERLACED VIDEO Zhi Zhou, Yeong Taeg Kim Samsung Information Systems America Digital Media Solution Lab 3345 Michelson Dr., Irvine CA, 92612 Gonzalo R. Arce University of Delaware

More information

H.264/AVC Baseline Profile Decoder Complexity Analysis

H.264/AVC Baseline Profile Decoder Complexity Analysis 704 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003 H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, Senior

More information

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame I J C T A, 9(34) 2016, pp. 673-680 International Science Press A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame K. Priyadarshini 1 and D. Jackuline Moni

More information

Bit Rate Control for Video Transmission Over Wireless Networks

Bit Rate Control for Video Transmission Over Wireless Networks Indian Journal of Science and Technology, Vol 9(S), DOI: 0.75/ijst/06/v9iS/05, December 06 ISSN (Print) : 097-686 ISSN (Online) : 097-5 Bit Rate Control for Video Transmission Over Wireless Networks K.

More information

Efficient encoding and delivery of personalized views extracted from panoramic video content

Efficient encoding and delivery of personalized views extracted from panoramic video content Efficient encoding and delivery of personalized views extracted from panoramic video content Pieter Duchi Supervisors: Prof. dr. Peter Lambert, Dr. ir. Glenn Van Wallendael Counsellors: Ir. Johan De Praeter,

More information

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder. Video Streaming Based on Frame Skipping and Interpolation Techniques Fadlallah Ali Fadlallah Department of Computer Science Sudan University of Science and Technology Khartoum-SUDAN fadali@sustech.edu

More information

Motion Video Compression

Motion Video Compression 7 Motion Video Compression 7.1 Motion video Motion video contains massive amounts of redundant information. This is because each image has redundant information and also because there are very few changes

More information

ROBUST REGION-OF-INTEREST SCALABLE CODING WITH LEAKY PREDICTION IN H.264/AVC. Qian Chen, Li Song, Xiaokang Yang, Wenjun Zhang

ROBUST REGION-OF-INTEREST SCALABLE CODING WITH LEAKY PREDICTION IN H.264/AVC. Qian Chen, Li Song, Xiaokang Yang, Wenjun Zhang ROBUST REGION-OF-INTEREST SCALABLE CODING WITH LEAKY PREDICTION IN H.264/AVC Qian Chen, Li Song, Xiaokang Yang, Wenjun Zhang Institute of Image Communication & Information Processing Shanghai Jiao Tong

More information

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and Video compression principles Video: moving pictures and the terms frame and picture. one approach to compressing a video source is to apply the JPEG algorithm to each frame independently. This approach

More information

Error Concealment for SNR Scalable Video Coding

Error Concealment for SNR Scalable Video Coding Error Concealment for SNR Scalable Video Coding M. M. Ghandi and M. Ghanbari University of Essex, Wivenhoe Park, Colchester, UK, CO4 3SQ. Emails: (mahdi,ghan)@essex.ac.uk Abstract This paper proposes an

More information

Impact of scan conversion methods on the performance of scalable. video coding. E. Dubois, N. Baaziz and M. Matta. INRS-Telecommunications

Impact of scan conversion methods on the performance of scalable. video coding. E. Dubois, N. Baaziz and M. Matta. INRS-Telecommunications Impact of scan conversion methods on the performance of scalable video coding E. Dubois, N. Baaziz and M. Matta INRS-Telecommunications 16 Place du Commerce, Verdun, Quebec, Canada H3E 1H6 ABSTRACT The

More information

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1 MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1 Toshiyuki Urabe Hassan Afzal Grace Ho Pramod Pancha Magda El Zarki Department of Electrical Engineering University of Pennsylvania Philadelphia,

More information

Scalable Foveated Visual Information Coding and Communications

Scalable Foveated Visual Information Coding and Communications Scalable Foveated Visual Information Coding and Communications Ligang Lu,1 Zhou Wang 2 and Alan C. Bovik 2 1 Multimedia Technologies, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA 2

More information

Study of AVS China Part 7 for Mobile Applications. By Jay Mehta EE 5359 Multimedia Processing Spring 2010

Study of AVS China Part 7 for Mobile Applications. By Jay Mehta EE 5359 Multimedia Processing Spring 2010 Study of AVS China Part 7 for Mobile Applications By Jay Mehta EE 5359 Multimedia Processing Spring 2010 1 Contents Parts and profiles of AVS Standard Introduction to Audio Video Standard for Mobile Applications

More information

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 6, NO. 3, JUNE 1996 313 Express Letters A Novel Four-Step Search Algorithm for Fast Block Motion Estimation Lai-Man Po and Wing-Chung

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 ISSN 0976 6464(Print)

More information

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4 Contents List of figures List of tables Preface Acknowledgements xv xxi xxiii xxiv 1 Introduction 1 References 4 2 Digital video 5 2.1 Introduction 5 2.2 Analogue television 5 2.3 Interlace 7 2.4 Picture

More information

COMPLEXITY REDUCTION FOR HEVC INTRAFRAME LUMA MODE DECISION USING IMAGE STATISTICS AND NEURAL NETWORKS.

COMPLEXITY REDUCTION FOR HEVC INTRAFRAME LUMA MODE DECISION USING IMAGE STATISTICS AND NEURAL NETWORKS. COMPLEXITY REDUCTION FOR HEVC INTRAFRAME LUMA MODE DECISION USING IMAGE STATISTICS AND NEURAL NETWORKS. DILIP PRASANNA KUMAR 1000786997 UNDER GUIDANCE OF DR. RAO UNIVERSITY OF TEXAS AT ARLINGTON. DEPT.

More information

Linköping University Post Print. Packet Video Error Concealment With Gaussian Mixture Models

Linköping University Post Print. Packet Video Error Concealment With Gaussian Mixture Models Linköping University Post Print Packet Video Error Concealment With Gaussian Mixture Models Daniel Persson, Thomas Eriksson and Per Hedelin N.B.: When citing this work, cite the original article. 2009

More information

WE CONSIDER an enhancement technique for degraded

WE CONSIDER an enhancement technique for degraded 1140 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 9, SEPTEMBER 2014 Example-based Enhancement of Degraded Video Edson M. Hung, Member, IEEE, Diogo C. Garcia, Member, IEEE, and Ricardo L. de Queiroz, Senior

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

Camera Motion-constraint Video Codec Selection

Camera Motion-constraint Video Codec Selection Camera Motion-constraint Video Codec Selection Andreas Krutz #1, Sebastian Knorr 2, Matthias Kunter 3, and Thomas Sikora #4 # Communication Systems Group, TU Berlin Einsteinufer 17, Berlin, Germany 1 krutz@nue.tu-berlin.de

More information

MPEG has been established as an international standard

MPEG has been established as an international standard 1100 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 7, OCTOBER 1999 Fast Extraction of Spatially Reduced Image Sequences from MPEG-2 Compressed Video Junehwa Song, Member,

More information

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding Jun Xin, Ming-Ting Sun*, and Kangwook Chun** *Department of Electrical Engineering, University of Washington **Samsung Electronics Co.

More information

New Approach to Multi-Modal Multi-View Video Coding

New Approach to Multi-Modal Multi-View Video Coding Chinese Journal of Electronics Vol.18, No.2, Apr. 2009 New Approach to Multi-Modal Multi-View Video Coding ZHANG Yun 1,4, YU Mei 2,3 and JIANG Gangyi 1,2 (1.Institute of Computing Technology, Chinese Academic

More information