1C.4.1. Modeling of Motion Classified VBR Video Codecs. Ya-Qin Zhang. Ferit Yegenoglu, Bijan Jabbari III. MOTION CLASSIFIED VIDEO CODEC INFOCOM '92

Similar documents
MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

A look at the MPEG video coding standard for variable bit rate video transmission 1

Relative frequency. I Frames P Frames B Frames No. of cells

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Digital Video Telemetry System

A Statistical Framework to Enlarge the Potential of Digital TV Broadcasting

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

DIGITAL COMMUNICATION

Motion Video Compression

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

Bridging the Gap Between CBR and VBR for H264 Standard

Chapter 2. Advanced Telecommunications and Signal Processing Program. E. Galarza, Raynard O. Hinds, Eric C. Reed, Lon E. Sun-

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

Constant Bit Rate for Video Streaming Over Packet Switching Networks

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

Chapter 10 Basic Video Compression Techniques

Dual frame motion compensation for a rate switching network

Video coding standards

TERRESTRIAL broadcasting of digital television (DTV)

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Video 1 Video October 16, 2001

Bit Rate Control for Video Transmission Over Wireless Networks

Dynamic bandwidth allocation scheme for multiple real-time VBR videos over ATM networks

DCT Q ZZ VLC Q -1 DCT Frame Memory

H.261: A Standard for VideoConferencing Applications. Nimrod Peleg Update: Nov. 2003

THE CAPABILITY of real-time transmission of video over

Error prevention and concealment for scalable video coding with dual-priority transmission q

An Overview of Video Coding Algorithms

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

AUDIOVISUAL COMMUNICATION


Optimization techniques for adaptive. quantization of image and video under delay. constraints. Antonio Ortega. Submitted in partial fulællment of the

Multimedia Communications. Video compression

Understanding IP Video for

Pattern Smoothing for Compressed Video Transmission

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Influence of Available Bandwidth on the Statistical Characterization of Compressed Video

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

MPEG has been established as an international standard

1 Introduction to PSQM

ATSC vs NTSC Spectrum. ATSC 8VSB Data Framing

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Implementation of an MPEG Codec on the Tilera TM 64 Processor

RECOMMENDATION ITU-R BT.1203 *

EMBEDDED ZEROTREE WAVELET CODING WITH JOINT HUFFMAN AND ARITHMETIC CODING

MPEG-1 and MPEG-2 Digital Video Coding Standards

SAVE: An Algorithm for Smoothed Adaptive Video over Explicit Rate Networks

The H.263+ Video Coding Standard: Complexity and Performance

complex than coding of interlaced data. This is a significant component of the reduced complexity of AVS coding.

II. SYSTEM MODEL In a single cell, an access point and multiple wireless terminals are located. We only consider the downlink

Essence of Image and Video

Analysis of MPEG-2 Video Streams

Multimedia Communications. Image and Video compression

Analysis of Video Transmission over Lossy Channels

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

PAPER Wireless Multi-view Video Streaming with Subcarrier Allocation

Reduced complexity MPEG2 video post-processing for HD display

Analysis of a Two Step MPEG Video System

Wipe Scene Change Detection in Video Sequences

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

EXPERIMENTAL RESULTS OF MPEG-2 CODED VIDEO TRANSMISSION OVER A NOISY SATELLITE LINK *

WITH the rapid development of high-fidelity video services

Buffering strategies and Bandwidth renegotiation for MPEG video streams

Introduction to image compression

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Principles of Video Compression

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Understanding Compression Technologies for HD and Megapixel Surveillance

Dual frame motion compensation for a rate switching network

VERY low bit-rate video coding has triggered intensive. Significance-Linked Connected Component Analysis for Very Low Bit-Rate Wavelet Video Coding

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

Overview: Video Coding Standards

Adaptive Key Frame Selection for Efficient Video Coding

INTERNATIONAL TELECOMMUNICATION UNION. SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Coding of moving video

Interleaved Source Coding (ISC) for Predictive Video Coded Frames over the Internet

1022 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010

Authorized licensed use limited to: Columbia University. Downloaded on June 03,2010 at 22:33:16 UTC from IEEE Xplore. Restrictions apply.

Dual Frame Video Encoding with Feedback

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Distributed Video Coding Using LDPC Codes for Wireless Video

Region Adaptive Unsharp Masking based DCT Interpolation for Efficient Video Intra Frame Up-sampling

Impact Of ATM Traffic Shaping On MPEG-2 Video Quality*

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

Real Time PQoS Enhancement of IP Multimedia Services Over Fading and Noisy DVB-T Channel

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

ATSC Video and Audio Coding

Multimedia Communication Systems 1 MULTIMEDIA SIGNAL CODING AND TRANSMISSION DR. AFSHIN EBRAHIMI

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

A High Performance VLSI Architecture with Half Pel and Quarter Pel Interpolation for A Single Frame

INFORMATION THEORY INSPIRED VIDEO CODING METHODS : TRUTH IS SOMETIMES BETTER THAN FICTION

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

Minimax Disappointment Video Broadcasting

Lecture 2 Video Formation and Representation

SERIES J: CABLE NETWORKS AND TRANSMISSION OF TELEVISION, SOUND PROGRAMME AND OTHER MULTIMEDIA SIGNALS Measurement of the quality of service

Transcription:

Modeling of Motion Classified VBR Video Codecs Ferit Yegenoglu, Bijan Jabbari YaQin Zhang George Mason University Fairfax, Virginia GTE Laboratories Waltham, Massachusetts ABSTRACT Variable Bit Rate (VBR) video coding is emerging as a means to support full motion video services in broadband packet networks. In this paper, we use a motion adaptive VBR video codec and propose a motion classified model to represent the characteristics of various classes of motion activities. The codec switches between interframe, motion compensated, and intraframe coding corresponding to low, medium, and high motions and scene changes, respectively. Our model captures the motion of various video scenes and the codec structure by providing the statistics of VBRcoded video traffic through a first order composite autoregressive process with three motion classes. The parameters of this model are derived from a VBRcoded sample video sequence such that the bit rate distribution and the autocorrelation in bits rates of two successive frames are matched. We verify the validity and accuracy of the model by comparing certain statistics of the video sample with those of the model. Using this model, we then present and discuss the characteristics of aggregated traffic sources. I. INTRODUCTION Efficient handling of video services will be one of the key factors for the successful commercialization of future broadband integrated networks. With the advances in transmission and switching technologies, digital signal processing techniques and VLSI technology, increasing digital video services ranging from videophone to high definition TV (HDTV) will likely be provided. Without any bandwidth reducing techniques for video transmission, the multiplexed traffic from subscriber access lines could easily overload the high speed BISDN channels. Therefore, encoding of video signals to reduce the necessary bandwidth for transmission and switching would most likely be required. The possibility to provide VBR coding and transmission is one of the promising advantages that the ATM based broadband integrated networks can offer for video services. VBR coding allows a consistent quality, which is in contrast to the constant bit rate coding (CBR), where the quality varies to match the constant nature of the channel. ATM networks use statistical multiplexing and provide a variable rate transmission according to the data rate from the coded video source. However, due to the nature of statistical multiplexing and packet switching of ATM networks, cell loss probabilities and transmission delays become important issues to be investigated for realtime video services. Computation of these performance measures requires accurate models for VBR video traffic based on its statistical characteristics. The important statistical characteristics of VBR video traffic can be summarized as follows: i) there is a strong correlation among the bit rates of successive frames due to the nature of actual video scenes and interframe coding, ii) the bit rate of the coded video depends on the motion activity in the scene, iii) during each motion state (low, medium and high motion), the bit rate, burstiness and autocorrelation are significantly different, iv) highest bit rates arise during scene changes and last only one or two frames depending on the coding algorithm, and v) the bit rates resulting from high motion activities are typically lower than the peak bit rates in scene changes but the high motion activity periods have a longer duration. II. OVERVIEW OF THE METHOD In this paper, first, a motionclassified video coding scheme is used which switches between stages of interframe coding, motion compensated coding, and intraframe coding, adapting to different levels of motion activities. Based on the histogram measurement of the output rates of this codec, for a sample full motion video sequence, we have observed that the actual distribution can be well represented by a combination of three Gaussian distributions. This observation is different from those obtained in previous research in which a Gaussian distribution is generally assumed [1,2,3]. The reason is due to the fact that the codec used here is motionadaptive and fullmotion TV signals are employed as test sources. Motivated by the composite Gaussian distribution, a first order autoregressive (AR) model with random parameters is proposed here. The parameters of the model can take three possible values depending on the state of the codec. A method is developed to estimate the parameters of the AR model. VBR video rate is emulated according to this model, by generating frames matching to certain statistics of the actual video source, and its validity and accuracy are verified by comparing its bit rate histogram and first through fourth order statistics with actual sample measurement. Finally, the characteristics of the aggregate video traffic are studied by using the traffic model that is developed for a single source. III. MOTION CLASSIFIED VIDEO CODEC In VBR coding schemes, some type of interframe prediction scheme is often employed to reduce the temporal domain redundancy between successive frames. Video information can be segmented into two classes. The first class can be predicted from previous frames (or from following frames in two way codecs), and the other class contains new information, which cannot be predicted from other frames. Examples of interframe coding schemes include conditional replenishment and motioncompensated coding. Conditional replenishment is an interframe predictive coding scheme based on transmitting and coding differences between the present frame signal and the previous signal. When there is no motion or slow motion involved in the scene, this scheme is efficient. However, while the object in the picture moves to some extent between successive frames, coding efficiency will be greatly improved if this motion vector can be estimated in some way. Motion compensation schemes INFOCOM '92 1C.4.1 CH31336/92/00000105 $3.00 0 1992 IEEE 01 05

tend to estimate and thereby compensate for this motion. Most existing motion schemes are efficient for the motion compensation due to translation. When successive frames present high motion activities rather than simple translation or involve a complete scene change, it is easily understood that motion compensation will not help. Moreover, even interframe prediction tends to perform poorly due to low correlation between frames. In this case, a simple intraframe coding scheme will probably provide the best coding performance. Therefore, the choice of coding schemes should be adapted to the scene which can be best described as motion activities between successive frames. In video teleconferencing or videophone applications, most scenes involve lowtomedium motions and scene changes rarely occur. A simple motion compensated interframe coder will suffice for such applications. According to the above arguments, three schemes are employed depending on the motion activities: i) low motion (interframe DPCM). ii) medium motion (interframe motion compensated DPCM), and iii) high motion (intraframe coding). In the adaptive interbntra frame coding scheme that is considered here [4,5], the motion classifier first detects the motion activities for the incoming frame and chooses the appropriate coding schemes. In case of low motion, the frame difference is coded using a block Discrete Cosine Transform ( Dm technique and quantized. The quantization process takes advantage of human visual characteristics and coefficients are weighted prior to the uniform quantization. In case of medium motion, motion compensation schemes are employed. The above procedure will work on the motioncompensated frame difference. Motion vectors are noiselessly coded and transmitted. When high motion activity is identified, only an intraframe coding scheme is used. All the quantization and coding procedures remain the same. IV. THE VIDEO MODEL Modeling of VBR video traffic from a single source by an autoregressive process was first done by Maglaris et al. [ 11 for a picturephone type scene. Since such a scene does not exhibit various motion activities, the bitrate has a bellshaped probability density around the mean, and there is a strong correlation between the bit rates of successive frames. In their model, these characteristics have been presented by the autoregressive process A(n)aA(n1) + bw(n) where h(n) is the bit rate of the coded video during the n"' frame, w(n) is a Gaussian process with variance 1, and a and b are constants. The parameters of this model are found by matching the expected value of the bit rate and the discrete autocovanance of the model with those of a sequence of VBR coded video frames. The bit rate of the frames generated by this process has a normal distribution, and it matches the bellshaped distribution of the bitrate from picturephone type scenes. This model however would not closely model the bitrate distribution of video with various and distinct motion classes. Figure 2 shows such a distribution. A model for full motion VBR video traffic with scene changes was given by [6] also. A VBR video source is modeled as a superposition of two independent first order autoregressive processes to capture the autocorrelation function accurately, and a third process to incorporate the extra bits generated during scene changes. The sum of these three processes give the number of bits generated during scene changes. The model that is presented here has three motion activity classes that drive the adaptive coding algorithm described in the last section. This model has a set of parameters which can be estimated from a sample video frame sequence so that the characteristics of the coded video can be closely matched. Different motion classes like low, medium and high motion and scene changes are closely modeled. The duration of each motion class, the mean and variance of the bitrate for each class, the autocorrelation between two successive frames, the steady state probability of being at each motion activity class, and transition probabilities from one class to the other are observed from a VBRvideo sample. These statistics are input to an estimator, and the model parameters are estimated. Other than matching the above statistics, the bit rate of the video traffic obtained by this model has a very similar probability distribution function to that of the actual video data. The bit rate at the output of the coder depends on the coding algorithm and the motion activity of the video scene. For a low motion scene such as in a videophone, the distribution of the bitrate is close to normal. However, in full motion video applications the scene passes through different stages of motion activity. The bitrate distribution in this case no longer has a bellshaped density. Since different coding schemes are used during each motion class, the bitrate distribution and autocorrelation function are expected to be different with each class. The VBR video model, therefore, has a different set of parameters for each class. The traffic generated at each frame is represented by a first order autoregressive process belonging to one of the three motion activity classes. Therefore, three autoregressive processes with different parameters are present in our model. The duration of each class described in number of frames has a geometric distribution. Therefore, the process that describes the motion activity classes of the encoded video is a discrete time Markov process. If the duration of each state is long enough, the distribution of the bitrate at each state will be close to a normal distribution. The total distribution will be a mixture of Gaussian distributions and can closely match that of real video by changing the number of activity levels and the parameters at each class. Let &(n) denote the number of bits generated from the coded video during the n"' frame of class i (i=l: low motion, i=2: medium motion, i=3: high motion), then: L,(n)a,A,(n1) + G,(n) (es. 2) where a, is a random coefficient that takes one of three possible values, Gi(n) is a normal random variable with mean and variance 0:. Let k be the random variable denoting the duration of a class with mean l&. Then the density function fork is given by the following geometric distribution: 8 F,(k) 2 ( 1 e,> ; (klj,...) 18, Let xij denote the probability that the next class is j, given that the current class is i (note that qi and l/bi can be easily estimated from the VBR video sample). Then, the transition probability matrix P whose pii entry gives the probability of being in class j at the next frame given that the present class is i, is the following: 01 06 1C.4.2

The effect of randomly generating the first frame for each class is to loosen the correlation between the two frames before and after a class change. However, if the mean duration of each state is long, this does not happen frequently and the degradation in the autocorrelation is not significant. The 3class video model is completely described by 4, h, of, B, and qj. In the next section, estimation of these parameters is explained. V. ESTIMATION OF THE MODEL PARAMETERS Let y, define the boundary between the distributions of low and medium motion activity and y2 define the boundary between the medium and high motion activity frames as seen in Figure 2. Then, a frame of the given video sample belongs to the low motion class if hi(n) I y,, the medium motion class if yl < hi(n) I y2 and the high motion class if h,(n) > y2. Once the class each frame belongs to is identified, the following statistics are obtained: i) Expected value of the number of bits per frame for class i, qi = E(h,(n)) ii) Variance of number of the bits per frame during state i, Var(hi) = te(hi(n) 71,)*1 iii) A measure of correlation of the bit rates of two successive frames, Df = [E(hi(n) h,(nl))'] (eq. 5) iv) Expected value of the duration of each state, l/b, v) Transition probabilities among classes, nij The parameters that describe the autoregressive process for each class can be derived from the first three statistics above, q,, Var(h,), and Dz. To simplify the expressions that give q,, Var(hi), and D: for our model, a modification is made such that the first frame generated after a class transition does not depend on the bitrate of the previous frame. The bit rate for the first frame is randomly generated according to the mean and variance of the bit rate for that class. With this modification, it can be assumed that a steady state has been reached during each class. Now qi, Var(hi), and D: are given by the following equations: Finally, the quantities l/bi and nij that describe the transition matrix of the Markov chain for motion activity classes can be directly obtained from the sample. VI. APPLICATION OF THE MODEL AND RESULTS The test sequence consists of 500 frames and contains different degrees of motion activities and details. It is a fullmotion color video sequence with 720 by 480 pixels per frame, 16 bits per pixel, and 30 frames per second. The picture quality of this resolution is equivalent to or better than that of the NTSC broadcast color TV. The bitrate of the coded video data sample used here is shown in Figure 1. It is obtained by applying the adaptive coding scheme described in [4,5] on a full motion video of 500 frames. The average and standard deviation of the bit rate are 55.2 Kbits/frame and 17.7 Kbits/frame, respectively (1.66 Mbps and 0.531 Mbps). The peak to mean ratio (PMR) is roughly 21. The bitrate histogram for this data is shown in Figure 2. The thresholds that define motion activity classes 1 were chosen 1 as y,=44kbits/frame and y2=55 Kbits/frame. 1 1 1 1 1 The statistics measured for each motion activity class from this data are shown in table 1 all in bits/frame. M,j and M4, are the third and fourth order central moment statistics for class i, and are given as: ~ 3. [E(A(~) 1 ~ qj311' (eq. 12) M4,,= [E(&(@ rl~~l'" (eq. 13) motion activity q, STD(IJ M,~ M~~ low motion 37,482 2,401 2,283 1,241 3,221 medium motion 49,203 2,461 1,506 984 3,009 high motion 71,108 13,238 7,814 13,970 18,042 composite 55,247 17,755 16,900 23,740 vnr [A,] 2 ai 10; Table 1. Statistics measured from a VBR coded video sample 2 a: D? 1 +U{ Solving these equations, the parameters of interest are obtained as: 0.80 0.20 II = 0.44 0.56 0.17 0.83 The model parameters derived using the statistics in table 1 and equations 9, 10 and 11, are shown in table 2. 1 C.4.3 0107

REFERENCES [ 11 B. Maglaris, D. Anastassiou, P. Sen, G. Karlsson, J. D. Robbins (1988). "Performance Models of Statistical Multiplexing in Packet Video Communications," IEEE Transactions on Communications, VOL. 36, NO. 7, July, pp. 834843. Table 2. 3class model parameters Figure 3 shows the bit rate of 500 video frames emulated by the 3 state model using these parameters. The bitrate histogram is presented in Figure 4. From these figures it can be seen that the bitrate pattern and the bitrate histogram of the traffic generated by the 3state model are very close to those of the original video sample. The statistics gathered from the emulated data are listed in table 3. [2] W. Verbiest, L. Pinnoo, B. Voeten (1988). "The Impact of the ATM Concept on Video Coding," IEEE Journal on Selected Areas in Communications, VOL. 6, No. 9, December 1988, pp. 16231632. [3] M. Nomura, T. Fuji, N. Ohta (1989). "Basic Characteristics of Variable Rate Video Coding in ATM Environment," IEEE Journal on Selected Areas in Communications, VOL. 7, No. 5, June, pp. 752760. [4] Z. Q. Zhang, W. W. Wu, K. S. Kim, R. L. Pickhola, and J. Ramasastry (1991a). "VariableBitRate Video Transmission in the BroadbandISDN Environment," Proceedings of IEEE, VOL. 79, No. 2, February, pp. 214222.. Table 3. Statistics measured from the model The aggregate traffic characteristics are depicted in Figures 5 and 6. Figures 5 and 6 show the bitrate pattern and the bitrate histogram for aggregate video traffic from 10 sources each with model parameters as given above. From these figures it is seen that the distribution for aggregate traffic is namwer and is closer to a normal distribution. The peaktomean ratio reduces to 1.44, and the standard deviationto mean ratio reduces from 0.32 for a single source to 0.10. CONCLUSIONS In this paper, motion classified coded traffic was modeled by three Gaussian distributions and consequently a threeclass autoregressive process was developed. The codec and the model used here captures the motion activities present in full motion video sources. A method to estimate the parameters of the underlying composite AR model was presented. Comparison was made between emulated video rates according to this model and the actual sample measurement base on PDF, first through fourth order statistics. The main conclusions drawn from this study are: 1) The output rate distribution of the VBR coded fullmotion video can well be represented by a composite Gaussian PDF. 2) The output rate can be accurately modeled by a motionclassified composite AR process. A threeclass AR model suffices for the adaptive coding scheme employed here. 3) The aggregated traffic obtained from this model tends to approach a single AR process as the number of traffic sources increases. [5] Y. Q. Zhang, R. Pickholtz, and M. h ew (1991b) "A Combined Transform Coding (CTC) Scheme for Image Data Compression," IEEE Transactions on Consumer Electronics, February, pp. 4550. [6] G. Ramamurthy, B. Sengupta (1990). "Modeling and Analysis of a Variable Bit Rate Video Multiplexer," Proceedings of 7th International TeletrajJZc Congress Seminar. 0108 1C.4.4

kbitslframe I I I 110 1M 90 80 70 60 50 40 30 :I,,,,,,,,,,, I O 0 501M150200W)UYI3504004505M) Game number of frames 10 30 7.5 20 15 kbitslframe Figure 1. Bit Rate OC VBR Coded Video Frames Figure 4. Bit Rate Histogram of VBR Video Frames by the Model numtcr of frames 40 35 30 7.5 20 c Mbitslsec 30'0 27.0 24.0 1 21.0 18.0 15.0 12.0 I 15 10 5 Game. kbitslframe Figure 2. Bit Rate Histogram for 500 VBR Coded Video Frames 1M 90 80 70 60 50 40 30 Figure 5. Aggregate Video Traffic from 10 Video Sources, Mean: 16.65 Mbps, STD: 1.62 Mbps number of frames I 200 1M m U) IO Figure 6. Bit Rate Histogram for Aggregate Traffic Generated by 10 VBR Video Sources 1C.4.5 01 09