Evaluation of video quality metrics on transmission distortions in H.264 coded video

Similar documents
Lund, Sweden, 5 Mid Sweden University, Sundsvall, Sweden

SUBJECTIVE ASSESSMENT OF H.264/AVC VIDEO SEQUENCES TRANSMITTED OVER A NOISY CHANNEL

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

SERIES J: CABLE NETWORKS AND TRANSMISSION OF TELEVISION, SOUND PROGRAMME AND OTHER MULTIMEDIA SIGNALS Measurement of the quality of service

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Objective video quality measurement techniques for broadcasting applications using HDTV in the presence of a reduced reference signal

HIGH DEFINITION H.264/AVC SUBJECTIVE VIDEO DATABASE FOR EVALUATING THE INFLUENCE OF SLICE LOSSES ON QUALITY PERCEPTION

Video Quality Evaluation with Multiple Coding Artifacts

Error concealment techniques in H.264 video transmission over wireless networks

HEVC Subjective Video Quality Test Results

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

A SUBJECTIVE STUDY OF THE INFLUENCE OF COLOR INFORMATION ON VISUAL QUALITY ASSESSMENT OF HIGH RESOLUTION PICTURES

SUBJECTIVE AND OBJECTIVE EVALUATION OF HDR VIDEO COMPRESSION

An Evaluation of Video Quality Assessment Metrics for Passive Gaming Video Streaming

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

OBJECTIVE VIDEO QUALITY METRICS: A PERFORMANCE ANALYSIS

UC San Diego UC San Diego Previously Published Works

PERCEPTUAL QUALITY ASSESSMENT FOR VIDEO WATERMARKING. Stefan Winkler, Elisa Drelie Gelasca, Touradj Ebrahimi

ON THE USE OF REFERENCE MONITORS IN SUBJECTIVE TESTING FOR HDTV. Christian Keimel and Klaus Diepold

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

Estimating the impact of single and multiple freezes on video quality

Wireless Ultrasound Video Transmission for Stroke Risk Assessment: Quality Metrics and System Design

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

ERROR CONCEALMENT TECHNIQUES IN H.264

Project No. LLIV-343 Use of multimedia and interactive television to improve effectiveness of education and training (Interactive TV)

ACHIEVING HIGH QOE ACROSS THE COMPUTE CONTINUUM: HOW COMPRESSION, CONTENT, AND DEVICES INTERACT

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard

Understanding PQR, DMOS, and PSNR Measurements

SUBJECTIVE QUALITY EVALUATION OF HIGH DYNAMIC RANGE VIDEO AND DISPLAY FOR FUTURE TV

Robust 3-D Video System Based on Modified Prediction Coding and Adaptive Selection Mode Error Concealment Algorithm

Lecture 2 Video Formation and Representation

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Motion Video Compression

FULL-HD HEVC-ENCODED VIDEO QUALITY ASSESSMENT DATABASE. Enrico Masala. Politecnico di Torino Torino, Italy

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Methodology for Objective Evaluation of Video Broadcasting Quality using a Video Camera at the User s Home

Comparative Study of JPEG2000 and H.264/AVC FRExt I Frame Coding on High-Definition Video Sequences

COMPRESSION OF DICOM IMAGES BASED ON WAVELETS AND SPIHT FOR TELEMEDICINE APPLICATIONS

NO-REFERENCE QUALITY ASSESSMENT OF HEVC VIDEOS IN LOSS-PRONE NETWORKS. Mohammed A. Aabed and Ghassan AlRegib

P SNR r,f -MOS r : An Easy-To-Compute Multiuser

ROBUST ADAPTIVE INTRA REFRESH FOR MULTIVIEW VIDEO

Quality impact of video format and scaling in the context of IPTV.

Popularity-Aware Rate Allocation in Multi-View Video

Video coding standards

A Study of Encoding and Decoding Techniques for Syndrome-Based Video Coding

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices

The H.26L Video Coding Project

Audiovisual focus of attention and its application to Ultra High Definition video compression

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

OBJECT-BASED IMAGE COMPRESSION WITH SIMULTANEOUS SPATIAL AND SNR SCALABILITY SUPPORT FOR MULTICASTING OVER HETEROGENEOUS NETWORKS

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

Project Proposal: Sub pixel motion estimation for side information generation in Wyner- Ziv decoder.

PAPER Wireless Multi-view Video Streaming with Subcarrier Allocation

NUMEROUS elaborate attempts have been made in the

H.264/AVC analysis of quality in wireless channel

1022 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 19, NO. 4, APRIL 2010

AUDIOVISUAL COMMUNICATION

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

ABSTRACT 1. INTRODUCTION

Constant Bit Rate for Video Streaming Over Packet Switching Networks

M.Padmaja 1, K.Prasuna 2.

Characterizing Perceptual Artifacts in Compressed Video Streams

TERRESTRIAL broadcasting of digital television (DTV)

Monitoring video quality inside a network

Reduced-reference image quality assessment using energy change in reorganized DCT domain

ANALYSIS OF FREELY AVAILABLE SUBJECTIVE DATASET FOR HDTV INCLUDING CODING AND TRANSMISSION DISTORTIONS

PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS. Yuanyi Xue, Yao Wang

A New Standardized Method for Objectively Measuring Video Quality

RECOMMENDATION ITU-R BT Methodology for the subjective assessment of video quality in multimedia applications

Video Codec Requirements and Evaluation Methodology

MULTI-STATE VIDEO CODING WITH SIDE INFORMATION. Sila Ekmekci Flierl, Thomas Sikora

Dual frame motion compensation for a rate switching network

Perceptual Coding: Hype or Hope?

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

Chapter 2 Introduction to

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

CS229 Project Report Polyphonic Piano Transcription

A HYBRID METRIC FOR DIGITAL VIDEO QUALITY ASSESSMENT. University of Brasília (UnB), Brasília, DF, , Brazil {mylene,

Bit Rate Control for Video Transmission Over Wireless Networks

IEEE TRANSACTIONS ON BROADCASTING 1

Schemes for Wireless JPEG2000

An Overview of Video Coding Algorithms

Video Over Mobile Networks

Pick your Layers wisely - A Quality Assessment of H.264 Scalable Video Coding for Mobile Devices

Adaptive Key Frame Selection for Efficient Video Coding

ATSC Standard: Video Watermark Emission (A/335)

Wireless Multi-view Video Streaming with Subcarrier Allocation by Frame Significance

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

Color Image Compression Using Colorization Based On Coding Technique

MPEG Solutions. Transition to H.264 Video. Equipment Under Test. Test Domain. Multiplexer. TX/RTX or TS Player TSCA

PERCEPTUAL QUALITY OF H.264/AVC DEBLOCKING FILTER

Technical report on validation of error models for n.

MULTIVIEW DISTRIBUTED VIDEO CODING WITH ENCODER DRIVEN FUSION

WE CONSIDER an enhancement technique for degraded

SHOT DETECTION METHOD FOR LOW BIT-RATE VIDEO CODING

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Transcription:

1 Evaluation of video quality metrics on transmission distortions in H.264 coded video Iñigo Sedano, Maria Kihl, Kjell Brunnström and Andreas Aurelius Abstract The development of high-speed access networks has enabled a variety of video delivery alternatives over the Internet, for example IPTV and peer-to-peer based video services as Voddler. Consequently, the development of real-time video QoE monitoring methods is receiving large attention in the research community. We believe that the good performing objective metrics using reference information could be used to speed up the development process of real-time video QoE monitoring methods. Thus in this paper we study the accuracy of full-reference objective methods for assessing the quality degradation due to the transmission distortions. We evaluated several well-known publicly-available full-reference objective metrics on the freely available EPFL-PoliMI (Ecole Polytechnique Fédérale de Lausanne and Politecnico di Milano) video quality assessment database, which was specifically designed for the evaluation of transmission distortions. The full-reference metrics are usually evaluated using a reference which is uncompressed. Instead, we study the performance of the metrics when the reference videos are lightly compressed to ensure high quality. Index Terms Objective evaluation techniques, Performance evaluation, IPTV & Internet TV I. INTRODUCTION HE development of high-speed access networks has Tenabled a variety of video delivery alternatives over the Internet, for example IPTV and peer-to-peer based video services as Voddler [1]. A major problem is that the Quality of Experience (QoE) of the video can be severely affected even by a low packet loss rate. Consequently it is important to allocate necessary network resources to minimize the loss of video information. For doing that it is necessary to monitor and estimate the QoE delivered to user. Accordingly, the development of video quality metrics is receiving large attention in the research community. Currently, the methods using reference information and Manuscript received December 3, 2010. This work was developed inside the Future Internet project supported by the Basque Government within the ETORTEK Programme and the FuSeN project supported by the Spanish Ministerio de Ciencia e Innovación, under grant from Fundación Centros Tecnológicos Iñaki Goenaga and Tecnalia Research & Innovation. The work was also partly financed by the CELTIC project IPNQSIS, with the Swedish Governmental Agency for Innovation Systems (VINNOVA) supporting the Swedish contribution. I. Sedano is with Tecnalia Research & Innovation, Zamudio, E-48170 Spain (e-mail: isedano@robotiker.es). M. Kihl is with the Dept. of Electrical and Information Technology, Lund University, SE-221 00 Sweden. K. Brunnström and A. Aurelius are with Acreo AB, Networking and Transmission Laboratory (Netlab), Kista, SE-164 40 Sweden. suited for off-line non-monitoring purposes have shown much better performance predicting perceived video quality, than methods using no reference that could be used for monitoring. Subjective assessments involving a panel of observers constitutes up to now the most accurate method to measure the video QoE and are also necessary for the development of video quality metrics. However, subjective tests are expensive and time consuming. Instead, the good performing objective metrics using reference information could be used to speed up the development process of real-time video QoE monitoring methods. In order to understand their performance and how data compression as well as packet loss affects the QoE of video, we believe that the evaluation of already developed full-reference metrics is valuable. The accuracy of full-reference objective methods for assessing the quality degradation due to the compression process is an already well-studied subject, but independent evaluations are, however, scarce. Furthermore, independent evaluations of the impact of packet loss are even more rare. A notable exception is the work by the Video Quality Experts Group (VQEG) [2]. Nevertheless, until recently there has been a lack of freely available video quality databases containing transmission distortions, thus limiting the research progress in the area. Moorthy et al (2010) [3] evaluated publicly-available fullreference video quality assessment algorithms on the LIVE Wireless Video Quality Assessment database. The database was designed specifically for H.264 AVC compressed videos transmitted over a wireless channel and consisted of 10 reference videos and 160 distorted videos. A large scale subjective study involving over 30 different subjects was conducted to assess the quality of the distorted videos in the dataset. The LIVE Wireless Video Quality Assessment database is no longer publicly available. In this paper, we evaluate several full-reference objective metrics on the freely available EPFL-PoliMI (Ecole Polytechnique Fédérale de Lausanne and Politecnico di Milano) video quality assessment database [4] [5] [6], which has been specifically designed for the evaluation of transmission distortions. The database contains 78 video sequences at 4CIF spatial resolution (704 576 pixels). The distorted videos are created from five 10 seconds long and one 8 seconds long uncompressed video sequences in planar I420 raw progressive format. The reference videos are lightly compressed to ensure high video quality in the absence of packet losses. The references

2 are thus similar in quality to the uncompressed original. In real network deployments the uncompressed original sequence is usually not available. Therefore we believe there is an interest in evaluating the performance of the metrics when a compressed reference of a bit lower quality is used instead. The transmission distortions are simulated at different packet loss rates (PLR) (0.1%, 0.4%, 1%, 3%, 5%, 10%) and two different channel realizations are selected for each PLR. The H.264/AVC reference software encoder adopting the High Profile is used both for encoding and decoding the videos. The compressed videos in the absence of packet losses are used as the reference for the computation of the DMOS (Differential Mean Opinion Score) values. Forty naive subjects took part in the subjective tests. We have evaluated the performance of the following wellknown publicly-available video quality algorithms: Peak Signal to Noise Ratio (PSNR), Structural SIMilarity (SSIM) index [11], Multi-scale SSIM (MS-SSIM) [13], Video Quality Metric (VQM) [15], Visual Signal to Noise Ratio (VSNR) [16], MOtion-based Video Integrity Evaluation (MOVIE) [18], Spatial MOVIE [18] and Temporal MOVIE [18]. The performance of the objective models is evaluated using the Spearman Rank Order Correlation Coefficient, the Pearson Linear Correlation Coefficient, the Root Mean Square Error (RMSE) and the Outlier Ratio. A non-linear regression is done using a monotonic cubic polynomial function with four parameters as recommended by the VQEG [7]. The performance of the different metrics is compared by means of a statistical significance analysis based on the Pearson, RMSE and Outlier Ratio coefficients. II. METHODOLOGY A. EPFL-PoliMI video database Three of the reference videos have a frame rate of 25 fps (cropping HD resolution video sequences down to 4CIF ( 704 576 pixels 2 ) resolution and downsampling the original content from 50 fps to 25 fps) while the other three have a frame of 30 fps. As it has been already stated, the reference videos are lightly compressed to ensure high quality in the absence of packet losses. Therefore, a fixed Quantization Parameter between 28 and 32 was selected for each sequence. The sequences were encoded in H.264/AVC [8] High Profile. B-pictures and Context-adaptive binary arithmetic coding (CABAC) were enabled for coding efficiency. Each frame was divided into a fixed number of slices, where each slice consists of a full row of macroblocks. The error patterns were generated at six different PLRs [0.1%, 0.4%, 1%, 3%, 5%, 10%] with a two state Gilbert s model with an average burst length of 3 packets. For each PLR and content, two realizations were selected producing a total of 72 distorted sequences. The subjective evaluation was done using the 5 point ITU continuous scale in the range [0-5] [9]. 21 subjects participated in the evaluation at the PoliMI lab and 19 at the EPFL lab. More details about the subjective evaluation can be found in [4] [5] [6]. B. Processing of the subjective scores In [5] the Shapiro-Wilk test was used to verify the normality of distributions of scores across subjects and the results of the test showed that the scores distributions are normal or close to normal. Although the raw subjective scores were already processed in the EPFL-Polimi database, we processed them in a different way. A T-Student considering the overall mean and standard deviation of the raw MOS individual scores of each lab showed that at 95% confidence level the data from the two labs can be merge. As an additional verification, the DMOS and confidence interval (CI) values (in this case after normalization, screening and re-scaling) were calculated for each content and distortion type and compared between the two labs, confirming that the data from the two labs can be merged. First of all, we calculated the difference scores by substracting the scores of the degraded videos to the score of the reference videos. The difference scores for the reference videos are 0 and so are removed. Accordingly, a lower difference score indicates a higher quality. Afterwards, the Z-scores were computed for each subject separately by means of the Matlab zscore function. The Z- scores transform the original distribution to one in which the mean becomes zero and the standard deviation becomes one. Each subject may have used the rating scale differently and with different offset. Indeed, this normalization procedure reduces the gain and offset between the subjects. Subsequently, the outliers were detected according to the guidelines described in section 2.3.1 of Annex 2 of [9] and removed. Next, the Z-scores were re-scaled to the range [0,5]. The Z- scores are assumed to be distributed as a standard Gaussian. Consequently, 99% of the scores will lie in the range [-3,3]. In our study 100% of the scores lied in that range. The re-scaling was done by linearly mapping the range [-3,3] to the range [0,5] using the following formula: 5. 3 6 Finally the Difference Mean Opinion Score (DMOS) of each video was computed as the mean of the re-scaled Z-scores from the 36 subjects that remained after rejection. Additionally, the confidence intervals were also computed. C. Objective assessment algorithms The performance of the following video quality algorithms was evaluated on the EPFL-PoliMI video quality assessment database: Peak Signal to Noise Ratio (PSNR): PSNR is computed using the mean of the MSE vector (contains the Mean Square Error of each frame). The implementation used is based on the "PSNR of YUV videos" program (yuvpsnr.m) by Dima Pröfrock available in the MATLAB Central file repository [10]. Only the luminance values were considered.

3 Structural SIMilarity (SSIM): SSIM is computed for each frame. After that an average value is produced. The implementation used is an improved version of the original version [11] in which the appropriate scale is estimated. The implementation, named ssim.m can be downloaded in the author s implementation home page [12]. Only the luminance values were considered. Multi-scale SSIM (MS-SSIM): MS-SSIM [13] is computed for each frame. Afterwards an average value is produced. The implementation used was downloaded from the Laboratory for Image & Video Engineering (LIVE) at the University Of Texas at Austin [14]. Only the luminance values were considered. Video Quality Metric (VQM): The software version 2.2 for Linux used was downloaded from the author s implementation home page [15]. As for the parameters used: parsing type none, spatial, valid, gain and temporal calibration automatic, temporal algorithm sequence, temporal valid uncertainty false, alignment uncertainty 15, calibration frequency 15 and video model general model. The files were converted to the format required by VQM (Big-YUV file format, 4:2:2) using ffmpeg. Visual Signal to Noise Ratio (VSNR): VSNR [16] is computed using the total signal and noise values of the sequence. Only the luminance values were considered. We modified the authors implementation available at [17] to extract the signal and noise values in order to sum them separately. Only the luminance values were considered. MOtion-based Video Integrity Evaluation (MOVIE): MOVIE [18] includes three different versions: the Spatial MOVIE index, the Temporal MOVIE index and the MOVIE index. The MOVIE Index version 1.0 for Linux was used and can be downloaded from [14]. The optional parameters framestart, frameend or frameint were not used. The default values of the metrics were used for all the metrics. No registration problems occur in the dataset. function nlinfit. The performance of the metrics is compared by means of a statistical significance analysis based on the Pearson, RMSE and Outlier Ratio coefficients [7]. III. RESULTS We include the scatter plots of the objective metrics scores vs. DMOS for all the videos in the EPFL-PoliMI video quality database. The fitting function is also plotted. Fig. 1. Scatter plot PSNR D. Statistical analysis In order to test the performance of the objective algorithms we compute the Spearman Rank Order Correlation Coefficient (SROCC), the Pearson correlation coefficient, the Root Mean Square Error (RMSE) and the Outlier Ratio (OR). The Spearman coefficient assesses how well the relationship between two variables can be described using a monotonic function. The Pearson coefficient measures the linear relationship between a model s performance and the subjective data. The RMSE provides a measure of the prediction accuracy. Lastly, the consistency attribute of the objective metric is evaluated by the Outlier Ratio. The Pearson, RMSE and Outlier Ratio are computed after a non-linear regression. The regression is done using a monotonic cubic polynomial function with four parameters. The function is constrained to be monotonic:... In the above equation the DMOSp is the predicted value. The four parameters are obtained using the MATLAB Fig. 2. Scatter plot SSIM

4 Fig. 3. Scatter plot MS-SSIM Fig. 6. Scatter plot MOVIE Fig. 4. Scatter plot VQM Fig. 7. Scatter plot SPATIAL MOVIE Fig. 5. Scatter plot VSNR Fig. 8. Scatter plot TEMPORAL MOVIE

5 In the table I the values of the coefficients corresponding to all the metrics are shown. TABLE I OBJECTIVE QUALITY ASSESSMENT ALGORITHMS Pearson Spearman RMSE Outlier Ratio PSNR 0.9586 0.9614 0.2195 0.6250 SSIM 0.9595 0.9696 0.2173 0.5972 MS-SSIM 0.9642 0.9781 0.2046 0.5972 VSNR 0.9744 0.9735 0.1733 0.4722 VQM 0.9619 0.9603 0.2109 0.5417 MOVIE 0.9650 0.9622 0.2023 0.6250 SPATIAL MOVIE 0.9814 0.9787 0.1480 0.4583 TEMPORAL MOVIE 0.9243 0.9142 0.2944 0.6111 IV. DISCUSSION The statistical significance analysis based on the RMSE shows that at 95% confidence level all the metrics are statistically better than Temporal Movie. On the other hand, Spatial Movie is statistically better than all the other metrics except VSNR. VSNR is statistically better than PSNR and SSIM. The statistical significance analysis based on the Pearson confirms the lower performance of Temporal Movie and the analysis based on the Outlier Ratio does not provide an indication of the differences between the performances of the metrics. The packet loss did not induce registration problems, explaining partly the high correlation values obtained. [4] F. De Simone, M. Naccari, M. Tagliasacchi, F. Dufaux, S. Tubaro and T. Ebrahimi, "Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel," in Proc. International Workshop on Quality of Multimedia Experience (QoMEX), San Diego, California, U.S.A, July 2009. [5] F. De Simone, M. Tagliasacchi, M. Naccari, S. Tubaro and T. Ebrahimi, A H264/AVC video database for the evaluation of quality metrics," in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, U.S.A, March 2010. [6] EPFL-PoliMI video quality assessment database [Online]. Available: http://vqa.como.polimi.it [7] Final report from the video quality experts group on the validation of objective models of multimedia quality assessment, phase I [Online]. Available: ftp://vqeg.its.bldrdoc.gov/documents/vqeg_approved_final_reports/ VQEG_MM_Report_Final_v2.6.pdf [8] H.264/AVC reference software version JM14.2, Tech. Rep., Joint Video Team (JVT) [Online]. Available: http://iphome.hhi.de/suehring/tml/download/old_jm/ [9] ITU-T, Recommendation ITU-R P 910, September 1999, Subjective video quality assessment methods for multimedia applications. [10] MATLAB Central File Exchange [Online]. Available: http://www.mathworks.com/matlabcentral/fileexchange/ [11] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April 2004. [12] The Structural SIMilarity (SSIM) index author s home page [Online]. Available: http://www.ece.uwaterloo.ca/~z70wang/research/ssim/ [13] Z. Wang, E. P. Simoncelli and A. C. Bovik, "Multi-scale structural similarity for image quality assessment," IEEE Asilomar Conference Signals, Systems and Computers, November 2003. [14] Laboratory for Image & Video Engineering [Online]. Available: http://live.ece.utexas.edu/research/quality/index.htm [15] Video Quality Metric (VQM) Software [Online]. Available: http://www.its.bldrdoc.gov/vqm/ [16] D. M. Chandler and S. S. Hemami, "VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images," IEEE Transactions on Image Processing, vol. 16, no. 9, pp. 2284-2298, September 2007. [17] VSNR implementation from the authors [Online]. Available: http://foulard.ece.cornell.edu/dmc27/vsnr/vsnr.html [18] K. Seshadrinathan and A. C. Bovik, "Motion Tuned Spatio-temporal Quality Assessment of Natural Videos," IEEE Transactions on Image Processing, vol. 19, no. 2, pp. 335-350, February 2010. V. CONCLUSIONS The results show that when the lightly compressed sequences without packet losses are taken as the reference instead of the uncompressed videos, the correlation of the selected objective algorithms is high, being the lowest Pearson correlation coefficient after non-linear regression in our study 0.9243. However, we believe the overall correlation would be lower if the models are evaluated over databases containing more sequences, registration problems, different coding parameters (e.g. flexible macroblock ordering) and error concealment strategies. The statistical analysis based on RMSE shows that at 95% confidence level, the Spatial Movie index shows the highest performance and Temporal Movie the lowest among the studied metrics. REFERENCES [1] Voddler home page, http://www.voddler.com. [2] Video Quality Experts Group page, http://www.its.bldrdoc.gov/vqeg [3] A. K. Moorthy, K. Seshadrinathan, R. Soundararajan and A. C. Bovik, "Wireless Video Quality Assessment: A study of subjective scores and objective algorithms," IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, no.4, pp. 587-599, April 2010.