Video Quality Evaluation with Multiple Coding Artifacts L. Dong, W. Lin*, P. Xue School of Electrical & Electronic Engineering Nanyang Technological University, Singapore * Laboratories of Information Technology, Singapore Abstract: - The study aims at identifying effects of different coding artifacts towards video quality in various situations, via comprehensive analysis of VQEG test results. It confirms that each coding artifact factor (e.g., blockiness or edge damage) may play different role in different video sequences. This finding leads to the proposed strategy for a new distortion evaluation scheme. Key-Words: - Perceptual visual metrics, Subjective tests, Coding artifacts, Human visual system, Blockiness, Edge 1 Introduction It is widely acknowledged that peak signal-to-noise ratio (PSNR) does not always correlate well with perceived picture quality. In order to develop quality metrics with better match with human vision, various approaches [1-10] have been tested both inside and outside of VQEG (Video Quality Expert Group) initiatives. Much effort has been directed to modeling human vision characteristics, temporal and spatial decomposition [1-4], and detecting features (e.g., blockiness, correlated error, etc.) [5-9] in the images. However, due to incomplete understanding on the human vision system and lag in incorporating physiological/psychological findings, the performance of the second class of metrics is still far from our expectations [10-11]. In the ten metrics (inclusive of PSNR metric) under VQEG evaluation, none can outperform others in all test sequences with 16 test combinations each in term of Pearson correlation (the prediction accuracy measure) and Spearman rank-order correlation (the prediction monotonicity measure). In this paper, we present some of the findings from a comprehensive investigation and comparative analysis on the VQEG source sequences, the associated decoded results, the difference images, and the DMOS (Difference Mean Opinion Scores). Important issues, which measure and subjective DMOS are highlighted. A new scheme for distortion gauge is hence proposed. 2 Material & Sequence Classification for this Study VQEG conducted subjective tests [12] against a wide spectrum of decoded video sequences (both 50 and 60 Hz). The 16 source sequences used in the study are shown in Figure 1 while their corresponding test conditions are listed in Table 1. DMOS values typically range from 0 (the highest subjective quality) to 100 (the lowest subjective quality). For this study, the test sequences are divided into 3 groups according to the nature of visual signal. Those with fast object motion are classified as group 1, including SRC5, SRC6, SRC9 & SRC19. The second group consists of those with no fast object motion but intensive edge in the images, including SRC1, SRC2, SRC3, SRC10, SRC14, SRC15, SRC18 & SRC22. In the last (third) group are those with neither fast object motion nor intensive edge, including SRC4, SRC16, SRC20 & SRC21. Src1_ref_625 Src2_ref_625 Src3_ref_625 Src4_ref_625
Src5_ref_625 Src6_ref_625 Src7_ref_625 Src8_ref_625 Src9_ref_625 Src10_ref_625 Src13_ref_525 Src14_ref_525 Src15_ref_525 Src16_ref_525 Src17_ref_525 Src18_ref_525 Src19_ref_525 Src20_ref_525 Src21_ref_525 Src22_ref_525 Figure 1. Test Sequences. NUMBER BIT RATE RES METHOD COMMENTS 16 1.5 Mb/s CIF H.263 Full Screen 15 768 kb/s CIF H.263 Full Screen 14 2 Mb/s ¾ mp@ml This is horizontal resolution reduction only 13 2 Mb/s ¾ sp@ml 12 4.5 Mb/s mp@ml With errors TBD 11 3 Mb/s mp@ml With errors TBD 10 4.5 Mb/s mp@ml 9 3 Mb/s mp@ml 8 4.5 Mb/s mp@ml Composite NTSC and/or PAL 7 6 Mb/s mp@ml 6 8 Mb/s mp@ml Composite NTSC and/or PAL 5 8 & 4.5 Mb/s mp@ml Two codecs concatenated 4 19/PAL(NTSC)- 422p@ml PAL or NTSC 12 Mb/s 3 generations 3 50-50- -50 Mb/s 422p@ml 7 th generation with shift / I frame 2 19-19-12 Mb/s 422p@ml 3 rd generation 1 n/a n/a Multi-generation Betacam with drop-out (4 or 5, composite/component) Table 1: Test conditions (HRCs)
3 Three Major Causes of Inconsistent Distortion Measure For the first group of sequences (all are sport sequences), the highest subjective scores (DMOS) are mostly generated by HRC11, HRC13, HRC14, and HRC15. Among them, HRC11, HRC13, HRC14 generate significant blocking effect, and HRC15 does not. Figure 2 shows SRC6_HRC11 in which blocking effect is more visible in SRC6_HRC15. In Figure 3, it can be seen that HRC11, HRC13, HRC14 have much higher PSNR than HRC15 for all sequences in Group 1 (i.e., SRC5, SRC6, SRC9 and SRC19), but their subjective qualities look no better as compared with HRC15. It is observed that blockiness appears as the major damage to visual quality when fast motion occurs. In the second group of sequences, the SRCs have much high-frequency information in a frame, as well as some slow movement in objects or camera zooming. One can see that blocking effects also exist in these SRCs (for example SRC8_HRC13), however, they are less annoying as comparing to those in Group 1. For this group, the worst DMOSs are with the sequences with blurred edges. Edge blurring occurs when information lost near edge. vicinity. This can be understood as high-frequency components, particular those with edge information are most important to human visual system. In Figure 4 (a) and (b), a frame is shown for each of two decoded video sequences: the test combination SRC10_HRC15 exhibits more severe damages on edge than SRC10_HRC1 (it is obvious with the calendar, animals, fences and other areas). This can also be seen in Figure 4 (c) and (d) where more edge energy can be found in the error image of SRC10_HRC15 than that of SRC10_HRC1. As we expected, the DMOS for SRC10_HRC1 is significantly lower than that for SRC10_HRC15 (22 and 68, respectively), although the average PSNR for the two test combinations is very close (in fact, SRC10_HRC15 is even 0.5 db higher). The conclusion is that the concentration of error energy on edge that causes degradation of perceptual quality. The similar phenomena happen with other sequences in the group. In Group 3, since neither blockiness nor intensive damaged edges occur, luminance and chrominance errors become the major factor in distortion perception in most test combinations. (a) SRC6_HRC11 (b) SRC6_HRC15 Figure 2: Frame in SRC6_HRC11 (a) and SRC6_HRC15 (b)
Figure 3: PSNR vs. Subjective Rating (DMOS). (a) SRC10_HRC1 (b) SRC10_HRC15 ( PSNR =23.4 db; DMOS=22) ( PSNR =23.9 db; DMOS=68) (c) Error image for SRC10_HRC1 (d) Error image for SRC10_HRC15 Figure 4. Comparison of SRC10_HRC1 and SRC10_HRC15 4 Proposed Strategies for Perceptual Visual Metric The main difficulty in real-world quality assessment is that many factors affect the overall perception in the human visual system and in general, there is no prior knowledge of dominant distortion. However, detecting multiple error sources simultaneously is time consuming. It also disturbs the measurement accuracy practically when some certain types of error do not present. In [9], a switching is devised between blockiness detection and normal error evaluation. From analysis in section 3, blockiness is the major distortion factor in fast moving video (Group 1) while damaged edges are contributing significantly to quality degradation of in Group 2. A double switching mechanism is proposed for fast motion detection and subsequently strong edge detection, as shown in Figure 5. Since a general VQM (visual quality metric) (e.g. [1-4]) could not
predict perceptual distortion effectively in all case, blockiness error (once fast motion is detected) or edge error (no fast motion but intensive edge is present) must incorporate alongside with the VQM. Figure 5. Proposed Scheme for Distortion Evaluation Based upon Multiple Error Detection. S1~S3 are three switches. For S1, if there is no fast motion detected, continue Strong Edge Detecting. For S2, if Strong Edge Detector result is true, then add Edge Blurring Error into Distortion Measure. S3 is the same as S2. 5 Concluding Remarks We have conducted an analysis of VQEG test sequences with subjective test results, and identified major factors affecting perceptual quality according to the nature of visual signals. There is no doubt that various artifacts co-existing in pictures, but they downgrade the perceptual quality in different ways depending on the nature of video sequences viewed. Based upon these findings a new scheme of distortion metric has been proposed with switching strategy among different error detections. References [1] Watson, A.B., and Soloman, J.A., A model of visual contrast gain control and pattern masking, Journal of the Optical Society of America A, Vol. 14, No. 9, 1997, pp. 2379-2391. [2] Watson, A.B.etc.al., DVQ: A digital video quality metric based on human vision, Journal of Electronic Imaging, Vol. 10, No. 1, 2001, pp. 20-29. [3] Winkler, S., Vision models and quality metrics for image processing applications, Ecole Polytecnique Federale De Lausanne (EPFL), Swiss Federal Institute of Technology, Thesis No. 2313, Lausanne, Switzerland, Dec. 2000. [4] Winkler, S., A perceptual distortion metric for digital color video, SPIE Proc. Human Vision and Elect. Imaging IV, Vol. 3644, pp. 175 184, 1999. [5] Wu, H.R. and Yuen, M., A generalize block-edge impairment metric for video coding, IEEE Sig. Proc. Lett., Vol. 4(11), pp.317-320, 1997. [6] Karunasekera, S. A. and Kingsbury, N. G., A distortion measure for blocking artifacts in image based on human visual sensitivity, IEEE Trans. Ima. Proc., Vol. 4(6), pp.713-724, 1995. [7] Miyahara, M, Kotani, K., and Algazi, V.R, Objective picture quality scale (PQS) for image coding, IEEE Trans. Communications, Vol. 46(9), pp.1215-1225, 1998. [8] Yu, Z, Wu, H.R., Winkler, S. and Chen, T., Vision-model-based impairment metric to evaluate blocking artifacts in digital video, Proc. IEEE, Vol. 90(1), pp. 154-169, 2002. [9] Tan, K.T. and Ghanbari, M, A multimetric objective picture-quality measurement model for MPEG video, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 7, Oct. 2000, pp. 1208-1213. [10] VQEG (Video Quality Expert Group), Final Report from the Video Quality Expert Group on the validation of Objective Models of Video Quality Assessment, www.vqeg.org, March 2000. [11] Zhou Wang, etc.al., Why Is Image Quality Assessment So Difficult? Proc. IEEE International Conference on Acoustics, Speech, & Signal Processing 2002.May 2002. [12] VQEG (Video Quality Expert Group), Evaluation of new methods for objective testing of video quality: objective test plan, ITU-T/COM- T/COM12/C, www.vqeg.org, September 1998.