KEY INDICATORS FOR MONITORING AUDIOVISUAL QUALITY

Proceedings of Seventh International Workshop on Video Processing and Quality Metrics for Consumer Electronics January 30-February 1, 2013, Scottsdale, Arizona KEY INDICATORS FOR MONITORING AUDIOVISUAL QUALITY Mikołaj Leszczuk, Mateusz Hanusiak, Ignacio Blanco (AGH University) Emmanuel Wyckens (Orange Labs), Silvio Borer (SwissQual) September 14, 2012 Abstract Automating quality checking is currently based on finding major video and audio artefacts. The Monitoring Of Audiovisual quality by key Indicators (MOAVI) subgroup of the Video Quality Experts Group (VQEG) is an open collaborative project for developing no-reference models for monitoring audiovisual service quality. MOAVI is a complementary, industry-driven alternative to Quality of Experience (QoE) used to automatically measure audiovisual quality by using simple indicators of perceived degradation. The goal is to develop a set of key indicators (including blocking effects, blurring effects, freeze/jerkiness effects, ghosting effects, slice video stripe errors, aspect ratio problems, field order problems, photosensitive epilepsy flashing effects, silence, and clipping) describing service quality in general (the list is not final, but it includes the most important artifacts), and to select subsets for each potential application. Therefore, the MOAVI project concentrates on models based on key indicators contrary to models predicting overall quality. 1 INTRODUCTION Current Quality of Experience (QoE) models of the No-Reference (NR) algorithm, such as those reported in related research work [1], address measuring quality of networked multimedia using objective parametric models. These models may experience slight problems in predicting overall audiovisual QoE. Therefore, a complementary, industry-driven alternative used to measure quality automatically by using simple perceived indicators can now be proposed. Consequently, the Monitoring Of Audio-Visual quality by key Indicators (MOAVI) [2] subgroup of the Video Quality Experts Group (VQEG) [3], an open collaborative project for developing NR models for monitoring audiovisual service quality, is developing such a set of key indicators. This paper is organized as follows. Section 2 describes related limitations. Section 3 presents MOAVI s key indicators. Section 4 concludes the paper and outlines of future work. The research leading to these results has received funding from the European Community s Seventh Framework Programme (FP7/2007-2013) under grant agreement 218086 (INDECT). 86 VPQM2013

2 BACKGROUND AND STATE-OF-THE-ART This section presents limitations of the state-of-the-art Full-Reference (FR), Reduced- Reference (RR) and NR metrics for standardized models. Most of the models in the recommendations have been validated using one of the following hypotheses: Frame freezes of up to 2 seconds, No degradation at the beginning or at the end of the video sequence are allowed, No skipped frames, The video reference should be clean (no spatial or temporal distortions), Minimum delay is supported between video reference and video (sometimes with constant delay), The up or downscaling operations are not always taken into account, Most models are based on measuring conventional blurriness, blockiness and jerkiness artefacts for producing predictive Mean Opinion Scores (MOS). Most of algorithms producing the MOS scores combine the blur, block and jerkiness metrics. The weighting between each indicator could be a simple mathematical function. If one of the indicators is in-correct, the global predictive score is completely wrong. The other indicators mentioned in MOAVI(e.g. ghosting, slice error) are not taken into account for MOS. The history of the ITU-T Recommendations is shown in Table 1, while the metrics based on video signal only are shown in Table 2. Table 1: History of ITU-T Recommendations. Type of Model Format Recommendation Year FR SD J.144 [4] 2004 FR QCIF/CIF/VGA J.247 [5] 2008 RR QCIF/CIF/VGA J.246 [6] 2008 FR SD J.144 [4] 2004 RR SD J.249 [7] 2010 FR HD J.341 [8] 2011 RR HD J.342 [9] 2011 Bitstream VGA HD In progress Exp. 2013 Hybrid VGA HD In progress Exp. 2013 The related research work [10] addresses measuring multimedia quality in mobile networks with an objective parametric model. Current standardization activity at ITU- T SG12 on models for multimedia and IPTV based on bit-stream information is also closely related. SG12 is now working on models for IPTV. Q.14/12 is responsible for these projects, provisionally called P.NAMS (non-intrusive parametric model for 87

Table 2: Synthesis of the FR, RR and NR MOS models Type of ITU-T Model FR RR NR HDTV J.341 [8] n/a n/a SDTV J.144 [4] n/a n/a Resolution VGA J.247 [5] J.246 [6] n/a CIF J.247 [5] J.246 [6] n/a QCIF J.247 [5] J.246 [6] n/a assessment of performance of multimedia streaming) and P.NBAMS (non-intrusive bitstream model for assessment of performance of multimedia streaming). P.NAMS uses packet-header information only (e.g. from IP through MPEG2-TS), while P.NBAMS is able to use the payload information (i.e. coded bit- stream) [11]. However, this work has so far focused on the overall quality (in MOS units), while MOAVI is focusing on Key Performance Indicators (KPI). The MOAVI project could be used to human behavior over longer period, and to propose an adapted model with enhanced SSCQE methods. Most of the recommended models are based on a global quality evaluation of the video sequences as in P.NAMS and P.NBAMS projects. The predictive score is correlated to the subjective score obtained using global evaluation method (SAMVIQ, DSCQS, ACR, etc.). Generally, the duration of the video sequences is limited to 10 s or 15 s in order to avoid a forgiveness effect (the observer cannot assess the video corretly after 30 s, and is prone to giving more weight to artefacts occurring at the end of the sequence). When a single model is used for monitoring video services, the global scores are provided for fixed temporal windows and without any acknowledgement of the previous scores. 3 MOAVI S KEY INDICATORS FOR AUTOMATED QUALITY CHECKING Automating quality checking is currently based on finding major video and audio artefacts. The processing is performed on the video signal and/or the bit- stream. Quality checking can be conducted before, during, and/or after the encoding process. However, in MOAVI, no MOS is provided. MOAVI key artefact indicators are classified into four directories based on their origins: 1. Capturing 2. Processing 3. Transmission 4. Displaying 88

3.1 Capturing Artefacts are introduced during video recording. Images and video are captured using cameras comprising of an optical system and a sensor with processing circuitry. Artefacts based on capture will affect both analogue and digital systems as they are at the front end of the image acquisition. Reflected light from the object or scene forms an image on the sensor [12]. Artefacts include: blurring, flickering, exposure time distortions, ghosting, mute, shaking, rainbow effect, lip sync, blackout, clipping, and vignetting. 3.2 Processing Processing is required to meet constraints such as bandwidth limitations imposed by the medium and to provide immunity against medium noise. There are many coding techniques for removing the redundancies in images and video. Coding can introduce artefacts such as reduced spatial and temporal resolution, which are the common and dominant undesirable visible effects [12]. Artefacts include: blocking, blurring, flickering, freezing/jerkiness (jerky motion), ghosting, ringing/mosquito, colour bleeding, lip sync, clipping, and framing (pillar-boxing/letter-boxing). 3.3 Transmission When data is transmitted through a medium, some of the data may be lost, distorted or may result in multiple data due to reflections. When data arrives through many paths in addition to the direct path, the distortion is known as multipath distortion and affects both analogue and digital communications [12]. Artefacts include: blocking, blurring, flickering, freezing/jerkiness (jerky motion), ghosting, ringing/mosquito, mute, block missing, stripe noise, colour bleeding, lip sync, and blackout. 3.4 Displaying As the technology was developed, different display systems started to offer different subjective quality with the same resolution. With the latest display screens, the difference is reduced to the minimum between OLED, LCD and SED technologies. Artefacts include: block missing, stripe noise, aspect ratio error, photosensitive epilepsy flashing effect, lip sync, blackout, and framing (pillar-boxing/letter-boxing). 4 CONCLUSIONS AND NEXT STEPS This project is still in its infancy. Questions need to be submitted to the MOAVI Co- Chairs. Nine people have been involved in this activity so far. In the next step, methods for measuring distortions will be analysed. Psychophysical experiments will then be conducted for distortions for which quantitative thresholds are missing. As a result, the thresholds will be contributed to the research community (by means of a published scientific paper). 89

REFERENCES [1] R. Venkatesh, Babu Ajit, S. Bopardikar, Andrew Perkis, and Odd Inge Hillestad, No-reference metrics for video streaming applications, 2002. [2] Emmanuel Wyckens, Silvio Borer, and Mikołaj Leszczuk, MOAVI (Monitoring of Audio Visual Quality by Key Indicators) Project, VQEG, July 2012, http://www.its.bldrdoc.gov/vqeg/projects/moavi/moavi.aspx. [3] VQEG, The Video Quality Experts Group, July 2012, http://www.vqeg.org/. [4] ITU-T J.144, Objective perceptual video quality measurement techniques for digital cable television in the presence of a full reference, 2004. [5] ITU-T J.247, Objective perceptual multimedia video quality measurement in the presence of a full reference, 2008. [6] ITU-T J.246, Perceptual isual quality measurement techniques for multimedia services over digital cable television networks in the presence of a reduced bandwidth reference, 2008. [7] ITU-T J.249, Perceptual video quality measurement techniques for digital cable television in the presence of a reduced reference, 2010. [8] ITU-T J.341, Objective perceptual multimedia video quality measurement of HDTV for digital cable television in the presence of a full reference, 2011. [9] ITU-T J.342, Objective multimedia video quality measurement of HDTV for digital cable television in the presence of a reduced reference signal, 2011. [10] J. Gustafsson, G. Heikkila, and M. Pettersson, Measuring multimedia quality in mobile networks with an objective parametric model, in Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, oct. 2008, pp. 405 408. [11] Akira Takahashi, Kazuhisa Yamagishi, and Ginga Kawaguti, Global standardization activities recent activities of qos / qoe standardization in itu-t sg12, Ntt Technical Review, vol. 6, no. 9, pp. 1 5, 2008. [12] Amal Punchihewa and Donald G. Bailey, Artefacts in Image and Video Systems: Classification and Mitigation, in Image and Vision Computing New Zealand, 2002. 90