Lecture/Lab Session 2 Inputs and Outputs May 4, 2009 Outline Review Inputs of Encoders: Formats Outputs of Decoders: Perceptual Quality Issue MATLAB Exercises Reading and showing images and video sequences Getting familiar with color spaces Examining spatial and temporal redundancies Examining psychovisual redundancies Measuring perceptual quality with PSNR 1 Image and video coding: A big picture Predictive Review Input Pre- Processing Lossy Lossless Post- Processing Visual Quality Measurement Predictive Encoded 110 11001 Decoded Post- Processing Lossy Lossless Pre-Processing 3 1
Some terms HSV/HSL (HSI/HSB) color spaces Luminance = Brightness = Lightness = Intensity = Value Hue: Saturation = Colorfulness = Chroma = Vividness = Purity Chromaticity = Hue + Saturation => It is a 2-tuple. Chromaticity Diagram: a 2-D diagram showing chromaticities Color = Luminance + Chromaticity => So, it is a 3-tuple. Chrominance (Chroma) = Color Difference Color Space: a 3-D space of colors Color Mixing Systems (light) vs. Color Appearance Systems (perception) Gamut: range of colors (in a color space) JND: Just-Noticeable Difference (50% accuracy) HSV color space HSL color space 4 5 srgb vs. CIE 1931 XYZ color spaces Inputs of Encoders: Formats 6 2
Color conversion in image/video encoding Color conversion: R G B => Y C b C r Input Pre-Processing Color Space RGB R G B Conversion Y UV/ Y P b P r 001110101001 Encoded A/D Conversion Y C b C r Chroma Subsampling Other Parts of Encoding Process 8 R,G,B [0,1],,, [, k r, k b (0,1), (, k r+k b<1 Y =k r *R +(1-k r -k b )*G +k b *B [0,1] P b =0.5*(B -Y )/(1-k b ) [-0.5,0.5] P r =0.5*(R -Y )/(1-k r ) [-0.5,0.5] Values of k b and k r can be different! ITU-T BT.601 (SDTV): k r =0.229, k b =0.114 ITU-T T BT.709 (HDTV): k r =0.2126, k b =0.0722 ANSI/SMPTE 240M-1995 (HDTV): k r =0.212, k b =0.087 Y P b P r => Y C b C r (taking MPEG-2 as an example) Y =219*Y +16 [16,235] C b =224*P b +128 [16,240] C r =224*P r +128 [16,240] 9 Chroma subsampling formats A 4X2 Image P(1,1) P(1,2) P(1,3) P(1,4) P(2,1) P(2,2) P(2,3) P(2,4) Y C b C r Y Sampling Locations C b Sampling Locations C r Sampling Locations 4:4:4 4 4 4 ALL ALL ALL 4:2:2 4 2 2 ALL P(1,1) P(1,3) P(2,1) P(2,3) P(1,1) P(1,3) P(2,1) P(2,3) 4:2:0 4 1 1 ALL P(1,1) P(1,3) P(1,1) P(1,3) 4:1:1 4 1 1 ALL P(1,1) P(2,1) P(1,1) P(2,1) Progressive vs. Interlacing Progressive mode: each row of a frame is scanned one by one Interlacing mode: a frame is divided into two fields odd rows are scanned first and even rows later. Benefit: bandwidth saving close to ½ 1920 1080 60H interlacing HDTV: 1920/2 1080 60 24= 1492992000 1.4 G bits/second 1280 720 60H progressive HDTV: 1280 720 60 24= 1327104000 1.236 G bits/second Problem: 10 11 3
Interlacing mode: Frames vs. Fields Interlacing mode: Frames vs. Fields A frame may be divided into two fields Top filed + Bottom field A frame may be divided into two fields Top filed + Bottom field 12 Interlacing mode: Frames vs. Fields 13 Digital image formats A frame may be divided into two fields Top filed + Bottom field No standard spatial resolutions Uncompressed images 14 For research purpose: 2n 2n, such as 128 128, 256 256, 512 512, 1024 1024 BMP (*.bmp): fileheader + infoheader + [palette] + data (row by row, bottom-up) Netpbm formats: PBM/PGM/PPM (*.pbm/*.pgm/*.ppm) Losslessly compressed images TIFF = Tagged Image File Format (*.tiff/*.tif) PNG = Portable Network Graphics (*.png) GIF = Graphics Interchange Format (*.gif) up to 256 colors Lossily compressed images JPEG (*.jpg), JPEG2000 (*.jp2/*.j2k) 15 4
Digital video formats CIF = Common Intermediate Format (since H.261) CIF (Full CIF = FCIF) = 352 288 QCIF (Quarter CIF) = 176 144 SQCIF (Sub Quarter CIF) = 128 96 4CIF = 4 CIF = 704 576 16CIF = 4 4CIF = 1408 1152 SIF = Source Input Format (since MPEG-1) 625/50 (TV: PAL/SECAM) = 352 288/360 288 525/59.94 (TV: NTSC) = 352 240/ 360 240 Sub-SIF (Computers) = 320 288 or 384 288 YUV video file format (.yuv/.cif/.qcif/.sif/ ) Planar formats YUV = YV12 = I420 = IYUV (4:2:0) YV16 (4:2:2) Packed formats UYVY = UYNV = Y422 (4:2:2) YUY2 = YUNV = V422 = YUYV (4:2:2) More info available at http://www.fourcc.org/yuv.php 16 17 YUV4MPEG2 format (.y4m) File Header File signature: YUV4MPEG2 Parameters Width, height and frame rate: Wxxx Hyyy Fa:b Interlacing: Ip (progressive), It (top field first), Ib (bottom field first), Im (mixed mode, detailed in frame headers) Aspect ratio: Aa:b Color space (Chroma format): C4xx... Frames Comment: X. Frame Header FRAME + a number of parameters (optional) + 0x0A Frame (YUV planar format) More information is available at http://wiki.multimedia.cx/index.php?title=yuv4mpeg2 Multimedia container/wrapper formats AVI = Audio Video Interleave (*.avi) FLV = Flash Video (*.flv) ASF = Advanced Systems Format (*.asf) MPEG-TS (Transport Stream) & MPEG-PS (Program Stream) (*.mpg/*.ts/*.ps) MP4 = MPEG-4 Part 14 (*.mp4) => 3GP (*.3gp/*.3g2) MOV (Quicktime) (*.mov) RealMedia (*.rm/*.rmvb) 18 19 5
Image and video coding: Quality issue Quality Mt Meterics Outputs of Decoders: Perceptual Quality Issue Input image/video Visual Quality Measurement Decoded image/video Encoder Decoder Encoded image/video 21 Visual quality measurement: Subjective DSIS (Double Stimulus Impairment Scale) DSCQS (Double Stimulus Continuous Quality Scale) SSCQE (Single Stimulus Continuous Quality Evaluation) Some measurement methods have been standardied: ITU-R BT.500, ITU-R BT.710, ITU-T P.910 Visual quality measurement: Objective Two images: the original one f(x,y) and the decoded one f (x,y) MSE = Mean Squared Error 1 MN X (f 0 (x, y) f(x, y)) 2 x,y SNR = Signal-to-Noise Ratio à P! x,y (f 0 (x, y)) 2 10 log 10 P x,y (f 0 (x, y) f(x, y)) 2 PSNR = Peak Signal-to-Noise Ratio Ã! L 2 10 log 10 P x,y (f 0 (x, y) f(x, y)) 2 22 23 6
Visual quality measurement: Objective Visual quality measurement: Objective SSIM = Structural Similarity Index SSIM = The original image PSNR=32.7 db PSNR=37.5 db (2E(f (x, y)f 0(x, y)) + C1 ) (2σf,f 0 + C2 ) ³ (E(f(x, y))2 + E(f 0 (x, y))2 + C1 ) σf2 + σf2 0 + C2 VQM = Video Quality Metric MPQM = Moving Pictures Quality Metric NQM Q = Noise Q Quality y Measure Research Question: PSNR is not good enough to perfectly reflect visual quality, more advanced metrics considering HVS are wanted. Some objective metrics are being standardied by the ITU-T. 24 25 Reading and Showing MATLAB Exercises Read an image. g Show an image. f=imread( Images/lena_color.bmp ); imshow(f); Read a frame from a YUV video. Show a video frame frame. f=yuvread( Video/news.qcif,1); imshow(ycbcr2rgb(yuv4xx_444(f))); Play back a YUV video. yuvplay( Video/news.qcif, All ); 27 7
Getting familiar with color spaces Show YUV planes of a video. f=yuvread( test.cif',1); subplot(2,2,1:2); imshow(f.y); subplot(2,2,3); imshow(f.u); subplot(2,2,4); imshow(f.v); Try to add pseudo-colors to U- and V-planes. Tip: assume Y=0.5 and another chroma channel is 0. Try to read a RGB image and convert it to YUV color space. YUV video player => Y4M video player Read the code of YUV video player. Two files: YUVplayer.m and YUVplayer.fig. Run guide YUVplayer.fig to open the second file. Read MATLAB help documents to learn how to design GUI with MATLAB. Try to implement a Y4M video player. This can be a take-home assignment. 28 29 8