COMP 9519: Tutorial 1 1. An RGB image is converted to YUV 4:2:2 format. The YUV 4:2:2 version of the image is of lower quality than the RGB version of the image. Is this statement TRUE or FALSE? Give reasons for your answer. 2. Shown below are two 8x8 image blocks. Calculate the entropy associated with each block. 10 10 10 10 11 11 13 13 10 10 10 11 12 13 13 15 10 11 11 12 13 13 15 15 11 11 12 13 13 15 15 16 11 12 12 14 14 15 16 16 12 12 14 14 15 16 16 17 12 14 14 15 16 17 17 17 14 14 16 16 17 17 17 17 Block A Block B
3. Based on 2x2 macroblocks (MB), the X motion estimation algorithm (X- MEN) searches for the best matching motion vector in the following locations: {0,0}, {-2,-2}, {+2,-2}, {-2, +2}, {+2,+2}. These locations are relative to the current MB and corresponds to the center, top-left, top-right, bottom-left and bottom-right areas. Assume a search range of {-2,+2}, with the current MB at the current frame (frame[n]) as: 113 119 238 107 And the reference frame (frame[n-1]) is: 216 77 87 44 149 203 82 94 150 31 215 41 40 251 49 95 134 138 136 250 108 15 245 147 15 115 44 223 49 141 231 108 52 38 185 69 131 154 185 115 94 183 44 61 108 102 145 152 171 178 79 64 85 13 105 11 161 228 254 165 218 51 161 144 214 96 214 223 110 106 190 7 183 70 112 247 125 159 60 183 5 219 145 188 58 78 68 80 177 65 87 170 208 187 140 130 174 218 94 35 148 223 112 3 21 221 80 222 117 96 238 198 97 151 179 3 194 4 238 98 116 59 93 3 117 3 85 125 212 127 139 228 135 196 174 174 113 205 100 35 115 107 167 47 128 229 113 51 163 248 54 24 90 232 151 209 105 192 100 179 181 210 177 76 53 252 214 9 39 59 31 110 230 202 160 251 109 164 158 169 97 201 160 156 172 61 10 227 1 235 178 206 78 209 203 73 200 112 34 155 178 13 117 187 76 215 101 179 48 168 244 120 174 127 53 4 186 20 222 175 13 94 105 124 49 87 133 17 118 55 155 4 122 163 238 88 177 158 167 29 174 74 224 252 145 164 161 48 141 49 67 42 166 186 214 170 Where the shaded MB is the relative position of the current MB in frame[n] on the reference frame (also called the co-located MB). Answer the following questions: i) Calculate the sum of absolute differences at each search location of X- MEN. ii) What will be the best motion vector given by X-MEN? Justify your answer.
4. The video encoding process blocks are shown below: Answer the following questions: i) Which block(s) will information loss occur? ii) iii) iv) Which block(s) contain the decoded version of the previous frame (Frame[N-1])? Which block(s) contain the motion compensated version of the current frame (Frame[N])? Assume the current frame is inter-coded (i.e. coded using motion estimation), what will the coded video consist of?
5. An 8x8 image block is given below. (i) Transform this block using the 2D DCT, (ii) perform quantization using a step size of 8 for all transformed coefficients, (iii) perform zig-zag scanning of the quantized coefficients to obtain (run, level) pairs, (iv) perform inverse quantization, (v) perform 2D IDCT, (vi) calculate MSE of the final inverse quantized, inverse transformed block. 10 10 10 10 10 10 10 10 12 12 12 12 12 12 12 12 20 22 24 26 28 30 32 34 22 24 26 28 30 32 34 36 22 24 26 28 30 32 34 38 26 28 30 32 34 36 38 40 28 30 32 34 36 38 40 42 30 32 34 36 38 40 50 60 The 1D DCT matrix that can be used for this question is given below. 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.3536 0.4904 0.4157 0.2778 0.0975-0.0975-0.2778-0.4157-0.4904 0.4619 0.1913-0.1913-0.4619-0.4619-0.1913 0.1913 0.4619 0.4157-0.0975-0.4904-0.2778 0.2778 0.4904 0.0975-0.4157 0.3536-0.3536-0.3536 0.3536 0.3536-0.3536-0.3536 0.3536 0.2778-0.4904 0.0975 0.4157-0.4157-0.0975 0.4904-0.2778 0.1913-0.4619 0.4619-0.1913-0.1913 0.4619-0.4619 0.1913 0.0975-0.2778 0.4157-0.4904 0.4904-0.4157 0.2778-0.0975 Note that 2D DCT can be performed by : C X C T
Where C is the DCT matrix given above and the superscript T denotes a transpose; and X denotes the image block being transformed. To calculate the 2D Inverse DCT (2D IDCT) remember that C C T = I Where I denotes the identity matrix; you can deduce the 2D IDCT from this relationship. Quantization can be performed by a simple integer division of the transformed coefficients by the quantization step size (i.e. integer division by 8 in this question). Alternatively you can use the mid-tread quantizer given in Lecture 2 ( with a step size of 8). 6. Using the VLC scheme detailed in the lecture slides, can you try to code some of the (run, length) pairs obtained in the previous question? 7. Assume you have a video sequence coded in the following pattern IBBPBBPBBPBBIBBPBBPBBPBBI (i) (ii) (iii) (iv) If the second I frame is corrupted with error (i.e. the I frame in the middle of the above sequence), how many other frames can be degraded due to error propagation? You can assume, for example, that a portion of the data for the second I frame is missing (i.e due to lost packets in a streaming application). Similarly what would be the effect of error propagation if the first B frame is corrupted with error? How can such error propagation be limited for an MPEG-4 coded bit stream? How can scalable coding help with error resilience in a video streaming application? Assume you have spatial scalable coding with two layers (base layer and enhancement layer) and that video is being streamed live (ie IPTV).