Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks July 22 nd 2008 Vineeth Shetty Kolkeri EE Graduate,UTA 1
Outline 2. Introduction 3. Error control 4. Error concealment techniques 5. Implementation and quality assessment metric 6. Future work 2
1. Introduction Figure 1: Typical Situation of 3G/4G cellular telephony 3
Figure 1.1: Basic Coding Structure for H.264 [1] 4
1.1 H.264 Encoder Figure 1.2: Encoder [10] 5
1.2 H.264 Decoder Figure 1.3: Decoder [10] 6
Purpose of H.264 / MPEG-4 part 10 1. Higher coding efficiency than previous standards, MPEG-1,2,4 part 2, H.261, H.263 2. Simple syntax specifications 3. Seamless integration of video coding into all current protocols 4. More error robustness 5. Various applications like video broadcasting, video streaming, video conferencing, D-Cinema, HDTV 6. Network friendliness 7. Balance between coding efficiency, implementation complexity and cost - based on state-of the-art in VLSI design technology 7
Better image quality at the same compressed bitrate, or a lower compressed bitrate for the same image quality. Figure 1.4: PSNR (between original and reconstructed pictures) and bit rate saving results of Tempete CIF 15Hz sequence for the video streaming application [10] 8
2. Error Control Figure 2.1: Wireless Video Applications MMS, PSS and PCS: differentiation by real-time or offline processing for encoding, transmission and decoding [11] Figure 2.2: Packet Transmission in wireless medium Goal of Error Control: Overcome the effects of errors, during the transmission of the video frames in the wireless medium, e.g. packet loss on a packet network or a wireless network. Method used for Error Control : Error Concealment 9
3. Error Concealment 4. Problem: Transmission errors may result in lost information 2. Goal: Estimate the lost information in order to conceal the fact that an error has occurred 3. Error concealment is performed at the decoder 4. Observation: Video exhibits a significant amount of correlation along the spatial and temporal dimensions 5. Basic approach: Perform some form of spatial/temporal concealment to estimate the lost information from the correctly received data 10
Error Concealment (cont.) Consider the case where a single macroblock (16x16 block of pixels) is lost Three examples of error concealment: 1.Spatial Concealment: Estimate missing pixels by smoothly extrapolating surrounding pixels Correctly recovering missing pixels is extremely difficult, however even correctly estimating the DC (average) value is very helpful 2.Temporal Concealment: Copy the pixels at the same spatial location in the previous frame Effective when there is no motion, potential problems when there is motion 3.Motion-compensated temporal Concealment: Estimate missing block as motion-compensated block from previous frame Can use coded motion vector, neighboring motion vector, or compute new motion vector 11
3.1 Motion Vector Extrapolation (MVE) Compensate the missed MB by extrapolating each MV that is stored in previously decoded frame. 2. 8x8 sub-block based process. 3. Large overlapped MV is selected for the sub-block. If there is no overlap, then use Zero MV. Figure 3.1: Motion vectors from the previous frame [4] 12
4. Error Concealment MB missing Zero MV Replaces missed MV as (0,0) Copy a macro-block from previously reconstructed reference slice at the exact same position Figure 4.1: Zero MV concealment in dispersed FMO slices 13
4.1 Error Concealment Frame missing 1. Temporal Replacement Copy a MB/Frame from previously reconstructed reference slice at the exact same position 2. Motion Vector Copy Exploits MVs of a few past frames Estimate the MV of each pixel in last successful frame Project last frame onto an estimate of missing frame 14
4.1.1 Temporal Replacement - Frame Copy Figure 4.2: Frames# 5, 6 and 7 of the Original Sequence Figure 4.3: Frame# 5 of the decoded frame, Successfully decoded lost Frame # 15 6. Frame# 6 was reconstructed by Frame copy. Frame #7 is degraded.
"Inter" temporal prediction block based motion estimation and compensation 1. Multiple reference pictures 2. Reference P pictures 3. Arbitrary referencing order 4. Variable block sizes for motion compensation Seven block sizes: 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 & 4x4 5. 1/4-sample luma interpolation (1/4 or 1/8th-sample chroma interpolation in 4:2:0 format)) 6. Weighted prediction 7. Frame or field based motion estimation for interlaced scanned video 16
4.1.2 Motion Vector Copy Figure 4.4: Frames# 5, 6 and 7 of the Original Sequence Figure 4.5: Frame# 5 of the decoded frame, Successfully decoded lost Frame # 6. Frame# 6 was reconstructed by Motion Copy algorithm. Frame #7 is degraded. 17
Figure 4.7: Frame divided into multiple macroblocks of 16 x 16, 8 x 8, 4 x 4 variable size to represent coding profiles No. of bits in I and P frames Figure 4.8: Graph shows the size of the different I and P frames obtained after encoding 19 frames of the foreman QCIF video sequence. Green line shows the average values of the bit lost when it is passed through the lossy algorithm after encoding in a video 18 sequence
0.97 0.96 0.95 0.94 0.93 0.92 Series2 Series1 0.91 0.9 0.89 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4.9: Comparison of the recovered frames using Frame copy using SSIM 0.97 0.96 0.95 0.94 0.93 Series2 Series1 0.92 0.91 0.9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4.10: Comparison on of the recovered frames using Motion Estimation 19 using SSIM
38 37 36 35 34 33 32 31 30 29 28 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Series2 Series1 Figure 4.11: Comparison of the recovered frames using Frame copy using PSNR 40 35 30 25 20 15 Series2 Series1 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4.12: Comparison of the recovered frames using Motion Estimation using PSNR 20
60 50 40 30 Series2 Series1 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4.13: Comparison of the recovered frames using Frame copy using MSE 60 50 40 30 20 Series3 Series2 Series1 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4.14: Comparison of the recovered frames using Frame copy using MSE 21
5. Different Error Concealment Techniques Ref: I.C.Todoli Performance of Error Concealment Methods for Wireless Video, Diploma Thesis, Vienna University of Technology, 2007 [5] Original Decode without residuals Error Copy-paste Weighted Average Boundary matching Decode I Frame without residuals Block matching 22
Weighted averaging : The often used method averaging. Each pixel of a missing block is interpolated as a linear combination of the nearest pixels in the boundaries. Decode I frame without residuals: If the residuals are lost but the type of prediction correctly received, the simplest is to decode the missing block by setting the missing residuals to zero. This scenario occurs if data partitioning is used for H.264/ AVC and motion vectors are better protected than the residuals. Decoding without residuals performs well if the missing residuals were small. Copy-Paste: The missing blocks of one frame are replaced by spatially corresponding blocks from the previous frame. Boundary matching: Motion vectors of the missing block as well as those of its neighbors are unknown. Here we are concealing by looking for the coordinates of the best match within the search area of the missing block from the previous frame. Block matching: Better results can be obtained by searching the best match from the correctly received macroblocks. Decode without residuals: If the residuals are lost but the motion vectors correctly received, the simplest is to decode the missing block by setting the missing residuals to zero. 23
6. Implementation and Video Quality Analysis of the Received Sequences 1. Tested the frame copy and motion estimation in the decoder. 3. Implemented lossy algorithm for creating the error slices in the frame. 2. Implementing the error concealment algorithms in the decoder of JM 13.2. 7. Compare results of the recovered frames by error concealment technique from MSE: It calculates the difference between two images. It can be applied to digital video by averaging the results for each frame. PSNR: The most commonly used objective quality metric is the Peak Signal to Noise Ratio (PSNR). For a video sequence of frames. SSIM: This approach emphasizes that the Human Visual System (HVS) is highly adapted to extract structural information from visual scenes. Therefore, a measurement of structural similarity (or difference) should provide a good approximation to perceptual image quality. 24
Future Work 6. Implement the various error concealment algorithms using JM 13.2 Software. 9. Evaluate the quality of recovered frames using various techniques. 11.Compare the computational complexity between the different algorithms. 25
References 1. T. Stockhammer, M. M. Hannuksela and T. Wiegand, H.264/AVC in Wireless Environments, IEEE Trans. Circuits and Systems for Video Technology, Vol. 13, pp. 657-673, July 2003. 2. Soon-kak Kwon, A. Tamhankar and K.R. Rao, Overview of H.264 / MPEG-4 Part 10, J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006. 3. S. Wenger, H.264/AVC over IP IEEE Trans. Circuits and Systems for Video Technology, vol. 13, pp. 645-656, July 2003. 4. M. Wada, Selective Recovery of Video Packet Loss using Error Concealment, IEEE Journal on Selected Areas in Communication, vol. 7, pp. 807-814, June 1989. 5. I.C.Todoli Performance of Error Concealment Methods for Wireless Video, Diploma Thesis, Vienna University of Technology, 2007. 6. Video Trace research group at ASU, YUV video sequences, http://trace.eas.asu.edu/yuv/index.html. 7. A.B. Watson, "Toward a perceptual video quality metric", SPIE Human Vision, Visual Processing, and Digital Display VIII, 3299, pp 139-147, 1998. 8. F. Xiao, DCT-based video quality evaluation, Final Project for EE392J Stanford Univ. 2000. http://compression.ru/video/quality_measure/vqm.pdf Z. Wang, The SSIM index for image quality assessment, http://www.cns.nyu.edu/zwang/files/research/ssim/. Soon-kak Kwon, A. Tamhankar and K.R. Rao, Overview of H.264 / MPEG-4 Part 10, J. Visual Communication and Image Representation, vol. 17, pp.186-216, April 2006. T. Stockhammer, M. M. Hannuksela, and T. Wiegand, H.264/AVC in Wireless Environments, IEEE Trans. on Circuits and Systems for Video Technology,Vol. 13, pp. 657-673, July 2003. Z. Wang, et al, Image Quality Assessment: From Error Visibility to Structural Similarity, IEEE Trans. Image Processing, vol. 13, pp.600-612, April 2004. 26