Overview: Video Coding Standards Video coding standards: applications and common structure ITU-T Rec. H.261 ISO/IEC MPEG-1 ISO/IEC MPEG-2 State-of-the-art: H.264/AVC Video Coding Standards no. 1
Applications of Video Compression Efficient and flexible video compression standard needed Adapted from [Srinivasan et al., 2004] Video Coding Standards no. 2
Applications of Video Compression Digital television broadcasting 2... 6 Mbps (10 20 Mbps for HD) MPEG-2 (H.264/AVC) DVD video Blu-ray Disk Internet video streaming Videoconferencing, videotelephony 5... 8 Mbps up to 40 Mbps MPEG-2 MPEG-2, H.264/AVC, VC-1 (up to 1080p) 100... 2000 kbps MPEG-1, H.264/AVC, VC-1, or similar proprietary 20... 2000 kbps H.261, H.263, H.264/AVC Video over 3G wireless 100... 500 kbps H.263, MPEG-4, H.264/ AVC Video Coding Standards no. 3
Motion-compensated Hybrid Coding H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC - Decoder Coder Control Transform/ Quantizer Deq./Inv. Transform Control Data Quant. Transf. coeffs Intra/Inter 0 Motion- Compensated Predictor Entropy Coding Motion Estimator Motion Data Video Coding Standards no. 4
Video Compression Standards: Hierarchical Syntax Video Coding Standards no. 5
ITU-T Rec. H.261 International standard for ISDN picture phones and for video conferencing systems (1990) Image format: CIF (352 x 288 Y samples) or QCIF (176 x 144 Y samples), frame rate 7.5... 30 fps Bit-rate: multiple of 64 kbps (= ISDN-channel), typically 128 kbps including audio Picture quality: for 128 kbps acceptable with limited motion in the scene Stand-alone videoconferencing system or desk-top videoconferencing system, integrated with PC Video Coding Standards no. 6
Macroblocks Macroblock (MB) of 16x16 pixels Sampling format: 4:2:0 MB consists of 4 luminance and 2 chrominance blocks 16x16 luminance samples 8x8 Cb- samples 8x8 Cr- samples 0 1 4 5 2 3 Video Coding Standards no. 7
H.261 Motion-Compensated Prediction Integer-pel accuracy One displacement vector per macroblock Maximum displacement vector range +/-16 horizontally and vertically Adaptive loop filter, separable in 1-D horizontal and vertical impulse response: [¼, ½, ¼] Differential encoding of motion vectors Video Coding Standards no. 8
H.261 Residual Coding 8x8 DCT Quantization Uniform quantizer (Δ=8) for intra-mode DC coefficients Uniform threshold quantizer (Δ=2,4,,62) for AC coefficients in intra-mode and all coefficients in inter-mode Zig-zag scan Run-level coding for entropy coding (zero-run, value) symbols zero-run: the number of coefficients quantized to zero since the last nonzero coefficient value: the amplitude of the current nonzero coefficient Video Coding Standards no. 9
H.261 Macroblock Types (VLC Table) Prediction MQUANT MVD CBP TCOEFF VLC Intra X 0001 Intra X X 0000 001 Inter X X 1 Inter X X X 0000 1 Inter+MC X 0000 0000 1 Inter+MC X X X 0000 0001 Inter+MC X X X X 0000 0000 01 Inter+MC+FIL X 001 Inter+MC+FIL X X X 01 Inter+MC+FIL X X X X 0000 01 Video Coding Standards no. 10
MPEG-1/2: GOP Structure Group of Pictures = GOP 1 3 4 2 6 7 8 5 Video Coding Standards no. 11
MPEG-1/2 Encoder Preprocessing Picture reordering - 8x8 DCT Weighting Quantization VLC Video multiplex Buffer Video in Inverse quantization Inverse weighting Motion vectors, macroblock info, start codes Inverse 8x8 DCT Bitstream zero Picture store 1 + 1/2 + Motion compensation Picture store 2 Video Coding Standards no. 12
MPEG-1: coding of I-pictures I-pictures: intraframe coded 8x8 DCT Arbitrary weighting matrix for coefficients Differential coding of DC-coefficients Uniform quantization Zig-zag-scan, run-level-coding Entropy coding Unfortunately, not quite JPEG Video Coding Standards no. 13
MPEG-1: coding of P-pictures Motion-compensated prediction from an encoded I-picture or P-picture (DPCM) Half-pel accuracy of motion compensation, bilinear interpolation One displacement vector per macroblock Differential coding of displacement vectors Coding of prediction error with 8x8-DCT, uniform threshold quantization, zig-zag-scan as in I-pictures Video Coding Standards no. 14
MPEG-1: coding of B-pictures Motion-compensated prediction from two consecutive P- or I-pictures either only forward prediction (1 vector/macroblock) or or only backward prediction (1 vector/macroblock) Average of forward and backward prediction = interpolation (2 vectors/ macroblock) Half-pel accuracy of motion compensation, bilinear interpolation Coding of prediction error with 8x8-DCT, uniform quantization, zig-zag-scan as in I-pictures Video Coding Standards no. 15
MPEG-2 vs. MPEG-1 Efficiently compress interlaced digital video at broadcast quality Frame pictures or field pictures Adaptive frame/field prediction Adaptive frame/field DCT Improved coding efficiency by different quantization, VLC tables, and additional coefficient scan patterns Spatial, temporal and SNR scalability profiles (rarely used) Video Coding Standards no. 16
Field 1 Video Coding Standards no. 17
Field 2 Video Coding Standards no. 18
Frame-based better Field-based better Frame = Both Fields Combined Video Coding Standards no. 19
Adaptive Frame/Field DCT Video Coding Standards no. 20
Adaptive Frame/Field Motion Compensation Frame Prediction Field Prediction Video Coding Standards no. 21
[source: G. Sullivan, VCEG] Video Coding Standards no. 22
H.264/AVC Coder Input Video Signal Split into Macroblocks 16x16 pixels - Decoder Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation Deblocking Filter Output Video Signal Motion Estimation Motion Data [source: G. Sullivan, VCEG] Video Coding Standards no. 23
Common Elements with other Standards Macroblocks: 16x16 luma + 2 x 8x8 chroma samples Block-wise motion compensation Variable block-size motion compensation Block transform of prediction error Scalar quantization I, P, and B coding types [source: G. Sullivan, VCEG] Video Coding Standards no. 24
H.264 Motion Compensation Accuracy Input Video Signal Split into Macroblocks 16x16 pixels - Decoder Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation Motion Estimation De-blocking 16x16 16x8 8x16 8x8 Filter MB 0 0 1 Types 0 0 1 Output1 2 3 Video 8x8 Signal 8x4 4x8 4x4 8x8 0 0 1 0 0 1 Types Motion 1 2 3 Data Motion vector accuracy 1/4 (6-tap filter) [source: G. Sullivan, VCEG] Video Coding Standards no. 25
H.264 Multiple Reference Frames Input Video Signal Split into Macroblocks 16x16 pixels - Decoder Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Motion- Compensation De-blocking Filter Output Video Signal Motion Estimation Multiple Reference Motion Frames Data Generalized B Frames Weighted Prediction [source: G. Sullivan, VCEG] Video Coding Standards no. 26
H.264 Intra Prediction Input Video Signal Split into Macroblocks 16x16 pixels - Decoder Intra/Inter Coder Control Transform/ Scal./Quant. Intra-frame Prediction Motion- Compensation Motion Estimation Scaling & Inv. Transform De-blocking Filter Directional spatial prediction (9 types for luma, 1 chroma) Control Data Q A B C D E F G H I a Quant. b c d J Transf. e f g coeffs h K i j k l L m n o p Output Video Signal 4 Entropy Coding 0 e.g., Mode 3: diagonal down/right Motion prediction Data a, f, k, p are predicted by (A + 2Q + I + 2) >> 2 6 1 5 7 2 8 [source: G. Sullivan, VCEG] Video Coding Standards no. 27 3
H.264 4x4 Transform Input Video Signal 4x4 Block Integer Transform Decoder Split into Macroblocks 16x16 pixels - Coder Control Transform/ Scal./Quant. Scaling & Inv. Transform Control Data Quant. Transf. coeffs Entropy Coding Intra/Inter Intra-frame Prediction Repeated transform of DC coeffs for 8x8 chroma and some 16x16 Intra luma blocks Motion- Compensation De-blocking Filter Output Video Signal Motion Estimation Motion Data [source: G. Sullivan, VCEG] Video Coding Standards no. 28
Deblocking Filter q 0 q 1 q 2 One dimensional visualization of an edge position p 2 p 1 p 0 4x4 Block Edge Filtering of p 0 and q 0 only takes place if: 1. p 0 - q 0 < α(qp) 2. p 1 - p 0 < β(qp) 3. q 1 - q 0 < β(qp) Where β(qp) is considerably smaller than α(qp) Filtering of p 1 or q 1 takes place if additionally : 1. p 2 - p 0 < β(qp) or q 2 - q 0 < β(qp) (QP = quantization parameter) [source: G. Sullivan, VCEG] Video Coding Standards no. 29
Deblocking: Subjective Result for Intra Highly compressed first decoded intra picture at 0.28 bit/sample Without Filter With H264/AVC Deblocking [source: G. Sullivan, VCEG] Video Coding Standards no. 30
Deblocking: Subjective Result for Inter Highly compressed decoded inter picture Without Filter With H264/AVC Deblocking [source: G. Sullivan, VCEG] Video Coding Standards no. 31
[Wiegand, et al. 2003] Video Coding Standards no. 32
[Wiegand, et al. 2003] Video Coding Standards no. 33
[Wiegand, et al. 2003] Video Coding Standards no. 34
[Wiegand, et al. 2003] Video Coding Standards no. 35
Further reading Ming Liou, Overview of the px64 kbits/s video coding standard, Communications of the ACM, vol. 34, no. 4, pp. 59-63, April 1991. D. LeGall, MPEG: a video compression standard for multimedia applications, Communications of the ACM, vol. 34, no. 4, pp. 46-58, April 1991. IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003. Video Coding Standards no. 36
160 140 120 Picture I P' B Average size 156 kbits '62'kbits 15 kbits?'100 5 ọ! 80 @ ) 3 E o- ou 40 20 0 Current picture Past picture Best matching macroblock Future picture MV : motion vector (x,y) Best matching macroblock -.- Backward prediction error L Forward prediction error t^-"'' I nterpolafi ve prediction error Prediction error Bidirectional motion compensation.
High Levels Max resolution/ rate (Hz) Min. resolution/ rate (Hz) Simple 4:2:0 Nonscalable Main 4:2:0 Profiles Main+ 4:2:0 Scalable Next 422:2 N/A 920 x 1152160 N/A 1920 x ll52j60 N/A N/A N/A 960 x 576130 High- 1440 Main Low Bitrate (Mbitvs) N/A 80 N/A 100 (all layers) 80 (base+mid) 25 (base layer) Max resolution/ rate (Hz) NiA 1440 x 1152/60 1440 x ll5a6c 1440 x 115216( Min. resolution/ rate (Hz) Bitrate (MbitVs) N/A 60 Max resolution/ rate (Hz) Min. resolution/ rate (Hz) N/A N/A 720 x 576/30 720 x 576/30 60 (all layers) 40 (ba^se+mid) I5 (base layer) 80 (all layers) 60 (base+mid) 20 (ba^se layer) 720 x 576/30 720 x 51il3A 720 x 576/30 720 x 576130 Bitrate (MbitVs) I5 l5 Max resolution/ rate (Hz) Min. resolution/ rate (Hz) N/A N/A N/A 352 x 288130 l5 (all layers) I 0 (base layer) 20 (all layers) l5 (base+mid) 4 (base layer) N/A 352 x 288130 352 x 288130 N/A N/A N/A N/A N/A Bitrate (Mbirs/s) N/A 4 4 (all layers) 3 (base layer) N/A