Contents List of figures List of tables Preface Acknowledgements xv xxi xxiii xxiv 1 Introduction 1 References 4 2 Digital video 5 2.1 Introduction 5 2.2 Analogue television 5 2.3 Interlace 7 2.4 Picture sizes 14 2.4.1 Standard definition television formats (SDTV) 14 2.4.2 Standard input format (SIF) 14 2.4.3 Common intermediate format (CIF) 15 2.4.4 Quarter common intermediate format (QCIF) 15 2.4.5 Video graphics array (VGA) 15 2.4.6 Quarter video graphics array (QVGA) 15 2.4.7 High definition television 16 2.5 Chrominance formats 16 2.6 Bit depth and uncompressed video bit rates 19 2.7 Concluding remarks 19 2.8 Summary 20 References 20 3 Picture quality assessment 21 3.1 Introduction 21 3.2 Subjective viewing tests 21 3.3 Analogue video quality measurements 23 3.3.1 Linear distortions 23 3.3.2 Non-linear distortions 23 3.3.3 Noise 23 3.3.4 Measurement 24 3.4 Digital video quality 24 3.4.1 Peak signal-to-noise ratio (PSNR) 24 3.4.2 Objective perceptual video quality measurement techniques 27
viii Video compression systems 3.4.3 Picture quality monitoring using watermark signals 28 3.4.4 Testing picture quality analysers 29 3.4.5 Video quality measurements inside encoders or decoders 30 3.4.5.1 Double-ended video quality measurements 30 in encoders 3.4.5.2 Single-ended video quality measurements on compressed bit streams 30 3.5 Concluding remarks 30 3.6 Summary 31 References 31 4 Compression principles 33 4.1 Introduction 33 4.2 Basic audio compression techniques 33 4.2.1 Frequency response 33 4.2.2 Frequency masking 34 4.2.3 Temporal masking 35 4.2.4 Transformation and quantisation 35 4.3 Introduction to video compression 37 4.4 Redundancy in video signals 37 4.5 Exploiting spatial redundancy 38 4.5.1 Example transformation and quantisation 39 4.6 Exploiting statistical redundancy 41 4.7 Exploiting temporal redundancy 43 4.8 Motion estimation 44 4.8.1 Example transformation of a predicted block 44 4.9 Block diagrams 46 4.10 Quantisation control 46 4.11 Bi-directional prediction 48 4.12 Mode decision 49 4.13 Rate control 50 4.14 Concluding remarks 51 4.15 Summary 51 References 51 5 MPEG video compression standards 53 5.1 Introduction 53 5.2 MPEG-1 53 5.3 MPEG-2 55 5.3.1 Interlace coding tools 56 5.3.2 Film coding tools 57 5.3.3 Low-delay mode 58 5.3.4 Profiles and levels 59 5.3.5 Block diagram 60 5.4 MPEG-4 61 5.4.1 MPEG-4 Part 2 (Visual) 62 5.4.1.1 Motion prediction 63 5.4.1.2 Intra-prediction 64 5.4.1.3 Object coding 64
Contents ix 5.4.2 MPEG-4 Part 10 (AVC) 64 5.4.2.1 New transforms 65 5.4.2.2 Entropy coding 66 5.4.2.3 Intra-predictions 66 5.4.2.4 Motion prediction 67 5.4.2.5 In-loop filter 69 5.4.2.6 Rate-distortion optimisation (RDO) 71 5.4.2.7 Block diagram 71 5.5 Structure of MPEG bit streams 72 5.5.1 Elementary streams 72 5.5.1.1 Sequence 72 5.5.1.2 Group of pictures 72 5.5.1.3 Picture 74 5.5.1.4 Slice 74 5.5.1.5 Macroblock 75 5.5.1.6 Block 75 5.5.1.7 Differences in MPEG-4 (AVC) 75 5.5.2 Transport stream 76 5.5.2.1 Timing synchronisation 77 5.5.2.2 Adaptation fields 77 5.5.2.3 Program-specific information (PSI) 79 5.6 Current MPEG activities 79 5.7 Concluding remarks 80 5.8 Summary 80 Exercise 5.1 Encoder evaluation 81 Exercise 5.2 Bit-rate saving 81 Exercise 5.3 PSNR limit 81 References 82 6 Non-MPEG compression algorithms 85 6.1 Introduction 85 6.2 VC-1 SMPTE 421M 85 6.2.1 Adaptive block-size transform 86 6.2.2 Motion compensation 86 6.2.3 Advanced entropy coding 87 6.2.4 De-blocking filter 87 6.2.5 Advanced B frame coding 87 6.2.6 Low-rate tools 87 6.2.7 Fading compensation 87 6.3 Audio Video Coding Standard (AVS) 87 6.3.1 Transform 88 6.3.2 Intra-predictions 88 6.3.3 Motion compensation 88 6.3.4 Entropy coding 88 6.3.5 De-blocking filter 88 6.3.6 B frame coding modes 88 6.4 Wavelet compression 91 6.5 JPEG2000 92 6.6 Dirac 94 6.6.1 Dirac Pro 95 6.7 Concluding remarks 95
x Video compression systems 6.8 Summary 96 Exercise 6.1 Clip encoding 96 Exercise 6.2 Codec comparison 96 References 96 7 Motion estimation 99 7.1 Introduction 99 7.2 Block-matching algorithms in the spatial domain 100 7.2.1 Exhaustive motion estimation 100 7.2.2 Hierarchical motion estimation 104 7.2.3 Other block-matching methods 107 7.2.3.1 Spiral search 107 7.2.3.2 Sub-sampling method 108 7.2.3.3 Telescopic search 108 7.2.3.4 Three-step search 109 7.3 Correlation algorithms in the frequency domain 109 7.3.1 Phase correlation 110 7.3.2 Cross-correlation 111 7.4 Sub-block motion estimation 112 7.5 Sub-pixel motion refinement 113 7.6 Concluding remarks 113 7.7 Summary 114 Exercise 7.1 Motion estimation search range 114 References 114 8 Pre-processing 117 8.1 Introduction 117 8.2 Picture re-sizing 117 8.2.1 Horizontal down-sampling 117 8.2.2 Vertical down-sampling in SDTV systems 119 8.2.2.1 Temporal sub-sampling 120 8.2.2.2 Intra-field down-sampling 120 8.2.2.3 Vertical-temporal down-sampling 120 8.3 De-interlacing 120 8.3.1 Motion-adaptive de-interlacing 121 8.3.2 Motion-compensated de-interlacing 121 8.3.3 De-interlacing and compression 122 8.4 Noise reduction 122 8.4.1 Introduction 122 8.4.2 Types of noise 123 8.4.3 Bit-rate demand of noisy video signals 123 8.4.4 Noise reduction algorithms 124 8.4.4.1 Temporal noise reduction algorithms 125 8.4.4.2 Spatial noise reduction algorithms 126 8.4.4.3 Noise-level measurement 126 8.4.4.4 Impulse noise reduction 127 8.4.5 De-blocking and mosquito filters 127 8.4.5.1 De-blocking filters 127 8.4.5.2 Mosquito noise filters 128
Contents xi 8.5 Forward analysis 129 8.5.1 Introduction 129 8.5.2 Film mode detection 129 8.5.3 Scene cuts, fades and camera flashes 129 8.5.4 Picture criticality measurement 130 8.5.5 Global coding mode decisions 130 8.5.6 Detection of previously coded picture types 130 8.6 Concluding remarks 130 8.7 Summary 131 Exercise 8.1 De-interlacing 131 Exercise 8.2 Vertical down-sampling 131 Exercise 8.3 HD to SD conversion 131 Exercise 8.4 Encoder optimisation 132 References 132 9 High definition television (HDTV) 135 9.1 Introduction 135 9.2 Compression of HDTV 136 9.3 Spatial scalability 136 9.4 Progressive versus interlaced 138 9.5 Compression of 1080p50/60 140 9.6 Bit-rate requirement of 1080p50 140 9.6.1 Test sequences 140 9.6.2 Test configuration 141 9.6.3 Bit-rate results 142 9.6.4 Qualification of these results 143 9.7 Concluding remarks 144 9.8 Summary 145 Exercise 9.1 HDTV contribution 145 References 146 10 Compression for mobile devices 149 10.1 Introduction 149 10.2 Compression algorithms for mobile applications 149 10.3 Pre-processing for TV to mobile 150 10.4 Compression tools for TV to mobile 151 10.4.1 CABAC 152 10.4.2 B frames 152 10.4.3 Main Profile 152 10.4.4 Decode time 153 10.5 Scalable video coding (SVC) 154 10.6 Transmission of text to mobile devices 155 10.7 Concluding remarks 156 10.8 Summary 157 Exercise 10.1 Compression for mobile devices 157 References 157
xii Video compression systems 11 MPEG decoders and post-processing 159 11.1 Introduction 159 11.2 Decoding process 159 11.3 Channel change time 161 11.4 Bit errors and error concealment 162 11.5 Post-processing 164 11.6 Concluding remarks 165 11.7 Summary 166 Exercise 11.1 Channel change time 166 Exercise 11.2 Filter design 166 References 166 12 Statistical multiplexing 169 12.1 Introduction 169 12.2 Analysis of statistical multiplex systems 171 12.3 Bit-rate saving with statistical multiplexing 173 12.4 Statistical multiplexing of MPEG-4 (AVC) bit streams 174 12.4.1 Comparison of bit-rate demand of MPEG-2 175 and MPEG-4 (AVC) 12.4.2 Bit-rate demand of MPEG-4 (AVC) HDTV 177 12.5 System considerations 178 12.5.1 Configuration parameters 178 12.5.2 Noise reduction 179 12.5.3 Two-pass encoding 180 12.5.4 Opportunistic data 180 12.6 Concluding remarks 181 12.7 Summary 181 Exercise 12.1 Statistical multiplexing at median 182 bit-rate demand Exercise 12.2 Large statistical multiplex system 182 References 182 13 Compression for contribution and distribution 183 13.1 Introduction 183 13.2 MPEG-2 4:2:2 Profile 184 13.2.1 Bit-rate requirement of 4:2:2 Profile 184 13.2.2 Optimum GOP structure with high-bit-rate 184 4:2:2 encoding 13.2.3 VBI handling in MPEG-2 4:2:2 encoding 185 13.3 MPEG-4 (AVC) 188 13.3.1 CABAC at high bit rates 190 13.3.2 MPEG-4 (AVC) profiles 190 13.3.3 Fidelity range extension 191 13.3.4 4:2:2 MPEG-4 (AVC) 192 13.3.5 Comparison of 8 and 10 bit coding 194 13.3.6 Low-delay MPEG-4 (AVC) 196
Contents xiii 13.4 Concluding remarks 196 13.5 Summary 197 Exercise 13.1 Distribution link 197 References 197 14 Concatenation and transcoding 199 14.1 Introduction 199 14.2 Concatenation model 200 14.3 Detection of MPEG frame types in decoded video signals 203 14.4 Bit-rate saving 204 14.5 MPEG-2/MPEG-4 (AVC) concatenation 205 14.6 Fixed bit rate into variable bit rate 206 14.6.1 MPEG-2 to MPEG-2 concatenation 206 14.6.2 MPEG-2 to MPEG-4 (AVC) concatenation 208 14.6.3 MPEG-4 (AVC) to MPEG-4 (AVC) 210 concatenation 14.6.4 Concatenation of HDTV 211 14.6.5 MPEG-4 (AVC) to MPEG-2 concatenation 212 14.7 Variable bit rate into fixed bit rate 212 14.7.1 MPEG-2 downstream encoder 213 14.7.2 MPEG-4 (AVC) downstream encoder 213 14.8 Concluding remarks 214 14.9 Summary 215 Exercise 14.1 Concatenation with statistical multiplexing 215 Exercise 14.2 Concatenation of DTH signals into 215 MPEG-4 (AVC) encoder Exercise 14.3 Concatenation of DTH signals 215 into MPEG-2 encoder Exercise 14.4 Concatenation of 4:2:0 signals 216 References 216 15 Bit-stream processing 217 15.1 Introduction 217 15.2 Bit-rate changers 217 15.3 Transcoding 219 15.4 Splicing 219 15.4.1 Splicing and statistical multiplexing 221 15.4.2 Insertion of pre-compressed video clips 222 15.5 Concluding remarks 223 15.6 Summary 223 Exercise 15.1 Bit-stream processing 223 Exercise 15.2 Splicing between different bit rates 224 References 224 16 Concluding remarks 225 References 226
xiv Video compression systems Appendix A: Test sequences referred to in this book 227 A.1 SDTV test sequences 227 A.2 HDTV test sequences 228 Appendix B: RGB/YUV conversion 229 B.1 SDTV conversion from RGB to YUV 229 B.2 SDTV conversion from YUV to RGB 229 B.3 HDTV conversion from RGB to YUV 229 B.4 HDTV conversion from YUV to RGB 230 Appendix C: Definition of PSNR 231 Appendix D: Discrete cosine transform (DCT) and inverse DCT 232 Appendix E: Introduction to wavelet theory 233 E.1 Introduction 233 E.2 Relationship to windowed Fourier transformation 233 E.2.1 Orthonormal wavelets 234 E.2.2 Wavelets based on splines 235 References 236 Appendix F: Comparison between phase correlation and cross-correlation 237 Appendix G: Polyphase filter design 239 G.1 Down-sampling by N 239 G.1.1 Numerical example 239 G.2 Up-sampling by M 240 G.2.1 Numerical example 240 G.3 Up-sampling by M/N 240 G.3.1 Numerical example 241 G.4 Down-sampling by M/N 241 G.4.1 Numerical example 241 Appendix H: Expected error propagation time 243 Appendix I: Derivation of the bit-rate demand model 245 Bibliography 247 Useful websites 249 Glossary 251 Example answers to exercises 257 Index 263