A Big Umbrella. Content Creation: produce the media, compress it to a format that is portable/ deliverable

Similar documents
Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Ch. 1: Audio/Image/Video Fundamentals Multimedia Systems. School of Electrical Engineering and Computer Science Oregon State University

Multimedia Communications. Image and Video compression

Chapter 10 Basic Video Compression Techniques

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Multimedia Communications. Video compression

Video 1 Video October 16, 2001

The H.26L Video Coding Project

Video coding standards

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

An Overview of Video Coding Algorithms

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Principles of Video Compression

Motion Video Compression

So far. Chapter 4 Color spaces Chapter 3 image representations. Bitmap grayscale. 1/21/09 CSE 40373/60373: Multimedia Systems

Understanding Compression Technologies for HD and Megapixel Surveillance

Video Compression. Representations. Multimedia Systems and Applications. Analog Video Representations. Digitizing. Digital Video Block Structure

Multimedia Networking

Chapter 2 Introduction to

AUDIOVISUAL COMMUNICATION

ATSC vs NTSC Spectrum. ATSC 8VSB Data Framing

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

Advanced Computer Networks

COMP 9519: Tutorial 1

Overview: Video Coding Standards

Digital Video Telemetry System

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

Improvement of MPEG-2 Compression by Position-Dependent Encoding

MPEG + Compression of Moving Pictures for Digital Cinema Using the MPEG-2 Toolkit. A Digital Cinema Accelerator

Multimedia Communication Systems 1 MULTIMEDIA SIGNAL CODING AND TRANSMISSION DR. AFSHIN EBRAHIMI

Video Transmission. Thomas Wiegand: Digital Image Communication Video Transmission 1. Transmission of Hybrid Coded Video. Channel Encoder.

Audio and Video II. Video signal +Color systems Motion estimation Video compression standards +H.261 +MPEG-1, MPEG-2, MPEG-4, MPEG- 7, and MPEG-21

MULTIMEDIA COMPRESSION AND COMMUNICATION

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

!"#"$%& Some slides taken shamelessly from Prof. Yao Wang s lecture slides

Lecture 18: Exam Review

The H.263+ Video Coding Standard: Complexity and Performance

Joint source-channel video coding for H.264 using FEC

Steganographic Technique for Hiding Secret Audio in an Image

Modeling and Evaluating Feedback-Based Error Control for Video Transfer

MPEG-2. ISO/IEC (or ITU-T H.262)

Lecture 23: Digital Video. The Digital World of Multimedia Guest lecture: Jayson Bowen

Digital Image Processing

Colour Reproduction Performance of JPEG and JPEG2000 Codecs

Lecture 1: Introduction & Image and Video Coding Techniques (I)

H.261: A Standard for VideoConferencing Applications. Nimrod Peleg Update: Nov. 2003

How do you make a picture?

Midterm Review. Yao Wang Polytechnic University, Brooklyn, NY11201

ITU-T Video Coding Standards

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

EMBEDDED ZEROTREE WAVELET CODING WITH JOINT HUFFMAN AND ARITHMETIC CODING

Video Over Mobile Networks

Understanding IP Video for

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Content storage architectures

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Data Manipulation. Audio and Image Representation. -Representation, Compression, and Communication Errors. Audio Representation

ABSTRACT ERROR CONCEALMENT TECHNIQUES IN H.264/AVC, FOR VIDEO TRANSMISSION OVER WIRELESS NETWORK. Vineeth Shetty Kolkeri, M.S.

8/30/2010. Chapter 1: Data Storage. Bits and Bit Patterns. Boolean Operations. Gates. The Boolean operations AND, OR, and XOR (exclusive or)

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

Spatial Error Concealment Technique for Losslessly Compressed Images Using Data Hiding in Error-Prone Channels

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

CERIAS Tech Report Preprocessing and Postprocessing Techniques for Encoding Predictive Error Frames in Rate Scalable Video Codecs by E

Digital Television Fundamentals

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

Channel models for high-capacity information hiding in images

Scalable Foveated Visual Information Coding and Communications

Lecture 2 Video Formation and Representation

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC

Essence of Image and Video

EEC-682/782 Computer Networks I

Error Resilient Video Coding Using Unequally Protected Key Pictures

ni.com Digital Signal Processing for Every Application

How Does H.264 Work? SALIENT SYSTEMS WHITE PAPER. Understanding video compression with a focus on H.264

A Layered Approach for Watermarking In Images Based On Huffman Coding

Processing. Electrical Engineering, Department. IIT Kanpur. NPTEL Online - IIT Kanpur

PCM ENCODING PREPARATION... 2 PCM the PCM ENCODER module... 4

A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds.

New forms of video compression

Tutorial on the Grand Alliance HDTV System

Introduction to image compression

CM3106 Solutions. Do not turn this page over until instructed to do so by the Senior Invigilator.

Error-Resilience Video Transcoding for Wireless Communications

INFORMATION THEORY INSPIRED VIDEO CODING METHODS : TRUTH IS SOMETIMES BETTER THAN FICTION

Chapt er 3 Data Representation

CHROMA CODING IN DISTRIBUTED VIDEO CODING

Part1 박찬솔. Audio overview Video overview Video encoding 2/47

Data Storage and Manipulation

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Video Compression - From Concepts to the H.264/AVC Standard

Digital Media. Daniel Fuller ITEC 2110

Multicore Design Considerations

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

Visual Communication at Limited Colour Display Capability

Digital Representation

Transcription:

A Big Umbrella Content Creation: produce the media, compress it to a format that is portable/ deliverable Distribution: how the message arrives is often as important as what the message is Search: finding the information you need Protection: we care about privacy and security, ownership and digital rights The four are tangled together $&

Goal of This Course Understand various aspects of a modern multimedia pipeline Content creating, editing Distribution Search & mining Protection Hands-on experience on hot media trends A Multimedia System #&

Digital Data Acquisition Source: Analog Output: Digital Analog Digital Two Steps Sampling: take samples at time nt T: sampling period; f s = 1/T: sampling frequency f s= 10Hz! T=0.1 second Quantization: map amplitude values into a set of discrete values '&

Sampling Theorem A signal can be reconstructed from its samples, if the original signal has no frequencies above 1/2 the sampling frequency The minimum sampling rate for band-limited function is called Nyquist rate This means: T (or f s ) depends on the signal frequency range A fast varying signal should be sampled more frequently! speech: f s >8KHz; music, f s >44KHz Before and After Sampling Spatial domain Frequency domain 1 -f M f M Sampling! frequency signal duplications at k fs 1/T -f s -f M f M f s T f s =1/T (&

Original signal Reconstruction (Frequency domain view) 1 -f M f M Sampled signal f s > =2f M 1/T -f s -f M f M f s Ideal reconstruction! signal x low-pass filter in frequency domain T -f s /2 1 f s /2 Reconstructed signal = original signal -f M f M Reconstruction (Frequency domain view) Original signal 1 -f M f M Sampled signal f s < 2f M 1/T -f s -f M f M f s T Ideal reconstruction Filter (low-pass) Reconstructed signal!= original signal -f s /2 f s /2 1 -f M f M Alias due to insufficient sampling rate )&

Definition of An Image Think an image as a function, f, from R 2 to R: f (x, y ) gives the intensity at position ( x, y ) Realistically, we expect the image only to be defined over a rectangle, with a finite range: f: [a,b]x[c,d]! [0,1] A color image is just three functions pasted together (R, G, B) components Grayscale Image x f(x,y) y!&

24-bit Colored Image Each pixel is represented by three bytes, usually representing RGB one byte for each R, G, B component 256x256x256 possible combined colors, or a total of 16,777,216 possible colors. However such flexibility does result in a storage penalty: A 640x480 24-bit color image would require 921.6 kb of storage without any compression. Define Colors via RGB Trichromatic color mixing theory Any color can be obtained by mixing three primary colors with a right proportion Primary colors for illuminating sources: Red, Green, Blue (RGB) CRT works by exciting red, green, blue phosphors using separate electronic guns R+G+B=White Used in digital images *&

A Multimedia System Redundancy in Media Data Medias (speech, audio, image, video) are not random collection of signals, but exhibit a similar structure in local neighborhood Temporal redundancy: current and next signals are very similar (smooth media: speech, audio, video) Spatial redundancy: the pixels intensities and colors in local regions are very similar Spectral redundancy: When the data is mapped into the frequency domain, a few frequencies dominate over the others +&

Lossless Compression Lossless compression Compress the signal but can reproduce the exact original signal Used for archival purposes and often medical imaging, technical drawings Assign new binary codes to represent the symbols based on the frequency of occurrence of the symbols in the message Example 1: Run Length Encoding (BMP, PCX) BBBBEEEEEEEECCCCDAAAAA! 4B8E4C1D5A Example 2: Lempel-Ziv-Welch (LZW): adaptive dictionary, dynamically create a dictionary of strings to efficiently represent messages, used in GIF & TIFF Example 3: Huffman coding: the length of the codeword to present a symbol (or a value) scales inversely with the probability of the symbol s appearance, used in PNG, MNG, TIFF Lossy Compression The compressed signal after de-compressed, does not match the original signal Compression leads to some signal distortion Suitable for natural images such as photos in applications where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. Types Color space reduction: reduce 24!8bits via color lookup table Chrominance subsampling: from 4:4:4 to 4:2:2, 4:1:1, 4:2:0, eye perceives spatial changes of brightness more sharply than those of color, by averaging or dropping some of the chrominance information Transform coding (or perceptual coding): Fourier transform (DCT, wavelet) followed by quantization and entropy coding Today s focus,&

A Typical Image Compression System Transform original data into a new representation that is easier to compress Use a limited number of levels to represent the signal values Find an efficient way to represent these levels using binary bits Transformation Quantization Binary Encoding DCT for images +Zigzag ordering Scalar quantization (Run-length coding Huffman coding ) DC: prediction + Huffman AC: run-length + Huffman Coding Colored Images Color images are typically stored in (R,G,B) format JPEG standard can be applied to each component separately Does not make use of the correlation between color components Does not make use of the lower sensitivity of the human eye to chrominance samples Alternate approach Convert (R,G,B) representation to a YCbCr representation Y: luminance, Cb, Cr: chrominance Down-sample the two chrominance components Because the peak response of the eye to the luminance component occurs at a higher frequency than to the chrominance components $%&

Chrominance Subsampling Key Concepts of Video Compression Temporal Prediction: (INTER mode) Predict a new frame from a previous frame and only specify the prediction error Prediction error will be coded using an image coding method (e.g., DCT-based JPEG) Prediction errors have smaller energy than the original pixel values and can be coded with fewer bits Motion-compensation to improve prediction: Use motion-compensated temporal prediction to account for object motion INTRA frame coding: (INTRA mode) Those regions that cannot be predicted well are coded directly using DCT-based method Spatial prediction: Use spatial directional prediction to exploit spatial correlation (H.264) Work on each macroblock (MB) (16x16 pixels) independently for reduced complexity Motion compensation done at the MB level DCT coding of error at the block level (8x8 pixels or smaller) Block-based hybrid video coding $$&

Different Prediction Modes Intra: coded directly; Predictive: predicted from a previous frame; Bidirectional: predicted from a previous frame and a following frame. Intra: coded directly; Predictive: predicted from a previous frame; Bidirectional: predicted from a previous frame and a following frame. Can be done at frame or block levels $#&

MPEG Frame Arrangement A Typical Video Compression System Transform original data into a new representation that is easier to compress Use a limited number of levels to represent the signal values Find an efficient way to represent these levels using binary bits Transformation Quantization Binary Encoding Temporal Prediction (P,B) Motion Compensation Spatial Prediction (for I frames) Scalar quantization Vector quantization Fixed length Variable length (Run-length coding Huffman coding ) $'&

A Typical Speech Compression System Transform original data into a new representation that is easier to compress Use a limited number of levels to represent the signal values Find an efficient way to represent these levels using binary bits Transformation Quantization Binary Encoding Temporal Prediction Scalar quantization Vector quantization Fixed length Variable length (Run-length coding Huffman coding ) Compressing Speech via Temporal Prediction $(&

Demo Results Original signal Original signal s Histogram Difference signal Difference signal s Histogram Much smaller range! easier to encode Your Ear as a Filterbank The auditory system can be roughly modeled as a filterbank, consisting of 25 overlapping bandpass filters, from 0 to 20 KHz The ear cannot distinguish sounds within the same band that occur simultaneously. Each band is called a critical band The bandwidth of each critical band is about 100 Hz for signals below 500 Hz, and increases linearly after 500 Hz up to 5000 Hz 1 bark = width of 1 critical band $)&

Threshold in Quiet Audible level at various frequencies: The minimum sound level of an average ear with normal hearing can hear with no other sound present Only need to code a frequency band if its sound level is above its corresponding threshold Sound Level (db) Threshold in quiet Frequency Frequency Masking When two sound frequencies are present in the signal simultaneously, the presence of one might hide the perception of the other Also known as simultaneous masking A weak noise (the maskee) can be made inaudible by simultaneously occurring stronger signal (the masker), e.g, a pure tone; if masker and maskee are close enough to each other in frequency. Sound Level (db) Threshold in quiet A 1kHz tone of strength 60dB is present Masking threshold Frequency $!&

A Multimedia System Application architectures Client-server Including data centers / cloud computing Peer-to-peer (P2P) Hybrid of client-server and P2P 2: Application Layer 34 $*&

Ways to Distribute Videos Single server, single (or many) clients Not scalable IP multicast Required uniform router hardware Content delivery networks (CDNs) $$$$, serve small-size, highly popular data Application end points (pure/hybrid P2P) Unstable, popularity driven Client-server architecture client/server server: always-on host permanent IP address server farms for scaling clients: communicate with server may be intermittently connected may have dynamic IP addresses do not communicate directly with each other 2: Application Layer 36 $+&

Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change IP addresses peer-peer Highly scalable but difficult to manage 2: Application Layer 37 Hybrid of client-server and P2P Skype voice-over-ip P2P application centralized server: finding address of remote party: client-client connection: direct (not through server) Instant messaging chatting between two users is P2P centralized service: client presence detection/location user registers its IP address with central server when it comes online user contacts central server to find IP addresses of buddies 2: Application Layer 38 $,&

Media over IP (Internet): Making it Work Use UDP to avoid TCP congestion control and the delay associated with it; required for time-sensitive media traffic Use RTP/UDP to enable QoS monitoring, sender and receiver can record the # of packets sent/received and adjust their operations accordingly Client-side uses adaptive playout delay to compensate for the delay (and the jitter) Server side matches stream bandwidth to available client-to-server path bandwidth Chose among pre-encoded stream rates Dynamic encoding rate Error recovery (on top of UDP) FEC and/or interleaving Retransmissions (time permitting) Unequal error protection (duplicate important parts) Conceal errors (interpolate from nearby data) Image and Video are vulnerable to losses Assuming conventional MPEG-like system: MC-prediction, Block-DCT, run length and Huffman coding Losses create two types of problems Loss of bit stream synchronization: Decoder does not know what bits correspond to what parameters E.g. error in Huffman codeword Incorrect state and error propagation: Decoder s state is different from encoder s, leading to incorrect predictions and error propagation E.g. error in MC-prediction or DC-coefficient prediction 40 #%&

Layered Solution Use a layered representation. Receivers decide Layers added and dropped to adjust to appropriate target rate. R1 S R2 R3 41 Error Concealment for Video Repeat pixels from previous frame Effective when there is no motion, potential problems when there is motion Interpolate pixels from neighboring region Correctly recovering missing pixels is extremely difficult, however even correctly estimating the DC (average) value is very helpful Interpolate motion vectors from previous frame Can use coded motion vector, neighboring motion vector, or compute new motion vector 42 #$&

A Multimedia System What is a Watermark? A watermark is a secret message that is embedded into a cover message Usually, only the knowledge of a secret key allows us to extract the watermark. Has a mathematical property that allows us to argue that its presence is the result of deliberate actions. Effectiveness of a watermark is a function of its Stealth Resilience Capacity ##&

Watermarking Encoding original image Watermark S Encoder watermarked image User Key K Watermarking Decoding S=X? watermarked image original image Decoder Watermark X User Key K #'&

Various Categories of Watermarks Based on method of insertion Additive Quantize and replace Based on domain of insertion Transform domain Spatial domain Based on method of detection Private - requires original image Public (or oblivious) - does not require original Based on security type Robust - survives image manipulation Fragile - detects manipulation (authentication) Embedding Watermarks Method 1: Spatial Domain Least Significant Bit (LSB) Modification Simple but not robust An image pixel s value Replace the bit with your watermark pixel value (0 or 1) #(&

Spatial Domain Robust Watermarking Pseudo-randomly (based on secret key) select n pairs of pixels: pair i: a i, b i are the values of the pixels in the pair The expected value of sum i (a i -b i )==0 Increase a i by 1, Decrease b i by 1 The expected value of sum i (a i -b i ) now!2n To detect watermark, check sum i (a i -b i ) on the watermarked image Frequency-domain Robust Watermark: Spread Spectrum Watermark Spread Spectrum == transmits a narrowband signal over a much larger bandwidth the signal energy present in any single frequency is much smaller Apply this to watermark: The watermark is spread over many frequency bins so that the (change of ) energy in any one bin is very small and almost undetectable Watermark extraction == combine these many weak signals into a single but stronger output Because the watermark verification process knows the location and content of the watermark To destroy such a watermark would require noise of high amplitude to be added to all frequency bins #)&

UMCP ENEE631 Slides (created by M.Wu based on Research Talks 98-04) Spread Spectrum Watermark: Cox et al What to use as watermark? Where to put it? Place wmk in perceptually significant spectrum (for robustness) Modify by a small amount below Just-noticeable-difference (JND) Use long random noise-like vector as watermark for robustness/security against jamming+removal & imperceptibility Embedding v i = v i +! v i w i = v i (1+! w i ) Perform DCT on entire image and embed wmk in DCT coeff. Choose N=1000 largest AC coeff. and scale {v i } by a random factor Original image Full frame 2D DCT seed random vector generator marked wmk N largest coeff. image sort v =v (1+! w) Full Frame IDCT & other coeff. normalize UMCP ENEE631 Slides (created by M.Wu based on Research Talks 98-04) " Subtract original image from the test one before feeding to detector ( non-blind detection ) " Correlation-based detection a correlator normalized by Y in Cox et al. paper test image X =X+W+N? X =X+N? original unmarked image preprocess orig X test X DCT DCT select N largest select N largest wmk compute similarity threshold decision #!&

A Multimedia System Final Exam Cover everything till this lecture Use the lecture slides and book readings Place + Time June 9 th, 9am 11am rather than 8am 11am Closed book, closed notes Two more office hours: Friday May 4 th 3 5pm at HFH 1121 Next Monday May 7 th 11-noon at HFH 1121 #*&