Video signals are separated into several channels for recording and transmission.

Similar documents
An Overview of Video Coding Algorithms

Motion Video Compression

Video coding standards

Advanced Computer Networks

Audio and Video II. Video signal +Color systems Motion estimation Video compression standards +H.261 +MPEG-1, MPEG-2, MPEG-4, MPEG- 7, and MPEG-21

Chapter 2 Introduction to

Video 1 Video October 16, 2001

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Video Compression. Representations. Multimedia Systems and Applications. Analog Video Representations. Digitizing. Digital Video Block Structure

Multimedia Communications. Video compression

A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds.

MULTIMEDIA TECHNOLOGIES

Multimedia Communications. Image and Video compression

So far. Chapter 4 Color spaces Chapter 3 image representations. Bitmap grayscale. 1/21/09 CSE 40373/60373: Multimedia Systems

06 Video. Multimedia Systems. Video Standards, Compression, Post Production

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Overview: Video Coding Standards

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

The H.26L Video Coding Project

Chapter 10 Basic Video Compression Techniques

PAL uncompressed. 768x576 pixels per frame. 31 MB per second 1.85 GB per minute. x 3 bytes per pixel (24 bit colour) x 25 frames per second

Multimedia. Course Code (Fall 2017) Fundamental Concepts in Video

AUDIOVISUAL COMMUNICATION

The H.263+ Video Coding Standard: Complexity and Performance

MPEG-2. ISO/IEC (or ITU-T H.262)

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Ch. 1: Audio/Image/Video Fundamentals Multimedia Systems. School of Electrical Engineering and Computer Science Oregon State University

Chapter 3 Fundamental Concepts in Video. 3.1 Types of Video Signals 3.2 Analog Video 3.3 Digital Video

The Multistandard Full Hd Video-Codec Engine On Low Power Devices

Midterm Review. Yao Wang Polytechnic University, Brooklyn, NY11201

Chapter 2 Video Coding Standards and Video Formats

Content storage architectures

Video (Fundamentals, Compression Techniques & Standards) Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Digital Media. Daniel Fuller ITEC 2110

SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Infrastructure of audiovisual services Coding of moving video

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

Colour Reproduction Performance of JPEG and JPEG2000 Codecs

INTERNATIONAL TELECOMMUNICATION UNION. SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS Coding of moving video

Lecture 1: Introduction & Image and Video Coding Techniques (I)

VIDEO 101: INTRODUCTION:

5.1 Types of Video Signals. Chapter 5 Fundamental Concepts in Video. Component video

Principles of Video Compression

Part1 박찬솔. Audio overview Video overview Video encoding 2/47

OL_H264MCLD Multi-Channel HDTV H.264/AVC Limited Baseline Video Decoder V1.0. General Description. Applications. Features

Video Over Mobile Networks

Digital Image Processing

Lecture 2 Video Formation and Representation

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

H.261: A Standard for VideoConferencing Applications. Nimrod Peleg Update: Nov. 2003

SUMMIT LAW GROUP PLLC 315 FIFTH AVENUE SOUTH, SUITE 1000 SEATTLE, WASHINGTON Telephone: (206) Fax: (206)

Digital Video Telemetry System

OL_H264e HDTV H.264/AVC Baseline Video Encoder Rev 1.0. General Description. Applications. Features

MPEG + Compression of Moving Pictures for Digital Cinema Using the MPEG-2 Toolkit. A Digital Cinema Accelerator

Lecture 23: Digital Video. The Digital World of Multimedia Guest lecture: Jayson Bowen

10 Digital TV Introduction Subsampling

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Rounding Considerations SDTV-HDTV YCbCr Transforms 4:4:4 to 4:2:2 YCbCr Conversion

In MPEG, two-dimensional spatial frequency analysis is performed using the Discrete Cosine Transform

1. Broadcast television

Video Compression - From Concepts to the H.264/AVC Standard

ITU-T Video Coding Standards

To discuss. Types of video signals Analog Video Digital Video. Multimedia Computing (CSIT 410) 2

RECOMMENDATION ITU-R BT * Video coding for digital terrestrial television broadcasting

Analog and Digital Video Basics

Information Transmission Chapter 3, image and video

ATSC vs NTSC Spectrum. ATSC 8VSB Data Framing

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

The Development of a Synthetic Colour Test Image for Subjective and Objective Quality Assessment of Digital Codecs

AT65 MULTIMEDIA SYSTEMS DEC 2015

ZONE PLATE SIGNALS 525 Lines Standard M/NTSC

OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

Reduced complexity MPEG2 video post-processing for HD display

Presented by: Amany Mohamed Yara Naguib May Mohamed Sara Mahmoud Maha Ali. Supervised by: Dr.Mohamed Abd El Ghany

Video Compression Basics. Nimrod Peleg Update: Dec. 2003

Part II Video. General Concepts MPEG1 encoding MPEG2 encoding MPEG4 encoding

Visual Communication at Limited Colour Display Capability

Welcome Back to Fundamentals of Multimedia (MR412) Fall, ZHU Yongxin, Winson

EECS150 - Digital Design Lecture 12 Project Description, Part 2

Analog and Digital Video Basics. Nimrod Peleg Update: May. 2006

Television History. Date / Place E. Nemer - 1

HDTV compression for storage and transmission over Internet

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard

complex than coding of interlaced data. This is a significant component of the reduced complexity of AVS coding.

P1: OTA/XYZ P2: ABC c01 JWBK457-Richardson March 22, :45 Printer Name: Yet to Come

The implementation of HDTV in the European digital TV environment

Multimedia Communication Systems 1 MULTIMEDIA SIGNAL CODING AND TRANSMISSION DR. AFSHIN EBRAHIMI

Avivo and the Video Pipeline. Delivering Video and Display Perfection

4. Video and Animation. Contents. 4.3 Computer-based Animation. 4.1 Basic Concepts. 4.2 Television. Enhanced Definition Systems

H.264/AVC Baseline Profile Decoder Complexity Analysis

17 October About H.265/HEVC. Things you should know about the new encoding.

Communication Theory and Engineering

RECOMMENDATION ITU-R BT.1201 * Extremely high resolution imagery

MPEG-1 and MPEG-2 Digital Video Coding Standards

International Journal for Research in Applied Science & Engineering Technology (IJRASET) Motion Compensation Techniques Adopted In HEVC

A review of the implementation of HDTV technology over SDTV technology

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Multimedia Systems Video I (Basics of Analog and Digital Video) Mahdi Amiri April 2011 Sharif University of Technology

Transcription:

Video

In filmmaking and video production, footage is the raw, unedited material as it had been originally filmed by movie camera or recorded by a video camera which must be edited to create a motion picture, video clip, television show or similar completed work. Video signals are separated into several channels for recording and transmission. There are different methods of color channel separation, depending on the video format and its historical origins.

For example, broadcast video devices were originally designed for black-and-white video, and color was added later. This is still evident in today s video formats that break image information into separate black-andwhite and color information. On the other hand, video and image processing on computers is more flexible and developed later, so a three-color RGB model was adopted instead of a luma-chroma model.

Video signal formats C:\Users\Mac\Downloads\slide5.pdf

NTSC NTSC television channel occupies a total bandwidth of 6 MHz The actual video signal is transmitted between 500 khz and 5.45 MHz above the lower bound of the channel. The video carrier is 1.25 MHz above the lower bound of the channel. The color subcarrier is 3.579545 MHz above the video carrier. The main audio carrier is 4.5 MHz above the video carrier.

PAL Phase Alternating Line, is a colour encoding system for analogue television used in broadcast television systems in most countries broadcasting. PAL uses a subcarrier carrying the chrominance information added to the luminance video signal to form a composite video baseband signal. The frequency of this subcarrier is 4.43361875 MHz. The name "Phase Alternating Line" describes the way that the phase of part of the colour information on the video signal is reversed with each line, which automatically corrects phase errors in the transmission of the signal by cancelling them out, at the expense of v frame colour resolution.

PAL The 4.43361875 MHz frequency of the colour carrier is a result of 283.75 colour clock cycles per line plus a 25 Hz offset to avoid interferences.

SECAM ( Sequential Color with Memory ) SECAM differs from the other color systems by the way the R-Y and B-Y signals are carried. First, SECAM uses frequency modulation to encode chrominance information on the sub carrier. Second, instead of transmitting the red and blue information together, it only sends one of them at a time, and uses the information about the other color from the preceding line. It uses an analog delay line, a memory device, for storing one line of color information. This justifies the "Sequential, With Memory" name.

SECAM Because SECAM transmits only one color at a time, it is free of the color artifacts present in NTSC and PAL resulting from the combined transmission of both signals. This means that the vertical color resolution is halved relative to NTSC. Because the FM modulation of SECAM's color sub carrier is insensitive to phase (or amplitude) errors, phase errors do not cause loss of color saturation in SECAM. It uses YUV color model. This encoding is suitable for applications that transmit only one signal at a time.

SECAM SECAM transmissions are more robust over longer distances than NTSC or PAL.

Property NTSC PAL SECAM Lines 525 625 Frame rate 30 fps 25 fps Resolution 720 x 480; 704 x 480; 352 x 480; 352 x 240 720 x 576; 704 x 576; 352 x 576; 352 x 288 720x576 Details This is also called "composite video" because all the video information synchronization, luminance, and color are combined into a single analog signal. Has some color distortions. By reversing the relative phase of the color signal components on alternate scanning lines, this system avoids the color distortion that appears in NTSC. The color information is transmitted sequentially (R-Y followed by B-Y, etc.) for each line and conveyed by a frequency modulated subcarrier that avoids the distortion arising during NTSC transmission.

EDTV CCIR CIF SIF HDTV Video transmission standards

Common concepts Interlacing: Interlacing was invented as a way to reduce flicker in CRT video displays without increasing the number of complete frames per second, which would have sacrificed image detail to remain within the limitations of a narrow bandwidth. Progressive scan: Each refresh period updates all scan lines in each frame in sequence. When displaying a natively progressive broadcast or recorded signal, the result is optimum spatial resolution of both the stationary and moving parts of the image.

EDTV Enhanced-definition television, or extendeddefinition television (EDTV) is an American Consumer Electronics Association (CEA) marketing shorthand term for certain digital television formats and devices. Specifically, this term defines formats that deliver a picture superior to that of SDTV but not as detailed as HDTV. The term refers to devices capable of displaying 480-line or 576-line signals in progressive scan. As EDTV signals require more bandwidth (due to frame doubling)

EDTV EDTV broadcasts use less digital bandwidth than HDTV, so TV stations can broadcast several EDTV stations at once. EDTV signals are broadcast with non-square pixels. Progressive displays (such as plasma displays and LCDs) can show EDTV signals without the need to interlace them first. This can result in a reduction of motion artifacts. However to achieve this most progressive displays require the broadcast to be frame doubled (i.e., 25 to 50 and 30 to 60) to avoid the same motion flicker issues that interlacing fixes.

HDTV High-definition television (HDTV) is a digital television broadcasting system with a significantly higher resolution than traditional formats (NTSC, SECAM, PAL). HDTV is a digital TV broadcasting format where the broadcast transmits widescreen pictures with more detail and quality than found in a standard analog television, or other digital television formats. Any scan line count greater than 480 is generally considered "High Definition". Even 480 lines transmitted as progressive scan is considered a "High Definition" image. The top of the heap would be the 1080 line HDTV standard which several broadcasters have elected to support.

CCIR CCIR is the Consultative Committee for International Radio, one of the most important standards it has produced is CCIR-601, for component digital video. Table shows some of the digital video specifications, all with an aspect ratio of 4:3. The CCIR 601 standard uses an interlaced scan, so each field has only half as much vertical resolution

CIF Format used to standardize the horizontal and vertical resolutions in pixels of ycbcr sequences in video signals, commonly used in video teleconferencing systems. CIF stands for Common Intermediate Format specified by the CCITT (International Telegraph and Telephone Consultative Committee). The idea of CIF is to specify a format for lower bit rate. QCIF stands for Quarter-CIF To have one fourth of the area, as "quarter" implies, the height and width of the frame are halved.

Digitization of video The basic process used to digitize images to create video sequences is the sampling of image elements (pixels) for intensity and color. For color video, each element contains intensity (brightness) and color components (red, green, and blue - RGB). These components are periodically sampled and converted into a digital format. Analog video digitization involves analyzing each scan line of video, separating the color and intensity levels and digitizing each component.

For digital video capturing from optical sensors (such as video recorders with CCD sensors), each pixel element is converted into a color type (red, green, and blue) which has an intensity level (brightness). Converting video signals at 30 frames per second into digital streams of data results in large amounts of data. For color images, each line of image is divided (filtered) into its color components (red, green and blue components). Each position on filtered image is scanned or sampled and converted to a level. Each sampled level is converted into a digital signal.

Video file formats MOV Real Video H-261 H-263 Cinepack Nerodigtal

MOV MOV is an MPEG 4 video container file format used in Apple's Quicktime program. MOV files use Apple s proprietary compression algorithm. Apple introduced the MOV file format in 1998. The format specifies a multimedia container file that contains one or more tracks, each of which stores a particular type of data: audio, video, effects, or text (e.g. for subtitles). MOV and MP4 files are similar and can both be played by QuickTime. However, MP4 files are recognized as an international standard and are more widely supported than MOV files.

Real video RealVideo is a suite of proprietary video compression formats developed by RealNetworks. Supported on many platforms, including Windows, Mac, Linux, Solaris, and several mobile phones. RealVideo codecs are identified by four-character codes. RV10 and RV20 are the H.263-based codecs. RV30 and RV40 are RealNetworks' proprietary codecs.

RealVideo can be played from a RealMedia file or streamed over the network using the Real Time Streaming Protocol (RTSP). However, RealNetworks uses RTSP only to set up and manage the connection. The actual video data is sent with their own proprietary Real Data Transport (RDT) protocol.

H.261 H.261 is an ITU-T video compression standard. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group (VCEG), and was the first video codec that was useful in practical terms.

H.261 was originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. The coding algorithm was designed to be able to operate at video bit rates between 40 kbit/s and 2 Mbit/s. \ Widely used for video conferencing in the 128 Kbits/second to 384 Kbits/second range. This is a block Discrete Cosine Transform method. TheH.261 standard actually only specifies how to decode the video. Encoder designers were left free to design their own encoding algorithms, as long as their output was constrained properly to allow it to bedecoded byany decoder made according to the standard..

Encoders are also left free to perform any pre-processing they want to their input video, and decoders are allowed to perform any post-processing they want to their decoded video prior to display. One effective post-processing technique that became a key element of the best H.261-based systems is called deblocking filtering. This reduces the appearance of block-shaped artifacts caused by the block-based motion compensation and spatial transform parts of the design.

1. A preprocessor converts the video at the output of a camera to a new format. 2. The coding parameters of the compressed video signal are multiplexed and then combined with the audio, data and end-to-end signaling for transmission. 3. The transmission buffer controls the bit rate, either by changing the quantizer step size at the encoder, or in more severe cases by requesting reduction in frame rate, to be carried out at the preprocessor.

Nerodigital Nero Digital is a brand name applied to a suite of MPEG-4-compatible video and audio compression codecs developed by Nero AG of Germany and Ateme of France. The audio codecs are integrated into the Nero Digital Audio+ audio encoding tool for Microsoft Windows, and the audio & video codecs are integrated into Nero's Recode DVD ripping software. The video streams generated by Nero Digital can be played back on some stand-alone hardware players and software media players such as the company's own Nero Showtime.

Cinepak is a lossy video codec developed by Peter Barrett at SuperMac Technologies, and released in 1991 with the Video Spigot, and then in 1992 as part of Apple Computer's QuickTime video suite. One of the first video compression tools to achieve full motion video on CD-ROM, it was designed to encode 320 240 resolution video at 1 (150 kbyte/s) CD-ROM transfer rates. The original name of this codec was CompactVideo, which is why its FourCC identifier is CVID. The codec was ported to the Microsoft Windows platform in 1993.

Cinepak is based on vector quantization, which is a significantly different algorithm from the DCT algorithm used by most current codecs. This permitted implementation on relatively slow CPUs (video encoded in Cinepak will usually play fine even on a 25 MHz Motorola 68030). Cinepak files tend to be about 70% larger than similar quality MPEG-4 Part 2. Codebooks V1 and V4 2*2 pixel blocks 1 block = 4 luma values or 4 luma & 2 chroma values Quantization 0... 255 0... 255 4*4 pixel blocks 2*2 pixel blocks

For processing, Cinepak divides a video into key (intra-coded) images and inter-coded images. codebooks are transmitted from scratch codebook entries are selectively updated. Each image is further divided into a number of horizontal bands. The codebooks can be updated on a per-band basis. Each band is divided into 4 4 pixel blocks. Each block can be coded either from the V1 or from the V4 codebook.

When coding from the V1 codebook, one codebook index per 4 4 block is written to the bit stream, and the corresponding 2 2 codebook entry is upscaled to 4 4 pixels. When coding from the V4 codebook, four codebook indices per 4 4 block are written to the bit stream, one for each 2 2 sub-block. Alternatively to coding from the V1 or the V4 codebook, a 4 4 block in an inter-coded image can be skipped. A skipped block is copied unchanged from the previous frame in a conditional replenishment fashion. The data rate can be controlled by adjusting the rate of key frames and by adjusting the permitted error in each block.

Android video formats Android Supported Video Format/Codec Supported Video File Types/Container Formats Details H.263 3GPP (.3gp) MPEG-4 (.mp4) H.264 AVC 3GPP (.3gp) MPEG-4 (.mp4) Baseline Profile (BP) MPEG-TS (.ts, AAC audio only, not seekable, Android 3.0+) MPEG-4 SP VP8 3GPP (.3gp) WebM (.webm) Matroska (.mkv, Android 4.0+) Streamable only in Android 4.0 and above

The 3GP and 3G2 file formats are both structurally based on the ISO base media file format defined in ISO/IEC 14496-12 - MPEG-4 Part 12. 3GP and 3G2 are container formats similar to MPEG-4 Part 14 (MP4), which is also based on MPEG-4 Part 12. The 3GP and 3G2 file format were designed to decrease storage and bandwidth requirements to accommodate mobile phones. 3GP and 3G2 are similar standards, but with some differences: 3GPP file format was designed for GSM-based Phones and may have the filename extension.3gp 3GPP2 file format was designed for CDMA-based Phones and may have the filename extension.3g2 Some cell phones use the.mp4 extension for 3GP video.

The Matroska Multimedia Container (.mkv) is an open standard free container format, a file format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. It is intended to serve as a universal format for storing common multimedia content, like movies or TV shows.

Video editing

DVD Formats DVD (also known as "Digital Versatile Disc" or "Digital Video Disc") is a popular optical disc storage media format mainly used for video and data storage. Most DVDs are of the same dimensions as compact discs (CDs) but store more than 6 times the data. DVD-ROM has data which can only be read and not written, DVD-R can be written once and then functions as a DVD-ROM, and DVD-RAM or DVD-RW holds data that can be re-written multiple times. DVD-Video and DVD-Audio discs respectively refer to properly formatted & structured video and audio content. Other types of DVD discs, including those with video content, may be referred to as DVD-Data discs.

DVD Technology DVD uses 650 nm wavelength laser diode light as opposed to 780 nm for CD. This permits a smaller spot on the media surface that is 1.32 μm for DVD while it was 2.11 μm for CD. Writing speeds for DVD were 1x, that is 1350 kb/s (1318 KiB/s), in first drives and media models. More recent models at 18x or 20x have 18 or 20 times that speed. Note that for CD drives, 1x means 153.6 kb/s (150 KiB/s), 9 times slower.

DVD recordable and rewritable HP initially developed recordable DVD media from the need to store data for back-up and transport. DVD recordables are now also used for consumer audio and video recording. Three formats were developed: DVD-R/RW (minus/dash), DVD+R/RW (plus), DVD-RAM.

Dual layer recording Dual Layer recording allows DVD-R and DVD+R discs to store significantly more data, up to 8.5 Gigabytes per side, per disc, compared with 4.7 Gigabytes for single layer discs. The drive with Dual Layer capability accesses the second layer by shining the laser through the first semi-transparent layer. The layer change mechanism in some DVD players can show a noticeable pause, as long as two seconds by some accounts.

DVD Video DVD-Video is a standard for storing video content on DVD media. DVD-Video discs use either 4:3 or 16:9 aspect ratio MPEG-2 video, stored at a resolution of 720 480 (NTSC) or 720 576 (PAL) at 24, 30, or 60 FPS. Audio is commonly stored using the Dolby Digital (AC-3) or Digital Theater System (DTS) formats, ranging from 16-bits/48kHz to 24-bits/96kHz format with monaural to 7.1 channel "Surround Sound presentation, and/or MPEG-1 Layer 2. DVD-Video also supports features like menus, selectable subtitles, multiple camera angles, and multiple audio tracks.

DVD-Audio DVD-Audio is a format for delivering high-fidelity audio content on a DVD. It offers many channel configuration options (from mono to 7.1 surround sound) at various sampling frequencies (up to 24-bits/192kHz). Compared with the CD format, the much higher capacity DVD format enables the inclusion of considerably more music (with respect to total running time and quantity of songs) and/or far higher audio quality (reflected by higher linear sampling rates and higher vertical bitrates, and/or additional channels for spatial sound reproduction).

MPEG MPEG video compression is used in many current and emerging products. It is at the heart of digital television set-top boxes, DSS, HDTV decoders, DVD players, video conferencing, Internet video, and other applications. These applications benefit from video compression in the fact that they may require less storage space for archived video information, less bandwidth for the transmission of the video information from one point to another, or a combination of both.

Moving Picture Expert Group worked to generate the specifications under ISO, & IEC, the International Electrotechnical Commission. "MPEG video" actually consists of two finalized standards, MPEG-1 and MPEG-2, with a third standard, MPEG-4, in the process of being finalized at the time this paper was written. The MPEG-1 & -2 standards are similar in basic concepts. They both are based on motion compensated blockbased transform coding techniques, while MPEG-4 uses software image construct descriptors, for target bit-rates in the very low range, < 64Kb/sec.

Finalized in 1991 MPEG-1 Referred to as source input format (SIF) video Was originally optimized to work at video resolutions of or commonly. 352x240 pixels at 30 frames/sec (NTSC based) 352x288 pixels at 25 frames/sec (PAL based), MPEG-1 resolution may go as high as 4095x4095 at 60 frames/sec. The bit-rate is optimized for applications of around 1.5 Mb/sec, but can be used at higher rates if required. MPEG-1 is defined for progressive frames only, and has no direct provision for interlaced video applications, such as in broadcast television applications.

MPEG-2 Addressed issues directly related to digital television broadcasting Such as the efficient coding of field-interlaced video and scalability. The target bit-rate was raised to between 4 and 9 mb/sec, very high quality video. Mpeg-2 consists of profiles and levels. Bit-stream scalability, color-space resolution image resolution and the maximum bit-rate per profile Example: M a in p r o f ile, m a in le v e l ( m p @ m l) w it h 720x480 r e s o lu t io n v id e o at 30 f r a m e s / s e c, at b it - r a t e s up to 15 mb/ s e c f o r n t s c

MPEG Video Layers MPEG video is broken up into a hierarchy of layers to help with error handling, random search and editing, and synchronization, for example with an audio bitstream. Video Sequence Layer It is a self contained bits-stream Ex. Coded movie or advertisement Group of pictures Composed of 1 or more groups of intra frames (I) and non intra pictures (P and B) Picture layer itself Slice Layer Each slice is a contiguous sequence of raster ordered macro-blocks

Each slice consists of macro-blocks, which are 16x16 arrays of luminance pixels, or picture data elements, with 2 8x8 arrays of associated chrominance pixels. The macro-blocks can be further divided into distinct 8x8 blocks, for further processing such as transform coding. Each of these layers has its own unique 32 bit start code defined in the syntax to consist of 23 zero bits followed by a one, then followed by 8 bits for the actual start code. These start codes may have as many zero bits as desired preceding them.

A MPEG "film" is a sequence of three kinds of frames: Frames I-Frame (Intra-coded) P- Frame(Intercoded) B- Frame(Intercoded)

Video Filter MPEG uses the YCbCr color space to represent the data values instead of RGB, where Y is the luminance signal, Cb is the blue color difference signal, and Cr is the red color difference signal. A macroblock can be represented in several different manners when referring to the YCbCr color space such as 4:4:4, 4:2:2, and 4:2:0 video. 4:2:0 contains one quarter of the chrominance information. Although MPEG-2 has provisions to handle the higher chrominance formats for professional applications, most consumer level products will use the normal 4:2:0 mode.

The 4:2:0 representation allows an immediate data reduction from 12 blocks/macroblock to 6 blocks/macroblock, or 2:1 compared to full bandwidth representations such as 4:4:4 or RGB. To generate this format without generating color aliases or artifacts requires that the chrominance signals be filtered.

DCT 8x8 block values are coded by means of the discrete cosine transform. 120 108 90 75 69 73 82 89 127 115 97 81 75 79 88 95 134 122 105 89 83 87 96 103 137 125 107 92 86 90 99 106 131 119 101 86 80 83 93 100 117 105 87 72 65 69 78 85 100 88 70 55 49 53 62 69 89 77 59 44 38 42 51 58 The normal way is to determine the brightness of each of the 64 pixels and to scale them to some limits, say from 0 to 255 *, whereby "0" means "black" and "255" means "white".

But you can define all the 64 values by only 5 integers if you apply the following formula called discrete cosine transform (DCT) 700 90 100 0 0 0 0 0 90 0 0 0 0 0 0 0-89 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The decoder can reconstruct the pixel values by the following formula called inverse discrete cosine transform (IDCT):

Quantization: This operation is used to force as many of the DCT coefficients to zero, or near zero, as possible within the boundaries of the prescribed bit-rate and video quality parameters. Run Length VLC: Considerable savings can be had by representing the fairly large number of zero coefficients in a more effective manner, and that is the purpose of run-length amplitude coding of the quantized coefficients. But before that process is performed, more efficiency can be gained by reordering the DCT coefficients.

Scanning of the example coefficients in a zigzag pattern results in a sequence of numbers as follows: 8, 4, 4, 2, 2, 2, 1, 1, 1, 1, (12 zeroes), 1, (41 zeroes). This sequence is then represented as a run-length (representing the number of consecutive zeroes) and an amplitude (coefficient value following a run of zeroes). These values are then looked up in a fixed table of variable length codes, where the most probable occurrence is given a relatively short code, and the least probable occurrence is given a relatively long code.

Video Buffer and Rate Control A constant bit-rate may be provided by the output of the encoder buffer, yet underflow or overflow may be prevented without severe quality penalties such as the repeating or dropping of entire video frames.

Inter-frame construction Imagine an I-frame showing a triangle on white background! A following P-frame shows the same triangle but at another position. Prediction means to supply a motion vector which declares how to move the triangle on I-frame to obtain the triangle in P-frame. This motion vector is part of the MPEG stream and it is divided in a horizontal and a vertical part. These parts can be positive(motion to the right or downwards) or motion to the left or motion upwards).

The red rectangle is shifted and rotated by 5 to the right. So a simple displacement of the red rectangle will cause a prediction error. Therefore the MPEG stream contains a matrix for compensating this prediction error. Thus, the reconstruction of inter coded frames goes ahead in two steps: 1. Application of the motion vector to the referred frame; 2. Adding the prediction error compensation to the result;

The input bitstream buffer consists of memory that operates in the inverse fashion of the buffer in the encoder. For fixed bit-rate applications, the constant rate bitstream is buffered in the memory and read out at a variable rate depending on the coding efficiency of the macroblocks and frames to be decoded.

The VLD is most computationally expensive portion of the decoder because it must operate on a bit-wise basis with table look-ups performed at speeds up to the input bit-rate. The inverse quantizer block multiplies the decoded coefficients by the corresponding values of the quantization matrix and the quantization scale factor. Clipping of the resulting coefficients is performed to the region 2048 to +2047, then an IDCT mismatch control is applied to prevent long term error propagation within the sequence.

MPEG-4 1. MPEG-4 uses media objects to represent aural, visual or audiovisual content. These media objects can be combined to form compound media objects. 2. MPEG-4 multiplexes and synchronizes the media objects before transmission to provide QoS and it allows interaction with the constructed scene at receiver s machine. 3. MPEG-4 organizes the media objects in a hierarchical fashion where the lowest level has primitive media objects like still images, video objects, audio objects. 4. MPEG-4 has a number of primitive media objects which can be used to represent 2 or 3-dimensional media objects. 5. MPEG-4 also defines a coded representation of objects for text, graphics, synthetic sound, talking synthetic heads. 6. MPEG-4 provides a standardized way to describe a scene. Media objects can be places anywhere in the coordinate system. Transformations can be used to change the geometrical or acoustical appearance of a media object.

Visual part of the MPEG-4 standard describes methods for compression of images and video, compression of textures for texture mapping of 2-D and 3-D meshes, compression of implicit 2-D meshes, compression of timevarying geometry streams that animate meshes. It also provides algorithms for random access to all types of visual objects as well as algorithms for spatial, temporal and quality scalability, content-based scalability of textures, images and video. Algorithms for error robustness and resilience in error prone environments are also part of the standard. For synthetic objects MPEG-4 has parametric descriptions of human face and body, parametric descriptions for animation streams of the face and body.

1. MPEG-4 also describes static and dynamic mesh coding with texture mapping, texture coding with view dependent applications. 2. MPEG-4 supports coding of video objects with spatial and temporal scalability. 3. Scalability allows decoding a part of a stream and construct images with reduced decoder complexity (reduced quality), reduced spatial resolution, reduced temporal resolution., or with equal temporal and spatial resolution but reduced quality. Scalability is desired when video is sent over heterogeneous networks, or receiver can not display at full resolution (limited power)

Robustness in error prone environments is an important issue for mobile communications. MPEG-4 has 3 groups of tools for this: 1. Resynchronization tools enables the resynchronization of the bit-stream and the decoder when an error has been detected. 2. After synchronization data recovery tools are used to recover the lost data. 3. These tools are techniques that encode the data in an error resilient way. Error concealment tools are used to conceal the lost data. Efficient resynchronization is key to good data recovery and error concealment.

Scene descriptors Object descriptors Video Scene VOP1 VOP2 MUX Storage VOP3 Audio encoding