Essence of Image and Video

Similar documents
Essence of Image and Video

Ch. 1: Audio/Image/Video Fundamentals Multimedia Systems. School of Electrical Engineering and Computer Science Oregon State University

Fundamentals of Multimedia. Lecture 3 Color in Image & Video

!"#"$%& Some slides taken shamelessly from Prof. Yao Wang s lecture slides

Chapter 3 Fundamental Concepts in Video. 3.1 Types of Video Signals 3.2 Analog Video 3.3 Digital Video

An Overview of Video Coding Algorithms

Processing. Electrical Engineering, Department. IIT Kanpur. NPTEL Online - IIT Kanpur

Lecture 2 Video Formation and Representation

Video compression principles. Color Space Conversion. Sub-sampling of Chrominance Information. Video: moving pictures and the terms frame and

Wipe Scene Change Detection in Video Sequences

Understanding Human Color Vision

Motion Video Compression

Reducing False Positives in Video Shot Detection

Chapter 10 Basic Video Compression Techniques

Television History. Date / Place E. Nemer - 1

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

Audio and Video II. Video signal +Color systems Motion estimation Video compression standards +H.261 +MPEG-1, MPEG-2, MPEG-4, MPEG- 7, and MPEG-21

Colour Reproduction Performance of JPEG and JPEG2000 Codecs

Video Compression. Representations. Multimedia Systems and Applications. Analog Video Representations. Digitizing. Digital Video Block Structure

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

Multimedia Systems Video I (Basics of Analog and Digital Video) Mahdi Amiri April 2011 Sharif University of Technology

Video coding standards

Chrominance Subsampling in Digital Images

The Development of a Synthetic Colour Test Image for Subjective and Objective Quality Assessment of Digital Codecs

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Man-Machine-Interface (Video) Nataliya Nadtoka coach: Jens Bialkowski

Chapter 2 Introduction to

Evaluation of Automatic Shot Boundary Detection on a Large Video Test Suite

SHOT DETECTION METHOD FOR LOW BIT-RATE VIDEO CODING

Rounding Considerations SDTV-HDTV YCbCr Transforms 4:4:4 to 4:2:2 YCbCr Conversion

[source unknown] Cornell CS465 Fall 2004 Lecture Steve Marschner 1

Chapter 4 Color in Image and Video. 4.1 Color Science 4.2 Color Models in Images 4.3 Color Models in Video

MPEG has been established as an international standard

Advanced Computer Networks

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

A Framework for Segmentation of Interview Videos

To discuss. Types of video signals Analog Video Digital Video. Multimedia Computing (CSIT 410) 2

Principles of Video Segmentation Scenarios

Midterm Review. Yao Wang Polytechnic University, Brooklyn, NY11201

Multimedia. Course Code (Fall 2017) Fundamental Concepts in Video

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

Introduction to Video Compression Techniques. Slides courtesy of Tay Vaughan Making Multimedia Work

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Lecture 2 Video Formation and Representation

Multimedia Communications. Image and Video compression

Basics on Video Communications and Other Video Coding Approaches/Standards

Murdoch redux. Colorimetry as Linear Algebra. Math of additive mixing. Approaching color mathematically. RGB colors add as vectors

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

PAL uncompressed. 768x576 pixels per frame. 31 MB per second 1.85 GB per minute. x 3 bytes per pixel (24 bit colour) x 25 frames per second

Contents. xv xxi xxiii xxiv. 1 Introduction 1 References 4

OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

Understanding Compression Technologies for HD and Megapixel Surveillance

1. Broadcast television

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004

Computer and Machine Vision

Overview: Video Coding Standards

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Analysis of a Two Step MPEG Video System

Minimizing the Perception of Chromatic Noise in Digital Images

TERMINOLOGY INDEX. DME Down Stream Keyer (DSK) Drop Shadow. A/B Roll Edit Animation Effects Anti-Alias Auto Transition

MULTIMEDIA TECHNOLOGIES

Vannevar Bush: As We May Think

High-Definition, Standard-Definition Compatible Color Bar Signal

INTRA-FRAME WAVELET VIDEO CODING

Introduction & Colour

Video 1 Video October 16, 2001

Multimedia Communications. Video compression

Steganographic Technique for Hiding Secret Audio in an Image

Computer Graphics. Raster Scan Display System, Rasterization, Refresh Rate, Video Basics and Scan Conversion

Digital Video Telemetry System

A look at the MPEG video coding standard for variable bit rate video transmission 1

Information Transmission Chapter 3, image and video

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Chapter 1 INTRODUCTION

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Lecture 1: Introduction & Image and Video Coding Techniques (I)

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

Color Spaces in Digital Video

Fast thumbnail generation for MPEG video by using a multiple-symbol lookup table

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds.

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

AUDIOVISUAL COMMUNICATION

Impact of scan conversion methods on the performance of scalable. video coding. E. Dubois, N. Baaziz and M. Matta. INRS-Telecommunications

Improved Performance For Color To Gray And Back Using Walsh, Hartley And Kekre Wavelet Transform With Various Color Spaces

Adaptive Key Frame Selection for Efficient Video Coding

COLOR AND COLOR SPACES ABSTRACT

Video Information Glossary of Terms

EMBEDDED ZEROTREE WAVELET CODING WITH JOINT HUFFMAN AND ARITHMETIC CODING

Module 1: Digital Video Signal Processing Lecture 5: Color coordinates and chromonance subsampling. The Lecture Contains:

Automatic Soccer Video Analysis and Summarization

5.1 Types of Video Signals. Chapter 5 Fundamental Concepts in Video. Component video

Motion Re-estimation for MPEG-2 to MPEG-4 Simple Profile Transcoding. Abstract. I. Introduction

VIDEO 101: INTRODUCTION:

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

17 October About H.265/HEVC. Things you should know about the new encoding.

Transcription:

1 Essence of Image and Video Wei-Ta Chu 2010/9/23

2 Essence of Image Wei-Ta Chu 2010/9/23 Chapters 2 and 6 of Digital Image Procesing by R.C. Gonzalez and R.E. Woods, Prentice Hall, 2 nd edition, 2001

Image Sensing and Acquisition 3 Collect the incoming energy and focus it onto an image plane.

A Simple Image Formation Model 4 Denote an image by a 2D function Characterized by two components: Illumination:, determined by the illumination source Reflectance:, determined by the characteristics of the imaged objects.

Image Sampling and Quantization 5 Sampling Quantization Digitizing the coordinate values Digitizing the amplitude values

Image Sampling and Quantization 6 Continuous image projected onto a sensor array Results of image sampling and quantization

Digital Image Representation 7 Dynamic range The number of discrete gray levels allowed for each pixel Due to processing, storage, and sampling hardware considerations, the number of gray levels typically is an integer power of 2: We refer to images whose gray levels span a significant portion of the gray scale as having a high dynamic range.

Digital Image Representation 8 Image size For a square image with width(height) is N, the total number of bits required to represent the image:

Spatial Resolution 9 Sampling is the principal factor determining the spatial resolution of an image. 1024x1024 32x32: Downsampled by a factor of 2

10 Spatial Resolution 1024x1024 Resample 512 x 512 to 1024 x 1024 From 256x256 From 128x128 From 64x64 From 32x32

Gray-Level Resolution(L=256,128,,4,2) 11 256 128 16 8 64 32 4 2

Histogram 12 The histogram of an image with gray level in the range [0,L-1] is a discrete function Normalized histogram

Histogram 13 Useful image statistics Image processing applications Image enhancement Image compression Image segmentation

Color Fundamentals 14 Color spectrum: violet, blue, green, yellow, orange & red Each color in the spectrum blends smoothly into the next The color perceived in an object are determined by the nature of the light reflected from the object

Color Fundamentals 15 Cones can be divided into three principal sensing categories Due to the absorption of the human eyes, colors are seen as variable of three primary colors (red, green, blue) Approximately 65% of all cones are sensitive to red light, 33% to green light, 2% to blue light.

Color Fundamentals 16 Secondary colors of light Magenta (R + B) Cyan (G + B) Yellow (R + G) The primary color of pigments subtract a primary color of light and reflects the other two.

Color Fundamentals 17 Brightness Embodies the chromatic notion of intensity Hue Attribute associated with the dominant wavelength in a mixture of light waves Dominant color as perceived by an observer Saturation The relative purity of the amount of white light mixed with a hue Less saturated: e.g. pink (red+white), lavender (violet+white) Hue and saturation taken together as called chromaticity.

Specifying Colors 18 The amounts of red, green, and blue needed to form any particular color are called the tristimulus values and are denoted X, Y, Z, respectively. A color is then specified by its trichromatic coefficients, defined as Using CIE chromaticity diagram, which shows color composition as a function of x (red) and y(green)

Specifying Colors 19 The point marked green has approximately 63% green and 25% red content. The composition of blue is approximately 13%.

Color Models (Color Spaces) 20 A color model is a specification of a coordinate system and a subspace within that system where each color is represented by a single point. Hardware-oriented & application-oriented RGB color monitor, color video cameras CMY (cyan, magenta, yellow) color printing CMYK (cyan, magenta, yellow, black) color printing HSI (hue, saturation, intensity) closely matching with human perception

The RGB Color Model 21 Based on Cartesian coordinate system Different colors are points on or inside the cube Full color image: 8 bits for each component, total 24 bits

22 The RGB Color Model

The CMY and CMYK Color Models 23 When a surface coated with cyan pigment is illuminated with white light, no red light is reflected from the surface. Cyan subtracts red light Most devices that deposit colored pigments on paper require CMY data input or perform RGB to CMY conversion. Equal amounts of CMY pigments should produce black.

The HSI Color Model 24 RGB/CMY color systems are suited for hardware implementations. RGB system matches nicely with the fact that the human eye is strongly perceptive to red, green, and blue primaries. But RGB and CMY are not well suited for describing colors for human interpretation.

The HSI Color Model 25 We describe a color object by its hue, saturation, and brightness. Hue: color attribute that describes a pure color Saturation: degree of pure color diluted by white light Brightness: measured by intensity HSI color model decouples the intensity component from the color-carrying information

The HSI Color Model 26 Take the RGB cube, stand on the black vertex, with the white vertex above it. The intensity (gray scale) is along the line joining these two vertices.

The HSI Color Model 27 The dot is an arbitrary color point. The angle from the red axis gives the hue, and the length of the vector is the saturation. The intensity of all colors is given by the position of the plane on the vertical intensity axis.

HSI 28 HSI is also known as HSL, HLS HSV color space

Converting colors from RGB to HSI 29 RGB values have been normalized to the range [0,1] The angle θis measured with respect to the red axis of the HSI space.

The LAB (CIELAB) Color Models 30 CIELAB (L * a * b * ) color space L*: lightness dimension a*,b*: two chromatic dimensions that are roughly red-green and blue-yellow. L*a*b* color is designed to approximate human vision http://en.wikipedia.org/wiki/lab_color_space http://coatings.specialchem.com.cn/tc/color/index.aspx?id=cielab

Other Color Models 31 YUV, YIQ, YCbCr color spaces YCbCr is widely used in video/image compression schemes such as MPEG and JPEG Please refer to http://en.wikipedia.org/wiki/color_space

Color Histogram 32 A representation of the distribution of colors in an image. Discretize colors into a number of bins, and counting the number of pixels with colors in each bin. http://rsb.info.nih.gov/ij/plugins/color-inspector.html

Nonuniform Quantization 33 An example in HLS (HSI) space Considering human perception Lee, et al. Spatial color descriptor for image retrieval and Video summarization, IEEE Trans. on Multimedia, 2003.

Characteristics of Histogram 34 The color histogram of an image represents the global statistics (color distribution) of pixels colors Histogram is one of the most useful feature to describe images or be the basis for similarity measure

Histogram-based Difference 35 Bin-wise histogram difference between Image I 1 and I 2

Short Introduction to Image Features 36 Color features Color histogram Color moments Color coherence vectors (CCV) Color correlogram Ma, et al. Benchmarking image features content-based image retrieval, Record of the 32nd Asilomar Conf. on Signals, Systems & Computers, vol 1., 1998.

Short Introduction to Image Features 37 Texture features Tamura features (coarseness, directionality, contrast) Multi-resolution simultaneous auto-regressive model Canny edge histogram Gabor texture feature Pyramid-structured wavelet transform (PWT) feature Tree-structured transform (TWT) feature Ma, et al. Benchmarking image features content-based image retrieval, Record of the 32nd Asilomar Conf. on Signals, Systems & Computers, vol 1., 1998.

Color Moments 38 Containing only the dominant features instead of storing the complete color distributions. Store the first three moments of each color channel of an image in the index. Average Variance Skewness

Color Moments 39

Color Moments 40 Distance between two images I 1 and I 2 Diff. of average Diff. of variance Diff. of skewness

Color Correlogram 41 A color correlogram expresses how the spatial correlation of parts of colors changes with distance. The histogram of an image is defined as The colors in are quantized into The notation is synonymous with and Huang, et al. Image indexing using color correlograms, CVPR, 1997.

Color Correlogram 42 Let a distance be a fixed a priori. Then the correlogram of is defined for, This value gives the probability that a pixel at distance away from the given pixel is of color., The autocorrelogram of captures spatial correlation between identical colors only

43 Essence of Video Wei-Ta Chu 2010/9/23

Constitution of Digital Video Data 44 A natural video stream is continuous in both spatial and temporal domains. In order to represent and process a video stream digitally it is necessary to sample spatially and temporally. Spatial domain Temporal domain

Video Stream 45 Natural scene Camera RGB to YC 1 C 2 Monitor Processing, Storage, Transmission YC 1 C 2 To RGB

Video Data Representation 46 RGB is not very efficient for representing real-world images, since equal bandwidths are required to describe all the three color components. E.g. 8 bits per component, then 24 bits per pixel Human eye is more sensitive to luminance. Many image coding standards and broadcast systems use luminance and color difference signals. YUV and YIQ for analog television standards, YCbCr for their digital version.

Color Models in Video 47 Largely derive from older analog methods for coding color for TV. Luminance is separated from color information. YIQ is the color space used by the NTSC color TV system, employed mainly in North and Central America, and Japan. In Europe, video tape uses the PAL and SECAM codings, which are based on TV that uses a matrix transform called YUV. Digital video mostly uses a matrix transform called YCbCr that is closely related to YUV.

TV Encoding System 48 PAL, short for Phase Alternating Line, is a color encoding system used in broadcast television systems in large parts of the world. SECAM, French for "Sequential Color with Memory"), is an analog color television system first used in France. NTSC is the analog television system in use in the United States, Canada, Japan, South Korea, Taiwan, the Philippines, Mexico, and some other countries

The YUV Color Model 49 The YUV model defines a color space in terms of one luma (brightness) and two chrominance components. The YUV color model is used in the PAL, NTSC, and SECAM composite color video standards. YUV signals are created from an original RGB source. The weighted values of R, G, and B are added together to produce a single Y signal.

The YUV Color Model 50 The U signal is then created by subtracting the Y from the blue signal, and then scaling; V is created by subtracting the Y from the red, and then scaling by a different factor. Y U V

The YCbCr Color Model 51 YCbCr is a family of color spaces used in video and digital photography systems. Y is the luma component and Cb and Cr are the blue and red chroma components. Recommendation 601 specifies 8-bit coding: Y C b C r

Chroma Subsampling 52 4:2:2 indicates horizontal subsampling of the Cb, Cr signals by a factor of 2. Of four pixels labeled as 0 to 3, all four Ys are sent, and every twocb sthe two Cr sare sent. (Y0,Cb0) (Y1,Cr0) (Y2,Cb2) (Y3,Cr2) 4:2:0 subsamples in both the horizontal and vertical dimensions by a factor of 2.

Examples 53 Given image resolution of 720x576 pixels represented with 8 bits each component, the bit rate required is: 4:4:4 resolution: 720x576x8x3 = 10 Mbits/frame 4:2:0 resolution: (720x576x8) + (360x288x8)x2 = 5 Mbits/frame

Motion Estimation 54 Successive video frames may contain the same objects (still or moving). Motion estimation examines the movement of objects in an image sequence to try to obtain vectors representing the estimated motion.

Motion Estimation 55 The Essence of Image and Video Compression, by A.C. Kokaram http://www.mee.tcd.ie/~ack/teaching/1e8/lecture3.pdf

Three Typical Types of Coded Picture 56 I frame (intraframe) Intraframe encoded without any temporal prediction P frame (forward predicted frame) Interframe encoded using motion predition from the previous I or P frame B frame (bidirectionally predicted frame) Interframe encoded using interpolated motion prediction between the previous I or P frames and the next I or P frames.

Motion Prediction 57 A typical Group of Picture (GOP) in MPEG-2

Short Introduction to Video Features 58 Motion-based features Camera motion, object motion Motion activity/magnitude Moving object detection Shot-based features Average shot length/shot change frequency Scene-based features

Motion Type 59 Camera motion (global motion) Zoom-in/Zoom-out Pan Tilt Object motion

Motion Activity/Magnitude 60 Attributes: Intensity of activity Direction of activity Spatial distribution of activity Indication of the number and size of active regions Temporal distribution of activity Variation of activity over the duration of a video segment or shot

Average Shot Length/Shot Change 61 Frequency A statistical measurement which divides the total length of the film by the number of shots. Average duration of a shot between cuts Directors often change shots frequently (shorter ASL) to attract the audience E.g. commercials Video segments with longer ASLs usually present peaceful scenes.

62 Video Syntax Analysis Wei-Ta Chu 2010/9/23

Outline 63 Shot boundary detection Scene boundary detection Keyframe selection

Video Structure 64 Shot A consecutive sequence of frames recorded from a single camera. Scene A collection of semantically related and temporally adjacent shots, depicting and conveying a high-level concept or story. Scene Shot Frame Video

Shot Boundary Detection / 65 Shot Change Detection Shot A basic unit for advanced accessing browsing, summarization, retrieval Keyframes Representative frame(s) of a shot Issues Large camera/object motion Editing effects: dissolve, wipe, fade Flashlight

Types of Shot Change 66 Abrupt change (hard cut) Cut occurs in a single frame when stopping and restarting the camera Gradual transition Fade-in: gradual increase in intensity starting from a black frame Fade-out: gradual decrease in intensity resulting a black frame Dissolve: transiting from the end of one clip to the beginning of another Wipe: One image is replaced by another with a distinct edge that forms a shape.

Examples of Shot Changes 67 Cut Dissolve Wipe Li and Lee. Efective detection of various wipe transitions IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 6, pp. 663-673, 2007.

Examples of Fade 68 Fade out Fade in Cernekova, et al., Information theory-based shot cut/fade detection and video summarization IEEE Trans. on Circuits and Systems for Video Technology, vol. 16, no. 1, pp. 82-91, 2006.

Different Types of Wipe 69 Li and Lee. Efective detection of various wipe transitions IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 6, pp. 663-673, 2007. Video example: http://en.wikipedia.org/wiki/wipe_%28transition%29

Detection Process 70 Extract features Similarity calculating Boundary decision Video Shot 1 Shot 2 Shot 3 Shot 4

Features 71 Pixel difference Statistical difference Histograms Compression differences Edge Motion

Pixel Difference 72 Count the number of pixels that change in value more than some threshold. May be sensitive to camera motion.

1. Pair-wise comparison 73 Compare the corresponding pixels in two frames. Problems: sensitive to camera movement E.g. camera panning Improvement: smoothing by a 3x3 window before comparison Zhang, et al., Automatic partitioning of ful-motion video Multimedia Systems Journal, vol. 1, pp. 10-28, 1993.

2. Histogram Comparison 74 Less sensitive to object motion, since it ignores the spatial changes in a frame. H i (j): the histogram value for the ith frame, where j is one of the G grey levels.

2. Histogram Comparison Example 75 Example video sequence The intensity histogram of the first three frames

2. Histogram Comparison 76 Color histogram difference p i (r,g,b) is the number of pixels of color (r,g,b) in frame I i of N pixels. Each color component is discritized to 2 B different values.

3. Likelihood Ratio 77 Compare corresponding regions (blocks) in two successive frames based on second-order statistical characteristics of their intensity values. m i : mean intensity value for a given region S i : variances for a given region Then a camera break can be declared whenever the total number of sample areas whose likelihood ratio exceeds the threshold is sufficiently large Raise the tolerance of slow and small object motion from frame to frame.

4. Edge Change Ratio 78 Zabih, et al., A feature-based algorithm for detecting and clasifying scene breaks Proc. Of ACM Multimedia, pp. 189-200,1995.

4. Edge Change Ratio 79

4. Edge Change Ratio 80 Edge change ratio

5. Motion Vectors 81 Using the direction of motion prediction to be the cues for shot change detection Pei, et al., Scene-effect detection and insertion MPEG encoding scheme for video browsing and eror concealment IEEE Trans. on Multimedia, vol. 7, no. 4, pp. 606-614, 2005.

5. Motion Vectors 82 Using motion vector information to filter out false positives Zhang, et al., Automatic partitioning of ful-motion video Multimedia Systems Journal, vol. 1, pp. 10-28, 1993.

6. Differences in DCT domain 83 Discrete Cosine Transform (DCT) coefficients 1. Select subset of blocks 2. Select subset of DCT coefficients of these blocks 3. Concatenate selected coefficients of selected blocks as a vector 4. Calculate the similarity of two coefficient vectors Arman, et al., Image procesing on encoded video sequences Multimedia Systems Journal, vol. 1, no. 5, pp. 211-219, 1994.

Gradual Transition Detection 84 Cuts or abrupt change Gradual transition

1. Twin-Comparison Approach 85 Zhang, et al., Automatic partitioning of full-motion video Multimedia Systems Journal, vol. 1, pp. 10-28, 1993.

2. Edge Change Ratio 86 Lienhart, R., Comparison of automatic shot boundary detection algorithms Proc. of SPIE Storage and Retrieval for Image and Video Databases VII, vol. 3656, pp. 290-301, 1999.

87 2. Edge Change Ratio

3. Characterizing a Wipe Transition 88

Evaluation 89 Precision The percentage of retrieved items that are desired items Recall The percentage of desired items that are retrieved. Precision = # Correctly retrieved items # All retrieved items = # Correctly retrieved items # Correctly retrieved items + # Falsely retrieved items Recall = # Correctly retrieved items # All relevant items = # Correctly retrieved items # Correctly retrieved items + # Items that are not retrieved

Evaluation Other Terms 90 Miss # Items that are not retrieved True positive (TP) # Correctly retrieved items False positive (FP) Predicted positive Predicted negative Actual positive TP FN Actual negative FP TN # Falsely retrieved items True negative (TN) # Correctly missed items False negative (FN) # Items that are not retrieved

Evaluation 91 Detected (retrieved) Relevant (ground truth) Actual positive Actual negative Predicted positive TP FP FP TP FN Predicted negative FN TN TN

Relationship between Precision & Recall 92 Precision-Recall (PR) curve

93 Relationship between True Positive and False Positive Receiver Operator Characteristic (ROC) curve

Using PR or ROC Curves? 94 ROC curves can present an overly optimistic view of an algorithm s performance if there is a large skew in the clas distribution. Number of true negative examples greatly exceeds the number of positive examples. Thus a large change in the number in false positives can lead to a small change in the false positive rate. Precision compares false positives to true positives and better captures the algorithm s performance. Davis, et al., The relationship between precision-recal and ROC curves Proc. of International Conference on Machine Learning, pp. 233-240, 2006.

Comparison of Shot Boundary 95 Detection Techniques Methods Histograms, region histograms, running histograms, motion-compensated pixel differences, DCT coefficient differences Evaluation data Video type # Frames Cuts Gradual transitions TV 133204 831 42 News 81595 293 99 Movie 142507 564 95 Commercial 51733 755 254 Misc. 10706 64 16 Total Multimedia Content 419745 Analysis, CSIE, CCU 2507 506

Methods Compared 96 Histogram (64-bin gray-level) difference, single threshold Region (block) histogram 16 blocks, 64 gray-scale histograms, difference threshold for each block, and count threshold for changed blocks Running histogram (Twin method) 64 gray-scale histogram for each frame, twin thresholds Compute motion vectors. If excessive motion, reject gradual changes Motion compensated pixel difference 12 blocks per frame, motion vector for each block Compute average residual errors, if larger than high threshold, detected as a cut Use cumulative errors to detect gradual changes (similar to above) Use motion vectors to reject false gradual changes DCT difference Concatenate 15 coefficients of same locations from different blocks to form a vector Compute (1-inner product of two vectors from consecutive frames)

PR Curve for TV program 97

PR Curve for News program 98

PR Curve for Movie Videos 99

PR Curve for Commercials 100

PR Curve for All Data 101

PR Curve for All Data Cut Only 102

Observations 103 Histogram-based method is consistent Produced the first or second best precision Simplicity & straightforward Region algorithm seems to be the best Where recall is not the highest priority Running algorithm seems to be the best Where recall is important Motion vector is helpful to reduce false positives DCT the worst Large number of false positives in black frames

References 104 J.S. Boreczky, et al., "Comparison of video shot boundary detection techniques" Proc. of SPIE Conference on Storage and Retrieval for Image and Video Databases, vol. 2670, 1996. (must read) R. Lienhart, "Comparison of automatic shot boundary detection algorithms" Proc. of SPIE Storage and Retrieval for Image and Video Databases VII, vol. 3656, pp. 290-301, 1999. J. Yuan, et al., "A formal study of shot boundary detection" IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 2, pp. 168-186, 2007. A. Hanjalic, "Shot-boundary detection: unraveled or resolved?" IEEE Trans. on Circuits and Systems for Video Technology, vol. 12, no. 2, pp. 90-105, 2002.