Lecture 5: Clustering and Segmenta4on Part 1

Similar documents
Lecture 5: Clustering and Segmentation Part 1

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Indexing local features and instance recognition

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

ECS 189G: Intro to Computer Vision March 31 st, Yong Jae Lee Assistant Professor CS, UC Davis

Instance Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016

8. Schelling's Segrega0on Model

CS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017

Graphical Perception. Graphical Perception. Graphical Perception. Which best encodes quantities? Jeffrey Heer Stanford University

Graphical Perception. Graphical Perception. Which best encodes quantities?

Automatic LP Digitalization Spring Group 6: Michael Sibley, Alexander Su, Daphne Tsatsoulis {msibley, ahs1,

CSE Data Visualization. Graphical Perception. Jeffrey Heer University of Washington

Generic object recognition

CS229 Project Report Polyphonic Piano Transcription

CS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015

Supervised Learning in Genre Classification

Contemporary philosophy. 12 th April

Latin Square Design. Design of Experiments - Montgomery Section 4-2

A Framework for Segmentation of Interview Videos

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

OSL Preprocessing Henry Luckhoo. Wednesday, 23 October 13

Modeling memory for melodies

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Introduction to Psychology Prof. Braj Bhushan Department of Humanities and Social Sciences Indian Institute of Technology, Kanpur

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Technical Specifications

Algorithmic Music Composition

Digital Color Management Basics. An Introduc6on for Mo6on Pictures

N T I. Introduction. II. Proposed Adaptive CTI Algorithm. III. Experimental Results. IV. Conclusion. Seo Jeong-Hoon

Lossless Compression Algorithms for Direct- Write Lithography Systems

Hidden Markov Model based dance recognition

Recap of Last (Last) Week

MUSI-6201 Computational Music Analysis

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

Machine Translation Part 2, and the EM Algorithm

Spa$al Programming for Musical Representa$on and Analysis

Linköping University Post Print. Packet Video Error Concealment With Gaussian Mixture Models

Module 3: Video Sampling Lecture 16: Sampling of video in two dimensions: Progressive vs Interlaced scans. The Lecture Contains:

Topic 10. Multi-pitch Analysis

Discrete, Bounded Reasoning in Games

Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet

VBM683 Machine Learning

Audio Structure Analysis

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Displays and framebuffers. CSE 457 Winter 2015

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Music Recommendation from Song Sets

Murdoch redux. Colorimetry as Linear Algebra. Math of additive mixing. Approaching color mathematically. RGB colors add as vectors

Audio-over-IP Technology Pavilion

Part II Video. General Concepts MPEG1 encoding MPEG2 encoding MPEG4 encoding

Beyond Worst Case Analysis in Approxima4on Uriel Feige The Weizmann Ins2tute

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

Inverse Filtering by Signal Reconstruction from Phase. Megan M. Fuller

Probabilistic Grammars for Music

DART Tutorial Sec'on 18: Lost in Phase Space: The Challenge of Not Knowing the Truth.

MUSICAL INSTRUMENT IDENTIFICATION BASED ON HARMONIC TEMPORAL TIMBRE FEATURES

MODELS of music begin with a representation of the

Software Engineering 2DA4. Slides 3: Optimized Implementation of Logic Functions

DATA COMPRESSION USING THE FFT

LOCOCODE versus PCA and ICA. Jurgen Schmidhuber. IDSIA, Corso Elvezia 36. CH-6900-Lugano, Switzerland. Abstract

Subject-specific observed profiles of change from baseline vs week trt=10000u

TDECQ update noise treatment and equalizer optimization (revision of king_3bs_01_0117) 14th February 2017 P802.3bs SMF ad hoc Jonathan King, Finisar

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

A repetition-based framework for lyric alignment in popular songs

Lecture 18: Exam Review

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Decision-Maker Preference Modeling in Interactive Multiobjective Optimization

Music Composition with RNN

Gender and Age Estimation from Synthetic Face Images with Hierarchical Slow Feature Analysis

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

Computer Vision for HCI. Image Pyramids. Image Pyramids. Multi-resolution image representations Useful for image coding/compression

TechNote: MuraTool CA: 1 2/9/00. Figure 1: High contrast fringe ring mura on a microdisplay

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

Video coding standards

SELSE ASAR: Applica+on-Specific Approximate Recovery to Mi+gate Hardware Variability. Presenter: Manish Gupta

User Interface Design: Simplicity & Elegance

Re-Cinematography: Improving the Camera Dynamics of Casual Video

Package spotsegmentation

Linear mixed models and when implied assumptions not appropriate

Note: Please use the actual date you accessed this material in your citation.

Stuart Hall: Encoding Decoding

Predicting the immediate future with Recurrent Neural Networks: Pre-training and Applications

Discriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik

Automatic Labelling of tabla signals

UNIVERSITY OF CALICUT INTRODUCTION TO VISUAL LANGUAGE QUESTION BANK

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 7, NOVEMBER

Harmonic Generation based on Harmonicity Weightings

Characterization and improvement of unpatterned wafer defect review on SEMs

Name That Song! : A Probabilistic Approach to Querying on Music and Text

Classification of Timbre Similarity

AUDIOVISUAL COMMUNICATION

Chapter 10 Basic Video Compression Techniques

Obstacle Warning for Texting

Put your sound where it belongs: Numerical optimization of sound systems. Stefan Feistel, Bruce C. Olson, Ana M. Jaramillo AFMG Technologies GmbH

Transcription:

Lecture 5: Clustering and Segmenta4on Part 1 Professor Fei- Fei Li Stanford Vision Lab Lecture 5 -! 1

What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 2

Lecture 5 -! 3

Image Segmenta4on Goal: iden4fy groups of pixels that go together Slide credit: Steve Seitz, Kristen Grauman Lecture 5 -! 4

The Goals of Segmenta4on Separate image into coherent objects Image Human segmenta4on Slide credit: Svetlana Lazebnik Lecture 5 -! 5

The Goals of Segmenta4on Separate image into coherent objects Group together similar- looking pixels for efficiency of further processing superpixels X. Ren and J. Malik. Learning a classifica4on model for segmenta4on. ICCV 2003. Slide credit: Svetlana Lazebnik Lecture 5 -! 6

Segmenta4on Compact representa4on for image data in terms of a set of components Components share common visual proper4es Proper4es can be defined at different level of abstrac4ons Lecture 5 -! 7

General ideas Tokens whatever we need to group (pixels, points, surface elements, etc., etc.) Bobom up segmenta4on tokens belong together because they are locally coherent Top down segmenta4on This lecture (#5) tokens belong together because they lie on the same visual en4ty (object, scene ) > These two are not mutually exclusive Lecture 5 -! 8

What is Segmenta4on? Clustering image elements that belong together Par44oning Divide into regions/sequences with coherent internal proper4es Grouping Iden4fy sets of coherent tokens in image Slide credit: Christopher Rasmussen Lecture 5 -! 9

What is Segmenta4on? Why do these tokens belong together? Lecture 5 -! 10

Basic ideas of grouping in human vision Gestalt proper4es Figure- ground discrimina4on Lecture 5 -! 11

Examples of Grouping in Vision Grouping video frames into shots Determining image regions What things should be grouped? Figure- ground What cues indicate groups? Slide credit: Kristen Grauman Object- level grouping Lecture 5 -!12

Similarity Slide credit: Kristen Grauman Lecture 5 -! 13

Symmetry Slide credit: Kristen Grauman Lecture 5 -! 14

Common Fate Image credit: Arthus- Bertrand (via F. Durand) Slide credit: Kristen Grauman Lecture 5 -! 15

Proximity Slide credit: Kristen Grauman Lecture 5 -! 16

Muller- Lyer Illusion Gestalt principle: grouping is key to visual percep4on. Lecture 5 -! 17

The Gestalt School Grouping is key to visual percep4on Elements in a collec4on can have proper4es that result from rela4onships The whole is greater than the sum of its parts Illusory/subjec4ve contours Occlusion Familiar configura4on hbp://en.wikipedia.org/wiki/gestalt_psychology Slide credit: Svetlana Lazebnik Lecture 5 -! 18

Gestalt Theory Gestalt: whole or group Whole is greater than sum of its parts Rela4onships among parts can yield new proper4es/features Psychologists iden4fied series of factors that predispose set of elements to be grouped (by human visual system) I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees. Max Wertheimer (1880-1943) Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/wertheimer/forms/forms.htm Lecture 5 -! 19

Gestalt Factors These factors make intui4ve sense, but are very difficult to translate into algorithms. Image source: Forsyth & Ponce Lecture 5 -! 20

Con4nuity through Occlusion Cues Lecture 5 -! 21

Con4nuity through Occlusion Cues Con4nuity, explana4on by occlusion Lecture 5 -! 22

Con4nuity through Occlusion Cues Image source: Forsyth & Ponce Lecture 5 -! 23

Con4nuity through Occlusion Cues Image source: Forsyth & Ponce Lecture 5 -! 24

Figure- Ground Discrimina4on Lecture 5 -! 25

The Ul4mate Gestalt? Lecture 5 -! 26

What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering Mixture of Gaussians, EM Model- free clustering Mean- ship Lecture 5 -! 27

Image Segmenta4on: Toy Example 1 2 3 black pixels gray pixels white pixels input image intensity These intensi4es define the three groups. We could label every pixel in the image according to which of these primary intensi4es it is. i.e., segment the image based on the intensity feature. What if the image isn t quite so simple? Slide credit: Kristen Grauman Lecture 5 -! 28

Pixel count Input image Intensity Pixel count Input image Slide credit: Kristen Grauman Intensity Lecture 5 -! 29

Pixel count Input image Intensity Now how to determine the three main intensi4es that define our groups? We need to cluster. Slide credit: Kristen Grauman Lecture 5 -! 30

0 190 255 Intensity 1 2 3 Goal: choose three centers as the representa4ve intensi4es, and label every pixel according to which of these centers it is nearest to. Best cluster centers are those that minimize Sum of Square Distance (SSD) between all points and their nearest cluster center c i : SSD= clusters i p cluster i p c i 2 Slide credit: Kristen Grauman Lecture 5 -! 31

Clustering With this objec4ve, it is a chicken and egg problem: If we knew the cluster centers, we could allocate points to groups by assigning each to its closest center. If we knew the group memberships, we could get the centers by compu4ng the mean per group. Slide credit: Kristen Grauman Lecture 5 -! 32

K- Means Clustering Basic idea: randomly ini4alize the k cluster centers, and iterate between the two steps we just saw. 1. Randomly ini4alize the cluster centers, c 1,..., c K 2. Given cluster centers, determine points in each cluster For each point p, find the closest c i. Put p into cluster i 3. Given points in each cluster, solve for c i Set c i to be the mean of points in cluster i 4. If c i have changed, repeat Step 2 Proper4es Will always converge to some solu4on Can be a local minimum Does not always find the global minimum of objec4ve func4on: SSD= clusters i p cluster i p c i 2 Slide credit: Steve Seitz Lecture 5 -! 33

Segmenta4on as Clustering K=2 img_as_col = double(im(:)); cluster_membs = kmeans(img_as_col, K); K=3 labelim = zeros(size(im)); for i=1:k inds = find(cluster_membs==i); meanval = mean(img_as_column(inds)); labelim(inds) = meanval; end Slide credit: Kristen Grauman Lecture 5 -! 34

K- Means Clustering Java demo: hbp://home.dei.polimi.it/mabeucc/clustering/tutorial_html/appletkm.html Lecture 5 -! 35

K- Means++ Can we prevent arbitrarily bad local minima? 1. Randomly choose first center. 2. Pick new center with prob. propor4onal to (Contribu4on of p to total error) 3. Repeat un4l k centers. Expected error = O(log k) * op4mal Arthur & Vassilvitskii 2007 Slide credit: Steve Seitz Lecture 5 -! 36

Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity similarity Feature space: intensity value (1D) Slide credit: Kristen Grauman Lecture 5 -! 37

Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on color similarity B G R=255 G=200 B=250 R=245 G=220 B=248 Feature space: color value (3D) Slide credit: Kristen Grauman R R=15 G=189 B=2 Lecture 5 -! 38 R=3 G=12 B=2

Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on texture similarity F 1 F 2 Filter bank of 24 filters F 24 Feature space: filter bank responses (e.g., 24D) Slide credit: Kristen Grauman Lecture 5 -! 39

Smoothing Out Cluster Assignments Assigning a cluster label per pixel may yield outliers: Original Labeled by cluster center s intensity How can we ensure they are spa4ally smooth? 1 2? 3 Slide credit: Kristen Grauman Lecture 5 -! 40

Segmenta4on as Clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity+posi9on similarity Intensity Y X Way to encode both similarity and proximity. Slide credit: Kristen Grauman Lecture 5 -! 41

K- Means Clustering Results K- means clustering based on intensity or color is essen4ally vector quan4za4on of the image abributes Clusters don t have to be spa4ally coherent Image Intensity- based clusters Color- based clusters Image source: Forsyth & Ponce Lecture 5 -! 42

K- Means Clustering Results K- means clustering based on intensity or color is essen4ally vector quan4za4on of the image abributes Clusters don t have to be spa4ally coherent Clustering based on (r,g,b,x,y) values enforces more spa4al coherence Image source: Forsyth & Ponce Lecture 5 -! 43

Summary K- Means Pros Simple, fast to compute Converges to local minimum of within- cluster squared error Cons/issues Sewng k? Sensi4ve to ini4al centers Sensi4ve to outliers Detects spherical clusters only Assuming means can be computed Slide credit: Kristen Grauman Lecture 5 -! 44

What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 45

Probabilis4c Clustering Basic ques4ons What s the probability that a point x is in cluster m? What s the shape of each cluster? K- means doesn t answer these ques4ons. Basic idea Instead of trea4ng the data as a bunch of points, assume that they are all generated by sampling a con4nuous func4on. This func4on is called a genera4ve model. Defined by a vector of parameters θ Slide credit: Steve Seitz Lecture 5 -! 46

Mixture of Gaussians One genera4ve model is a mixture of Gaussians (MoG) K Gaussian blobs with means μ b covariance matrices V b, dimension d Blob b defined by: Blob b is selected with probability The likelihood of observing x is a weighted mixture of Gaussians, Slide credit: Steve Seitz Lecture 5 -! 47

Expecta4on Maximiza4on (EM) Goal Find blob parameters θ that maximize the likelihood func4on: Approach: 1. E- step: given current guess of blobs, compute ownership of each point 2. M- step: given ownership probabili4es, update blobs to maximize likelihood func4on 3. Repeat un4l convergence Slide credit: Steve Seitz Lecture 5 -! 48

EM Details E- step Compute probability that point x is in blob b, given current guess of θ M- step Compute probability that blob b is selected (N data points) Mean of blob b Covariance of blob b Slide credit: Steve Seitz Lecture 5 -! 49

Applica4ons of EM Turns out this is useful for all sorts of problems Any clustering problem Any model es4ma4on problem Missing data problems Finding outliers Segmenta4on problems Segmenta4on based on color Segmenta4on based on mo4on Foreground/background separa4on... EM demo hbp://lcn.epfl.ch/tutorial/english/gaussian/html/index.html Slide credit: Steve Seitz Lecture 5 -! 50

Segmenta4on with EM Original image EM segmentation results k=2 k=3 k=4 k=5 Image source: Serge Belongie Lecture 5 -! 51

Summary: Mixtures of Gaussians, EM Pros Probabilis4c interpreta4on Sop assignments between data points and clusters Genera4ve model, can predict novel data points Rela4vely compact storage Cons Local minima Ini4aliza4on Open a good idea to start with some k- means itera4ons. Need to know number of components Solu4ons: model selec4on (AIC, BIC), Dirichlet process mixture Need to choose genera4ve model Numerical problems are open a nuisance Lecture 5 -! 52

What we have learned today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 53