Lecture 5: Clustering and Segmenta4on Part 1 Professor Fei- Fei Li Stanford Vision Lab Lecture 5 -! 1
What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 2
Lecture 5 -! 3
Image Segmenta4on Goal: iden4fy groups of pixels that go together Slide credit: Steve Seitz, Kristen Grauman Lecture 5 -! 4
The Goals of Segmenta4on Separate image into coherent objects Image Human segmenta4on Slide credit: Svetlana Lazebnik Lecture 5 -! 5
The Goals of Segmenta4on Separate image into coherent objects Group together similar- looking pixels for efficiency of further processing superpixels X. Ren and J. Malik. Learning a classifica4on model for segmenta4on. ICCV 2003. Slide credit: Svetlana Lazebnik Lecture 5 -! 6
Segmenta4on Compact representa4on for image data in terms of a set of components Components share common visual proper4es Proper4es can be defined at different level of abstrac4ons Lecture 5 -! 7
General ideas Tokens whatever we need to group (pixels, points, surface elements, etc., etc.) Bobom up segmenta4on tokens belong together because they are locally coherent Top down segmenta4on This lecture (#5) tokens belong together because they lie on the same visual en4ty (object, scene ) > These two are not mutually exclusive Lecture 5 -! 8
What is Segmenta4on? Clustering image elements that belong together Par44oning Divide into regions/sequences with coherent internal proper4es Grouping Iden4fy sets of coherent tokens in image Slide credit: Christopher Rasmussen Lecture 5 -! 9
What is Segmenta4on? Why do these tokens belong together? Lecture 5 -! 10
Basic ideas of grouping in human vision Gestalt proper4es Figure- ground discrimina4on Lecture 5 -! 11
Examples of Grouping in Vision Grouping video frames into shots Determining image regions What things should be grouped? Figure- ground What cues indicate groups? Slide credit: Kristen Grauman Object- level grouping Lecture 5 -!12
Similarity Slide credit: Kristen Grauman Lecture 5 -! 13
Symmetry Slide credit: Kristen Grauman Lecture 5 -! 14
Common Fate Image credit: Arthus- Bertrand (via F. Durand) Slide credit: Kristen Grauman Lecture 5 -! 15
Proximity Slide credit: Kristen Grauman Lecture 5 -! 16
Muller- Lyer Illusion Gestalt principle: grouping is key to visual percep4on. Lecture 5 -! 17
The Gestalt School Grouping is key to visual percep4on Elements in a collec4on can have proper4es that result from rela4onships The whole is greater than the sum of its parts Illusory/subjec4ve contours Occlusion Familiar configura4on hbp://en.wikipedia.org/wiki/gestalt_psychology Slide credit: Svetlana Lazebnik Lecture 5 -! 18
Gestalt Theory Gestalt: whole or group Whole is greater than sum of its parts Rela4onships among parts can yield new proper4es/features Psychologists iden4fied series of factors that predispose set of elements to be grouped (by human visual system) I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of colour. Do I have "327"? No. I have sky, house, and trees. Max Wertheimer (1880-1943) Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/wertheimer/forms/forms.htm Lecture 5 -! 19
Gestalt Factors These factors make intui4ve sense, but are very difficult to translate into algorithms. Image source: Forsyth & Ponce Lecture 5 -! 20
Con4nuity through Occlusion Cues Lecture 5 -! 21
Con4nuity through Occlusion Cues Con4nuity, explana4on by occlusion Lecture 5 -! 22
Con4nuity through Occlusion Cues Image source: Forsyth & Ponce Lecture 5 -! 23
Con4nuity through Occlusion Cues Image source: Forsyth & Ponce Lecture 5 -! 24
Figure- Ground Discrimina4on Lecture 5 -! 25
The Ul4mate Gestalt? Lecture 5 -! 26
What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering Mixture of Gaussians, EM Model- free clustering Mean- ship Lecture 5 -! 27
Image Segmenta4on: Toy Example 1 2 3 black pixels gray pixels white pixels input image intensity These intensi4es define the three groups. We could label every pixel in the image according to which of these primary intensi4es it is. i.e., segment the image based on the intensity feature. What if the image isn t quite so simple? Slide credit: Kristen Grauman Lecture 5 -! 28
Pixel count Input image Intensity Pixel count Input image Slide credit: Kristen Grauman Intensity Lecture 5 -! 29
Pixel count Input image Intensity Now how to determine the three main intensi4es that define our groups? We need to cluster. Slide credit: Kristen Grauman Lecture 5 -! 30
0 190 255 Intensity 1 2 3 Goal: choose three centers as the representa4ve intensi4es, and label every pixel according to which of these centers it is nearest to. Best cluster centers are those that minimize Sum of Square Distance (SSD) between all points and their nearest cluster center c i : SSD= clusters i p cluster i p c i 2 Slide credit: Kristen Grauman Lecture 5 -! 31
Clustering With this objec4ve, it is a chicken and egg problem: If we knew the cluster centers, we could allocate points to groups by assigning each to its closest center. If we knew the group memberships, we could get the centers by compu4ng the mean per group. Slide credit: Kristen Grauman Lecture 5 -! 32
K- Means Clustering Basic idea: randomly ini4alize the k cluster centers, and iterate between the two steps we just saw. 1. Randomly ini4alize the cluster centers, c 1,..., c K 2. Given cluster centers, determine points in each cluster For each point p, find the closest c i. Put p into cluster i 3. Given points in each cluster, solve for c i Set c i to be the mean of points in cluster i 4. If c i have changed, repeat Step 2 Proper4es Will always converge to some solu4on Can be a local minimum Does not always find the global minimum of objec4ve func4on: SSD= clusters i p cluster i p c i 2 Slide credit: Steve Seitz Lecture 5 -! 33
Segmenta4on as Clustering K=2 img_as_col = double(im(:)); cluster_membs = kmeans(img_as_col, K); K=3 labelim = zeros(size(im)); for i=1:k inds = find(cluster_membs==i); meanval = mean(img_as_column(inds)); labelim(inds) = meanval; end Slide credit: Kristen Grauman Lecture 5 -! 34
K- Means Clustering Java demo: hbp://home.dei.polimi.it/mabeucc/clustering/tutorial_html/appletkm.html Lecture 5 -! 35
K- Means++ Can we prevent arbitrarily bad local minima? 1. Randomly choose first center. 2. Pick new center with prob. propor4onal to (Contribu4on of p to total error) 3. Repeat un4l k centers. Expected error = O(log k) * op4mal Arthur & Vassilvitskii 2007 Slide credit: Steve Seitz Lecture 5 -! 36
Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity similarity Feature space: intensity value (1D) Slide credit: Kristen Grauman Lecture 5 -! 37
Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on color similarity B G R=255 G=200 B=250 R=245 G=220 B=248 Feature space: color value (3D) Slide credit: Kristen Grauman R R=15 G=189 B=2 Lecture 5 -! 38 R=3 G=12 B=2
Feature Space Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on texture similarity F 1 F 2 Filter bank of 24 filters F 24 Feature space: filter bank responses (e.g., 24D) Slide credit: Kristen Grauman Lecture 5 -! 39
Smoothing Out Cluster Assignments Assigning a cluster label per pixel may yield outliers: Original Labeled by cluster center s intensity How can we ensure they are spa4ally smooth? 1 2? 3 Slide credit: Kristen Grauman Lecture 5 -! 40
Segmenta4on as Clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity+posi9on similarity Intensity Y X Way to encode both similarity and proximity. Slide credit: Kristen Grauman Lecture 5 -! 41
K- Means Clustering Results K- means clustering based on intensity or color is essen4ally vector quan4za4on of the image abributes Clusters don t have to be spa4ally coherent Image Intensity- based clusters Color- based clusters Image source: Forsyth & Ponce Lecture 5 -! 42
K- Means Clustering Results K- means clustering based on intensity or color is essen4ally vector quan4za4on of the image abributes Clusters don t have to be spa4ally coherent Clustering based on (r,g,b,x,y) values enforces more spa4al coherence Image source: Forsyth & Ponce Lecture 5 -! 43
Summary K- Means Pros Simple, fast to compute Converges to local minimum of within- cluster squared error Cons/issues Sewng k? Sensi4ve to ini4al centers Sensi4ve to outliers Detects spherical clusters only Assuming means can be computed Slide credit: Kristen Grauman Lecture 5 -! 44
What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 45
Probabilis4c Clustering Basic ques4ons What s the probability that a point x is in cluster m? What s the shape of each cluster? K- means doesn t answer these ques4ons. Basic idea Instead of trea4ng the data as a bunch of points, assume that they are all generated by sampling a con4nuous func4on. This func4on is called a genera4ve model. Defined by a vector of parameters θ Slide credit: Steve Seitz Lecture 5 -! 46
Mixture of Gaussians One genera4ve model is a mixture of Gaussians (MoG) K Gaussian blobs with means μ b covariance matrices V b, dimension d Blob b defined by: Blob b is selected with probability The likelihood of observing x is a weighted mixture of Gaussians, Slide credit: Steve Seitz Lecture 5 -! 47
Expecta4on Maximiza4on (EM) Goal Find blob parameters θ that maximize the likelihood func4on: Approach: 1. E- step: given current guess of blobs, compute ownership of each point 2. M- step: given ownership probabili4es, update blobs to maximize likelihood func4on 3. Repeat un4l convergence Slide credit: Steve Seitz Lecture 5 -! 48
EM Details E- step Compute probability that point x is in blob b, given current guess of θ M- step Compute probability that blob b is selected (N data points) Mean of blob b Covariance of blob b Slide credit: Steve Seitz Lecture 5 -! 49
Applica4ons of EM Turns out this is useful for all sorts of problems Any clustering problem Any model es4ma4on problem Missing data problems Finding outliers Segmenta4on problems Segmenta4on based on color Segmenta4on based on mo4on Foreground/background separa4on... EM demo hbp://lcn.epfl.ch/tutorial/english/gaussian/html/index.html Slide credit: Steve Seitz Lecture 5 -! 50
Segmenta4on with EM Original image EM segmentation results k=2 k=3 k=4 k=5 Image source: Serge Belongie Lecture 5 -! 51
Summary: Mixtures of Gaussians, EM Pros Probabilis4c interpreta4on Sop assignments between data points and clusters Genera4ve model, can predict novel data points Rela4vely compact storage Cons Local minima Ini4aliza4on Open a good idea to start with some k- means itera4ons. Need to know number of components Solu4ons: model selec4on (AIC, BIC), Dirichlet process mixture Need to choose genera4ve model Numerical problems are open a nuisance Lecture 5 -! 52
What we have learned today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means Feature space Probabilis4c clustering (Problem Set 1 (Q3)) Mixture of Gaussians, EM Lecture 5 -! 53