CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

Similar documents
CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Indexing local features and instance recognition

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

Generic object recognition

Instance Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

CS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

The Bias-Variance Tradeoff

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

Lecture 5: Clustering and Segmentation Part 1

CS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017

Improving Performance in Neural Networks Using a Boosting Algorithm

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

A Framework for Segmentation of Interview Videos

Reducing False Positives in Video Shot Detection

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

CS 7643: Deep Learning

Feature-Based Analysis of Haydn String Quartets

Chord Classification of an Audio Signal using Artificial Neural Network

DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION

Image Steganalysis: Challenges

SMART VEHICLE SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE METHODS

Discriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik

2. Problem formulation

CS229 Project Report Polyphonic Piano Transcription

Noise Flooding for Detecting Audio Adversarial Examples Against Automatic Speech Recognition

Hearing Sheet Music: Towards Visual Recognition of Printed Scores

MUSI-6201 Computational Music Analysis

Lecture 5: Clustering and Segmenta4on Part 1

Neural Network Predicating Movie Box Office Performance

Automatic Construction of Synthetic Musical Instruments and Performers

Audio-Based Video Editing with Two-Channel Microphone

Outline. Why do we classify? Audio Classification

An Iot Based Smart Manifold Attendance System

Impact of Deep Learning

Distortion Analysis Of Tamil Language Characters Recognition

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Pedestrian Detection with a Large-Field-Of-View Deep Network

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Enabling editors through machine learning

Detecting Musical Key with Supervised Learning

Lyric-Based Music Mood Recognition

Joint Image and Text Representation for Aesthetics Analysis

Nearest-neighbor and Bilinear Resampling Factor Estimation to Detect Blockiness or Blurriness of an Image*

Topics in Computer Music Instrument Identification. Ioanna Karydi

EyeFace SDK v Technical Sheet

A TEXT RETRIEVAL APPROACH TO CONTENT-BASED AUDIO RETRIEVAL

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.

Evaluating Melodic Encodings for Use in Cover Song Identification

Lyrics Classification using Naive Bayes

Week 14 Music Understanding and Classification

AUTOMATIC LICENSE PLATE RECOGNITION(ALPR) ON EMBEDDED SYSTEM

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Course 10 The PDH multiplexing hierarchy.

ECS 189G: Intro to Computer Vision March 31 st, Yong Jae Lee Assistant Professor CS, UC Davis

(Refer Slide Time: 00:55)

CS 7643: Deep Learning

StatPatternRecognition: Status and Plans. Ilya Narsky, Caltech

Name Identification of People in News Video by Face Matching

Melody Extraction from Generic Audio Clips Thaminda Edirisooriya, Hansohl Kim, Connie Zeng

A Music Retrieval System Using Melody and Lyric

Multi-modal Analysis for Person Type Classification in News Video

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor

Improving Frame Based Automatic Laughter Detection

Automatic Piano Music Transcription

Representations in Deep Neural Nets. Paul Humphreys July

Automatic Laughter Detection

4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER. 6. AUTHOR(S) 5d. PROJECT NUMBER

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

Incorporating Domain Knowledge with Video and Voice Data Analysis in News Broadcasts

Symbol Classification Approach for OMR of Square Notation Manuscripts

Music Composition with RNN

The software concept. Try yourself and experience how your processes are significantly simplified. You need. weqube.

GRADIENT-BASED MUSICAL FEATURE EXTRACTION BASED ON SCALE-INVARIANT FEATURE TRANSFORM

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

Analysis of a Two Step MPEG Video System

POST-PROCESSING FIDDLE : A REAL-TIME MULTI-PITCH TRACKING TECHNIQUE USING HARMONIC PARTIAL SUBTRACTION FOR USE WITHIN LIVE PERFORMANCE SYSTEMS

Musical Hit Detection

Summarizing Long First-Person Videos

High ResolutionCross Strip Anodes for Photon Counting detectors

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

MOVIES constitute a large sector of the entertainment

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

StaMPS Persistent Scatterer Practical

The software concept. Try yourself and experience how your processes are significantly simplified. You need. weqube.

StaMPS Persistent Scatterer Exercise

Interactive Tic Tac Toe

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

TechNote: MuraTool CA: 1 2/9/00. Figure 1: High contrast fringe ring mura on a microdisplay

Man-Machine-Interface (Video) Nataliya Nadtoka coach: Jens Bialkowski

Topic 10. Multi-pitch Analysis

Speech Recognition and Signal Processing for Broadcast News Transcription

Chapter 2 Introduction to

A Survey of Audio-Based Music Classification and Annotation

Different Approach of VIDEO Compression Technique: A Study

Detecting the Moment of Snap in Real-World Football Videos

Automatic LP Digitalization Spring Group 6: Michael Sibley, Alexander Su, Daphne Tsatsoulis {msibley, ahs1,

Transcription:

CS 1674: Intro to Computer Vision Face Detection Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

Today Window-based generic object detection basic pipeline boosting classifiers face detection as case study Kristen Grauman

Generic category detection: basic framework Build/train object model Choose a representation Learn or fit parameters of model / classifier Generate candidates in new image Score the candidates Kristen Grauman

Generic category detection: representation choice Window-based Part-based Kristen Grauman

Window-based models Building an object model Consider edges, contours, and (oriented) intensity gradients Summarize local distribution of gradients with histogram Locally orderless: offers invariance to small shifts and rotations Adapted from Kristen Grauman

Window-based models Building an object model Given the representation, train a binary classifier Car/non-car Classifier No, Yes, not car. a car. Kristen Grauman

Window-based models Generating and scoring candidates Car/non-car Classifier Kristen Grauman

Window-based object detection: recap Training: 1. Obtain training data 2. Define features 3. Define classifier Given new image: 1. Slide window 2. Score by classifier Training examples Car/non-car Classifier Feature extraction Kristen Grauman

Face detection and recognition Detection Recognition Sally Lana Lazebnik

Challenges of face detection Sliding window detector must evaluate tens of thousands of location/scale combinations Faces are rare: 0 10 per image A megapixel image has ~10 6 pixels and a comparable number of candidate face locations For computational efficiency, we should try to spend as little time as possible on the non-face windows To avoid having a false positive in every image, our false positive rate has to be less than 10-6 Lana Lazebnik

Viola-Jones face detector

Discriminative classifier construction Nearest neighbor Neural networks 10 6 examples Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005... LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 Support Vector Machines Boosting Conditional Random Fields Guyon, Vapnik Heisele, Serre, Poggio, 2001, Viola, Jones 2001, Torralba et al. 2004, Opelt et al. 2006, McCallum, Freitag, Pereira 2000; Kumar, Hebert 2003 Slide adapted from Antonio Torralba

Boosting intuition Weak Classifier 1 Paul Viola

Boosting illustration Weights Increased Paul Viola

Boosting illustration Weak Classifier 2 Paul Viola

Boosting illustration Weights Increased Paul Viola

Boosting illustration Weak Classifier 3 Paul Viola

Boosting illustration Final classifier is a combination of weak classifiers Paul Viola

Boosting: training Initially, weight each training example equally In each boosting round: Find the weak learner that achieves the lowest weighted training error Raise weights of training examples misclassified by current weak learner Compute final classifier as linear combination of all weak learners (weight of each learner is directly proportional to its accuracy) Exact formulas for re-weighting and combining weak learners depend on the particular boosting scheme (e.g., AdaBoost) Lana Lazebnik, Kristen Grauman

Main idea: Viola-Jones face detector Represent local texture with efficiently computable rectangular features within window of interest Select discriminative features to be weak classifiers Use boosted combination of them as final classifier Form a cascade of such classifiers, rejecting clear negatives quickly Kristen Grauman

Viola-Jones detector: features Rectangular filters Feature output is difference between adjacent regions Value = (pixels in white area) (pixels in black area) Efficiently computable with integral image: any sum can be computed in constant time Value at (x,y) is sum of pixels above and to the left of (x,y) Integral image Adapted from Kristen Grauman and Lana Lazebnik

Fast computation with integral images The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive This can quickly be computed in one pass through the image (x,y) Lana Lazebnik

Lana Lazebnik Computing sum within a rectangle Let A,B,C,D be the values of the integral image at the corners of a rectangle Then the sum of original image values within the rectangle can be computed as: sum = A B C + D Only 3 additions are required for any size of rectangle! D C B A

Lana Lazebnik Example Source Result

Viola-Jones detector: features Which subset of these features should we use to determine if a window has a face? Considering all possible filter parameters: position, scale, and type: 180,000+ possible features associated with each 24 x 24 window Use AdaBoost both to select the informative features and to form the classifier Kristen Grauman

Viola-Jones detector: AdaBoost Want to select the single rectangle feature and threshold that best separates positive (faces) and negative (nonfaces) training examples, in terms of weighted error. Resulting weak classifier: Outputs of a possible rectangle feature on faces and non-faces. For next round, reweight the examples according to errors, choose another filter/threshold combo. Kristen Grauman

Start with uniform weights on training examples. For M rounds Evaluate weighted error for each weak learner, pick best learner. Figure from C. Bishop, notes from K. Grauman (d) Normalize the weights so they sum to 1 Re-weight the examples: Incorrectly classified get more weight, correctly classified get less weight. Final classifier is combination of weak ones, weighted according to error they had.

Boosting for face detection First two features selected by boosting: This feature combination can yield 100% detection rate and 50% false positive rate Lana Lazebnik

Boosting: pros and cons Advantages of boosting Integrates classification with feature selection Complexity of training is linear in the number of training examples Flexibility in the choice of weak learners, boosting scheme Testing is fast Easy to implement Disadvantages Needs many training examples Often found not to work as well as an alternative discriminative classifier, support vector machine (SVM) Lana Lazebnik

Are we done? Even if the filters are fast to compute, each new image has a lot of possible windows to search. How to make the detection more efficient? Kristen Grauman

Cascading classifiers for detection Form a cascade with low false negative rates early on Apply less accurate but faster classifiers first to immediately discard windows that clearly appear to be negative Kristen Grauman

Viola-Jones detector: summary Train cascade of classifiers with AdaBoost Faces New image Selected features, thresholds, and weights Non-faces Train with 5K positives, 350M negatives Real-time detector using 38 layer cascade (0.067s) 6061 features in all layers Adapted from Kristen Grauman

Viola-Jones detector: summary A seminal approach to real-time object detection Training is slow, but detection is very fast Key ideas Integral images for fast feature evaluation Boosting for feature selection Attentional cascade of classifiers for fast rejection of non-face windows P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001. P. Viola and M. Jones. Robust real-time face detection. IJCV 57(2), 2004. Kristen Grauman

Matlab Demo https://www.mathworks.com/help/vision/ref/vi sion.cascadeobjectdetector-class.html#btaovqu

Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results

Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results

Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results

Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Detecting profile faces? Can we use the same detector?

Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Viola-Jones Face Detector: Results Paul Viola, ICCV tutorial Kristen Grauman

Example using Viola-Jones detector Frontal faces detected and then tracked, character names inferred with alignment of script and subtitles. Everingham, M., Sivic, J. and Zisserman, A. "Hello! My name is... Buffy" - Automatic naming of characters in TV video, BMVC 2006. http://www.robots.ox.ac.uk/~vgg/research/nface/index.html

Face detection and recognition Detection Recognition Sally Lana Lazebnik

Lana Lazebnik Consumer application: iphoto 2009 Things iphoto thinks are faces

Consumer application: iphoto 2009 Can be trained to recognize pets! http://gizmodo.com/5140703/iphotos-facial-recognition-feature-works-on-cats Slide credit: Lana Lazebnik