1 / 41 CPSC 425: Computer Vision Instructor: Fred Tung ftung@cs.ubc.ca Department of Computer Science University of British Columbia Lecture Notes 2015/2016 Term 2
2 / 41 Welcome to CPSC 425 Who has heard of Google s self-driving car or Tesla Autopilot? Image credit: Google; Technology Review, 2015.
3 / 41 Welcome to CPSC 425 For an autonomous car to navigate safely, it must sense its environment. detect lane markings, obstacles detect and predict the movement of other cars, cyclists, and pedestrians interpret road signs, gestures from cyclists
4 / 41 Welcome to CPSC 425 Image credit: CBC, 2015.
5 / 41 Welcome to CPSC 425 How can we design computers (or robots, self-driving cars,...) that make sense of a complex visual world? That is the question that computer vision tries to answer.
6 / 41 Menu January 5, 2016 Topics: Introduction Course Mechanics Course Topics Some Introductory Examples Reading: Next: Forsyth & Ponce (2nd ed.) 1.1.1 1.1.3 Handouts: Assignment 1: Introduction to Python for Computer Vision Reminders: Complete Assignment 1 by Tuesday, January 12 www: http://www.cs.ubc.ca/~ftung/cpsc425/ piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/
7 / 41 Who is Fred? Fred Tung University of Waterloo graduate (2008) PhD candidate in Computer Science Supervisor: Jim Little My areas of research are... Scene parsing of images Scene parsing of video Large-scale visual search
Scene Parsing 1 0.8 0.6 0.4 0.2 building car fence mountain person road sidewalk sky unlabelled (ground truth only) 0 0.2 0.4 0.6 0.8 8 / 41
9 / 41 Course Origins CPSC 425 was originally developed by Bob Woodham and has evolved over the years. Much of the material this year is adapted from material prepared by Bob. I will also share with you some exciting recent work in computer vision to solve real-world problems such as autonomous navigation object recognition, and large-scale image search
10 / 41 Framework for Class Discussion Come to each class prepared to discuss that day s material at four levels: Problem: What is the problem addressed? Key Idea(s): What is the key idea (or ideas) behind the approach taken? What assumptions are made? Are there alternative approaches? Technical Detail(s): What theory underlies the approach taken? What are important practical aspects of implementation, experimentation and application? Gotchas: Are there unexpected features of the approach likely to trip up the inexperienced?
11 / 41 Course Expectations Students in this class have varying backgrounds, skills, and expectations. Please respect, help and encourage each other. I will expect you to read assigned textbook sections in advance read any additional assigned reading in advance ask questions (both in and outside class) engage fully in all course activities: lectures, assignments, discussion, and office hours complete all assignments on time behave ethically
12 / 41 Course Mechanics There will be 7 assignments (6 marked) There is: we will use Python 2 and four packages: Python Imaging Library (PIL) NumPy Matplotlib SciPy one (in-class) midterm exam, tentatively February 11 (class before reading break) a 150 minute final exam, scheduled by the Registrar s office
13 / 41 Course Mechanics My office hour: Fridays 10:30-11:30, ICCS 187 (or email me for an appointment) TA office hours TBA Course website: http://www.cs.ubc.ca/~ftung/cpsc425/ Course Piazza group: https://piazza.com/ubc.ca/winterterm22015/cpsc425/ There will be no extension to assignment due dates
14 / 41 Course Mechanics (cont d) Marks for the course are calculated as follows: In class (clicker questions) 10 % Assignments 25 % Midterm exam 25 % Final exam 40 %
15 / 41 Course Outline I. Physics of Imaging Image formation Cameras and lenses Colour II. Early Vision Image filtering, correlation/convolution Image characterisation Edge and corner detection Texture analysis
16 / 41 Course Outline III. Mid-Level Vision Feature detection Model fitting Stereo Motion and optical flow IV. High-Level Vision Clustering and classification Image classification Object detection Deep learning in computer vision Examples and applications
Questions? 17 / 41
18 / 41 Example 1: Dione and Titan (Moons of Saturn) Image credit: NASA/JPL-Caltech/Space Science Institute
19 / 41 Example 1 (cont d): Dione and Titan Image credit: NASA/JPL-Caltech/Space Science Institute
Example 2: A Full (Earth) Moon 20 / 41
Example 3: Eggs? 21 / 41
Example 3 (cont d): 22 / 41
Example 3 (cont d): 23 / 41
24 / 41 Example 4: The dress Lighting conditions also affect the perception of colour.
25 / 41 Example 5: Rotating Mask Example Video: rotating mask Given a rotating mask, we have difficulty seeing the hollow side Our everyday experience tells us that the nose is pointing outwards and not inwards The associated text for this example read, in part, In solving the ill-posed problem from[sic] recovering 3D form from 2D images our brain makes a priori assumptions about the world. Assumption 1: Faces are convex (Original) credit: http://www.kyb.tuebingen.mpg.de/
26 / 41 Example 6: Handwritten Text Read this!
27 / 41 Example 7: The FedEx Logo Lindon Leader of Creative Leader designed the FedEx logo See interview with him at http://www.thesneeze.com/mt-archives/000273.php
28 / 41 Example 8: First-Down Line image courtesy SporTVision http://www.sportvision.com/
Example 9: Kinect for Xbox 360 How does it work? image from January/February 2011 issue of Technology Review 29 / 41
30 / 41 Example 9: Kinect for Xbox 360 The Kinect uses depth information to recognize the pose of the players. Image from J. Shotton et al. (2011)
31 / 41 Example 10: Word Lens (Google Translate) Real-time translation on your mobile device using optical character recognition + augmented reality Image credit: Google
32 / 41 Example 11: Reverse image search Search using an image instead of text Image credit: Google; Tineye.
33 / 41 Example 12: Interactive image search Pinterest recently introduced a feature that lets you to look up product information by drawing boxes in images
34 / 41 Example 13: Amazon delivery drones Delivering the product you ordered, by unmanned aerial vehicle Image credit: Amazon
35 / 41 Example 14: Smart traffic systems Adjusting a network of traffic lights in real time, based on current traffic conditions Image credit: Miovision
36 / 41 Example 15: Sports video analytics at UBC Here is a sample video sequence processed based on the combined thesis work of three LCI graduate students: Video: 1000 frame broadcast hockey sequence Credit: Kenji Okuma, Wei-Lwun Lu, Ankur Gupta
37 / 41 Example 15 (cont d): Puck Location and Possession Andrew Duan s M.Sc thesis (August, 2011) integrates the determination of puck location and possession into our sports video analysis system Here s what Andrew s system does with the same hockey video sequence we saw before: Video: 1000 frame broadcast hockey sequence Credit: Xin Duan (Andrew)
38 / 41 Example 16: Basketball Wei-Lwun Lu s Ph.D thesis (October, 2011) tracks multiple players while preserving player identity. Here are a couple of examples from Wei-Lwun s thesis using a basketball video sequence: Video: Homography estimation Video: Player identification Credit: Wei-Lwun Lu http://www.cs.ubc.ca/~vailen/thesis/thesis.shtml
Example 17: Automating camera operation More recently, Jianhui Chen (one of your TAs this term) is working on automatic broadcast camera control. Image credit: J. Chen and P. Carr, 2015. 39 / 41
40 / 41 Example 17: Automating camera operation Image credit: IEEE Spectrum
Reminders: Complete Assignment 1 by Tuesday, January 12 www: http://www.cs.ubc.ca/~ftung/cpsc425/ piazza: https://piazza.com/ubc.ca/winterterm22015/cpsc425/ 41 / 41