CS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017

Size: px
Start display at page:

Download "CS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017"

Transcription

1 CS 2770: Computer Vision Introduction Prof. Adriana Kovashka University of Pittsburgh January 5, 2017

2 About the Instructor Born 1985 in Sofia, Bulgaria Got BA in 2008 at Pomona College, CA (Computer Science & Media Studies) Got PhD in 2014 at University of Texas at Austin (Computer Vision)

3 Course Info Course website: Instructor: Adriana Kovashka Use "CS2770" at the beginning of your Subject Office: Sennott Square 5325 Office hours: Tue/Thu, 3:30pm - 5:30pm

4 TA Keren Ye Office: Sennott Square 5501 Office hours: TBD Do the Doodle by the end of Friday:

5 Textbooks Computer Vision: Algorithms and Applications by Richard Szeliski Visual Object Recognition by Kristen Grauman and Bastian Leibe More resources available on course webpage Your notes from class are your best study material, slides are not complete with notes

6 Course Goals To learn about the basic computer vision tasks and approaches To get experience with some computer vision techniques To learn/apply basic machine learning (a key component of modern computer vision) To think critically about vision approaches, and to see connections between works and potential for improvement

7 Policies and Schedule

8 Should I take this class? It will be a lot of work! But you will learn a lot Some parts will be hard and require that you pay close attention! But I will have periodic ungraded pop quizzes to see how you re doing I will also pick on students randomly to answer questions Use instructor s and TA s office hours!!!

9 Questions?

10 Plan for Today Introductions What is computer vision? Why do we care? What are the challenges? What is the current research like? Overview of topics (if time)

11 Introductions What is your name? What one thing outside of school are you passionate about? Do you have any prior experience with computer vision? What do you hope to get out of this class? Every time you speak, please remind me your name

12 Computer Vision

13 What is computer vision? Done? "We see with our brains, not with our eyes (Oliver Sacks and others) Kristen Grauman (adapted)

14 What is computer vision? Automatic understanding of images and video Computing properties of the 3D world from visual data (measurement) Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities (perception and interpretation) Algorithms to mine, search, and interact with visual data (search and organization) Kristen Grauman

15 Vision for measurement Real-time stereo Structure from motion Multi-view stereo for community photo collections NASA Mars Rover Pollefeys et al. Goesele et al. Kristen Grauman Slide credit: L. Lazebnik

16 Vision for perception, interpretation The Wicked Twister ride Lake Erie sky water Ferris wheel amusement park Cedar Point tree ride 12 E Objects Activities Scenes Locations Text / writing Faces Gestures Motions Emotions ride tree people waiting in line people sitting on ride Kristen Grauman deck tree bench tree carousel umbrellas pedestrians maxair

17 Visual search, organization Query Image or video archives Relevant content Kristen Grauman

18 Related disciplines Graphics Image processing Artificial intelligence Computer vision Algorithms Machine learning Cognitive science Kristen Grauman

19 Vision and graphics Images Vision Model Graphics Inverse problems: analysis and synthesis. Kristen Grauman

20 Why vision? Images and video are everywhere! 144k hours uploaded to YouTube daily 4.5 mil photos uploaded to Flickr daily 10 bil images indexed by Google Personal photo albums Movies, news, sports Surveillance and security Adapted from Lana Lazebnik Medical and scientific images

21 Why vision? As image sources multiply, so do applications Relieve humans of boring, easy tasks Human-computer interaction Perception for robotics / autonomous agents Organize and give access to visual content Description of image content for the visually impaired Fun applications (e.g. transfer art styles to my photos) Adapted from Kristen Grauman

22 Faces and digital cameras Camera waits for everyone to smile to take a photo [Canon] Setting camera focus via face detection Kristen Grauman

23 Devi Parikh Face recognition

24 Linking to info with a mobile device Situated search Yeh et al., MIT kooaba MSR Lincoln Kristen Grauman

25 Exploring photo collections Snavely et al. Kristen Grauman

26 Special visual effects The Matrix What Dreams May Come Mocap for Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz Kristen Grauman

27 Yong Jae Lee Interactive systems

28 Video-based interfaces YouTube Link Human joystick NewsBreaker Live Assistive technology systems Camera Mouse Boston College Kristen Grauman

29 Vision for medical & neuroimages fmri data Golland et al. Image guided surgery MIT AI Vision Group Kristen Grauman

30 Safety & security Navigation, driver safety Monitoring pool (Poseidon) Kristen Grauman Pedestrian detection MERL, Viola et al. Surveillance

31 Healthy eating FarmBot.io YouTube Link Im2calories by Myers et al., ICCV 2015 figure source

32 Self-training for sports? Pirsiavash et al., Assessing the Quality of Actions, ECCV 2014

33 Image generation Reed et al., ICML 2016 Radford et al., ICLR 2016

34 YouTube link Seeing AI

35 Obstacles? Kristen Grauman Read more about the history: Szeliski Sec. 1.2

36 What the computer gets Why is this problematic? Adapted from Kristen Grauman and Lana Lazebnik

37 Why is vision difficult? Ill-posed problem: real world much more complex than what we can measure in images 3D 2D Impossible to literally invert image formation process with limited information Need information outside of this particular image to generalize what image portrays (e.g. to resolve occlusion) Adapted from Kristen Grauman

38 Challenges: many nuisance parameters Illumination Object pose Clutter Occlusions Intra-class appearance Viewpoint Think again about the pixels Kristen Grauman

39 Challenges: intra-class variation CMOA Pittsburgh slide credit: Fei-Fei, Fergus & Torralba

40 Challenges: importance of context slide credit: Fei-Fei, Fergus & Torralba

41 Challenges: Complexity Thousands to millions of pixels in an image 3,000-30,000 human recognizable object categories 30+ degrees of freedom in the pose of articulated objects (humans) Billions of images indexed by Google Image Search billion smart camera phones sold in 2015 About half of the cerebral cortex in primates is devoted to processing visual information [Felleman and van Essen 1991] Kristen Grauman

42 Challenges: Limited supervision Less More Kristen Grauman

43 Challenges: Vision requires reasoning Antol et al., VQA: Visual Question Answering, ICCV 2015

44 Ok, clearly the vision problem is deep and challenging time to give up? Active research area with exciting progress! How datasets changed: Kristen Grauman

45 Datasets today ImageNet: 22k categories, 14mil images Microsoft COCO: 80 categories, 300k images PASCAL: 20 categories, 12k images SUN: 5k categories, 130k images

46 Some Visual Recognition Problems

47 Recognition: What is this?

48 Recognition: What objects do you see? building street balcony truck carriage horse table person person car

49 Detection: Where are the cars?

50 Activity: What is this person doing?

51 Scene: Is this an indoor scene?

52 Instance: Which city? Which building?

53 Visual question answering: What are all these people participating in?

54 The Latest at CVPR 2016 * CVPR = IEEE Conference on Computer Vision and Pattern Recognition

55 Our ability to detect objects has gone from 34 map in 2008 to 73 map at 7 FPS (frames per second) or 63 map at 45 FPS in 2016

56 Redmon et al., CVPR 2016 You Only Look Once: Unified, Real-Time Object Detection

57 Force from Motion: Decoding Physical Sensation from a First Person Video Park et al., CVPR 2016

58 MovieQA: Understanding Stories in Movies through Question-Answering Tapaswi et al., CVPR 2016

59 Owens et al., CVPR 2016 Visually Indicated Sounds

60 Anticipating Visual Representations from Unlabeled Video Vondrick et al., CVPR 2016

61 Gatys et al., CVPR 2016 Image Style Transfer Using Convolutional Neural Networks

62 DeepArt.io try it for yourself! (Image Style Transfer Using Convolutional Neural Networks) Images: Styles:

63 DeepArt.io try it for yourself! (Image Style Transfer Using Convolutional Neural Networks) Results:

64 Thomas and Kovashka, CVPR 2016 Seeing Behind the Camera: Identifying the Authorship of a Photograph

65 Is computer vision solved? Given an image, we can guess with 81% accuracy what object categories are shown (ResNet) but we only answer why questions about images with 14% accuracy!

66 Why does it seem that it s solved? Deep learning makes excellent use of massive data (labeled for the task of interest?) But it s hard to understand how it does so It doesn t work well when massive data is not available and your task is different than tasks for which data is available Sometimes the manner in which deep methods work is not intellectually appealing, but our smarter / more complex methods perform worse

67 Overview of Topics

68 Overview of topics Lower-level vision Analyzing textures, edges and gradients in images, without concern for the semantics (e.g. objects) of the image Higher-level vision Making predictions about the semantics or higherlevel functions of content in images (e.g. objects, attributes, styles, motion, etc.) Involves machine learning; we ll cover some basics of this then go back to low-level tasks

69 Features and filters Transforming and describing images; textures, colors, edges Kristen Grauman

70 Features and filters Detecting distinctive + repeatable features Describing images with local statistics

71 Indexing and search Matching features and regions across images Kristen Grauman

72 How does light in 3d world project to form 2d images? Image formation Kristen Grauman

73 Multiple views Multi-view geometry, matching, invariant features, stereo vision Lowe Hartley and Zisserman Fei-Fei Li Kristen Grauman

74 Grouping and fitting Clustering, segmentation, fitting; what parts belong together? Kristen Grauman [fig from Shi et al]

75 Visual recognition Recognizing objects and categories, learning techniques Kristen Grauman

76 Object detection Detecting novel instances of objects Classifying regions as one of several categories

77 Attribute-based description Describing the high-level properties of objects Allows recognition of unseen objects

78 Convolutional neural networks State-of-the-art on many recognition tasks Image Prediction Krizhevsky et al. Yosinski et al., ICML DL workshop 2015

79 Recurrent neural networks Sequence processing, e.g. question answering Wu et al., CVPR 2016

80 Motion and tracking Tracking objects, video analysis Tomas Izo Kristen Grauman

81 Pose and actions Automatically annotating human pose (joints) Recognizing actions in first-person video

82 Your Homework Fill out Doodle Read entire course website Do first reading

83 Next Time Linear algebra review Matlab tutorial

CS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015

CS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015 CS 1699: Intro to Computer Vision Introduction Prof. Adriana Kovashka University of Pittsburgh September 1, 2015 Course Info Course website: http://people.cs.pitt.edu/~kovashka/cs1699 Instructor: Adriana

More information

ECS 189G: Intro to Computer Vision March 31 st, Yong Jae Lee Assistant Professor CS, UC Davis

ECS 189G: Intro to Computer Vision March 31 st, Yong Jae Lee Assistant Professor CS, UC Davis ECS 189G: Intro to Computer Vision March 31 st, 2015 Yong Jae Lee Assistant Professor CS, UC Davis Plan for today Topic overview Introductions Course overview: Logistics and requirements 2 What is Computer

More information

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016 CS 1674: Intro to Computer Vision Intro to Recognition Prof. Adriana Kovashka University of Pittsburgh October 24, 2016 Plan for today Examples of visual recognition problems What should we recognize?

More information

Generic object recognition

Generic object recognition Generic object recognition May 19 th, 2015 Yong Jae Lee UC Davis Announcements PS3 out; due 6/3, 11:59 pm Sign attendance sheet (3 rd one) 2 Indexing local features 3 Kristen Grauman Visual words Map high-dimensional

More information

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 CS 1674: Intro to Computer Vision Face Detection Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 Today Window-based generic object detection basic pipeline boosting classifiers face detection

More information

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin Indexing local features Wed March 30 Prof. Kristen Grauman UT-Austin Matching local features Kristen Grauman Matching local features? Image 1 Image 2 To generate candidate matches, find patches that have

More information

Indexing local features and instance recognition

Indexing local features and instance recognition Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 Approximating the Laplacian We can approximate the Laplacian with a difference

More information

Summarizing Long First-Person Videos

Summarizing Long First-Person Videos CVPR 2016 Workshop: Moving Cameras Meet Video Surveillance: From Body-Borne Cameras to Drones Summarizing Long First-Person Videos Kristen Grauman Department of Computer Science University of Texas at

More information

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1 BBM 413 Fundamentals of Image Processing Dec. 11, 2012 Erkut Erdem Dept. of Computer Engineering Hacettepe University Segmentation Part 1 Image segmentation Goal: identify groups of pixels that go together

More information

An Introduction to Deep Image Aesthetics

An Introduction to Deep Image Aesthetics Seminar in Laboratory of Visual Intelligence and Pattern Analysis (VIPA) An Introduction to Deep Image Aesthetics Yongcheng Jing College of Computer Science and Technology Zhejiang University Zhenchuan

More information

The Bias-Variance Tradeoff

The Bias-Variance Tradeoff CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016 Plan for Today More Matlab Measuring performance The bias-variance trade-off Matlab

More information

CPSC 425: Computer Vision

CPSC 425: Computer Vision 1 / 41 CPSC 425: Computer Vision Instructor: Fred Tung ftung@cs.ubc.ca Department of Computer Science University of British Columbia Lecture Notes 2015/2016 Term 2 2 / 41 Welcome to CPSC 425 Who has heard

More information

Lecture 5: Clustering and Segmentation Part 1

Lecture 5: Clustering and Segmentation Part 1 Lecture 5: Clustering and Segmentation Part 1 Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today Segmentation and grouping Gestalt principles Segmentation as clustering K means Feature

More information

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering, DeepID: Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University i of Hong Kong Machine Learning with Big Data Machine learning with small data: overfitting,

More information

Instance Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

Instance Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision Instance Recognition Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision Administrative stuffs Paper review submitted? Topic presentation Experiment presentation For / Against discussion lead

More information

FOIL it! Find One mismatch between Image and Language caption

FOIL it! Find One mismatch between Image and Language caption FOIL it! Find One mismatch between Image and Language caption ACL, Vancouver, 31st July, 2017 Ravi Shekhar, Sandro Pezzelle, Yauhen Klimovich, Aurelie Herbelot, Moin Nabi, Enver Sangineto, Raffaella Bernardi

More information

Exhibits. Open House. NHK STRL Open House Entrance. Smart Production. Open House 2018 Exhibits

Exhibits. Open House. NHK STRL Open House Entrance. Smart Production. Open House 2018 Exhibits 2018 Exhibits NHK STRL 2018 Exhibits Entrance E1 NHK STRL3-Year R&D Plan (FY 2018-2020) The NHK STRL 3-Year R&D Plan for creating new broadcasting technologies and services with goals for 2020, and beyond

More information

AI FOR BETTER STORYTELLING IN LIVE FOOTBALL

AI FOR BETTER STORYTELLING IN LIVE FOOTBALL AI FOR BETTER STORYTELLING IN LIVE FOOTBALL N. Déal1 and J. Vounckx2 1 UEFA, Switzerland and 2 EVS, Belgium ABSTRACT Artificial Intelligence (AI) represents almost limitless possibilities for the future

More information

CS 7643: Deep Learning

CS 7643: Deep Learning CS 7643: Deep Learning Topics: Stride, padding Pooling layers Fully-connected layers as convolutions Backprop in conv layers Dhruv Batra Georgia Tech Invited Talks Sumit Chopra on CNNs for Pixel Labeling

More information

Lecture 5: Clustering and Segmenta4on Part 1

Lecture 5: Clustering and Segmenta4on Part 1 Lecture 5: Clustering and Segmenta4on Part 1 Professor Fei- Fei Li Stanford Vision Lab Lecture 5 -! 1 What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means

More information

ImageNet Auto-Annotation with Segmentation Propagation

ImageNet Auto-Annotation with Segmentation Propagation ImageNet Auto-Annotation with Segmentation Propagation Matthieu Guillaumin Daniel Küttel Vittorio Ferrari Bryan Anenberg & Michela Meister Outline Goal & Motivation System Overview Segmentation Transfer

More information

Joint Image and Text Representation for Aesthetics Analysis

Joint Image and Text Representation for Aesthetics Analysis Joint Image and Text Representation for Aesthetics Analysis Ye Zhou 1, Xin Lu 2, Junping Zhang 1, James Z. Wang 3 1 Fudan University, China 2 Adobe Systems Inc., USA 3 The Pennsylvania State University,

More information

Less is More: Picking Informative Frames for Video Captioning

Less is More: Picking Informative Frames for Video Captioning Less is More: Picking Informative Frames for Video Captioning ECCV 2018 Yangyu Chen 1, Shuhui Wang 2, Weigang Zhang 3 and Qingming Huang 1,2 1 University of Chinese Academy of Science, Beijing, 100049,

More information

Music Understanding and the Future of Music

Music Understanding and the Future of Music Music Understanding and the Future of Music Roger B. Dannenberg Professor of Computer Science, Art, and Music Carnegie Mellon University Why Computers and Music? Music in every human society! Computers

More information

VBM683 Machine Learning

VBM683 Machine Learning VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, David Sontag, Aykut Erdem Quotes If you were a current computer science student what area would you start studying heavily? Answer:

More information

Audio spectrogram representations for processing with Convolutional Neural Networks

Audio spectrogram representations for processing with Convolutional Neural Networks Audio spectrogram representations for processing with Convolutional Neural Networks Lonce Wyse 1 1 National University of Singapore arxiv:1706.09559v1 [cs.sd] 29 Jun 2017 One of the decisions that arise

More information

Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network

Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network Xin Jin 1,2,LeWu 1, Xinghui Zhou 1, Geng Zhao 1, Xiaokun Zhang 1, Xiaodong Li 1, and Shiming Ge 3(B) 1 Department of Cyber Security,

More information

CS229 Project Report Polyphonic Piano Transcription

CS229 Project Report Polyphonic Piano Transcription CS229 Project Report Polyphonic Piano Transcription Mohammad Sadegh Ebrahimi Stanford University Jean-Baptiste Boin Stanford University sadegh@stanford.edu jbboin@stanford.edu 1. Introduction In this project

More information

Concept of ELFi Educational program. Android + LEGO

Concept of ELFi Educational program. Android + LEGO Concept of ELFi Educational program. Android + LEGO ELFi Robotics 2015 Authors: Oleksiy Drobnych, PhD, Java Coach, Assistant Professor at Uzhhorod National University, CTO at ELFi Robotics Mark Drobnych,

More information

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION Paulo V. K. Borges Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) 07942084331 vini@ieee.org PRESENTATION Electronic engineer working as researcher at University of London. Doctorate in digital image/video

More information

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs Damian Borth 1,2, Rongrong Ji 1, Tao Chen 1, Thomas Breuel 2, Shih-Fu Chang 1 1 Columbia University, New York, USA 2 University

More information

MATLAB & Image Processing (Summer Training Program) 4 Weeks/ 30 Days

MATLAB & Image Processing (Summer Training Program) 4 Weeks/ 30 Days (Summer Training Program) 4 Weeks/ 30 Days PRESENTED BY RoboSpecies Technologies Pvt. Ltd. Office: D-66, First Floor, Sector- 07, Noida, UP Contact us: Email: stp@robospecies.com Website: www.robospecies.com

More information

Predicting Performance of PESQ in Case of Single Frame Losses

Predicting Performance of PESQ in Case of Single Frame Losses Predicting Performance of PESQ in Case of Single Frame Losses Christian Hoene, Enhtuya Dulamsuren-Lalla Technical University of Berlin, Germany Fax: +49 30 31423819 Email: hoene@ieee.org Abstract ITU s

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #1 Friday, September 5, 2003 Dr. Ian C. Bruce Room CRL-229, Ext. 26984 ibruce@mail.ece.mcmaster.ca Office Hours: TBA Instructor: Teaching Assistants:

More information

Image Steganalysis: Challenges

Image Steganalysis: Challenges Image Steganalysis: Challenges Jiwu Huang,China BUCHAREST 2017 Acknowledgement Members in my team Dr. Weiqi Luo and Dr. Fangjun Huang Sun Yat-sen Univ., China Dr. Bin Li and Dr. Shunquan Tan, Mr. Jishen

More information

Deep learning for music data processing

Deep learning for music data processing Deep learning for music data processing A personal (re)view of the state-of-the-art Jordi Pons www.jordipons.me Music Technology Group, DTIC, Universitat Pompeu Fabra, Barcelona. 31st January 2017 Jordi

More information

A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification

A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification INTERSPEECH 17 August, 17, Stockholm, Sweden A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification Yun Wang and Florian Metze Language

More information

Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY

Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY Image Processing Using MATLAB (Summer Training Program) 6 Weeks/ 45 Days PRESENTED BY RoboSpecies Technologies Pvt. Ltd. Office: D-66, First Floor, Sector- 07, Noida, UP Contact us: Email: stp@robospecies.com

More information

Singer Traits Identification using Deep Neural Network

Singer Traits Identification using Deep Neural Network Singer Traits Identification using Deep Neural Network Zhengshan Shi Center for Computer Research in Music and Acoustics Stanford University kittyshi@stanford.edu Abstract The author investigates automatic

More information

Audio-Based Video Editing with Two-Channel Microphone

Audio-Based Video Editing with Two-Channel Microphone Audio-Based Video Editing with Two-Channel Microphone Tetsuya Takiguchi Organization of Advanced Science and Technology Kobe University, Japan takigu@kobe-u.ac.jp Yasuo Ariki Organization of Advanced Science

More information

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception LEARNING AUDIO SHEET MUSIC CORRESPONDENCES Matthias Dorfer Department of Computational Perception Short Introduction... I am a PhD Candidate in the Department of Computational Perception at Johannes Kepler

More information

THE FOLLOWING PREVIEW HAS BEEN APPROVED FOR ALL AUDIENCES. CVPR 2016 Spotlight

THE FOLLOWING PREVIEW HAS BEEN APPROVED FOR ALL AUDIENCES. CVPR 2016 Spotlight THE FOLLOWING PREVIEW HAS BEEN APPROVED FOR ALL AUDIENCES CVPR 2016 Spotlight Understanding Stories in Movies through Question-Answering Makarand Tapaswi Yukun Zhu Rainer Stiefelhagen Antonio Torralba

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

CSE 166: Image Processing. Overview. Representing an image. What is an image? History. What is image processing? Today. Image Processing CSE 166

CSE 166: Image Processing. Overview. Representing an image. What is an image? History. What is image processing? Today. Image Processing CSE 166 CSE 166: Image Processing Overview Image Processing CSE 166 Today Course overview Logistics Some mathematics MATLAB Lectures will be boardwork and slides Take written notes or take pictures of the board

More information

Melody classification using patterns

Melody classification using patterns Melody classification using patterns Darrell Conklin Department of Computing City University London United Kingdom conklin@city.ac.uk Abstract. A new method for symbolic music classification is proposed,

More information

Image Aesthetics and Content in Selecting Memorable Keyframes from Lifelogs

Image Aesthetics and Content in Selecting Memorable Keyframes from Lifelogs Image Aesthetics and Content in Selecting Memorable Keyframes from Lifelogs Feiyan Hu and Alan F. Smeaton Insight Centre for Data Analytics Dublin City University, Dublin 9, Ireland {alan.smeaton}@dcu.ie

More information

2. Problem formulation

2. Problem formulation Artificial Neural Networks in the Automatic License Plate Recognition. Ascencio López José Ignacio, Ramírez Martínez José María Facultad de Ciencias Universidad Autónoma de Baja California Km. 103 Carretera

More information

1 Feb Grading WB PM Low power Wireless RF Transmitter for Photodiode Temperature Measurements

1 Feb Grading WB PM Low power Wireless RF Transmitter for Photodiode Temperature Measurements 1 Jan 21 2015341 Practice WB119 6 9PM Low power Wireless RF Transmitter for Photodiode Temperature Measurements 1 Jan 21 2015377 Practice WB119 6 9PM Gloovy 1 Jan 21 2015405 Practice WB119 6 9PM Machine

More information

MUSI-6201 Computational Music Analysis

MUSI-6201 Computational Music Analysis MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)

More information

Outline. Why do we classify? Audio Classification

Outline. Why do we classify? Audio Classification Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify

More information

DTS Neural Mono2Stereo

DTS Neural Mono2Stereo WAVES DTS Neural Mono2Stereo USER GUIDE Table of Contents Chapter 1 Introduction... 3 1.1 Welcome... 3 1.2 Product Overview... 3 1.3 Sample Rate Support... 4 Chapter 2 Interface and Controls... 5 2.1 Interface...

More information

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons Center for Games and Playable Media http://games.soe.ucsc.edu Kendall review of HW 2 Next two weeks

More information

Distortion Analysis Of Tamil Language Characters Recognition

Distortion Analysis Of Tamil Language Characters Recognition www.ijcsi.org 390 Distortion Analysis Of Tamil Language Characters Recognition Gowri.N 1, R. Bhaskaran 2, 1. T.B.A.K. College for Women, Kilakarai, 2. School Of Mathematics, Madurai Kamaraj University,

More information

LSTM Neural Style Transfer in Music Using Computational Musicology

LSTM Neural Style Transfer in Music Using Computational Musicology LSTM Neural Style Transfer in Music Using Computational Musicology Jett Oristaglio Dartmouth College, June 4 2017 1. Introduction In the 2016 paper A Neural Algorithm of Artistic Style, Gatys et al. discovered

More information

Power Efficient Architectures to Accelerate Deep Convolutional Neural Networks for edge computing and IoT

Power Efficient Architectures to Accelerate Deep Convolutional Neural Networks for edge computing and IoT Power Efficient Architectures to Accelerate Deep Convolutional Neural Networks for edge computing and IoT Giuseppe Desoli ST Central Labs STMicroelectronics Artificial Intelligence is Everywhere 2 Analysis,

More information

Journal of Field Robotics. Instructions to Authors

Journal of Field Robotics. Instructions to Authors Journal of Field Robotics Instructions to Authors Manuscripts submitted to the Journal of Field Robotics should describe work that has both practical and theoretical significance. Authors must clearly

More information

Visual Dialog. Devi Parikh

Visual Dialog. Devi Parikh VQA Visual Dialog Devi Parikh 2 People coloring a street on a college campus 3 It was a great event! It brought families out, and the whole community together. 4 5 Q. What are they coloring the street

More information

Automatic Piano Music Transcription

Automatic Piano Music Transcription Automatic Piano Music Transcription Jianyu Fan Qiuhan Wang Xin Li Jianyu.Fan.Gr@dartmouth.edu Qiuhan.Wang.Gr@dartmouth.edu Xi.Li.Gr@dartmouth.edu 1. Introduction Writing down the score while listening

More information

Discriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik

Discriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik Discriminative and Generative Models for Image-Language Understanding Svetlana Lazebnik Image-language understanding Robot, take the pan off the stove! Discriminative image-language tasks Image-sentence

More information

For support, video tutorials, webinars and further information visit us at

For support, video tutorials, webinars and further information visit us at Getting started For support, video tutorials, webinars and further information visit us at www.thinksmartbox.com Welcome to Grid 3 gives you the power to communicate, learn and control your world. This

More information

Lecture 9 Source Separation

Lecture 9 Source Separation 10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research

More information

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting Dalwon Jang 1, Seungjae Lee 2, Jun Seok Lee 2, Minho Jin 1, Jin S. Seo 2, Sunil Lee 1 and Chang D. Yoo 1 1 Korea Advanced

More information

Image-to-Markup Generation with Coarse-to-Fine Attention

Image-to-Markup Generation with Coarse-to-Fine Attention Image-to-Markup Generation with Coarse-to-Fine Attention Presenter: Ceyer Wakilpoor Yuntian Deng 1 Anssi Kanervisto 2 Alexander M. Rush 1 Harvard University 3 University of Eastern Finland ICML, 2017 Yuntian

More information

CTP 431 Music and Audio Computing. Course Introduction. Graduate School of Culture Technology (GSCT) Juhan Nam

CTP 431 Music and Audio Computing. Course Introduction. Graduate School of Culture Technology (GSCT) Juhan Nam CTP 431 Music and Audio Computing Course Introduction Graduate School of Culture Technology (GSCT) Juhan Nam 1 Who We Are Instructor: Juhan Nam ( ) Assistant Professor in GSCT Music and Audio Computing

More information

SHENZHEN H&Y TECHNOLOGY CO., LTD

SHENZHEN H&Y TECHNOLOGY CO., LTD Chapter I Model801, Model802 Functions and Features 1. Completely Compatible with the Seventh Generation Control System The eighth generation is developed based on the seventh. Compared with the seventh,

More information

IMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS. Oce Print Logic Technologies, Creteil, France

IMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS. Oce Print Logic Technologies, Creteil, France IMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS Bin Jin, Maria V. Ortiz Segovia2 and Sabine Su sstrunk EPFL, Lausanne, Switzerland; 2 Oce Print Logic Technologies, Creteil, France ABSTRACT Convolutional

More information

1) New Paths to New Machine Learning Science. 2) How an Unruly Mob Almost Stole. Jeff Howbert University of Washington

1) New Paths to New Machine Learning Science. 2) How an Unruly Mob Almost Stole. Jeff Howbert University of Washington 1) New Paths to New Machine Learning Science 2) How an Unruly Mob Almost Stole the Grand Prize at the Last Moment Jeff Howbert University of Washington February 4, 2014 Netflix Viewing Recommendations

More information

Representations of Sound in Deep Learning of Audio Features from Music

Representations of Sound in Deep Learning of Audio Features from Music Representations of Sound in Deep Learning of Audio Features from Music Sergey Shuvaev, Hamza Giaffar, and Alexei A. Koulakov Cold Spring Harbor Laboratory, Cold Spring Harbor, NY Abstract The work of a

More information

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com

More information

This project will work with two different areas in digital signal processing: Image Processing Sound Processing

This project will work with two different areas in digital signal processing: Image Processing Sound Processing Title of Project: Shape Controlled DJ Team members: Eric Biesbrock, Daniel Cheng, Jinkyu Lee, Irene Zhu I. Introduction and overview of project Our project aims to combine image and sound processing into

More information

Smart Traffic Control System Using Image Processing

Smart Traffic Control System Using Image Processing Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,

More information

Digital Image Processing and Pattern Recognition

Digital Image Processing and Pattern Recognition Digital Image Processing and Pattern Recognition Malay K. Pakhira Click here if your download doesn"t start automatically Digital Image Processing and Pattern Recognition Malay K. Pakhira Digital Image

More information

Generating Chinese Classical Poems Based on Images

Generating Chinese Classical Poems Based on Images , March 14-16, 2018, Hong Kong Generating Chinese Classical Poems Based on Images Xiaoyu Wang, Xian Zhong, Lin Li 1 Abstract With the development of the artificial intelligence technology, Chinese classical

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Kyogu Lee

More information

Pedestrian Detection with a Large-Field-Of-View Deep Network

Pedestrian Detection with a Large-Field-Of-View Deep Network Pedestrian Detection with a Large-Field-Of-View Deep Network Anelia Angelova 1 Alex Krizhevsky 2 and Vincent Vanhoucke 3 Abstract Pedestrian detection is of crucial importance to autonomous driving applications.

More information

Visual Communication at Limited Colour Display Capability

Visual Communication at Limited Colour Display Capability Visual Communication at Limited Colour Display Capability Yan Lu, Wen Gao and Feng Wu Abstract: A novel scheme for visual communication by means of mobile devices with limited colour display capability

More information

CHAPTER-9 DEVELOPMENT OF MODEL USING ANFIS

CHAPTER-9 DEVELOPMENT OF MODEL USING ANFIS CHAPTER-9 DEVELOPMENT OF MODEL USING ANFIS 9.1 Introduction The acronym ANFIS derives its name from adaptive neuro-fuzzy inference system. It is an adaptive network, a network of nodes and directional

More information

workbook Listening scripts

workbook Listening scripts workbook Listening scripts 42 43 UNIT 1 Page 9, Exercise 2 Narrator: Do you do any sports? Student 1: Yes! Horse riding! I m crazy about horses, you see. Being out in the countryside on a horse really

More information

Preface. system has put emphasis on neuroscience, both in studies and in the treatment of tinnitus.

Preface. system has put emphasis on neuroscience, both in studies and in the treatment of tinnitus. Tinnitus (ringing in the ears) has many forms, and the severity of tinnitus ranges widely from being a slight nuisance to affecting a person s daily life. How loud the tinnitus is perceived does not directly

More information

Joint bottom-up/top-down machine learning structures to simulate human audition and musical creativity

Joint bottom-up/top-down machine learning structures to simulate human audition and musical creativity Joint bottom-up/top-down machine learning structures to simulate human audition and musical creativity Jonas Braasch Director of Operations, Professor, School of Architecture Rensselaer Polytechnic Institute,

More information

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

Photo Aesthetics Ranking Network with Attributes and Content Adaptation Photo Aesthetics Ranking Network with Attributes and Content Adaptation Shu Kong 1, Xiaohui Shen 2, Zhe Lin 2, Radomir Mech 2, Charless Fowlkes 1 1 UC Irvine {skong2, fowlkes}@ics.uci.edu 2 Adobe Research

More information

SMART VEHICLE SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE METHODS

SMART VEHICLE SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE METHODS 1 TERNOPIL ACADEMY OF NATIONAL ECONOMY INSTITUTE OF COMPUTER INFORMATION TECHNOLOGIES SMART VEHICLE SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE METHODS Presenters: Volodymyr Turchenko Vasyl Koval The

More information

Coal Mines Security System

Coal Mines Security System www.ijcsi.org 419 Coal Mines Security System Ankita Guhe, Shruti Deshmukh, Bhagyashree Borekar, Apoorva Kailaswar,Milind E.Rane Department Electronics Engg. Vishwakarma Institute Technology(VIT), Pune,411037,INDIA

More information

Introduction to Knowledge Systems

Introduction to Knowledge Systems Introduction to Knowledge Systems 1 Knowledge Systems Knowledge systems aim at achieving intelligent behavior through computational means 2 Knowledge Systems Knowledge is usually represented as a kind

More information

arxiv: v2 [cs.cv] 27 Jul 2016

arxiv: v2 [cs.cv] 27 Jul 2016 arxiv:1606.01621v2 [cs.cv] 27 Jul 2016 Photo Aesthetics Ranking Network with Attributes and Adaptation Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, Charless Fowlkes UC Irvine Adobe {skong2,fowlkes}@ics.uci.edu

More information

arxiv: v1 [cs.cv] 9 Apr 2018

arxiv: v1 [cs.cv] 9 Apr 2018 arxiv:1804.03160v1 [cs.cv] 9 Apr 2018 The Sound of Pixels Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick Josh McDermott, and Antonio Torralba Massachusetts Institute of Technology Abstract.

More information

4K Video, Real-Time Analytics, and AI Applications Drive 24G SAS

4K Video, Real-Time Analytics, and AI Applications Drive 24G SAS 4K Video, Real-Time Analytics, and AI Applications Drive 24G SAS Dennis Martin Santa Clara, CA http://www.demartek.com/demartek_presenting_flashmemorysummit_2017-08.html 1 About Demartek Industry Analysis

More information

Large Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia

Large Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia Large Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia Shih Fu Chang Columbia University http://www.ee.columbia.edu/dvmm June 2013 Damian Borth Tao Chen Rongrong Ji Yan

More information

Detecting Bosch IVA Events with Milestone XProtect

Detecting Bosch IVA Events with Milestone XProtect Date: 8 December Detecting Bosch IVA Events with Prepared by: Tim Warren, Solutions Integration Engineer, Content and Technical Development 2 Table of Content 3 Overview 3 Camera Configuration 3 XProtect

More information

Through-Wall Human Pose Estimation Using Radio Signals

Through-Wall Human Pose Estimation Using Radio Signals Through-Wall Human Pose Estimation Using Radio Signals Mingmin Zhao Tianhong Li Mohammad Abu Alsheikh Yonglong Tian Antonio Torralba Dina Katabi MIT CSAIL Hang Zhao Figure 1: The figure shows a test example

More information

Speech Recognition and Signal Processing for Broadcast News Transcription

Speech Recognition and Signal Processing for Broadcast News Transcription 2.2.1 Speech Recognition and Signal Processing for Broadcast News Transcription Continued research and development of a broadcast news speech transcription system has been promoted. Universities and researchers

More information

IMPROVING VIDEO ANALYTICS PERFORMANCE FACTORS THAT INFLUENCE VIDEO ANALYTIC PERFORMANCE WHITE PAPER

IMPROVING VIDEO ANALYTICS PERFORMANCE FACTORS THAT INFLUENCE VIDEO ANALYTIC PERFORMANCE WHITE PAPER IMPROVING VIDEO ANALYTICS PERFORMANCE FACTORS THAT INFLUENCE VIDEO ANALYTIC PERFORMANCE WHITE PAPER Modern video analytic algorithms have changed the way organizations monitor and act on their security

More information

THE FUTURE OF VOICE ASSISTANTS IN THE NETHERLANDS. To what extent should voice technology improve in order to conquer the Western European market?

THE FUTURE OF VOICE ASSISTANTS IN THE NETHERLANDS. To what extent should voice technology improve in order to conquer the Western European market? THE FUTURE OF VOICE ASSISTANTS IN THE NETHERLANDS To what extent should voice technology improve in order to conquer the Western European market? THE FUTURE OF VOICE ASSISTANTS IN THE NETHERLANDS Go to

More information

Image Aesthetics Assessment using Deep Chatterjee s Machine

Image Aesthetics Assessment using Deep Chatterjee s Machine Image Aesthetics Assessment using Deep Chatterjee s Machine Zhangyang Wang, Ding Liu, Shiyu Chang, Florin Dolcos, Diane Beck, Thomas Huang Department of Computer Science and Engineering, Texas A&M University,

More information

Singing voice synthesis based on deep neural networks

Singing voice synthesis based on deep neural networks INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Singing voice synthesis based on deep neural networks Masanari Nishimura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

More information

CS 7643: Deep Learning

CS 7643: Deep Learning CS 7643: Deep Learning Topics: Computational Graphs Notation + example Computing Gradients Forward mode vs Reverse mode AD Dhruv Batra Georgia Tech Administrativia HW1 Released Due: 09/22 PS1 Solutions

More information

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Therapy for Memory: A Music Activity and Educational Program for Cognitive Impairments

Therapy for Memory: A Music Activity and Educational Program for Cognitive Impairments 2 Evidence for Music Therapy Therapy for Memory: A Music Activity and Educational Program for Cognitive Impairments Richard S. Isaacson, MD Vice Chair of Education Associate Prof of Clinical Neurology

More information

Music Information Retrieval with Temporal Features and Timbre

Music Information Retrieval with Temporal Features and Timbre Music Information Retrieval with Temporal Features and Timbre Angelina A. Tzacheva and Keith J. Bell University of South Carolina Upstate, Department of Informatics 800 University Way, Spartanburg, SC

More information