CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016
|
|
- Emil Jordan
- 5 years ago
- Views:
Transcription
1 CS 1674: Intro to Computer Vision Intro to Recognition Prof. Adriana Kovashka University of Pittsburgh October 24, 2016
2 Plan for today Examples of visual recognition problems What should we recognize? Recognition pipeline Features Data Overview of some methods for classification K-Nearest Neighbors Linear classifiers
3 Some translations Feature vector = descriptor = representation Recognition often involves classification Classes = categories (hence classification = categorization) Training = learning a model (e.g. classifier), happens at training time from training data Classification = prediction, happens at test time
4 Slide credit: D. Hoiem
5 Classification Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Decision boundary Zebra Non-zebra Slide credit: L. Lazebnik
6 Classification Assign input vector to one of two or more classes Any decision rule divides the input space into decision regions separated by decision boundaries Slide credit: L. Lazebnik
7 Example: Spam filter Slide credit: L. Lazebnik
8 Examples of Categorization in Vision Part or object detection E.g., for each window: face or non-face? Scene categorization Indoor vs. outdoor, urban, forest, kitchen, etc. Action recognition Picking up vs. sitting down vs. standing Emotion recognition Happy vs. scared vs. surprised Region classification Label pixels into different object/surface categories Boundary classification Boundary vs. non-boundary Etc, etc. Adapted from D. Hoiem
9 What do you see in this image? Trees Bear Camera Man Can I put stuff in it? Rabbit Grass Forest Slide credit: D. Hoiem
10 Describe, predict, or interact with the object based on visual cues Is it dangerous? Is it alive? How fast does it run? Does it have a tail? Is it soft? Can I poke with it? Slide credit: D. Hoiem
11 Image categorization Two-class (binary): Cat vs Dog Adapted from D. Hoiem
12 Image categorization Multi-class (often): Object recognition Caltech 101 Average Object Images Adapted from D. Hoiem
13 Image categorization Fine-grained recognition Visipedia Project Slide credit: D. Hoiem
14 Image categorization Place recognition Places Database [Zhou et al. NIPS 2014] Slide credit: D. Hoiem
15 Image categorization Dating historical photos [Palermo et al. ECCV 2012] Slide credit: D. Hoiem
16 Image categorization Image style recognition [Karayev et al. BMVC 2014] Slide credit: D. Hoiem
17 Region categorization Layout prediction Assign regions to orientation Geometric context [Hoiem et al. IJCV 2007] Assign regions to depth Make3D [Saxena et al. PAMI 2008] Slide credit: D. Hoiem
18 Region categorization Material recognition [Bell et al. CVPR 2015] Slide credit: D. Hoiem
19 Attribute-based recognition A. Farhadi, I. Endres, D. Hoiem, and D Forsyth. Describing Objects by their Attributes. CVPR 2009.
20 Attribute-based recognition A. Kovashka, S. Vijayanarasimhan, and K. Grauman. Actively Selecting Annotations Among Objects and Attributes. ICCV 2011.
21 Attribute-based search A. Kovashka, D. Parikh and K. Grauman. WhittleSearch: Image Search with Relative Attribute Feedback. CVPR 2012.
22 Generic categorization problem Slide credit: K. Grauman
23 Instance-level recognition problem John s car Which one do you think is harder: Generic or instance-level recognition? Adapted from K. Grauman
24 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Visual Object Categories What stuff should we bother to recognize? Basic Level Categories in human categorization [Rosch 76, Lakoff 87] The highest level at which category members have similar perceived shape The highest level at which a single mental image reflects the entire category The level at which human subjects are usually fastest at identifying category members The first level named and understood by children The highest level at which a person uses similar motor actions for interaction with category members K. Grauman, B. Leibe
25 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Visual Object Categories Basic-level categories in humans seem to be defined predominantly visually. There is evidence that humans (usually) start with basic-level categorization before doing identification. Basic-level categorization is easier and faster for humans than object identification! Abstract levels animal quadruped Basic level dog cat cow German shepherd Doberman K. Grauman, B. Leibe Individual level Fido
26 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Object Categorization Task Description Given a small number of training images of a category, recognize a-priori unknown instances of that category and assign the correct category label. Which categories are feasible visually? Fido German shepherd dog animal living being K. Grauman, B. Leibe
27 How many object categories are there? Source: Fei-Fei Li, Rob Fergus, Antonio Torralba. Biederman 1987
28
29 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Other Types of Categories Functional Categories e.g. chairs = something you can sit on K. Grauman, B. Leibe
30 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Other Types of Categories Ad-hoc categories e.g. something you can find in an office environment K. Grauman, B. Leibe
31 Why recognition? Recognition a fundamental part of perception e.g., robots, autonomous agents Organize and give access to visual content Connect to information Detect trends and themes Slide credit: K. Grauman
32 Why is recognition hard?
33 Recognition: A machine learning approach
34 The machine learning framework Apply a prediction function to a feature representation of the image to get the desired output: f( ) = apple f( ) = tomato f( ) = cow Slide credit: L. Lazebnik
35 The machine learning framework y = f(x) output prediction function image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the prediction error on the training set Testing: apply f to a never before seen test example x and output the predicted value y = f(x) Slide credit: L. Lazebnik
36 Training Training Images Steps Image Features Training Labels Training Learned model Testing Test Image Image Features Learned model Prediction Slide credit: D. Hoiem and L. Lazebnik
37 Q: What are good features for recognizing a beach? Slide credit: D. Hoiem
38 Q: What are good features for recognizing cloth fabric? Slide credit: D. Hoiem
39 Q: What are good features for recognizing a mug? Slide credit: D. Hoiem
40 Q: What are good features for fine-grained recognition? Cardigan Welsh Corgi? Pembroke Welsh Corgi What breed is this dog? Slide credit: J. Deng
41 What are the right features? Depend on what you want to know! Object: shape Local shape info, shading, shadows, texture Material properties: albedo, feel, hardness Color, texture Scene: geometric layout linear perspective, gradients, line segments Action: motion Optical flow, tracked points Slide credit: D. Hoiem
42 What kind of things do we compute histograms of? Color L*a*b* color space HSV color space Texture (filter banks or HOG over regions) Slide credit: D. Hoiem
43 What kind of things do we compute histograms of? Histograms of descriptors SIFT [Lowe IJCV 2004] Bag of visual words Slide credit: D. Hoiem
44 Bag of visual words Image patches Cluster patches BoW histogram Slide credit: D. Hoiem
45 Training Training Images Steps Image Features Training Labels Training Learned model Testing Test Image Image Features Learned model Prediction Slide credit: D. Hoiem and L. Lazebnik
46 Recognition training data Images in the training set must be annotated with the correct answer that the model is expected to produce Motorbike Slide credit: L. Lazebnik
47 Datasets today ImageNet: 22k categories, 14mil images Microsoft COCO: 70 categories, 300k images PASCAL: 20 categories, 12k images SUN: 5k categories, 130k images
48 The PASCAL Visual Object Classes Challenge ( ) Challenge classes: Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor Dataset size (by 2012): 11.5K training/validation images, 27K bounding boxes, 7K segmentations Slide credit: L. Lazebnik
49 PASCAL competitions Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image Detection: Predicting the bounding box (if any) and label of each object from the twenty target classes in the test image Adapted from L. Lazebnik
50 Challenges: robustness Illumination Object pose Clutter Occlusions Intra-class appearance Viewpoint Slide credit: K. Grauman
51 Challenges: robustness Realistic scenes are crowded, cluttered, have overlapping objects. Slide credit: K. Grauman
52 Challenges: importance of context Slide credit: Fei-Fei, Fergus & Torralba
53 Painter identification How would you learn to identify the author of a painting? Goya Kirchner Klimt Marc Monet Van Gogh
54 One way to think about it Training labels dictate that two examples are the same or different, in some sense Features and distances define visual similarity Goal of training is to learn feature weights so that visual similarity predicts label similarity Linear classifier: confidence in positive label is a weighted sum of features What are the weights? We want the simplest function that is confidently correct Adapted from D. Hoiem
55 Nearest neighbor classifier Training examples from class 1 Test example Training examples from class 2 f(x) = label of the training example nearest to x All we need is a distance function for our inputs No training required! Slide credit: L. Lazebnik
56 K-Nearest Neighbors classification For a new point, find the k closest points from training data Labels of the k points vote to classify Black = negative Red = positive k = 5 If query lands here, the 5 NN consist of 3 negatives and 2 positives, so we classify it as negative. Slide credit: D. Lowe
57 1-nearest neighbor x x o o x + x o o o x o + x x x x2 o x1 Slide credit: D. Hoiem
58 3-nearest neighbor x x o o x + x o o o x o + x x x x2 o x1 Slide credit: D. Hoiem
59 5-nearest neighbor x x o o x + x o o o x o + x x x x2 o x1 What are the tradeoffs of having a too large k? Too small k? Slide credit: D. Hoiem
60 A nearest neighbor recognition example: im2gps: Estimating Geographic Information from a Single Image. James Hays and Alexei Efros. CVPR
61 Where in the World? Slides: James Hays
62 Where in the World? Slides: James Hays
63 Where in the World? Slides: James Hays
64 How much can an image tell about its geographic location? Slide credit: James Hays
65 Nearest Neighbors according to gist + bag of SIFT + color histogram + a few others Slide credit: James Hays
66 6+ million geotagged photos by 109,788 photographers Slides: James Hays
67 [Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.] Slides: James Hays
68 Slides: James Hays
69 [Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.] Slides: James Hays
70 The Importance of Data [Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.] Slides: James Hays
71 Discriminative classifiers Learn a simple function of the input features that correctly predicts the true labels on the training set y = f x Training Goals 1. Accurate classification of training data 2. Correct classifications are confident 3. Classification function is simple Slide credit: D. Hoiem
72 What about this line? Linear classifier Find a linear function to separate the classes f(x) = sgn(w 1 x 1 + w 2 x w D x D ) = sgn(w x) Slide credit: L. Lazebnik
73 NN vs. linear classifiers NN pros: + Simple to implement + Decision boundaries not necessarily linear + Works for any number of classes + Nonparametric method NN cons: Need good distance function Slow at test time (large search problem to find neighbors) Storage of data Linear pros: + Low-dimensional parametric representation + Very fast at test time Linear cons: Works for two classes How to train the linear function? What if data is not linearly separable? Adapted from L. Lazebnik
74 Evaluating Classifiers Accuracy # correctly classified / # all test examples Precision/recall Precision = # retrieved positives / # retrieved Recall = # retrieved positives / # positives F-measure = 2PR / (P + R)
CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016
CS 1674: Intro to Computer Vision Face Detection Prof. Adriana Kovashka University of Pittsburgh November 7, 2016 Today Window-based generic object detection basic pipeline boosting classifiers face detection
More informationGeneric object recognition
Generic object recognition May 19 th, 2015 Yong Jae Lee UC Davis Announcements PS3 out; due 6/3, 11:59 pm Sign attendance sheet (3 rd one) 2 Indexing local features 3 Kristen Grauman Visual words Map high-dimensional
More informationCS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015
CS 1699: Intro to Computer Vision Introduction Prof. Adriana Kovashka University of Pittsburgh September 1, 2015 Course Info Course website: http://people.cs.pitt.edu/~kovashka/cs1699 Instructor: Adriana
More informationThe Bias-Variance Tradeoff
CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016 Plan for Today More Matlab Measuring performance The bias-variance trade-off Matlab
More informationIndexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin
Indexing local features Wed March 30 Prof. Kristen Grauman UT-Austin Matching local features Kristen Grauman Matching local features? Image 1 Image 2 To generate candidate matches, find patches that have
More informationBBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1
BBM 413 Fundamentals of Image Processing Dec. 11, 2012 Erkut Erdem Dept. of Computer Engineering Hacettepe University Segmentation Part 1 Image segmentation Goal: identify groups of pixels that go together
More informationCS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017
CS 2770: Computer Vision Introduction Prof. Adriana Kovashka University of Pittsburgh January 5, 2017 About the Instructor Born 1985 in Sofia, Bulgaria Got BA in 2008 at Pomona College, CA (Computer Science
More informationIndexing local features and instance recognition
Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 Approximating the Laplacian We can approximate the Laplacian with a difference
More informationInstance Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision
Instance Recognition Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision Administrative stuffs Paper review submitted? Topic presentation Experiment presentation For / Against discussion lead
More informationECS 189G: Intro to Computer Vision March 31 st, Yong Jae Lee Assistant Professor CS, UC Davis
ECS 189G: Intro to Computer Vision March 31 st, 2015 Yong Jae Lee Assistant Professor CS, UC Davis Plan for today Topic overview Introductions Course overview: Logistics and requirements 2 What is Computer
More informationFOIL it! Find One mismatch between Image and Language caption
FOIL it! Find One mismatch between Image and Language caption ACL, Vancouver, 31st July, 2017 Ravi Shekhar, Sandro Pezzelle, Yauhen Klimovich, Aurelie Herbelot, Moin Nabi, Enver Sangineto, Raffaella Bernardi
More informationLecture 5: Clustering and Segmentation Part 1
Lecture 5: Clustering and Segmentation Part 1 Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today Segmentation and grouping Gestalt principles Segmentation as clustering K means Feature
More informationMusic Emotion Recognition. Jaesung Lee. Chung-Ang University
Music Emotion Recognition Jaesung Lee Chung-Ang University Introduction Searching Music in Music Information Retrieval Some information about target music is available Query by Text: Title, Artist, or
More informationDiscriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik
Discriminative and Generative Models for Image-Language Understanding Svetlana Lazebnik Image-language understanding Robot, take the pan off the stove! Discriminative image-language tasks Image-sentence
More informationInternational Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC
Scientific Journal of Impact Factor (SJIF): 5.71 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 e-issn (O): 2348-4470 p-issn (P): 2348-6406 MUSICAL
More informationMUSI-6201 Computational Music Analysis
MUSI-6201 Computational Music Analysis Part 9.1: Genre Classification alexander lerch November 4, 2015 temporal analysis overview text book Chapter 8: Musical Genre, Similarity, and Mood (pp. 151 155)
More informationDeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,
DeepID: Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University i of Hong Kong Machine Learning with Big Data Machine learning with small data: overfitting,
More informationSummarizing Long First-Person Videos
CVPR 2016 Workshop: Moving Cameras Meet Video Surveillance: From Body-Borne Cameras to Drones Summarizing Long First-Person Videos Kristen Grauman Department of Computer Science University of Texas at
More informationHearing Sheet Music: Towards Visual Recognition of Printed Scores
Hearing Sheet Music: Towards Visual Recognition of Printed Scores Stephen Miller 554 Salvatierra Walk Stanford, CA 94305 sdmiller@stanford.edu Abstract We consider the task of visual score comprehension.
More informationTopics in Computer Music Instrument Identification. Ioanna Karydi
Topics in Computer Music Instrument Identification Ioanna Karydi Presentation overview What is instrument identification? Sound attributes & Timbre Human performance The ideal algorithm Selected approaches
More informationEnhancing Semantic Features with Compositional Analysis for Scene Recognition
Enhancing Semantic Features with Compositional Analysis for Scene Recognition Miriam Redi and Bernard Merialdo EURECOM, Sophia Antipolis 2229 Route de Cretes Sophia Antipolis {redi,merialdo}@eurecom.fr
More informationLEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception
LEARNING AUDIO SHEET MUSIC CORRESPONDENCES Matthias Dorfer Department of Computational Perception Short Introduction... I am a PhD Candidate in the Department of Computational Perception at Johannes Kepler
More informationLarge scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs
Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs Damian Borth 1,2, Rongrong Ji 1, Tao Chen 1, Thomas Breuel 2, Shih-Fu Chang 1 1 Columbia University, New York, USA 2 University
More informationInstrument Recognition in Polyphonic Mixtures Using Spectral Envelopes
Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes hello Jay Biernat Third author University of Rochester University of Rochester Affiliation3 words jbiernat@ur.rochester.edu author3@ismir.edu
More informationWHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?
WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG? NICHOLAS BORG AND GEORGE HOKKANEN Abstract. The possibility of a hit song prediction algorithm is both academically interesting and industry motivated.
More informationMusic Mood. Sheng Xu, Albert Peyton, Ryan Bhular
Music Mood Sheng Xu, Albert Peyton, Ryan Bhular What is Music Mood A psychological & musical topic Human emotions conveyed in music can be comprehended from two aspects: Lyrics Music Factors that affect
More information2. Problem formulation
Artificial Neural Networks in the Automatic License Plate Recognition. Ascencio López José Ignacio, Ramírez Martínez José María Facultad de Ciencias Universidad Autónoma de Baja California Km. 103 Carretera
More informationAn Introduction to Deep Image Aesthetics
Seminar in Laboratory of Visual Intelligence and Pattern Analysis (VIPA) An Introduction to Deep Image Aesthetics Yongcheng Jing College of Computer Science and Technology Zhejiang University Zhenchuan
More informationUsing Genre Classification to Make Content-based Music Recommendations
Using Genre Classification to Make Content-based Music Recommendations Robbie Jones (rmjones@stanford.edu) and Karen Lu (karenlu@stanford.edu) CS 221, Autumn 2016 Stanford University I. Introduction Our
More informationDAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval
DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Rebecca
More informationSemantic Image Segmentation via Deep Parsing Network
Semantic Image Segmentation via Deep Parsing Network Ziwei Liu*, Xiaoxiao Li*, Ping Luo, Chen Change Loy, Xiaoou Tang Multimedia Lab, The Chinese University of Hong Kong Problem Problem TV Background Plant
More informationAutomatic Music Clustering using Audio Attributes
Automatic Music Clustering using Audio Attributes Abhishek Sen BTech (Electronics) Veermata Jijabai Technological Institute (VJTI), Mumbai, India abhishekpsen@gmail.com Abstract Music brings people together,
More informationWHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs
WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs Abstract Large numbers of TV channels are available to TV consumers
More informationLearning beautiful (and ugly) attributes
MARCHESOTTI, PERRONNIN: LEARNING BEAUTIFUL (AND UGLY) ATTRIBUTES 1 Learning beautiful (and ugly) attributes Luca Marchesotti luca.marchesotti@xerox.com Florent Perronnin florent.perronnin@xerox.com XRCE
More informationLecture 5: Clustering and Segmenta4on Part 1
Lecture 5: Clustering and Segmenta4on Part 1 Professor Fei- Fei Li Stanford Vision Lab Lecture 5 -! 1 What we will learn today Segmenta4on and grouping Gestalt principles Segmenta4on as clustering K- means
More informationShades of Music. Projektarbeit
Shades of Music Projektarbeit Tim Langer LFE Medieninformatik 28.07.2008 Betreuer: Dominikus Baur Verantwortlicher Hochschullehrer: Prof. Dr. Andreas Butz LMU Department of Media Informatics Projektarbeit
More informationColor in Information Visualization
Color in Information Visualization James Bernhard April 2012 Color serves different purposes in art and in information visualization: In art, color is used for creative and expressive purposes In information
More informationSegment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing
Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing Hamid Izadinia, Fereshteh Sadeghi, Santosh K. Divvala, Hannaneh Hajishirzi, Yejin Choi, Ali Farhadi Presentated by Edward
More informationMachine Vision System for Color Sorting Wood Edge-Glued Panel Parts
Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts Q. Lu, S. Srikanteswara, W. King, T. Drayer, R. Conners, E. Kline* The Bradley Department of Electrical and Computer Eng. *Department
More informationUniversität Bamberg Angewandte Informatik. Seminar KI: gestern, heute, morgen. We are Humor Beings. Understanding and Predicting visual Humor
Universität Bamberg Angewandte Informatik Seminar KI: gestern, heute, morgen We are Humor Beings. Understanding and Predicting visual Humor by Daniel Tremmel 18. Februar 2017 advised by Professor Dr. Ute
More informationAutomatic Extraction of Popular Music Ringtones Based on Music Structure Analysis
Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis Fengyan Wu fengyanyy@163.com Shutao Sun stsun@cuc.edu.cn Weiyao Xue Wyxue_std@163.com Abstract Automatic extraction of
More informationPredicting Aesthetic Radar Map Using a Hierarchical Multi-task Network
Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network Xin Jin 1,2,LeWu 1, Xinghui Zhou 1, Geng Zhao 1, Xiaokun Zhang 1, Xiaodong Li 1, and Shiming Ge 3(B) 1 Department of Cyber Security,
More informationVideo-based Vibrato Detection and Analysis for Polyphonic String Music
Video-based Vibrato Detection and Analysis for Polyphonic String Music Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan Audio Information Research Lab University of Rochester The 18 th International
More informationReducing False Positives in Video Shot Detection
Reducing False Positives in Video Shot Detection Nithya Manickam Computer Science & Engineering Department Indian Institute of Technology, Bombay Powai, India - 400076 mnitya@cse.iitb.ac.in Sharat Chandran
More informationThe theory of data visualisation
The theory of data visualisation V2017-10 Simon Andrews, Phil Ewels simon.andrews@babraham.ac.uk phil.ewels@scilifelab.se Data Visualisation A scientific discipline involving the creation and study of
More informationA Framework for Segmentation of Interview Videos
A Framework for Segmentation of Interview Videos Omar Javed, Sohaib Khan, Zeeshan Rasheed, Mubarak Shah Computer Vision Lab School of Electrical Engineering and Computer Science University of Central Florida
More information8K Resolution: Making Hyperrealism a Reality
8K Resolution: Making Hyperrealism a Reality Is 8K worth it? With the first 8K TV being released into consumer markets this year and the growth of 8K content creation and supporting technologies, it s
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 AN HMM BASED INVESTIGATION OF DIFFERENCES BETWEEN MUSICAL INSTRUMENTS OF THE SAME TYPE PACS: 43.75.-z Eichner, Matthias; Wolff, Matthias;
More informationOutline. Why do we classify? Audio Classification
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification Implementation Future Work Why do we classify
More informationA Survey of Audio-Based Music Classification and Annotation
A Survey of Audio-Based Music Classification and Annotation Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang IEEE Trans. on Multimedia, vol. 13, no. 2, April 2011 presenter: Yin-Tzu Lin ( 阿孜孜 ^.^)
More informationImageNet Auto-Annotation with Segmentation Propagation
ImageNet Auto-Annotation with Segmentation Propagation Matthieu Guillaumin Daniel Küttel Vittorio Ferrari Bryan Anenberg & Michela Meister Outline Goal & Motivation System Overview Segmentation Transfer
More informationShot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences
, pp.120-124 http://dx.doi.org/10.14257/astl.2017.146.21 Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences Mona A. M. Fouad 1 and Ahmed Mokhtar A. Mansour
More informationDAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval
DAY 1 Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval Jay LeBoeuf Imagine Research jay{at}imagine-research.com Kyogu Lee
More informationJoint Image and Text Representation for Aesthetics Analysis
Joint Image and Text Representation for Aesthetics Analysis Ye Zhou 1, Xin Lu 2, Junping Zhang 1, James Z. Wang 3 1 Fudan University, China 2 Adobe Systems Inc., USA 3 The Pennsylvania State University,
More informationVBM683 Machine Learning
VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra, David Sontag, Aykut Erdem Quotes If you were a current computer science student what area would you start studying heavily? Answer:
More informationgresearch Focus Cognitive Sciences
Learning about Music Cognition by Asking MIR Questions Sebastian Stober August 12, 2016 CogMIR, New York City sstober@uni-potsdam.de http://www.uni-potsdam.de/mlcog/ MLC g Machine Learning in Cognitive
More informationFeature-Based Analysis of Haydn String Quartets
Feature-Based Analysis of Haydn String Quartets Lawson Wong 5/5/2 Introduction When listening to multi-movement works, amateur listeners have almost certainly asked the following situation : Am I still
More informationSinger Recognition and Modeling Singer Error
Singer Recognition and Modeling Singer Error Johan Ismael Stanford University jismael@stanford.edu Nicholas McGee Stanford University ndmcgee@stanford.edu 1. Abstract We propose a system for recognizing
More informationINTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION
INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION ULAŞ BAĞCI AND ENGIN ERZIN arxiv:0907.3220v1 [cs.sd] 18 Jul 2009 ABSTRACT. Music genre classification is an essential tool for
More informationLesson 10 November 10, 2009 BMC Elementary
Lesson 10 November 10, 2009 BMC Elementary Overview. I was afraid that the problems that we were going to discuss on that lesson are too hard or too tiring for our participants. But it came out very well
More informationImage Steganalysis: Challenges
Image Steganalysis: Challenges Jiwu Huang,China BUCHAREST 2017 Acknowledgement Members in my team Dr. Weiqi Luo and Dr. Fangjun Huang Sun Yat-sen Univ., China Dr. Bin Li and Dr. Shunquan Tan, Mr. Jishen
More informationAPPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC
APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC Vishweshwara Rao, Sachin Pant, Madhumita Bhaskar and Preeti Rao Department of Electrical Engineering, IIT Bombay {vishu, sachinp,
More informationMidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases
1 MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases Gus Xia Tongbo Huang Yifei Ma Roger B. Dannenberg Christos Faloutsos Schools of Computer Science Carnegie Mellon University 2
More informationA QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM
A QUER B EAMPLE MUSIC RETRIEVAL ALGORITHM H. HARB AND L. CHEN Maths-Info department, Ecole Centrale de Lyon. 36, av. Guy de Collongue, 69134, Ecully, France, EUROPE E-mail: {hadi.harb, liming.chen}@ec-lyon.fr
More informationTOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC
TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC G.TZANETAKIS, N.HU, AND R.B. DANNENBERG Computer Science Department, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA E-mail: gtzan@cs.cmu.edu
More informationAudio Feature Extraction for Corpus Analysis
Audio Feature Extraction for Corpus Analysis Anja Volk Sound and Music Technology 5 Dec 2017 1 Corpus analysis What is corpus analysis study a large corpus of music for gaining insights on general trends
More informationLecture 9 Source Separation
10420CS 573100 音樂資訊檢索 Music Information Retrieval Lecture 9 Source Separation Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research
More informationIMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS. Oce Print Logic Technologies, Creteil, France
IMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS Bin Jin, Maria V. Ortiz Segovia2 and Sabine Su sstrunk EPFL, Lausanne, Switzerland; 2 Oce Print Logic Technologies, Creteil, France ABSTRACT Convolutional
More informationVISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,
VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS O. Javed, S. Khan, Z. Rasheed, M.Shah {ojaved, khan, zrasheed, shah}@cs.ucf.edu Computer Vision Lab School of Electrical Engineering and Computer
More informationComparison Parameters and Speaker Similarity Coincidence Criteria:
Comparison Parameters and Speaker Similarity Coincidence Criteria: The Easy Voice system uses two interrelating parameters of comparison (first and second error types). False Rejection, FR is a probability
More informationLarge Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia
Large Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia Shih Fu Chang Columbia University http://www.ee.columbia.edu/dvmm June 2013 Damian Borth Tao Chen Rongrong Ji Yan
More informationEvaluating Melodic Encodings for Use in Cover Song Identification
Evaluating Melodic Encodings for Use in Cover Song Identification David D. Wickland wickland@uoguelph.ca David A. Calvert dcalvert@uoguelph.ca James Harley jharley@uoguelph.ca ABSTRACT Cover song identification
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationA TEXT RETRIEVAL APPROACH TO CONTENT-BASED AUDIO RETRIEVAL
A TEXT RETRIEVAL APPROACH TO CONTENT-BASED AUDIO RETRIEVAL Matthew Riley University of Texas at Austin mriley@gmail.com Eric Heinen University of Texas at Austin eheinen@mail.utexas.edu Joydeep Ghosh University
More informationBar Codes to the Rescue!
Fighting Computer Illiteracy or How Can We Teach Machines to Read Spring 2013 ITS102.23 - C 1 Bar Codes to the Rescue! If it is hard to teach computers how to read ordinary alphabets, create a writing
More informationRecognising Cello Performers using Timbre Models
Recognising Cello Performers using Timbre Models Chudy, Magdalena; Dixon, Simon For additional information about this publication click this link. http://qmro.qmul.ac.uk/jspui/handle/123456789/5013 Information
More informationBasic Sight Words - Preprimer
Basic Sight Words - Preprimer a and my run can three look help in for down we big here it away me to said one where is yellow blue you go two the up see play funny make red come jump not find little I
More informationComputational Models of Music Similarity. Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST)
Computational Models of Music Similarity 1 Elias Pampalk National Institute for Advanced Industrial Science and Technology (AIST) Abstract The perceived similarity of two pieces of music is multi-dimensional,
More informationjsymbolic 2: New Developments and Research Opportunities
jsymbolic 2: New Developments and Research Opportunities Cory McKay Marianopolis College and CIRMMT Montreal, Canada 2 / 30 Topics Introduction to features (from a machine learning perspective) And how
More informationDeep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj
Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj 1 Story so far MLPs are universal function approximators Boolean functions, classifiers, and regressions MLPs can be
More informationDetecting the Moment of Snap in Real-World Football Videos
Detecting the Moment of Snap in Real-World Football Videos Behrooz Mahasseni and Sheng Chen and Alan Fern and Sinisa Todorovic School of Electrical Engineering and Computer Science Oregon State University
More informationAcoustic Scene Classification
Acoustic Scene Classification Marc-Christoph Gerasch Seminar Topics in Computer Music - Acoustic Scene Classification 6/24/2015 1 Outline Acoustic Scene Classification - definition History and state of
More informationBrowsing News and Talk Video on a Consumer Electronics Platform Using Face Detection
Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection Kadir A. Peker, Ajay Divakaran, Tom Lanning Mitsubishi Electric Research Laboratories, Cambridge, MA, USA {peker,ajayd,}@merl.com
More informationAn Efficient Multi-Target SAR ATR Algorithm
An Efficient Multi-Target SAR ATR Algorithm L.M. Novak, G.J. Owirka, and W.S. Brower MIT Lincoln Laboratory Abstract MIT Lincoln Laboratory has developed the ATR (automatic target recognition) system for
More informationPERCEPTUAL QUALITY OF H.264/AVC DEBLOCKING FILTER
PERCEPTUAL QUALITY OF H./AVC DEBLOCKING FILTER Y. Zhong, I. Richardson, A. Miller and Y. Zhao School of Enginnering, The Robert Gordon University, Schoolhill, Aberdeen, AB1 1FR, UK Phone: + 1, Fax: + 1,
More informationDETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION
DETECTION OF SLOW-MOTION REPLAY SEGMENTS IN SPORTS VIDEO FOR HIGHLIGHTS GENERATION H. Pan P. van Beek M. I. Sezan Electrical & Computer Engineering University of Illinois Urbana, IL 6182 Sharp Laboratories
More informationNatural Scenes Are Indeed Preferred, but Image Quality Might Have the Last Word
Psychology of Aesthetics, Creativity, and the Arts 2009 American Psychological Association 2009, Vol. 3, No. 1, 52 56 1931-3896/09/$12.00 DOI: 10.1037/a0014835 Natural Scenes Are Indeed Preferred, but
More informationStory Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004
Story Tracking in Video News Broadcasts Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004 Acknowledgements Motivation Modern world is awash in information Coming from multiple sources Around the clock
More information1ms Column Parallel Vision System and It's Application of High Speed Target Tracking
Proceedings of the 2(X)0 IEEE International Conference on Robotics & Automation San Francisco, CA April 2000 1ms Column Parallel Vision System and It's Application of High Speed Target Tracking Y. Nakabo,
More informationSpeech Recognition and Signal Processing for Broadcast News Transcription
2.2.1 Speech Recognition and Signal Processing for Broadcast News Transcription Continued research and development of a broadcast news speech transcription system has been promoted. Universities and researchers
More informationDistortion Analysis Of Tamil Language Characters Recognition
www.ijcsi.org 390 Distortion Analysis Of Tamil Language Characters Recognition Gowri.N 1, R. Bhaskaran 2, 1. T.B.A.K. College for Women, Kilakarai, 2. School Of Mathematics, Madurai Kamaraj University,
More informationPedestrian Detection with a Large-Field-Of-View Deep Network
Pedestrian Detection with a Large-Field-Of-View Deep Network Anelia Angelova 1 Alex Krizhevsky 2 and Vincent Vanhoucke 3 Abstract Pedestrian detection is of crucial importance to autonomous driving applications.
More informationClassification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors
Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors Priyanka S. Jadhav M.E. (Computer Engineering) G. H. Raisoni College of Engg. & Mgmt. Wagholi, Pune, India E-mail:
More informationModeling memory for melodies
Modeling memory for melodies Daniel Müllensiefen 1 and Christian Hennig 2 1 Musikwissenschaftliches Institut, Universität Hamburg, 20354 Hamburg, Germany 2 Department of Statistical Science, University
More informationInteractive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation
for Polyphonic Electro-Acoustic Music Annotation Sebastien Gulluni 2, Slim Essid 2, Olivier Buisson, and Gaël Richard 2 Institut National de l Audiovisuel, 4 avenue de l Europe 94366 Bry-sur-marne Cedex,
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 3. A Network-Centric View on HPC
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 3. A Network-Centric View on HPC Intro What did we learn in the last lecture SMM vs. DMM architecture and programming Systolic
More informationSmart Traffic Control System Using Image Processing
Smart Traffic Control System Using Image Processing Prashant Jadhav 1, Pratiksha Kelkar 2, Kunal Patil 3, Snehal Thorat 4 1234Bachelor of IT, Department of IT, Theem College Of Engineering, Maharashtra,
More informationVocabulary Sentences & Conversation Color Shape Math. blue green. Vocabulary Sentences & Conversation Color Shape Math. blue brown
Scope & Sequence Unit 1 Classroom chair colo paper crayon door pencil scissors shelf table A: What do you see? B: I see a book. A: What do you do with scissors? B: I cut with scissors. number 1 I put the
More informationSOCIAL NARRATIVE: GOING TO LIFELINE THEATRE Welcome to Lifeline Theatre! We look forward to having you attend Bunnicula on November 3 rd.
SOCIAL NARRATIVE: GOING TO LIFELINE THEATRE Welcome to Lifeline Theatre! We look forward to having you attend Bunnicula on November 3 rd. LIFELINE THEATRE I am going to Lifeline Theatre to see a play.
More informationMulti-modal Analysis for Person Type Classification in News Video
Multi-modal Analysis for Person Type Classification in News Video Jun Yang, Alexander G. Hauptmann School of Computer Science, Carnegie Mellon University, 5000 Forbes Ave, PA 15213, USA {juny, alex}@cs.cmu.edu,
More information