Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing

Similar documents
hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

CS 1674: Intro to Computer Vision. Intro to Recognition. Prof. Adriana Kovashka University of Pittsburgh October 24, 2016

Video-based Vibrato Detection and Analysis for Polyphonic String Music

Browsing News and Talk Video on a Consumer Electronics Platform Using Face Detection

VBM683 Machine Learning

ImageNet Auto-Annotation with Segmentation Propagation

Semi-supervised Musical Instrument Recognition

A repetition-based framework for lyric alignment in popular songs

Scalable Semantic Parsing with Partial Ontologies ACL 2015

FOIL it! Find One mismatch between Image and Language caption

Discriminative and Generative Models for Image-Language Understanding. Svetlana Lazebnik

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

Automatic Rhythmic Notation from Single Voice Audio Sources

CS229 Project Report Polyphonic Piano Transcription

Music Information Retrieval Community

Using Variational Autoencoders to Learn Variations in Data

MUSI-6201 Computational Music Analysis

Compare and contrast essay words >>>CLICK HERE<<<

A Survey of Audio-Based Music Classification and Annotation

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,

Chairs: Josep Lladós (CVC, Universitat Autònoma de Barcelona)

Hearing Sheet Music: Towards Visual Recognition of Printed Scores

Dance Kindergarten-Fifth Grade

Experimenting with Musically Motivated Convolutional Neural Networks

Lecture 5: Clustering and Segmentation Part 1

MODELS of music begin with a representation of the

5. One s own opinion shall be separated from facts and logical conclusions as well as from the opinions of cited authors.

gresearch Focus Cognitive Sciences

CS 1699: Intro to Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh September 1, 2015

Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012)

Universität Bamberg Angewandte Informatik. Seminar KI: gestern, heute, morgen. We are Humor Beings. Understanding and Predicting visual Humor

TRACKING THE ODD : METER INFERENCE IN A CULTURALLY DIVERSE MUSIC CORPUS

Idioms. Idiom quiz. 1. Improve after going through something A. As plain as day

Cycle-7 MAMA Pulse height distribution stability: Fold Analysis Measurement

APPLICATION NOTE. Fiber Alignment Now Achievable with Commercial Software

How to Obtain a Good Stereo Sound Stage in Cars

Time Series Models for Semantic Music Annotation Emanuele Coviello, Antoni B. Chan, and Gert Lanckriet

into a Cognitive Architecture

Feature-Based Analysis of Haydn String Quartets

Puzzles and Playing: Power Tools for Mathematical Engagement and Thinking

Report on the 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

Music Similarity and Cover Song Identification: The Case of Jazz

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

Automatic Piano Music Transcription

Image Quality & System Design Considerations. Stuart Nicholson Architect / Technology Lead Christie

Metaphors in the Discourse of Jazz. Kenneth W. Cook Russell T. Alfonso

Sarcasm Detection in Text: Design Document

Detecting Attempts at Humor in Multiparty Meetings

CS 2770: Computer Vision. Introduction. Prof. Adriana Kovashka University of Pittsburgh January 5, 2017

Improving Frame Based Automatic Laughter Detection

EDDY CURRENT IMAGE PROCESSING FOR CRACK SIZE CHARACTERIZATION

Portable Performance for Debug and Validation

National University of Singapore, Singapore,

MOVIES constitute a large sector of the entertainment

Less is More: Picking Informative Frames for Video Captioning

: Reading With Comprehension - The graduate constructs meaning by using multiple strategies to comprehend a variety of texts.

Sampling: What you don t know can hurt you. Juan Muñoz

Structured training for large-vocabulary chord recognition. Brian McFee* & Juan Pablo Bello

Deep learning for music data processing

CSE221- Logic Design, Spring 2003

Heart Rate Variability Preparing Data for Analysis Using AcqKnowledge

Vector-Valued Image Interpolation by an Anisotropic Diffusion-Projection PDE

PAPER Parameter Embedding in Motion-JPEG2000 through ROI for Variable-Coefficient Invertible Deinterlacing

jsymbolic 2: New Developments and Research Opportunities

Abstracts workshops RaAM 2015 seminar, June, Leiden

Future Performance of the LCLS

Kant IV The Analogies The Schematism updated: 2/2/12. Reading: 78-88, In General

For every sentences A and B, there is a sentence: A B,

Workload Prediction and Dynamic Voltage Scaling for MPEG Decoding

Sensor-Based Analysis of User Generated Video for Multi-camera Video Remixing

Narrative Theme Navigation for Sitcoms Supported by Fan-generated Scripts

Understanding Compression Technologies for HD and Megapixel Surveillance

AutoChorale An Automatic Music Generator. Jack Mi, Zhengtao Jin

Lyric-Based Music Mood Recognition

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Symbol Classification Approach for OMR of Square Notation Manuscripts

5 th Grade Practice Reading Passages

Optical Technologies Micro Motion Absolute, Technology Overview & Programming

EyeFace SDK v Technical Sheet

Lecture 10 Harmonic/Percussive Separation

Multiple Choice A Blessing Grade Ten

Musical Hit Detection

ASTROGAM Calorimeter: detector and FEE. Martino Marisaldi INAF IASF Bologna 1 st ASTROGAM Workshop, Roma Dec. 2013

Analysis and Clustering of Musical Compositions using Melody-based Features

Introduction to Citation

arxiv: v1 [cs.sd] 18 Oct 2017

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

February 16, 2007 Menéndez-Benito. Challenges/ Problems for Carlson 1977

Part I: Graph Coloring

What are meanings? What do linguistic expressions stand for or denote?

Music Alignment and Applications. Introduction

A Survey on: Sound Source Separation Methods

Enhancing Music Maps

Chapter 2 Introduction to

An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews

ENGINEER AND CONSULTANT IP VIDEO BRIEFING BOOK

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

CLRC Writing Skills Workshop: Introduction to Citation

Martial Arts, Dancing and Sports dataset: a Challenging Stereo and Multi-View Dataset for Human Pose Estimation Supplementary Material

Introduction and Overview

Transcription:

Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing Hamid Izadinia, Fereshteh Sadeghi, Santosh K. Divvala, Hannaneh Hajishirzi, Yejin Choi, Ali Farhadi Presentated by Edward Banner

Outline What is a SPT? Motivation: What does a SPT enable us to do? How to build a SPT? How to make use of a SPT? Evaluation Discussion

What is a segment-phrase table? One to many mapping from phrases to segmentation models

What is a segment-phrase table? One to many mapping from phrases to segmentation models Image credit: Izadinia et al. Phrases

What is a segment-phrase table? One to many mapping from phrases to segmentation models Image credit: Izadinia et al. Phrases Segments

Why build a segment-phrase table? Many reasons!

Why build a segment-phrase table? Entailment If a horse is grazing, is it also standing?

Why build a segment-phrase table? Entailment If a horse is grazing, is it also standing? Image credit: Izadinia et al.

Why build a segment-phrase table? Paraphrasing Are horse jumping and horse leaping paraphrases of each other?

Why build a segment-phrase table? Paraphrasing Are horse jumping and horse leaping paraphrases of each other? Image credit: Izadinia et al.

Why build a segment-phrase table? Relative similarity Is cat standing up closer to bear standing up or deer standing up?

Why build a segment-phrase table? Relative similarity Is cat standing up closer to bear standing up or deer standing up? Image credit: Izadinia et al.

Why build a segment-phrase table? Semantic segmentation Image credit: Izadinia et al.

Considerations in building segment-phrase table Human annotators?

Considerations in building segment-phrase table Human annotators? Too expensive to obtain human-labeled pixel labels Opt instead for weakly-supervised approach instead

How do they build it? Three components: 1. 2. 3. Train a webly-supervised detection model for each phrase Model each phrase as a deformable parts model Learn segmentation model for each part

How do they build it? 1. Train a webly-supervised detection model for each phrase e.g. running horse

How do they build it? 2. Model each phrase as a deformable parts model Concerned about intra-class variation?

How do they build it? 2. Model each phrase as a deformable parts model Concerned about intra-class variation? horse

How do they build it? 2. Model each phrase as a deformable parts model Concerned about intra-class variation? horse running horse

How do they build it? 2. Model each phrase as a deformable parts model Concerned about intra-class variation? Key insight: parts of phrases have low intra-class variation horse running horse

How do they build it? 3. Learn segmentation model for each part Model superpixels with GMM and solve with EM and Graphcut Rough initialization with Grabcut and HOG root filter

How do they build it? 3. Learn segmentation model for each part Model superpixels with GMM and solve with EM and Graphcut Rough initialization with Grabcut and HOG root filter horse running right

Segment-phrase table built Results: For each phrase, we have learned: Bounding box detector Segmentation model for each part What can we do now? Image credit: Izadinia et al. Phrases Segments

Semantic segmentation Example: horse Image credit: Izadinia et al.

Semantic segmentation Example: horse Image credit: Izadinia et al.

Semantic segmentation Example: horse Image credit: Izadinia et al.

Semantic segmentation Example: horse Image credit: Izadinia et al.

Semantic segmentation Example: horse Image credit: Izadinia et al.

Semantic segmentation using linguistic constraints Example: horse Image credit: Izadinia et al.

Semantic segmentation using linguistic constraints Example: horse Image credit: Izadinia et al. standing standing sitting sitting kicking kicking posing posing

Semantic segmentation using linguistic constraints Example: horse Image credit: Izadinia et al. standing standing sitting sitting kicking kicking posing posing

Entailment Does phrase X entail phrase Y? Intuition: All segments for which phrase X is a valid description, then phrase Y is also a valid description

Entailment Does phrase X entail phrase Y? Intuition: All segments for which phrase X is a valid description, then phrase Y is also a valid description horse grazing horse standing

Entailment Does phrase X entail phrase Y? Intuition: All segments for which phrase X is a valid description, then phrase Y is also a valid description horse grazing horse standing

Entailment Does phrase X entail phrase Y? Intuition: All segments for which phrase X is a valid description, then phrase Y is also a valid description horse grazing horse standing

Paraphrasing Are phrase X and phrase Y paraphrases of each other? Strategy: compute X Y and Y X and say they re paraphrases if they re close Image credit: Izadinia et al.

Paraphrasing Are phrase X and phrase Y paraphrases of each other? Strategy: compute X Y and Y X and say they re paraphrases if they re close Image credit: Izadinia et al.

Relative Semantic Similarity Is phrase X closer to phrase Y or phrase Z? Strategy: compute X Y and X Z and pick highest number of the two Image credit: Izadinia et al.

Relative Semantic Similarity Is phrase X closer to phrase Y or phrase Z? Strategy: compute X Y and X Z and pick highest number of the two Image credit: Izadinia et al.

Evaluation - Takeaways Semantic segmentation state of the art or near it Highlights tradeoffs between unsupervised approach on large data and supervised approaches on small dataset Linguistic constraints help semantic segmentation SPT approach beats language-only and vision-only baselines on entailment, paraphrasing, and relative similarity

Discussion

Discussion Leverage supervision Variable number of part models per phrase Larger evaluation dataset Comparison against state-of-the-art entailment and paraphrase systems