A picture is worth 13.6 words (on average)

Similar documents
A picture is worth 13.6 words (on average)

Young Learners. Starters. Sample papers. Young Learners English Tests (YLE) Volume One. UCLES 2014 CE/2063a/4Y01

Vocabulary Sentences & Conversation Color Shape Math. blue green. Vocabulary Sentences & Conversation Color Shape Math. blue brown

Test 1 Answers. Listening. T RANSCRIPT Hello. This is the Cambridge Starters. Part 1 (5 marks) Part 2 (5 marks) Part 3 (5 marks) Part 4 (5 marks)

Pgs. Level 1 Questions Level 2 Questions Level 3 Questions Level 4 Questions

Basic Sight Words - Preprimer

Word Fry Phrase. one by one. I had this. how is he for you

Letterland Lists by Unit. cat nap mad hat sat Dad lap had at map

ABSS HIGH FREQUENCY WORDS LIST C List A K, Lists A & B 1 st Grade, Lists A, B, & C 2 nd Grade Fundations Correlated

The Ant and the Grasshopper

Section I. Quotations

Countable (Can count) uncountable (cannot count)

The First Hundred Instant Sight Words. Words 1-25 Words Words Words


P3 Hold On Tight. Do you want to have some fun? Dah dah dah dah Do you want to have some fun? Then come along with me.

Cover Photo: Burke/Triolo Productions/Brand X Pictures/Getty Images

FIRST STEP LAAS LANGUAGE ATTAINMENT ASSESSMENT SYSTEM. English English Language Language Examinations Examinations. December 2005 June 2014 NAME..

Fry Instant Phrases. First 100 Words/Phrases

Where are the three friends?... What is the girl wearing?... Find the true sentence...

ENGLISH ENGLISH AMERICAN. Level 1. Tests

ENGLISH ENGLISH. Level 3. Tests AMERICAN. Student Workbook ENGLISH. Level 3. Rosetta Stone Classroom. RosettaStone.com AMERICAN

ENGLISH ENGLISH BRITISH. Level 3. Tests

General Revision on Module 1& 1 and (These are This is You are) two red apples in the basket.

Idioms. Idiom quiz. 1. Improve after going through something A. As plain as day

First 100 High Frequency Words

XSEED Summative Assessment Test 1. Duration: 90 Minutes Maximum Marks: 60. English, Test 1. XSEED Education English Grade 3 1

Grade 2 - English Ongoing Assessment T-2( ) Lesson 4 Diary of a Spider. Vocabulary

LEVEL PRE-A1 LAAS LANGUAGE ATTAINMENT ASSESSMENT SYSTEM. English English Language Language Examinations Examinations. December 2005 May 2012

Teach Your Child Lessons BeginningReads Level 10

SJK(C) PU SZE YEAR 3 ENGLISH LANGUAGE ASSESSMENT (3) PAPER 1

Recording scripts Third edition. for Movers

Downloaded from SA2QP Total number of printed pages 10

THE YELLOW BUTTERFLY. Off flew the butterfly!

In the sentence above we find the article "a". It shows us that the speaker does not need a specific chair. He can have any chair.

1-1 I Like Stars. A. It is in a room. A. It is looking at the stars through the window. A. They are a rabbit, a frog, a bird, and a mouse.

Unit 4. Decodable Readers. Phonics/Comprehension Activities. Lifeinfirstgrade1.blogspot.com

ENGLISH ENGLISH BRITISH. Level 1. Tests

1 Family and friends. 1 Play the game with a partner. Throw a dice. Say. How to play

Table of Contents. #3974 Daily Warm-Ups: Nonfiction & Fiction Writing 2 Teacher Created Resources

Test 1 Answers. Listening TRANSCRIPT. Part 1 (5 marks) Part 2 (5 marks) Part 3 (5 marks) Part 4 (5 marks) Part 5 (5 marks) Part 1

Conversation 1. Conversation 2. Conversation 3. Conversation 4. Conversation 5

THE LANGUAGE MAGICIAN classroom resources. Pupil's worksheets Activities

Unit 4 Week 1 Day 2. Unit 4 Week 1 Day 1

LearnEnglish Elementary Podcast Series 02 Episode 08

ATOMIC ENERGY CENTRAL SCHOOL No.4, RAWATBHATA WORKSHEET FOR ANNUAL EXAM Name: CLASS : III / Sec. SUB : English

Mama, I asked as we looked. Why are there so many? Why not have just one or two, instead of each and every?

Supplementary Material Notes

A Day in May. Phonics Skills. Long a: ai, ay. rain Gail gray day May Ray mail brain play tray way

Contents Starter Unit 1 Unit 2 Unit 3 Review 1 Cross-curricular 1: Math Unit 4 Unit 5 Unit 6 Review 2 Cross-curricular 2: Language Arts Unit 7

L.4.4a L.3.4a L.2.4a

SALTY DOG Year 2

Lesson THINKING OPERATIONS. Now you re going to say the rule that starts with no chairs. (Pause.) Get ready.

Section 2: Known And Unknown

Section 2: Known and Unknown

This is a vocabulary test. Please select the option a, b, c, or d which has the closest meaning to the word in bold.

Instant Words Group 1

English. March Grade. External Measurement of Student Achievement TEST INSTRUCTIONS

clutched _G3U4W5_ indd 1 2/19/10 5:00 PM

Tilda and her family. Read, write and draw

Room 6 First Grade Homework due on Tuesday, November 3rd

My name is: YazooA_booklet.indd 1 9/8/09 10:20:56 AM

Show Me Actions. Word List. Celebrating. are I can t tell who you are. blow Blow out the candles on your cake.

Visual Madlibs: Fill in the blank Description Generation and Question Answering Supplementary File

(Answers on Pages 17 & 18)

DIAGNOSTIC EVALUATION

Longman English for Pre-school Book 4

Match the questions and answers. Type the letter in the box.

AteneodeZamboanga University

Understanding, Predicting, and Recalling Time 3

XSEED Summative Assessment Test 2. Duration: 90 Minutes Maximum Marks: 60. English, Test 2. XSEED Education English Grade 1

Weekly Homework A LEVEL

FIRST STEP LAAS LANGUAGE ATTAINMENT ASSESSMENT SYSTEM. English English Language Language Examinations Examinations. December 2005 May 2017 NAME..

Hey! Get Off Our Train By John Burningham

2018 English Entrance Exam for Returnees

FIRST STEP LAAS LANGUAGE ATTAINMENT ASSESSMENT SYSTEM. English English Language Language Examinations Examinations. December 2005 SAMPLE 1 NAME..

to believe all evening thing to see to switch on together possibly possibility around

For each example, define for yourself what aspects of the item(s) are being tested, and just as important what is not being tested!

TEST ONE. Singing Star Showing this week. !The Wild Wheel Ride! Indoor tennis centre. RACING CAR TRACK To drive, children must be 1 metre or more

Quiz 4 Practice. I. Writing Narrative Essay. Write a few sentences to accurately answer these questions.

ENGLISH ENGLISH. Level 3. Student Workbook AMERICAN. Student Workbook ENGLISH. Level 3. Rosetta Stone Classroom. RosettaStone.

101 Extraordinary, Everyday Miracles

able, alone, animal, become, call, catch, country, monkey, thin, word; baby, clean, eat, enjoy, family, fruit, jump, kind, man, parent

ENGLISH FILE Pre-intermediate

1 st Final Term Revision SY Student s Name:

Box and Subject List of Stereographs


cl Underline the NOUN in the sentence. gl Circle the missing ending punctuation. !.? Watch out Monday Tuesday Wednesday Thursday you are in my class.

Poetry. Read this poem and then answer the questions THE SHEEP. by Ann and Jane Taylor

Power Words come. she. here. * these words account for up to 50% of all words in school texts

A verb tells what the subject does or is. A verb can include more than one word. There may be a main verb and a helping verb.

A nurse works at a hospital. Left is the opposite of (A) right. A pencil is used to write. Fingers are used to (A) touch.

lorries waitresses secretaries sandwiches children matches flowers vegetable families dictionaries eye bag boxes schools lunches cities hotel watches

3-40. Oi! Get off our Train

EYFS Curriculum Months. Personal, Social and Emotional Development Physical Development Communication and Language

Victoria Vega 10 December 2014

LEVEL PRE-A1 LAAS LANGUAGE ATTAINMENT ASSESSMENT SYSTEM. English English Language Language Examinations Examinations. December 2005 May 2010

Practice for the 2 nd Test

Write your answers on the question paper. You will have six minutes at the end of the test to copy your answers onto the answer sheet.

Name Date Unit 3 - Wk.2 Abuelo and the Three Bears. Daily Language Arts / Math D.O.L.

Correlation. Fountas & Pinnell K DRA. Plant and Animal Life Cycles Sparky Learns About the Ladybug Life Cycle

Transcription:

Yiannis Tamara Aloimonos Berg Alex Berg Jesse Dodge Amit Goyal Yejin Choi A picture is worth 13.6 words (on average) 1/59 Xufeng Alyssa Meg Han Mensch Mitchell Kota Karl Ching Lik Yezhou Yamaguch Stratos TeoA picture Yang is worth 13.6 words Hal Daumé III, me@hal3.name

An on-paper experiment Write a caption for this image, one sentence in length. (In English.) 2/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

People write weird captions Another dream car to add to the list, this one spotted in Hanbury St. 3/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

People write weird captions Another dream car to add to the list, this one spotted in Hanbury St. 4/59 Shot out my car window while stuck in traffic because people in Cincinatti can't drive in the rain Hal Daumé III, me@hal3.name A picture is worth 13.6 words

People write weird captions 1. A distorted photo of a man cutting up a large cut of meat in a garage. 2. A man smiling at the camera while carving up meat. Another dream car to Shot out my car window 3. A man smiling while he add to the list, this one while stuckofinmeat. traffic cuts up a piece spotted in Hanbury St. next to a table because people in 4. A smiling man is standing dressing Cincinatti can't a piece of venison. drive in the rain 5. The man is smiling into the camera as he cuts meat. 5/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What I used to think vision did... 6/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What I used to think vision did... 7/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What I used to think vision did... 8/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What I used to think vision did... 9/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Now I know better... 10/59 Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Detecting on a large scale... bird boat bottle bowl

What do people describe? 1) Given an image

What do people describe? two women sitting brunette blonde on bench reading magazine 1) Given an image Predict what people will describe

What do people describe? two women sitting brunette blonde on bench reading magazine 1) Given an image Predict what people will describe bench magazine grass skirt women

Predicting what will be described What s in this image?

Predicting what will be described What s in this image? man baby sling ladder fridge table watermelon chair boxes cups water bottle wall pacifier beard glasses shirt

Predicting what will be described What s in this image? What do people describe? A bearded man is holding a child in a sling. man baby sling ladder fridge table watermelon chair boxes cups water bottle wall pacifier beard glasses shirt

Predicting what will be described What s in this image? What do people describe? A bearded man is holding a child in a sling. A bearded man stands while holding a small child in a green sheet. A bearded man with a baby in a sling poses. Man standing in kitchen with little girl in green sack. Man with beard and baby man baby sling ladder fridge table watermelon chair boxes cups water bottle wall pacifier beard glasses shirt

Predicting what will be described What s in this image? What do people describe? A bearded man is holding a child in a sling. A bearded man stands while holding a small child in a green sheet. A bearded man with a baby in a sling poses. Man standing in kitchen with little girl in green sack. Man with beard and baby man baby sling ladder fridge table watermelon chair boxes cups water bottle wall pacifier beard glasses shirt

Description factors What factors influence what someone will describe about an image? Two kinds of factors Compositional Semantic

Compositional factors Size/Saliency Location A sail boat on the ocean.

Compositional factors Size/Saliency Location Two men standing on beach.

Semantic factors Object Type Nameable Scene Unusualness girl in the street

Semantic factors Object Type Nameable Scene Unusualness kitchen in house

Semantic factors Object Type Nameable Scene Unusualness elephant in the beach

Semantic factors Object Type Nameable Scene Unusualness A tree in water and a boy with a beard

Using large corpora to compose natural captions (why write your own material when you can just steal it?)

Composing captions a) monkey playing in the tree canopy, Monte Verde in the rain forest b) capuchin monkey in front of my window c) monkey spotted in Apenheul Netherlands under the tree d) a white-faced or capuchin in the tree in the garden e) the monkey sitting in a tree, posing for his picture

Composing captions a) monkey playing in the tree canopy, Monte Verde in the rain forest b) capuchin monkey in front of my window c) monkey spotted in Apenheul Netherlands under the tree d) a white-faced or capuchin in the tree in the garden e) the monkey sitting in a tree, posing for his picture

Captioning with (some) evidence Caption images where: We assume some evidence for 1 object & Object detector is confident

Captioning with (some) evidence Caption images where: We assume some evidence for 1 object & Object detector is confident Tag: mare Evidence for horse

Captioning with (some) evidence Caption images where: We assume some evidence for 1 object & Tag: mare Evidence for horse Object detector is confident High detection score

Generation: Grab 'N Mash Grab phrases based on image similarity between query and captioned data base Object detection similarity - NPs, VPs Stuff detection similarity PPs Scene similarity - PPs Mash phrases Compose descriptions using simple rule based concatenation

Getting NPs Objects Detect: fruit

Getting NPs Objects Detect: fruit Find matching fruit detections by color similarity

Getting NPs Objects Tray of glace fruit in the market at Nice, France Fresh fruit in the market Detect: fruit Find matching fruit detections by color similarity A box of oranges was just catching the sun, bringing out detail in the skin. mandarin oranges in glass bowl The street market in Santanyi, Mallorca is a must for the oranges and local crafts. An orange tree in the backyard of the house.

Getting NPs Objects The muddy elephant An elephant small elephant A very large and seemingly old elephant musk male elephant African elephant the temple elephant Fushia flower a flower a pink zinna flower This beautiful flower a roman pink flower a tiny pink flower pink bursting flowers a perfectly pink gerbera daisy a lonesome duck a native new zealand duck The duck male Mallard duck several other ducks a so-called navigation duck this duck a duck duck mandarin duck

Getting VPs objects Detect: cow Find matching cow detections by shape/pose similarity theses cows live in the field behind my house The cow was more interested in eating than looking at me with a camera! A cow eating flowers in the south of the Netherlands. While cycling north on Tremaine Road near Milton, this cow gazed across the road intently.

Getting PPs stuff Detect: grass green manure in the veg field - Plaw Hatch I am happy in a field of green Maryland grass Find matching grass detections by color similarity Sheep in a field spotted during a coastal drive from Tramore to Found on hawthorn in boggy grass field

Getting PPs scenes Extract scene descriptor Find matching images by scene similarity Pedestrian street in the Old Lyon with stairs to climb up the hill of fourviere I'm about to blow the building across the street over with my massive lung power. Only in Paris will you find a View from our B&B in this bottle of wine on a table photo outside a bookstore

Composing captions

Composing captions object color object pose scene stuf

Composing captions object color object pose scene stuf NP: the sheep VP: meandered along a desolate road PP: in the highlands of Scotland PP: through frozen grass

Composing captions object color object pose scene stuf Various composition patterns: NP VP NP PP_stuf NP PP_scene NP VP PP_scene PP_stuf NP: the sheep VP: meandered along a desolate road PP: in the highlands of Scotland PP: through frozen grass

Composing captions object color object pose scene stuf Various composition patterns: NP VP NP PP_stuf NP PP_scene NP VP PP_scene PP_stuf NP: the sheep VP: meandered along a desolate road PP: in the highlands of Scotland PP: through frozen grass the sheep meandered along a desolate road in the highlands of Scotland through frozen grass

Good results A duck was having a bath in the harbor at whitehaven, cumbria, england in the water near Camley St A female Monarch butterfly was visiting the plant in my front yard in Devon 17/10/10 her flower girl dress designed by Mainbocher in the house A double-decker bus under some spreading shade trees Stained glass window depicting Christ and numerous saints in Washington National Cathedral in the Eglise cat enjoys hiding under the tree

Not so good results

Not so good results Language issues A Moo cow tied up around the city eating grass in various places under the tree at the young tree male tiger sighting in twelve months of a street

Not so good results Language issues A Moo cow tied up around the city eating grass in various places under the tree at the young tree male tiger sighting in twelve months of a street Vision issues a girl walking by in a green field in the sun The silhouetted building and cross stands under water around Loon Mountain

Not so good results Language issues A Moo cow tied up around the city eating grass in various places under the tree at the young tree male tiger sighting in twelve months of a street Vision issues a girl walking by in a green field in the sun The silhouetted building and cross stands under water around Loon Mountain Just plain silly bike was left here by an ancient civilization not as sophisticated as our own in the grass of granite dogs running pic, this time, racing through the sea at Fraisthorpe near Bridlington of Christmas tree in bed

What about 2nd language learning? Obvious problems 51/59 Assumes knowledge 1st language Assumes knowledge of the world Still don't have a robot... Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What about 2nd language learning? Obvious problems 52/59 Assumes knowledge 1st language Assumes knowledge of the world Still don't have a robot... But we do have software with exercises for SLA Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What about 2nd language learning? Obvious It'sproblems hard for 53/59 people, too! Assumes knowledge 1st language Assumes knowledge of the world Still don't have a robot... But we do have software with exercises for SLA Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What about 2nd language learning? Obvious It'sproblems hard for 54/59 people, too! Assumes knowledge 1st language Assumes knowledge of the world Still don't have a robot... But we do have software with exercises for SLA Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Aspects of computational 2ndLL Very specific linguistic variants 55/59 Number, case, agreement, etc. Not enough to get the majority case Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Aspects of computational 2ndLL Very specific linguistic variants 56/59 Number, case, agreement, etc. Not enough to get the majority case Focus on subtle visual differences Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Aspects of computational 2ndLL 57/59 AI-style reasoning & one-shot learning Hal Daumé III, me@hal3.name A picture is worth 13.6 words

What is needed to solve this? 58/59 Linguistic model over character sequences (words not okay!) w/o any L-specific background Pre-trained (?) visual detectors for objects, poses and physical relationships (eg., gaze) Ability to reason and generalize from a few examples Hal Daumé III, me@hal3.name A picture is worth 13.6 words

Yiannis Tamara Aloimonos Berg Alex Berg Jesse Dodge Amit Goyal Yejin Choi Thanks! Questions? 59/59 Xufeng Alyssa Meg Han Mensch Mitchell Kota Karl Ching Lik Yezhou Yamaguch Stratos TeoA picture Yang is worth 13.6 words Hal Daumé III, me@hal3.name