CS114 Lecture 15 Lexical Seman3cs

Similar documents
Semantic Analysis in Language Technology

Word Senses. Slides adapted from Dan Jurafsky and James Mar6n

Introduction to Semantics

Word Meaning and Similarity

Lecture: Lexical Semantics

Lexical Semantics. Thesaurus-based. ree years apart, we can see a clear shift in popularity

Lecture 13: Chapter 10: Semantics

What are meanings? What do linguistic expressions stand for or denote?

Language and Inference

Chapter 9: Semantics. LANE 321 Content adapted from Yule (2010) Copyright 2014 Haifa Alroqi

Ontology and Taxonomy. Computational Linguistics Emory University Jinho D. Choi

Stuart Hall: Encoding Decoding

WordFinder. Verginica Barbu Mititelu RACAI / 13 Calea 13 Septembrie, Bucharest, Romania

Introduction to Semantics and Pragmatics Class 3 Semantic Relations

Lexical Semantics: Sense, Referent, Prototype. Sentential Semantics (phrasal, clausal meaning)

Lexical Categories: Semantics

On the Ontological Basis for Logical Metonymy:

Informa(on Extrac(on: I Predetermined Rela(ons. David Israel SRI (Emeritus) Sapienza (Visi(ng)

Introduction to Semantics and Pragmatics Class 3 Semantic Relations

LING/C SC 581: Advanced Computational Linguistics. Lecture Notes Feb 6th

Antonymy in Language Structure and Use

Semantics: The meaning of words

Introduction to Semantics and Pragmatics Class 4 Semantic Relations and Semantic Features

Regular Polysemy in WordNet and Pattern based Approach

Introduction to Natural Language Processing Phase 2: Question Answering

Introduction to WordNet, HowNet, FrameNet and ConceptNet

Motif Definition and Classification to Structure Non-linear Plots and to Control the Narrative Flow in Interactive Dramas

Compound Noun Polysemy and Sense Enumeration in WordNet

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Word Sense Disambiguation in Queries. Shaung Liu, Clement Yu, Weiyi Meng

All Printables for February 4, 2013

A picture of the grammar. Sense and Reference. A picture of the grammar. A revised picture. Foundations of Semantics LING 130 James Pustejovsky

MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Song Lessons Understanding and Using English Grammar, 3rd Edition. A lesson about adjective, adverb, and noun clauses (Chapters 12, 13, 17)

2009 Teacher Created Resources, Inc.

Georgia Performance Standards for Second Grade

Target Vocabulary (Underlining indicates a word or word form from the Academic Word

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

8 HERE AND THERE _OUT_BEG_SB.indb 68 13/09/ :41

Semantics. Philipp Koehn. 16 November 2017

National Curriculum English

organise (dis- is a prefix and ed is a suffix.) What is the root word in disorganised?

Table of Contents TABLE OF CONTENTS

BIO + OLOGY = PHILEIN + ANTHROPOS = BENE + VOLENS = GOOD WILL MAL + VOLENS =? ANTHROPOS + OLOGIST = English - Language Arts Step 6

Meaning 1. Semantics is concerned with the literal meaning of sentences of a language.

HERE AND THERE. Vocabulary Collocations. Grammar Present continuous: all forms

Introduction to semantic networks and conceptual graphs

Two Styles of Construction Grammar Do Ditransitives

Lecture (04) CHALLENGING THE LITERAL

Improving MeSH Classification of Biomedical Articles using Citation Contexts

Useful Definitions. a e i o u. Vowels. Verbs (doing words) run jump

Affect-based Features for Humour Recognition

Grammar 101: Adjectives, Adverbs, Articles, Prepositions, oh my! For Planners

Introduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff

1 Family and friends. 1 Play the game with a partner. Throw a dice. Say. How to play

Lire Journal: Journal of Linguistics and Literature Volume 3 Nomor 2 October 2018

Comparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus

Metonymy in Grammar: Word-formation. Laura A. Janda Universitetet i Tromsø

Sarcasm Detection in Text: Design Document

Quiz 4 Practice. I. Writing Narrative Essay. Write a few sentences to accurately answer these questions.

arxiv: v1 [cs.cl] 24 Oct 2017

A Dictionary Of Synonyms And Antonyms By Joseph Devlin

TABLE OF CONTENTS. #3996 Daily Warm-Ups: Language Skills 2 Teacher Created Resources, Inc.

MECHANICS STANDARDS IN ENGINEERING WRITING

TABLE OF CONTENTS. Free resource from Commercial redistribution prohibited. Language Smarts TM Level D.

Key stage 2 - English grammar, punctuation and spelling practice paper

Conjunctions ******* There are several types of conjunctions in English grammar. They are:

Power Words come. she. here. * these words account for up to 50% of all words in school texts


The Visual Denotations of Sentences. Julia Hockenmaier with Peter Young and Micah Hodosh University of Illinois

Gerunds: Subject and Object Prof Marcelo Pereira de Leão

Alice in Wonderland. Great Illustrated Classics Reading Comprehension Worksheets. Sample file

(Faculty/field of study)

The Cognitive Nature of Metonymy and Its Implications for English Vocabulary Teaching

Language Arts Study Guide Week 1, 8, 15, 22, 29

Developed in Consultation with Pennsylvania Educators

Chinese Word Sense Disambiguation with PageRank and HowNet

Introduction to NLP. What is Natural Language Processing?

Key - Worksheet 3 Linguistics Eng B

Key Stage 2 example test paper

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Studium Języków Obcych

Developmental Sets. 1. Set I: (Spanish speaker)

Overview Sen,ment analysis on Twi7er

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons

UNIT 3 Past simple OJ Circle the right words in each sentence.

EMPOWERING TEACHERS. Instructional Example LA We are going identify synonyms for words. TEACHER EXPLAINS TASK TEACHER MODELS TASK

Speech & Language Homework Parent Le)er

Look at the picture on the right and at the examples below: 1a. Monica was driving to work. She had a car accident.

Taxonomy Displays Bridging UX & Taxonomy Design. Content Strategy Seattle Meetup April 28, 2015 Heather Hedden

Unit 12:Adjective Clauses. Todd Windisch, Spring 2015

2 o Semestre 2013/2014

Reading & Language. Homophones. Homophones. Grade 5. Correlated. Idioms. Homophones. Greek & Latin Roots. Analogies. Homographs. Synonyms & Antonyms

ADAPTIVE LEARNING ENVIRONMENTS: More examples

I no longer live with my parents. => I used to live with my parents. 1. We don't listen to long songs anymore.

Pulse 3 Progress Test Basic

Ontology-based Distinction between Polysemy and Homonymy

Hello. I m Q-rex. Target Language. Phone Number :

Seman&cs, Pragma&cs, Key Link. Patel Chapter Krumhansl 2002 DiPaola MusicFace

Transcription:

CS114 Lecture 15 Lexical Seman3cs March 19, 2014 Professor Meteer Thanks for Jurafsky & Mar3n & Prof. Pustejovksy for slides

Assignment 3: Superchunks Create a new chunker which takes the chunked data and produces bener chunks Your program should consist of a set of declara3ves rules (taking advantage of python data structures) an "interpreter" which is agnos3c as to the specific set of rules being applied. Use the the dev set from the previous assignment Focus your efforts on people, places, and organiza3ons

Example wsj_0014 Norman Ricken, 52 years old and former president and chief opera3ng officer of Toys "R" Us Inc., and Frederick Deane Jr., 63, chairman of Signet Banking Corp., were elected directors of this consumer electronics and appliances retailing chain. They succeed Daniel M. Rexinger, re3red Circuit City execu3ve vice president, and Robert R. Glauber, U.S. Treasury undersecretary, on the 12- member board. [ Norman/NNP Ricken/NNP ],/, [ 52/CD years/nns ] old/jj and/cc [ former/ JJ president/nn ] and/cc [ chief/nn opera3ng/vbg officer/nn ] of/in [ Toys/NNPS ] ``/`` [ R/NNP ] ''/' [ Us/NNP ] [ Inc./NNP ],/, and/cc [ Frederick/NNP Deane/NNP Jr./NNP ],/, [ 63/CD ],/, [ chairman/nn ] of/in [ Signet/NNP Banking/NNP Corp./NNP ],/, were/vbd elected/vbn [ directors/nns ] of/in [ this/dt consumer/nn electronics/nns ] and/cc [ appliances/nns ] retailing/nn [ chain/nn ]./

Three Perspec3ves on Meaning 1. Lexical Seman7cs The meanings of individual words 2. Formal Seman7cs (or Composi3onal Seman3cs or Senten3al Seman3cs) How those meanings combine to make meanings for individual sentences or unerances 3. Discourse or Pragma7cs How those meanings combine with each other and with other facts about various kinds of context to make meanings for a text or discourse Dialog or Conversa7on is ojen lumped together with Discourse

Outline: Comp Lexical Seman3cs Intro to Lexical Seman3cs Homonymy, Polysemy, Synonymy Online resources: WordNet Computa3onal Lexical Seman3cs Word Sense Disambigua3on Supervised Semi- supervised Word Similarity Thesaurus- based Distribu3onal

Preliminaries What s a word? Defini3ons we ve used: Types, tokens, stems, roots, inflected forms, etc... Lexeme: An entry in a lexicon consis3ng of a pairing of a form with a single meaning representa3on Lexicon: A collec3on of lexemes

Rela3onships between word meanings What s in a nym? Homonymy Synonymy Antonymy Metonymy And a sem? Polysemy Monosemy And a nom Hypernomy Hypernym Hyponomy Hyponym Meronomy Holonymy

Homonymy: Bat and Bass Homonymy: Lexemes that share a form Phonological, orthographic or both But have unrelated, dis3nct meanings Clear example: Bat (wooden s3ck- like thing) vs Bat (flying scary mammal thing) Or bank (financial ins3tu3on) versus bank (riverside) Can be homophones, homographs, or both: Homophones: Write and right Piece and peace Homographs Bass Convert

Homonymy causes problems for NLP applica3ons Text- to- Speech Same orthographic form but different phonological form bass vs bass Informa3on retrieval Different meanings same orthographic form QUERY: bat care Machine Transla3on Speech recogni3on Why?

Polysemy The bank is constructed from red brick I withdrew the money from the bank Are those the same sense? Or consider the following WSJ example While some banks furnish sperm only to married women, others are less restric3ve Which sense of bank is this? Is it dis3nct from (homonymous with) the river bank sense? How about the savings bank sense?

Polysemy A single lexeme with mul3ple related meanings (bank the building, bank the financial ins3tu3on) Most non- rare words have mul3ple meanings The number of meanings is related to its frequency Verbs tend more to polysemy Dis3nguishing polysemy from homonymy isn t always easy (or necessary)

Metaphor and Metonymy Specific types of polysemy Metaphor: Germany will pull Slovenia out of its economic slump. I spent 2 hours on that homework. Metonymy The White House announced yesterday. This chapter talks about part- of- speech tagging Bank (building) and bank (financial ins3tu3on)

How do we know when a word has more than one sense? ATIS examples Which flights serve breakfast? Does America West serve Philadelphia? The zeugma test:?does United serve breakfast and San Jose?

Synonyms Word that have the same meaning in some or all contexts. filbert / hazelnut couch / sofa big / large automobile / car vomit / throw up Water / H 2 0 Two lexemes are synonyms if they can be successfully subs3tuted for each other in all situa3ons If so they have the same proposi7onal meaning

Synonyms But there are few (or no) examples of perfect synonymy. Why should that be? Even if many aspects of meaning are iden3cal S3ll may not preserve the acceptability based on no3ons of politeness, slang, register, genre, etc. Example: Water and H 2 0

Some more terminology Lemmas and wordforms A lexeme is an abstract pairing of meaning and form A lemma or cita7on form is the gramma3cal form that is used to represent a lexeme. Carpet is the lemma for carpets Dormir is the lemma for duermes. Specific surface forms carpets, sung, duermes are called wordforms The lemma bank has two senses: Instead, a bank can hold the investments in a custodial account in the client s name But as agriculture burgeons on the east bank, the river will shrink even more. A sense is a discrete representa3on of one aspect of the meaning of a word

Synonymy is a rela3on between senses rather than words Consider the words big and large Are they synonyms? How big is that plane? Would I be flying on a large or small plane? How about here: Miss Nelson, for instance, became a kind of big sister to Benjamin.?Miss Nelson, for instance, became a kind of large sister to Benjamin. Why? big has a sense that means being older, or grown up large lacks this sense

Antonyms Senses that are opposites with respect to one feature of their meaning Otherwise, they are very similar! dark / light short / long hot / cold up / down in / out More formally: antonyms can define a binary opposi3on or at opposite ends of a scale (long/short, fast/slow) Be reversives: rise/fall, up/down

Hyponymy One sense is a hyponym of another if the first sense is more specific, deno3ng a subclass of the other car is a hyponym of vehicle dog is a hyponym of animal mango is a hyponym of fruit Conversely vehicle is a hypernym/superordinate of car animal is a hypernym of dog fruit is a hypernym of mango superordinate vehicle fruit furniture mammal hyponym car mango chair dog

Hypernymy more formally Extensional: The class denoted by the superordinate extensionally includes the class denoted by the hyponym Entailment: A sense A is a hyponym of sense B if being an A entails being a B Hyponymy is usually transi3ve (A hypo B and B hypo C entails A hypo C)

II. WordNet A hierarchically organized lexical database On- line thesaurus + aspects of a dic3onary Versions for other languages are under development Category Unique Forms Noun 117,097 Verb 11,488 Adjective 22,141 Adverb 4,601

WordNet Where it is: hnp://wordnet.princeton.edu/

Format of Wordnet Entries The noun "bass" has 8 senses in WordNet. 1. bass 1 - (the lowest part of the musical range) 2. bass -, bass part - (the lowest part in polyphonic music) 3. bass, basso - (an adult male singer with the lowest voice) 4. sea bass, bass - (the lean flesh of a saltwater fish of the family Serranidae) 5. freshwater bass, bass - (any of various North American freshwater fish with lean flesh (especially of the genus Micropterus)) 6. bass 6, bass voice 1, basso" - (the lowest ad3lt male singing voice) 7. bass 7 - (the member with the lowest range of a family of musical instruments) 8. bass - (nontechnical name for any of numerous edible marine and freshwater spiny- finned fishes) The adjec3ve "bass" has 1 sense in WordNet. 1. bass, deep 6 - (having or deno3ng a low vocal or instrumental range) "a deep voice": "a bass voice is lower than a baritone voice"; "a bass clarinet"

WordNet Noun Rela3ons Rela7on Also called Defini7on Example Hypernym Superordinate From concepts to superordinates breakfast 1 meal 1 Hyponym Subordinate From concepts to subtypes meal 1 lunch 1 Member Meronym Has- Member From groups to their faculty 2 professor 1 members Has- Instance From concepts to composer 1 Bach 1 instances of the concept Instance From instances to their Austen 1 author 1 concepts Member Holonvm Member- Of From members to their copilot 1 crew 1 groups Part Meronym Has- Part From wholes to parts table 2 leg 3 Part Holonvm Part- Of From parts to wholes course 7 meal 1 Antonym Opposites leader 1 follower 1

WordNet Verb Rela3ons Rela7on Defini7on Example Hypernym From events to superordinate events Fly 9 travel 5 Troponym From a verb (event) to a specific manner elabora3on of that verb walk 1 stroll 1 Entails From verbs (events) to the verbs (events) they entail snore 1 sleep 1 Antonym Opposites increase 1 <=> decrease 1

WordNet Hierarchies

How is sense defined in WordNet? The set of near- synonyms for a WordNet sense is called a synset (synonym set); it s their version of a sense or a concept Example: chump as a noun to mean a person who is gullible and easy to take advantage of Each of these senses share this same gloss Thus for WordNet, the meaning of this sense of chump is this list.

Word Sense Disambigua3on (WSD) Given a word in context, A fixed inventory of poten3al word senses decide which sense of the word this is. English- to- Spanish MT Inventory is set of Spanish transla3ons Speech Synthesis Inventory is homographs with different pronuncia3ons like bass and bow Automa3c indexing of medical ar3cles MeSH (Medical Subject Headings) thesaurus entries

Two variants of WSD task Lexical Sample task Small pre- selected set of target words And inventory of senses for each word We ll use supervised machine learning All- words task Every word in an en3re text A lexicon with senses for each word Sort of like part- of- speech tagging Except each lemma has its own tagset

Supervised Machine Learning Approaches Supervised machine learning approach: a training corpus of words tagged in context with their sense used to train a classifier that can tag words in new text Just as we saw for part- of- speech tagging, sta3s3cal MT. Summary of what we need: the tag set ( sense inventory ) the training corpus A set of features extracted from the training corpus A classifier

Supervised WSD 1: WSD Tags What s a tag? A dic3onary sense? For example, for WordNet an instance of bass in a text has 8 possible tags or labels (bass1 through bass8).

WordNet Bass The noun ``bass'' has 8 senses in WordNet 1. bass - (the lowest part of the musical range) 2. bass, bass part - (the lowest part in polyphonic music) 3. bass, basso - (an adult male singer with the lowest voice) 4. sea bass, bass - (flesh of lean- fleshed saltwater fish of the family Serranidae) 5. freshwater bass, bass - (any of various North American lean- fleshed freshwater fishes especially of the genus Micropterus) 6. bass, bass voice, basso - (the lowest adult male singing voice) 7. bass - (the member with the lowest range of a family of musical instruments) 8. bass - (nontechnical name for any of numerous edible marine and freshwater spiny- finned fishes)

Inventory of sense tags for bass WordNet Sense Spanish Translation Roget Category Target Word in Context bass 4 lubina FISH/INSECT... fish as Pacific salmon and striped bass and... bass 4 lnbina FISH/INSECT... produce filets of smoked bass or sturgeon... bass 7 bajo MUSIC... exciting jazz bass player since Ray Brown... bass 7 bajo MUSIC...play bass because he doesn't have to solo...

Supervised WSD 2: Get a corpus Lexical sample task: Line- hard- serve corpus - 4000 examples of each Interest corpus - 2369 sense- tagged examples All words: Seman7c concordance: a corpus in which each open- class word is labeled with a sense from a specific dic3onary/thesaurus. SemCor: 234,000 words from Brown Corpus, manually tagged with WordNet senses SENSEVAL- 3 compe33on corpora - 2081 tagged word tokens

Supervised WSD 3: Extract feature vectors Weaver (1955) If one examines the words in a book, one at a 3me as through an opaque mask with a hole in it one word wide, then it is obviously impossible to determine, one at a 3me, the meaning of the words. [ ] But if one lengthens the slit in the opaque mask, un3l one can see not only the central word in ques3on but also say N words on either side, then if N is large enough one can unambiguously decide the meaning of the central word. [ ] The prac3cal ques3on is : ``What minimum value of N will, at least in a tolerable frac3on of cases, lead to the correct choice of meaning for the central word?''

Feature vectors A simple representa3on for each observa3on (each instance of a target word) Vectors of sets of feature/value pairs I.e. files of comma- separated values These vectors should represent the window of words around the target

Two kinds of features in the vectors Colloca7onal features and bag- of- words features Colloca7onal Features about words at specific posi3ons near target word Ojen limited to just word iden3ty and POS Bag- of- words Features about words that occur anywhere in the window (regardless of posi3on) Typically limited to frequency counts

Examples Example text (WSJ) An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expecta3ons perhaps Assume a window of +/- 2 from the target

Examples Example text An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expecta3ons perhaps Assume a window of +/- 2 from the target

Colloca3onal Posi3on- specific informa3on about the words in the window guitar and bass player stand [guitar, NN, and, CC, player, NN, stand, VB] Word n- 2, POS n- 2, word n- 1, POS n- 1, Word n+1 POS n+1 In other words, a vector consis3ng of [posi3on n word, posi3on n part- of- speech ]

Bag- of- words Informa3on about the words that occur within the window. First derive a set of terms to place in the vector. Then note how ojen each of those terms occurs in a given window.

Co- Occurrence Example Assume we ve senled on a possible vocabulary of 12 words that includes guitar and player but not and and stand guitar and bass player stand [0,0,0,1,0,0,0,0,0,1,0,0] Which are the counts of words predefined as e.g., [fish, fishing, viol, guitar, double, cello

Classifiers Once we cast the WSD problem as a classifica3on problem, then all sorts of techniques are possible Naïve Bayes (the easiest thing to try first) Decision lists Decision trees Neural nets Support vector machines Nearest neighbor methods