LING/C SC 581: Advanced Computational Linguistics. Lecture Notes Feb 6th

Size: px
Start display at page:

Download "LING/C SC 581: Advanced Computational Linguistics. Lecture Notes Feb 6th"

Transcription

1 LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 6th

2 Adminstrivia The Homework Pipeline: Homework 2 graded Homework 4 not back yet soon Homework 5 due Weds by midnight No classes next week: I'm out of town on business No new homework assigned this week

3 Today's Topics Homework 4 review

4 Homework 4 Review: Ques1on 1 Construct a WSJ text corpus that excludes both words tagged as NONE- and punctuation words (defined previously) Show your Python console. How many words in the corpus? How many distinct words? Plot the cumulative frequency distribution graph How many top words do you need to account for 50% of the corpus?

5 Homework 4 Review: Question 1 excluded = set(['-none-', '-LRB-', '-RRB-', 'SYM', ':', '.', ',', '``', "''"]) tokens = [x[0] for x in ptb.tagged_words(categories=['news']) if x[1] not in excluded] words = set(tokens) print('tokens: {}; #Words: {}'.format(len(text),len(words))) Tokens: ; #Words: len(words) print('lexical diversity: {:.3f}%'.format(len(words)/len(text))) Lexical diversity: 0.047% text = nltk.text(tokens) dist = nltk.freqdist(text) print(dist) <FreqDist with samples and outcomes>

6 Homework 4 Review: Ques1on 1 list = sorted(dist.items(),key=lambda t:t[1],reverse=true) half = len(text) / 2.0 total = 0 index = 0 while total < half: total += list[index][1] index += 1 print('no of words: {}; total: {}'.format(index,total)) No of words: 217; total: /2 =

7 Homework 4 Review: Question 1 print('{:12s} {:5s}'.format('Word','Freq')) for word, freq in list[:index]: print('{:12s} {:5d}'.format(word,freq))

8 Homework 4 Review: Question 1

9 Homework 4 Review: Question 2 With case folding: tokens = [x[0].lower() for x in ptb.tagged_words(categories=['news']) if x[1] not in excluded] Tokens: ; #Words: Lexical diversity: 0.042% No of words: 176; total: ( /2= )

10 Homework 4 Review: Question 2

11 Colorless green ideas examples (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Chomsky (1957):... It is fair to assume that neither sentence (1) nor (2) (nor indeed any part of these sentences) has ever occurred in an English discourse. Hence, in any statistical model for grammaticalness, these sentences will be ruled out on identical grounds as equally `remote' from English. Yet (1), though nonsensical, is grammatical, while (2) is not. idea (1) is syntactically valid, (2) is word salad One piece of suppor>ng evidence: (1) pronounced with normal intona>on (2) pronounced like a list of words

12 Background: Language Models and N-grams given a word sequence w 1 w 2 w 3... w n chain rule how to compute the probability of a sequence of words p(w 1 w 2 ) = p(w 1 ) p(w 2 w 1 ) p(w 1 w 2 w 3 ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )... p(w 1 w 2 w 3...w n ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )... p(w n w 1...w n-2 w n-1 ) note It s not easy to collect (meaningful) sta8s8cs on p(w n w n-1 w n-2...w 1 ) for all possible word sequences

13 Background: Language Models and N-grams Given a word sequence w 1 w 2 w 3... w n Bigram approximation just look at the previous word only (not all the proceedings words) Markov Assumption: finite length history 1st order Markov Model p(w 1 w 2 w 3...w n ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )...p(w n w 1...w n-3 w n-2 w n-1 ) p(w 1 w 2 w 3...w n )» p(w 1 ) p(w 2 w 1 ) p(w 3 w 2 )...p(w n w n-1 ) note p(w n w n-1 ) is a lot easier to collect data for (and thus estimate well) than p(w n w 1...w n-2 w n-1 )

14 Colorless green ideas Sentences: (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Sta7s7cal Experiment (Pereira 2002) bigram language model w i-1 w i

15 Part-of-Speech (POS) Tag Sequence Chomsky's example: colorless green ideas sleep furiously JJ JJ NNS VBP RB (POS Tags) Similar but grammatical example: revolutionary new ideas appear infrequently JJ JJ NNS VBP RB LSLT pg. 146

16 Stanford Parser Stanford Parser: a probabilis2c PS parser trained on the Penn Treebank

17 Stanford Parser Stanford Parser: a probabilis2c PS parser trained on the Penn Treebank

18 Penn Treebank (PTB) Corpus: word frequencies: Word POS Frequency colorless 0 green NNP 33 JJ 19 NN 5 ideas NNS 32 sleep VB 5 NN 4 VBP 2 NNP 1 furiously RB 2 Word POS Frequency revolutionary JJ 6 NNP 2 NN 2 new JJ 1795 NNP 1459 NNPS 2 NN 1 ideas NNS 32 appear VB 55 VBP 41 infrequently 0

19 Stanford Parser Structure of NPs: colorless green ideas revolutionary new ideas Phrase Frequency [ NP JJ JJ NNS] 1073 [ NP NNP JJ NNS] 61

20 An experiment examples (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Question: Is (1) even the most likely permutation of these particular five words?

21 Parsing Data All 5! (=120) permutations of colorless green ideas sleep furiously.

22 Parsing Data The winning sentence was: 1 furiously ideas sleep colorless green. after training on sections (approx. 40,000 sentences) sleep selects for ADJP object with 2 heads adverb (RB) furiously modifies noun

23 Parsing Data The next two highest scoring permutations were: 2 Furiously green ideas sleep colorless. 3 Green ideas sleep furiously colorless. sleep takes NP object sleep takes ADJP object

24 Parsing Data (Pereira 2002) compared Chomsky s original minimal pair: 23. colorless green ideas sleep furiously 36. furiously sleep ideas green colorless Ranked #23 and #36 respectively out of 120

25 Parsing Data But graph (next slide) shows how arbitrary these rankings are when trained on randomly chosen sections covering 14K- 31K sentences Example: #36 furiously sleep ideas green colorless outranks #23 colorless green ideas sleep furiously (and the top 3) over much of the training space Example: Chomsky's original sentence #23 colorless green ideas sleep furiously outranks both the top 3 and #36 just briefly at one data point

26 Sentence Rank vs. Amount of Training Data 120 Best three sentences Rank #1 #2 # Amount of training data

27 Sentence Rank vs. Amount of Training Data #23 colorless green ideas sleep furiously #36 furiously sleep ideas green colorless 80 Rank #23 # Amount of training data

28 Sentence Rank vs. Amount of Training Data #23 colorless green ideas sleep furiously #36 furiously sleep ideas green colorless 80 Rank #1 #2 #3 #23 # Amount of training data

Introduction to Natural Language Processing Phase 2: Question Answering

Introduction to Natural Language Processing Phase 2: Question Answering Introduction to Natural Language Processing Phase 2: Question Answering Center for Games and Playable Media http://games.soe.ucsc.edu The plan for the next two weeks Week9: Simple use of VN WN APIs. Homework

More information

Practice Midterm Exam for Natural Language Processing

Practice Midterm Exam for Natural Language Processing Practice Midterm Exam for Natural Language Processing Name: Net ID Instructions In the actual midterm there will be 7 questions, each will be worth 15 points. You also get 10 point for signing your name

More information

The ACL Anthology Network Corpus. University of Michigan

The ACL Anthology Network Corpus. University of Michigan The ACL Anthology Corpus Dragomir R. Radev 1,2, Pradeep Muthukrishnan 1, Vahed Qazvinian 1 1 Department of Electrical Engineering and Computer Science 2 School of Information University of Michigan {radev,mpradeep,vahed}@umich.edu

More information

ABSTRACT CITATION HANDLING: PROCESSING CITATION TEXTS IN SCIENTIFIC DOCUMENTS. Michael Alan Whidby Master of Science, 2012

ABSTRACT CITATION HANDLING: PROCESSING CITATION TEXTS IN SCIENTIFIC DOCUMENTS. Michael Alan Whidby Master of Science, 2012 ABSTRACT Title of thesis: CITATION HANDLING: PROCESSING CITATION TEXTS IN SCIENTIFIC DOCUMENTS Michael Alan Whidby Master of Science, 2012 Thesis directed by: Professor Bonnie Dorr Dr. David Zajic Department

More information

Lab 14: Text & Corpus Processing with NLTK. Ling 1330/2330: Computational Linguistics Na-Rae Han

Lab 14: Text & Corpus Processing with NLTK. Ling 1330/2330: Computational Linguistics Na-Rae Han Lab 14: Text & Corpus Processing with NLTK Ling 1330/2330: Computational Linguistics Na-Rae Han Getting started with NLTK book NLTK Book, with navigation panel: http://www.pitt.edu/~naraehan/ling1330/nltk_book.html

More information

UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics

UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics Olga Vechtomova University of Waterloo Waterloo, ON, Canada ovechtom@uwaterloo.ca Abstract The

More information

LING/C SC 581: Advanced Computational Linguistics. Lecture 2 Jan 15 th

LING/C SC 581: Advanced Computational Linguistics. Lecture 2 Jan 15 th LING/C SC 581: Advanced Computational Linguistics Lecture 2 Jan 15 th From last time Did everyone install Python 3 and nltk/nltk_data? We'll do a Homework 2 on this today Importing your own corpus Learning

More information

The structure of this ppt. Structural and categorial (and some functional) issues: English Hungarian

The structure of this ppt. Structural and categorial (and some functional) issues: English Hungarian The structure of this ppt Structural and categorial (and some functional) issues: 1.1. 1.12. English 2.1. 2.6. Hungarian 2 1.1. Structural issues The VP lecture (1) S NP John VP laughed. read the paper.

More information

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections 1/23 Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections Rudolf Mayer, Andreas Rauber Vienna University of Technology {mayer,rauber}@ifs.tuwien.ac.at Robert Neumayer

More information

Basic Natural Language Processing

Basic Natural Language Processing Basic Natural Language Processing Why NLP? Understanding Intent Search Engines Question Answering Azure QnA, Bots, Watson Digital Assistants Cortana, Siri, Alexa Translation Systems Azure Language Translation,

More information

Language and Inference

Language and Inference Language and Inference Day 5: Inference in the Real World Johan Bos johan.bos@rug.nl Semantic Analysis Pipeline tokenisation tokenised text POS-tagging parts of speech NE-tagging named entities parsing

More information

CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING

CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING August 2010 Instructors: Liang Huang and Kevin Knight TA: Jason Riesa Doesn t Google know everything? What animal does a cat eat? 2 Even Key Word Queries

More information

Probabilistic Grammars for Music

Probabilistic Grammars for Music Probabilistic Grammars for Music Rens Bod ILLC, University of Amsterdam Nieuwe Achtergracht 166, 1018 WV Amsterdam rens@science.uva.nl Abstract We investigate whether probabilistic parsing techniques from

More information

Sarcasm Detection in Text: Design Document

Sarcasm Detection in Text: Design Document CSC 59866 Senior Design Project Specification Professor Jie Wei Wednesday, November 23, 2016 Sarcasm Detection in Text: Design Document Jesse Feinman, James Kasakyan, Jeff Stolzenberg 1 Table of contents

More information

Spectacular successes and failures of recurrent neural networks applied to language

Spectacular successes and failures of recurrent neural networks applied to language Spectacular successes and failures of recurrent neural networks applied to language Marco Baroni Facebook AI Research Recurrent neural networks external input output state of the network at the previous

More information

Comparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus

Comparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus Comparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus Both sets of texts were preprocessed to provide comparable

More information

Sentence Processing. BCS 152 October

Sentence Processing. BCS 152 October Sentence Processing BCS 152 October 29 2018 Homework 3 Reminder!!! Due Wednesday, October 31 st at 11:59pm Conduct 2 experiments on word recognition on your friends! Read instructions carefully & submit

More information

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 353 359 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p353 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information

Characterizing Literature Using Machine Learning Methods

Characterizing Literature Using Machine Learning Methods Masterarbeit Characterizing Literature Using Machine Learning Methods vorgelegt von Jan Bílek Fakultät für Mathematik, Informatik und Naturwissenschaften Fachbereich Informatik Arbeitsbereich Wissenschaftliches

More information

DISCOURSE ANALYSIS OF LYRIC AND LYRIC-BASED CLASSIFICATION OF MUSIC

DISCOURSE ANALYSIS OF LYRIC AND LYRIC-BASED CLASSIFICATION OF MUSIC DISCOURSE ANALYSIS OF LYRIC AND LYRIC-BASED CLASSIFICATION OF MUSIC Jiakun Fang 1 David Grunberg 1 Diane Litman 2 Ye Wang 1 1 School of Computing, National University of Singapore, Singapore 2 Department

More information

How English Phrases Are Formed: Syntax I

How English Phrases Are Formed: Syntax I Week 7. yntax: the study of how words are combined into sentences. How English Phrases Are Formed: yntax I remember brick, blick, and bnick? eal nape yntax: the study of how words are combined into sentences.

More information

Paraphrasing Nega-on Structures for Sen-ment Analysis

Paraphrasing Nega-on Structures for Sen-ment Analysis Paraphrasing Nega-on Structures for Sen-ment Analysis Overview Problem: Nega-on structures (e.g. not ) may reverse or modify sen-ment polarity Can cause sen-ment analyzers to misclassify the polarity Our

More information

MLA ANNOTATED BIBLIOGRAPHIES. For use in your Revolutionary Song projects

MLA ANNOTATED BIBLIOGRAPHIES. For use in your Revolutionary Song projects MLA ANNOTATED BIBLIOGRAPHIES For use in your Revolutionary Song projects Review: Revolutionary Song Project Write a revolutionary song like Beasts of England. Research the Russian Revolution, and write

More information

Introduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff

Introduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff Introduction to NLP Ruihong Huang Texas A&M University Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff "An Aggie does not lie, cheat, or steal or tolerate those who do."

More information

Introduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff

Introduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff Introduction to NLP Ruihong Huang Texas A&M University Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff "An Aggie does not lie, cheat, or steal or tolerate those who do."

More information

Randomness for Ergodic Measures

Randomness for Ergodic Measures Randomness for Ergodic Measures Jan Reimann and Jason Rute Pennsylvania State University Computability in Europe June 29 July 3 Slides available at www.personal.psu.edu/jmr71/ (Updated on July 3, 2015.)

More information

Natural Language Processing

Natural Language Processing atural Language Processg Info 159/259 Lecture 19: Semantic parsg (Oct. 31, 2017) David Bamman, UC Berkeley Announcements 259 fal project presentations: 3:30-5pm Tuesday, Dec. 5 (RRR week), 202 South Hall

More information

Language and Mind Prof. Rajesh Kumar Department of Humanities and Social Sciences Indian Institute of Technology, Madras

Language and Mind Prof. Rajesh Kumar Department of Humanities and Social Sciences Indian Institute of Technology, Madras Language and Mind Prof. Rajesh Kumar Department of Humanities and Social Sciences Indian Institute of Technology, Madras Module - 07 Lecture - 32 Sentence CP in Subjects and Object Positions Let us look

More information

World Journal of Engineering Research and Technology WJERT

World Journal of Engineering Research and Technology WJERT wjert, 2018, Vol. 4, Issue 4, 218-224. Review Article ISSN 2454-695X Maheswari et al. WJERT www.wjert.org SJIF Impact Factor: 5.218 SARCASM DETECTION AND SURVEYING USER AFFECTATION S. Maheswari* 1 and

More information

Hidden Markov Model based dance recognition

Hidden Markov Model based dance recognition Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,

More information

Information retrieval in folktales using natural language processing

Information retrieval in folktales using natural language processing Information retrieval in folktales using natural language processing Adrian Groza and Lidia Corde Intelligent Systems Group, Department of Computer Science, Technical University of Cluj-Napoca, Romania

More information

WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH

WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH This section presents materials that can be helpful to researchers who would like to use the helping skills system in research. This material is

More information

Determining sentiment in citation text and analyzing its impact on the proposed ranking index

Determining sentiment in citation text and analyzing its impact on the proposed ranking index Determining sentiment in citation text and analyzing its impact on the proposed ranking index Souvick Ghosh 1, Dipankar Das 1 and Tanmoy Chakraborty 2 1 Jadavpur University, Kolkata 700032, WB, India {

More information

Natural Language Processing (CSE 517): Predicate-Argument Semantics

Natural Language Processing (CSE 517): Predicate-Argument Semantics Natural Language Processing (CSE 517): Predicate-Argument Semantics Noah Smith c 2016 University of Washington nasmith@cs.washington.edu February 29, 2016 1 / 61 Semantics vs. Syntax Syntactic theories

More information

Class 5: Language processing over a noisy channel. Ted Gibson 9.59J/24.905J

Class 5: Language processing over a noisy channel. Ted Gibson 9.59J/24.905J Class 5: Language processing over a noisy channel Ted Gibson 9.59J/24.905J Review from last time: Mahowald et al. 2013 Words with a long/ short form (e.g., math, mathematics) are preferred as short in

More information

Neural evidence for a single lexicogrammatical processing system. Jennifer Hughes

Neural evidence for a single lexicogrammatical processing system. Jennifer Hughes Neural evidence for a single lexicogrammatical processing system Jennifer Hughes j.j.hughes@lancaster.ac.uk Background Approaches to collocation Background Association measures Background EEG, ERPs, and

More information

By Mrs. Paula McMullen Library Teacher Norwood Public Schools

By Mrs. Paula McMullen Library Teacher Norwood Public Schools By Mrs. Paula McMullen Library Teacher A reference resource helps us to find answers to information questions. These questions may be about words, subjects, places in the world, or current topics. Some

More information

Using Natural Language Processing Techniques for Musical Parsing

Using Natural Language Processing Techniques for Musical Parsing Using Natural Language Processing Techniques for Musical Parsing RENS BOD School of Computing, University of Leeds, Leeds LS2 9JT, UK, and Department of Computational Linguistics, University of Amsterdam

More information

arxiv: v1 [cs.cl] 24 Oct 2017

arxiv: v1 [cs.cl] 24 Oct 2017 Instituto Politécnico - Universidade do Estado de Rio de Janeiro Nova Friburgo - RJ A SIMPLE TEXT ANALYTICS MODEL TO ASSIST LITERARY CRITICISM: COMPARATIVE APPROACH AND EXAMPLE ON JAMES JOYCE AGAINST SHAKESPEARE

More information

HCC class lecture 8. John Canny 2/23/09

HCC class lecture 8. John Canny 2/23/09 HCC class lecture 8 John Canny 2/23/09 Vygotsky s Genetic Planes Phylogenetic Social-historical Ontogenetic Microgenetic What did he mean by genetic? Internalization Social Plane Social functions Internalization

More information

Markers of Literary Language A Computational-Linguistic Odyssey

Markers of Literary Language A Computational-Linguistic Odyssey Markers of Literary Language A Computational-Linguistic Odyssey Andreas van Cranenburgh Huygens ING Royal Netherlands Academy of Arts and Sciences Institute for Logic, Language and Computation University

More information

The structure of this ppt

The structure of this ppt The structure of this ppt Structural, categorial and functional issues: 1.1. 1.11. English 2.1. 2.6. Hungarian 3.1. 3.9. Functional issues (in English) 2 1.1. Structural issues The VP lecture (1) S NP

More information

A Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography

A Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography A Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography Derrick Erickson and Michael Hausman University of Colorado at Colorado Springs CS 591 Substitution Cipher 1. Remove all but

More information

Towards the Generation of Melodic Structure

Towards the Generation of Melodic Structure MUME 2016 - The Fourth International Workshop on Musical Metacreation, ISBN #978-0-86491-397-5 Towards the Generation of Melodic Structure Ryan Groves groves.ryan@gmail.com Abstract This research explores

More information

Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest

Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest Dragomir Radev 1, Amanda Stent 2, Joel Tetreault 2, Aasish Pappu 2 Aikaterini Iliakopoulou 3, Agustin

More information

The Visual Denotations of Sentences. Julia Hockenmaier with Peter Young and Micah Hodosh University of Illinois

The Visual Denotations of Sentences. Julia Hockenmaier with Peter Young and Micah Hodosh University of Illinois The Visual Denotations of Sentences Julia Hockenmaier with Peter Young and Micah Hodosh juliahmr@illinois.edu University of Illinois Sentence-Based Image Description and Search Hodosh, Young, Hockenmaier,

More information

MODIFIED UNIT TEST. Miss Shay English 10 honors Spring 2012 Modified Assessment (Hearing Impairment) on Books One and Two of Les Miserables

MODIFIED UNIT TEST. Miss Shay English 10 honors Spring 2012 Modified Assessment (Hearing Impairment) on Books One and Two of Les Miserables UNIT TEST Miss Shay English 10 honors Spring 2012 Assessment on Books One and Two of Les Miserables Today, instead of a formal essay, or a multiple choice examination, you are going to write a creative

More information

Using the Annotated Bibliography as a Resource for Indicative Summarization

Using the Annotated Bibliography as a Resource for Indicative Summarization Using the Annotated Bibliography as a Resource for Indicative Summarization Min-Yen Kan, Judith L. Klavans, and Kathleen R. McKeown Proceedings of of the Language Resources and Evaluation Conference, Las

More information

STA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo

STA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo STA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo Jian Chen Supervisor: Professor Jeffrey S. Rosenthal May 12, 2010 Abstract In this paper, we present the use of Markov Chain

More information

INDEX. classical works 60 sources without pagination 60 sources without date 60 quotation citations 60-61

INDEX. classical works 60 sources without pagination 60 sources without date 60 quotation citations 60-61 149 INDEX Abstract 7-8, 11 Process for developing 7-8 Format for APA journals 8 BYU abstract format 11 Active vs. passive voice 120-121 Appropriate uses 120-121 Distinction between 120 Alignment of text

More information

2 Year College vs. 4 Year College Research

2 Year College vs. 4 Year College Research Name Date Period 2 Year College vs. 4 Year College Research Writing Assignment: Written Report You will do research on the pros and cons of attending a 2 year college vs. a 4 year college and then you

More information

The structure of this ppt. Sentence types An overview Yes/no questions WH-questions

The structure of this ppt. Sentence types An overview Yes/no questions WH-questions The structure of this ppt Sentence types 1.1.-1.3. An overview 2.1.-2.2. Yes/no questions 3.1.-3.2. WH-questions 4.1.-4.5. Directives 2 1. Sentence types: an overview 3 1.1. Sentence types: an overview

More information

저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.

저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다. 저작자표시 - 비영리 - 동일조건변경허락 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 이차적저작물을작성할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할수없습니다. 동일조건변경허락. 귀하가이저작물을개작, 변형또는가공했을경우에는,

More information

Automatic Speech Recognition (CS753)

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 22: Conversational Agents Instructor: Preethi Jyothi Oct 26, 2017 (All images were reproduced from JM, chapters 29,30) Chatbots Rule-based chatbots Historical

More information

Encoders and Decoders: Details and Design Issues

Encoders and Decoders: Details and Design Issues Encoders and Decoders: Details and Design Issues Edward L. Bosworth, Ph.D. TSYS School of Computer Science Columbus State University Columbus, GA 31907 bosworth_edward@colstate.edu Slide 1 of 25 slides

More information

-This is the first grade of the marking period. Be sure to do your very best work and answer all parts of the assignment completely and thoroughly.

-This is the first grade of the marking period. Be sure to do your very best work and answer all parts of the assignment completely and thoroughly. Name: 8 th grade summer reading Comment [VCSD1]: The plot diagram is used commonly in literature to visually show the different aspects of a novel, short story, play, etc. It is extremely helpful in determining

More information

Set-Top-Box Pilot and Market Assessment

Set-Top-Box Pilot and Market Assessment Final Report Set-Top-Box Pilot and Market Assessment April 30, 2015 Final Report Set-Top-Box Pilot and Market Assessment April 30, 2015 Funded By: Prepared By: Alexandra Dunn, Ph.D. Mersiha McClaren,

More information

6.034 Notes: Section 4.1

6.034 Notes: Section 4.1 6.034 Notes: Section 4.1 Slide 4.1.1 What is a logic? A logic is a formal language. And what does that mean? It has a syntax and a semantics, and a way of manipulating expressions in the language. We'll

More information

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society Title Computationally Recognizing Wordplay in Jokes Permalink https://escholarship.org/uc/item/0v54b9jk Journal Proceedings

More information

Fixed Verse Generation using Neural Word Embeddings. Arjun Magge

Fixed Verse Generation using Neural Word Embeddings. Arjun Magge Fixed Verse Generation using Neural Word Embeddings by Arjun Magge A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved May 2016 by the Graduate Supervisory

More information

COMPARING STATISTICAL MACHINE TRANSLATION (SMT) AND NEURAL MACHINE TRANSLATION (NMT) PERFORMANCES Hervé Blanchon Laurent Besacier Laboratoire LIG Équipe GETALP "#$%%& $%& speech GETA L langue P parole!

More information

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1 First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1 Zehra Taşkın *, Umut Al * and Umut Sezen ** * {ztaskin; umutal}@hacettepe.edu.tr Department of Information

More information

Supplementary Note. Supplementary Table 1. Coverage in patent families with a granted. all patent. Nature Biotechnology: doi: /nbt.

Supplementary Note. Supplementary Table 1. Coverage in patent families with a granted. all patent. Nature Biotechnology: doi: /nbt. Supplementary Note Of the 100 million patent documents residing in The Lens, there are 7.6 million patent documents that contain non patent literature citations as strings of free text. These strings have

More information

Morphology, heads, gaps, etc.

Morphology, heads, gaps, etc. Syntactic Attributes Morphology, heads, gaps, etc. Note: The properties of nonterminal symbols are often called features. However, we will use the alternative name attributes. (We ll use features to refer

More information

Precision testing methods of Event Timer A032-ET

Precision testing methods of Event Timer A032-ET Precision testing methods of Event Timer A032-ET Event Timer A032-ET provides extreme precision. Therefore exact determination of its characteristics in commonly accepted way is impossible or, at least,

More information

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons Center for Games and Playable Media http://games.soe.ucsc.edu Kendall review of HW 2 Next two weeks

More information

Detecting Sarcasm in English Text. Andrew James Pielage. Artificial Intelligence MSc 2012/2013

Detecting Sarcasm in English Text. Andrew James Pielage. Artificial Intelligence MSc 2012/2013 Detecting Sarcasm in English Text Andrew James Pielage Artificial Intelligence MSc 0/0 The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference

More information

Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm

Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm Anupam Khattri 1 Aditya Joshi 2,3,4 Pushpak Bhattacharyya 2 Mark James Carman 3 1 IIT Kharagpur, India, 2 IIT Bombay,

More information

Figurative Language Processing: Mining Underlying Knowledge from Social Media

Figurative Language Processing: Mining Underlying Knowledge from Social Media Figurative Language Processing: Mining Underlying Knowledge from Social Media Antonio Reyes and Paolo Rosso Natural Language Engineering Lab EliRF Universidad Politécnica de Valencia {areyes,prosso}@dsic.upv.es

More information

CHAPTER 2 REVIEW OF RELATED LITERATURE. advantages the related studies is to provide insight into the statistical methods

CHAPTER 2 REVIEW OF RELATED LITERATURE. advantages the related studies is to provide insight into the statistical methods CHAPTER 2 REVIEW OF RELATED LITERATURE The review of related studies is an essential part of any investigation. The survey of the related studies is a crucial aspect of the planning of the study. The advantages

More information

ILAR Grade 7. September. Reading

ILAR Grade 7. September. Reading ILAR Grade 7 September 1. Identify time period and location of a short story. 2. Illustrate plot progression, including rising action, climax, and resolution. 3. Identify and define unfamiliar words within

More information

Leopold-Franzens-University Innsbruck. Institute of Computer Science Databases and Information Systems. Stefan Wurzinger, BSc

Leopold-Franzens-University Innsbruck. Institute of Computer Science Databases and Information Systems. Stefan Wurzinger, BSc Leopold-Franzens-University Innsbruck Institute of Computer Science Databases and Information Systems Analyzing the Characteristics of Music Playlists using Song Lyrics and Content-based Features Master

More information

LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally

LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally Cynthia Van Hee, Els Lefever and Véronique hoste LT 3, Language and Translation Technology Team Department of Translation, Interpreting

More information

Rubrics & Checklists

Rubrics & Checklists Rubrics & Checklists fulfilling Common Core s for Fifth Grade Opinion Writing Self-evaluation that's easy to use and comprehend Scoring that's based on Common Core expectations Checklists that lead students

More information

Creating Mindmaps of Documents

Creating Mindmaps of Documents Creating Mindmaps of Documents Using an Example of a News Surveillance System Oskar Gross Hannu Toivonen Teemu Hynonen Esther Galbrun February 6, 2011 Outline Motivation Bisociation Network Tpf-Idf-Tpu

More information

12/4/2013 Wed E Period

12/4/2013 Wed E Period 12/4/2013 Wed E Period Bellwork: Silently, review for your TKAM test. Objectives: Identify elements of an introduction paragraph. Explain Theme. Identify Adverbs. homework Study nouns, pronouns, adjectives,

More information

Music Genre Classification and Variance Comparison on Number of Genres

Music Genre Classification and Variance Comparison on Number of Genres Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques

More information

Adverb Phrases & Reasons. Week 7, Wed 10/14/15 Todd Windisch, Fall 2015

Adverb Phrases & Reasons. Week 7, Wed 10/14/15 Todd Windisch, Fall 2015 Adverb Phrases & Reasons Week 7, Wed 10/14/15 Todd Windisch, Fall 2015 Final Draft WRITING PACKET #2 You have 35 minutes to finish your final draft and turn it in to me It is due at 2:50! If it is late,

More information

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki

Musical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Musical Creativity Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Basic Terminology Melody = linear succession of musical tones that the listener

More information

MODELING HARMONY WITH SKIP-GRAMS

MODELING HARMONY WITH SKIP-GRAMS MODELING HARMONY WITH SKIP-GRAMS David R. W. Sears Andreas Arzt Harald Frostel Reinhard Sonnleitner Gerhard Widmer Department of Computational Perception, Johannes Kepler University, Linz, Austria david.sears@jku.at

More information

Centre for Economic Policy Research

Centre for Economic Policy Research The Australian National University Centre for Economic Policy Research DISCUSSION PAPER The Reliability of Matches in the 2002-2004 Vietnam Household Living Standards Survey Panel Brian McCaig DISCUSSION

More information

Chapter 14. From Randomness to Probability. Probability. Probability (cont.) The Law of Large Numbers. Dealing with Random Phenomena

Chapter 14. From Randomness to Probability. Probability. Probability (cont.) The Law of Large Numbers. Dealing with Random Phenomena Chapter 14 From Randomness to Probability Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 14-1

More information

Music Recommendation from Song Sets

Music Recommendation from Song Sets Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia

More information

STYLISTIC ANALYSIS OF MAYA ANGELOU S EQUALITY

STYLISTIC ANALYSIS OF MAYA ANGELOU S EQUALITY Lingua Cultura, 11(2), November 2017, 85-89 DOI: 10.21512/lc.v11i2.1602 P-ISSN: 1978-8118 E-ISSN: 2460-710X STYLISTIC ANALYSIS OF MAYA ANGELOU S EQUALITY Arina Isti anah English Letters Department, Faculty

More information

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * David Ortega-Pacheco and Hiram Calvo Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan

More information

On the Ontological Basis for Logical Metonymy:

On the Ontological Basis for Logical Metonymy: Page 1: OntoLex 2002, May 27th. On the Ontological Basis for : Telic Roles and WORDNET Sandiway Fong NEC Research Institute Princeton NJ USA Eventive verb enjoy: Mary enjoyed the party Mary enjoyed dancing

More information

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder

EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder Dept. of Electrical and Computer Engineering University of California, Davis Issued: November 2, 2011 Due: November 16, 2011, 4PM Reading: Rabaey Sections

More information

ECE 301 Digital Electronics

ECE 301 Digital Electronics ECE 301 Digital Electronics Counters (Lecture #20) The slides included herein were taken from the materials accompanying Fundamentals of Logic Design, 6 th Edition, by Roth and Kinney, and were used with

More information

ECE 331 Digital System Design

ECE 331 Digital System Design ECE 331 Digital System Design Counters (Lecture #20) The slides included herein were taken from the materials accompanying Fundamentals of Logic Design, 6 th Edition, by Roth and Kinney, and were used

More information

Emotionally-Relevant Features for Classification and Regression of Music Lyrics

Emotionally-Relevant Features for Classification and Regression of Music Lyrics IEEE TRANSACTIONS ON JOURNAL AFFECTIVE COMPUTING, MANUSCRIPT ID 1 Emotionally-Relevant Features for Classification and Regression of Music Lyrics Ricardo Malheiro, Renato Panda, Paulo Gomes and Rui Pedro

More information

An Icelandic Gigaword Corpus

An Icelandic Gigaword Corpus Steinþór Steingrímsson, Sigrún Helgadóttir & Eiríkur Rögnvaldsson The paper describes work in progress to compile an Icelandic Gigaword Corpus (IGC). The initial aim of the project was to compile a large

More information

EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach

EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach Song Hui Chon Stanford University Everyone has different musical taste,

More information

Adjectives - Semantic Characteristics

Adjectives - Semantic Characteristics Adjectives - Semantic Characteristics Prototypical ADJs (inherent, concrete, relatively stable qualities) 1. Size General size: Horizontal extension: Thickness: Vertical extension: Vertical elevation:

More information

CSE 166: Image Processing. Overview. Representing an image. What is an image? History. What is image processing? Today. Image Processing CSE 166

CSE 166: Image Processing. Overview. Representing an image. What is an image? History. What is image processing? Today. Image Processing CSE 166 CSE 166: Image Processing Overview Image Processing CSE 166 Today Course overview Logistics Some mathematics MATLAB Lectures will be boardwork and slides Take written notes or take pictures of the board

More information

CHAPTER I INTRODUCTION

CHAPTER I INTRODUCTION CHAPTER I INTRODUCTION A. Background of the Study The meaning of word, phrase and sentence is very important to be analyzed because it can make something more understandable to be communicated to the others.

More information

Topic: Part of Speech Exam & Sentence Types KEY

Topic: Part of Speech Exam & Sentence Types KEY 09.13.10 Topic: Part of Speech Exam & Sentence Types KEY AFTER THIS CLASS YOU WILL BE ABLE TO: 1. Demonstrate mastery of parts of speech. 2. Identify and use declarative, interrogatory, imperative, and

More information

Learning to translate with source and target syntax. David Chiang, USC Information Sciences Institute

Learning to translate with source and target syntax. David Chiang, USC Information Sciences Institute Learning to translate with source and target syntax David Chiang, USC Information Sciences Institute 14 July 2010 Overview Using source and target syntax Why is it hard? How can we make it better? Let

More information

A computer assisted analysis of literary text: from feature analysis to judgements of literary merit Tess M. E. A. Crosbie

A computer assisted analysis of literary text: from feature analysis to judgements of literary merit Tess M. E. A. Crosbie Title Name A computer assisted analysis of literary text: from feature analysis to judgements of literary merit Tess M. E. A. Crosbie This is a digitised version of a dissertation submitted to the University

More information

Chapter 1 Midterm Review

Chapter 1 Midterm Review Name: Class: Date: Chapter 1 Midterm Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A survey typically records many variables of interest to the

More information

How to Write a Term Paper or Thesis

How to Write a Term Paper or Thesis How to Write a Term Paper or Thesis Michael A. Covington Artificial Intelligence Center The University of Georgia Athens, Georgia 30602 http://www.ai.uga.edu/mc Revised June 3, 2005 Abstract This is a

More information