LING/C SC 581: Advanced Computational Linguistics. Lecture Notes Feb 6th
|
|
- Cora Dean
- 5 years ago
- Views:
Transcription
1 LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 6th
2 Adminstrivia The Homework Pipeline: Homework 2 graded Homework 4 not back yet soon Homework 5 due Weds by midnight No classes next week: I'm out of town on business No new homework assigned this week
3 Today's Topics Homework 4 review
4 Homework 4 Review: Ques1on 1 Construct a WSJ text corpus that excludes both words tagged as NONE- and punctuation words (defined previously) Show your Python console. How many words in the corpus? How many distinct words? Plot the cumulative frequency distribution graph How many top words do you need to account for 50% of the corpus?
5 Homework 4 Review: Question 1 excluded = set(['-none-', '-LRB-', '-RRB-', 'SYM', ':', '.', ',', '``', "''"]) tokens = [x[0] for x in ptb.tagged_words(categories=['news']) if x[1] not in excluded] words = set(tokens) print('tokens: {}; #Words: {}'.format(len(text),len(words))) Tokens: ; #Words: len(words) print('lexical diversity: {:.3f}%'.format(len(words)/len(text))) Lexical diversity: 0.047% text = nltk.text(tokens) dist = nltk.freqdist(text) print(dist) <FreqDist with samples and outcomes>
6 Homework 4 Review: Ques1on 1 list = sorted(dist.items(),key=lambda t:t[1],reverse=true) half = len(text) / 2.0 total = 0 index = 0 while total < half: total += list[index][1] index += 1 print('no of words: {}; total: {}'.format(index,total)) No of words: 217; total: /2 =
7 Homework 4 Review: Question 1 print('{:12s} {:5s}'.format('Word','Freq')) for word, freq in list[:index]: print('{:12s} {:5d}'.format(word,freq))
8 Homework 4 Review: Question 1
9 Homework 4 Review: Question 2 With case folding: tokens = [x[0].lower() for x in ptb.tagged_words(categories=['news']) if x[1] not in excluded] Tokens: ; #Words: Lexical diversity: 0.042% No of words: 176; total: ( /2= )
10 Homework 4 Review: Question 2
11 Colorless green ideas examples (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Chomsky (1957):... It is fair to assume that neither sentence (1) nor (2) (nor indeed any part of these sentences) has ever occurred in an English discourse. Hence, in any statistical model for grammaticalness, these sentences will be ruled out on identical grounds as equally `remote' from English. Yet (1), though nonsensical, is grammatical, while (2) is not. idea (1) is syntactically valid, (2) is word salad One piece of suppor>ng evidence: (1) pronounced with normal intona>on (2) pronounced like a list of words
12 Background: Language Models and N-grams given a word sequence w 1 w 2 w 3... w n chain rule how to compute the probability of a sequence of words p(w 1 w 2 ) = p(w 1 ) p(w 2 w 1 ) p(w 1 w 2 w 3 ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )... p(w 1 w 2 w 3...w n ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )... p(w n w 1...w n-2 w n-1 ) note It s not easy to collect (meaningful) sta8s8cs on p(w n w n-1 w n-2...w 1 ) for all possible word sequences
13 Background: Language Models and N-grams Given a word sequence w 1 w 2 w 3... w n Bigram approximation just look at the previous word only (not all the proceedings words) Markov Assumption: finite length history 1st order Markov Model p(w 1 w 2 w 3...w n ) = p(w 1 ) p(w 2 w 1 ) p(w 3 w 1 w 2 )...p(w n w 1...w n-3 w n-2 w n-1 ) p(w 1 w 2 w 3...w n )» p(w 1 ) p(w 2 w 1 ) p(w 3 w 2 )...p(w n w n-1 ) note p(w n w n-1 ) is a lot easier to collect data for (and thus estimate well) than p(w n w 1...w n-2 w n-1 )
14 Colorless green ideas Sentences: (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Sta7s7cal Experiment (Pereira 2002) bigram language model w i-1 w i
15 Part-of-Speech (POS) Tag Sequence Chomsky's example: colorless green ideas sleep furiously JJ JJ NNS VBP RB (POS Tags) Similar but grammatical example: revolutionary new ideas appear infrequently JJ JJ NNS VBP RB LSLT pg. 146
16 Stanford Parser Stanford Parser: a probabilis2c PS parser trained on the Penn Treebank
17 Stanford Parser Stanford Parser: a probabilis2c PS parser trained on the Penn Treebank
18 Penn Treebank (PTB) Corpus: word frequencies: Word POS Frequency colorless 0 green NNP 33 JJ 19 NN 5 ideas NNS 32 sleep VB 5 NN 4 VBP 2 NNP 1 furiously RB 2 Word POS Frequency revolutionary JJ 6 NNP 2 NN 2 new JJ 1795 NNP 1459 NNPS 2 NN 1 ideas NNS 32 appear VB 55 VBP 41 infrequently 0
19 Stanford Parser Structure of NPs: colorless green ideas revolutionary new ideas Phrase Frequency [ NP JJ JJ NNS] 1073 [ NP NNP JJ NNS] 61
20 An experiment examples (1) colorless green ideas sleep furiously (2) furiously sleep ideas green colorless Question: Is (1) even the most likely permutation of these particular five words?
21 Parsing Data All 5! (=120) permutations of colorless green ideas sleep furiously.
22 Parsing Data The winning sentence was: 1 furiously ideas sleep colorless green. after training on sections (approx. 40,000 sentences) sleep selects for ADJP object with 2 heads adverb (RB) furiously modifies noun
23 Parsing Data The next two highest scoring permutations were: 2 Furiously green ideas sleep colorless. 3 Green ideas sleep furiously colorless. sleep takes NP object sleep takes ADJP object
24 Parsing Data (Pereira 2002) compared Chomsky s original minimal pair: 23. colorless green ideas sleep furiously 36. furiously sleep ideas green colorless Ranked #23 and #36 respectively out of 120
25 Parsing Data But graph (next slide) shows how arbitrary these rankings are when trained on randomly chosen sections covering 14K- 31K sentences Example: #36 furiously sleep ideas green colorless outranks #23 colorless green ideas sleep furiously (and the top 3) over much of the training space Example: Chomsky's original sentence #23 colorless green ideas sleep furiously outranks both the top 3 and #36 just briefly at one data point
26 Sentence Rank vs. Amount of Training Data 120 Best three sentences Rank #1 #2 # Amount of training data
27 Sentence Rank vs. Amount of Training Data #23 colorless green ideas sleep furiously #36 furiously sleep ideas green colorless 80 Rank #23 # Amount of training data
28 Sentence Rank vs. Amount of Training Data #23 colorless green ideas sleep furiously #36 furiously sleep ideas green colorless 80 Rank #1 #2 #3 #23 # Amount of training data
Introduction to Natural Language Processing Phase 2: Question Answering
Introduction to Natural Language Processing Phase 2: Question Answering Center for Games and Playable Media http://games.soe.ucsc.edu The plan for the next two weeks Week9: Simple use of VN WN APIs. Homework
More informationPractice Midterm Exam for Natural Language Processing
Practice Midterm Exam for Natural Language Processing Name: Net ID Instructions In the actual midterm there will be 7 questions, each will be worth 15 points. You also get 10 point for signing your name
More informationThe ACL Anthology Network Corpus. University of Michigan
The ACL Anthology Corpus Dragomir R. Radev 1,2, Pradeep Muthukrishnan 1, Vahed Qazvinian 1 1 Department of Electrical Engineering and Computer Science 2 School of Information University of Michigan {radev,mpradeep,vahed}@umich.edu
More informationABSTRACT CITATION HANDLING: PROCESSING CITATION TEXTS IN SCIENTIFIC DOCUMENTS. Michael Alan Whidby Master of Science, 2012
ABSTRACT Title of thesis: CITATION HANDLING: PROCESSING CITATION TEXTS IN SCIENTIFIC DOCUMENTS Michael Alan Whidby Master of Science, 2012 Thesis directed by: Professor Bonnie Dorr Dr. David Zajic Department
More informationLab 14: Text & Corpus Processing with NLTK. Ling 1330/2330: Computational Linguistics Na-Rae Han
Lab 14: Text & Corpus Processing with NLTK Ling 1330/2330: Computational Linguistics Na-Rae Han Getting started with NLTK book NLTK Book, with navigation panel: http://www.pitt.edu/~naraehan/ling1330/nltk_book.html
More informationUWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics
UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics Olga Vechtomova University of Waterloo Waterloo, ON, Canada ovechtom@uwaterloo.ca Abstract The
More informationLING/C SC 581: Advanced Computational Linguistics. Lecture 2 Jan 15 th
LING/C SC 581: Advanced Computational Linguistics Lecture 2 Jan 15 th From last time Did everyone install Python 3 and nltk/nltk_data? We'll do a Homework 2 on this today Importing your own corpus Learning
More informationThe structure of this ppt. Structural and categorial (and some functional) issues: English Hungarian
The structure of this ppt Structural and categorial (and some functional) issues: 1.1. 1.12. English 2.1. 2.6. Hungarian 2 1.1. Structural issues The VP lecture (1) S NP John VP laughed. read the paper.
More informationCombination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections
1/23 Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections Rudolf Mayer, Andreas Rauber Vienna University of Technology {mayer,rauber}@ifs.tuwien.ac.at Robert Neumayer
More informationBasic Natural Language Processing
Basic Natural Language Processing Why NLP? Understanding Intent Search Engines Question Answering Azure QnA, Bots, Watson Digital Assistants Cortana, Siri, Alexa Translation Systems Azure Language Translation,
More informationLanguage and Inference
Language and Inference Day 5: Inference in the Real World Johan Bos johan.bos@rug.nl Semantic Analysis Pipeline tokenisation tokenised text POS-tagging parts of speech NE-tagging named entities parsing
More informationCS 562: STATISTICAL NATURAL LANGUAGE PROCESSING
CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING August 2010 Instructors: Liang Huang and Kevin Knight TA: Jason Riesa Doesn t Google know everything? What animal does a cat eat? 2 Even Key Word Queries
More informationProbabilistic Grammars for Music
Probabilistic Grammars for Music Rens Bod ILLC, University of Amsterdam Nieuwe Achtergracht 166, 1018 WV Amsterdam rens@science.uva.nl Abstract We investigate whether probabilistic parsing techniques from
More informationSarcasm Detection in Text: Design Document
CSC 59866 Senior Design Project Specification Professor Jie Wei Wednesday, November 23, 2016 Sarcasm Detection in Text: Design Document Jesse Feinman, James Kasakyan, Jeff Stolzenberg 1 Table of contents
More informationSpectacular successes and failures of recurrent neural networks applied to language
Spectacular successes and failures of recurrent neural networks applied to language Marco Baroni Facebook AI Research Recurrent neural networks external input output state of the network at the previous
More informationComparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus
Comparison of N-Gram 1 Rank Frequency Data from the Written Texts of the British National Corpus World Edition (BNC) and the author s Web Corpus Both sets of texts were preprocessed to provide comparable
More informationSentence Processing. BCS 152 October
Sentence Processing BCS 152 October 29 2018 Homework 3 Reminder!!! Due Wednesday, October 31 st at 11:59pm Conduct 2 experiments on word recognition on your friends! Read instructions carefully & submit
More informationBIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini
Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 353 359 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p353 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index
More informationCharacterizing Literature Using Machine Learning Methods
Masterarbeit Characterizing Literature Using Machine Learning Methods vorgelegt von Jan Bílek Fakultät für Mathematik, Informatik und Naturwissenschaften Fachbereich Informatik Arbeitsbereich Wissenschaftliches
More informationDISCOURSE ANALYSIS OF LYRIC AND LYRIC-BASED CLASSIFICATION OF MUSIC
DISCOURSE ANALYSIS OF LYRIC AND LYRIC-BASED CLASSIFICATION OF MUSIC Jiakun Fang 1 David Grunberg 1 Diane Litman 2 Ye Wang 1 1 School of Computing, National University of Singapore, Singapore 2 Department
More informationHow English Phrases Are Formed: Syntax I
Week 7. yntax: the study of how words are combined into sentences. How English Phrases Are Formed: yntax I remember brick, blick, and bnick? eal nape yntax: the study of how words are combined into sentences.
More informationParaphrasing Nega-on Structures for Sen-ment Analysis
Paraphrasing Nega-on Structures for Sen-ment Analysis Overview Problem: Nega-on structures (e.g. not ) may reverse or modify sen-ment polarity Can cause sen-ment analyzers to misclassify the polarity Our
More informationMLA ANNOTATED BIBLIOGRAPHIES. For use in your Revolutionary Song projects
MLA ANNOTATED BIBLIOGRAPHIES For use in your Revolutionary Song projects Review: Revolutionary Song Project Write a revolutionary song like Beasts of England. Research the Russian Revolution, and write
More informationIntroduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff
Introduction to NLP Ruihong Huang Texas A&M University Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff "An Aggie does not lie, cheat, or steal or tolerate those who do."
More informationIntroduction to NLP. Ruihong Huang Texas A&M University. Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff
Introduction to NLP Ruihong Huang Texas A&M University Some slides adapted from slides by Dan Jurafsky, Luke Zettlemoyer, Ellen Riloff "An Aggie does not lie, cheat, or steal or tolerate those who do."
More informationRandomness for Ergodic Measures
Randomness for Ergodic Measures Jan Reimann and Jason Rute Pennsylvania State University Computability in Europe June 29 July 3 Slides available at www.personal.psu.edu/jmr71/ (Updated on July 3, 2015.)
More informationNatural Language Processing
atural Language Processg Info 159/259 Lecture 19: Semantic parsg (Oct. 31, 2017) David Bamman, UC Berkeley Announcements 259 fal project presentations: 3:30-5pm Tuesday, Dec. 5 (RRR week), 202 South Hall
More informationLanguage and Mind Prof. Rajesh Kumar Department of Humanities and Social Sciences Indian Institute of Technology, Madras
Language and Mind Prof. Rajesh Kumar Department of Humanities and Social Sciences Indian Institute of Technology, Madras Module - 07 Lecture - 32 Sentence CP in Subjects and Object Positions Let us look
More informationWorld Journal of Engineering Research and Technology WJERT
wjert, 2018, Vol. 4, Issue 4, 218-224. Review Article ISSN 2454-695X Maheswari et al. WJERT www.wjert.org SJIF Impact Factor: 5.218 SARCASM DETECTION AND SURVEYING USER AFFECTATION S. Maheswari* 1 and
More informationHidden Markov Model based dance recognition
Hidden Markov Model based dance recognition Dragutin Hrenek, Nenad Mikša, Robert Perica, Pavle Prentašić and Boris Trubić University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3,
More informationInformation retrieval in folktales using natural language processing
Information retrieval in folktales using natural language processing Adrian Groza and Lidia Corde Intelligent Systems Group, Department of Computer Science, Technical University of Cluj-Napoca, Romania
More informationWEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH
WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH This section presents materials that can be helpful to researchers who would like to use the helping skills system in research. This material is
More informationDetermining sentiment in citation text and analyzing its impact on the proposed ranking index
Determining sentiment in citation text and analyzing its impact on the proposed ranking index Souvick Ghosh 1, Dipankar Das 1 and Tanmoy Chakraborty 2 1 Jadavpur University, Kolkata 700032, WB, India {
More informationNatural Language Processing (CSE 517): Predicate-Argument Semantics
Natural Language Processing (CSE 517): Predicate-Argument Semantics Noah Smith c 2016 University of Washington nasmith@cs.washington.edu February 29, 2016 1 / 61 Semantics vs. Syntax Syntactic theories
More informationClass 5: Language processing over a noisy channel. Ted Gibson 9.59J/24.905J
Class 5: Language processing over a noisy channel Ted Gibson 9.59J/24.905J Review from last time: Mahowald et al. 2013 Words with a long/ short form (e.g., math, mathematics) are preferred as short in
More informationNeural evidence for a single lexicogrammatical processing system. Jennifer Hughes
Neural evidence for a single lexicogrammatical processing system Jennifer Hughes j.j.hughes@lancaster.ac.uk Background Approaches to collocation Background Association measures Background EEG, ERPs, and
More informationBy Mrs. Paula McMullen Library Teacher Norwood Public Schools
By Mrs. Paula McMullen Library Teacher A reference resource helps us to find answers to information questions. These questions may be about words, subjects, places in the world, or current topics. Some
More informationUsing Natural Language Processing Techniques for Musical Parsing
Using Natural Language Processing Techniques for Musical Parsing RENS BOD School of Computing, University of Leeds, Leeds LS2 9JT, UK, and Department of Computational Linguistics, University of Amsterdam
More informationarxiv: v1 [cs.cl] 24 Oct 2017
Instituto Politécnico - Universidade do Estado de Rio de Janeiro Nova Friburgo - RJ A SIMPLE TEXT ANALYTICS MODEL TO ASSIST LITERARY CRITICISM: COMPARATIVE APPROACH AND EXAMPLE ON JAMES JOYCE AGAINST SHAKESPEARE
More informationHCC class lecture 8. John Canny 2/23/09
HCC class lecture 8 John Canny 2/23/09 Vygotsky s Genetic Planes Phylogenetic Social-historical Ontogenetic Microgenetic What did he mean by genetic? Internalization Social Plane Social functions Internalization
More informationMarkers of Literary Language A Computational-Linguistic Odyssey
Markers of Literary Language A Computational-Linguistic Odyssey Andreas van Cranenburgh Huygens ING Royal Netherlands Academy of Arts and Sciences Institute for Logic, Language and Computation University
More informationThe structure of this ppt
The structure of this ppt Structural, categorial and functional issues: 1.1. 1.11. English 2.1. 2.6. Hungarian 3.1. 3.9. Functional issues (in English) 2 1.1. Structural issues The VP lecture (1) S NP
More informationA Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography
A Dominant Gene Genetic Algorithm for a Substitution Cipher in Cryptography Derrick Erickson and Michael Hausman University of Colorado at Colorado Springs CS 591 Substitution Cipher 1. Remove all but
More informationTowards the Generation of Melodic Structure
MUME 2016 - The Fourth International Workshop on Musical Metacreation, ISBN #978-0-86491-397-5 Towards the Generation of Melodic Structure Ryan Groves groves.ryan@gmail.com Abstract This research explores
More informationHumor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest
Humor in Collective Discourse: Unsupervised Funniness Detection in the New Yorker Cartoon Caption Contest Dragomir Radev 1, Amanda Stent 2, Joel Tetreault 2, Aasish Pappu 2 Aikaterini Iliakopoulou 3, Agustin
More informationThe Visual Denotations of Sentences. Julia Hockenmaier with Peter Young and Micah Hodosh University of Illinois
The Visual Denotations of Sentences Julia Hockenmaier with Peter Young and Micah Hodosh juliahmr@illinois.edu University of Illinois Sentence-Based Image Description and Search Hodosh, Young, Hockenmaier,
More informationMODIFIED UNIT TEST. Miss Shay English 10 honors Spring 2012 Modified Assessment (Hearing Impairment) on Books One and Two of Les Miserables
UNIT TEST Miss Shay English 10 honors Spring 2012 Assessment on Books One and Two of Les Miserables Today, instead of a formal essay, or a multiple choice examination, you are going to write a creative
More informationUsing the Annotated Bibliography as a Resource for Indicative Summarization
Using the Annotated Bibliography as a Resource for Indicative Summarization Min-Yen Kan, Judith L. Klavans, and Kathleen R. McKeown Proceedings of of the Language Resources and Evaluation Conference, Las
More informationSTA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo
STA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo Jian Chen Supervisor: Professor Jeffrey S. Rosenthal May 12, 2010 Abstract In this paper, we present the use of Markov Chain
More informationINDEX. classical works 60 sources without pagination 60 sources without date 60 quotation citations 60-61
149 INDEX Abstract 7-8, 11 Process for developing 7-8 Format for APA journals 8 BYU abstract format 11 Active vs. passive voice 120-121 Appropriate uses 120-121 Distinction between 120 Alignment of text
More information2 Year College vs. 4 Year College Research
Name Date Period 2 Year College vs. 4 Year College Research Writing Assignment: Written Report You will do research on the pros and cons of attending a 2 year college vs. a 4 year college and then you
More informationThe structure of this ppt. Sentence types An overview Yes/no questions WH-questions
The structure of this ppt Sentence types 1.1.-1.3. An overview 2.1.-2.2. Yes/no questions 3.1.-3.2. WH-questions 4.1.-4.5. Directives 2 1. Sentence types: an overview 3 1.1. Sentence types: an overview
More information저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.
저작자표시 - 비영리 - 동일조건변경허락 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 이차적저작물을작성할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할수없습니다. 동일조건변경허락. 귀하가이저작물을개작, 변형또는가공했을경우에는,
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 22: Conversational Agents Instructor: Preethi Jyothi Oct 26, 2017 (All images were reproduced from JM, chapters 29,30) Chatbots Rule-based chatbots Historical
More informationEncoders and Decoders: Details and Design Issues
Encoders and Decoders: Details and Design Issues Edward L. Bosworth, Ph.D. TSYS School of Computer Science Columbus State University Columbus, GA 31907 bosworth_edward@colstate.edu Slide 1 of 25 slides
More information-This is the first grade of the marking period. Be sure to do your very best work and answer all parts of the assignment completely and thoroughly.
Name: 8 th grade summer reading Comment [VCSD1]: The plot diagram is used commonly in literature to visually show the different aspects of a novel, short story, play, etc. It is extremely helpful in determining
More informationSet-Top-Box Pilot and Market Assessment
Final Report Set-Top-Box Pilot and Market Assessment April 30, 2015 Final Report Set-Top-Box Pilot and Market Assessment April 30, 2015 Funded By: Prepared By: Alexandra Dunn, Ph.D. Mersiha McClaren,
More information6.034 Notes: Section 4.1
6.034 Notes: Section 4.1 Slide 4.1.1 What is a logic? A logic is a formal language. And what does that mean? It has a syntax and a semantics, and a way of manipulating expressions in the language. We'll
More informationUC Merced Proceedings of the Annual Meeting of the Cognitive Science Society
UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society Title Computationally Recognizing Wordplay in Jokes Permalink https://escholarship.org/uc/item/0v54b9jk Journal Proceedings
More informationFixed Verse Generation using Neural Word Embeddings. Arjun Magge
Fixed Verse Generation using Neural Word Embeddings by Arjun Magge A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved May 2016 by the Graduate Supervisory
More informationCOMPARING STATISTICAL MACHINE TRANSLATION (SMT) AND NEURAL MACHINE TRANSLATION (NMT) PERFORMANCES Hervé Blanchon Laurent Besacier Laboratoire LIG Équipe GETALP "#$%%& $%& speech GETA L langue P parole!
More informationFirst Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1
First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1 Zehra Taşkın *, Umut Al * and Umut Sezen ** * {ztaskin; umutal}@hacettepe.edu.tr Department of Information
More informationSupplementary Note. Supplementary Table 1. Coverage in patent families with a granted. all patent. Nature Biotechnology: doi: /nbt.
Supplementary Note Of the 100 million patent documents residing in The Lens, there are 7.6 million patent documents that contain non patent literature citations as strings of free text. These strings have
More informationMorphology, heads, gaps, etc.
Syntactic Attributes Morphology, heads, gaps, etc. Note: The properties of nonterminal symbols are often called features. However, we will use the alternative name attributes. (We ll use features to refer
More informationPrecision testing methods of Event Timer A032-ET
Precision testing methods of Event Timer A032-ET Event Timer A032-ET provides extreme precision. Therefore exact determination of its characteristics in commonly accepted way is impossible or, at least,
More informationIntroduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons
Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons Center for Games and Playable Media http://games.soe.ucsc.edu Kendall review of HW 2 Next two weeks
More informationDetecting Sarcasm in English Text. Andrew James Pielage. Artificial Intelligence MSc 2012/2013
Detecting Sarcasm in English Text Andrew James Pielage Artificial Intelligence MSc 0/0 The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference
More informationYour Sentiment Precedes You: Using an author s historical tweets to predict sarcasm
Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm Anupam Khattri 1 Aditya Joshi 2,3,4 Pushpak Bhattacharyya 2 Mark James Carman 3 1 IIT Kharagpur, India, 2 IIT Bombay,
More informationFigurative Language Processing: Mining Underlying Knowledge from Social Media
Figurative Language Processing: Mining Underlying Knowledge from Social Media Antonio Reyes and Paolo Rosso Natural Language Engineering Lab EliRF Universidad Politécnica de Valencia {areyes,prosso}@dsic.upv.es
More informationCHAPTER 2 REVIEW OF RELATED LITERATURE. advantages the related studies is to provide insight into the statistical methods
CHAPTER 2 REVIEW OF RELATED LITERATURE The review of related studies is an essential part of any investigation. The survey of the related studies is a crucial aspect of the planning of the study. The advantages
More informationILAR Grade 7. September. Reading
ILAR Grade 7 September 1. Identify time period and location of a short story. 2. Illustrate plot progression, including rising action, climax, and resolution. 3. Identify and define unfamiliar words within
More informationLeopold-Franzens-University Innsbruck. Institute of Computer Science Databases and Information Systems. Stefan Wurzinger, BSc
Leopold-Franzens-University Innsbruck Institute of Computer Science Databases and Information Systems Analyzing the Characteristics of Music Playlists using Song Lyrics and Content-based Features Master
More informationLT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally
LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally Cynthia Van Hee, Els Lefever and Véronique hoste LT 3, Language and Translation Technology Team Department of Translation, Interpreting
More informationRubrics & Checklists
Rubrics & Checklists fulfilling Common Core s for Fifth Grade Opinion Writing Self-evaluation that's easy to use and comprehend Scoring that's based on Common Core expectations Checklists that lead students
More informationCreating Mindmaps of Documents
Creating Mindmaps of Documents Using an Example of a News Surveillance System Oskar Gross Hannu Toivonen Teemu Hynonen Esther Galbrun February 6, 2011 Outline Motivation Bisociation Network Tpf-Idf-Tpu
More information12/4/2013 Wed E Period
12/4/2013 Wed E Period Bellwork: Silently, review for your TKAM test. Objectives: Identify elements of an introduction paragraph. Explain Theme. Identify Adverbs. homework Study nouns, pronouns, adjectives,
More informationMusic Genre Classification and Variance Comparison on Number of Genres
Music Genre Classification and Variance Comparison on Number of Genres Miguel Francisco, miguelf@stanford.edu Dong Myung Kim, dmk8265@stanford.edu 1 Abstract In this project we apply machine learning techniques
More informationAdverb Phrases & Reasons. Week 7, Wed 10/14/15 Todd Windisch, Fall 2015
Adverb Phrases & Reasons Week 7, Wed 10/14/15 Todd Windisch, Fall 2015 Final Draft WRITING PACKET #2 You have 35 minutes to finish your final draft and turn it in to me It is due at 2:50! If it is late,
More informationMusical Creativity. Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki
Musical Creativity Jukka Toivanen Introduction to Computational Creativity Dept. of Computer Science University of Helsinki Basic Terminology Melody = linear succession of musical tones that the listener
More informationMODELING HARMONY WITH SKIP-GRAMS
MODELING HARMONY WITH SKIP-GRAMS David R. W. Sears Andreas Arzt Harald Frostel Reinhard Sonnleitner Gerhard Widmer Department of Computational Perception, Johannes Kepler University, Linz, Austria david.sears@jku.at
More informationCentre for Economic Policy Research
The Australian National University Centre for Economic Policy Research DISCUSSION PAPER The Reliability of Matches in the 2002-2004 Vietnam Household Living Standards Survey Panel Brian McCaig DISCUSSION
More informationChapter 14. From Randomness to Probability. Probability. Probability (cont.) The Law of Large Numbers. Dealing with Random Phenomena
Chapter 14 From Randomness to Probability Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 14-1
More informationMusic Recommendation from Song Sets
Music Recommendation from Song Sets Beth Logan Cambridge Research Laboratory HP Laboratories Cambridge HPL-2004-148 August 30, 2004* E-mail: Beth.Logan@hp.com music analysis, information retrieval, multimedia
More informationSTYLISTIC ANALYSIS OF MAYA ANGELOU S EQUALITY
Lingua Cultura, 11(2), November 2017, 85-89 DOI: 10.21512/lc.v11i2.1602 P-ISSN: 1978-8118 E-ISSN: 2460-710X STYLISTIC ANALYSIS OF MAYA ANGELOU S EQUALITY Arina Isti anah English Letters Department, Faculty
More informationAutomatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *
Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors * David Ortega-Pacheco and Hiram Calvo Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan
More informationOn the Ontological Basis for Logical Metonymy:
Page 1: OntoLex 2002, May 27th. On the Ontological Basis for : Telic Roles and WORDNET Sandiway Fong NEC Research Institute Princeton NJ USA Eventive verb enjoy: Mary enjoyed the party Mary enjoyed dancing
More informationEEC 116 Fall 2011 Lab #5: Pipelined 32b Adder
EEC 116 Fall 2011 Lab #5: Pipelined 32b Adder Dept. of Electrical and Computer Engineering University of California, Davis Issued: November 2, 2011 Due: November 16, 2011, 4PM Reading: Rabaey Sections
More informationECE 301 Digital Electronics
ECE 301 Digital Electronics Counters (Lecture #20) The slides included herein were taken from the materials accompanying Fundamentals of Logic Design, 6 th Edition, by Roth and Kinney, and were used with
More informationECE 331 Digital System Design
ECE 331 Digital System Design Counters (Lecture #20) The slides included herein were taken from the materials accompanying Fundamentals of Logic Design, 6 th Edition, by Roth and Kinney, and were used
More informationEmotionally-Relevant Features for Classification and Regression of Music Lyrics
IEEE TRANSACTIONS ON JOURNAL AFFECTIVE COMPUTING, MANUSCRIPT ID 1 Emotionally-Relevant Features for Classification and Regression of Music Lyrics Ricardo Malheiro, Renato Panda, Paulo Gomes and Rui Pedro
More informationAn Icelandic Gigaword Corpus
Steinþór Steingrímsson, Sigrún Helgadóttir & Eiríkur Rögnvaldsson The paper describes work in progress to compile an Icelandic Gigaword Corpus (IGC). The initial aim of the project was to compile a large
More informationEE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach
EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach Song Hui Chon Stanford University Everyone has different musical taste,
More informationAdjectives - Semantic Characteristics
Adjectives - Semantic Characteristics Prototypical ADJs (inherent, concrete, relatively stable qualities) 1. Size General size: Horizontal extension: Thickness: Vertical extension: Vertical elevation:
More informationCSE 166: Image Processing. Overview. Representing an image. What is an image? History. What is image processing? Today. Image Processing CSE 166
CSE 166: Image Processing Overview Image Processing CSE 166 Today Course overview Logistics Some mathematics MATLAB Lectures will be boardwork and slides Take written notes or take pictures of the board
More informationCHAPTER I INTRODUCTION
CHAPTER I INTRODUCTION A. Background of the Study The meaning of word, phrase and sentence is very important to be analyzed because it can make something more understandable to be communicated to the others.
More informationTopic: Part of Speech Exam & Sentence Types KEY
09.13.10 Topic: Part of Speech Exam & Sentence Types KEY AFTER THIS CLASS YOU WILL BE ABLE TO: 1. Demonstrate mastery of parts of speech. 2. Identify and use declarative, interrogatory, imperative, and
More informationLearning to translate with source and target syntax. David Chiang, USC Information Sciences Institute
Learning to translate with source and target syntax David Chiang, USC Information Sciences Institute 14 July 2010 Overview Using source and target syntax Why is it hard? How can we make it better? Let
More informationA computer assisted analysis of literary text: from feature analysis to judgements of literary merit Tess M. E. A. Crosbie
Title Name A computer assisted analysis of literary text: from feature analysis to judgements of literary merit Tess M. E. A. Crosbie This is a digitised version of a dissertation submitted to the University
More informationChapter 1 Midterm Review
Name: Class: Date: Chapter 1 Midterm Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A survey typically records many variables of interest to the
More informationHow to Write a Term Paper or Thesis
How to Write a Term Paper or Thesis Michael A. Covington Artificial Intelligence Center The University of Georgia Athens, Georgia 30602 http://www.ai.uga.edu/mc Revised June 3, 2005 Abstract This is a
More information