Learning multi-grained aspect target sequence for Chinese sentiment analysis. H Peng, Y Ma, Y Li, E Cambria Knowledge-Based Systems (2018)

Similar documents
COSC282 BIG DATA ANALYTICS FALL 2015 LECTURE 11 - OCT 21

Inverted Index Construction

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Automatic Rhythmic Notation from Single Voice Audio Sources

Combination of Audio & Lyrics Features for Genre Classication in Digital Audio Collections

MPEG-4 Audio Synchronization

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

Jazz Melody Generation and Recognition

Lyrics Classification using Naive Bayes

A repetition-based framework for lyric alignment in popular songs

CIS530 Homework 3: Vector Space Models

Sentiment and Sarcasm Classification with Multitask Learning

CIS530 HW3. Ignacio Arranz, Jishnu Renugopal January 30, 2018

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Sentiment Aggregation using ConceptNet Ontology

An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews

A TEXT RETRIEVAL APPROACH TO CONTENT-BASED AUDIO RETRIEVAL

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Detecting Musical Key with Supervised Learning

Take a Break, Bach! Let Machine Learning Harmonize That Chorale For You. Chris Lewis Stanford University

Julius Caesar In Plain And Simple English: A Modern Translation And The Original Version By William Shakespeare READ ONLINE

KLUEnicorn at SemEval-2018 Task 3: A Naïve Approach to Irony Detection

Name That Song! : A Probabilistic Approach to Querying on Music and Text

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Automatic Polyphonic Music Composition Using the EMILE and ABL Grammar Inductors *

Lyric-Based Music Mood Recognition

Singer Traits Identification using Deep Neural Network

Week 14 Query-by-Humming and Music Fingerprinting. Roger B. Dannenberg Professor of Computer Science, Art and Music Carnegie Mellon University

Built-In Self-Test (BIST) Abdil Rashid Mohamed, Embedded Systems Laboratory (ESLAB) Linköping University, Sweden

Multi-modal Analysis of Music: A large-scale Evaluation

arxiv: v1 [cs.ir] 16 Jan 2019

JULIUS CAESAR. Shakespeare. Cambridge School. Edited by Rob Smith and Vicki Wienand

Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods

The Ohio State University's Library Control System: From Circulation to Subject Access and Authority Control

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

A Discriminative Approach to Topic-based Citation Recommendation

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

Machine-Assisted Indexing. Week 12 LBSC 671 Creating Information Infrastructures

Automatic Piano Music Transcription

Audio-Based Video Editing with Two-Channel Microphone

MidiFind: Fast and Effec/ve Similarity Searching in Large MIDI Databases

Multi-modal Analysis for Person Type Classification in News Video

Hidden Markov Model based dance recognition

The Shakespeare Plays: Julius Caesar By McGraw-Hill READ ONLINE

Authorship Verification with the Minmax Metric

Detecting Attempts at Humor in Multiparty Meetings

BBM 413 Fundamentals of Image Processing Dec. 11, Erkut Erdem Dept. of Computer Engineering Hacettepe University. Segmentation Part 1

Homework 2 Key-finding algorithm

2. Problem formulation

SMART VEHICLE SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE METHODS

MUSIC/AUDIO ANALYSIS IN PYTHON. Vivek Jayaram

Using Genre Classification to Make Content-based Music Recommendations

CS229 Project Report Polyphonic Piano Transcription

A Statistical Framework to Enlarge the Potential of Digital TV Broadcasting

UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics

Finding Sarcasm in Reddit Postings: A Deep Learning Approach

Name That Song! : A Probabilistic Approach to Querying on Music and Text

Music Composition with RNN

Development of a wearable communication recorder triggered by voice for opportunistic communication

Outline. Why do we classify? Audio Classification

ECE438 - Laboratory 1: Discrete and Continuous-Time Signals

Unit Detection in American Football TV Broadcasts Using Average Energy of Audio Track

Overview of the SBS 2016 Mining Track

Joint Image and Text Representation for Aesthetics Analysis

INSTRUCTIONS FOR PREPARING MANUSCRIPTS FOR SUBMISSION TO ISEC

INSTRUCTIONS FOR PREPARING MANUSCRIPTS FOR SUBMISSION TO ISEC

Composer Identification of Digital Audio Modeling Content Specific Features Through Markov Models

arxiv: v1 [cs.ir] 20 Mar 2019

A Framework for Segmentation of Interview Videos

KENYA FOREST SERVICE DOCUMENT TITLE:

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

Reducing False Positives in Video Shot Detection

Instructions for Contributors to the APSIPA Transactions on Signal and Information Processing

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

Sarcasm Detection in Text: Design Document

... A Pseudo-Statistical Approach to Commercial Boundary Detection. Prasanna V Rangarajan Dept of Electrical Engineering Columbia University

Introduction to Your Teacher s Pack!

Capital Works process for Medium Works contracts

Word Sense Disambiguation in Queries. Shaung Liu, Clement Yu, Weiyi Meng

Subjective Similarity of Music: Data Collection for Individuality Analysis

Formalizing Irony with Doxastic Logic

SEGMENTATION, CLUSTERING, AND DISPLAY IN A PERSONAL AUDIO DATABASE FOR MUSICIANS

WAKE-UP-WORD SPOTTING FOR MOBILE SYSTEMS. A. Zehetner, M. Hagmüller, and F. Pernkopf

Voice Controlled Car System

Indexing local features and instance recognition

1 Guideline for writing a term paper (in a seminar course)

Toward Multi-Modal Music Emotion Classification

THE importance of music content analysis for musical

Package crimelinkage

gresearch Focus Cognitive Sciences

Music Genre Classification and Variance Comparison on Number of Genres

Exercises. ASReml Tutorial: B4 Bivariate Analysis p. 55

Optical Signals Application Plug-in Programmer Manual

Analysis of a Two Step MPEG Video System

Week 14 Music Understanding and Classification

Implementation of Emotional Features on Satire Detection

The Lowest Form of Wit: Identifying Sarcasm in Social Media

Sonnets (No Fear Shakespeare) By SparkNotes, William Shakespeare

Transcription:

Tutorial

Learning multi-grained aspect target sequence for Chinese sentiment analysis H Peng, Y Ma, Y Li, E Cambria Knowledge-Based Systems (28)

Ideas Task: Aspect term sentiment classification Problems Eg.: The red apple released in California was not that interesting. Eg.: The room size is small, but the view is excellent. Opportunities in Chinese Compositionality = Train ( 火 车) Wood (木) + Fire (火) Jungle (林) Vehicle ( 车 ) Forest (森)

Solutions Adaptive word embeddings Aspect target sequence modelling Attention mechanism Sequence modelling-lstm Multi-grained learning Fusion of granularities

https://github.com/senticnet

Q Term Docs Docs2 Docs3 Angels Fools Angels rush Angels fear Fools rush Fear fools Fear to Where angels To tread in queries generated a) Which arefear the biword boolean by the following phrase query? Rush in. fools rush in 2. where angels rush in 3. angels fear to tread b) Which are, if any, the document retrieved?

term doc Q2 doc2 angels #36, 74, 252, 65$ fools #, 7, 74, 222$ fear in #3, 37, 76, 444, 85$ rush #2, 66, 94, 32, 72$ to #47, 86, 234, 999$ doc3 #5, 23, 42$ #8, 78, 8, 458$ #3, 43, 3, 433$ #8, 328, 528$ #, 2,, 47, 5$ #5, 7, 25, 95$ #4, 6, 44$ #4, 24, 774, 944$ #9, 39, 599, 79$ Which treaddocument(s), #57, 94, 333$ if any, meet each of the following phrase where based #67, 24, 393,over mentioned #, 4,, #4,index? 36, queries, on the positional $ (a) fools rush in (b) where angels rush in (c) angels fear to tread 42, 43$; 736$

Reca Biword Index Index every consecutive pair of terms in the text as a phrase Es. Friends, Romans, Countrymen would generate the biwords:. friends romans 2. romans countrymen Longer phrase queries can be broken into the Boolean query on biwords: Es. stanford university palo alto stanford university AND university palo AND palo alto

Reca Positional index Extract inverted index entries for each distinct term: to, be, or, not. Merge their doc:position lists to enumerate all positions with to be or not to be. to: 2:,7,74,222,55; 4:8,6,9,429,433; 7:3,23,9;... be: :7,9; 4:7,9,29,43,434; 5:4,9,;... Same general method for proximity searches

Group discussion

A.a fools rush in => fools rush AND rush in where angels rush in => where angels AND angels rush AND rush in angels fear to tread => angels fear AND fear to AND to tread

A.b fools rush in = doc where angels rush in = doc, doc3 angels fear to tread = null

A2 fools rush in => doc Fools #, 7, 74, 222$ 444, 85$ rush #2, 66, 94, 32, 72$ in #3, 37, 76, where angels rush in => doc3 Where #4, 36, 736$ angels #5, 23, 42$ #5, 7, 25, 95$ rush #4, 6, 44$ in Doc;No positional merge available Where #67, 24, 393, $ angels #36, 74, 252, 65$ 94, 32, 72$ in #3, 37, 76, 444, 85$ rush #2, 66,

Q3 Consider the table of term frequencies for 3 documents denoted Doc, Doc2, Doc3 below. Compute the tf-idf weights for the terms car, auto, insurance, best, for each document, using idf the table below. wthe ( values log tffrom t,d ) log ( N / df t ) t,d term Doc Doc 2 Doc3 idf car 27 24.65 auto 3 33 2.8 insuran ce 33 29.62 best 4 7.5

Sec. 6.2.2 tf-idf weighting Recall The tf-idf weight of a term is the product of its tf weight and its idf weight. w t,d ( log tf t,d ) log ( N / df t ) Best known weighting scheme in information retrieval Note: the - in tf-idf is a hyphen, not a minus sign! Alternative names: tf.idf, tf x idf Increases with the number of occurrences within a document

Group discussion

A3 w t,d ( log tft,d ) log ( N / dft ) tf Doc Doc 2 Doc3 idf car 27 24.65 auto 3 33 2.8 insuran ce 33 29.62 best +log tf 4 Doc Doc 2 car 2.43 auto w Doc Doc 2 Doc3 car 4. 3.3 3.93 auto 3.8 5.24 7 Doc3.5 insuran ce 4.8 3.99 2 2.38 best 3.23 3.35.48 2.52 insuran ce 2.52 2.46 best 2.5 2.23

Q4 Refer to the tf and idf values for four terms and three documents from Q3. Compute the two top scoring documents on the query best car insurance for each of the following weighing schemes: (i) nnn.atc; (ii) ntc.atc. ddd.qqq

Sec. 6.4 tf-idf example: lnc.ltc Recall Document: car insurance auto insurance Query: best car insurance Term Document tfraw tf-wt auto best car insurance wt Query norm alize tf-raw.52 2.3.3 tf-wt Pro d df idf 5 2.3 5.3.3.34.52 2. 2..52.27.68 3. 3..78.53 Doc length = 2 2 2.32.92 Score = ++.27+.53 =.8 wt norma lize

Group discussion

A4 Find document vectors: (i) nnn (ii) ntc nnn Doc Doc2 Doc3 car 27**= 27 **= 24**= 24 auto 3**=3 33**= 33 **= **= 33**= 29**= Doc2 33 29 insuran ntc ce Doc car (27*.65=44.55)/49.6 (*.65=6.5)/88.55 4**= **= 7**= = = 4 7.9.9 (24*.65=39.6)/66.5=.6 auto (3*2.8=6.24)/49.6=.3 (33*2.8=68.64)/88.5 5=.78 *2.8= insuranc e *.62= (33*.62=53.46)/88.5 5=.6 (29*.62=46.98)/66.5=.7 best (4*.5=2)/49.6=.42 *.5= (7*.5=25.5)/66.5=.38 best Doc3

A4 Find the vector for query best car insurance: (i,ii) atc tf a t at atc car.5+.5*/=.65.65.6 auto.5+.5*/=.62.62.59 insurance best nnn.atc.5 Doc3 car.5+.5*/=.5 Doc Doc2 27*.6=6.2 *.6=6 auto insuranc e *.59= 33*.59=9. 47 29*.59=7. best 4*.54=7.5 6 7*.54=9.8 SUM 23.75 (3rd) 4.69 (st) 25.47 (2nd) 24*.6=4.4.54 max(tf)= length=2.76

A4 (ii) ntc.atc ntc Doc Doc2 Doc3 atc car.9.9.6.6 auto.3.79 insurance.6.7.59 best.42.38.54 ntc.atc Doc Doc2 Doc3 car.9*.6=.54.9*.6=..6*.6=.36 auto insurance.6*.59=.3 6.7*.59=.42 best.42*.54=.2 3.38*.54=.2 SUM.77 (2nd).47(3rd).99 (st)

Q5 Antony and Julius Cleopatr Caesar a The Tempest Antony 57 73 Brutus 28 57 Caesar 232 227 Calpurni a Cleopatr a 23 37 Mercy 5 a) Compute the cosine similarity and the Euclidian distance between the Worser 2 brutus based on the termdocuments and the query: caesar mercy document count matrix above. b) How does the Euclidian distance change if we normalize the vectors? w t,d ( log tft,d ) log ( N / dft ) NB: Compute the vector space using tf-idf formula of Q3

Euclidean distance Recall Euclidean distance: the distance between points (x,y ) and (x,y ) is given by: 2 2 Unfortunately, this distance is biased by the length of the vectors. So is not able to detect the correct terms distribution

Cosine similarity illustrated 27 Recall

Group discussion

A5 Compute the vector space Antony and Cleopatr a Julius Caesar The Tempest Query Antony.56.5 Brutus.43.56.8 Caesar.59.59.8 Calpurni a.95 Cleopatr a.42.45 Mercy.35.38.8 Worser.23.8

A5 Antony and Cleopatra Julius Caesar The Tempest Cosine similarity.57.62.35 Euclidian distance.9.22.58

A5 Normalized values Antony and Cleopatr a Julius Caesar The Tempest Query Antony.25.69 Brutus.93.9.33 Caesar.265.2.33 Calpurni a.322 Cleopatr a.88.446 Mercy.9.376.33 Worser.3.78.49.47.67 Euclidia n distance normaliz

Tutorial 2

Context-Dependent Sentiment Analysis in User-Generated Videos Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A., & Morency, L. P. (27). In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume : Long Papers) (Vol., pp. 873-883).

Idea Utterance context influences sentiment eg.: Movie review of Green Hornet : The Green Hornet did something similar It engages the audience more, they took a new spin on it, and I just loved it

Solutions Model the order of utterance appearance Contextual LSTM Fusion of modalities Hierarchical Framework

https://github.com/senticnet

Q Consider the following class conditioned word probabilities (c=non-spam, c=spam): For each of the 3 email snippets below, ignoring case, punctuations, and words beyond the known vocabulary words, compute the class conditioned document probabilities for each of the 3 documents (6 in total: P(d c), P(d2 c), P(d3 c), P(d c), P(d2 c), P(d3 c)) using the Naïve Bayes model.

Sec.3.2 Recall Naive Bayes Classifier d x, x2,, xn cmap argmax P (cj x, x2,, xn ) cj C The Theprobability probabilityof ofaa document documentddbeing beingin inclass class c.c. argmax P ( x, x2,, xn cj )P (cj ) Bayes Bayes Rule Rule cj C argmax P ( x cj )P ( x2 cj ) P ( xn cj )P (cj ) cj C N (C c j ) ˆ P (c j ) N Pˆ ( xi c j ) Conditional Conditional Dependence Dependence Assumption Assumption N ( X i xi, C c j ) N (C c j ) k

Q: documents d: OEM software - throw packing case, leave CD, use electronic manuals. Pay for software only and save 75-9%! Find incredible discounts! See our special offers! d2: Our Hottest pick this year! Brand new issue Cana Petroleum! VERY tightly held, in a booming business sector, with a huge publicity campaign starting up, Cana Petroleum (CNPM) is set to bring all our readers huge gains. We advise you to get in on this one and ride it to the top! d3: Dear friend, How is your family? hope all of you are fine, if so splendid. Yaw Osafo-Maafo is my name and former Ghanaian minister of finance. Although I was sacked by President John Kufuor on 28 April 26 for the fact I signed 29 million book publication contract with Macmillan Education without reference to the Public Procurement Board and without Parliamentary approval.

Q: Naïve Bayes model p( dj ck ) t p( wi ck ) i f ( wi, dj ) where f(wi,dj) = frequency of word wi in document dj

Hint d2: Our Hottest pick this year! Brand new issue Cana Petroleum! VERY tightly held, in a booming business sector, with a huge publicity campaign starting up, Cana Petroleum (CNPM) is set to bring all our readers huge gains. We advise you to get in on this one and ride it to t the top! f ( wi, dj ) p( dj ck ) p( wi ck ) i p(d2 c) = p(hottest c)*p(brand c)*p(new c)*p(huge c)2

Group discussion

A p(d c ).5.2.2.3 6 6 p(d c ).99.93.99.99 9.2 p(d2 c )..2 2.3..2 4 p(d2 c ).98.92 2.9.99 7.47 p(d3 c ).5.4 2 2 p(d3 c ).98.2.96 2

Q2 Compute the posterior probabilities of each document in Question, given c and c, (6 in total: P(c d), P(c d), P(c d2), P(c d2), P(c d3), P(c d3)) assuming that 8% of all email received are spam, i.e., prior class probability P(c)=.8 (from which you can derive P(c)=P(c)), and finally decide whether each document p(is ck spam. dj ) p(dj ck ) p(ck )

Group discussion

Q2 P(c) = -P(c) =.2

A2 P(c d) P(d c)xp(c)=6x-6x.2=.2x-6 P(c d) P(d c)xp(c)=.92x.8=.72 P(c d2) P(d2 c)xp(c)=.2x-4x.2=2.4x5 P(c d2) P(d2 c)xp(c)=.747x.8=.6 P(c d3) P(d3 c)xp(c)=.2x.2=.4 P(c d3) P(d3 c)xp(c)=.96x.8=.6

Q3 Build a Naïve Bayes classifier using words as features for the training set in Table 2 and use the classifier to classify the test set in the table.

Bayes probability Prior probability: Probability of expecting class ck before taking in account any evidence Likelihood: Recall True only because we make the "naive" conditional independence assumptions Posterior probability:

Recall Naive Bayes: Learning Number of documents belonging to class ck Total number of documents Number of occurrence of term xi in docs of class ck Number of terms appearing in docs of class ck

MAP classifier MAP is maximum a posteriori Detect the class that maximize our posteriori probability We just try all the class ck Recall

Group discussion

A3 Prior probability: p(china)=2/4, p(~china)=2/4

A3 (learning) Doc Id Terms Taipei 2 Macao 3 Japan 4 Sapporo Taiwan Taiwan Shanghai Sapporo Osaka Taiwan Vocabulary = {Taipei, Taiwan, Macao, Shanghai, Japan, Sapporo, Osaka} Vocabulary = 7 Doc class #Terms Yes 5 No 5

A3 (learning) P(Taipei yes)=(+)/(5+7)=2/2 P(Taipei no)=(+)/(5+7)=/2 P(Taiwan yes)=(2+)/(5+7)=3/2 P(Taiwan no)=(+)/(5+7)=2/2 P(Sapporo yes)=(+)/(5+7)=/2 P(Sapporo no)=(2+)/(5+7)=3/2

A3 (classifying) Doc Id Terms 5 Taiwan Taiwan 2 2 3 3 2.6 P(yes d5)= 4 2 2 2 P(no d5)= 2 2 3 3.47 3 4 2 2 Answer: d5 belongs to the class no Sapporo

Q4 Each of two Web search engines A and B generates a large number of pages uniformly at random from their indexes. 3% of A s pages are present in B s index, while 5% of B s pages are present in A s index. What is the ratio between the number of pages in A s index and the number of pages in B s?

Recall

Group discussion

A4 3% x A = 5% x B A/B = 5/3