arxiv: v1 [cs.cl] 3 May 2018

Similar documents
Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm

Harnessing Context Incongruity for Sarcasm Detection

World Journal of Engineering Research and Technology WJERT

LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally

Sentiment and Sarcasm Classification with Multitask Learning

Sarcasm Detection in Text: Design Document

Finding Sarcasm in Reddit Postings: A Deep Learning Approach

Are Word Embedding-based Features Useful for Sarcasm Detection?

How Do Cultural Differences Impact the Quality of Sarcasm Annotation?: A Case Study of Indian Annotators and American Text

Automatic Sarcasm Detection: A Survey

NLPRL-IITBHU at SemEval-2018 Task 3: Combining Linguistic Features and Emoji Pre-trained CNN for Irony Detection in Tweets

arxiv: v2 [cs.cl] 20 Sep 2016

KLUEnicorn at SemEval-2018 Task 3: A Naïve Approach to Irony Detection

An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews

Tweet Sarcasm Detection Using Deep Neural Network

Sarcasm Detection on Facebook: A Supervised Learning Approach

LLT-PolyU: Identifying Sentiment Intensity in Ironic Tweets

Approaches for Computational Sarcasm Detection: A Survey

Fracking Sarcasm using Neural Network

The Lowest Form of Wit: Identifying Sarcasm in Social Media

Sarcasm as Contrast between a Positive Sentiment and Negative Situation

Modelling Sarcasm in Twitter, a Novel Approach

HumorHawk at SemEval-2017 Task 6: Mixing Meaning and Sound for Humor Recognition

This is a repository copy of Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis.

Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing

CASCADE: Contextual Sarcasm Detection in Online Discussion Forums

#SarcasmDetection Is Soooo General! Towards a Domain-Independent Approach for Detecting Sarcasm

ValenTO at SemEval-2018 Task 3: Exploring the Role of Affective Content for Detecting Irony in English Tweets

Temporal patterns of happiness and sarcasm detection in social media (Twitter)

Towards a Contextual Pragmatic Model to Detect Irony in Tweets

TWITTER SARCASM DETECTOR (TSD) USING TOPIC MODELING ON USER DESCRIPTION

arxiv: v1 [cs.cl] 8 Jun 2018

Who would have thought of that! : A Hierarchical Topic Model for Extraction of Sarcasm-prevalent Topics and Sarcasm Detection

Formalizing Irony with Doxastic Logic

Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder

Automatic Detection of Sarcasm in BBS Posts Based on Sarcasm Classification

A Survey of Sarcasm Detection in Social Media

Harnessing Cognitive Features for Sarcasm Detection

Sparse, Contextually Informed Models for Irony Detection: Exploiting User Communities, Entities and Sentiment

Detecting Sarcasm in English Text. Andrew James Pielage. Artificial Intelligence MSc 2012/2013

arxiv:submit/ [cs.cv] 8 Aug 2016

Modelling Irony in Twitter: Feature Analysis and Evaluation

Deep Learning of Audio and Language Features for Humor Prediction

DICTIONARY OF SARCASM PDF

SARCASM DETECTION IN SENTIMENT ANALYSIS Dr. Kalpesh H. Wandra 1, Mehul Barot 2 1

The final publication is available at

PREDICTING HUMOR RESPONSE IN DIALOGUES FROM TV SITCOMS. Dario Bertero, Pascale Fung

Implementation of Emotional Features on Satire Detection

SARCASM DETECTION IN SENTIMENT ANALYSIS

Acoustic Prosodic Features In Sarcastic Utterances

UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics

arxiv: v1 [cs.ir] 16 Jan 2019

An extensive Survey On Sarcasm Detection Using Various Classifiers

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

Sentiment Analysis. Andrea Esuli

저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.

PunFields at SemEval-2018 Task 3: Detecting Irony by Tools of Humor Analysis

arxiv: v1 [cs.lg] 15 Jun 2016

Introduction to Sentiment Analysis. Text Analytics - Andrea Esuli

Image-to-Markup Generation with Coarse-to-Fine Attention

DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison

Really? Well. Apparently Bootstrapping Improves the Performance of Sarcasm and Nastiness Classifiers for Online Dialogue

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

SemEval-2018 Task 3: Irony Detection in English Tweets

Joint Image and Text Representation for Aesthetics Analysis

Humor recognition using deep learning

CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification

This is an author-deposited version published in : Eprints ID : 18921

Are you serious?: Rhetorical Questions and Sarcasm in Social Media Dialog

A Kernel-based Approach for Irony and Sarcasm Detection in Italian

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons

A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection

A COMPREHENSIVE STUDY ON SARCASM DETECTION TECHNIQUES IN SENTIMENT ANALYSIS

An Introduction to Deep Image Aesthetics

Cognitive Systems Monographs 37. Aditya Joshi Pushpak Bhattacharyya Mark J. Carman. Investigations in Computational Sarcasm

Sarcasm Detection: A Computational and Cognitive Study

REPORT DOCUMENTATION PAGE

Kavita Ganesan, ChengXiang Zhai, Jiawei Han University of Urbana Champaign

Music Composition with RNN

Computational modeling of conversational humor in psychotherapy

Do we really know what people mean when they tweet? Dr. Diana Maynard University of Sheffield, UK

Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election

Comparative study of Sentiment Analysis on trending issues on Social Media

Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series Friends

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Sentiment Aggregation using ConceptNet Ontology

LSTM Neural Style Transfer in Music Using Computational Musicology

Clues for Detecting Irony in User-Generated Contents: Oh...!! It s so easy ;-)

Automatic Piano Music Transcription

A New Scheme for Citation Classification based on Convolutional Neural Networks

Mining Subjective Knowledge from Customer Reviews: A Specific Case of Irony Detection

Improving Frame Based Automatic Laughter Detection

Semantic Role Labeling of Emotions in Tweets. Saif Mohammad, Xiaodan Zhu, and Joel Martin! National Research Council Canada!

Sarcasm in Social Media. sites. This research topic posed an interesting question. Sarcasm, being heavily conveyed

Ironic Gestures and Tones in Twitter

Understanding Book Popularity on Goodreads

Document downloaded from: This paper must be cited as:

arxiv: v1 [cs.cl] 1 Apr 2019

Chord Classification of an Audio Signal using Artificial Neural Network

Transcription:

Binarizer at SemEval-2018 Task 3: Parsing dependency and deep learning for irony detection Nishant Nikhil IIT Kharagpur Kharagpur, India nishantnikhil@iitkgp.ac.in Muktabh Mayank Srivastava ParallelDots, Inc. muktabh@paralleldots.com arxiv:1805.01112v1 [cs.cl] 3 May 2018 Abstract In this paper, we describe the system submitted for the SemEval 2018 Task 3 (Irony detection in English tweets) Subtask A by the team Binarizer. Irony detection is a key task for many natural language processing works. method treats ironical tweets to consist of smaller parts containing different emotions. We break down tweets into separate phrases using a dependency parser. We then embed those phrases using an LSTM-based neural network model which is pre-trained to predict emoticons for tweets. Finally, we train a fullyconnected network to achieve classification. 1 Introduction The micro-blogging site Twitter has created an abundance of data about opinions and sentiments regarding almost every aspect of daily life. A deeper study of the public opinion can be obtained by applying natural language processing techniques on this data. However, the performance of these NLP models is detrimentally affected by irony (Pozzi et al., 2016). As per the Oxford English Dictionary, irony is the expression of one s meaning by using language that normally signifies the opposite, typically for humorous or emphatic effect. This deviation between what is said and what is intended makes irony hard to detect. Being a platform where users are free to communicate and express themselves colloquially, Twitter generates considerable data injected with irony. Studying this would provide us with a better sentiment analysis of these tweets. Prior work on irony detection includes the use of unigrams and emoticons (González-Ibánez et al., 2011; Carvalho et al., 2009; Barbieri et al., 2014). Maynard and Greenwood (2014) describe an unsupervised pattern mining approach where the sentiment of the hashtag in the tweet is proposed to be a key indicator of sarcasm. If the sentiment of the tweet does not match the sentiment of the hashtag, it is predicted to be sarcastic. Riloff et al. (2013) illustrates a semi-supervised approach where rule-based classifiers are used to look for negative situation phrases and positive verbs in a sentence. Tsur et al. (2010) build pattern-based features that detect the presence of discriminative patterns as extracted from a large sarcasm-labelled corpus. N-gram-based approaches have also been used (Davidov et al., 2010; Ptáček et al., 2014; Joshi et al., 2015) with sentiment features. Joshi et al. (2017) use similarity between word embeddings as feature and Poria et al. (2016) use convolutional neural networks to extract sentiment, emotion and personality features for sarcasm detection. SemEval-2018 Task 3 (the 12th workshop on semantic evaluation) specifies two subtasks in relation to irony detection in English tweets (Van Hee et al., 2018). In subtask A the goal was to train a binary classifier that detects whether a given tweet is ironic or not. Subtask B was a multi-class classification problem where four labels were specified to describe the nature of irony (verbal irony by means of a polarity contrast, situational irony, other verbal irony, and non-ironic). The goal was to assign one of the four labels to each tweet. We propose a new method which considers ironical tweets to be collections of smaller parts containing different emotions. We break down tweets into these collections using a dependency parser and embed them using DeepMoji (Felbo et al., 2017) which is pre-trained to predict emoticons for tweets. Finally we train a classifier to detect irony. The paper is organized as follows: We discuss our methods in section 2. Section 3 contains the details about the experiments and the training data. In Section 4 we discuss the results and Section 5 concludes the paper with closing remarks.

2 Method In order to identify the chunks of various emotions in an ironic tweet, we split the tweets into phrases using a dependency parser. We use Tweeboparser (Kong et al., 2014), which is a dependency parser for English tweets. The parser is trained on a subset of a labelled corpus for 929 tweets (12,318 tokens) drawn from the POS-tagged tweet corpus of Owoputi et al. (2013), Tweebank. TweeboParser predicts the syntactic structure of the tweet represented by unlabelled dependencies. Tweets contain multiple sentences or fragments called utterances each with their own syntactic root disconnected from the others. Since a tweet often contains more than one utterance, the output of TweeboParser will often be a multi-rooted graph over the tweet. Also, many elements in tweets have no syntactic function. These include, in many cases, hashtags, URLs, and emoticons. TweeboParser attempts to exclude these tokens from the parse tree. For our purpose, we club the words arising from the same root to create a phrase. Multiple roots would create multiple phrases. As we can see from Figure 1, these phrases can convey the different sentiments attached to the different subjects of the tweet. Figure 1: Parser results After extracting a set of phrases for the sentence, we embed the phrases into vectors. We used the DeepMoji (Felbo et al., 2017) model, which is trained on 1.2 billion tweets with emojis to understand how language is used to express emotions. It encodes the provided phrase into a 2,304-dimensional feature vector. Under the hood, DeepMoji model projects each word into a 256- dimensional vector space followed by a hyperbolic Figure 2: Neural network architecture tangent activation function. After this, two bidirectional LSTMs with 1,024 hidden units each are used to capture the context of each word. Finally, the model uses skip connections from each layer to an attention layer and hence the attention layer outputs a 2,304 (256+1,024+1,024) dimensional vector. Now this 2,304-dimensional output is connected to a softmax layer for classification. We did not use the final softmax layer but took the 2,304-dimensional vector for each phrase. As the model was trained for prediction of emoticons, this feature vector contains information about the semantic and sentimental content of the phrases. To make the predictions we need to account for the sentiment behind every utterance. To this end, we concatenate these vectors and pass the resulting concatenated vector through a fully-connected network as described in Figure 2. Tweets can have a varying number of roots, which implies that they split into a varying number of phrases. model considers a maximum of nine roots. A tweet with an excess of nine roots is truncated suitably. On the other hand, a tweet with less than nine roots is zero-padded. We have described the complete process flowchart in Figure 3. 3 Experiments For subtask A, we were provided with a dataset consisting of tweets along with a binary class (0 or 1) which indicates whether this tweet is ironic or not (0 for non-ironic tweets and 1 for ironic tweets). The data was collected from Twitter API by querying tweets using the hashtags #irony, #sarcasm and #not, with subsequent manual an-

Figure 3: Process Flowchart notation to remove noise. 3,833 tweets for training and 784 tweets for testing were provided. The evaluation was done by using accuracy, precision, recall and F1 score. Accuracy : P recision : Recall : total number of instances number of predicted labels number of labels in the gold standard F 1 score : 2 x precision x recall precision + recall We used the pipeline described in Figure 3. The final step of the process used a fully connected neural network with four layers. The input layer of the FC network has a dimension of 20,736 (2,304*9), the second layer has a dimension of 9,216 (2,304*4), the third layer has a dimension of 2,304 and the fourth layer has 256 dimensions. The final layer has 2 dimensions, with one for each class. This is depicted in Figure 2. We used the hyperbolic tangent activation function in all of the layers, and stochastic gradient descent with a learning rate of 0.01 and a momentum of 0.5. Two models were then devised. The difference in these models lies in the input supplied to the FC network. In the first model, this input is the concatenation of the vectors obtained by embedding phrases. In the second model, the input is the Method Accuracy Precision Recall F1 score Winning team β1 β2 α 0.7347 0.6304 0.8006 0.7054 0.6659 0.5527 0.6471 0.5962 0.6390 0.5198 0.6941 0.5944 0.6951 0.6197 0.5176 0.5641 Table 1: Results SemEval Task 3A concatenation of the input in the first model along with a 2,304-dimensional vector representing the embedding of the tweet as a whole. The results we get from various experiments on these models are shown in Table 1. α is the first model. The best F1 score for this model was achieved after four epochs, as shown in Table 1. β1 and β2 are the second model running for five and four epochs respectively. 4 Results and Discussions We participated only in the shared task 3A as the team Binarizer. We came ninth as per accuracy and seventeenth as per F1 score among the fortythree participating systems. Due to a glitch on

our side during submission the results are based on 446 out of 784 instances in the test data. The models perform better than the baseline system as per the competition leaderboard. This reinforces the notion that separate phrases in a tweet carry information required for irony detection. α has greater precision whereas β has higher recall. So an application which demands urgent detection of ironic tweets would profit more from β. This demonstrates that the sentiment information of the context provided from the whole tweet is also important. 5 Conclusion and Future works We have shown how using the sentiments of different segments of tweets can enable irony detection. From the results of our experiments, we conclude that the segments have sufficient sentiment information in them for the identification of irony. In future research, we aim to improve the algorithm for parsing these chunks by replacing the dependency parser. Also, more experimentation can be performed for the last part of the pipeline. As the phrases from the tweets are sequences themselves, we can apply sequence modelling with LSTMs or CNNs. References Francesco Barbieri, Horacio Saggion, and Francesco Ronzano. 2014. Modelling sarcasm in twitter, a novel approach. In Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 50 58. Paula Carvalho, Luís Sarmento, Mário J Silva, and Eugénio De Oliveira. 2009. Clues for detecting irony in user-generated contents: oh...!! it s so easy;-. In Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, pages 53 56. ACM. Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Semi-supervised recognition of sarcastic sentences in twitter and amazon. In Proceedings of the fourteenth conference on computational natural language learning, pages 107 116. Association for Computational Linguistics. Bjarke Felbo, Alan Mislove, Anders Søgaard, Iyad Rahwan, and Sune Lehmann. 2017. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. In Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1615 1625. Roberto González-Ibánez, Smaranda Muresan, and Nina Wacholder. 2011. Identifying sarcasm in twitter: a closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2, pages 581 586. Association for Computational Linguistics. Aditya Joshi, Pushpak Bhattacharyya, and Mark J. Carman. 2017. Automatic sarcasm detection: A survey. ACM Computing Surveys, 50(5):73:1 73:22. Aditya Joshi, Vinita Sharma, and Pushpak Bhattacharyya. 2015. Harnessing context incongruity for sarcasm detection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), volume 2, pages 757 762. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A Smith. 2014. A dependency parser for tweets. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP), pages 1001 1012, Doha, Qatar. Diana Maynard and Mark Greenwood. 2014. Who cares about Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 14), pages 4238 4243, Reykjavik, Iceland. European Language Resources Association. Olutobi Owoputi, Brendan O Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A Smith. 2013. Improved part-of-speech tagging for online conversational text with word clusters. In The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2013), pages 380 390. Soujanya Poria, Erik Cambria, and Alexander Gelbukh. 2016. Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based s, 108:42 49. Federico Alberto Pozzi, Elisabetta Fersini, Enza Messina, and Bing Liu. 2016. Sentiment analysis in social networks. Morgan Kaufmann Publishers Inc., San Francisco, CA. Tomáš Ptáček, Ivan Habernal, and Jun Hong. 2014. Sarcasm detection on czech and english twitter. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 213 223. Ellen Riloff, Ashequl Qadir, Prafulla Surve, Lalindra De Silva, Nathan Gilbert, and Ruihong Huang. 2013. Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 704 714.

Oren Tsur, Dmitry Davidov, and Ari Rappoport. 2010. Icwsm-a great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews. In Proceedings of International AAAI Conference on Web and Social Media, pages 162 169. Cynthia Van Hee, Els Lefever, and Vronique Hoste. 2018. Semeval-2018 task 3: Irony detection in english tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval- 2018), New Orleans, LA, USA.