An extensive Survey On Sarcasm Detection Using Various Classifiers

Similar documents
A Survey of Sarcasm Detection in Social Media

World Journal of Engineering Research and Technology WJERT

Sarcasm Detection in Text: Design Document

Finding Sarcasm in Reddit Postings: A Deep Learning Approach

Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews

Automatic Rhythmic Notation from Single Voice Audio Sources

arxiv: v1 [cs.ir] 16 Jan 2019

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Lyrics Classification using Naive Bayes

Introduction to Natural Language Processing This week & next week: Classification Sentiment Lexicons

Implementation of Emotional Features on Satire Detection

Automatic Detection of Sarcasm in BBS Posts Based on Sarcasm Classification

arxiv: v1 [cs.cl] 3 May 2018

Some Experiments in Humour Recognition Using the Italian Wikiquote Collection

AN OVERVIEW ON CITATION ANALYSIS TOOLS. Shivanand F. Mulimani Research Scholar, Visvesvaraya Technological University, Belagavi, Karnataka, India.

Article Title: Discovering the Influence of Sarcasm in Social Media Responses

Design of Low Power and Area Efficient 64 Bits Shift Register Using Pulsed Latches

Acoustic Prosodic Features In Sarcastic Utterances

Lossless and Reversible Data Hiding In Encrypted Pictures by Allocating Memory Some Time Recently Encryption through Security Keys

Multimodal Music Mood Classification Framework for Christian Kokborok Music

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Computational Laughing: Automatic Recognition of Humorous One-liners

Affect-based Features for Humour Recognition

An Introduction to Deep Image Aesthetics

Basic Natural Language Processing

Music Information Retrieval with Temporal Features and Timbre

How Do Cultural Differences Impact the Quality of Sarcasm Annotation?: A Case Study of Indian Annotators and American Text

Detecting Sarcasm in English Text. Andrew James Pielage. Artificial Intelligence MSc 2012/2013

A combination of opinion mining and social network techniques for discussion analysis

A COMPREHENSIVE STUDY ON SARCASM DETECTION TECHNIQUES IN SENTIMENT ANALYSIS

The final publication is available at

Formalizing Irony with Doxastic Logic

KLUEnicorn at SemEval-2018 Task 3: A Naïve Approach to Irony Detection

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Automatic Music Clustering using Audio Attributes

Are Word Embedding-based Features Useful for Sarcasm Detection?

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

CS229 Project Report Polyphonic Piano Transcription

Identifying Related Documents For Research Paper Recommender By CPA and COA

Do we really know what people mean when they tweet? Dr. Diana Maynard University of Sheffield, UK

Melody classification using patterns

A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection

TWITTER SARCASM DETECTOR (TSD) USING TOPIC MODELING ON USER DESCRIPTION

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

Automatic Music Genre Classification

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs

Outline. Why do we classify? Audio Classification

A New Scheme for Citation Classification based on Convolutional Neural Networks

NLPRL-IITBHU at SemEval-2018 Task 3: Combining Linguistic Features and Emoji Pre-trained CNN for Irony Detection in Tweets

LT3: Sentiment Analysis of Figurative Tweets: piece of cake #NotReally

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Generating Chinese Classical Poems Based on Images

Distortion Analysis Of Tamil Language Characters Recognition

Can Song Lyrics Predict Genre? Danny Diekroeger Stanford University

Approaches for Computational Sarcasm Detection: A Survey

Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network

High School Photography 1 Curriculum Essentials Document

Sentiment and Sarcasm Classification with Multitask Learning

Determining sentiment in citation text and analyzing its impact on the proposed ranking index

SARCASM DETECTION IN SENTIMENT ANALYSIS Dr. Kalpesh H. Wandra 1, Mehul Barot 2 1

Chord Classification of an Audio Signal using Artificial Neural Network

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

SARCASM DETECTION IN SENTIMENT ANALYSIS

Detecting Musical Key with Supervised Learning

Music Composition with RNN

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Singer Traits Identification using Deep Neural Network

Cognitive Systems Monographs 37. Aditya Joshi Pushpak Bhattacharyya Mark J. Carman. Investigations in Computational Sarcasm

ISSN Vol.04,Issue.02, February-2016, Pages:

Hidden Markov Model based dance recognition

arxiv: v2 [cs.cl] 20 Sep 2016

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

Citation Proximity Analysis (CPA) A new approach for identifying related work based on Co-Citation Analysis

Tweet Sarcasm Detection Using Deep Neural Network

Neural Network Predicating Movie Box Office Performance

LSTM Neural Style Transfer in Music Using Computational Musicology

MUSI-6201 Computational Music Analysis

Categorization of ICMR Using Feature Extraction Strategy And MIR With Ensemble Learning

A Discriminative Approach to Topic-based Citation Recommendation

Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues

A Survey of Audio-Based Music Classification and Annotation

UWaterloo at SemEval-2017 Task 7: Locating the Pun Using Syntactic Characteristics and Corpus-based Metrics

Sentiment Analysis. Andrea Esuli

Lyric-Based Music Mood Recognition

Harnessing Context Incongruity for Sarcasm Detection

Introduction to Sentiment Analysis. Text Analytics - Andrea Esuli

Sarcasm in Social Media. sites. This research topic posed an interesting question. Sarcasm, being heavily conveyed

Universität Bamberg Angewandte Informatik. Seminar KI: gestern, heute, morgen. We are Humor Beings. Understanding and Predicting visual Humor

Automatic Sarcasm Detection: A Survey

The Lowest Form of Wit: Identifying Sarcasm in Social Media

Chinese Word Sense Disambiguation with PageRank and HowNet

Modeling Musical Context Using Word2vec

Humor recognition using deep learning

Audio Feature Extraction for Corpus Analysis

Joint Image and Text Representation for Aesthetics Analysis

Temporal patterns of happiness and sarcasm detection in social media (Twitter)

Music Genre Classification and Variance Comparison on Number of Genres

Transcription:

Volume 119 No. 12 2018, 13183-13187 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu An extensive Survey On Sarcasm Detection Using Various Classifiers K.R.Jansi* Department of Computer Science and Engineering SRM Institute of Science and Technology Chennai, Tamil Nadu, INDIA jansi.k@ktr.srmuniv.ac.in Pranit Rao Sajja* Department of Computer Science and Engineering SRM Institute of Science and Technology Chennai, Tamil Nadu, INDIA pranitraosajja@gmail.com Priyanshu Goyal* Department of Computer Science and Engineering SRM Institute of Science and Technology Chennai, Tamil Nadu, INDIA priyanshu471997@gmail.com Abstract Sarcasm is a sophisticated form of language use that acknowledges a gap between the intended meaning and the literal meaning of the words. Sarcasm detection is the task of predicting sarcasm in text. It is a crucial step to sentiment analysis, considering pervasiveness and challenges of sarcasm in sentimental text. Sentiment is the feeling or attitude towards something and sentiment analysis is evaluating or studying about various reviews and comments given by users. This paper presents a survey on various approaches for sarcasm detection and various approaches for classification of the text. We also discuss about the types of features that are extracted from the text and how they are used for the classification using various classifiers Keywords: Sarcasm detection,support vector machines,neural networks,nave bayes classifier the annotators to physically dissect each sentence, henceforth came the utilization for building up an instrument to decide Sarcasm Detection. Sarcasm Detection is a piece of NLP, which manages humor identification, which is as content. The different approaches to distinguish Sarcasm and the identifiers introduce in the remarks are appeared in the figure 2. I. INTRODUCTION Sentiment Analysis is a Natural Language Processing (NLP) and information extraction procures client s sentiments composed in positive, negative, nonpartisan comments, inquiries and requests by doing research on extensive amounts of records. When in doubt, assumption examination hopes to choose the impression of a speaker or a creator with respect to some point or the general tonality of a report. Notions of individuals can be examined as Positive, Negative and Neutral. The sentiments investigation is crucial to the business fields. By investigating, the negative estimation from the twitter can be utilized to enhance the business and can profit to the clients. Sarcasm is a sort of notion got from the French word Sarkasmos which signifies tear tissue or crush the teeth. The significance is not quite the same as what the speaker plans to state through sarcasm. Sarcasm can likewise be characterized as a complexity between a positive slant and negative circumstance [1] and the other way around. It is vital to distinguish the sarcasm in supposition mining and slant investigation. Sarcasm is a vital angle in online networking information investigation due to the nonattendance of up close and personal contact. As online networking is increasing greater prominence, the issue of sarcasm identification will turn out to be much all the more difficult. This style of articulations were included in Microblogs and online networking which made it extremely troublesome for Fig. 1. Basic features extracted for Sarcasm Detection II. APPROACHES FOR SARCASM DETECTION Sentiment characterization should be possible in two ways to be specific machine learning methodology and vocabulary based methodologies. Machine learning brings about most extreme exactness and semantic introduction gives better generalization. Machine learning can be isolated into supervised and unsupervised methodologies. Supervised approach require two arrangements of explained information, one set is for preparing and the other is for trying. The different sorts of classifiers utilized for directed learning are Decision Tree, Support Vector Machine, Neural Network, Naive Bayes and Maximum Entropy. The other approach called vocabulary based contains two methodologies in particular word reference based or corpus based. Lexicon based approach makes utilization of existing word reference, which 13183

is a gathering of supposition words alongside their positive (+ve) or negative (- ve) assumption. Word references can be made with/without utilizing metaphysics. A cosmology is characterized as an express, machine-discernable detail of a common conceptualization [2]. Cosmology can be utilized for new words which are not found in named corpus. Corpus construct method depends in light of the likelihood of event of an opinion word in conjunction with positive or negative organization of words by performing look on extremely gigantic measure of writings like Google seek, AltaVista seek and so on. Fig. 2. Approaches and Components used in Sarcasm Detection A. Supervised Approach Supervised techniques can be implemented by building a classifier. This classifier is trained by examples, which can be manually labeled. Most commonly used supervised algorithms are Support Vector Machines (SVM), Naive Bayes classifier and Maximum Entropy[3].. 1) Support Vector Machine: Support vector machines (SVM) is a Supervised machine learning algorithm basically used for both classification and Regression. it is mostly used in classification problems. Each data item is plotted as a point in n-dimensional area, wheren is number of features, with the value of every feature being the value of a specific coordinate. 2) Nave Bayes: Nave Bayes(NB) is mainly used for text categorization and it is based on Bayestheorem with the Nave assumption of independence between every pair of features. Naive Bayes is often used to predict the probability of sentiments in the text. 3) Maximum Entropy: It is a Probabilistic classifier model based on the class of exponential model. And it supports various natural language tasks, such as language modeling, part-of-speech tagging, and text categorization. 4) Neural Networks: It is a Computational model. Artificial neural network works as the way human brains processes information. To process the information ANN consolidates a tremendous measure of related training units and deliver important outcomes. String and Lee [4] utilized different administered methods, for example, Nave Bayes, Maximum Entropy and Support Vector Machine for parallel feeling arrangement for film audits. For tests, creators have gathered film audits from imdb.com. Creators have done explore different avenues regarding distinctive component designing, where SVM gave the most elevated precision of 82.9 rate with unigrams highlights. Dang et al. [5] ordered estimations utilizing SVM by utilizing distinctive element determination strategies. They have done the trials by utilizing two corpora one with 305 positive audits and 307 negative surveys on computerized camera and the other corpora was the multi-area dataset from Blitzer et al. [6]. SVM was prepared on accumulations of three listed capabilities in view of area free, space ward and assumption highlights. Information Gain was connected to lessen the quantity of highlights from various mix of highlights. The diminished list of capabilities performed preferred on multi-space dataset over computerized camera dataset and acquired an exactness of 84.15 rate. Zhang et al.[7]characterized notion utilizing machine learning for eatery audits written in the standard Cantonese. Creators have considered the impacts of highlight portrayals and highlight measure on the grouping execution. Creators have played out a test on 1500 +ve and 1500 - ve audits and utilized diverse element portrayals like unigram, unigram freq, bigram, bigram freq, trigram and trigram freq and different highlights in the scope of 50 to 1,600 highlights. The most noteworthy precision detailed was 95.67 rate utilizing Nave Bayes calculation for 900 1100 highlights. The Dynamic Artificial Neural Network and Support Vector Machine were utilized as multi-class classifiers. Dynamic Artificial Neural Networks(DAN2) was outlined in such an approach to contain numerous shrouded layers with four concealed hubs for each layer. DAN2 outflanked Support Vector Machine with an exactness of 71.3 rate for firmly positive, 66.7 rate for somewhat positive, 89.9 rate for somewhat negative and 95.1 rate for unequivocally negative Authors have done try different things with various element designing, where SVM gave the most astounding precision of 82.9 rate with unigrams highlights. B. Unsupervised Approach This concerns the analysis of unclassified cases. This framework isn t furnished with any preparation illustrations. Unsupervised learning is a strategy for discovering designs in information with no data about the result of the information. The quality of unsupervised learning exists in discovering relationship in information and show how it is orchestrated by Michie et al. [8] 1) Lexicon based approaches: The Lexicon-construct Approach depends with respect to a feeling vocabulary, an accumulation of known and precompiled slant terms. For removing two-word, phrases from remarks. Turney[9] utilized an accumulation of examples of tags.to decide semantic introduction of remarks creators utilized a component called PMI-Information recovery (Pointwise Mutual Information) by offering inquiries to a web crawler by considering two limit esteems as phenomenal and poor for positive and negative individually. For tests 410 surveys on different spaces have 13184

been completed from a site called Epinions.com. Thus, the most noteworthy exactness of 84 rate accomplished on littlest dataset, which contains just 75 audits on vehicles, and for the film surveys, the least precision level of 65.83 rate was accomplished. Subject Favorability assurance was finished by Yi [10] by creating feeling dictionaries of 3513 assessment words. Two elements were viewed as, for example, syntactic conditions among the expressions and subject term modifiers. Creators have done investigations in two sets the first is a multi-area corpus contains 175 instances of subject terms inside the unique circumstance and the second one is camera audits of 2000 cases. The proposed work has been assessed on pages and news articles of 552586 and 230079 audits separately to remove notions by characterizing 13 subject terms, which obtained the exactness rate of 86 rate and 88 rate individually. The exactness of 91 rate was accomplished when the notion extraction led on 476126 pages of medicinal space. 2) Corpus and Dictionary Based Approach: Dictionary based approach functions as ordinary word reference idea, first the given expressions of feelings are searched and after that it looks down their equivalent words and antonyms. A portion of the assessment words are recorded physically. The rundown is then extended via seeking into prominent or surely understood corpora like WordNet. The quality of extremity for each word is likewise recorded in the lexicon. The Corpus Based approach is utilized to discover supposition words with setting particular introductions, which rely upon syntactic patterns. Author Name Raghavan V M, Mohana Kumar[11] Pushpak Bhattacharyya, Mark Carman[12] Meishan Zhang, Yue Zhang and Guohong [13] Tamal Ghosh[14] Erik Forslid,Niklas Wikn.[15] Approach Used Dataset Used Code POS tagging,hashtag facebook post Processor unigram,word google embedding search deep neural net twitter End results 87% F- score:72% imbalance- 86%,balanced- 80% KNN,LDA dictionary KNN- 60%,LDA- 62% SVM, Decision tree,nave Bayes Amazon, twitter Amazon -87%, Twitter data -71% Chun-Che Peng,Moha mmad Lakis, Jan Wei Pan[16] Tomas Ptacek,Ivan Habernal,Jun Hong[17] Naive Bayes, 1 class SVM Maximum Entropy (MaxEnt) and SVM Twitter Nave Bayes- 62.02%. 1-Class SVM50% Twitter F1- score:0.947 III. FINDINGS OF THE STUDY AND DISCUSSION 1) Support Vector Machine: : Existing SVM strategies connected in different works need straight forwardness in results. In the event that the measurements of datasets are high, SVM won t not have the capacity to demonstrate the precise outcome. Pivot misfortune in SVM is frequently scattered that influences the exactness of the outcomes. A misfortune work which is utilized to prepare the classifiers is called Hinge misfortune 2) Decision Trees: : Decision Tree is simple and easy to utilize yet it has innate disadvantages, for example, it works with just known gatherings. Indeed, even little changes in the info information can settle on enormous changes on the choice tree and commonly prompts redrawing the tree. 3) Naive Bayes: An inconspicuous issue with Naive Bayes is that it works with little measure of preparing information while Deep learning works productively with huge informational collections. Run Based calculation is exceptionally work concentrated as the Rule base is huge. Consequently, basic leadership is extremely unpredictable. At the point when another information is included, new principles must be surrounded. 4) Lexicon Based: This approach gives better outcomes in the event that it works with little word reference and does not center around extensive database. From the above discoveries, this examination has arrived at a conclusion that a large portion of the current methodologies deal with little datasets and little word references. As Sarcasm, discovery can be effective just if the identification is made on vast arrangement of information and existing methodologies can t decide Sarcasm productive as information is gathered in an expansive sum from webbased social networking like Twitter and Facebook. Along these lines this examination reasons that Deep learning is a superior approach for Sarcasm Detection. IV. IMPLEMENTATION FINDINGS After the survey of various classifiers we have tried to implement two of the supervised approaches namely Support Vector Machine and Neural Networks. The findings of the study are as given in Fig3 and Fig4. Deciding on the Precision,Accuracy, F-values and Recall, we get to know that the Support Vector Machine classifier outweighs Neural Networks based classification based on the parameters. Including these 13185

Engineering and Technology, SRM Institute of Science and Technology, Chennai, Tamil Nadu, INDIA. Fig. 3. table showing SVM parameters and their obtained results Fig. 4. table showing NN parameters and their obtained results unigram features to a Neural Network was not possible due to the high dimensionality of feature vectors involved. This was however only possble with the LibSVM library. V. CONCLUSION Sarcasm recognition in composing is an insignificant testing errand because of absence of verbalization and outward appearances. Numerous methodologies are there to do Sentiment examination for sarcasm location. The quantity of Twitter clients are expanding step by step, the remarks shared by the general population are substantial, and vast informational collection is created. There are numerous procedures created to do slant investigation yet the issue of sarcasm recognition is as yet not illuminated. This paper gives a review on different philosophies used to distinguish sarcasm in Twitter web-based social networking information and have done the investigation of different classifiers, for example, Support Vector Machine, Nave Bayes, Lexicon based with exactness rate. Sarcasm can be resolved proficiently just if the current methodologies can manage huge informational index however the majority of the current methodologies can manage just little datasets. So profound learning approach is considered as an effective way to deal with recognize Sarcasm in the event of vast datasets. REFERENCES [1] S. K. Bharti, R. Pradhan, K. S. Babu, and S. K. Jena, Sarcasm analysis on twitter data using machine learning approaches, in Trends in Social Network Analysis. Springer, 2017, pp. 51 76. [2] M. Simmons, W. M. Nelson, S. Wu, and C. G. Hayes, Evaluation of the protective efficacy of a recombinant dengue envelope b domain fusion protein against dengue 2 virus infection in mice. The American journal of tropical medicine and hygiene, vol. 58, no. 5, pp. 655 662, 1998. [3] S. Bharti, B. Vachha, R. Pradhan, K. Babu, and S. Jena, Sarcastic sentiment detection in tweets streamed in real time: a big data approach, Digital Communications and Networks, vol. 2, no. 3, pp. 108 121, 2016. [4] B. Pang, L. Lee, and S. Vaithyanathan, Thumbs up?: sentiment classification using machine learning techniques, in Proceedings of the ACL- 02 conference on Empirical methods in natural language processing- Volume 10. Association for Computational Linguistics, 2002, pp. 79 86. [5] Y. Dang, Y. Zhang, and H. Chen, A lexicon-enhanced method for sentiment classification: An experiment on online product reviews, IEEE Intelligent Systems, vol. 25, no. 4, pp. 46 53, 2010. [6] J. Blitzer, M. Dredze, and F. Pereira, Biographies, bollywood, boomboxes and blenders: Domain adaptation for sentiment classification, in Proceedings of the 45th annual meeting of the association of computational linguistics, 2007, pp. 440 447. [7] Z. Zhang, Q. Ye, Z. Zhang, and Y. Li, Sentiment classification of internet restaurant reviews written in cantonese, Expert Systems with Applications, vol. 38, no. 6, pp. 7674 7682, 2011. [8] R. Henery, D. Michie, D. Spiegelhalter, and C. Taylor, Machine learning, neural and statistical classification, 1994. [9] P. D. Turney, Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, in Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002, pp. 417 424. [10] T. Nasukawa and J. Yi, Sentiment analysis: Capturing favorability using natural language processing, in Proceedings of the 2nd international conference on Knowledge capture. ACM, 2003, pp. 70 77. [11] R. Sridhar et al., Emotion and sarcasm identification of posts from facebook data using a hybrid approach. ICTACT Journal on Soft Computing, vol. 7, no. 2, 2017. [12] A. Joshi, P. Bhattacharyya, and M. J. Carman, Automatic sarcasm detection: A survey, ACM Computing Surveys (CSUR), vol. 50, no. 5, p. 73, 2017. [13] M. Zhang, Y. Zhang, and G. Fu, Tweet sarcasm detection using deep neural network, in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016, pp. 2449 2460. [14] T. D. Chaudhuri and I. Ghosh, Artificial neural network and time series modeling based approach to forecasting the exchange rate in a multivariate framework, arxiv preprint arxiv:1607.02093, 2016. [15] E. Forslid and N. Wikén, Automatic irony-and sarcasm detection in social media, 2015. [16] C.-C. Peng, M. Lakis, and J. W. Pan, Detecting sarcasm in text. [17] T. Ptáček, I. Habernal, and J. Hong, Sarcasm detection on czech and english twitter, in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, 2014, pp. 213 223. ACKNOWLEDGMENT This work was supported by Department of Computer Science and Engineering, School of Computing, Faculty of 13186

13187

13188