Introduction to Sentiment Analysis Wiltrud Kessler Institut für Maschinelle Sprachverarbeitung Universität Stuttgart 26. April 2011
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 2 / 42
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 3 / 42
Ablauf Di 26.4., 9:45 Uhr, M12.11 Einführung in Sentimentanalyse Fr 29.4., 11:15 Uhr, M12.11 Einführung in Sentimentanalyse Di 3.5., 9:45 Uhr, M12.11 Themenverteilung und Festlegung der Präsentationstermine. Spezielle Literatur für die einzelnen Seminarthemen ist auf dem Handout angegeben. Diese Folien stützen sich vor allem auf [PL08] und [Liu10]. Wiltrud Kessler Introduction to Sentiment Analysis 4 / 42
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 5 / 42
Motivation What other people think has always been part in decision making: Do you think I should buy this camera? Why do you vote for X? Do you know a good dentist? Before the spread of the internet, people used to ask friends. Now, with Web 2.0, the internet contains a huge amount of opinions in forums, blogs, review sites,... (user generated content). But it is not always easy to find and analyse the opinions needed for decision making. Wiltrud Kessler Introduction to Sentiment Analysis 6 / 42
Sentiment Analysis Wilson, Wiebe and Hoffmann [WWH05] Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations. Liu [Liu10] Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text. There is a wide variety of vocabulary used in the literature: Sentiment analysis, opinion mining, subjectivity analysis, emotional polarity computation,... The subject of the analysis is denoted as: Sentiment polarity, sentiment orientation, polarity of opinion, semantic orientation, lexical valence, attitude,... Wiltrud Kessler Introduction to Sentiment Analysis 7 / 42
Sentiment Analysis Wilson, Wiebe and Hoffmann [WWH05] Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations. Liu [Liu10] Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text. There is a wide variety of vocabulary used in the literature: Sentiment analysis, opinion mining, subjectivity analysis, emotional polarity computation,... The subject of the analysis is denoted as: Sentiment polarity, sentiment orientation, polarity of opinion, semantic orientation, lexical valence, attitude,... Wiltrud Kessler Introduction to Sentiment Analysis 7 / 42
Sentiment Analysis Wilson, Wiebe and Hoffmann [WWH05] Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations. Liu [Liu10] Sentiment analysis or opinion mining is the computational study of opinions, sentiments and emotions expressed in text. There is a wide variety of vocabulary used in the literature: Sentiment analysis, opinion mining, subjectivity analysis, emotional polarity computation,... The subject of the analysis is denoted as: Sentiment polarity, sentiment orientation, polarity of opinion, semantic orientation, lexical valence, attitude,... Wiltrud Kessler Introduction to Sentiment Analysis 7 / 42
Use Cases Search for opinions about an object or opinions of a specific person ( What do Europeans think of Obama? ). Find how opinions about an object change over time. Summarize opinions, maybe several summaries from different viewpoints ( The FDP thinks the proposal is simplistic, while the SPD... ). Find opinions of user reflected in recomender systems ( other people who liked this item also liked... ). Detect flames (mail, forums, blog comments,... ). Enhance our understanding of subjectivity to improve HCI (computer emotions, humor,... ). Wiltrud Kessler Introduction to Sentiment Analysis 8 / 42
Commercial Applications Customer satisfaction: Companies want to know the opinions of their customers about their products, especially new products. Market survey: Companies want to know how the competitors and their products are doing. Reputation management/brand perception: Track and influence the perception of the company and products, don t sell bad products. Ad placement/marketing campaigns: Identify places where people talk about the company, identify opinion leaders. Trend Prediction: Will people continue to want a specific type of product or prefer a different one. Wiltrud Kessler Introduction to Sentiment Analysis 9 / 42
Existing Tools ( Social Media Monitoring/Analysis ) Radian 6 Social Mention Overtone OpenMic Microsoft Dynamics Social Networking Accelerator SAS Social Media Analytics Lithium Social Media Monitoring RightNow Cloud Monitor... [tweetfeel.com] [tweetsentiments.com] Wiltrud Kessler Introduction to Sentiment Analysis 10 / 42
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 11 / 42
Definitions of Sentiment (1) WordNet A personal belief or judgement that is not founded on proof or certainty. Wikipedia An opinion is a subjective statement or thought about an issue or topic, and is the result of emotion or interpretation of facts. Wiltrud Kessler Introduction to Sentiment Analysis 12 / 42
Definitions of Sentiment (1) WordNet A personal belief or judgement that is not founded on proof or certainty. Wikipedia An opinion is a subjective statement or thought about an issue or topic, and is the result of emotion or interpretation of facts. Wiltrud Kessler Introduction to Sentiment Analysis 12 / 42
Definitions of Sentiment (2) Merriam-Webster s Online Dictionary Opinion, view, belief, conviction, persuasion [and] sentiment mean a judgment one holds as true. Opinion implies a conclusion thought out yet open to dispute [...]. View suggests a subjective opinion [...]. Belief implies often deliberate acceptance and intellectual assent [...]. Conviction applies to a firmly and seriously held belief [...]. Persuasion suggests a belief grounded on assurance (as by evidence) of its truth [...]. Sentiment suggests a settled opinion reflective of one s feelings [...]. Wiltrud Kessler Introduction to Sentiment Analysis 13 / 42
Definitions of Sentiment (3) Hatzivassiloglou and McKeown [HM97] The semantic orientation or polarity of a word indicates the direction the word deviates from the norm for its semantic group or lexical field. It also constrains the word s usage in the language. Turney and Littman [TL03] The evaluative character of a word is called its semantic orientation. Positive semantic orientation indicates praise [...] and negative semantic orientation indicates criticism [...]. Semantic orientation varies in both direction (positive or negative) and degree (mild to strong). Wiltrud Kessler Introduction to Sentiment Analysis 14 / 42
Definitions of Sentiment (3) Hatzivassiloglou and McKeown [HM97] The semantic orientation or polarity of a word indicates the direction the word deviates from the norm for its semantic group or lexical field. It also constrains the word s usage in the language. Turney and Littman [TL03] The evaluative character of a word is called its semantic orientation. Positive semantic orientation indicates praise [...] and negative semantic orientation indicates criticism [...]. Semantic orientation varies in both direction (positive or negative) and degree (mild to strong). Wiltrud Kessler Introduction to Sentiment Analysis 14 / 42
Definitions of Sentiment (4) Liu [Liu10] An opinion on a feature f is a positive or negative view, attitude, emotion or appraisal on f from an opinion holder. An opinion is a quintuple (o j, f jk, oo ijkl, h i, t l ), where o j is a target object (a product, person, event, organization, or topic). f jk is a feature (aspect) of the object o j (a component, part or attribute of an object). oo ijkl is the sentiment value of the opinion of the opinion holder h i on feature f jk of object o j at time t l (positive, negative, neutral). h i is an opinion holder (the person or organization that expresses the opinion). is the time when the opinion is expressed. t l Wiltrud Kessler Introduction to Sentiment Analysis 15 / 42
Definitions of Sentiment (4) Liu [Liu10] An opinion on a feature f is a positive or negative view, attitude, emotion or appraisal on f from an opinion holder. An opinion is a quintuple (o j, f jk, oo ijkl, h i, t l ), where o j is a target object (a product, person, event, organization, or topic). f jk is a feature (aspect) of the object o j (a component, part or attribute of an object). oo ijkl is the sentiment value of the opinion of the opinion holder h i on feature f jk of object o j at time t l (positive, negative, neutral). h i is an opinion holder (the person or organization that expresses the opinion). is the time when the opinion is expressed. t l Wiltrud Kessler Introduction to Sentiment Analysis 15 / 42
The following subproblems are to be solved Target object o j Feature f jk Sentiment oo ijkl Opinion holder h i Time t l Named Entity Extraction. Information Extraction. Sentiment Classification. Information/Data Extraction. Data Extraction. All of these problems are far from solved and include treating coreference resolution, word sense disambiguation,... We will focus on sentiment classification. Wiltrud Kessler Introduction to Sentiment Analysis 16 / 42
The following subproblems are to be solved Target object o j Feature f jk Sentiment oo ijkl Opinion holder h i Time t l Named Entity Extraction. Information Extraction. Sentiment Classification. Information/Data Extraction. Data Extraction. All of these problems are far from solved and include treating coreference resolution, word sense disambiguation,... We will focus on sentiment classification. Wiltrud Kessler Introduction to Sentiment Analysis 16 / 42
Subjectivity and Objectivity Definition [Liu10] An objective sentence expresses some factual information about the world, while a subjective sentence expresses some personal feelings or beliefs. Objective: I returned the phone yesterday. Subjective: The voice on my phone was not so clear. Not every subjective sentence contains an opinion: I wanted a phone with good voice quality. Objective sentences can implicitly indicate opinions: The earphone broke in two days. Subjectivity Analysis. We will consider opinionated sentences for classification, i.e. all sentences that express explicit or implicit opinion [Liu10]. Wiltrud Kessler Introduction to Sentiment Analysis 17 / 42
Subjectivity and Objectivity Definition [Liu10] An objective sentence expresses some factual information about the world, while a subjective sentence expresses some personal feelings or beliefs. Objective: I returned the phone yesterday. Subjective: The voice on my phone was not so clear. Not every subjective sentence contains an opinion: I wanted a phone with good voice quality. Objective sentences can implicitly indicate opinions: The earphone broke in two days. Subjectivity Analysis. We will consider opinionated sentences for classification, i.e. all sentences that express explicit or implicit opinion [Liu10]. Wiltrud Kessler Introduction to Sentiment Analysis 17 / 42
Direct and Indirect Opinions Direct opinions are sentiment expressions on some objects: The picture quality of this camera is great. Indirect opinions are comparisons, relations expressing similarities or differences (objective or subjective) of more than one object: Car X is cheaper (better) than car Y. Comparative Sentences Extraction. We will classify only direct opinions. Wiltrud Kessler Introduction to Sentiment Analysis 18 / 42
Direct and Indirect Opinions Direct opinions are sentiment expressions on some objects: The picture quality of this camera is great. Indirect opinions are comparisons, relations expressing similarities or differences (objective or subjective) of more than one object: Car X is cheaper (better) than car Y. Comparative Sentences Extraction. We will classify only direct opinions. Wiltrud Kessler Introduction to Sentiment Analysis 18 / 42
Beyond Positive/Negative (1) Apart from only positive and negative, it is conceivable to classify expressions as... objective Lack of opinion (see subjectivity analysis). neutral Sentiment that lies between positive and negative: The movie is mediocre. mixed A mixture of positive and negative language: The tool is usable. Wiltrud Kessler Introduction to Sentiment Analysis 19 / 42
Beyond Positive/Negative (2) Opinions vary not only in polarity, but also in strength: I don t think this is a good phone. (weak) This phone is a piece of junk. (strong) This is reflected in emotions, we can feel contented, happy, joyous or ecstatic. For classification, different levels can be used, p.e. 5 levels from very negative to very positive [ODB + 09]. Wiltrud Kessler Introduction to Sentiment Analysis 20 / 42
Beyond Positive/Negative (3) If humans are to classify sentiment, agreement is moderate: κ 1 annotations annotators classes [HM97] 89.15% 2 adjectives 4 3 [WWH05] 72.0% expressions 2 4 [ODB + 09] 71.2% documents 7 3 [ODB + 09] 46.6% documents 7 7 [AB06] view sentiment as a fuzzy category, where membership is gradual and some members are more central than others. Words that are less central are more likely to be ambiguous and will also be difficult to annotate for human annotators. 1 κ = P(A) P(E), where P(A) proportion of times the judges agreed, and 1 P(E) P(E) proportion of the times they would be expected to agree by chance. 2 agreement, they do not report κ Wiltrud Kessler Introduction to Sentiment Analysis 21 / 42
Levels of Analysis Document Assumption: Each document (or review) focuses on a single object (not true in many discussion posts) and contains opinions from a single opinion holder. Sentence Assumption: A sentence contains only one opinion. Aspect Assumption: A document contains only one opinion on every aspect. Word Assumption: One word has one orientation that is independent of context. Wiltrud Kessler Introduction to Sentiment Analysis 22 / 42
Sentiment Analysis Tasks Tasks in sentiment analysis we are not going to investigate: Extraction of holder, time and target of the opinion expression. Feature/Aspect-based opinion classification. Extraction/Summarization of opinions. Subjectivity analysis. Comparative sentences extraction. Detection of opinion spam (fake/untruthful review to promote or damage a product s reputation). Detection of duplicate reviews (two reviews which have very similar contents). Tasks in sentiment analysis we are going to investigate: Classification of sentiment polarity on all levels. Wiltrud Kessler Introduction to Sentiment Analysis 23 / 42
Summary: What is Sentiment Sentiments have a polarity and are expressed by an opinion holder at a specific time about a feature/aspect of an object. Expressions can be subjective or objective, both types can contain opinions. Sentiments can be direct or indirect (comparisons). Sentiments vary not only in polarity, but also in strength. Apart from positive and negative sentiment there can be neutral or mixed sentiment. Sentiments can be analyzed on different levels. Wiltrud Kessler Introduction to Sentiment Analysis 24 / 42
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 25 / 42
Why not use the same techniques as for facts? Traditionally, text categorization seeks to classify documents by topic into many possible categories. In sentiment classification we often have relatively few classes representing opposing or ordinal/numerical categories. Classification of facts works based on keywords, if a keyword is contained in the document, the document talks about the topic related to that keyword 3 ( This text does not talk about cars. is not a frequent statement). First idea: Find a set of keywords for every sentiment category. Keywords will be mainly adjectives, but can also be nouns ( rubbish ), verbs ( hate ) or complete phrases ( cost someone an arm and a leg ). 3 simplified slightly... Wiltrud Kessler Introduction to Sentiment Analysis 26 / 42
Keyword-based Sentiment Recognition Coming up with the right set of keywords is less trivial than one might initially think [PLV02]. Wiltrud Kessler Introduction to Sentiment Analysis 27 / 42
One Word, Two Polarities (1) One word may have different polarities in different domains: unpredictable Movie domain: unpredictable plot (positive) Automotive domain: unpredictable steering (negative) funny Movie domain: funny movie (positive) Food domain: funny taste (negative) Wiltrud Kessler Introduction to Sentiment Analysis 28 / 42
One Word, Two Polarities (2) One word may have different polarities in the same domain in combination with different targets: long, camera domain The battery life is long. (positive) The time taken to focus is long. (negative) low, finance domain The price is low. (positive) Their income is low. (negative) warm, restaurant domain They gave me a warm welcome... (positive)... and warm beer. (negative) Wiltrud Kessler Introduction to Sentiment Analysis 29 / 42
Sentiment Words without Sentiment, Sentiment without Sentiment Words Sentiment words do not always express sentiment: I am looking for a good insurance for my family. With great power comes great responsibility. It is possible to express sentiment with no words that are obvious sentiment words: If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut. (negative) The food tasted like my shoes. (negative) Wiltrud Kessler Introduction to Sentiment Analysis 30 / 42
Valence Shifters - Negation Negations can flip the polarity of an expression: I like this book. (positive) I don t like this book. (negative) John is successful at tennis. (positive) John is never successful at tennis. (negative) Peter failed the exam. (negative) Nobody failed the exam. (positive) But the presence a negation in the sentence is no guarantee to flip the polarity of the sentence: This book is good. (positive) No wonder this book is good. (positive) Wiltrud Kessler Introduction to Sentiment Analysis 31 / 42
Valence Shifters - Intensifiers, Diminishers Some modifiers weaken or strengthen the polarity of the term modified [PZ04]: efficient (positive) very efficient. (more positive) suspicious (negative) deeply suspicious (more negative) efficient (positive) slightly efficient (less positive) suspicious (negative) somewhat suspicious (less negative) Wiltrud Kessler Introduction to Sentiment Analysis 32 / 42
Valence Shifters - Presuppositional Items Presupposition Any information which is taken for granted in a discourse situation, for instance the sentence Did you enjoy your breakfast? assumes that the interlocutor already had breakfast. Presuppositional items may swap polarity or assign polarity to otherwise neutral/objective statements [PZ04]: The battery lasts 2 hours. (objective) The battery only lasts 2 hours. (negative, it should have lasted more) Servings were sufficient. (positive) Servings were barely sufficient. (negative, more was expected) He succeeded. (positive) He failed to succeed. (negative, he was expected to succeed) Wiltrud Kessler Introduction to Sentiment Analysis 33 / 42
Valence Shifters - Modals Modal operators are used to express possibilities, conditions, or to set up a context in which an attitude is expressed that does not necessarily reflect the actual opinion of the author. Mary might be a terrible person. If Sony makes good cameras, I will buy one. If you are looking for a camera with great picture quality, buy Sony. If this restaurant was bad, I would not recommend it. If Mary were a terrible person, she would be mean to her dogs. We need to treat this differently from sentences expressing real sentiments from the author. Wiltrud Kessler Introduction to Sentiment Analysis 34 / 42
More Problems Slang imo the ice cream is luuurrrrrrvely. Figure of Speech He is not the sharpest knife in the drawer. Irony As much use as a trapdoor on a lifeboat., The brilliant organizer failed to solve the problem. Relevance A document may contain off-topic passages that might also contain opinions. Point of view Israel defeated the Hamas. is positive for Israel, negative for the Hamas.... Wiltrud Kessler Introduction to Sentiment Analysis 35 / 42
Incorporating Discourse Structure I hate the Spice Girls.... [3 things the author hates about them]... Why I saw this movie is a really, really, really long story, but I did, and one would think I d despise every minute of it. But... Okay, I m really ashamed of it, but I enjoyed it. I mean, I admit it s a really awful movie,... [they] act wacky as hell...the ninth floor of hell...a cheap [beep] movie...the plot is such a mess that it s terrible. But I loved it. Contains a large number of negative sentences, but the overall sentiment towards the movie is positive [PL08, PLV02]. Wiltrud Kessler Introduction to Sentiment Analysis 36 / 42
Summary: Why is it Difficult We have a bad intuition as to which sentiment words to chose. Sentiment words are domain-specific. Some sentiment words have different polarities depending on the opinion target. Sentiment words can occur in a context where they don t express sentiment and sentiment can be expressed without using sentiment words. Valence shifters like intensifiers or negation can change the polarities of words depending on the sentence structure. The sources often contain diffucult phenomena like presuppositions, modals, slang, figures of speech, irony. The sentiment expressed may depend on the point of view. Sentiments are expressed in a discourse. Wiltrud Kessler Introduction to Sentiment Analysis 37 / 42
Outline Organisational Motivation What is Sentiment? Why is it Difficult? Resources Wiltrud Kessler Introduction to Sentiment Analysis 38 / 42
Some Resources The Sentiment & Affect Yahoo! Group http://groups.yahoo.com/group/sentimentai The General Inquirer http://www.wjh.harvard.edu/ inquirer SentiWordNet http://patty.isti.cnr.it/ esuli/software/sentiwordnet Movie Review corpus http://www.cs.cornell.edu/people/pabo/movie-review-data MPQA opinion corpus http://www.cs.pitt.edu/mpqa/databaserelease ICWSM 2010 JDPA corpus for the automotive domain http://www.icwsm.org/data/ Wiltrud Kessler Introduction to Sentiment Analysis 39 / 42
Bibliography I [AB06] [HM97] [Liu10] Alina Andreevskaia and Sabine Bergler. Mining WordNet for Fuzzy Sentiment : Sentiment Tag Extraction from WordNet Glosses. In Proceedings of the 11th Conference of the European Chapter of the Association for the Computational Linguistics, EACL-2006, pages 209 216, 2006. Vasileios Hatzivassiloglou and Kathleen McKeown. Predicting the semantic orientation of adjectives. In Proceedings of the Joint ACL/EACL Conference, pages 174 181, 1997. Bing Liu. Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, 2010. Wiltrud Kessler Introduction to Sentiment Analysis 40 / 42
Bibliography II [ODB + 09] Neil O Hare, Michael Davy, Adam Bermingham, Paul Ferguson, Paraic Sheridan, Cathal Gurrin, and Alan F. Smeaton. Topic-Dependent Sentiment Analysis Of Financial Blogs. In Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, 2009. [PL08] [PLV02] Bo Pang and Lillian Lee. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2:1 135, January 2008. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. Proceedings of the ACL-02 conference on Empirical methods in natural language processing, 10(July):79 86, 2002. Wiltrud Kessler Introduction to Sentiment Analysis 41 / 42
Bibliography III [PZ04] [TL03] Livia Polanyi and Annie Zaenen. Contextual lexical valence shifters. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications, 2004. Peter D. Turney and Michael L. Littman. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315 346, 2003. [WWH05] Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 347 354, 2005. Wiltrud Kessler Introduction to Sentiment Analysis 42 / 42