From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales. Saif Mohammad! National Research Council Canada

Similar documents
Semantic Role Labeling of Emotions in Tweets. Saif Mohammad, Xiaodan Zhu, and Joel Martin! National Research Council Canada!

Analyzing Electoral Tweets for Affect, Purpose, and Style

Grade 6. Paper MCA: items. Grade 6 Standard 1

Large Scale Concepts and Classifiers for Describing Visual Sentiment in Social Multimedia

Literature Cite the textual evidence that most strongly supports an analysis of what the text says explicitly

Your Sentiment Precedes You: Using an author s historical tweets to predict sarcasm

A Collection Of Beatrix Potter Stories (Webster's Spanish Thesaurus Edition) By Beatrix Potter

Grade 7. Paper MCA: items. Grade 7 Standard 1

Supplemental Material: Color Compatibility From Large Datasets

Andersens Fairy Tales, By Hans Chrisian Andersen

Time Domain Simulations

WikiArt Emotions: An Annotated Dataset of Emotions Evoked by Art

Students will understand that inferences may be supported using evidence from the text. that explicit textual evidence can be accurately cited.

Trevor de Clercq. Music Informatics Interest Group Meeting Society for Music Theory November 3, 2018 San Antonio, TX

Andersens Fairy Tales, By Hans Chrisian Andersen

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

Classic Fairy Tales For Young Children

Illinois Standards Alignment Grades Three through Eleven

Performance evaluation of I 3 S on whale shark data

Evaluating Oscilloscope Mask Testing for Six Sigma Quality Standards

Sarcasm Detection in Text: Design Document

Chapter Two: Long-Term Memory for Timbre

Algorithm User Guide: Colocalization

The Random House Book Of Fairy Tales Download Free (EPUB, PDF)

DOWNLOAD OR READ : UNHAPPILY EVER AFTER FAIRY TALES WITH A TWIST PDF EBOOK EPUB MOBI

What is Statistics? 13.1 What is Statistics? Statistics

Henri Matisse. Chapter author: Neil Takemoto. Chapter Three title page by: Pose 2

Writing a Critical or Rhetorical Analysis

Measuring Variability for Skewed Distributions

Vocabulary Workstation

AP Literature & Composition Summer Reading Assignment & Instructions

Writing Research Essays:

Linguistic Ethnography: Identifying Dominant Word Classes in Text

Study Book Buyer Quo Vadis? Key findings

ATSC Standard: Video Watermark Emission (A/335)

User Guide. S-Curve Tool

CHAPTER I INTRODUCTION. Human have ability to describe a feeling which has a correlation with

The Rhetorical Triangle

English Fairy Tales (Everyman's Library Children's Classics) By John Batten, Joseph Jacobs READ ONLINE

ATSC Candidate Standard: Video Watermark Emission (A/335)

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Brand Color Predictability in the Packaging Supply Chain. Giovanni Vigone Esko ERA Packaging and Decorative Conference 5-NOV-13 - Novara

Sitting through commercials: How commercial break timing and duration affect viewership

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

Chapter 1 Midterm Review

Welcome from Mickey. It s no secret that video is a go-to strategy for consumer marketers.

UNIT 3: THE ADVENTURES OF HUCKLEBERRY FINN BY MARK TWAIN PORTFOLIO OUTLINE & THESIS. English 10A Class Website

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

GLOSSARY OF TERMS. It may be mostly objective or show some bias. Key details help the reader decide an author s point of view.

FILE - FAMOUS ENGLISH SHORT STORIES EBOOK

Incoming 11 th grade students Summer Reading Assignment

Many authors, including Mark Twain, utilize humor as a way to comment on contemporary culture.

ILAR Grade 7. September. Reading

Frequencies. Chapter 2. Descriptive statistics and charts

The Puppet Mobile Elementary CSOs. Spring 2018

GRAAD 12 NATIONAL SENIOR CERTIFICATE GRADE 12

Title: Genre Study Grade: 2 nd grade Subject: Literature Created by: Synda Tindall, Elkhorn Public Schools (Dec. 2006)

Projektseminar: Sentimentanalyse Dozenten: Michael Wiegand und Marc Schulder

Honors English II: Summer Assignments 2015

An Impact Analysis of Features in a Classification Approach to Irony Detection in Product Reviews

Fairy Tales / By Hans Christian Andersen: Illustrated (Classic Reprint) By H. C. Andersen

WordFinder. Verginica Barbu Mititelu RACAI / 13 Calea 13 Septembrie, Bucharest, Romania

Iliad Of Homer By Alexander Homer; translated Pope READ ONLINE

The Classic Fairy Tales (Norton Critical Editions)

International theatrical results for UK films, 2008

10 Steps To Effective Listening

California Content Standards that can be enhanced with storytelling Kindergarten Grade One Grade Two Grade Three Grade Four

Elements Of Wit Mastering The Art Of Being Interesting By Benjamin Errett

Experiments with Fisher Data

Towards Using Hybrid Word and Fragment Units for Vocabulary Independent LVCSR Systems

QuadTech Data Central Reports Overview

homework solutions for: Homework #4: Signal-to-Noise Ratio Estimation submitted to: Dr. Joseph Picone ECE 8993 Fundamentals of Speech Recognition

DesCartes Reading Vocabulary RIT

Jefferson School District Literature Standards Kindergarten

Sentiment Analysis. Andrea Esuli

What to Read Next? The Value of Social Metadata for Book Search

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

Shanxi, PRC, China *Corresponding author

Sandy Creek High School. Instructor: Dr. Tara J. Spriggs. ***Required***

NATIONAL SENIOR CERTIFICATE GRADE 10

LITERAL UNDERSTANDING Skill 1 Recalling Information

English III: Rhetoric & Composition / AP English Language & Composition. Summer Reading Assignment. Sr. Scholastica, O.P.

LiFT-2 Literary Framework for European Teachers in Secondary Education /

TELEVISIONS. Overview PRODUCT CATEGORY REPORT

The College Student s Research Companion:

WHAT AM I SUPPOSED TO WRITE ABOUT. Deciphering and Understanding Writing Prompts

Essay Structure. Take out your vocab. Notecards! A Day: 9/3/15. B Day: 9/4/15. Reflection: Connect the painting below to your summer reading.

1976 Vocabulary Matching

Students will be able to

In Daniel Defoe s adventure novel, Robinson Crusoe, the topic of violence

12th GRADE AP LITERATURE AND COMPOSITION SUMMER READING ASSIGNMENT AP LITERATURE:

Astronomy 15 Reading Report. Research a topic of interest to you in contemporary astronomy;

Box Plots. So that I can: look at large amount of data in condensed form.

2018 TEST CASE: LEGAL ONLINE OFFERS OF FILM EXECUTIVE SUMMARY

Style Wise A Practical Guide To Becoming A Fashion Stylist

RF Safety Surveys At Broadcast Sites: A Basic Guide

Curriculum Map. Unit #3 Reading Fiction: Grades 6-8

2018 READER SURVEY REPORT READERS ON READING

READTHEORY Passages and Questions

Grimms' Fairy Tales: Dual Language: (German-English) By Jacob Grimm, Wilhelm Grimm

Transcription:

From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales Saif Mohammad! National Research Council Canada

Road Map! Introduction and background Emotion lexicon Analysis of emotion words in books Saif Mohammad. Tracking Emotions in Books and Mail. 2

Emotions! (Phil, from the San Francisco Chronicle) speaker/writer Death threats over South Park episode Event When your cartoon can get you killed listener/reader Extremists Trey Parker, Matt Stone Participants Participants Saif Mohammad. Tracking Emotions in Books and Mail. 3

Words! associated with joy When your cartoon can get you killed associated with sadness Saif Mohammad. Tracking Emotions in Books and Mail. 4

Our goal! Create a large word-emotion association lexicon through input from people. Examples: vampire is typically associated with fear startle is associated with surprise bliss is associated with joy death is associated with sadness eager is associated with anticipation Use the lexicon to understand the use of emotion words in text. Saif Mohammad. Tracking Emotions in Books and Mail. 5

Which Emotions? Saif Mohammad. Tracking Emotions in Books and Mail. 6

Plutchik, 1980: Eight Basic Emotions Joy Trust Fear Surprise Sadness Disgust Anger Anticipation Saif Mohammad. Tracking Emotions in Books and Mail. 7

Using Mechanical Turk for CROWDSOURCING A! WORD-EMOTION ASSOCIATION LEXICON! Saif Mohammad. Tracking Emotions in Books and Mail. 8

Amazon s Mechanical Turk Requester breaks task into small independent units HITs specifies: compensation for solving each HIT Turkers attempt as many HITs as they wish Saif Mohammad. Tracking Emotions in Books and Mail. 9

Crowdsourcing Benefits Inexpensive Convenient and time-saving Especially for large-scale annotation Challenges Quality control Malicious annotations Inadvertent errors Saif Mohammad. Tracking Emotions in Books and Mail.10

Target n-grams Must be: in Rogetʼs Thesaurus high-frequency term in the Google n-gram corpus Followed the Mohammad and Turney (2010) approach. Saif Mohammad. Tracking Emotions in Books and Mail.11

Word-Choice Question Q1. Which word is closest in meaning to shark?. car tree fish olive Generated automatically Near-synonym taken from thesaurus Distractors are randomly chosen Guides Turkers to desired sense Aides quality control If Q1 is answered incorrectly: Response to Q2 is discarded Saif Mohammad. Tracking Emotions in Books and Mail.12

Association Questions Q2. How much is shark associated with the emotion fear? (for example, horror and scary are strongly associated with fear) shark is not associated with fear shark is weakly associated with fear shark is moderately associated with fear shark is strongly associated with fear Eight such questions for the eight emotions. Two such questions for positive or negative. Saif Mohammad. Tracking Emotions in Books and Mail.13

Emotion Lexicon Each word-sense pair is annotated by 5 Turkers About 10% of the assignments were discarded due to incorrect response to Q1 (gold question) Targets with less than 3 valid assignments removed NRC Emotion Lexicon sense-level lexicon word sense pairs: 24,200 word-level lexicon union of emotions associated with the different senses of a word word types: 14,200 Saif Mohammad. Tracking Emotions in Books and Mail.14

MOTIVATION:! EMOTION ANALYSIS OF BOOKS! Saif Mohammad. Tracking Emotions in Books and Mail.15

Number of Books Published in a Year (source: Wikipedia) Saif Mohammad. Tracking Emotions in Books and Mail.16

Sources of Digitized Books! Project Gutenberg: more than 34,000 books Google Books Corpus (GBC): 5.2 million books published from 1600 to 2009 English portion has 361 billion words 1-grams, 2-grams, 3-grams, 4-grams, 5-grams Saif Mohammad. Tracking Emotions in Books and Mail.17

Applications! of emotion analysis of books! Search Example: Which Brothers Grimm tales are the darkest? Social Analysis Example: How have books portrayed entities over time? (Michel et al. 2011) Literary Analysis Example: Is the distribution of emotion words in fairy tales significantly different from that in novels? Summarization Example: Automatically generate summaries that capture different emotional states of characters in a novel Analyzing Persuasion Tactics Example: how emotion words are used for persuasion? (Mannix, 1992; Bales, 1997) Saif Mohammad. Tracking Emotions in Books and Mail.18

Applications! of emotion analysis of books! Search Example: Which Brothers Grimm tales are the darkest? Social Analysis Example: How have books portrayed entities over time? (Michel et al. 2011) Literary Analysis Example: Is the distribution of emotion words in fairy tales significantly different from that in novels? Summarization Example: Automatically generate summaries that capture different emotional states of characters in a novel Analyzing Persuasion Tactics Example: how emotion words are used for persuasion? (Mannix, 1992; Bales, 1997) Saif Mohammad. Tracking Emotions in Books and Mail.19

Saif Mohammad. Tracking Emotions in Books and Mail.20

Saif Mohammad. Tracking Emotions in Books and Mail.21

relative salience of trust words Saif Mohammad. Tracking Emotions in Books and Mail.22

relative salience of sadness words Saif Mohammad. Tracking Emotions in Books and Mail.23

Flow of Emotions! Saif Mohammad. Tracking Emotions in Books and Mail.24

Saif Mohammad. Tracking Emotions in Books and Mail.25

Emotion Word Density average number of emotion words in every X words Brothers Grimm fairy tales ordered as per increasing negative word density. X = 10,000. Saif Mohammad. Tracking Emotions in Books and Mail.26

Co-occurring Emotion Words! Examined emotion words in proximity of target entities Used the Google Books Corpus Looked for emotion words in 5-grams that had the target Ignored emotion associated with target word Grouped information into 5-year bins Saif Mohammad. Tracking Emotions in Books and Mail.27

Percentage of fear words in close proximity to occurrences of America, China, Germany, and India in books. Saif Mohammad. Tracking Emotions in Books and Mail.28

Percentage of anger words in close proximity to occurrences of man and woman in books. Saif Mohammad. Tracking Emotions in Books and Mail.29

Comparative Analysis FAIRY TALES VS. NOVELS! Saif Mohammad. Tracking Emotions in Books and Mail.30

Fairy Tales! Archetypal characters peasant, king, fairy Clear identification of good and bad Appeal through emotions (Kast, 1993, Jones 2002) Convey concerns, subliminal fears, wishes, and fantasies Do fairy tales have higher emotion word density than novels? Is there a difference in the distribution of emotion words? Saif Mohammad. Tracking Emotions in Books and Mail.31

Corpora! The Fairy Tale Corpus (FTC) (Lobo and Martins de Matos, 2010) 453 stories close to 1 million words penned in the 19th century by the Brothers Grimm, Beatrix Potter, and Hans C. Andersen taken from Project Gutenberg Corpus of English Novels (CEN) (compiled by Hendrik de Smet) 292 novels written between 1881 and 1922 by 25 British and American novelists 26 million words taken from Project Gutenberg Saif Mohammad. Tracking Emotions in Books and Mail.32

mean std. dev. FTC 749 393 CEN 746 162 Histogram of texts with different anger word densities. On the x-axis: 1 refers to density between 0 and 100, 2 refers to 100 to 200, and so on. Density is per 10,000 words. Saif Mohammad. Tracking Emotions in Books and Mail.33

mean std. dev. FTC 1417 467 CEN 1164 196 Histogram of texts with different joy word densities. On the x-axis: 1 refers to density between 0 and 100, 2 refers to 100 to 200, and so on. Density is per 10,000 words. Saif Mohammad. Tracking Emotions in Books and Mail.34

mean std. dev. FTC 814 443 CEN 785 159 Histogram of texts with sadness word densities. On the x-axis: 1 refers to density between 0 and 100, 2 refers to 100 to 200, and so on. Density is per 10,000 words. Saif Mohammad. Tracking Emotions in Books and Mail.35

mean std. dev. FTC 680 325 CEN 628 93 Histogram of texts with surprise word densities. On the x-axis: 1 refers to density between 0 and 100, 2 refers to 100 to 200, and so on. Density is per 10,000 words. Saif Mohammad. Tracking Emotions in Books and Mail.36

Summary! Created a large word-emotion association lexicon Used simple measures and visualizations to quantify and track the use of emotion words in texts Used the Brothers Grimm fairy tales showed texts can be ordered for affect-based search Used the Google Books Corpus tracked emotion associations of entities over time Used the Fairy Tales and Novels Corpora showed how fairy tales tend to have more extreme emotion word densities than novels Saif Mohammad. Tracking Emotions in Books and Mail.37