Smile and Laughter in Human-Machine Interaction: a study of engagement

Similar documents
Seminar CHIST-ERA Istanbul : 4 March 2014 Kick-off meeting : 27 January 2014 (call IUI 2012)

This manuscript was published as: Ruch, W. (1997). Laughter and temperament. In: P. Ekman & E. L. Rosenberg (Eds.), What the face reveals: Basic and

Brief Report. Development of a Measure of Humour Appreciation. Maria P. Y. Chik 1 Department of Education Studies Hong Kong Baptist University

How about laughter? Perceived naturalness of two laughing humanoid robots

PSYCHOLOGICAL AND CROSS-CULTURAL EFFECTS ON LAUGHTER SOUND PRODUCTION Marianna De Benedictis Università di Bari

Acoustic Prosodic Features In Sarcastic Utterances

Empirical Evaluation of Animated Agents In a Multi-Modal E-Retail Application

THE RELATIONSHIP BETWEEN DICHOTOMOUS THINKING AND MUSIC PREFERENCES AMONG JAPANESE UNDERGRADUATES

Audiovisual analysis of relations between laughter types and laughter motions

Automatic Laughter Detection

LAUGHTER IN SOCIAL ROBOTICS WITH HUMANOIDS AND ANDROIDS

TRAIT CHEERFULNESS AND THE SENSE OF HUMOUR

The roles of expertise and partnership in collaborative rehearsal

The Effects of Humor Therapy on Older Adults. Mariah Stump

Hearing Loss and Sarcasm: The Problem is Conceptual NOT Perceptual

Effect of sense of Humour on Positive Capacities: An Empirical Inquiry into Psychological Aspects

Relationship between styles of humor and divergent thinking

Chapter 3 Intercultural Communication

Running head: FACIAL SYMMETRY AND PHYSICAL ATTRACTIVENESS 1

Multimodal databases at KTH

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Expressive Multimodal Conversational Acts for SAIBA agents

Perceiving Differences and Similarities in Music: Melodic Categorization During the First Years of Life

Laugh when you re winning

The Sound of Emotion: The Effect of Performers Emotions on Auditory Performance Characteristics

Retrieval of textual song lyrics from sung inputs

Humour Styles: Predictors of. Perceived Stress and Self-Efficacy. with gender and age differences. Thea Sveinsdatter Holland

Detecting Attempts at Humor in Multiparty Meetings

Surprise & emotion. Theoretical paper Key conference theme: Interest, surprise and delight

DEMENTIA CARE CONFERENCE 2014

This project builds on a series of studies about shared understanding in collaborative music making. Download the PDF to find out more.

Expressive performance in music: Mapping acoustic cues onto facial expressions

This manuscript was published as: Ruch, W. (1995). Will the real relationship between facial expression and affective experience please stand up: The

Lecture 24. Social Hierarchy. Social Power Inhibition vs. disinhibition

Evaluation of Music Education: Musical Competences and Selfconfidence

An Examination of Personal Humor Style and Humor Appreciation in Others

Automatic Laughter Detection

Musings from the Deliberation Room: The Impact of Humor on Juror Decision Making

MAKING INTERACTIVE GUIDES MORE ATTRACTIVE

Unit 1 People FEATURES. 1 Look at the photo and the caption. Where is the explorer? What is the photographer s name? 10 Explorers

Multimodal Data Collection of Human-Robot Humorous Interactions in the JOKER Project

Influences of Humor on Creative Design: A Comparison of Students Learning Experience Between China and Denmark Chunfang Zhou

On the Effects of Teacher s Sense of Humor on Iranian s EFL Learners Reading Comprehension Ability

ScienceDirect. Humor styles, self-efficacy and prosocial tendencies in middle adolescents

Approaching Aesthetics on User Interface and Interaction Design

Jennifer L. Fackler, M.A.

Music Training and Neuroplasticity

The influence of performers stage entrance behavior on the audience s performance elaboration

Theatre of the Mind (Iteration 2) Joyce Ma. April 2006

The MAHNOB Laughter Database. Stavros Petridis, Brais Martinez, Maja Pantic

Influence of timbre, presence/absence of tonal hierarchy and musical training on the perception of musical tension and relaxation schemas

LAUGHTER YOGA IS THE BEST MEDICINE

The Effects of Web Site Aesthetics and Shopping Task on Consumer Online Purchasing Behavior

A Correlation Analysis of Normalized Indicators of Citation

Humour styles, personality and psychological well-being: What s humour got to do with it?

A Discourse Analysis Study of Comic Words in the American and British Sitcoms

Improving Frame Based Automatic Laughter Detection

Identifying the Importance of Types of Music Information among Music Students

Time-Saving Features in Economy Oscilloscopes Streamline Test

Mirroring Facial Expressions and Emotions in Dyadic Conversations

Mind Formative Evaluation. Limelight. Joyce Ma and Karen Chang. February 2007

Exploring Choreographers Conceptions of Motion Capture for Full Body Interaction

Personality Types and Sense of Humor and their Association with Teachers Performance Improvement

Language Value April 2016, Volume 8, Number 1 pp Copyright 2016, ISSN BOOK REVIEW

Communicating Across Cultures. Professor Maryam Omari Executive Dean School of Business and Law Edith Cowan University

Development of a wearable communication recorder triggered by voice for opportunistic communication

what are you laughing at? by Tio

Comparison, Categorization, and Metaphor Comprehension

Keysight Technologies Intrinsic Contact Noise: A Figure of Merit for Identifying High Resolution AFMs. Application Note

Speech and Speaker Recognition for the Command of an Industrial Robot

Teamwork Makes the Dream Work

Welcome and Appreciation!

To Link this Article: Vol. 7, No.1, January 2018, Pg. 1-11

Unit Topic and Functions Language Skills Text types 1 Found Describing photos and

Automatic meter extraction from MIDI files (Extraction automatique de mètres à partir de fichiers MIDI)

This full text version, available on TeesRep, is the post-print (final version prior to publication) of:

Humor in the Learning Environment: Increasing Interaction, Reducing Discipline Problems, and Speeding Time

Can scientific impact be judged prospectively? A bibliometric test of Simonton s model of creative productivity

How Laughter Yoga Can Improve. Efficiency and Performance in Your Company

2018 Oregon Dental Conference Course Handout

Three Decades Investigating Humor and Laughter: An Interview With Professor Rod Martin

Welcome to Session 7

LAUGHTER serves as an expressive social signal in human

Monday 15 May 2017 Afternoon Time allowed: 1 hour 30 minutes

MELODIC AND RHYTHMIC CONTRASTS IN EMOTIONAL SPEECH AND MUSIC

Speaking in Minor and Major Keys

EXTRASENSORY PERCEPTION IN RELATION TO LENGTH OF THE BIG TOE [ALL CAPS]

Some Experiments in Humour Recognition Using the Italian Wikiquote Collection

BAA ' Women Creating Community. Faculty Women's Club University of Calgary. Editors. Polly Knowlton Cockett Eileen Lohka Kate Bentley

Laughter Among Deaf Signers

Emotional structure of jokes: A corpus-based investigation

Advanced Code of Influence. Book 6

Effects of Using Graphic Notations. on Creativity in Composing Music. by Australian Secondary School Students. Myung-sook Auh

WEB FORM F USING THE HELPING SKILLS SYSTEM FOR RESEARCH

Peak experience in music: A case study between listeners and performers

VERBAL COMMUNICATION ANALYSIS IN THE OPRAH WINFREY SHOW

Humor: Prosody Analysis and Automatic Recognition for F * R * I * E * N * D * S *

The FUNdamentals of Humor: How to Add a Bolt of Lighten ing to Your Life and Work

Designing a library program to increase the number of fourth grade reading goals

Relation between the overall unpleasantness of a long duration sound and the one of its events : application to a delivery truck

Transcription:

Smile and ter in Human-Machine Interaction: a study of engagement Mariette Soury 1,2, Laurence Devillers 1,3 1 LIMSI-CNRS, BP133, 91403 Orsay cedex, France 2 University Paris 11, 91400 Orsay, France 3 University Paris Sorbonne 4, 28 rue serpente, 75005 Paris, France E-Mail: soury@limsi.fr, devil@limsi.fr Abstract This article presents a corpus featuring adults playing games in interaction with machine trying to induce laugh. This corpus was collected during Interspeech 2013 in Lyon to study behavioral differences correlated to different personalities and cultures. We first present the collection protocol, then the corpus obtained and finally different quantitative and qualitative measures. Smiles and are types of affect bursts which are defined as short emotional non-speech expressions. Here we correlate smile and laugh with personality traits and cultural background. Our final objective is to propose a measure of engagement deduced from those affect bursts. Keywords: affect bursts, personality traits, sense of humor 1 Introduction While spoken language constitutes a strong communication channel in human-machine interaction, previous works demonstrated that relevant information is simultaneously and spontaneously conveyed nonverbally (Campbell, 2007). The experiment presented in this paper was designed to evaluate the impact of a user s individual characteristics (cultural background, sense of humor, personality traits) on his propensity to express amusement in front of a machine. Specifically, we focused on affect bursts as the expression of users emotional state. The notion of affect burst has been introduced by Scherer (Scherer, 1994) as very brief, discrete, nonverbal expressions of affect in both face and voice as triggered by clearly identifiable s. Affect bursts are defined as short emotional non-speech expressions interrupting speech. Examples of expressions exclusively due to push effects are affect bursts (i.e., Krumhuber & Scherer, 2011) or infant grunts. Pursuing the ideas of Johnstone et al. (Johnstone et al., 2000) about prototypes in emotion expression, it may be that some emotions are expressed through prototypical affect bursts, while others are not. A simple criterion for prototypes could be that they are spontaneously produced by speakers and easily identified by listeners. (Schroeder, 2003) shows that affect bursts, presented without context, can convey a clearly identifiable emotional meaning. This subject, although theoretically described in detail, does not seem to have been extensively studied experimentally. In this study, we refer to smiles and as affect bursts. Our purpose is to evaluate how these affect bursts can reveal the subject s engagement in the interaction 1. Previous work by Tanaka (Tanaka et. al., 2010) investigated the impact of cultural differences on the perception of emotions; our assumption is that cultural differences impact the expression of emotions as well. In order to account for users individuality in the expressive behavior, we needed large amount of multicultural data. Few existing resources are multicultural. In our previous experiments, we only used French real-life corpora recorded in call center (Devillers & Vidrascu, 2007) to study negative and positive laughter. In (Schroeder, 2003) the corpus used is in German language. We opted to collect a new audio and video corpus during the Interspeech 2013 conference. The second section of this paper presents the protocol designed to acquire multimodal data in a man-machine interaction, and the resulting corpus is detailed in the third section. The correlations between participants profile and the collected affect bursts are described in the fourth section. 2 Collection protocol In this study, we used two tasks for inducing smiles and : Passive induction tasks such as watching funny videos or listening to jokes Active induction tasks such as Tongue-Twisters game reading on a computer screen or repeating with the Nao robot (Aldebaran Robotics). 1 Research partially supported by Oseo project ROMEO2, and Feder project E-Therapies 3633

In the passive task, we intended to observe both reflex contagious laughter and cultural laughter (i.e. funny scenes with various cultural backgrounds ( cognitive amusement, irony). er is a powerful social signal. er is "contagious": seeing or hearing someone laugh can trigger our own amusement in a reflexive manner. ter can also be the cognitive result of the interpretation of a ridiculous or funny situation. The sense of humor is both personal and cultural. We also used tongue-twisters game with the robot NAO in several languages as an active task inducing laughter. Tongue-twisters are sentences which exist in most languages, specifically designed to have a difficult pronunciation, by alternating similar phonemes (e.g. Peter Piper picked a peck of pickled peppers ). They can be used as a diction exercise which difficulty is linked to performance that can provoke embarrassment laughter or satisfaction laughter linked to erroneous or good pronunciations. To increase the possibility of erroneous pronunciation, we included sentences in different languages including English, French, Spanish, Italian, German, and Mandarin Chinese. The game was carried out in public (with a small group of persons). This experiment consisted of 4 tasks further referred as: video, tongue-twister read, tongue-twister repeat and jokes. 3 Contents of the Inter-LAUGH corpus 3.1 Participants We collected audio data with a lapel microphone at 16 KHz, and captured the computer part of the experiment using a webcam at 25 frames per second in 640x480 resolution. The participants also filled psychological evaluations regarding their OCEAN personality traits (John et. al, 1991) and sense of humor (McGhee, 1996). We recorded 45 participants (31 male and 14 female; between 25 and 60 years old). Interspeech is an international conference and attracts researchers of several nationalities: our participants came from 24 countries. The ratio of the size to the number of distinct country of origin is low (1.88). It is therefore necessary to gather the participants in larger groups in order to correlate our observations to their community. The belonging to a community could be defined by a common language, place of birth, or place of residence. For the purpose of this paper we decided to gather the participants in 3 meta-regions (America, Europe, Asia), depending on their cultural background and defined by the country where they spent their childhood. This resulted in the following repartition: 6 participants for America (from the USA, Canada, and Mexico), 27 participants for Europe (from France, the UK, Germany, Italy, Spain, Russia, Eastern and Northern Europe) and 12 participants for Asia (from China, Japan, Thailand, and India). 3.2 Description of the audio corpus In order to perform an audio analysis of the collected, we manually segmented the recorded sessions a posteriori, and annotated positive (amused) and negative (embarrassed) laugh segments. All the laughter of the subjects were segmented and annotated by two expert coders following an adapted protocol of annotation (Devillers & Vidrascu, 2007). The annotation took the interaction context in consideration, to evaluate whether were embarrassed or amused. We decided to observe both the number of generated by each task, and the relative duration of over the duration of the task. Those metrics give us insight on phenomenon such as multiple chuckles, or a single but hearty laugh. This analysis, displayed in Table 1, gives us a more accurate evaluation of the reactions of the participants. embarrassed duration of embarrassed amused duration of amused All Videos (read on screen) (repeat after the robot) Jokes 197 4 68 93 32-0.09% 2.15% 4.79% 1.80% 339 199 23 32 85-4.14% 1.09% 2.07% 5.86% Table 1 : Type of laugh elicited by task - number of generated by each task, and relative duration of over the duration of the task 3.3 Description of the video corpus In order to perform a video analysis of the collected smiles, we manually segmented the recorded sessions a posteriori, and annotated smiles in the obtained segments. The setting of the experiment only allowed us to capture video while the participants were facing the computer (i.e. for the videos and tongue-twisters read tasks). We discriminated between open smile (where the teeth are visible between the lips) and closed smiles. Positive laugh usually corresponds to open smile. The video were annotated without sound, to focus only on visual cues. As for the, we extracted both the number of smiles, and their relative length to the task duration. Those results are visible in Table 2. 3634

All Videos closed smiles duration of closed smiles (read on screen) 164 131 33-9.48% 2.70% produced by each group on the total number of amused (resp. embarrassed) (presented in Figure 1). We proceeded in the same manner for open and closed smiles (presented in Figure 2). The imbalanced representation of the 3 populations was taken into account to correct the skewness of the data. open smiles duration of open smiles 429 251 178-32.78% 13.83% Table 2 : Type of smile elicited by task - number of smiles, and relative length to the task duration 3.4 Co-occurring s in audio and video Our work hypothesis is that while laugh is an open expression of emotion, smile is a more subtle one and might occur more easily in a non-familiar environment. We temporally aligned annotations for audio and video channels for the videos and tongue-twisters read task, and observed the co-occurrence of s. We distinguished 4 cases: no in either modality, only in audio (laugh), only in video (smile), and bimodal. The latter represents 12.44% of all s observed on the complete session. Results are visible in Table 3. complete screen session no audio video bimodal 36,88% 11,82% 38,87% 12,44% Videos 33,53% 10,69% 40,17% 15,61% tongue-twisters read 42,98% 13,86% 36,49% 6,67% Table 3 : co-occurrence of and smiles by task If we observe tasks separately, bimodal s represent 15.61% and 6.67% of the observed s in the videos and tongue-twisters read tasks, respectively. The tongue-twisters task implied the active participation of the subjects, whereas the videos task allowed for more passive reactions, which could explain the disparity. We can see that s occurred more frequently in video than in audio: on the screen part of the interaction alone, we induced more smiles than. 4 Correlations between smiles, and user profile 4.1 With cultural background In order to evaluate the impact of the cultural background of the participants on their propensity to laugh and smile, we separated the data in 3 categories. We observed the ratio of amused (resp. embarrassed) Figure 1: type by group Figure 2: Smile type by group On the observed, the participants of the Asia group produced the majority of the embarrassed (56%), while the participants of the US group produced the majority of the amused (47%). For those two groups, the number of amused and embarrassed are imbalanced (resp. 23% and 56% for Asia, resp. 47% and 14% for US), contrary to the EU group who contributed to both categories equally. Regarding the smiles, the EU group produced the majority of the closed smiles (45%) while the Asia group produced the majority of the open smiles (39%). The US group contributed to both categories in almost equal measures. 4.2 With OCEAN personality traits During the design of the experiment, we expected the personality of each participant to impact on the expression of their amusement. We have been working with the OCEAN personality trait inventory on previous experiment to asset personality traits; we made the assumption that Openness, Extraversion and Neuroticism traits would have the most impact on affect bursts 3635

production. We asked participants to fill a sub-version of the OCEAN test for those 3 characteristics. with several of our measurements (marked in bold in Table 6). We correlated the scores for each trait with the production of and smiles, in number and in duration (absolute and relative to the task duration), for each participant, using R s implementation of Pearson correlations. Contrary to our initial expectations, the Neuroticism trait did not significantly correlate 2 with any of our measurements. While they did not correlate with any measurements regarding smiles, the Openness and Extraversion traits did significantly correlate 2 with. s Duration of Duration of Event Open Smile Closed Smile 0.4474 0.2635 0.2613-0.2785 0.3907 0.3015 0.1952-0.3032 0.3403 0.3157 0.2248-0.3070 s 0.1816-0.2800 Duration of 0.2439-0.2980 Duration of Event 0.2966-0.2686 Table 6 : Person correlations between SHS and smile and laugh measurements significant correlations are marked in bold 2 From Table 6 we can see that participants with a higher SHS laughed more and longer, and produced shorter closed smiles. Table 4 : Person correlations between Openness and laugh measurements significant correlations are marked in bold 2 s 0.3457-0.0500 Duration of 0.3454-0.0319 Duration of Event 0.3823-0.0167 Table 5 : Person correlations between Extraversion and laugh measurements significant correlations are marked in bold 2 From Table 4 and Table 5 we can see that the opened and extraverted participants produced more and longer amused, and shorter embarrassed. 4.3 With Sense of Humor To evaluate the impact of individual differences in laughing matters, we asked the participants to fill a questionnaire after the experiment, to assert their sense of humor. We used McGhee Sense of Humor Scale (SHS) (McGhee, 1996) which proposes 24 questions to evaluate 6 aspects of humor: Enjoyment of Humor, ter, Verbal Humor, Finding Humor in Everyday Life, ing at Yourself, and Humor Under Stress. Each aspect is rated between 4 and 28, and the global sense of humor is rated by the sum of the 6 sub-categories. We correlated this global Sense of Humor Score (SHS) with the same method used for the OCEAN traits previously mentioned. The SHS significantly correlated 2 5 Results This experiment provided data regarding two aspects of man-machine interaction: the multimodal expression of human behavior, and the influence of individual profile on this expression. The results of the video analysis demonstrate that a non-negligible quantity of information regarding users appreciation of the interaction was not perceivable in the audio channel. This reinforces Scherer s definition of affects burst as multimodal s. Moreover, the noisy conditions of the experiment rendered the audio channel less reliable for an automatic analysis, but the addition of a clean video channel can compensate this deficiency. The wide variety of participants was an opportunity to observe the impact of individual preferences on the enjoyment of a man-machine interaction. We took into account OCEAN personality traits, Sense of Humor and cultural influences in our measurements. While the observed is too small for strong conclusions, it encourages further investigations in that regard. 6 References [1] Campbell, N. On the use of nonverbal speech sounds in human communication, in proc. of the COST 2102 Workshop on Verbal and Nonverbal Communication Behaviours, 2007 [2] Devillers, L., Vidrascu L., Positive and negative emotional states behind the in spontaneous spoken dialogs, In Proceedings of ICPH 07 Workshop on ter, 2007 2 Person p-coefficient below 0.05 3636

[3] Krumhuber E. & Scherer, K. Affect bursts: Dynamic patterns of facial expressions, journal of Emotion. vol. 11, No 4 (pp 742), 2011 [4] John O. P., Donahue E. M., Kentle R. L., "The Big Five Inventory--Versions 4a and 54.", Berkeley: University of California, Berkeley, Institute of Personality and Social Research, 1991 [5] Johnstone T & Scherer, K. Vocal communication of emotion. In M. Lewis & J.Haviland (Eds). Handbook of emotions (pp. 200-235) New-York - Guilford Press, 2000 [6] McGhee, P. E., Humor, healing, and the amuse system: Humor as survival training., Dubuque, Iowa: Kendall/Hunt Publishing, 1996 [7] Scherer, K. R.. Affect Bursts, in Emotions: Essays on Emotion Theory (S. H.M. van Goozen, N. E. van de Poll, & J. A. Sergeant, eds.),. Hillsdale, NJ: Lawrence Erlbaum, (pp 161-193), 1994 [8] Schröder, M. Experimental study of affect bursts, Speech Communication Special session on speech and emotion, vol. 40, No 1-2, 2003 [9] Tanaka A., Koizumi A., Imai H., Hiramatsu S., Hiramoto E., de Gelder B., I Feel Your Voice Cultural Differences in the Multisensory Perception of Emotion. Psychological science vol. 21, No 9 (pp 1259-1262), 2010 3637