PDF hosted at the Radboud Repository of the Radboud University Nijmegen

Similar documents
WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

DOES MOVIE SOUNDTRACK MATTER? THE ROLE OF SOUNDTRACK IN PREDICTING MOVIE REVENUE

Promo Mojo: CBS' 'Instinct' Takes Top Spot

TV Today. Lose Small, Win Smaller. Rating Change Distribution Percent of TV Shows vs , Broadcast Upfronts 1

The Fox News Eect:Media Bias and Voting S. DellaVigna and E. Kaplan (2007)

Temporal patterns of happiness and sarcasm detection in social media (Twitter)

Promo Mojo: Food Network's 'Wedding Cake Championship' Edges Out World Cup

Seen on Screens: Viewing Canadian Feature Films on Multiple Platforms 2007 to April 2015

Promo Mojo: TLC Tops Rankings With 'This Is Life Live'

Promo Mojo: NBC's 'Billboard Music Awards' Puts Broadcast Back on Top

Promo Mojo: Fox's 'The Gifted' Takes Its Turn at Top

Promo Mojo: CBS' 'Total Knock Out' Beats Out Competition to Lead List

Promo Mojo: NBC Closes Out Olympics by Leading for 5th Straight Week

Promo Mojo: Season Eight of 'The Walking Dead' Debuts

-Not for Publication- Online Appendix to Telecracy: Testing for Channels of Persuasion

F1000 recommendations as a new data source for research evaluation: A comparison with citations

DAT335 Music Perception and Cognition Cogswell Polytechnical College Spring Week 6 Class Notes

Promo Mojo: Fox Takes First and Second Place with NFL, '9-1-1'

The Influence of Open Access on Monograph Sales

Promo Mojo: Discovery's 'Gold Rush' Strikes It Rich

CONQUERING CONTENT EXCERPT OF FINDINGS

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

Analysis of Seabright study on demand for Sky s pay TV services. Annex 7 to pay TV phase three document

More About Regression

How many seconds of commercial time define a commercial minute? What impact would different thresholds have on the estimate?

NETWORK PRIMETIME & OTT PROGRAMMING Flash #5-15 November 2017

Libraries as Repositories of Popular Culture: Is Popular Culture Still Forgotten?

Comparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006

Have you seen these shows? Monitoring Tazama! (investigate show) and XYZ (political satire)

Predicting the Importance of Current Papers

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

Cable Television Advertising. A Guide for the Radio Marketer

Case Study: Can Video Quality Testing be Scripted?

2015 SEPTEMBER 23 FLASH REPORT #2 THE LAUGHS BEGIN ARE THE RATINGS BROKE?

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

THE CROSSPLATFORM REPORT

Improving music composition through peer feedback: experiment and preliminary results

Analysis of local and global timing and pitch change in ordinary

GfK Audience Measurements & Insights FREQUENTLY ASKED QUESTIONS TV AUDIENCE MEASUREMENT IN THE KINGDOM OF SAUDI ARABIA

REVIEW OF THE MANDATORY DAYTIME PROTECTION RULES IN THE OFCOM BROADCASTING CODE

Centre for Economic Policy Research

Usage versus citation indicators

Analysis of MPEG-2 Video Streams

The Relationship Between Movie theater Attendance and Streaming Behavior. Survey Findings. December 2018

PERCEPTUAL QUALITY COMPARISON BETWEEN SINGLE-LAYER AND SCALABLE VIDEOS AT THE SAME SPATIAL, TEMPORAL AND AMPLITUDE RESOLUTIONS. Yuanyi Xue, Yao Wang

Release Year Prediction for Songs

BSAC Business Briefing. TV Consumption Trends in the Multi-Screen Era. October 2012

Wales. BBC in the nations

REACHING THE UN-REACHABLE

MIS 0855 Data Science (Section 005) Fall 2016 In-Class Exercise (Week 6) Advanced Data Visualization with Tableau

Enhancing Music Maps

TV Data Report: Time Shifting. alphonso.tv

Weeding book collections in the age of the Internet

WEB APPENDIX. Managing Innovation Sequences Over Iterated Offerings: Developing and Testing a Relative Innovation, Comfort, and Stimulation


OPTIMAL TELEVISION SCANNING FORMAT FOR CRT-DISPLAYS

Opening Our Eyes. Appendix 3: Detailed survey findings. How film contributes to the culture of the UK

Relationships Between Quantitative Variables

Analysis of Film Revenues: Saturated and Limited Films Megan Gold

2018 TEST CASE: LEGAL ONLINE OFFERS OF FILM EXECUTIVE SUMMARY

Validity. What Is It? Types We Will Discuss. The degree to which an inference from a test score is appropriate or meaningful.

Make Me Laugh: Recommending Humoristic Content on the WWW

Precision testing methods of Event Timer A032-ET

First-Time Electronic Data on Out-of-Home and Time-Shifted Television Viewing New Insights About Who, What and When

Digital Ad. Maximizing TV Stations' Revenues. The Digital Opportunity. A Special Report from Media Group Online, Inc.

Discussing some basic critique on Journal Impact Factors: revision of earlier comments

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

The Council for Research Excellence

FPA (Focal Plane Array) Characterization set up (CamIRa) Standard Operating Procedure

NPOWER VIDEO ON DEMAND REPORT GUIDE SUMMER 2013

ASIAN JOURNAL OF MANAGEMENT RESEARCH Online Open Access publishing platform for Management Research

However, in studies of expressive timing, the aim is to investigate production rather than perception of timing, that is, independently of the listene

Analyzing Second Screen Based Social Soundtrack of TV Viewers from Diverse Cultural Settings

in the Howard County Public School System and Rocketship Education

OUTshine Film Festival s inspire entertain educate LGBTQ+ experience Gloria & Emilio Estefan James Franco The Miami Edition

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

1) New Paths to New Machine Learning Science. 2) How an Unruly Mob Almost Stole. Jeff Howbert University of Washington

SES Omni TV. The next day of TV!!!

August 7, Legal Memorandum

NETFLIX MOVIE RATING ANALYSIS

Introduction slide 1 Digital Television 1. produced consumed New companies online continuation experimentation fragmenting reception dispersed

OPERATIVE GUIDE P.I.T. PILE INTEGRITY TEST

Figures in Scientific Open Access Publications

Exploring Millennials Meaningful Relationship With TV Programming

Impacts on User Behavior. Carol Ansley, Sr. Director Advanced Architecture, ARRIS Scott Shupe, Sr. Systems Architect Video Strategy, ARRIS

MULTIPLE- SCREEN VIEWING: SPORT: THE WORLD CUP AND SPORTS VIEWING 1 ENGLAND V CROATIA (ITV) - WEDNESDAY JULY 11TH 2018

Channel 4 submission to the BBC Trust s review of BBC services for younger audiences

ERICSSON CONSUMERLAB. TV and MEDIA A consumer-driven future of media

Grabbing the spotlight Awards show trends and the rise of digital studios

Australian. video viewing report

BBC Trust Review of the BBC s Speech Radio Services

Focus Group Discussions on Quantity and Forms of Advertising in Free TV Services. Summary of Views

BBC Three. Part l: Key characteristics of the service

Legal Memorandum. In this issue, link to information about. Developments: FCC Proposes New Video Description Rules. April 29, 2016

Are You There, Chelsea?

Modeling memory for melodies

"Just as it turns people into stars, TV turns brands into household names." ThinkBox

STAYING INFORMED ACROSS THE GARDEN STATE WHERE DO YOU GO AND WHAT DO YOU KNOW?

PROJECT THE SHORT FILM SHOW

Transcription:

PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/158815 Please be advised that this information was generated on 2019-01-20 and may be subject to change.

Can Tweets Predict TV Ratings? Bridget Sommerdijk, Eric Sanders and Antal van den Bosch Centre for Language Studies / Centre for Language and Speech Technology, Radboud University, the Netherlands {e.sanders, a.vandenbosch}@let.ru.nl Abstract We set out to investigate whether and mentions of TV programmes on the Twitter social media platform are correlated. If such a correlation exists, Twitter may be used as an alternative source for estimating viewer popularity. Moreover, the Twitter-based rating estimates may be generated during the programme, or even before. We count the occurrences of programme-specific hashtags in an archive of Dutch tweets of eleven popular TV shows broadcast in the Netherlands in one season, and perform correlation tests. Overall we find a strong correlation of 0.82; the correlation remains strong, 0.79, if tweets are counted a half hour before broadcast time. However, the two most popular TV shows account for most of the positive effect; if we leave out the single and second most popular TV shows, the correlation drops to being moderate to weak. Also, within a TV show, correlations between ratings and tweet counts are mostly weak, while correlations between of the previous and next shows are strong. In absence of information on previous shows, Twitter-based counts may be a viable alternative to classic estimation methods for. Estimates are more reliable with more popular TV shows. Keywords: Twitter, 1. Introduction The social media platform Twitter 1 harbors enormous amounts of information, much of which refers to the personal realm. By referring to what one is doing, people provide information that can be used as a basis for research in sociology, demographics, and statistics. In this paper, we focus on : how many people watch a certain TV program. Deller et al. (Deller, 2011) explore the reasons why it has become popular to use social media, such as Twitter, before and during the watching of TV programs: to suggest others to watch too, from a desire to talk about what they do, and from a desire to be part of a live conversation. Our main research question is: can we use Twitter to predict? We present a case study focusing on Dutch TV. Similar research questions are discussed by Wakamiya et al. (Wakamiya et al., 2011) who use Twitter to estimate TV ratings based on textual, spatial, and temporal relevance. Oh et al. (Oh et al., 2015) conclude from their study that there is a positive relationship between social media activities and. In their study, Sanders and Van den Bosch (Sanders and Van den Bosch, 2013) used a simple method to try to predict the outcome of the political parliamentary elections in the Netherlands in 2012, which worked surprisingly well. By counting the names of political parties and comparing them to polls and actual election results, they achieved a high correlation. Encouraged by this result we set out to apply a similar method to the prediction of. In the remainder of this paper we first explain how we gathered the data we used in Section 2. In Section 3 we describe the experiments we conducted. In Section 4 we show the results from our experiments and in Section 5 we draw conclusions and discuss them. In the last section we provide some directions for future research. 1 http://www.twitter.com 2. Data For our research we focus on Dutch TV programmes associated with a relatively high, as these programs have the highest impact both in terms of economic relevance (e.g. for advertisement placement) and in total viewer time. The TV programmes were selected from the top-25 of programmes that are most tweeted about as listed on the website spot.nl. Spot is a foundation for the promotion and optimization of TV commercials oriented at the Dutch TV market. We only selected programmes broadcast once a week; the programmes are new shows, not replays. Generally speaking, these weekly programmes are also the type of programmes that are tweeted about most, in contrast to daily news broadcasts, one-off documentaries, children s shows, etc. and with these we minimize the risk that a tweet is about the previous or next episode of a programme (Cheng et al., 2013). For our study we selected the top eleven most tweeted about programmes falling into the category of weekly shows. All programmes were broadcast between December 2013 and March 2014. Table 1 lists the eleven programmes. The for all episodes of the eleven shows were obtained from the SKO, Stichting Kijk Onderzoek (English: foundation for TV-ratings). 2 The ratings are determined by acquiring information from devices installed in 1,235 randomly selected Dutch households that together monitor the TV watching behavior of 2,800 people. Every year the viewer panel is refreshed by moving a quarter of the devices to another household. 3 The numbers of tweets referring to a particular show in a particular week are obtained from the webservice Twiqs.nl 4. Twiqs archives about 40% of all Dutch Tweets 2 www.kijkonderzoek.nl 3 http://mens-en-samenleving. infonu.nl/communicatie/ 104372-hoe-worden-de-kijkcijfers-bepaald. html 4 http://www.twiqs.nl 2965

Name Hashtag # Episodes Type Boer Zoekt Vrouw #bzv 13 dating show #boer zoekt vrouw Wie Is De Mol #widm 8 game show The Voice Of Holland #tvoh 16 talent show Maastricht #flikkenmaastricht 11 police series Divorce #divorce 12 drama series The Voice Kids #tvk 8 talent show #moordvrouw 10 police series Ik Vertrek #ikvertrek 10 reality show Alles Mag Op Vrijdag #amov 7 game show #allesmagopvrijdag Hoeveel Ben Je Waard #hbjw 7 reality show #hoeveelbenjewaard #proefkonijnen 4 game show Table 1: Names and hashtags of the eleven Dutch TV shows for which data was gathered in the period December 2013 March 2014. (Tjong Kim Sang and Van den Bosch, 2013) and has extensive search options. For our tweet collection we used simple time-specific search with the most commonly used hashtags for the TV programmes. Some programmes had two popular hashtags. Tables 1 and 2 list the hashtags used, the number of episodes, the type of show, the mean per weekly show or episode, and the mean number of tweets. For reference, the daily posted in the Netherlands is in the order of two million tweets; with a population of 17 million inhabitants, the Netherlands has a relatively active Twitter user base with about one million active users. The numbers in Table 2 suggest that only one in several hundreds of viewers is posting about the show during the show s broadcast. 3. Method To investigate whether there is a relation between the number of tweets and TV-ratings, the correlation (Pearsons r) was computed for tweets and ratings in various conditions. In (Deller, 2011) the authors state that tweets about TV programmes are mostly posted when people are watching that particular programme. The best correlation is therefore probably the one between the and the tweets that were posted during the broadcast. 5 Additionally we counted the tweets posted half an hour before the broadcast and half an hour after the broadcast. These appeared to be typical time slots within which there is already or still tweeted about the programmes. Table 2 compares the numbers of tweets posted during the half hours before and after the show with the numbers of tweets posted during the show, confirming the observation of (Deller, 2011). Correlations were computed in two ways: 1. Per show, by taking the number of a tweets and ratings of all episodes together and computing the correlation over all data pairs. This yields one result. 5 To gather tweets posted exactly during programme broadcasts, we checked the actual starting and end times of the programmes via the website http://www.hebikietsgemist. nl/. 2. Per programme / episode, by computing the correlation over the data pairs of the indiviual episodes of one programme. This yields 11 results. 4. Results Figure 1 displays a scatter plot of the against the posted during all episodes of all programmes, as well as the best-fitting linear regression line. Pearson s r of this relation is 0.82(p < 0.01), which is remarkably high. Closer analysis tells that this is for a large part due to the 21 episodes of the two programmes that are viewed by most people, and that are much tweeted about: BZV and TVOH. If we leave out BZV the correlation drops to 0.44(p < 0.01) and if we leave out both BZV and TVOH the correlation reduces to 0.23(p < 0.01). Figure 2 zooms in on the next seven programmes of the top-11 that are viewed by fewer people than the top-4 programmes, i.e. the graph excludes all episodes of the four best watched programmes. From this figure we observe that there is at best a weak relation between the and the for these programmes. Figures 3 and 4 display scatter plots of ratings and tweets for tweets that were posted half an hour before the TV programme started, and half an hour after the programme has finished, respectively. The Pearson s r correlations are 0.79(p < 0.01) and 0.57(p < 0.01), respectively, indicating a better correlation between tweets posted before a show than posted after. If we leave out the numbers for BZV the correlations drop to 0.29(p < 0.01) and 0.41(p < 0.01), respectively. Table 3 provides the correlations between and averaged over all episodes of only that particular show. In general these correlation are low. Some even have a negative correlation, which is contrary to the effect we are looking for. Clearly, the is not a good predictor for for different episodes of one series. 2966

Tweets during broadcast versus, all programmes 0e+00 2e+06 4e+06 BZV TVOH WIDM Ikvertrek Divorce 0 2000 4000 6000 8000 10000 12000 Figure 1: Number of tweets during the TV programme related to number of viewers. Tweets during broadcast versus, excluding top 4 0 500000 1500000 2500000 Divorce 0 200 400 600 800 1000 1200 Figure 2: Number of tweets during the TV programme related to number of watchers excluding the 4 most viewed programmes. 2967

Tweets half an hour before broadcast versus 0e+00 2e+06 4e+06 0 500 1000 1500 BZV TVOH WIDM Ikvertrek Divorce Figure 3: Number of tweets half an hour before the TV programme related to number of watchers. Tweets half an hour after broadcast versus 0e+00 2e+06 4e+06 BZV TVOH WIDM Ikvertrek Divorce 0 200 400 600 800 1000 1200 Figure 4: Number of tweets half an hour after the TV programme related to number of watchers. 2968

Number of tweets Name 30 min before during 30 min after Boer Zoekt Vrouw 4,111,692 1,011 8,967 489 Wie Is De Mol 2,333,250 380 1,998 793 The Voice Of Holland 2,294,125 64 4,442 102 Maastricht 2,218,455 79 229 85 Divorce 1,967,500 24 290 58 The Voice Kids 1,608,125 11 813 34 1,570,500 162 323 44 Ik Vertrek 1,452,700 21 2,044 149 Alles Mag Op Vrijdag 1,338,857 13 465 16 Hoeveel Ben Je Waard 773,143 2 39 7 662,750 12 138 6 Table 2: Average per show, and average 30 minutes before, during, and 30 minutes after a show episode. Name Correlation Boer Zoekt Vrouw 0.05 Wie Is De Mol 0.47 The Voice Of Holland 0.07 Maastricht -0.50 Divorce -0.11 The Voice Kids 0.10-0.42 Ik Vertrek -0.07 Alles Mag Op Vrijdag 0.22 Hoeveel Ben Je Waard 0.63 0.11 Table 3: Correlations (Pearson s r) per show between TV ratings and numbers of tweets posted during shows. 5. Conclusions and Discussion We investigated how well can be predicted from Twitter by counting hashtags referring to TV programmes. We observed the correlation between the number of Twitter mentions and the ratings of the 11 most popular weekly TV programmes in the Netherlands broadcast between December 2013 and March 2014. For the tweets that were posted during the broadcast of the programme, the correlation (Pearson s r) is 0.82, which can be considered very high. This is, however, for a large part due to the two most popular programmes. If we leave these out, the correlation drops to 0.23. The correlations with the tweets that were posted half an hour before of half an hour after the broadcast are show the same pattern, although their numbers are smaller. The interestingly high correlation of 0.79 for all shows for tweets posted a half hour before the shows start, indicates that anticipatory tweets of people posting messages about the fact that they are about to tune into a show correlate about as well with as the larger number of tweets posted during a show. These results can be interpreted as implying that estimated could already be publicized at the start of the show. However, the high correlation drops to medium or low correlation when the single or two most watched shows are left out. If we zoom in on Figures 1 and 2, we see that for most TV programmes, the different episodes of one programme have similar in general. In other words, the number of watchers for a programme are constant during the season. The correlation between the ratings of two following episodes is 0.98. Thus the ratings of a programme are predictable from the ratings of the previous episode to a high degree. The about programme differs a lot between the different episodes. Therefore the correlation of the episodes of a single programme is low in general (table 3), or even negative for two drama (police) series the latter may be due to special episodes such as cliffhanger episodes or season finales, which draw roughly the same viewers as other episodes, but trigger more reactions on Twitter. From these results, we conclude that predicting from tweets is not as promising with this simple method as was the prediction of election results with a similar method based on hashtags and counts. The most popular shows stand out with the most tweets as well as the highest ratings, leading to a high correlation for the 11 most popular programmes overall. The larger shows bias this result, since for the other programmes a higher does not always go together with higher ratings. Programmes that are less popular than these 11 are not expected to show a more positive result. 6. Future Work We adopted a simple method to count the relevant for a show; we just counted hashtags. Some improvements over this method are possible. A first step would be to take into account the other contents of the tweets. We may want to filter tweets based on their contents, in order, for instance, to only take those tweets into account that have a positive sentiment, as negative tweets may indicate the dislike of a show and may be indicative of the poster not watching the show. Another step of which we expect positive results is take the genre of programmes into account. In this way we would only compare programmes with each other that are in the same genre, such as talent shows, game shows, drama se- 2969

ries, documentaries, etc. This was ignored in these experiments; our relatively small selection featured weekly shows only, with a majority of game and talent shows but also drama (and police) series. We expect that some types of programmes generate a larger amount of tweets from the audience than others. Game shows in which candidates are voted off are known to be much tweeted about (see (Christopher Buschow, 2014)). In future experiments we would need to enlarge our data set with more TV programmes and conduct per-genre analyses. We may look at non-weekly programmes as well, both of the daily type (such as the daily news) and the irregular type (such as sports events), as some of these tend to attract massive viewing numbers as well also for these events we may prove to be predictive of viewer ratings ahead of the broadcast. Finally, we may want to use other types of social media and crowd-generated content, such as internet fora, to complement the Twitter stream as a basis for computing statistics. Not only is the Twitter stream quite sparse when it comes to numbers of tweets per episode of a show (cf. Table 2), the Twitter user demography may also be biased towards age groups, and other social media may offer complementary perspectives on TV from differently composed user groups. 7. Bibliographical References Cheng, Y.-H., Wu, C.-M., Ku, T., and Chen, G.-D. (2013). A predicting model of tv audience rating based on the facebook. In Social Computing (SocialCom), 2013 International Conference on, pages 1034 1037, Sept. Christopher Buschow, Beate Schneider, S. U. (2014). Tweeting television: Exploring communication activities on twitter while watching tv. Communications, 39 (2):129 149. Deller, R. (2011). Twittering on: Audience research and participation using twitter. Participations, 8 (1):216 245. Oh, C., Yergeau, S., Woo, Y., Wurtsmith, B., and Vaughn, S. (2015). Is twitter psychic? social media analytics and television ratings. In The Second International Conference on Computing Technology and Information Management. Sanders, E. and Van den Bosch, A. (2013). Relating political party mentions on twitter with polls and election results. In Proceedings of DIR-2013, pages 68 71. Tjong Kim Sang, E. and Van den Bosch, A. (2013). Dealing with big data: The case of twitter. Computational Linguistics in the Netherlands Journal, 3:121 134, 12/2013. Wakamiya, S., Lee, R., and Sumiya, K. (2011). Towards better tv viewing rates: Exploiting crowd s media life logs over twitter for tv rating. In Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, ICUIMC 11, pages 39:1 39:10, New York, NY, USA. ACM. 2970