Does the number of users rating the movie accurately predict the average user rating?

Similar documents
Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

STAT 503 Case Study: Supervised classification of music clips

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

Algebra I Module 2 Lessons 1 19

More About Regression

TWO-FACTOR ANOVA Kim Neuendorf 4/9/18 COM 631/731 I. MODEL

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Problem Points Score USE YOUR TIME WISELY USE CLOSEST DF AVAILABLE IN TABLE SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Relationships Between Quantitative Variables

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Douglas D. Reynolds UNLV UNIVERSITY OF NEVADA LAS VEGAS CENTER FOR MECHANICAL & ENVIRONMENTAL SYSTEMS TECHNOLOGY

DV: Liking Cartoon Comedy

Statistical analysis of shot types in the films of Alfred Hitchcock

Human Hair Studies: II Scale Counts

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

Chapter 1 Midterm Review

Chapter 3. Averages and Variation

Measuring Variability for Skewed Distributions

Box Plots. So that I can: look at large amount of data in condensed form.

STAT 250: Introduction to Biostatistics LAB 6

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

in the Howard County Public School System and Rocketship Education

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

hprints , version 1-1 Oct 2008

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

Chapter 6. Normal Distributions

Chapter Two: Long-Term Memory for Timbre

What is Statistics? 13.1 What is Statistics? Statistics

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

I. Model. Q29a. I love the options at my fingertips today, watching videos on my phone, texting, and streaming films. Main Effect X1: Gender

E X P E R I M E N T 1

Archiving: Experiences with telecine transfer of film to digital formats

CONCLUSION The annual increase for optical scanner cost may be due partly to inflation and partly to special demands by the State.

Dissertation proposals should contain at least three major sections. These are:

University of Tennessee at Chattanooga Steady State and Step Response for Filter Wash Station ENGR 3280L By. Jonathan Cain. (Emily Stark, Jared Baker)

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Analysis of Film Revenues: Saturated and Limited Films Megan Gold

Reliability. What We Will Cover. What Is It? An estimate of the consistency of a test score.

WEB APPENDIX. Managing Innovation Sequences Over Iterated Offerings: Developing and Testing a Relative Innovation, Comfort, and Stimulation

A Comparison of Relative Gain Estimation Methods for High Radiometric Resolution Pushbroom Sensors

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

1996 Yampi Shelf, Browse Basin Airborne Laser Fluorosensor Survey Interpretation Report [WGC Browse Survey Number ]

Linear mixed models and when implied assumptions not appropriate

Detecting Medicaid Data Anomalies Using Data Mining Techniques Shenjun Zhu, Qiling Shi, Aran Canes, AdvanceMed Corporation, Nashville, TN

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

PICK THE RIGHT TEAM AND MAKE A BLOCKBUSTER A SOCIAL ANALYSIS THROUGH MOVIE HISTORY

MANOVA/MANCOVA Paul and Kaila

Understanding PQR, DMOS, and PSNR Measurements

1. Model. Discriminant Analysis COM 631. Spring Devin Kelly. Dataset: Film and TV Usage National Survey 2015 (Jeffres & Neuendorf) Q23a. Q23b.

DETAILED TEST RESULTS ON SEVEN TOWNSVILLE KONGSBERG TARGETS

B - PSB Audience Impact. PSB Report 2013 Information pack August 2013

properly formatted. Describes the variables under study and the method to be used.

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

Draft last edited May 13, 2013 by Belinda Robertson

Salt on Baxter on Cutting

Modeling memory for melodies

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian

A Citation Analysis of Articles Published in the Top-Ranking Tourism Journals ( )

Modeling television viewership

SECTION I. THE MODEL. Discriminant Analysis Presentation~ REVISION Marcy Saxton and Jenn Stoneking DF1 DF2 DF3

Quantitative methods

Resampling Statistics. Conventional Statistics. Resampling Statistics

International Comparison on Operational Efficiency of Terrestrial TV Operators: Based on Bootstrapped DEA and Tobit Regression

Discriminant Analysis. DFs

Evaluation of Serial Periodic, Multi-Variable Data Visualizations

Ferenc, Szani, László Pitlik, Anikó Balogh, Apertus Nonprofit Ltd.

Avoiding False Pass or False Fail

Best Pat-Tricks on Model Diagnostics What are they? Why use them? What good do they do?

Seen on Screens: Viewing Canadian Feature Films on Multiple Platforms 2007 to April 2015

Reviews of earlier editions

Mixed Effects Models Yan Wang, Bristol-Myers Squibb, Wallingford, CT

Mental Health Status and Perceived Tinnitus Severity

Student Guide to the Publication Manual of the American Psychological Association Vol. 5

Show-Stopping Numbers: What Makes or Breaks a Broadway Run. Jack Stucky. Advisor: Scott Ogawa. Northwestern University. MMSS Senior Thesis

Confidence Intervals for Radio Ratings Estimators

AskDrCallahan Calculus 1 Teacher s Guide

Estimation of inter-rater reliability

Release Year Prediction for Songs

Smoothing Techniques For More Accurate Signals

Chapter 4. Displaying Quantitative Data. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Precision testing methods of Event Timer A032-ET

Student Guide to the Publication Manual of the American Psychological Association Vol. 5

Statistical Consulting Topics. RCBD with a covariate

Statistics for Engineers

F1000 recommendations as a new data source for research evaluation: A comparison with citations

Automatic Laughter Detection

The Choice of Sampling Frequency and Product Acceptance Criteria to Assure Content Uniformity for Continuous Manufacturing Processes

Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

N12/5/MATSD/SP2/ENG/TZ0/XX. mathematical STUDIES. Wednesday 7 November 2012 (morning) 1 hour 30 minutes. instructions to candidates

subplots (30-m by 33-m) without space between potential subplots. Depending on the size of the

User s Manual. Log Scale (/LG) GX10/GX20/GP10/GP20/GM10 IM 04L51B01-06EN. 3rd Edition

THE USE OF RESAMPLING FOR ESTIMATING CONTROL CHART LIMITS

A STUDY OF AMERICAN NEWSPAPER READABILITY

Work Package 9. Deliverable 32. Statistical Comparison of Islamic and Byzantine chant in the Worship Spaces

University Microfilms International tann Arbor, Michigan 48106

Special Article. Prior Publication Productivity, Grant Percentile Ranking, and Topic-Normalized Citation Impact of NHLBI Cardiovascular R01 Grants

MID-TERM EXAMINATION IN DATA MODELS AND DECISION MAKING 22:960:575

Transcription:

STAT 503 Assignment 1: Movie Ratings SOLUTION NOTES These are my suggestions on how to analyze this data and organize the results. I ve given more questions below than I can address in my analysis, so show the range of questions groups addressed in their reports. 1 Data Description With the upcoming academy awards and the just finished golden globes we collected data on 14 recent movies from http://movies.yahoo.com/. The movies are Electra, The Aviator, Ocean s Twelve, Coach Carter, Meet the Fockers, Racing Stripes, In Good Company, White Noise, Lemony Snicket, Fat Albert, The Incredibles, Alexander, Shall We Dance, Closer, Sideways. The movies were selected mostly from the top 10 list on January 14, 2005. Several movies were also suggested by class members. We also collected ratings from 9 critics (Atlanta Journal, Boston Globe, Chicago Sun-Times, Chicago Tribune, E! Online, filmcritic, Hollywood Reporter, New York Times, Seattle Post), and the average Yahoo user rating. The list of critics were the ones that rated all 14 movies. The main question we want to answer is: Which movies are consistently rated highly by both critics and reviewers? Other questions of interest might be: Are some critics more generous in their ratings than others? Can the critics ratings accurately predict the average user rating? Does the number of users rating the movie accurately predict the average user rating? Is there a relationship between the weekly gross of a movie and the average user rating or the average critic rating? 1

2 Suggested approaches Approach Reason Type of questions addressed Data Restructuring Make new variables ratings by converting letter grades to GPA s The letter grade is an ordinal variable, and so it needs to be treated numerically for various calculations. Summary statistics Tabulate average critics ratings and average user ratings, and average movie ratings for each critic, calculate confidence intervals for the critics ratings Dotplots Of the critics ratings for each movie Pairwise Scatterplots Of average critics rating against average user rating, user rating against number of users Regression Of average critics rating against average user rating, user rating against number of users To compare the numbers for each movie, and for each critic Examine the distribution of the ratings for each movie. Explore the relationship between these pairs of variables To build a numerical model for the relationship between the pairs of variables What is the average critic rating for each movie? Does the average user rating compare similarly to the average critic rating? Are some movies consistently rated higher by the critics? Do some movies split the critics, and draw dramatically different ratings from different critics? What is the relationship between number of users and average user rating? Can the critics ratings accurately predict the average user rating? Does the number of users rating the movie accurately predict the average user rating? 2

3 Actual Results 3.1 Summary Statistics Table 1 contains a summary of the data for each movie. The users average grade and the critics average grade are listed, along with the standard deviation and Bonferroni-adjusted confidence intervals for the critics ratings for each movie. The top rated movie is The Incredibles. The lowest rated movie is White Noise. Coach Carter is a movie which is rated comparatively lower by critics than by users. The other movies rated better by users than critics are Lemony Snicket, Shall We Dance, Fat Albert, Electra and White Noise. Alexander is rated low by the critics, C, but even lower by the users, C-. Closer is another movie rated lower by users, B-, than critics, B. Most of the differences in the average movies ratings by critics are not considered to differ significantly. The one exception is Electra is considered significantly lower rated than The Incredibles. Table 1: Summary of Movie Ratings. The top rated movie is The Incredibles. The lowest rated movie is White Noise. Coach Carter is a movie which is rated comparatively lower by critics than by users. Alexander is rated low by the critics but even lower by the users. Bonferroni-adjusted confidence intervals for the average critic ratings for each movie are provided to compare the significance of the differences. Most of the differences in the average movies ratings by critics are not considered to differ significantly. The one exception is Electra is considered significantly lower rated than The Incredibles. Movie User Critic Number Average Letter Average Letter SD Lower CI Upper CI The Incredibles 31517 3.67 A- 3.48 B+ 0.69 2.56 4.40 The Aviator 5214 3.00 B 3.19 B+ 0.67 2.30 4.08 In Good Company 1761 3.00 B 2.96 B 0.68 2.06 3.86 Closer 5625 2.67 B- 2.93 B 0.91 1.72 4.14 Coach Carter 7320 3.67 A- 2.89 B 0.24 2.58 3.20 Meet the Fockers 22667 3.00 B 2.89 B 0.24 2.58 3.20 Lemony Snicket 16816 3.00 B 2.78 B- 0.67 1.89 3.67 Ocean s Twelve 14027 2.67 B- 2.56 B- 0.44 1.97 3.14 Shall We Dance 6933 2.67 B- 2.44 C+ 0.90 1.25 3.64 Racing Stripes 1320 2.33 C+ 2.33 C+ 0.37 1.84 2.83 Alexander 12340 1.67 C- 2.00 C 0.85 0.87 3.13 Fat Albert 7740 2.67 B- 1.93 C 0.57 1.17 2.69 Electra 4191 2.67 B- 1.74 C- 0.47 1.12 2.36 White Noise 7197 2.33 C+ 1.70 C- 0.81 0.63 2.78 Table 2 contains a summary of the data for each critic. Both Chicago newspapers rate movies higher on average, and the Hollywoord Reporter and The New York Times rate movies the lowest on average. Bonferroni-adjusted confidence intervals to compare the significance of the difference in average ratings are given. The differences in the critics averages are not considered to differ significantly. 3.2 Dotplots Figure 1 shows the distribution of the critics ratings for each movie. Although The Incredibles was rated very highly on average (as reported in Table 1) we can see from these plots that two critics gave it much 3

Table 2: Summary of Critics Ratings. Both Chicago newspapers rate movies higher on average, and the Hollywoord Reporter and The New York Times rate movies the lowest on average. Bonferroni-adjusted confidence intervals to compare the significance of the difference in average ratings are given. The differences in the critics averages are not considered to differ significantly. Critic Average SD Lower CI Upper CI Chicago Tribune 2.98 0.89 2.20 3.76 Chicago Sun-Times 2.88 0.66 2.30 3.46 E! Online 2.71 0.94 1.89 3.54 Boston Globe 2.69 0.70 2.08 3.30 Seattle Post 2.60 0.66 2.02 3.17 Atlanta Journal 2.48 0.60 1.96 3.00 filmcritic 2.33 0.73 1.70 2.97 New York Times 2.19 0.87 1.43 2.96 Hollywood Reporter 2.17 0.97 1.32 3.01 lower ratings: C from The Atlanta Journal, and B- from the New York Times. Shall We Dance was rated reasonably well, B, but it was given an F from The Hollywood Reporter, and C- from E! Online. The ratings for Alexander are uniformly spread, with one critic, Chicago Tribune, rating it as high as A-. Coach Carter and Meet the Fokkers were consistently rated near B. Figure 1: Ratings given by 9 critics to 14 movies. 3.3 Scatterplots Figure 2 shows the scatterplots of users ratings against critics ratings. Some critics ratings reasonably match the user ratings: E! Online, filmcritic, New York Times. Some critics don t match the user ratings at all: Atlanta Journal and Chicago Tribune. 4

Figure 2: Scatterplots of users ratings against critics ratings, to compare their similarity. 5

3.4 Regression We first fitted a multiple linear model of average user rating against all the critics ratings. This model did not explain a significant amount of the variation in users ratings, and no coefficients were significant. So we fitted separate models for each critic, using a stepwise regression approach. Table 3.4 summarizes the results. The ratings of E! Online and filmcritic most closely match the users ratings. The New York Times, Chicago Sun-Times and Seattle Post do a quite reasonable job of predicting the average user rating. The worst performance are the Chicago Tribune and the Atlanta Journal. Table 3: Summary of individual regression models for each critic. The ratings of E! Online and filmcritic most closely match the users ratings. The New York Times, Chicago Sun-Times and Seattle Post do a quite reasonable job of predicting the average user rating. The worst performance are the Chicago Tribune and the Atlanta Journal. Critic Deviance Intercept Slope p-value for slope E! Online 1.37 1.63 0.43 0.001 filmcritic 1.52 1.54 0.53 0.002 New York Times 2.18 2.00 0.36 0.021 Chicago Sun-Times 2.53 1.62 0.41 0.057 Seattle Post 2.58 1.75 0.40 0.065 Hollywood Reporter 2.70 2.24 0.25 0.090 Boston Globe 2.99 2.05 0.27 0.191 Chicago Tribune 3.19 2.29 0.17 0.323 Atlanta Journal 3.21 2.20 0.24 0.343 6

4 Conclusions Only The Incredibles is rated highly, A-, by both users and critics. Company are rated well by both users and critics at a B. The Aviator and In Good The critics ratings for each movie can vary a lot. Although The Incredibles is rated very highly on average, two critics gave it low scores: C from The Atlanta Journal, and B- from the New York Times. Shall We Dance was rated reasonably well, B, by all but it received and F from The Hollywood Reporter, and C- from E! Online. There is some disagreement between critics and users. Coach Carter is rated poorly by critics but highly by users. Electra was rated poorly by critics, C-, but reasonably by the users, B-. Alexander is a movie that users rated worse than critics, and both rated it poorly. There is some bias in the critics ratings. The two Chicago newspapers rate movies more highly on average than other critics. The Hollywood Reporter and The New York Times rate movies lower on average. The critics who s ratings best predict the users ratings are E! Online, filmcritic and the New York Times. The ratings given Atlanta Journal and the Chicago Tribune both differ substantially from the users ratings. 7

5 References R Development Core Team (2003) R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-00-3, http://www.r-project.org. Yahoo (2005) Yahoo! Movies Website. http://movies.yahoo.com. Accessed Jan 14, 2005. 8

Appendix # Read in data d.movies.critics<-read.csv("movies-critics.csv") table(d.movies.critics[,1],d.movies.critics[,3]) d.movies.users<-read.csv("movies-users.csv") table(d.movies.users[,1],d.movies.users[,3]) # code for automatically creating numerical values for ratings critics.ratings<-factor(as.character(d.movies.critics[,3]), levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-", " A"," A+"),ordered=T) critics.ratings.gpa<-as.numeric(critics.ratings)/3 users.ratings<-factor(as.character(d.movies.users[,3]), levels=c(" F"," D-"," D"," D+"," C-"," C"," C+"," B-"," B"," B+"," A-", " A"," A+"), ordered=t) users.ratings.gpa<-as.numeric(users.ratings)/3 # Summarize the ratings by movies, critical values for # confidence intervals are not adjusted options(digits=3,width=100) movies.summary<-null for (i in 1:14) { mn<- mean(critics.ratings.gpa[d.movies.critics[,1]== levels(d.movies.critics[,1])[i]]) sd1<-sd(critics.ratings.gpa[d.movies.critics[,1]== levels(d.movies.critics[,1])[i]]) movies.summary<-rbind(movies.summary, cbind(levels(d.movies.critics[,1])[i], round(mn,digits=3),round(sd1,digits=3), round(mn-sd1*qt(0.998,8)/sqrt(9),digits=3), round(mn+sd1*qt(0.998,8)/sqrt(9),digits=3))) } dimnames(movies.summary)<-list(null, c("movie","critics Av","Critics SD", "Critics Lower CI","Critics Upper CI")) movies.summary[order(movies.summary[,2],decreasing=t),] # Summarize the ratings by critics, critics.summary<-null for (i in 1:9) { mn<- mean(critics.ratings.gpa[d.movies.critics[,2]== levels(d.movies.critics[,2])[i]]) sd1<-sd(critics.ratings.gpa[d.movies.critics[,2]== levels(d.movies.critics[,2])[i]]) critics.summary<-rbind(critics.summary, cbind(levels(d.movies.critics[,2])[i], round(mn,digits=3),round(sd1,digits=3), round(mn-sd1*qt(0.997,13)/sqrt(14),digits=3), round(mn+sd1*qt(0.997,13)/sqrt(14),digits=3))) } 9

dimnames(movies.summary)<-list(null, c("critic","av","sd","lower CI","Upper CI")) critics.summary[order(critics.summary[,2],decreasing=t),] # Change levels of movie names from alphabetical to order of the mean ordered.movies<-factor(as.character(d.movies.critics[,1]), levels=c("the Incredibles","The Aviator","In Good Company","Closer", "Coach Carter","Meet the Fockers","Lemony Snicket","Ocean s Twelve", "Shall We Dance","Racing Stripes","Alexander","Fat Albert","Electra", "White Noise"), ordered=t) # Graphics for movies library(lattice) dotplot(ordered.movies~jitter(critics.ratings.gpa), ylab="movie",xlab="rating", aspect=0.8,col=1) # Regression critics.ratings.gpa.mat<-matrix(critics.ratings.gpa,ncol=9,byrow=t) par(mfrow=c(3,3),mar=c(4,5,3,1)) for (i in 1:9) { plot(critics.ratings.gpa.mat[,i],users.ratings.gpa,pch=16, xlab="critic",ylab="user",xlim=c(0,4),ylim=c(0,4),cex=1.5) text(0.5,3.5,paste("r=", round(cor(critics.ratings.gpa.mat[,i],users.ratings.gpa),digits=2)), cex=1.5) title(as.character(d.movies.critics[i,2])) } critics.users.reg<-glm(users.ratings.gpa~critics.ratings.gpa.mat) summary(critics.users.reg) 1-pchisq(critics.users.reg$null.deviance-critics.users.reg$deviance,9) x<-null for (i in 1:9) { y<-summary(glm(users.ratings.gpa~critics.ratings.gpa.mat[,i])) x<-rbind(x,c(y$deviance,y$coefficients[c(1,2,8)])) } 10