MID-TERM EXAMINATION IN DATA MODELS AND DECISION MAKING 22:960:575

Similar documents
Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

N12/5/MATSD/SP2/ENG/TZ0/XX. mathematical STUDIES. Wednesday 7 November 2012 (morning) 1 hour 30 minutes. instructions to candidates

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Use black ink or black ball-point pen. Pencil should only be used for drawing. *

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

More About Regression

WHAT'S HOT: LINEAR POPULARITY PREDICTION FROM TV AND SOCIAL USAGE DATA Jan Neumann, Xiaodong Yu, and Mohamad Ali Torkamani Comcast Labs

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

MAT Practice (solutions) 1. Find an algebraic formula for a linear function that passes through the points ( 3, 7) and (6, 1).

DV: Liking Cartoon Comedy

Distribution of Data and the Empirical Rule

Algebra I Module 2 Lessons 1 19

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

BER margin of COM 3dB

RANDOMIZED COMPLETE BLOCK DESIGN (RCBD) Probably the most used and useful of the experimental designs.

Validity of TV, Video, Video Game Viewing/Usage Diary: Comparison with the Data Measured by a Viewing State Measurement Device

X-70B HDTV DIGITAL INDOOR ANTENNA

Analysis of Seabright study on demand for Sky s pay TV services. Annex 7 to pay TV phase three document

Replicated Latin Square and Crossover Designs

Release Year Prediction for Songs

Open Access Determinants and the Effect on Article Performance

hprints , version 1-1 Oct 2008

International Comparison on Operational Efficiency of Terrestrial TV Operators: Based on Bootstrapped DEA and Tobit Regression

Supplemental Material: Color Compatibility From Large Datasets

FCC Releases Proposals for Broadcast Spectrum Incentive Auctions

Regression Model for Politeness Estimation Trained on Examples

Reliability. What We Will Cover. What Is It? An estimate of the consistency of a test score.

Detecting Medicaid Data Anomalies Using Data Mining Techniques Shenjun Zhu, Qiling Shi, Aran Canes, AdvanceMed Corporation, Nashville, TN

WEB APPENDIX. Managing Innovation Sequences Over Iterated Offerings: Developing and Testing a Relative Innovation, Comfort, and Stimulation

A STATISTICAL VIEW ON THE EXPRESSIVE TIMING OF PIANO ROLLED CHORDS

LAB 1: Plotting a GM Plateau and Introduction to Statistical Distribution. A. Plotting a GM Plateau. This lab will have two sections, A and B.

What is Statistics? 13.1 What is Statistics? Statistics

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Unit 2: Graphing Part 5: Standard Form

Lesson 25: Solving Problems in Two Ways Rates and Algebra

TWO-FACTOR ANOVA Kim Neuendorf 4/9/18 COM 631/731 I. MODEL

Douglas D. Reynolds UNLV UNIVERSITY OF NEVADA LAS VEGAS CENTER FOR MECHANICAL & ENVIRONMENTAL SYSTEMS TECHNOLOGY

Television Audience 2010 & 2011

Media Questions on the 1996 election study and related content analysis of media coverage of the presidential campaign

Do Television and Radio Destroy Social Capital? Evidence from Indonesian Villages Online Appendix Benjamin A. Olken February 27, 2009

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

THE FAIR MARKET VALUE

AskDrCallahan Calculus 1 Teacher s Guide

Libraries as Repositories of Popular Culture: Is Popular Culture Still Forgotten?

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

TV Today. Lose Small, Win Smaller. Rating Change Distribution Percent of TV Shows vs , Broadcast Upfronts 1

Page I-ix / Lab Notebooks, Lab Reports, Graphs, Parts Per Thousand Information on Lab Notebooks, Lab Reports and Graphs

RF Safety Surveys At Broadcast Sites: A Basic Guide

Before the Federal Communications Commission Washington, D.C ) ) ) ) ) ) ) ) ) REPORT ON CABLE INDUSTRY PRICES

I. Model. Q29a. I love the options at my fingertips today, watching videos on my phone, texting, and streaming films. Main Effect X1: Gender

in the Howard County Public School System and Rocketship Education

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

Technical Appendices to: Is Having More Channels Really Better? A Model of Competition Among Commercial Television Broadcasters

PRECISION OF MEASUREMENT OF DIAMETER, AND DIAMETER-LENGTH PROFILE, OF GREASY WOOL STAPLES ON-FARM, USING THE OFDA2000 INSTRUMENT

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Before the FEDERAL COMMUNICATIONS COMMISSION Washington, DC 20554

Bookstore Operator Contract Sections Covering Rental. (I.A) Rental Book Limiting Criteria (Purchase Textbooks):

Time Domain Simulations

Go! Guide: The Notes Tab in the EHR

Sitting through commercials: How commercial break timing and duration affect viewership

WHAT EVER HAPPENED TO CHANNEL 1?

PPM Panels: A Guidebook for Arbitron Authorized Users

SECTION I. THE MODEL. Discriminant Analysis Presentation~ REVISION Marcy Saxton and Jenn Stoneking DF1 DF2 DF3

Before the FEDERAL COMMUNICATIONS COMMISSION WASHINGTON, DC 20554

Overview. Teacher s Manual and reproductions of student worksheets to support the following lesson objective:

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

ATSC: Digital Television Update

Duplication of Public Goods: Some Evidence on the Potential Efficiencies from the Proposed Echostar/DirecTV Merger. April, 2004.

Western Statistics Teachers Conference 2000

Instead of the foreword. The author

Frequencies. Chapter 2. Descriptive statistics and charts

Big Media, Little Kids: Consolidation & Children s Television Programming, a Report by Children Now submitted in the FCC s Media Ownership Proceeding

Sampling: What you don t know can hurt you. Juan Muñoz

THE 1MPACT OF TIME ON MODELS OF TELEVISION SPOT PRICES. by Benjamin J. Bates

Cryptanalysis of LILI-128

Making a LUT of the Mahrer-Pielke Radiation Parameterization in RAMS. David M. Stokowski 26 April 2006 AT730

Channel Repertoires: Using Peoplemeter Data in Beijing. Elaine J. Yuan and James G. Webster. Northwestern University

Setting Energy Efficiency Requirements Using Multivariate Regression

Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Analysis of WFS Measurements from first half of 2004

Incorporation of Escorting Children to School in Individual Daily Activity Patterns of the Household Members

7.4 Applications of Linear Systems

Analysis and Clustering of Musical Compositions using Melody-based Features

Paired plot designs experience and recommendations for in field product evaluation at Syngenta

Nielsen Examines TV Viewers to the Political Conventions. September 2008

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

MPEG-2. ISO/IEC (or ITU-T H.262)

Spillovers between property rights and transaction costs for innovative industries: Evidence from vertical integration in broadcast television

PUBLIC NOTICE MEDIA BUREAU SEEKS COMMENT ON RECENT DEVELOPMENTS IN THE VIDEO DESCRIPTION MARKETPLACE TO INFORM REPORT TO CONGRESS. MB Docket No.

Note for Applicants on Coverage of Forth Valley Local Television

Description of Methodology

E X P E R I M E N T 1

Do Television and Radio Destroy Social Capital? Evidence from Indonesian Villages

Linear mixed models and when implied assumptions not appropriate

The Fox News Eect:Media Bias and Voting S. DellaVigna and E. Kaplan (2007)

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian

Relationships Between Quantitative Variables

STOCK MARKET DOWN, NEW MEDIA UP

Transcription:

MID-TERM EXAMINATION IN DATA MODELS AND DECISION MAKING 22:960:575 Instructions: Fall 2017 1. Complete and submit by email to TA and cc me, your answers by 11:00 PM today. 2. Provide a single Excel workbook with 5 Sheets, one sheet per problem. 3. Solve all problems. 4. Good Luck! Problem 1. Ellen Smith has collected the following data on the amount of time (in minutes) taken by a tax preparation service to complete client interviews: Client Number Time to Complete Interview (minutes) 1 8.0 2 12.0 3 26.0 4 10.0 5 23.0 6 21.0 7 16.0 8 22.0 9 18.0 10 17.0 11 36.0 12 9.0 You may assume that this data are a representative sample of all client interview times. (a) Compute the sample mean and the sample standard deviation of this sample.

(b) Construct a 99% confidence interval for the population mean interview time with β= 99% (c) Approximately how large a sample size is required to obtain a 99% confidence interval whose accuracy is +/- 8.0 minutes? Problem 2. AXY Marketing has been gathering data on people s television viewing habits in smaller metropolitan areas. Ely Nanda, an analyst at AXY, is trying to predict the number of households that tune in to a given television station at any time during a given calendar week. She has gathered data for 25 different stations/broadcast areas, and has run a simple linear regression model, where the number of households that tune in to a station (in 10,000s) sometime during the week is the dependent variable. The independent variable that she has used is the number of households (in 10,000s) with televisions in the broadcast area. The resulting regression model output appears below. Ely has looked at the output and is discouraged with the results: (a) Based on the above regression output, provide at least one reason why this regression might not be a good model. Ely has decided to give her factors some more thought, and has come upon the idea that the number of households who tune in to a particular station during the week might also depend on whether or not the station s channel is VHF or UHF. For example, most VHF stations are major networks (like ABC, CBS, or NBC), which are viewed more often regardless of the size of the broadcast area. Ely therefore has included a dummy variable for whether a station broadcasts on VHF (VHF = 1, UHF = 0).

The results of her multiple linear regression are as follows: (b) Write a complete equation for the multiple linear regression model that incorporates the estimated coefficients provided by the second regression model output. Make sure to define in words all the variables used in the equation. Do the signs of the regression coefficients make sense? Hint: define variables: Y= # of Households (10,000s), X1=number of Households (10,000s) in the broadcast area, X2=1 if the station broadcasts on VHF, 0 otherwise Problem 3. A medical test for malaria is subject to some error. Given a person who has malaria, the probability that the test will fail to reveal the malaria is 0.06. Given a person who does not have malaria, the test will correctly identify that the person does not have malaria with probability 0.91. In a particular area, 20% of the population suffers from malaria. (a) If someone has malaria, what is the probability that the test will identify that person as having malaria?

(b) Copy the following joint probability table to your answer xls and fill the missing numbers. Has malaria Does not have malaria Total Test indicates malaria 0.188 Test indicates no malaria Total (c) Suppose that Richard Rice, a resident of the area, decides to take the test for malaria. If his test results indicate that he has malaria, what is the probability that he actually has malaria? (d) Suppose three unrelated individuals who are not infected with malaria take the test. What is the probability that at least one of the three individuals will be identified by the test as having malaria?

Problem 4. The YUMM cereal company distributes Colored Sugar Cereal. Each box is supposed to contain 450 grams of cereal. They also sell cereal in a 2-pack, where each 2-pack contains 2 boxes of cereal. The 2-packs are supposed to have a total weight of 900 grams. YUMM can choose µ, the actual mean amount of cereal to put in each of the boxes, but their filling process has some inaccuracies. Regardless of the value of µ that they select (typically between 450 grams and 500 grams), the amount of cereal placed in the box by their filling process is Normally distributed with mean µ and standard deviation 10 grams. (The mean µ is the same for all of the boxes.) Since each box is poured by the same machine that has been calibrated to the chosen value of µ, the correlation between the weights of any two boxes is CORR=0.63. (a) Suppose that YUMM selects µ = 470 grams. What is the probability that any given box is under the 450 grams that the box is supposed to weigh? (b) Suppose that YUMM selects µ = 460. What is the expected total weight of the 2 boxes in a given 2-pack? What are the variance and the standard deviation of the total weight of the given 2-pack? What is the distribution of the total weight? (c) Suppose that µ = 460 grams. What is the probability that the total weight of a given 2-pack is less than 900 grams? (d) At what value should YUMM set µ so that the probability is 0.95 that the weight in any given single box is at least 450 grams?

Problem 5. Four teams of workers are available to do 4 jobs. The cost required for each team to do each job is given in the table below. We want to assign the teams to do the jobs at minimum cost. a) Write the problem in the form of an integer programming problem. b) Solve with Excel the integer programming problem.