Chapter 21. Margin of Error. Intervals. Asymmetric Boxes Interpretation Examples. Chapter 21. Margin of Error

Similar documents
Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Subject: Florida U.S. Congressional District 13 Primary Election survey

Margin of Error. p(1 p) n 0.2(0.8) 900. Since about 95% of the data will fall within almost two standard deviations, we will use the formula

What is Statistics? 13.1 What is Statistics? Statistics

Quantitative methods

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Objective: Write on the goal/objective sheet and give a before class rating. Determine the types of graphs appropriate for specific data.

SDS PODCAST EPISODE 96 FIVE MINUTE FRIDAY: THE BAYES THEOREM

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

Subject: Florida Statewide Republican Primary Election survey conducted for FloridaPolitics.com

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

Lecture 10: Release the Kraken!

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

Eisenberger with mayoral lead in Hamilton Largest number undecided

Conditional Probability and Bayes

How Large a Sample? CHAPTER 24. Issues in determining sample size

Resampling Statistics. Conventional Statistics. Resampling Statistics

The Fox News Eect:Media Bias and Voting S. DellaVigna and E. Kaplan (2007)

Northern Dakota County Cable Communications Commission ~

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

Chapter 1 Midterm Review

Chapter 7 Probability

COMP Test on Psychology 320 Check on Mastery of Prerequisites

China s Overwhelming Contribution to Scientific Publications

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Subject: Florida Statewide Republican Governor Primary Election survey conducted for FloridaPolitics.com

AP Statistics Sec 5.1: An Exercise in Sampling: The Corn Field

BARB Establishment Survey Annual Data Report: Volume 1 Total Network and Appendices

A Majority of Americans Use Apps to Watch Streaming Content on Their Televisions

Monday 15 May 2017 Afternoon Time allowed: 1 hour 30 minutes

Community Orchestras in Australia July 2012

BARB Establishment Survey Quarterly Data Report: Total Network

The Relationship Between Movie theater Attendance and Streaming Behavior. Survey Findings. December 2018

Key Maths Facts to Memorise Question and Answer

Use black ink or black ball-point pen. Pencil should only be used for drawing. *

AGAINST ALL ODDS EPISODE 22 SAMPLING DISTRIBUTIONS TRANSCRIPT

MATH& 146 Lesson 11. Section 1.6 Categorical Data

Before the Federal Communications Commission Washington, D.C ) ) ) ) ) ) ) ) ) REPORT ON CABLE INDUSTRY PRICES

International Affairs Department, Telecommunications Bureau

More About Regression

NEW INSIGHTS ON TODAY S COMMUTERS

Bart vs. Lisa vs. Fractions

STAYING INFORMED ACROSS THE GARDEN STATE WHERE DO YOU GO AND WHAT DO YOU KNOW?

unbiased , is zero. Yï) + iab Fuller and Burmeister [4] suggested the estimator: N =Na +Nb + Nab Na +NB =Nb +NA.

Penultimate Check-Up on Election 42: LIBERALS OPENING UP DAYLIGHT?

Box Plots. So that I can: look at large amount of data in condensed form.

Distribution of Data and the Empirical Rule

B291B. MATHEMATICS B (MEI) Paper 1 Section B (Foundation Tier) GENERAL CERTIFICATE OF SECONDARY EDUCATION. Friday 9 January 2009 Morning

1 Lesson 11: Antiderivatives of Elementary Functions

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset

Confidence Intervals for Radio Ratings Estimators

Centre for Economic Policy Research

Task-based Activity Cover Sheet

STAT 250: Introduction to Biostatistics LAB 6

expressed on operational issues are those of the authors and not necessarily those of the U.S. Census Bureau.

Validity. What Is It? Types We Will Discuss. The degree to which an inference from a test score is appropriate or meaningful.

Analysis of Seabright study on demand for Sky s pay TV services. Annex 7 to pay TV phase three document

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #5 Nielsen Television Ratings Problem

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

Comparative Study of Electoral Systems (CSES) Module 3: Sample Design and Data Collection Report June 05, 2006

Consonance and Dissonance Activities *

Views on local news in the federal electoral district of Montmagny-L Islet-Kamouraska-Rivière-du-Loup

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

Sampling Plans. Sampling Plan - Variable Physical Unit Sample. Sampling Application. Sampling Approach. Universe and Frame Information

GROWING VOICE COMPETITION SPOTLIGHTS URGENCY OF IP TRANSITION By Patrick Brogan, Vice President of Industry Analysis

Cambridge International Examinations Cambridge International Advanced Subsidiary and Advanced Level

North Carolina Standard Course of Study - Mathematics

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

Department of MBA, School of Communication and Management Studies, Nalukettu, Kerala, India

AN EXPERIMENT WITH CATI IN ISRAEL

Personalized TV Recommendation with Mixture Probabilistic Matrix Factorization

TeeJay Publishers. Curriculum for Excellence. Course Planner - Level 1

3. Population and Demography

OCTAVE C 3 D 3 E 3 F 3 G 3 A 3 B 3 C 4 D 4 E 4 F 4 G 4 A 4 B 4 C 5 D 5 E 5 F 5 G 5 A 5 B 5. Middle-C A-440

Mixed Models Lecture Notes By Dr. Hanford page 151 More Statistics& SAS Tutorial at Type 3 Tests of Fixed Effects

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

The Role of Dice in Election Audits Extended Abstract

Professor Weissman s Algebra Classroom

Common assumptions in color characterization of projectors

OPIOIDS IN THE GARDEN STATE

Frictions and the elasticity of taxable income: evidence from bunching at tax thresholds in the UK


Internet Passes Radio, Closes in on Television as Most Essential Medium in American Life

BOOK READING IN NEW ZEALAND

Algebra I Module 2 Lessons 1 19

Measuring Variability for Skewed Distributions

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

Record your answers and work on the separate answer sheet provided.

Statistics For Dummies PDF

Friday 17 May 2013 Morning

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Why Engineers Ignore Cable Loss

Ferenc, Szani, László Pitlik, Anikó Balogh, Apertus Nonprofit Ltd.

Viewers and Voters: Attitudes to television coverage of the 2005 General Election

Comparing gifts to purchased materials: a usage study

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

Western Statistics Teachers Conference 2000

Suppose you make $1000 a week. Your company is in dire straits, and so you have to take a 50% pay cut.

Section 5.2: Organizing and Graphing Categorical

A Study of Predict Sales Based on Random Forest Classification

Transcription:

Context Part VI Sampling Accuracy of Percentages Previously, we assumed that we knew the contents of the box and argued about chances for the draws based on this knowledge. In survey work, we frequently need to turn this reasoning, and argue from the draws to the box. This is called inference. Three main new ideas: method to estimate the SE error intervals simple random sample of 2500 voters. In the sample, 1328 people favor the candidate. This is simple random sample of 2500 voters. In the sample, 1328 people favor the candidate. 1328 100% = 53% 2500 Population Parameter Should he enter the primary? The crucial question is: how wrong is this estimate likely to be? Sample Statistic

simple random sample of 2500 voters. In the sample, 1328 people favor the candidate. The likely size of the chance error is given by the standard error, and to calculate that we need a box model: 0?? 1?? Population: 100,000 voters in the district Parameter: percentage of voters who favor the candidate Sample: 2500 people who were polled SD of the box = (1 0) (fraction of 1's) (fraction of 0's) Problem: We do not know the composition of the box. Statistic: percentage of voters in the sample who favor the candidate: 53% Solution: Substitute the fractions observed in the sample for the unknown fractions in the box. So SD of the box (1 0) 1, 328 1, 172 2, 500 2, 500 0.50 SE for the sum = 2500 0.5 = 25 SE for the sample percentage = 25 100% = 1% 2, 500 This technique is called bootstrap. When sampling from a 0-1 box whose composition is unknown, the SD of the box can be estimated by substituting the fractions of 0's and 1's in the sample for the unknown fractions in the box. This estimate is good when the sample is reasonably large. Thus, the estimate of 53% is likely to be o by 1% or so. The candidate is very likely to win.

The margin of error (not in The margin of error (not in textbook) In the media, a margin of error is commonly reported for polls. textbook) This is just twice the standard error. intervals intervals In, 53% of the voters in the sample were in favor of the candidate. The SE for the percentage was estimated as 1%. How far can the population percentage (parameter) be from 53%? We know that a chance error of more than 2 SEs is unlikely. We can make condence intervals with any condence level. Some common levels are: estimate ± 1 SE: 68% condence interval estimate ± 2 SEs: 95% condence interval So let's go 2 SEs in each direction: (51%, 55%). estimate ± 3 SEs: 99.7% condence interval The is called a 95% condence interval for the population percentage. : we are about 95% condent that this interval captures the percentage of voters in the population who favor the candidate. These numbers are based on the normal approximation; the method only works if the normal approximation works The more asymmetric the box, the larger the sample size we need (because of the Central Limit Theorem, see Ch 18.5)

intervals of condence intervals Consider 10, 100, 1000, or 10,000 draws from the following boxes: 0 500,000 1 500,000 0 990,000 1 10,000 0 999,995 1 5 What does it mean that we are about 95% condent that the interval captures the population parameter? Remember that the population percentage is a xed number. Each time we take a dierent sample, we get a dierent sample percentage, and thus also a dierent estimate for the SE. If we would repeat this a million times, then 95% of the condence intervals contain the true population percentage, and 5% don't. Problem: after computing a condence interval, we don't know if it is one that contains the true parameter, or if it is one of the few that do not contain the parameter. Considerations Example 2 1 The methods in this chapter only work for simple random samples For more complicated sampling methods like cluster sampling, we need more complicated formulas For non-probability sampling methods, we basically have no formulas A survey organization takes a simple random sample of 1500 persons from residents in a large city. Among the sampled persons, 1035 were renters. 2 The sample size should be small relative to the population (say < 1/10th), so that we can ignore that we draw without replacement 3 For the bootstrap method to work, the sample size should be reasonably large Fill in the blanks: We estimate that the percentage of renters in the city is... This estimate is likely to be o by... or so. 4 For the normal approximation to work, the sample size should be reasonably large. The more asymmetric the box is, the larger the sample size we need. If possible, also construct a 95% condence interval for the percentage of renters.

Example 3 Example 3 A simple random sample of 6,000 17-year-olds in school was taken. Only 36.1% of the students in the sample knew that Chaucer wrote The Canterbury Tales, but 95.2% knew that Edison invented the light bulb. A simple random sample of 6,000 17-year-olds in school was taken. Only 36.1% of the students in the sample knew that Chaucer wrote The Canterbury Tales, but 95.2% knew that Edison invented the light bulb. (a) If possibly, nd a 95% con- dence interval for the percentage of all 17-year-olds in school who knew Chaucer wrote The Canterbury Tales. (b) If possible, nd a 95% con- dence interval for the percentage of all 17-year-olds in school who knew that Edison invented the light bulb.