Box Plots. So that I can: look at large amount of data in condensed form.

Similar documents
Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Measuring Variability for Skewed Distributions

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Algebra I Module 2 Lessons 1 19

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

Chapter 1 Midterm Review

9.2 Data Distributions and Outliers

Chapter 3. Averages and Variation

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

Distribution of Data and the Empirical Rule

What can you tell about these films from this box plot? Could you work out the genre of these films?

Homework Packet Week #5 All problems with answers or work are examples.

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

What is Statistics? 13.1 What is Statistics? Statistics

Math 7 /Unit 07 Practice Test: Collecting, Displaying and Analyzing Data

Lesson 5: Events and Venn Diagrams

11, 6, 8, 7, 7, 6, 9, 11, 9

Dot Plots and Distributions

Chapter 6. Normal Distributions

Frequencies. Chapter 2. Descriptive statistics and charts

Comparing Distributions of Univariate Data

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

6 th Grade Semester 2 Review 1) It cost me $18 to make a lamp, but I m selling it for $45. What was the percent of increase in price?

EXPLORING DISTRIBUTIONS

Chapter 4. Displaying Quantitative Data. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Math 81 Graphing. Cartesian Coordinate System Plotting Ordered Pairs (x, y) (x is horizontal, y is vertical) center is (0,0) Quadrants:

abc Mark Scheme Statistics 3311 General Certificate of Secondary Education Higher Tier 2007 examination - June series

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions.

Notes Unit 8: Dot Plots and Histograms

Chapter 2 Describing Data: Frequency Tables, Frequency Distributions, and

MATH& 146 Lesson 11. Section 1.6 Categorical Data

Normalization Methods for Two-Color Microarray Data

Statistics for Engineers

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Lesson 5: Events and Venn Diagrams

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002

Estimation of inter-rater reliability

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Full file at

The Measurement Tools and What They Do

Lecture 10: Release the Kraken!

Centre for Economic Policy Research

Lesson 25: Solving Problems in Two Ways Rates and Algebra

d. Could you represent the profit for n copies in other different ways?

Use black ink or black ball-point pen. Pencil should only be used for drawing. *

Sample Design and Weighting Procedures for the BiH STEP Employer Survey. David J. Megill Sampling Consultant, World Bank May 2017

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

Technical Appendices to: Is Having More Channels Really Better? A Model of Competition Among Commercial Television Broadcasters

Libraries as Repositories of Popular Culture: Is Popular Culture Still Forgotten?

SEVENTH GRADE. Revised June Billings Public Schools Correlation and Pacing Guide Math - McDougal Littell Middle School Math 2004

Key Maths Facts to Memorise Question and Answer

MA 15910, Lesson 5, Algebra part of text, Sections 2.3, 2.4, and 7.5 Solving Applied Problems

Copyright 2013 Pearson Education, Inc.

AP Statistics Sec 5.1: An Exercise in Sampling: The Corn Field

DIFFERENTIATE SOMETHING AT THE VERY BEGINNING THE COURSE I'LL ADD YOU QUESTIONS USING THEM. BUT PARTICULAR QUESTIONS AS YOU'LL SEE

The One Penny Whiteboard

Record your answers and work on the separate answer sheet provided.

Sampling Plans. Sampling Plan - Variable Physical Unit Sample. Sampling Application. Sampling Approach. Universe and Frame Information

Multiple-point simulation of multiple categories Part 1. Testing against multiple truncation of a Gaussian field

Margin of Error. p(1 p) n 0.2(0.8) 900. Since about 95% of the data will fall within almost two standard deviations, we will use the formula

Answers. Chapter 9 A Puzzle Time MUSSELS. 9.1 Practice A. Technology Connection. 9.1 Start Thinking! 9.1 Warm Up. 9.1 Start Thinking!

Get a Hint! Watch a Video. Save & Exit. The results from a survey of workers in a factory who work overtime on weekends are shown below.

Mathematics Curriculum Document for Algebra 2

Chapter 40: MIDI Tool

Graphical Displays of Univariate Data

Chapter 2 Notes.notebook. June 21, : Random Samples

PHY221 Lab 1 Discovering Motion: Introduction to Logger Pro and the Motion Detector; Motion with Constant Velocity

Evaluating Oscilloscope Mask Testing for Six Sigma Quality Standards

Visual Encoding Design

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

PENNSAUKEN INTERMEDIATE SCHOOL Incoming 5th and 6th Grade Summer Reading Program for Summer 2017

Pennsauken Intermediate School Summer Reading 2018 Incoming 5th grade

download instant at

Collecting Data Name:

AGAINST ALL ODDS EPISODE 22 SAMPLING DISTRIBUTIONS TRANSCRIPT

The Effects of Study Condition Preference on Memory and Free Recall LIANA, MARISSA, JESSI AND BROOKE

The impact of sound technology on the distribution of shot lengths in motion pictures

GCSE MARKING SCHEME AUTUMN 2017 GCSE MATHEMATICS NUMERACY UNIT 1 - INTERMEDIATE TIER 3310U30-1. WJEC CBAC Ltd.

Introduction to Probability Exercises

Simulation Supplement B

1. MORTALITY AT ADVANCED AGES IN SPAIN MARIA DELS ÀNGELS FELIPE CHECA 1 COL LEGI D ACTUARIS DE CATALUNYA

Pennsauken Intermediate School Summer Reading 2018 Incoming 4th grade

Page I-ix / Lab Notebooks, Lab Reports, Graphs, Parts Per Thousand Information on Lab Notebooks, Lab Reports and Graphs

WJEC MATHEMATICS INTERMEDIATE ALGEBRA. SEQUENCES & Nth TERM

South African Cultural Observatory National Conference Presentation May 2016

Introduction to IBM SPSS Statistics (v24)

Unit 07 PC Form A. 1. Use pencil and paper to answer the question. Plot and label each point on the coordinate grid.

TELEVISIONS. Overview PRODUCT CATEGORY REPORT

Most Canadians think the Prime Minister s trip to India was not a success

AMERICAN FEDERATION OF MUSICIANS SUMMARY OF SCALES AND CONDITIONS TELEVISION VIDEOTAPE AGREEMENT

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

TeeJay Publishers. Curriculum for Excellence. Course Planner - Level 1

SWITCHED INFINITY: SUPPORTING AN INFINITE HD LINEUP WITH SDV

Open access press vs traditional university presses on Amazon

Items You Need. THINK FAST: Why did I use a bulleted list here instead of a numbered. list? Tuesday, February 5, 13

Transcription:

LESSON 5 Box Plots LEARNING OBJECTIVES Today I am: creating box plots. So that I can: look at large amount of data in condensed form. I ll know I have it when I can: make observations about the data based on the IQR. Opening Exercise Consider the following scenario. A television game show, Fact or Fiction, was canceled after nine shows. Many people watched the nine shows and were rather upset when it was taken off the air. A random sample of eighty viewers of the show was selected. Viewers in the sample responded to several questions. The dot plot below shows the distribution of ages of these eighty viewers. Rasdi Adbul Rahman/Shutterstock.com A data distribution that is not symmetrical is described as skewed. In a skewed distribution, data stretch either to the left or to the right. The stretched side of the distribution is called a tail. 1. Would you consider this data set to be skewed? Explain your thinking. 65

66 Module 1 Descriptive Statistics Exploratory Challenge 1 Constructing and Interpreting the Box Plot 2. Using the dot plot in the Opening Exercise, construct a box plot over the dot plot by completing the following steps. Recall that there are 80 data points in the dot plot. t A. Locate the middle 40 observations, and draw a box around these values. B. Calculate the median, and then draw a vertical line in the box at the location of the median. median 60 C. Draw a line that extends from the upper end of the box to the largest observation in the data set. D. Draw a line that extends from the lower edge of the box to the minimum value in the data set. 3. Recall that the five values used to construct the box plot make up the 5-number summary. What is the 5-number summary for this data set of ages? Minimum age: Lower quartile or Q1: Median age: Upper quartile or Q3: Maximum age: 6 40 GO TO 75 IQ R 70 40 30 Range 75 6 69

Unit 1 Measuring Distributions Lesson 5 Box Plots 67 4. A. What percent of the data does the box part of the box plot capture? 50 B. What percent of the data fall between the minimum value and Q1? 25 C. What percent of the data fall between Q3 and the maximum value? 25 5. Why do we use the median for a box plot? The possibility of skewed data 6. What are the advantages and challenges to using a box plot?

68 Module 1 Descriptive Statistics Fill in each blank with the appropriate word from the word bank. 7. Each section is called a, since the data is split into sections ( ). 8. The box is also called the or. quarters quartile 4 interquartile range section 25 9. Each holds of the data. 10. The IQR can be determined by subtracting the quartile from the quartile. Word Bank Qi Q 3 Q z Q1 first four Interquartile Range IQR one-fourth or 25% quarters quartile section third

Unit 1 Measuring Distributions Lesson 5 Box Plots 69 Exploratory Challenge 2 Comparing Data 11. Ron is taking a survey to find out how many pencils each of his friends have. The data is below. Number of pencils in their pencil pouch: I O l 1, 2, 4, 4, 4, 4, 5, 5, 6, 6, 6, 6, 6, 7, 8, 10, 11 A. What is the 5- Number Summary for this data? Early Spring/Shutterstock.com I 4 6 6.5 22 Minimum ; Q1 ; Median ; Q3 ; Maximum B. Draw the box plot below. o C. Describe the box plot using SOCS. s ootier unimodal 12. Neville joins the group and has 3 pencils in his pencil pouch. The updated data is below. Number of pencils in their pencil pouch: 1, 2, 3, 4, 4, 4, 4, 5, 5, 6, 6, 6, 6, 6, 7, 8, 10, 11 A. What is the 5- Number Summary for this data? Center g spread a IQ 13 2.5 1 4 5 5 6 11 Minimum ; Q1 ; Median ; Q3 ; Maximum I B. Draw the box plot below. II C. Describe the box plot using SOCS.

70 Module 1 Descriptive Statistics 13. Did Neville s data change the box plot significantly? 0 Not really 14. Hermione joins the group and has 20 pencils in her pencil pouch. Do you think 20 an outlier for this data set? Explain your thinking. Sarawut Aiemsinsuk/Shutterstock.com A data distribution may contain extreme data (unusually large or unusually small relative to the median and the IQR). A box plot can be used to display extreme data values that are identified as outliers An outlier is defined to be any data value that is more than 1.5 (IQR) away from the nearest quartile. Lower Boundary Q1 1.5 IQR Upper Boundary Q3 1.5 IQR 15. Hermione joins the group and has 20 pencils in her pencil pouch. The updated data is below. Number of pencils in their pencil pouch: 1, 2, 3, 4, 4, 4, 4, 5, 5, 6, 6, 6, 6, 6, 7, 8, 10, 11, 20 A. What is the 5- Number Summary for this data? Minimum ; Q1 ; Median ; Q3 ; Maximum B. Calculate the IQR (interquartile range). C. Do you think 20 is an outlier? How can we know for sure? D. Determine if 20 is an outlier for this data set. c c I 4 6 7 20 7 4 3 use s s the formula 3 1.5 4.5 t 7 l l 5 20 is an outlier

Unit 1 Measuring Distributions Lesson 5 Box Plots 71 E. Draw the box plot below. off F. How did the box plot change by adding Hermione s 20 pencils? What parts changed very little? What parts changed significantly? 16. Use the box plots below to answer the following questions about Carl s and Angela s box and whisker plots. A. Estimate what the lower quartile for Angela is. B. Who has the higher maximum? C. Estimate what Carl s range is?

72 Module 1 Descriptive Statistics 17. A. True or False Angela s IQR is larger than Carl s IQR. B. True or False Carl s median is higher than Angela s median. C. True or False About 25% of Carl s sales were between $46 and $63. D. True or False About 75% of Angela s sales were between $0 and $40. E. True or False Angela s maximum is about $63. 18. Based on the data given, who should win Employee of the Month at Coldstone? Support your answer with statistics. 19. True or False Angela and Carl sold about the same amount of ice creams that day.

Unit 1 Measuring Distributions Lesson 5 Box Plots 73 Lesson Summary 20. Use the diagram and the word list to identify the five-number summary that makes up a box plot. Then complete the sentences. Word Bank for Diagram: Lower Quartile Upper Quartile Maximum Median Minimum Nonsymmetrical data distributions are referred to as. Left-skewed or skewed to the left means the data spread out (like a tail) on the left side. Right-skewed or skewed to the right means the data spread out (like a tail) on the right side. The center of a skewed data distribution is described by the. Variability of a skewed data distribution is described by the interquartile range ( ). The IQR describes variability by specifying the length of the interval that contains the middle % of the data values. Outliers in a data set are defined as those values than 1.5 (IQR) box plot.

Unit 1 Measuring Distributions Lesson 5 Box Plots 75 NAME: PERIOD: DATE: Homework Problem Set An advertising agency researched the ages of viewers most interested in various types of television ads. Consider the following summaries: Ages Target Products or Services 30 45 Electronics, home goods, cars 46 55 Financial services, appliances, furniture 56 72 Retirement planning, cruises, health-care services 1. The mean age of the people surveyed is approximately 50 years old. As a result, the producers of the show decided to obtain advertisers for a typical viewer of 50 years old. A. According to the table, what products or services do you think the producers will target? B. Based on the sample, what percent of the people surveyed about the Fact or Fiction show would have been interested in these commercials if the advertising table is accurate? 2. The show failed to generate the interest the advertisers hoped. As a result, they stopped advertising on the show, and the show was cancelled. Kristin made the argument that a better age to describe the typical viewer is the median age. A. What is the median age of the sample? B. What products or services does the advertising table suggest for viewers if the median age is considered as a description of the typical viewer? C. What percent of the people surveyed would be interested in the products or services suggested by the advertising table if the median age were used to describe a typical viewer?

76 Module 1 Descriptive Statistics 3. A. What percent of the viewers have ages between Q1 and Q3? B. The difference between Q3 and Q1, or Q3 Q1, is called the interquartile range, or IQR. What is the IQR for this data distribution? 4. Do you think producers of the show would prefer a show that has a small or large interquartile range? Explain your answer. 5. Do you agree with Kristin s argument that the median age provides a better description of a typical viewer? Explain your answer. 6. Which ages, if any, do you think are outliers for the viewer ages in the box plot below?

Unit 1 Measuring Distributions Lesson 5 Box Plots 77 Students at Waldo High School are involved in a special project that involves communicating with people in Kenya. Consider a box plot of the ages of 200 randomly selected people from Kenya. sample, these four ages were considered outliers. 7. 8. A. What is the median age of the sample of ages from Kenya? B. What are the approximate values of Q1 and Q3? C. What is the approximate IQR of this sample? D. Multiply the IQR by 1.5. What value do you get? E. Add 1.5 (IQR) to the third quartile age (Q3). What do you notice about the four F. Are there any age values that are less than Q1 1.5 (IQR)? If so, these ages would also be considered outliers. G. of the box plot for ages of the people in the sample from Kenya.

78 Module 1 Descriptive Statistics Consider the following scenario. Transportation officials collect data on flight delays (the number of minutes a flight takes off after its scheduled time). Consider the dot plot of the delay times in minutes for 60 BigAir flights during December 2012. Flik47/Shutterstock.com 9. How many flights left more than 60 minutes late? 10. Why is this data distribution considered skewed? 11. Is the tail of this data distribution to the right or to the left? How would you describe several of the delay times in the tail?

Unit 1 Measuring Distributions Lesson 5 Box Plots 79 12. Draw a box plot over the dot plot of the flights for December. 13. What is the interquartile range, or IQR, of this data set? 14. The mean of the 60 flight delays is approximately 42 minutes. Do you think that 42 minutes is typical of the number of minutes a BigAir flight was delayed? Why or why not? 15. Based on the December data, write a brief description of the BigAir flight distribution for December. 16. Calculate the percentage of flights with delays of more than 1 hour. Were there many flight delays of more than 1 hour? 17. BigAir later indicated that there was a flight delay that was not included in the data. The flight not reported was delayed for 48 hours. If you had included that flight delay in the box plot, how would you have represented it? Explain your answer.

80 Module 1 Descriptive Statistics 18. A. Consider a dot plot and the box plot of the delay times in minutes for 60 BigAir flights during January 2013. How is the January flight delay distribution different from the one summarizing the December flight delays? In terms of flight delays in January, did BigAir improve, stay the same, or do worse compared to December? Explain your answer. B. Do you think this data set contains any outliers? Explain your thinking.

Unit 1 Measuring Distributions Lesson 5 Box Plots 81 Spiral REVIEW Histograms 19. How many students took the algebra test? 20. Which grade has the most test scores? 21. Which grades have the same number of test scores? 22. How many more students earned 85 89 than earned 80 84? 23. How is this histogram different from the ones you studied in Lessons 2 and 3?