Comparing Distributions of Univariate Data

Similar documents
Graphical Displays of Univariate Data

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

Chapter 1 Midterm Review

Algebra I Module 2 Lessons 1 19

Measuring Variability for Skewed Distributions

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Box Plots. So that I can: look at large amount of data in condensed form.

Chapter 3. Averages and Variation

9.2 Data Distributions and Outliers

Homework Packet Week #5 All problems with answers or work are examples.

Dot Plots and Distributions

Frequencies. Chapter 2. Descriptive statistics and charts

Copyright 2013 Pearson Education, Inc.

Chapter 4. Displaying Quantitative Data. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Functions Modeling Change A Preparation for Calculus Third Edition

Comparing Areas of Rectangles

Notes Unit 8: Dot Plots and Histograms

6 th Grade Semester 2 Review 1) It cost me $18 to make a lamp, but I m selling it for $45. What was the percent of increase in price?

What is Statistics? 13.1 What is Statistics? Statistics

On Your Own. Applications. Unit 2. ii. The following are the pairs of mutual friends: A-C, A-E, B-D, C-D, and D-E.

Math 7 /Unit 07 Practice Test: Collecting, Displaying and Analyzing Data

Chapter 2 Notes.notebook. June 21, : Random Samples

The One Penny Whiteboard

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

Normalization Methods for Two-Color Microarray Data

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Chapter 6. Normal Distributions

EXPLORING DISTRIBUTIONS

Estimation of inter-rater reliability

Statistics for Engineers

Distribution of Data and the Empirical Rule

What can you tell about these films from this box plot? Could you work out the genre of these films?

E X P E R I M E N T 1

Full file at

Introduction to IBM SPSS Statistics (v24)

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002

MATH& 146 Lesson 11. Section 1.6 Categorical Data

HP StreamSmart 410 User Guide. For use with the HP Prime Graphing Calculator

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

Relationships Between Quantitative Variables

APA Research Paper Chapter 2 Supplement

The APA Style Converter: A Web-based interface for converting articles to APA style for publication

STAT 250: Introduction to Biostatistics LAB 6

Iterative Deletion Routing Algorithm

Getting Started with the CBL 2 System

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

NENS 230 Assignment #2 Data Import, Manipulation, and Basic Plotting

presented by Speakers: Joe Konrath, Product Manager, Microfilm Trudi Egan, Project Manager, Microfilm Joan Corkran, Project Manager, Microfilm

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

Explorations 2: British Columbia Curriculum Correlations Please use the Find function to search for specific expectations.

CBL Lab MAPPING A MAGNETIC FIELD MATHEMATICS CURRICULUM. High School. Florida Sunshine State Mathematics Standards

Answers. Chapter 9 A Puzzle Time MUSSELS. 9.1 Practice A. Technology Connection. 9.1 Start Thinking! 9.1 Warm Up. 9.1 Start Thinking!

1.1 Common Graphs and Data Plots

BullCharts BullScan Manager a Tutorial

GCSE MARKING SCHEME AUTUMN 2017 GCSE MATHEMATICS NUMERACY UNIT 1 - INTERMEDIATE TIER 3310U30-1. WJEC CBAC Ltd.

Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions.

Version : 27 June General Certificate of Secondary Education June Foundation Unit 1. Final. Mark Scheme

M1 OSCILLOSCOPE TOOLS

Centre for Economic Policy Research

More About Regression

Frances Salomon Murphy writings, 1953 FLP.CLRC.MURPHY

6 ~ata-ink Maximization and Graphical Design

abc Mark Scheme Statistics 3311 General Certificate of Secondary Education Higher Tier 2007 examination - June series

Good playing practice when drumming: Influence of tempo on timing and preparatory movements for healthy and dystonic players

9.5 Add or Remove Samples in Single Access Mode

BLONDER TONGUE LABORATORIES, INC.

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

SECTION I. THE MODEL. Discriminant Analysis Presentation~ REVISION Marcy Saxton and Jenn Stoneking DF1 DF2 DF3

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

Objective: Write on the goal/objective sheet and give a before class rating. Determine the types of graphs appropriate for specific data.

The Benesh Movement Notation Score

Processes for the Intersection

Tutorial 0: Uncertainty in Power and Sample Size Estimation. Acknowledgements:

2G Video Wall Guide Just Add Power HD over IP Page1 2G VIDEO WALL GUIDE. Revised

1.1 Cable Schedule Table

Sampling Plans. Sampling Plan - Variable Physical Unit Sample. Sampling Application. Sampling Approach. Universe and Frame Information

INSTRUCTIONS FOR USE Program version V1.1 November 1999

CytoFLEX Flow Cytometer Quick Start Guide

Bridges and Arches. Authors: André Holleman (Bonhoeffer college, teacher in research at the AMSTEL Institute) André Heck (AMSTEL Institute)

2 AORM Setup & View Wizard

To log actions. If you want to repeat what you have done, the script serves as a guide.

FPA (Focal Plane Array) Characterization set up (CamIRa) Standard Operating Procedure

Visual Encoding Design

Draft last edited May 13, 2013 by Belinda Robertson

Level 1 Mathematics and Statistics, 2011

Proceedings of the Third International DERIVE/TI-92 Conference

Version : 1.0: klm. General Certificate of Secondary Education November Higher Unit 1. Final. Mark Scheme

WordCruncher Tools Overview WordCruncher Library Download an ebook or corpus Create your own WordCruncher ebook or corpus Share your ebooks or notes

NUMB3RS Activity: Coded Messages. Episode: The Mole

PYROPTIX TM IMAGE PROCESSING SOFTWARE

SIDRA INTERSECTION 8.0 UPDATE HISTORY

Film-Tech. The information contained in this Adobe Acrobat pdf file is provided at your own risk and good judgment.

THE UNIVERSITY OF QUEENSLAND

Printing From Applications: Adobe InDesign CS3, CS4, and CS5

A Numeric Compression Algorithm for the HP Prime Calculator Dr. Jackie F. Woldering

Transcription:

. Chapter 3 Comparing Distributions of Univariate Data Topic 9 covers comparing data and constructing multiple univariate plots. Topic 9 Multiple Univariate Plots Example: Building heights in Philadelphia, PA were stored in list phily and folder BLDTALL in Topic 1. Store Seattle building heights (buildings 400 or more feet tall) in list seattle, and New York City building heights (the 24 tallest buildings) in list nyc. Store the following data, in the order listed, in lists seattle and nyc in folder BLDTALL. seattle 500 605 609 487 466 514 454 456 543 409 574 943 493 730 580 743 722 448 nyc 792 927 1046 1250 741 951 850 813 808 730 750 750 1368 1362 915 716 752 739 778 814 745 757 866 861 (Source: Reprinted with permission from the World Almanac and Book of Facts 2000. 2000 World Almanac Education Group, Inc. All rights reserved.)

50 ADVANCED PLACEMENT STATISTICS WITH THE TI-89 1. Press O, 1:Flash Apps, and then select the Stats/List Editor. 2. Create the list seattle by highlighting the list1 heading. Press 2 / and type the name seattle. 3. Repeat step 2 to insert the name nyc in place of list2. 4. Enter the seattle and nyc data values from the table on page 49 under the appropriate headings (screen 1). (1) Parallel Boxplots Parallel boxplots are the quickest way to get a pictorial overview of the comparison between data lists on the TI-89. 1. From the Stats/List Editor and folder BLDTALL, press Plots, and select 1:Plot Setup. 2. Highlight Plot 1, and press ƒ Define to define Plot 1 as a modified boxplot with X List: nyc (screen 2). 3. Press twice to return to the Plot Setup screen. (2) 4. Repeat steps 2 and 3 for Plot 2 defined for list seattle and Plot 3 defined for list phily (screen 3). (3) 5. From the Plot Setup screen, press ZoomData. After the plots are displayed, press Trace and B four times (screen 4). (4)

CHAPTER 3: COMPARING DISTRIBUTIONS OF UNIVARIATE DATA 51 All the distributions are skewed to the right with at least one outlier. New York City (P1) has three outliers of 1250, 1362, and maxx = 1368 feet (the Empire State Building, One World Trade Center, and Two World Trade Center, respectively). The most obvious difference is with New York City having taller buildings (center shifted to the right). Seventy-five percent of NYC s 24 tallest buildings are over 750 feet = Q 1, while Seattle has only one building that tall (the outlier), and Philadelphia has three buildings that tall (including the two outliers). Philadelphia buildings (minus the outliers) have the greatest overall spread, but NYC s interquartile range (spread of center 50% of the box) is the largest and its center box also has the most skewness. Seattle s middle 50% is almost symmetric (median line almost in the center of the box). 1-VarStats for Multiple Lists 1. From the Home screen, press ½, and then press Flash Apps. 2. You are in alpha mode so you do not press the j key. Press the letter O (screen 5). Note the syntax at the bottom of the screen when ú is next to OneVar(. NUM is the number of lists designated as x1, x2,, x20. 3. Press and tistat.onevar( is pasted in the input line of the Home screen. Note: Lists do not need to be of equal length. (5) 4. Type and/or paste 3, phily, seattle, nyc) and then press to complete the operation (screen 6). (Done is displayed.) 5. Press 2, scroll down to highlight the STATVARS folder, and press B to expand the folder and highlight mat1var. 6. Press to paste mat1var to the Home screen input line. 7. Press (screen 7). 8. To view the entire matrix of values, press C once to highlight the matrix. Press B or A to go right or left, and D or C to go up or down. (The key is to the right of 2.) (6) (7)

52 ADVANCED PLACEMENT STATISTICS WITH THE TI-89 Below is a table summary of seven key variables for each of the three cities. As a reminder: ü = mean s x = standard deviation n = sample size Med = median Q 3 = third quartile (75% value) Q 1 = first quartile (25% value) IQR = interquartile range phily seattle nyc ü 539 571 878 s x 151 133 188 n 24 18 24 Med 489 529 811 Q 3 579 609 921 Q 1 426 466 750 IQR 153 143 171 Summary measures without outliers: phily seattle nyc ü 0 507 549 814 s 0 109 101 85 n 0 22 17 21 Med 0 485 514 792 IQR 0 155 146 116

CHAPTER 3: COMPARING DISTRIBUTIONS OF UNIVARIATE DATA 53 The summary measures in the first table confirm what you observed from the modified boxplots, but the values calculated without the outliers emphasize the extreme nature of the New York outliers to the extent that the measure of variability for New York has changed from the most variable to the least (compare s x and IQR x with s 0 and IQR 0 ). Screen 8 shows what the boxplot looks like if you delete the outlier values from the data set and regraph. Compare screen 8 with screen 4. With the reduced data set, the Chrysler Building in New York City (1046 feet) becomes a possible outlier. Multiple Dotplots The TI-89 has no built-in dotplot function. In Topic 2 you did the plot by hand because dotplots and stemplots are most effective for small to moderate size data lists (histograms work best for longer lists). It will be helpful, however, to build multiple dotplots on the TI-89 using the following method to aid in making comparisons. 1. Copy lists phily, seattle, and nyc to lists list1, list2, and list3 respectively, and sort them in ascending order (screen 9). (See Chapter 1, Topic 2, Putting Data in Order section.) The Stats/List Editor should resemble screen 9. 2. Replace list4, list5, and list6 with new names t1, t2, and t3 respectively. (See the Do This First chapter, Inserting a New List Name section.) 3. Fill list t1, t2, and t3 with 1 s, 2 s, and 3 s respectively, using commands seq(1,x,1,24), seq(2,x,1,18), and seq(3,x,1,24). (See the Do This First chapter, Using seq( to Generate a List section.) 4. The screen should resemble screen 10. 5. Change the second 1 in list t1 to 1.1. (This corresponds to the repeated value of 400 in list x1.) 6. Press 2 D to continue down list t2 to make the 8 th and 18 th t1 values have values of 1.1. 7. List seattle has no repeats, but in list3 (nyc) there are two 750 s in positions 6 and 7, so make the 7 th value in t3 equal 3.1. (8) (9) (10)

54 ADVANCED PLACEMENT STATISTICS WITH THE TI-89 8. Using Plot, select 1:Plot Setup and ƒ Define to create three plots with the specifications shown in the table and in screen 11. (11) Plot 1 Type: Scatter Mark: Dot X List: list1 Y List: t1 Plot 2 Type: Scatter Mark: Dot X List: list2 Y List: t2 Plot 3 Type: Scatter Mark: Dot X List: list3 Y List: t3 9. Set up the window using $ with the following entries: xmin = 350 xmax = 1400 xscl = 100 (12) ymin = -1 ymax = 7 yscl = 0 xres = 1 (See screen 12.) 10. Press % (screen 13). (13) 11. If the graph is difficult to see, go back to the Plot Setup screen (step 8) and change the mark in Plot 1, Plot 2, and Plot 3 to + (plus) (screen 14). You looked at the dotplot for Philadelphia buildings in Topic 2, but the additional information gathered from the multiple dotplots over the parallel boxplots is a cluster of three buildings in Seattle around 700 feet, with a gap of over 100 feet from the smaller buildings. New York City has a fourth possible outlier at 1046 feet (the Chrysler Building). (14) Chrysler Building

CHAPTER 3: COMPARING DISTRIBUTIONS OF UNIVARIATE DATA 55 Back-to-Back Stemplots Use the sorted values in list1, list2, and list3 to create the following stemplots as you did in Topic 2. Note: The back-to-back stemplots are modified to include a third list of data. Philadelphia Seattle New York City 44221100 4 1 Key:41 410 ft 9999885 * 556799 0 5 014 City Hall 977 * 78 6 11 Space Needle * 40 7 234 7 2344 9 * * 5555689 8 8 111 5 * * 567 9 4 Seattle s Columbia Seafirst Center 9 23 One Liberty Place 5 * * 5 10 10 * * 5 Chrysler Bldg. 11 11 * * 12 12 * * 5 Empire State Bldg. 13 13 * * 67 Two & One World Trade Center

56 ADVANCED PLACEMENT STATISTICS WITH THE TI-89 The previous stemplots show all the data to the nearest ten feet. All cities lists are skewed to taller values, with New York City having the majority of the taller buildings and Philadelphia the majority of the smaller buildings. The variability, clusters, gaps, and outliers are consistent with what you observed in the dotplots and modified boxplots. Multiple (Sparse) Histograms To combine the advantages of both the histograms and dotplots, you will compare histograms with many cells. Too many cells and a Plot Setup error will occur. Bucket widths of 25 feet will work. Using this width, the maximum frequency in any cell is 6 for the phily data, 4 for the nyc data, and 3 for the seattle data. 6 + 1 = 7, 7 3 = 21, so ymin + ymax = 21 and you can fit three histograms on one graph screen. 1. From the Stats/List Editor, press Plots, 1:Plot Setup and ƒ Define to create the following three plots with specifications: Plot 1 Type: Histogram X: nyc Bucket width: 25 Plot 2 Type: Histogram X: seattle Bucket width: 25 Plot 3 Type: Histogram X: phily Bucket width: 25 (15) (See screen 15.) 2. Highlight Plot 2 and Plot 3 and press ( ) to deselect the plots. Observe in screen 15 that Plot 1 is the only one checked and active. 3. Set up the window using $ with the following entries: xmin = 350 xmax = 1400 xscl = 100 (16) ymin = -14 ymax = 7 yscl = 0 xres = 1 (See screen 16. The histogram is the top third of the graph screen.)

CHAPTER 3: COMPARING DISTRIBUTIONS OF UNIVARIATE DATA 57 4. Press % (screen 17). (17) 5. Press ƒ Tools and select 2:Save Copy As (screen 18). 6. Select Type: Picture and Folder: BLDTALL. In the Variable: field, type histo. Press. 7. Return to the Plot Setup screen and deselect Plot 1. Highlight Plot 1 and press ( ) to deselect it. (18) 8. Select Plot 2 ( ( )) with seattle data and change the window ( $) to the following entries: xmin = 350 xmax = 1400 xscl = 100 (19) ymin = -7 ymax = 14 yscl = 0 xres = 1 (See screen 19.) 9. Press % for the middle histogram (screen 20). 10. Press ƒ Tools, select 1:Open picture histo, and then select Type: Picture. (20) 11. Press and the top two graphs are displayed (screen 21). 12. Repeat steps 5 and 6 corresponding to screen 18 to save these graphs in place of the old histogram. (21)

58 ADVANCED PLACEMENT STATISTICS WITH THE TI-89 13. From the Plot Setup menu, deselect Plot 2, select Plot 3 with phily data, and change the window ( $) to the following entries: xmin = 350 xmax = 1400 xscl = 100 (22) ymin = 0 ymax = 21 yscl = 0 xres = 1 (See screen 22.) 14. Press % for the bottom histogram. 15. Press ƒ Tools, select 1:Open picture histo, and then select Type: Picture. 16. Press to view all three histograms (screen 23). Skewness, clusters, gaps, and outliers are all shown in relationship to the other data sets. (23) Parallel Boxplots with Multiple Dotplots Screen 24 gives two type comparisons on the same screen. Can you duplicate it? (24)