What is Statistics? 13.1 What is Statistics? Statistics

Similar documents
Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions.

Distribution of Data and the Empirical Rule

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

Algebra I Module 2 Lessons 1 19

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

Chapter 1 Midterm Review

Frequencies. Chapter 2. Descriptive statistics and charts

Chapter 6. Normal Distributions

Measuring Variability for Skewed Distributions

Notes Unit 8: Dot Plots and Histograms

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

MATH& 146 Lesson 11. Section 1.6 Categorical Data

Objective: Write on the goal/objective sheet and give a before class rating. Determine the types of graphs appropriate for specific data.

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Box Plots. So that I can: look at large amount of data in condensed form.

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Full file at

Chapter 2 Notes.notebook. June 21, : Random Samples

Homework Packet Week #5 All problems with answers or work are examples.

Key Maths Facts to Memorise Question and Answer

Chapter 2 Describing Data: Frequency Tables, Frequency Distributions, and

Copyright 2013 Pearson Education, Inc.

Statistics: A Gentle Introduction (3 rd ed.): Test Bank. 1. Perhaps the oldest presentation in history of descriptive statistics was

6 th Grade Semester 2 Review 1) It cost me $18 to make a lamp, but I m selling it for $45. What was the percent of increase in price?

Math 81 Graphing. Cartesian Coordinate System Plotting Ordered Pairs (x, y) (x is horizontal, y is vertical) center is (0,0) Quadrants:

download instant at

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

AskDrCallahan Calculus 1 Teacher s Guide

Margin of Error. p(1 p) n 0.2(0.8) 900. Since about 95% of the data will fall within almost two standard deviations, we will use the formula

EXPLORING DISTRIBUTIONS

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

Dot Plots and Distributions

When do two squares make a new square

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002

Q1. In a division sum, the divisor is 4 times the quotient and twice the remainder. If and are respectively the divisor and the dividend, then (a)

Section 5.2: Organizing and Graphing Categorical

Chapter 4. Displaying Quantitative Data. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Estimation of inter-rater reliability

Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

AP Statistics Sec 5.1: An Exercise in Sampling: The Corn Field

Page I-ix / Lab Notebooks, Lab Reports, Graphs, Parts Per Thousand Information on Lab Notebooks, Lab Reports and Graphs

Chapter 21. Margin of Error. Intervals. Asymmetric Boxes Interpretation Examples. Chapter 21. Margin of Error

Relationships Between Quantitative Variables

T HE M AGIC OF G RAPHS AND S TATISTICS

Answers. Chapter 9 A Puzzle Time MUSSELS. 9.1 Practice A. Technology Connection. 9.1 Start Thinking! 9.1 Warm Up. 9.1 Start Thinking!

9.2 Data Distributions and Outliers

AGAINST ALL ODDS EPISODE 22 SAMPLING DISTRIBUTIONS TRANSCRIPT

E X P E R I M E N T 1

Math 7 /Unit 07 Practice Test: Collecting, Displaying and Analyzing Data

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

d. Could you represent the profit for n copies in other different ways?

Evaluating Oscilloscope Mask Testing for Six Sigma Quality Standards

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

Lecture 10: Release the Kraken!

Sampling Plans. Sampling Plan - Variable Physical Unit Sample. Sampling Application. Sampling Approach. Universe and Frame Information

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Sampling Worksheet: Rolling Down the River

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

Table of Contents. Introduction...v. About the CD-ROM...vi. Standards Correlations... vii. Ratios and Proportional Relationships...

User Guide. S-Curve Tool

N12/5/MATSD/SP2/ENG/TZ0/XX. mathematical STUDIES. Wednesday 7 November 2012 (morning) 1 hour 30 minutes. instructions to candidates

Mobile Math Teachers Circle The Return of the iclicker

Chapter 7: RV's & Probability Distributions

A Comparison of Methods to Construct an Optimal Membership Function in a Fuzzy Database System

Version : 27 June General Certificate of Secondary Education June Foundation Unit 1. Final. Mark Scheme

Measurement User Guide

Chapter 7 Probability

Congratulations to the Bureau of Labor Statistics for Creating an Excellent Graph By Jeffrey A. Shaffer 12/16/2011

STAT 250: Introduction to Biostatistics LAB 6

Comparing Distributions of Univariate Data

For these exercises, use SAS data sets stored in a permanent SAS data library.

Statistics for Engineers

Northern Dakota County Cable Communications Commission ~

Jumpstarters for Math

The Measurement Tools and What They Do

Lecture 2 Video Formation and Representation

More About Regression

Sampler Overview. Statistical Demonstration Software Copyright 2007 by Clifford H. Wagner

Bite Size Brownies. Designed by: Jonathan Thompson George Mason University, COMPLETE Math

Suppose you make $1000 a week. Your company is in dire straits, and so you have to take a 50% pay cut.

PS User Guide Series Seismic-Data Display

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

Introductory Statistics. Lecture 1 Sinan Hanay

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

DIFFERENTIATE SOMETHING AT THE VERY BEGINNING THE COURSE I'LL ADD YOU QUESTIONS USING THEM. BUT PARTICULAR QUESTIONS AS YOU'LL SEE

Release Year Prediction for Songs

MITOCW big_picture_integrals_512kb-mp4

x) Graph the function

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

Sample Analysis Design. Element2 - Basic Software Concepts (cont d)

6.5 Percussion scalograms and musical rhythm

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

The One Penny Whiteboard

Precision testing methods of Event Timer A032-ET

TeeJay Publishers. Curriculum for Excellence. Course Planner - Level 1

Transcription:

13.1 What is Statistics? What is Statistics? The collection of all outcomes, responses, measurements, or counts that are of interest. A portion or subset of the population. Statistics Is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions. ch 13 Angel & Porter (6th ed) 1 Descriptive Statistics Involves organizing, summarizing, and displaying data. e.g. Tables, charts, averages Statistics Inferential Statistics Involves using sample data to draw conclusions about a population. ch 13 Angel & Porter (6th ed) 2 Sampling Techniques Simple Random Sample: Each member of the population has an equal chance of being selected. Please read section 13.2 The Misuses of Statistics ch 13 Angel & Porter (6th ed) 3

13.3 Frequency Distributions Consider the following data collected from students in a class: Number of traffic tickets received 0 4 4 5 4 1 5 5 6 7 0 4 3 1 3 3 3 4 3 4 2 1 6 4 5 It is usually helpful to summarize a large amount of data in a frequency distribution. ch 13 Angel & Porter (6th ed) 4 A frequency distribution is a listing of the observed values and the corresponding frequency of occurrence of each value. Example Construct a frequency distribution for the number of traffic tickets received. Number of traffic tickets received 0 4 4 5 4 1 5 5 6 7 0 4 3 1 3 3 3 4 3 4 2 1 6 4 5 ch 13 Angel & Porter (6th ed) 5 If you have a large data set in which few numbers are repeated, it may be helpful to create a Grouped Frequency Distribution. Example The following data represent the monthly account balances (to the nearest dollar) for a sample of fifty credit card users. 138 78 175 46 79 118 90 163 88 107 126 154 85 60 42 54 62 128 114 73 129 130 81 67 119 116 145 105 96 71 100 145 117 60 125 130 94 88 136 112 118 84 74 62 81 110 108 71 85 165 ch 13 Angel & Porter (6th ed) 6

Rules for Data Grouped by Classes 1. The classes should be of the same width. 2. The classes should not overlap. 3. Each piece of data should belong to only one class. You should use between classes (intervals). Lets arbitrarily make the first interval go from 40 59. This means the second interval must start at 60. We say the class width is 20, since there are 20 numbers in the first interval (40, 41, 42,, 58, 59) ch 13 Angel & Porter (6th ed) 7 The modal class of a frequency distribution is the class with the highest frequency. The midpoint of a class (called the class mark) is found by lower limit + upper limit 2 ch 13 Angel & Porter (6th ed) 8 Example #16 Construct a frequency distribution with a first class of 42-47. 57 57 49 52 50 51 51 56 46 61 61 64 56 47 56 60 61 57 54 50 46 55 55 62 52 57 68 48 54 54 51 43 69 58 51 65 49 42 54 55 64 ch 13 Angel & Porter (6th ed) 9

13.4 Statistical Graphs Often it is easier to understand information when it is summarized in a graph. We will look at 3 types of graphs. Circle Graphs (Pie Graphs) Shows the relationship of each category to the whole by visually comparing the sizes of the slices of the pie. Information displayed in a circle graph needs to be categories (non-numeric). Letter grade distribution on an exam D 13% C 17% F 9% A 48% B 13% ch 13 Angel & Porter (6th ed) 10 Example #10 In 2000 there are 66.6 million households online worldwide. Of the total, 57% are in North America. Estimate the number of households online in each region shown on the graph. Asia/Pacific rim 15% Europe 25% Other 3% North America 57% ch 13 Angel & Porter (6th ed) 11 Histograms and frequency polygons are used to illustrate numeric data contained in a frequency distribution. Histogram Observed data is placed on the horizontal axis and frequencies on the vertical axis. A rectangle is placed above each value or class indicating the frequency for that value or class. Histogram 12 10 8 6 4 2 0 0 1 2 3 4 observed values ch 13 Angel & Porter (6th ed) 12 Frequency

Frequency Polygon Observed data is placed on the horizontal axis and frequencies on the vertical axis. A dot is placed at the corresponding frequency above each observed value or class. The dots are connected with straight line segments. Frequency 12 10 8 6 4 2 0 Frequency Polygon -1 0 1 2 3 4 5 observed values ch 13 Angel & Porter (6th ed) 13 Example #14 The frequency distribution shown indicates the ages of a group of 40 people attending a party. (A) Construct a histogram of the frequency distribution. (B) Construct a frequency polygon of the frequency distribution. Age 20 21 22 23 24 25 26 27 Number of People 6 3 0 4 6 3 8 10 ch 13 Angel & Porter (6th ed) 14 Example #16 The frequency distribution illustrates the annual salaries, in thousands of dollars, of the people in management positions at the Bradley Thomas Corporation. (A) Construct a histogram of the frequency distribution. (B) Construct a frequency polygon of the frequency distribution. Salary (in $1000) 20-25 26-31 32-37 38-43 44-49 50-55 56-61 Number of People 4 6 8 9 8 5 3 ch 13 Angel & Porter (6th ed) 15

Example The following histogram represents the record high temperature for different states. Frequency 20 18 16 14 12 10 8 6 4 2 0 Record High Temperature 102 107 112 117 122 127 132 Temperature ch 13 Angel & Porter (6th ed) 16 (A) How many states were surveyed? (B) What are the lower and upper class limits of the first and second classes? (C) How many states have a record high temperature in the class with a class mark of 122? (D) What is the class mark of the modal class? ch 13 Angel & Porter (6th ed) 17 13.5 Measures of Central Tendency I. Mean: The sum of all data values divided by the number of values For a sample: Sigma Notation: add all of the data values (x) in the data set. ch 13 Angel & Porter (6th ed) 18

Example An instructor recorded the number of absences for his students in one semester. For a random sample the data are: 2 4 2 0 40 2 4 3 6 Find the sample mean. ch 13 Angel & Porter (6th ed) 19 II. Median: The middle value of an data set. Half of the measurements fall below the median and half are above. Example An instructor recorded the number of absences for his students in one semester. For a random sample the data are: 2 4 2 0 40 2 4 3 6 Find the median. ch 13 Angel & Porter (6th ed) 20 III. Mode: The value with the highest frequency. If no entry is repeated, there is no mode. Example An instructor recorded the number of absences for his students in one semester. For a random sample the data are: 2 4 2 0 40 2 4 3 6 Find the mode. ch 13 Angel & Porter (6th ed) 21

IV. Midrange: The value halfway between the lowest and highest values in the data set. Midrange = Example An instructor recorded the number of absences for his students in one semester. For a random sample the data are: 2 4 2 0 40 2 4 3 6 Find the midrange. ch 13 Angel & Porter (6th ed) 22 Example (cont) Suppose the student with 40 absences is dropped from the course. Calculate the mean, median, mode, and midrange of the remaining values. Compare the effect of the change to each type of average. 2 4 2 0 2 4 3 6 ch 13 Angel & Porter (6th ed) 23 Comparing the Mean, Median, Mode, and Midrange The mean is used most often because it uses all of the data values in its computation. Thus it is almost always a good representative value. The mean is the only measure of central tendency that can be affected by any change in the data set. If the data set contains "extreme values" (called ) the median provides a more accurate measure of central tendency. Look at the effect of the 40 on the mean and median in the previous example. ch 13 Angel & Porter (6th ed) 24

The mode is the easiest to "compute" however it may not be very useful if the data set is small. The mode is useful when discussing such ideas as shoe size. If a retailer is ordering shoes, it would be helpful to know the most common shoe size. The midrange is seldom used. Because it only uses the lowest and highest values, it is too sensitive to extreme values. ch 13 Angel & Porter (6th ed) 25 13.6 Measures of Dispersion Tells how spread out the data is. Consider the heights of the five starting players on each of two men s college basketball teams. Team A 73 Team B 72 72 76 67 76 76 76 78 84 Mean = 75 Mean = 75 Median = 76 Median = 76 Mode = 76 Mode = 76 These sets are different due to variation. ch 13 Angel & Porter (6th ed) 26 I. Range = Example Range A = Range B = Drawback: The range only uses 2 numbers from a data set. ch 13 Angel & Porter (6th ed) 27

The deviation for each value x is the difference between the value of x and the mean of the data set. In a sample, the deviation for each value x is: ( x x) 2 II. Sample Standard Deviation: s = x x n 1 ch 13 Angel & Porter (6th ed) 28-3 -2 72 73 74 75 76 77 78 x = 75 ch 13 Angel & Porter (6th ed) 29 1 1 3 Procedure to find the standard deviation: (p. 696) 1. Calculate the mean. 2. Make a chart with 3 columns: Data Data Mean (Data Mean) 2 3. Fill in each column. 4. Add the values in the (Data Mean) 2 column. 5. Divide the sum by n 1. 6. Take the square root of the quotient. ch 13 Angel & Porter (6th ed) 30

Example Calculate the standard deviation for Team B. Data 67 Data - Mean (Data Mean) 2 72 76 76 84 ch 13 Angel & Porter (6th ed) 31 13.7 The Normal Curve When we look at a histogram we can see the overall shape of the distribution of data. Some shapes occur more often than others. 20 15 10 5 Number of Children per Family 0 0 1 2 3 4 5 6 7 8 9 Number of children ch 13 Angel & Porter (6th ed) 32 Skewed Right Skewed Left Normal Distribution ch 13 Angel & Porter (6th ed) 33

Data with Normal distribution has the following characteristics. About About x 3s x 2s x s x x + s x + 2s x + 3s of the data lies within 1 standard deviation of the mean of the data lies within 2 standard deviations of the mean Almost all of the data lies within 3 standard deviations of the mean ch 13 Angel & Porter (6th ed) 34 Example An instruction manual claims that the assembly time for a product is normally distributed with a mean of 4.2 hours and standard deviation 0.3 hours. What percentage of products will have assembly times between 3.6 hours and 4.8 hours? What if we wanted to know what percentage of products will have assembly times more than 4.7 hours. To answer this question requires using a z-score. ch 13 Angel & Porter (6th ed) 35 The z-score, represents the number of standard deviations a random variable x falls from the mean µ. value - mean x x z = = standard deviation s Example An instruction manual claims that the assembly time for a product is normally distributed with a mean of 4.2 hours and standard deviation 0.3 hours. Find the standard z-score for an assembly time of: (a) 3.6 hrs (b) 4.2 hrs (c) 4.7 hrs Z-Score has 2 parts: (1) sign - above or below the mean (2) numerical value - # of standard deviations away from the mean ch 13 Angel & Porter (6th ed) 36

Most questions we need to answer about a normal distribution involve values other than those within 1, 2, or 3 standard deviations away from the mean. To answer these questions, we use the z-score and a table of percentage values. Table 13.7 (p. 706) gives the area (percentage) under the normal curve between the mean, z = 0, and a z-value to the right of the mean. The total area under the normal curve is 1.0 = 100% ch 13 Angel & Porter (6th ed) 37 Example Use Table 13.7 to find the specified area. A) Above the mean. B) Between z = 0 and z = 1.00 C) Between z = -1.00 and z = 0 Since the curve is symmetric about the mean, the area between the mean and a positive z-score is the same as the area between the mean and the corresponding negative z- score. ch 13 Angel & Porter (6th ed) 38 Example (Cont.) D) Between z = -2.00 and z = 2.00 E) Between z = 1.23 and z = 2.35 See Procedure to find the Percent of Data Between any Two Values on p. 707 Example (Cont.) F) To the right of z = 1.73 G) To the left of z = 1.08 ch 13 Angel & Porter (6th ed) 39

Example An instruction manual claims that the assembly time for a product is normally distributed with a mean of 4.2 hours and standard deviation 0.3 hours. A) What percentage of products will have assembly times more than 4.7 hours? B) What percentage of products will have assembly times between 3.5 hours and 3.9 hours? Remember to draw a picture for each problem! ch 13 Angel & Porter (6th ed) 40 Example The life expectancy of nondefective GE light bulbs normally distributed, with a mean life of 1500 hours and a standard deviation of 100 hours. #73 Find the percent of bulbs that will last more than 1450 hours. #74 Find the percent of bulbs that last between 1400 hours and 1550 hours. #75 Find the percent of bulbs that last less than 1480 hours. ch 13 Angel & Porter (6th ed) 41