MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/3

Similar documents
MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

What is Statistics? 13.1 What is Statistics? Statistics

Frequencies. Chapter 2. Descriptive statistics and charts

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

Distribution of Data and the Empirical Rule

Chapter 1 Midterm Review

Algebra I Module 2 Lessons 1 19

Chapter 6. Normal Distributions

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Measuring Variability for Skewed Distributions

Math 81 Graphing. Cartesian Coordinate System Plotting Ordered Pairs (x, y) (x is horizontal, y is vertical) center is (0,0) Quadrants:

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

download instant at

Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions.

Chapter 3. Averages and Variation

Box Plots. So that I can: look at large amount of data in condensed form.

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

MATH& 146 Lesson 11. Section 1.6 Categorical Data

NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING

The One Penny Whiteboard

When do two squares make a new square

Estimating. Proportions with Confidence. Chapter 10. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Lesson 7: Measuring Variability for Skewed Distributions (Interquartile Range)

Dot Plots and Distributions

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Objective: Write on the goal/objective sheet and give a before class rating. Determine the types of graphs appropriate for specific data.

9.2 Data Distributions and Outliers

Chapter 2 Describing Data: Frequency Tables, Frequency Distributions, and

Full file at

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002

T HE M AGIC OF G RAPHS AND S TATISTICS

Sampler Overview. Statistical Demonstration Software Copyright 2007 by Clifford H. Wagner

Homework Packet Week #5 All problems with answers or work are examples.

AP Statistics Sec 5.1: An Exercise in Sampling: The Corn Field

Statistics for Engineers

d. Could you represent the profit for n copies in other different ways?

THE USE OF RESAMPLING FOR ESTIMATING CONTROL CHART LIMITS

Chapter 2 Notes.notebook. June 21, : Random Samples

Lecture 10: Release the Kraken!

NETFLIX MOVIE RATING ANALYSIS

EXPLORING DISTRIBUTIONS

Chapter 4. Displaying Quantitative Data. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Tutorial 0: Uncertainty in Power and Sample Size Estimation. Acknowledgements:

1/ 19 2/17 3/23 4/23 5/18 Total/100. Please do not write in the spaces above.

Calculated Percentage = Number of color specific M&M s x 100% Total Number of M&M s (from the same row)

Math 7 /Unit 07 Practice Test: Collecting, Displaying and Analyzing Data

User Guide. S-Curve Tool

E X P E R I M E N T 1

Reviews of earlier editions

How Large a Sample? CHAPTER 24. Issues in determining sample size

Sampling Plans. Sampling Plan - Variable Physical Unit Sample. Sampling Application. Sampling Approach. Universe and Frame Information

Jumpstarters for Math

Western Statistics Teachers Conference 2000

Graphical Displays of Univariate Data

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level

Congratulations to the Bureau of Labor Statistics for Creating an Excellent Graph By Jeffrey A. Shaffer 12/16/2011

N12/5/MATSD/SP2/ENG/TZ0/XX. mathematical STUDIES. Wednesday 7 November 2012 (morning) 1 hour 30 minutes. instructions to candidates

Use black ink or black ball-point pen. Pencil should only be used for drawing. *

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

11, 6, 8, 7, 7, 6, 9, 11, 9

Visual Encoding Design

Force & Motion 4-5: ArithMachines

Relationships Between Quantitative Variables

Estimation of inter-rater reliability

Comparing Distributions of Univariate Data

Notes Unit 8: Dot Plots and Histograms

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

More About Regression

1. MORTALITY AT ADVANCED AGES IN SPAIN MARIA DELS ÀNGELS FELIPE CHECA 1 COL LEGI D ACTUARIS DE CATALUNYA

*On-Line appendix for non-tables, by Margo Schlanger

For the SIA. Applications of Propagation Delay & Skew tool. Introduction. Theory of Operation. Propagation Delay & Skew Tool

Answers. Chapter 9 A Puzzle Time MUSSELS. 9.1 Practice A. Technology Connection. 9.1 Start Thinking! 9.1 Warm Up. 9.1 Start Thinking!

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

THE MONTY HALL PROBLEM

Problem Points Score USE YOUR TIME WISELY USE CLOSEST DF AVAILABLE IN TABLE SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Multiple-point simulation of multiple categories Part 1. Testing against multiple truncation of a Gaussian field

Chapter 7 Probability

Release Year Prediction for Songs

Analysis of data from the pilot exercise to develop bibliometric indicators for the REF

STAT 250: Introduction to Biostatistics LAB 6

Measurement User Guide

Section 2.1 How Do We Measure Speed?

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

Telephone calls and the Brontosaurus Adam Atkinson

Confidence Intervals for Radio Ratings Estimators

Cancer in females. Visual Display of (Public Health) Data - Theory and Practice. Michael C. Samuel, Dr. P.H. Senior Epidemiologist / Data Scientist

Key Maths Facts to Memorise Question and Answer

6 ~ata-ink Maximization and Graphical Design

Chapter 7: RV's & Probability Distributions

SEVENTH GRADE. Revised June Billings Public Schools Correlation and Pacing Guide Math - McDougal Littell Middle School Math 2004

Table of Contents. Introduction...v. About the CD-ROM...vi. Standards Correlations... vii. Ratios and Proportional Relationships...

Resampling Statistics. Conventional Statistics. Resampling Statistics

Centre for Economic Policy Research

REACHING THE UN-REACHABLE

Common assumptions in color characterization of projectors

10.4 Inference as Decision. The 1995 O.J. Simpson trial: the situation

DIFFERENTIATE SOMETHING AT THE VERY BEGINNING THE COURSE I'LL ADD YOU QUESTIONS USING THEM. BUT PARTICULAR QUESTIONS AS YOU'LL SEE

The Urbana Free Library Patron Survey. Final Report

Transcription:

MATH 214 (NOTES) Math 214 Al Nosedal Department of Mathematics Indiana University of Pennsylvania MATH 214 (NOTES) p. 1/3

CHAPTER 1 DATA AND STATISTICS MATH 214 (NOTES) p. 2/3

Definitions. Statistics is defined as the science of collecting, analyzing, presenting, and interpreting data. Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation. Elements are the entities on which data are collected. A variable is a characteristic of interest for the elements. Data can also be classified as either qualitative or quantitative. Qualitative data include labels or names used to identify an attribute of each element. Quantitative data require numeric values that indicate how much or how many. MATH 214 (NOTES) p. 3/3

Descriptive Statistics Most of the statistical information in newspapers, magazines, company reports, and other publications consists of data that are summarized and presented in a form that is easy for the reader to understand. Such summaries of data, which may be tabular, graphical, or numerical, are referred to as descriptive statistics. MATH 214 (NOTES) p. 4/3

Statistical Inference Many situations require information about a large group of elements. But, because of time, cost, and other considerations, data can be collected from only a small portion of the group. The larger group of elements in a particular study is called the population, and the smaller group is called the sample. As one of its major contributions, statistics uses data from a sample to make estimates and test hypotheses about the characteristics of a population through a process referred to as statistical inference. MATH 214 (NOTES) p. 5/3

CHAPTER 2 DESCRIPTIVE STATISTICS: TABULAR AND GRAPHICAL PRESENTATIONS MATH 214 (NOTES) p. 6/3

Summarizing Qualitative Data Frequency distribution. A frequency distribution is a tabular summary of data showing the number (frequency) of items in each of several nonoverlapping classes. Relative frequency of a class = Frequency of the class n where n represents the total number of observations. MATH 214 (NOTES) p. 7/3

Bar graphs and pie charts A bar graph, is a graphical device for depicting qualitative data summarized in a frequency, relative frequency, or percent frequency distribution. On one axis of the graph, we specify the labels that are used for the classes (categories). A frequency, relative frequency, or percent frequency scale can be used for the other axis of the graph. The pie chart provides another graphical device for presenting relative frequency and percent frequency distributions for qualitative data. MATH 214 (NOTES) p. 8/3

Summarizing Quantitative Data A common graphical presentation of quantitative data is a histogram. This graphical summary can be prepared for data previously summarized in either a frequency, relative frequency, or percent frequency distribution. A histogram is constructed by placing the variables of interest on the horizontal axis and the frequency, relative frequency, or percent frequency on the vertical axis. MATH 214 (NOTES) p. 9/3

Exercise (page 40) 11. Consider the following data 14 21 23 21 16 19 22 25 16 16 24 24 25 19 16 19 18 19 21 12 16 17 18 23 25 20 23 16 20 19 24 26 15 22 24 20 22 24 22 20 a. Develop a frequency distribution using classes of 12-14, 15-17, 18-20, 21-23, and 24-26. b. Develop a relative frequency distribution and a percent frequency distribution using the classes in part (a). c. Make a histogram. MATH 214 (NOTES) p. 10/3

Solution class freq. relative freq. percent freq. 12-14 2 2/40 0.05 15-17 8 8/40 0.20 18-20 11 11/40 0.275 21-23 10 10/40 0.25 24-26 9 9/40 0.225 MATH 214 (NOTES) p. 11/3

Describing distributions with numbers How much do people with a bachelor s degree (but no higher degree) earn? Here are the incomes of 15 such people, chosen at random by the Census Bureau in March 2002 and asked how much they earned in 2001. Most people reported their incomes to the nearest thousand dollars, so we have rounded their responses to thousands of dollars. 110 25 50 50 55 30 35 30 4 32 50 30 32 74 60 How could we find the "typical" income for people with a bachelor s degree (but no higher degree)? MATH 214 (NOTES) p. 12/3

Describing distributions with numbers How much do people with a bachelor s degree (but no higher degree) earn? Here are the incomes of 15 such people, chosen at random by the Census Bureau in March 2002 and asked how much they earned in 2001. Most people reported their incomes to the nearest thousand dollars, so we have rounded their responses to thousands of dollars. 110 25 50 50 55 30 35 30 4 32 50 30 32 74 60 How could we find the "typical" income for people with a bachelor s degree (but no higher degree)? MATH 214 (NOTES) p. 12/3

CHAPTER 3 DESCRIPTIVE STATISTICS: NUMERICAL MEASURES MATH 214 (NOTES) p. 13/3

Measuring center: the mean The most common measure of center is the ordinary arithmetic average, or mean. To find the mean of a set of observations, add their values and divide by the number of observations. If the n observations are x 1,x 2,...,x n, their mean is (1) or in more compact notation, x = x 1 + x 2 +... + x n n (2) x = 1 n n x i i=1 MATH 214 (NOTES) p. 14/3

Measuring center: the median The median M is the midpoint of a distribution, the number such that half the observations are smaller and the other half are larger. To find the median of the distribution: Arrange all observations in order of size, from smallest to largest. If the number of observations n is odd, the median M is the center observation in the ordered list. Find the location of the median by counting n+1 2 observations up from the bottom of the list. MATH 214 (NOTES) p. 15/3

Measuring center: the median (cont.) If the number of observations n is even, the median M is the mean of the two center observations in the ordered list. Find the location of the median by counting n+1 2 observations up from the bottom of the list. MATH 214 (NOTES) p. 16/3

The quartiles Q 1 and Q 3 To calculate the quartiles: Arrange the observations in increasing order and locate the median M in the ordered list of observations. The first quartile Q 1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median The third quartile Q 3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median MATH 214 (NOTES) p. 17/3

Side-by-side Boxplots Example. Here are the numbers of home runs that Babe Ruth hit in his 15 years with the New York Yankees, 1920 to 1934: 54 59 35 41 46 25 47 60 54 46 49 46 41 34 22 Another home run hitter is Mark McGwire, who retired after the 2001 season. Here are McGwire s home run counts for 1987 to 2001: 49 32 33 39 22 42 9 9 39 52 58 70 65 32 29 Find the five-number summaries and make side-by-side boxplots to compare these two home run hitters. What do your plots show? MATH 214 (NOTES) p. 18/3

Measures of association between 2 variables Covariance (sample covariance) You can compute the covariance, S XY using the following formula: (3) S XY = n i=1 x iy i n 1 n xȳ n 1 MATH 214 (NOTES) p. 19/3

Probability: Colors of M & M s If you draw an M & M candy at random from a bag of the candies, the candy you draw will have one of the seven colors. The probability of drawing each color depends on the proportion of each color among all candies made. Here is the distribution for milk chocolate M & M s: Color Purple Yellow Red Probability 0.2 0.2 0.2 Color Orange Brown Green Blue Probability 0.1 0.1 0.1? MATH 214 (NOTES) p. 20/3

Colors of M & M s (cont.) a) What must be the probability of drawing a blue candy? b) What is the probability that you do not draw a brown candy? c) What is the probability that the candy you draw is either yellow, orange, or red? MATH 214 (NOTES) p. 21/3

Conditional probability Problem. Josh and Al are avid tennis players and they enjoy playing matches against each other. They do, however, have one difference of opinion on the court. Al likes to have a nice long warm-up session at the start where they hit the ball back and forth and back and forth. Josh s ideal warm-up is to bend at the waist to tie his sneakers and to adjust his shorts. Al thinks that when they rush through the warm-up, he doesn t play as well. MATH 214 (NOTES) p. 22/3

Conditional probability (cont.) The following table shows the outcomes of their last 20 matches, along with the type of warm-up before they started keeping score. Does the type of warm-up have an influence on the outcome of a match? Warm-up time Al wins Josh wins Total Less than 10 min. 4 9 13 10 min. or more 5 2 7 Total 9 11 20 MATH 214 (NOTES) p. 23/3

CHAPTER 7 SAMPLING DISTRIBUTIONS MATH 214 (NOTES) p. 24/3

Example A couple plans to have three children. There are 8 possible arrangements of girls and boys. For example, GGB means the first two children are girls and the third child is a boy. All 8 arrangements are (approximately) equally likely. a) Write down all 8 arrangements of the sexes of three children. What is the probability of any one of these arrangements? MATH 214 (NOTES) p. 25/3

Example (cont.) b) Let X be the number of girls the couple has. What is the probability that X = 2? c) Starting from your work in a), find the distribution of X. That is, what values can X take, and what are the probabilities for each value? MATH 214 (NOTES) p. 26/3

Problem We are interested in estimating the average number of cars per household in a little town call Statstown. Let X represent the number of cars in a house picked at random. God knows that X has a Binomial distribution with n = 4 and p = 0.5. Suppose that we can only afford a sample of size 4 and that we are going to use this sample to estimate that population average. MATH 214 (NOTES) p. 27/3

Problem (cont.) What we are going to do next is called a simulation. First, we will draw a lot of random samples coming from a Binomial Distribution with n = 4 and p = 0.5. Then we will make a histogram for all the x s corresponding to our samples. We are going to do this do see what the histogram of x looks like. This will give us an idea of what to expect in a similar situation. MATH 214 (NOTES) p. 28/3

Central Limit Theorem Draw a random sample of size n from any population with mean µ and finite standard deviation σ. When n is large, the sampling distribution of the sample mean x is approximately Normal: (4) x is approximately N(µ, σ n ) MATH 214 (NOTES) p. 29/3

Example The number of accidents per week at a hazardous intersection varies with mean 2.2 and standard deviation 1.4. This distribution takes only whole-number values, so it is certainly not Normal. a) Let x be the mean number of accidents per week at the intersection during a year (52 weeks). What is the approximate distribution of x according to the central limit theorem? MATH 214 (NOTES) p. 30/3

Example (cont.) b) What is the approximate probability that x is less than 2? c) What is the approximate probability that there are fewer than 100 accidents at the intersection in a year? (Hint: Restate this event in terms of x) MATH 214 (NOTES) p. 31/3

CHAPTER 9 HYPOTHESIS TESTS MATH 214 (NOTES) p. 32/3

Do you want to become a millionaire? Let s say that one of you is invited to this popular show. As you probably know, you have to answer a series of multiple choice questions and there are four possible answers to each question. Perhaps you also have seen that if you don t know the answer to a question you could either "jump the question" or you could "ask the audience". Suppose that you run into a question for which you don t know the answer with certainty and you decide to "ask the audience". Let s say that you initially believe that the right answer is A. Then you ask the audience and only 2% of the audience shares your opinion. What would you do? Change your initial belief or reject it? MATH 214 (NOTES) p. 33/3

TO BE CONTINUED... MATH 214 (NOTES) p. 34/3