1 302360_File_B.qxd 7/7/03 7:18 AM Page 1 Distribution of Data and the Empirical Rule 1 Distribution of Data and the Empirical Rule Stem-and-Leaf Diagrams Frequency Distributions and Histograms Normal Distributions and the Empirical Rule z-scores A Stem-and-Leaf Diagram of a Set of History Test Scores Stems Leaves Legend: 8/6 represents 86 Stem-and-Leaf Diagrams Although the mean, the median, the mode, and the standard deviation provide some information about a set of data and the distribution of the data, it is often helpful to use graphical procedures that visually illustrate precisely how the values in a set of data are distributed. Many small sets of data can be graphically displayed by using a stem-andleaf diagram. For instance, consider the following history test scores: 65, 72, 96, 86, 43, 61, 75, 86, 49, 68, 98, 74, 84, 78, 85, 75, 86, 73 In this form the data are called raw data because the data have not been organized. With raw data it is generally difficult to observe how the data are distributed. In the stem-and-leaf diagram shown at the left, we have organized the test scores by placing all the scores that are in the 40s in the top row, the scores that are in the 50s in the second row, the scores that are in the 60s in the third row, and so on. The tens digits of the scores have been placed to the left of the vertical line. In this diagram they are referred to as stems. The ones digits of the test scores have been placed in the proper row to the right of the vertical line. In this diagram they are the leaves. It is now easy to make observations about the distribution of the scores. Only two of the scores are in the 90s, six of the scores are in the 70s, and none of the scores are in the 50s. The lowest score is 43 and the highest is 98. Steps in the Construction of a Stem-and-Leaf Diagram 1. Determine the stems and list the stems in a column from smallest to largest. 2. List the remaining digits of each stem as a leaf to the right of its stem. 3. Include a legend that explains the meaning of the stem and the leaves. Include a title for the diagram. The choice of how many leading digits to use as the stem will depend on the particular application and can be best explained with an example. EXAMPLE 1 Construct a Stem-and-Leaf Diagram A travel agent has recorded the amount spent by customers for a cruise. Construct a stem-and-leaf diagram for the data. Amount Spent for a Cruise, Summer of 2003 $3600 $4700 $7200 $2100 $5700 $4400 $9400 $6200 $5900 $2100 $4100 $5200 $7300 $6200 $3800 $4900 $5400 $5400 $3100 $3100 $4500 $4500 $2900 $3700 $3700 $4800 $4800 $2400 Continued
2 302360_File_B.qxd 7/7/03 7:18 AM Page 2 2 Solution One method of choosing the stems is to let each thousands digit be a stem and each hundreds digit be a leaf. If the stems and leaves are assigned in this manner, then the notation 2 1, which has a stem of 2 and a leaf of 1, represents a cost of $2100 and the notation 5 4 represents a cost of $5400. The diagram can now be constructed by writing all of the stems, from smallest to largest, in a column to the left of a vertical line and writing the corresponding leaves to the right of the vertical line. Amount Spent for a Cruise Stems Leaves Legend: 7 3 represents $7300 CHECK YOUR PROGRESS 1 The following table lists the ages of the customers who purchased a cruise. Construct a stem-and-leaf diagram for the data. Ages of Customers Who Purchased a Cruise Solution See page S1. Sometimes two sets of data can be compared by using a back-to-back stemand-leaf diagram, which has common stems with leaves from one data set displayed to the right of the stems and leaves from the other data set displayed to the left of the stems. For instance, the following back-to-back stem-and-leaf diagram shows the test scores for two biology classes that took the same test.
3 302360_File_B.qxd 7/7/03 7:18 AM Page 3 Distribution of Data and the Empirical Rule 3 Biology Test Scores 8 A.M. class 10 A.M. class Legend: 3 7 represents 73 Legend: 8 2 represents 82 QUESTION Which biology class did better on the test? Frequency Distributions and Histograms Large sets of data are often displayed using a frequency distribution or a histogram. For example, consider the following situation. An Internet service provider (ISP) has installed new computers. To estimate the new download times its subscribers will experience, the ISP surveyed 1000 subscribers to determine the time each subscriber required to download a particular file from the Internet site music.net. The results of that survey are summarized in the following table. Download time Number of (in seconds) subscribers Number of subscribers Download time, in seconds A grouped frequency distribution A histogram of the frequency distribution at the left The above table is called a grouped frequency distribution. It shows how often (frequently) certain events occurred. Each interval 0 10, 10 20,... is called a ANSWER The 8 A.M. class did better on the test because it had more scores in the 80s and 90s and fewer scores in the 40s, 50s, and 60s. The scores in the 70s were similar for both classes.
4 302360_File_B.qxd 7/7/03 7:18 AM Page 4 4 class. This distribution has six classes. For the class, 10 is the lower class boundary and 20 is the upper class boundary. Any data value that lies on a common boundary is assigned to the higher class. The graph of a frequency distribution is called a histogram. A histogram provides a pictorial view of how the data are distributed. In the above histogram, the height of each bar indicates how many subscribers experienced the download times indicated by the class represented below on the horizontal axis. The center point of a class is called a class mark. In the above histogram, the class marks 5, 15, 25, 35, 45, 55 are shown by the red tick marks on the horizontal axis. Instead of using classes with a width of 10 seconds, the ISP could have chosen a smaller class width. A smaller class width produces more classes. For instance, if each class width were 5 seconds, the frequency distribution and histogram for the music.net example would have the form shown below. Download time (in seconds) Number of subscribers Number of subscribers Download time, in seconds A frequency distribution with 12 classes A histogram of the frequency distribution at the left Examine the following distribution. It shows the percent of subscribers who are in each class, as opposed to the frequency distribution above, which shows the number of subscribers in each class. The type of frequency distribution that lists the percent of data in each class is called a relative frequency distribution. The relative frequency histogram shown at the right below was drawn by using the data in the relative frequency distribution. It shows the percent of subscribers along its vertical axis.
5 302360_File_B.qxd 7/7/03 7:18 AM Page 5 Distribution of Data and the Empirical Rule 5 Download time (in seconds) Number of subscribers Percent of subscribers Download time, in seconds A relative frequency distribution A relative frequency histogram Download time Percent of (in seconds) subscribers Sum is 14.9% Sum is 68.8% One advantage of using a relative frequency distribution instead of a frequency distribution is that there is a direct correspondence between the percent of the data that lie in a particular portion of the relative frequency distribution and probability. For instance, in the relative frequency distribution above, the percent of the data that lie between 35 and 40 seconds is 14.9%. Thus, if a subscriber is chosen at random, the probability that the subscriber will require between 35 and 40 seconds to download the music file is EXAMPLE 2 Use a Relative Frequency Distribution Use the music.net relative frequency distribution above to determine a. the percent of subscribers who required at least 25 seconds to download the file. b. the probability that a subscriber chosen at random will require from 5 to 20 seconds to download the file. Solution a. The percent of data in all classes with a lower bound of 25 seconds or more is the sum of the percents for all of the classes highlighted in red in the distribution at the left. The percent of subscribers who required at least 25 seconds to download the file is 68.8%. b. The percent of data in all classes with a lower bound of at least 5 seconds and an upper bound of 20 seconds or less is the sum of the percents for all of the classes highlighted in blue in the distribution at the left. Thus the percent of subscribers who required from 5 to 20 seconds to download the file is 14.9%. The probability that a subscriber chosen at random will require from 5 to 20 seconds to download the file is Continued
6 302360_File_B.qxd 7/7/03 7:18 AM Page 6 6 CHECK YOUR PROGRESS 2 Use the relative frequency distribution below to determine a. the percent of the states that pay an average teacher salary of at least $45,000. b. the probability that a state selected at random pays an average teacher salary of at least $30,000 but less than $39,000. Average Salaries of Public School Teachers, Average Salary, s Number of States Relative Frequency $27,000 s $30,000 $30,000 s $33,000 $33,000 s $36,000 $36,000 s $39,000 $39,000 s $42,000 $42,000 s $45,000 $45,000 s $48,000 $48,000 s $51,000 $51,000 s $54, % 7 14% 12 24% 9 18% 6 12% 3 6% 5 10% 3 6% 2 4% Source: Solution See page S1. There is a geometric analogy between the percents of data and probabilities we calculated in Example 2 and the relative frequency histogram for the data. For instance, the percent of data described in part a. of Example 2 corresponds to the area shown by the red bars in the histogram on the left below. The percent of data described in part b. corresponds to the area shown by the blue bars in the histogram on the right below Percent of subscribers Percent of subscribers Download time, in seconds Download time, in seconds 25 seconds or more At least 5 but less than 20 seconds Normal Distributions and the Empirical Rule A histogram for a set of data provides us with a tool that can indicate patterns or trends in the distribution of data. The terms uniform, skewed, symmetrical, and normal are used to describe the distributions of some sets of data.
7 302360_File_B.qxd 7/7/03 7:18 AM Page 7 Distribution of Data and the Empirical Rule 7 A uniform distribution, shown in the figure below, is generated when all of the observed events occur with the same frequency. The graph of a uniform distribution remains at the same height over the range of the data. Some random processes produce distributions that are uniform or nearly uniform. For example, if the spinner below is used to generate numbers, then in the long run each of the numbers 1, 2, 3,..., 8 will be generated with approximately the same frequency. Uniform distribution Random number generator Frequency of x x Frequency of x Symmetrical distribution Center line mean = median = mode x A symmetrical distribution, shown at the left, is symmetrical about a vertical center line. If you fold a symmetrical distribution along the center line, the right side of the distribution will match the left side. The following data sets are examples of distributions that are nearly symmetrical: the weights of all male students, the heights of all teenage females, the prices of a gallon of regular gasoline in a large city, the mileages for a particular type of automobile tire, and the amounts of soda dispensed by a vending machine. In a symmetrical distribution, the mean, the median, and the mode are all equal and they are located at the center of the distribution. Skewed distributions, shown in the figures below, have a longer tail on one side of the distribution and shorter tail on the other side. A distribution is skewed to the left if it has a longer tail on the left and is skewed to the right if it has a longer tail on the right. In a distribution that is skewed to the left, the mean is less than the median, which is less than the mode. In a distribution that is skewed to the right, the mode is less than the median, which is less than the mean. Skewed distributions Frequency of x Skewed left Frequency of x Skewed right mean median mode x mode median mean x Many examinations yield test scores that have skewed distributions. For instance, if a test designed for students in the sixth grade is given to students in a ninth grade class, most of the scores will be high, and the distribution of the test scores will be skewed to the left. Discrete values are separated from each other by an increment, or space. For example, only whole numbers are used to record the number of points a
8 302360_File_B.qxd 7/7/03 7:18 AM Page 8 8 basketball player scores in a game. The possible numbers of points s that the player can score are restricted to the discrete values 0, 1, 2, 3, 4,.... The variable s is a discrete variable. Different scores are separated from each other by at least 1 point. Any variable that is based on counting procedures is a discrete variable. Histograms are generally used to show the distribution of discrete variables. Continuous values are values that can take on all real numbers in some interval. For example, the possible times that it takes to drive to the grocery store represent a continuous value. The time is not restricted to natural numbers such as 4 minutes or 5 minutes. In fact, the time may be any part of a minute, or of a second if we care to measure that precisely. A variable such as time that is based on measuring with smaller and smaller units is a continuous variable. Continuous curves, rather than histograms, are used to show the distributions of continuous variables. Distributions of continuous variables f(t) f(x) f(w) a. Bimodal t b. Skewed right x c. Symmetrical w In some cases a continuous curve is used to display the distribution of a set of discrete data. For instance, when we have a large set of data and the class intervals are very small, the shape of the top of the histogram approaches a smooth curve. See the two figures below. Thus, when graphing the distribution of very large sets of data with very small class intervals, it is common practice to replace the histogram with a smooth continuous curve. A histogram for discrete data A continuous distribution curve f(x) f(x) If x is a continuous variable with mean (the Greek letter mu) and standard deviation, then its normal distribution is given by f x e 1 2 x 2 2 x One of the most important statistical distributions is known as a normal distribution. The precise mathematical definition of a normal distribution is given by the equation in the Take Note at the left; however, for many problems it is sufficient to know that all normal distributions have the following properties. x
9 302360_File_B.qxd 7/7/03 7:18 AM Page 9 Distribution of Data and the Empirical Rule 9 Properties of a Normal Distribution A normal distribution has a bell shape that is symmetric about a vertical line through its center. The mean, the median, and the mode of a normal distribution are all equal and they are located at the center of the distribution. f(x) A normal distribution 2.15% 2.15% 13.6% 34.1% 34.1% 13.6% x µ 3σ µ 2 σ µ σ µ µ + σ µ + 2 σ µ + 3σ 68.2% of the data 95.4% of the data 99.7% of the data The Empirical Rule: In a normal distribution, about 68.2% of the data lies within 1 standard deviation of the mean. 95.4% of the data lies within 2 standard deviations of the mean. 99.7% of the data lies within 3 standard deviations of the mean. The Empirical Rule can be used to solve many problems that involve a normal distribution. f(x) Data within 2 σ of µ 34.1% 13.6% 34.1% 13.6% µ 2σ µ µ + 2σ x 95.4% EXAMPLE 3 Use the Empirical Rule A survey of 1000 U.S. gas stations found that the price charged for a gallon of regular gas can be closely approximated by a normal distribution with a mean of $1.90 and a standard deviation of $0.20. How many of the stations charge a. between $1.50 and $2.30 for a gallon of regular gas? b. less than $2.10 for a gallon of regular gas? c. more than $2.30 for a gallon of regular gas? Solution a. The $1.50 per gallon price is 2 standard deviations below the mean. The $2.30 price is 2 standard deviations above the mean. In a normal distribution, 95.4% of all data lies within 2 standard deviations of the mean. (See the normal distribution at the left.) Therefore, approximately 95.4% of the stations charge between $1.50 and $2.30 for a gallon of regular gas. Continued
10 302360_File_B.qxd 7/7/03 7:18 AM Page f(x) Data less than 1 σ above µ f(x) 50% 84.1% of the data 34.1% µ µ + σ Data more than 2 σabove µ 2.3% µ 2σ µ 95.4% 2.3% µ + 2σ x x b. The $2.10 price is 1 standard deviation above the mean. (See the normal distribution at the left.) In a normal distribution, 34.1% of all data lies between the mean and 1 standard deviation above the mean. Thus, approximately 34.1% of the stations charge between $1.90 and $2.10 for a gallon of regular gasoline. Half of the stations charge less than the mean. Therefore, about of the stations charge less that $2.10 for a gallon of regular gas. This problem can also be solved by computing 34.1% 50% 84.1% of c. The $2.30 price is 2 standard deviations above the mean. In a normal distribution, 95.4% of all data is within 2 standard deviations of the mean. This means that the other 4.6% of the data will lie either more than 2 standard deviations above the mean or less than 2 standard deviations below the mean. We are only interested in the data that lie more than 2 standard deviations 1 above the mean, which is 2 of 4.6%, or 2.3%, of the data. (See the distribution at the left.) Thus about 2.3% of the stations charge more than $2.30 for a gallon of regular gas. CHECK YOUR PROGRESS 3 A vegetable distributor knows that during the month of August, the weights of its tomatoes were normally distributed with a mean of 0.61 pound and a standard deviation of 0.15 pound. a. What percent of the tomatoes weighed less than 0.76 pound? b. In a shipment of 6000 tomatoes, how many tomatoes can be expected to weigh more than 0.31 pound? c. In a shipment of 4500 tomatoes, how many tomatoes can be expected to weigh between 0.31 and 0.91 pound? Solution See page S1. z-scores When you take a test, it is natural to wonder how you will do compared to the other students in the class. Will you finish in the top 10%, or will you be closer to the middle? One statistic that is used to measure the position of a data value with respect to other data values is known as the z-score. z-score The z-score for a given data value x is the number of standard deviations between x and the mean of the data. The following formulas are used to calculate the z-score for a data value x. Population: z x x Sample: z x x x s In the next example, we use a student s z-scores for two tests to determine how well the student did on each test in comparison to the other students.
11 302360_File_B.qxd 7/7/03 7:18 AM Page 11 Distribution of Data and the Empirical Rule 11 In any application, the quantity x and the standard deviation are both measured in the same units. Thus a z-score, which is the quotient of x and, is a dimensionless measure. EXAMPLE 4 Use z-scores a. Ruben has taken two tests in his math class. He scored 72 on the first test, for which the mean was 65 and the standard deviation was 8. He received a 60 on the second test, for which the mean was 45 and the standard deviation was 12. In comparison to the other students, did Ruben do better on the first or the second test? b. Stacy is in the same math class as Ruben. Stacy s z-score for the first test was What was Stacy s score on the first test? Solution a. The z-score formula yields z and z Thus Ruben scored standard deviations above the mean on his first test and 1.25 standard deviations above the mean on the second test. In comparison to his classmates, Ruben scored better on the second test than on the first test. b. Substitute into the z-score formula and score for x x 65 x 59 x 65 8 Stacy s score on the first test was 59. CHECK YOUR PROGRESS 4 a. Cheryl took two quizzes in her history class. She scored 15 on the first quiz, for which the mean was 12 and the standard deviation was 2.4. Her score on the second quiz, for which the mean was 11.5 and the standard deviation was 2.2, was 14. In comparison to her classmates, did Cheryl do better on the first or the second quiz? b. Greg is in the same history class as Cheryl. Greg s z-score for the first quiz was 2.5. What was Greg s score on the first quiz? Solution See page S1. Topics for Discussion 1. Is it possible, in a normal distribution of data, for the mean to be much larger than the median? Explain. 2. Must all large data sets have a normal distribution? Explain. 3. A professor gave a final examination to 110 students. Eighteen students had examination scores that were more than one standard deviation above the mean. Does this indicate that 18 of the students had examination scores that were less than one standard deviation below the mean? Explain. 4. A set of data consists of the 525 monthly salaries, listed in dollars, of the employees of a large company. What units should be used for the z-scores associated with the salaries? Explain.
12 302360_File_B.qxd 7/7/03 7:18 AM Page EXERCISES In Exercises 1 to 8, determine whether the given statement is true or false. 1. If a distribution is symmetric about a vertical line, then it is a normal distribution. 2. Every normal distribution has a bell-shaped graph. 3. In a normal distribution, the mean, the median, and the mode of the distribution all are located at the center of the distribution. 4. In a distribution that is skewed to the left, the median of the data is greater than the mean. 5. If a z-score for a data value x is negative, then x must also be negative. 6. In every data set, 68.2% of the data lies within 1 standard deviation of the mean. 7. Let x be the number of people who attend a baseball game. The variable x is a discrete variable. Business and Economics 11. State Sales Tax Rates Use the following frequency distribution to determine a. the percent of states in the U.S. that had a 2001 sales tax of at least 5%. b. the probability that a state selected at random had a 2001 sales tax rate of at least 3% but less than 5% State Sales Tax Rate Number Relative Tax rate, r of states frequency 0% r 1% 1% r 2% 2% r 3% 3% r 4% 4% r 5% 5% r 6% 6% r 7% 5 10% 0 0% 1 2% 0 0% 13 26% 15 30% 13 26% 8. The time of day d in the lobby of a bank is measured with a digital clock. The variable d is a continuous variable. 7% r 8% Source: Time Almanac % In Exercises 9 and 10, use the Empirical Rule to answer each question. 9. In a normal distribution, what percent of the data lies a. within 2 standard deviations of the mean? b. more than 1 standard deviation above the mean? c. between 1 standard deviation below the mean and 2 standard deviations above the mean? 10. In a normal distribution, what percent of the data lies a. within 3 standard deviations of the mean? b. less than 2 standard deviations below the mean? c. between 2 standard deviations below the mean and 3 standard deviations above the mean? 12. Waiting Time The amount of time customers spend waiting in line at a bank is normally distributed, with a mean of 3.5 minutes and a standard deviation of 0.75 minute. Find the probability that the time a customer will spend waiting is a. at most 2.75 minutes. b. less than 2 minutes. 13. Weights of Parcels During a particular week, an overnight delivery company found that the weights of its parcels were normally distributed, with a mean of 24 ounces and a standard deviation of 6 ounces. a. What percent of the parcels weighed between 12 ounces and 30 ounces? b. What percent of the parcels weighed more than 42 ounces?
13 302360_File_B.qxd 7/7/03 7:18 AM Page 13 Distribution of Data and the Empirical Rule Weights of Boxes of Corn Flakes The weights of the boxes of corn flakes filled by a machine are normally distributed, with an average weight of 14.5 ounces and a standard deviation of 0.5 ounce. What percent of the boxes a. weigh less than 14.0 ounces? b. weigh between 13.5 and 15.0 ounces? 15. Duration of Long Distance Telephone Calls A telephone company has found that the lengths of its long distance telephone calls are normally distributed, with a mean of 225 seconds and a standard deviation of 55 seconds. What percent of its long distance calls last a. more than 335 seconds? Social Sciences 19. Presidential Inauguration Ages and Ages at Death The table in Exercise 26 of Section 8.4 lists the U.S. presidents and their ages at inauguration. The table in Exercise 27 of Section 8.4 lists the deceased U.S. presidents as of December 2002, and their ages at death. Marshall/Liaison/Getty Images a. Construct a back-to-back stem-and-leaf diagram for the data in the tables. b. What patterns, if any, are evident from the diagram? b. between 170 and 390 seconds? Life and Health Sciences 16. Median Income for Physicians The 1995 median income for physicians was $160,000. (Source: AMA Center for Health Policy Research) The distribution of these incomes is skewed to the right. Is the mean of these incomes greater than or less than $160,000? 17. Heights of Women A survey of 1000 women aged 20 to 30 found that their heights are normally distributed, with a mean of 65 inches and a standard deviation of 2.5 inches. a. How many of the women have a height that is within 1 standard deviation of the mean? 20. Average Salaries of Teachers Use the following frequency distribution to determine a. the percent of states in the U.S. that paid a average teacher salary of at least $39,000. b. the probability that a state selected at random paid a average teacher salary of at least $36,000 but less than $45,000. Average Salaries of Public School Teachers, Number Relative Average salary, s of states frequency $27,000 s $30,000 $30,000 s $33,000 $33,000 s $36, % 7 14% 12 24% b. How many of the women have a height that is between 60 inches and 70 inches? 18. Distribution of Data Consider the set of the heights of all babies born in the United States during a particular year. Do you think this data set can be closely approximated by a normal distribution? Explain. $36,000 s $39,000 $39,000 s $42,000 $42,000 s $45,000 $45,000 s $48,000 $48,000 s $51,000 $51,000 s $54,000 Source: % 6 12% 3 6% 5 10% 3 6% 2 4%
14 302360_File_B.qxd 7/7/03 7:18 AM Page Test Scores The following relative frequency histogram shows the distribution of test scores for 50 students who took a history test. Relative frequency 25% 20% 15% 10% 5% 0% Test scores a. What percent of the students scored at least 76 on the test? 25. Comparison of Quiz Scores Ryan took two quizzes in his art class. He scored 45 on the first quiz, for which the mean was 51.4 and the standard deviation was 9.5. His score on the second quiz, for which the mean was 53.6 and the standard deviation was 7.2, was 49. In comparison to his classmates, did Ryan do better on the first or the second quiz? 26. Comparison of Test Scores Tanya took two tests in her chemistry class. She scored 85 on the first test, for which the mean was 79.4 and the standard deviation was 6.4. Her score on the second test, for which the mean was 70.5 and the standard deviation was 5.3, was 78. In comparison to her classmates, did Tanya do better on the first or the second test? b. How many of the students received a score of at least 60 but less than 84? 22. Examination Duration Times At a university, 500 law students took an examination. One student completed the exam in 24 minutes. The mode for the completion time is 50 minutes. The distribution of the times the students took to complete the exam is skewed to the left. Is the mean of these times greater than or less than 50 minutes? 23. Intelligence Quotients A psychologist finds that the intelligence quotients of a group of patients are normally distributed, with a mean of 104 and a standard deviation of 26. Find the percent of the patients with IQs a. above 130. Sports and Recreation 27. Super Bowl Scores The following table lists the winning and losing scores for all of the Super Bowl games up to the year Super Bowl Results, AP/ Wide World Photos a. Construct a back-to-back stem-and-leaf diagram for the winning scores and the losing scores. b. between 130 and 182. b. What patterns, if any, are evident from the backto-back stem-and-leaf diagram? 24. Distribution of Data The population of a resort city consists mostly of wealthy families and families with low incomes. Do you think the set of family incomes for this city can be closely approximated by a normal distribution? Explain. 28. Ironman Triathlon The following table lists the winning times for the men s and women s Ironman Triathlon World Championships, held in Kailua-Kona, Hawaii. (Source: hawaii2001/statistik/index.php)
15 302360_File_B.qxd 7/7/03 7:18 AM Page 15 Distribution of Data and the Empirical Rule 15 Ironman Triathlon World Championships (Winning times rounded to the nearest minute) Men, Women, :47 8:29 8:20 12:55 9:35 9:17 11:16 8:34 8:21 11:21 9:01 9:07 9:25 8:31 8:04 12:01 9:01 9:32 9:38 8:09 8:33 10:54 9:14 9:24 9:08 8:28 8:24 10:44 9:08 9:13 9:06 8:19 8:17 10:25 8:55 9:26 8:54 8:09 8:21 10:25 8:58 8:51 8:08 9:49 9:20 a. Construct a back-to-back stem-and-leaf diagram for the data in the tables. Hint: Use the two-digit minutes as your leaves, and insert a comma between the leaves in each row so that they can be easily distinguished from each other. b. What patterns, if any, are evident from the backto-back stem-and-leaf diagram? 29. Home Run Leaders The following tables list the numbers of home runs hit by the home run leaders in the National and the American League from 1971 to Race Times The following relative frequency histogram shows the distribution of times for the 1200 contestants who finished a race. Relative frequency 24% 20% 16% 12% 8% 4% 0% Time, in seconds a. What percent of the contestants finished the race in less than 80 seconds? b. How many contestants had a time of at least 60 seconds but less than 80 seconds? 31. Baseball Attendance A baseball franchise finds that the attendance at its home games is normally distributed, with a mean of 16,000 and a standard deviation of a. What percent of the home games have an attendance between 8000 and 16,000? b. What percent of the home games have an attendance of less than 12,000? Home Run Leaders, National League American League a. Construct a back-to-back stem-and-leaf diagram for the data in the tables. b. What patterns, if any, are evident from the backto-back stem-and-leaf diagram? Physical Sciences and Engineering 32. Breaking Points of Ropes The breaking points of a particular type of rope are normally distributed, with a mean of 350 pounds and a standard deviation of 24 pounds. What is the probability that a piece of this rope chosen at random will have a breaking point of a. less than 326 pounds? b. between 302 and 398 pounds? 33. Tire Mileage The mileages of WearEver tires are normally distributed, with a mean of 48,000 miles and a standard deviation of 6000 miles. What is the probability that the WearEver tire you purchase will provide a mileage of a. more than 60,000 miles? b. between 42,000 and 54,000 miles?
16 302360_File_B.qxd 7/7/03 7:18 AM Page Highway Speed of Vehicles A study of 8000 vehicles that passed by a highway checkpoint found that their speeds were normally distributed, with a mean of 61 miles per hour and a standard deviation of 7 miles per hour. a. How many of the vehicles had a speed of more than 68 miles per hour? b. How many of the vehicles had a speed of less than 40 miles per hour? Explorations Chebyshev s Theorem The following well-known theorem is called Chebyshev s theorem. It is named after the Russian mathematician Pafnuty Lvovich Chebyshev ( ). Chebyshev s theorem states that a mathematical relationship exists between the spread of data and the standard deviation of the data. A remarkable property of Chebyshev s theorem is that it is valid for any set of data. This is unlike the Empirical Rule, which applies only to sets of data that have normal distributions. Chebyshev s Theorem The proportion or percentage of any data set that lies within z standard deviations of the mean, where z is any positive number greater than 1, is at least Applying Chebyshev s theorem with z 2 yields 1 1 z This result of 75% means that at least 75% of the data 4 in any data set must lie within 2 standard deviations of the mean of the data set. 1. Use Chebyshev s theorem to determine the minimum percentage of data (to the nearest percent) in any data set that must lie within a. 1.2 standard deviations of the mean. b. 2.5 standard deviations of the mean. c. 3.1 standard deviations of the mean. 2. A new automobile dealership found that during the month of March, the mean selling price of its cars was $29,200, with a standard deviation of $5100. Use Chebyshev s theorem to determine the minimum percentage (to the nearest percent) of the dealership s cars that have a selling price within a. 1.5 standard deviations of the mean that is, between $21,550 and $36,850. b. 2.8 standard deviations of the mean that is, between $14,920 and $43, z 2
13.1 What is Statistics? What is Statistics? The collection of all outcomes, responses, measurements, or counts that are of interest. A portion or subset of the population. Statistics Is the science of
Eureka Math 2015 2016 Algebra I Module 2 Lessons 1 19 Eureka Math, Published by the non-profit Great Minds. Copyright 2015 Great Minds. No part of this work may be reproduced, distributed, modified, sold,
An analyst usually does not concentrate on each individual data values but would like to have a whole picture of how the variables distributed. In this chapter, we will introduce some tools to tabulate
MATH 214 (NOTES) Math 214 Al Nosedal Department of Mathematics Indiana University of Pennsylvania MATH 214 (NOTES) p. 1/3 CHAPTER 1 DATA AND STATISTICS MATH 214 (NOTES) p. 2/3 Definitions. Statistics is
Number of Families II. Statistical Graphs section 3.2 Histograms and Frequency Polygons are statistical graphs used to illustrate frequency distributions. Example: Construct a histogram for the frequency
Chapter 6 Normal Distributions Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Edited by José Neville Díaz Caraballo University of
LESSON 5 Box Plots LEARNING OBJECTIVES Today I am: creating box plots. So that I can: look at large amount of data in condensed form. I ll know I have it when I can: make observations about the data based
EXTENSION Dot Plots and Distributions A dot plot is a data representation that uses a number line and x s, dots, or other symbols to show frequency. Dot plots are sometimes called line plots. E X A M P
MATH 214 (NOTES) Math 214 Al Nosedal Department of Mathematics Indiana University of Pennsylvania MATH 214 (NOTES) p. 1/11 CHAPTER 6 CONTINUOUS PROBABILITY DISTRIBUTIONS MATH 214 (NOTES) p. 2/11 Simple
Name: Class: Date: Chapter 1 Midterm Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A survey typically records many variables of interest to the
Name Class Date 9.2 Data Distributions and Outliers Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have? Eplore Using Dot Plots to Display Data
Exam Name SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Provide an appropriate response. 1) A parcel delivery service lowered its prices and finds that
: Measuring Variability for Skewed Distributions (Interquartile Range) Exploratory Challenge 1: Skewed Data and its Measure of Center Consider the following scenario. A television game show, Fact or Fiction,
6 th Grade Semester 2 Review 1) It cost me $18 to make a lamp, but I m selling it for $45. What was the percent of increase in price? 2) Tom's weekly salary changed from $240 to $288. What was the percent
COMP Test on Psychology 320 Check on Mastery of Prerequisites This test is designed to provide you and your instructor with information on your mastery of the basic content of Psychology 320. The results
. Puzzle Time MUSSELS Technolog Connection.. 7.... in. Chapter 9 9. Start Thinking! For use before Activit 9. Number of shoes x Person 9. Warm Up For use before Activit 9.. 9. Start Thinking! For use before
Math 81 Graphing Cartesian Coordinate System Plotting Ordered Pairs (x, y) (x is horizontal, y is vertical) center is (0,0) Ex 1. Plot and indicate which quadrant they re in. A (0,2) B (3, 5) C (-2, -4)
1 UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002 Exercises Unit 2 Descriptive Statistics Tables and Graphs Due: Monday September
Measuring Variability for Skewed Distributions Skewed Data and its Measure of Center Consider the following scenario. A television game show, Fact or Fiction, was canceled after nine shows. Many people
Chapter 3 Averages and Variation Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Measures of Central Tendency We use the term average
2.1: Random Samples Random Sample sample that is representative of the entire population. Each member of the population has an equal chance of being included in the sample. Each sample of the same size
: Measuring Variability for Skewed Distributions (Interquartile Range) Student Outcomes Students explain why a median is a better description of a typical value for a skewed distribution. Students calculate
Math Objectives Students will recognize that when the population standard deviation is unknown, it must be estimated from the sample in order to calculate a standardized test statistic. Students will recognize
1. The line has endpoints L(-8, -2) and N(4, 2) and midpoint M. What is the equation of the line perpendicular to and passing through M? A. B. Y= C. Y= D. Y= 3x + 6 2. A rectangle has vertices at (-5,3),
Mobile Math Teachers Circle The Return of the iclicker June 20, 2016 1. Dr. Spock asked his class to solve a percent problem, Julia set up the proportion: 4/5 = x/100. She then cross-multiplied to solve
Lesson 8.1 Construct the graphical display for each given data set. Describe the distribution of the data. 1. Construct a box-and-whisker plot to display the number of miles from school that a number of
UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certificate of Education Ordinary Level *0192736882* STATISTICS 4040/12 Paper 1 October/November 2013 Candidates answer on the question paper.
CHAPTER 2 EXPLORING DISTRIBUTIONS 18 16 14 12 Frequency 1 8 6 4 2 54 56 58 6 62 64 66 68 7 72 74 Female Heights What does the distribution of female heights look like? Statistics gives you the tools to
STAT 113: Statistics and Society Ellen Gundlach, Purdue University (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e) Learning Objectives for Exam 1: Unit 1, Part 1: Population
AP Statistics Sec.: An Exercise in Sampling: The Corn Field Name: A farmer has planted a new field for corn. It is a rectangular plot of land with a river that runs along the right side of the field. The
Key Maths Facts to Memorise Question and Answer Ways of using this booklet: 1) Write the questions on cards with the answers on the back and test yourself. 2) Work with a friend to take turns reading a
Section.1 How Do We Measure Speed? 1. (a) Given to the right is the graph of the position of a runner as a function of time. Use the graph to complete each of the following. d (feet) 40 30 0 10 Time Interval
88127402 mathematical STUDIES STANDARD level Paper 2 Wednesday 7 November 2012 (morning) 1 hour 30 minutes instructions to candidates Do not open this examination paper until instructed to do so. A graphic
Grade 5 Mathematics Mid-Year Assessment REVIEW The learning targets (Texas Essential Knowledge and Skill statements) are listed prior to sample items. The sample items are not an exhaustive list and only
MATH 106 FINAL EXAMINATION This is an open-book exam. You may refer to your text and other course materials as you work on the exam, and you may use a calculator. You must complete the exam individually.
Chapter 2 Test A Multiple Choice Section 2.1 (Visualizing Variation in Numerical Data) 1. [Objective: Interpret visual displays of numerical data] Each day for twenty days a record store owner counts the
Math 7 /Unit 07 Practice Test: Collecting, Displaying and Analyzing Data Name: Date: Define the terms below and give an example. 1. mode 2. range 3. median 4. mean 5. Which data display would be used to
MA 15910, Lesson 5, Algebra part of text, Sections 2.3, 2.4, and 7.5 Solving Applied Problems Steps for solving an applied problem 1. Read the problem; carefully noting the information given and the questions
Before the Federal Communications Commission Washington, D.C. 20554 In the Matter of Implementation of Section 3 of the Cable Television Consumer Protection and Competition Act of 1992 Statistical Report
Special Topics: U3. L3. Inv 1 Name: Homework: Math XL Unit 3 HW 9/28-10/2 (Due Friday, 10/2, by 11:59 pm) Lesson Target: Write multiple expressions to represent a variable quantity from a real world situation.
3.2 Writing Expressions represents an unknown quantity? How can you write an expression that 1 ACTIVITY: Ordering Lunch Work with a partner. You use a $20 bill to buy lunch at a café. You order a sandwich
Name: Class: Practice Test. The elevation of the surface of the Dead Sea is -424. meters. In 2005, the height of Mt. Everest was 8,844.4 meters. How much higher was the summit of Mt. Everest? a. -9.268.7
. Chapter 1 Graphical Displays of Univariate Data Topic 2 covers sorting data and constructing Stemplots and Dotplots, Topic 3 Histograms, and Topic 4 Frequency Plots. (Note: Boxplots are a graphical display
7 Timing Processes for the Intersection In Chapter 6, you studied the operation of one intersection approach and determined the value of the vehicle extension time that would extend the green for as long
Chapter 27 Inferences for Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 27-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley An
NAA ENHANCING THE QUALITY OF MARKING PROJECT: THE EFFECT OF SAMPLE SIZE ON INCREASED PRECISION IN DETECTING ERRANT MARKING Mudhaffar Al-Bayatti and Ben Jones February 00 This report was commissioned by
NETFLIX MOVIE RATING ANALYSIS Danny Dean EXECUTIVE SUMMARY Perhaps only a few us have wondered whether or not the number words in a movie s title could be linked to its success. You may question the relevance
The One Penny Whiteboard Ongoing, in the moment assessments may be the most powerful tool teachers have for improving student performance. For students to get better at anything, they need lots of quick
p01.qxd 10/29/03 9:25 AM Page 1 I T HE M AGIC OF G RAPHS AND S TATISTICS It s hard to get through a day without seeing a graph or chart somewhere, whether you re reading a newspaper or a magazine, watching
Sampling Worksheet: Rolling Down the River Name: Part I A farmer has just cleared a new field for corn. It is a unique plot of land in that a river runs along one side. The corn looks good in some areas
Collecting Data Name: Gary tried out for the college baseball team and had received information about his performance. In a letter mailed to his home, he found these recordings. Pitch speeds: 83, 84, 88,
Estimating Chapter 10 Proportions with Confidence Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc. Principal Idea: Survey 150 randomly selected students and 41% think marijuana should be
abc General Certificate of Secondary Education Statistics 3311 Higher Tier Mark Scheme 2007 examination - June series Mark schemes are prepared by the Principal Examiner and considered, together with the
Uses of The numbers,,,, and are all fractions. A fraction is written with two whole numbers that are separated by a fraction bar. The top number is called the numerator. The bottom number is called the
Chapter 2 Test Questions 1. Perhaps the oldest presentation in history of descriptive statistics was a. a frequency distribution b. graphs and tables c. a frequency polygon d. a pie chart 2. In her bar
General Certificate of Education June 2009 Advanced Subsidiary Examination MATHEMATICS Unit Statistics 1B MS/SS1B STATISTICS Unit Statistics 1B Wednesday 20 May 2009 1.30 pm to 3.00 pm For this paper you
For the SIA Applications of Propagation Delay & Skew tool Determine signal propagation delay time Detect skewing between channels on rising or falling edges Create histograms of different edge relationships
E X P E R I M E N T 1 Getting to Know Data Studio Produced by the Physics Staff at Collin College Copyright Collin College Physics Department. All Rights Reserved. University Physics, Exp 1: Getting to
1. The Jackson Middle School cross country team is making a box plot of the time it takes each person on the team to run a mile, rounded to the nearest minute. The times are shown below. 11, 6, 8, 7, 7,
Q1. In a division sum, the divisor is 4 times the quotient and twice the remainder. If and are respectively the divisor and the dividend, then (a) 3 (c) a 1 4b (b) 2 (d) Q2. If is divisible by 11, then
1st FIM INTERNATIONAL ORCHESTRA CONFERENCE Berlin April 7-9, 2008 FIM INTERNATIONAL SURVEY ON ORCHESTRAS Report By Kate McBain watna.communications Musicians of today, orchestras of tomorrow! A. Orchestras
GCSE MARKING SCHEME AUTUMN 2017 GCSE MATHEMATICS NUMERACY UNIT 1 - INTERMEDIATE TIER 3310U30-1 INTRODUCTION This marking scheme was used by WJEC for the 2017 examination. It was finalised after detailed
Chapter 5 Relationships Between Quantitative Variables Three Tools we will use Scatterplot, a two-dimensional graph of data values Correlation, a statistic that measures the strength and direction of a
1 MATH 16A LECTURE. OCTOBER 28, 2008. PROFESSOR: SO LET ME START WITH SOMETHING I'M SURE YOU ALL WANT TO HEAR ABOUT WHICH IS THE MIDTERM. THE NEXT MIDTERM. IT'S COMING UP, NOT THIS WEEK BUT THE NEXT WEEK.
Bite Size Brownies Designed by: Jonathan Thompson George Mason University, COMPLETE Math The Task Mr. Brown E. Pan recently opened a new business making brownies called The Brown E. Pan. On his first day
Draft last edited May 13, 2013 by Belinda Robertson 97 98 Appendix A: Prolem Handouts Problem Title Location or Page number 1 CCA Interpreting Algebraic Expressions Map.mathshell.org high school concept
Free Pre-Algebra Lesson 41! page 1 Lesson 41 Solving Percent Equations A percent is really a ratio, usually of part to whole. In percent problems, the numerator of the ratio (the part) is called the, and
G R E Which of these is the number 5,005,0? five million, five hundred, fourteen five million, five thousand, fourteen five thousand, five hundred, fourteen five billion, five million, fourteen LIFORNI
Sec 1.1 -Analyzing Numerical Data Fermi Problems: Estimating Large Numbers Name: Enrico Fermi (1901 1954) was an Italian physicist that worked in the United States. He was known for his contributions in
Relationships Chapter 5 Between Quantitative Variables Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc. Three Tools we will use Scatterplot, a two-dimensional graph of data values Correlation,
Journal of Criminal Law and Criminology Volume 31 Issue 5 January-February Article 11 Winter 1941 Human Hair Studies: II Scale Counts Lucy H. Gamble Paul L. Kirk Follow this and additional works at: https://scholarlycommons.law.northwestern.edu/jclc
N4906 91040 Measurement User Guide The Serial BERT offers several different kinds of advanced measurements for various purposes: DUT Output Timing/Jitter This type of measurement is used to measure the
Write your name here Surname Other names Pearson Edexcel GCSE Centre Number Candidate Number Applications of Mathematics Unit 1: Applications 1 For Approved Pilot Centres ONLY Higher Tier Wednesday 6 November
Introduction to Probability Exercises Look back to exercise 1 on page 368. In that one, you found that the probability of rolling a 6 on a twelve sided die was 1 12 (or, about 8%). Let s make sure that
ICPSR Blalock Lectures, 2003 Bootstrap Resampling Robert Stine Lecture 3 Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions? Getting class notes
STTE OF THE MEDI THE CROSSPLTFORM REPORT QURTER, 0 UNDERSTNDING THE VIDEO CONSUMER The average merican today has more ways to watch video whenever, however and wherever they choose. While certain segments
1. The table shows the number of sport cards of each kind in Monique s collection. Monique s Sport Card Collection Kind of Card Baseball Basketball Football Hockey Total Number of Cards 36 28 20 16 100
Technical Appendix May 2016 DREAMBOX LEARNING ACHIEVEMENT GROWTH in the Howard County Public School System and Rocketship Education Abstract In this technical appendix, we present analyses of the relationship
Math 7 Module Lessons.notebook September, 05 Module Ratios and Proportional Relationships Lessons Lesson # September, 05 You need: pencil, calculator and binder. Do Now: Find your group and complete do
: Solving Problems in Two Ways Rates and Algebra Student Outcomes Students investigate a problem that can be solved by reasoning quantitatively and by creating equations in one variable. They compare the