x x x xhonors algebra 2 chapter 11 introduction: measures of central tendency and dispersion...

15
Honors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important types of statistics are: measures of central tendency and measures of dispersion. Measure of Central Tendency: A number used to represent the center or middle of a set of data. The mean, median, and mode are commonly used measures of central tendency. Measures of Dispersion: A statistic that tells you how dispersed, or spread out, data values are. The range and standard deviation are commonly used measures of dispersion. Measures of Central Tendency Mean: (or _________________ ). The mean of n numbers is the sum of the n numbers divided by n . The mean is denoted by ______, which is read “ x bar”. For the data set: n x x x x , . . . , , , 3 2 1 , the mean is: n x x x x x n . . . 3 2 1 Median: The median of n numbers is the middle number when the numbers are written in order. (If n is even, the median is the mean of the two middle numbers.) Mode: The mode of n numbers is the number or numbers that occur most frequently. (There may be one mode, no mode, or more than one mode.) Measures of Dispersion Range: The difference between the greatest and least data values. Standard Deviation: Describes the typical difference (or deviation) between a data value and the mean. The standard deviation is denoted by ______, which is read as “sigma”. For the data set: n x x x x , . . . , , , 3 2 1 , the standard deviation is: n x x x x x x n 2 2 2 2 1 . . .

Upload: others

Post on 30-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Honors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion

Statistics: Numerical values used to summarize and compare sets of data. Two important types of statistics are: measures of central tendency and measures of dispersion. Measure of Central Tendency: A number used to represent the center or middle of a set of data. The mean, median, and mode are commonly used measures of central tendency. Measures of Dispersion: A statistic that tells you how dispersed, or spread out, data values are. The range and standard deviation are commonly used measures of dispersion.

Measures of Central Tendency Mean: (or _________________ ). The mean of n numbers is the sum of the n numbers divided by n .

The mean is denoted by ______, which is read “ x bar”.

For the data set: nxxxx ,...,,, 321 , the mean is: n

xxxxx n

...321

Median: The median of n numbers is the middle number when the numbers are written in order.

(If n is even, the median is the mean of the two middle numbers.)

Mode: The mode of n numbers is the number or numbers that occur most frequently.

(There may be one mode, no mode, or more than one mode.)

Measures of Dispersion Range: The difference between the greatest and least data values. Standard Deviation: Describes the typical difference (or deviation) between a data value and the mean. The standard deviation is denoted by ______, which is read as “sigma”.

For the data set: nxxxx ,...,,, 321 , the standard deviation is:

n

xxxxxx n

22

2

2

1 ...

Page 2: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Ex. 1: Find the mean, median, and mode of the data set: 8,10,11,5,8,6,8,4,7,6

Ex. 2: Find the range and standard deviation of the data set: 14,3,4,8,6

Ex. 3: The data below shows the scores on a recent quiz. 28,40,31,7,29,32,40,35

a) Identify the outlier in the data set. b) Find the mean, median, mode, range, and standard deviation when the outlier is included. c) Find the mean, median, mode, range, and standard deviation when the outlier is not included.

Graphing Calculator:

Homework: Pg.593 #’s 1-7

Page 3: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Honors Algebra 2 Section 11.1 - Normal Distributions

Mean:

The ___________ of the set of data values. x

STANDARD DEVIATION:

The typical difference (or deviation) between a data value and the __________.

NORMAL DISTRIBUTION: A probability distribution modeled by a bell shaped curve called a _________________________

that is symmetric about the __________.

NORMAL CURVE: A normal distribution with mean x and standard deviation has the following properties: AREA UNDER A NORMAL CURVE: - Total area under the related normal curve is _______ or ________%. - About _______ % of the area lies within 1 standard deviation of the mean.

- About _______ % of the area lies within 2 standard deviations of the mean.

- About _______ % of the area lies within 3 standard deviations of the mean.

Ex. 1: A normal distribution has mean x and standard deviation . Find the indicated probability for a randomly

selected x value from the distribution.

a) xxP b) xxxP 2 c) 2 xxP

Ex. 2: A normal distribution has a mean of 27 and a standard deviation of 5. Find the probability that a randomly selected x -value from the distribution is in the given interval.

a) Between 17 and 32 b) At least 32 c) At most 37

Page 4: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Ex. 3: The blood cholesterol readings for a group of women are normally distributed with a mean of 172 mg/dl and a standard deviation of 14 mg/dl.

a) About what percent of the women have reading between 158 and 186? b) Readings higher than 200 are considered undesirable. About what percent of the readings are undesirable?

STANDARD NORMAL DISTRIBUTION:

The standard normal distribution is the normal distribution with __________________ and_______________________________.

Z-VALUE: The z-value for a particular x-value is called the z-score for the x-value and is the number of standard deviations the x-value lies above or below the mean. z = This formula can be used to transform x-values from a normal

distribution with mean x and standard deviation into z-values having a standard normal distribution. Standard Normal Table: If z is a randomly selected value from a standard normal distribution, you can use the table to find the probability that z is less than or equal to some given value. Note: In the table, the value .0000+ means slightly more than 0, and the value 1.000- means slightly less than 1.

Ex. 4: Scientists conducted aerial surveys of a seal sanctuary and recorded the number x of seals they observed during each survey. The numbers of seals observed were normally distributed with a mean of 73 seals and a standard deviation of 14.1 seals. a) Find the probability that at most 50 seals were observed during a survey. Step 1: Step 2: b) Find the probability that at most 90 seals were observed during a survey. Ex. 5: A survey of 20 colleges found that the average credit card debt for seniors was $3450. The debt was normally distributed with a standard deviation of $1175. a) Find the probability that the credit card b) Find the probability that the credit card debt of the seniors was at most $3600. debt of the seniors was at least $3600. Homework: worksheet

Page 5: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

11.2 Populations, Samples, and Hypotheses

Exploration Activity

a. What is the theoretical probability of rolling a sum of 7 when two six-sided dice are rolled?

b. Conduct an experiment to determine the probability of rolling a sum of 7 when rolling two six-sided dice.

Calculator: (Apps) (5: Prob Sim) (2: Roll Dice) (Zoom) Trial Set: 25 Dice: 2 Sides: 6 Graph: Freq Store Table: 50 Clear Table: Yes Update after: 1 (Table) (Y=) (Window) *(Hit Window, Zoom, or Trace) *(Use arrow keys to see frequency of each sum) *(Graph) to clear (y=) for yes

c. What happens as you increase the sample size?

Population and Samples Population: is the collection of all data, such as responses, measurements, or counts that you want information about. Sample: is a subset of that population. Census: consists of data from the entire population. Example #1: Identify the population and the sample. Describe the sample.

a. The owner of a dance studio asks 32 dancers what their favorite type of dance is, and 25 of them say hip-hop.

b. A counselor at Easton Middle School pulled 225 student class schedules, and found that 46 students have science first period.

Parameter: a numerical description of a population. Statistic: a numerical description of a sample. Example #2: Distinguish between parameters and statistics.

a. For all teenagers in a certain town working jobs last summer, the mean hourly wage was $6.95. Is the mean wage a parameter or statistic?

b. A survey of 912 men, ages 50-60 in Central America, found that the standard deviation of the length of their feet is about 4 cm. Is this a parameter or statistic?

Number of Rolls

Number of times a Sum of 7 Appears

Experimental Probability

50

100

500

1000

Page 6: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Analyzing Hypotheses To analyze a hypothesis, you need to distinguish between results that can easily occur by chance and results that are highly unlikely to occur by chance. One way to analyze a hypothesis is to perform a simulation. When results are highly unlikely to occur, the hypothesis is probably false. Example #3

You roll a six-sided die 5 times and do not get an even number. The probability of this happening is (1

2)

5

= 0.03125, so

you suspect this die favors odd numbers. The die maker claims the die does not favor odd numbers of even numbers.

a. What should you conclude when you roll the die 50 times and get 23 odd numbers?

b. What should you conclude when you roll the die and get 40 odd numbers?

Homework: pgs.607-608 #’s 7-10, 15-21, 25

Page 7: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

11.3 Collecting Data

Exploration Activities 1.) Analyzing Sampling Techniques

Work with a partner. Determine whether each sample is representative of the population. Explain your reasoning. a. To determine the number of hours people exercise during a week, researchers use random-digit dialing and

call 1500 people.

b. To determine how many text messages high school students send in a week, researchers post a survey on a website and receive 750 responses.

c. To determine how much money college students spend on clothes each semester, a researcher surveys 450

college students as they leave the university library.

2.) Analyzing Survey Questions Work with a partner. Determine whether each survey question is biased. Explain your reasoning. If so, suggest an unbiased rewording of the question. a. Does eating nutritious, whole-grain foods improve your health?

b. Do you ever attempt the dangerous activity of texting while driving?

c. How many hours do you sleep each night?

3.) Analyzing Survey Randomness and Truthfulness Work with a partner. Discuss each potential problem in obtaining a random survey of a population. Include suggestions for overcoming the problem. a. The people selected might not be a random sample of the population.

b. The people selected might not be willing to participate in the survey.

c. The people selected might not be truthful when answering the question.

Page 8: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Random Sample: each member in the population has an equal chance of being selected. (MOST PREFERRED)

Example #1 You want to determine whether people in your neighborhood like the new social media website that provides neighborhood updates. Identify the type of sample described. a. You ask all the neighbors on the block where you reside. b. You randomly select a neighbor from each block in the neighborhood. c. You e-mail a questionnaire to each neighbor and use only the questionnaires that are returned. d. Describe how you could do a systematic sample.

Bias: is an error that results in misrepresentation of a population. Unbiased sample: is representative of the population that you want information about. Biased sample: a sample that overrepresents or under-represents part of the population. Example #2: Identify the type of sample and explain why the sample is biased.

a. The principal asks students at one lunch table about the quality of food served in school’s cafeteria.

b. A sports announcer wants to know how often people in the town attend community sporting events. She asks every tenth person in attendance at a local soccer game.

Page 9: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Example #3: Selecting an Unbiased Sample You are in charge of the senior-class prom. You want to poll the members of the senior class to find out where the prom should be held. There are 415 students in the senior class. Describe a method for selecting a random sample of 70 seniors to poll. *Random Integers on Calculator: (math), (> prob), (5:randInt), Lower: 1, Upper: 415, n:1, Enter, Enter

Methods of Collecting Data An experiment imposes a treatment on individuals in order to collect data on their response to the treatment. The treatment may be a medical treatment, or it can be any action that might affect a variable in the experiment, such as adding methanol to gasoline and then measuring its effect on fuel efficiency. An observational study observes individuals and measures variables without controlling the individuals or their environment. This type of study is used when it is difficult to control or isolate the variable being studied, or when it may be unethical to subject people to a certain treatment or to withhold it from them. A survey is an investigation of one or more characteristics of a population. In a survey, every member of a sample is asked one or more questions. A simulation uses a model to reproduce the conditions of a situation or process so that the simulated outcomes closely match the real-world outcomes. Simulations allow you to study situations that are impractical or dangerous to create in real life. Example #4: Identify the method of data collection each situation describes.

a. A teacher records how many students enter the classroom and turn in their homework before sitting down at their desks.

b. A manager uses a computer program to predict how many defective products can be expected on a particular assembly line.

When designing surveys, it is important to word your questions so they do not lead to biased results. Questions that are flawed in a way that leads to inaccurate results are called biased questions.

Avoid Questions that:

Encourage a particular response

Do not provide enough information to give an accurate opinion

Are too sensitive to answer truthfully

Address more than one issue Example #5 A dentist surveys his patients by asking, “Do you brush your teeth at least twice per day and floss every day?” Explain why the question may be biased or otherwise introduce bias into the survey. Then describe a way to correct the flaw. Homework: pgs.614-616 #’s 5-12, 15-19, 21-28, 34, 40

Page 10: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

11.4 Experimental Design

Controlled Experiment: two groups are studied under identical conditions with the exception of one variable. Control Group: the group that is subject to no treatment. Treatment Group: the group that is subjected to treatment. Randomization: a process of randomly assigning subjects to different treatment groups. Randomized Comparative Experiment: subjects are randomly assigned to the control group or the treatment group. Placebo: a harmless, unmedicated treatment that resembles the actual treatment. Example #1 Determine whether each study is a randomized comparative experiment. If it is, describe the treatment, the treatment group, and the control group. If it is not, explain why not and discuss whether conclusions drawn from the study are valid.

a) Supermarket Checkout:

b) Car Safety:

Comparative Studies and Causality A rigorous randomized comparative experiment, by eliminating sources of variation other than the controlled variable, can make valid cause-and-effect conclusions possible. An observational study can identify correlation between variables, but not causality. Variables, other than what is being measured, may be affecting the results.

Page 11: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Example #2 Explain whether the following research topic is best investigated through an experiment or an observational study. Then describe the design of the experiment or observational study.

a) A researcher wants to know whether people eating carrots daily improve their eyesight.

b) You want to know whether flowers sprayed twice per day with a mist of water stay fresh longer than flowers that are not sprayed.

** An important part of experimental design is sample size, or the number of subjects in the experiment. To improve the validity of the experiment, replication is required, which is repetition of the experiment under the same or similar conditions.** Example #3 A pharmaceutical company wants to test the effectiveness of a new chewing gum designed to help people lose weight. Identify a potential problem, if any, with each experimental design. Then describe how you can improve it. a. The company identifies 10 people who are overweight. Five subjects are given the new chewing gum and the other 5 are given a placebo. After 3 months, each subject is evaluated and it is determined that the 5 subjects who have been using the new chewing gum have lost weight. b. The company identifies 10,000 people who are overweight. The subjects are divided into groups according to gender. Females receive the new chewing gum and males receive the placebo. After 3 months, a significantly large number of the female subjects have lost weight. c. The company identifies 10,000 people who are overweight. The subjects are divided into groups according to age. Within each age group, subjects are randomly assigned to receive the new chewing gum or the placebo. After 3 months, a significantly large number of the subjects who received the new chewing gum have lost weight. Homework: pgs. 623-624 #’s 3-8, 11, 16

Page 12: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

11.5 Making Inferences from Sample Surveys

Descriptive statistics involves the organization, summarization, and display of data. Inferential statistics involves using a sample to draw conclusions about a population. Example #1: Estimate the population mean µ.

Example #2: Estimating Population Proportion. A cafeteria manager wants to predict the number of chicken sandwiches that will be sold when students can choose between a chicken sandwich and a turkey burger. Seven cafeteria workers conduct surveys of randomly selected students. The students are asked whether they will buy a chicken sandwich. The results are shown in the table.

a. Based on the results of the first two sample surveys, do you think there will be more chicken sandwiches sold than turkey burgers? Explain.

b. Based on the results in the table, do you think there will be more chicken sandwiches sold than turkey burgers? Explain.

Margin of Error Formula When a random sample of size n is taken from a large population, the margin of error is approximated by

𝑀𝑎𝑟𝑔𝑖𝑛 𝑜𝑓 𝑒𝑟𝑟𝑜𝑟 = ±1

√𝑛

This means that if the percent of the sample responding a certain way is p (expressed as a decimal), then the percent of

the population who would respond the same way is likely to be between 𝒑 −𝟏

√𝒏 and 𝒑 +

𝟏

√𝒏.

Example #3: Finding Margin of Error 1. In a survey of 2048 people in the U.S., 55% said that television is their main source of news.

(a) What is the margin of error for the survey?

(b) Give an interval that is likely to contain the exact percent of all people who use television as their main source of news.

Page 13: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

2. In a survey of 2680 people in the U.S., 34% said that movies are their main source of entertainment. (a) What is the margin of error for the survey?

(b) Give an interval that is likely to contain the exact percent of all people who use movies as their main source of entertainment.

3. A group of students survey the local community about their favorite flavor of ice cream. How many people did they survey if the margin of error is ± 7%?

4. In a poll about which movie channel its customers prefer to watch, 41% of the customers prefer HBO. If the margin of error was ± 3.6%, how many people did they survey?

5. A survey claims that the percent of all the people who use television as their main source of news is between 46.3% and 55.8%. How many people were surveyed? Homework: pgs.630-632 #’s 3,7, 12, 14, 17, 18, 23, 25

Page 14: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

11.6 Making Inferences from Experiments

Example #1: A randomized comparative experiment tests whether the use of rainwater affects the total yield (in kilograms) of green bell peppers. The control group, which receives a mix of faucet water and rainwater, has 10 plants, and the treatment group, which receives only rainwater, has 10 plants. The table shows the results.

a) Find the mean yield of the control group

b) Find the mean yield of the treatment group

c) Find the experimental difference of the means

d) Display the data in a double dot plot.

e) What can you conclude?

Example #2 Resample the data in Extra Example 1 using a simulation. Use the mean yield of the new control and treatment groups to calculate the difference of the means.

Step 1: Combine the measurements from both groups and assign a number to each value. Let the numbers 1 through 10 represent the data in the original control group, and let the numbers 11 through 20 represent the data in the original treatment group. Step 2: Use a random number generator. Randomly generate 20 numbers from 1 through 20 without repeating a number. Use the first 10 numbers to make the new control group, and the next 10 to make the new treatment group.

Total Yield (Kg)

Control Group

Treatment Group

Page 15: x x x xHonors Algebra 2 Chapter 11 Introduction: Measures of Central Tendency and Dispersion Statistics: Numerical values used to summarize and compare sets of data. Two important

Example #3 To conclude that the treatment in Extra Example 1 is responsible for the difference in yield, you need to analyze this hypothesis: The type of water has no effect on the yield of the green bell pepper plants. Use the histogram, which shows the results from 200 resamplings of the data. Compare the experimental difference of 0.18 from Extra Example 1 with the resampling differences. What can you conclude about the hypothesis? Does rainwater have an effect on the yield?

Homework: pgs. 637-638 #’s 3, 7, 9, 12