statistical analysis – chapter 4 normal distribution

25
Statistical Analysis – Chapter 4 Normal Distribution

Upload: melissa-anthony

Post on 23-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Analysis – Chapter 4 Normal Distribution

Statistical Analysis – Chapter 4Normal Distribution

Page 2: Statistical Analysis – Chapter 4 Normal Distribution

What is the normal curve? In chapter 2 we talked about histograms and

modes

A normal distribution is when a set of values for one variable, when displayed in a histogram (or line graph) has one peak (mode) and looks like a bell. Here is an example using height:

Page 3: Statistical Analysis – Chapter 4 Normal Distribution

Characteristics of the Normal Curvea. Bell shaped, fading at the tails. In other words, more

values are in the middle, and odd or unusual values fall at the tails

b. All (100%) of the data fits on the curve, with 50% before the mean and 50% after

c. 68% of the data falls within -1 and +1 standard deviations of the mean

d. 95% of the data falls between -2 and +2 standard deviations

e. The percentage of data between any two points is equal to the probability of randomly selecting a value between the two points (remember classical probability from Ch. 3)

Page 4: Statistical Analysis – Chapter 4 Normal Distribution

Standard Deviations and Z-Score Z – scores = the number of standard deviations

away from the mean.

z-score = x - µ σ (x = data for which we want to know the z-

score)

We use the characteristics of the normal curve, and the z-score, to find out the probability of a particular event or value occurring (remember classical probability from Chapter 3)

Page 5: Statistical Analysis – Chapter 4 Normal Distribution

Solving Normal Curve Problems Using Z-Scores(steps listed at bottom of p. 111)

1.Draw a normal curve, showing values for (-2 through +2)

2.Shade the area in question3.Calculate the z scores and cutoffs

(percentages asked for)4.Use the z-scores and cutoffs to solve the

normal curve problem

Page 6: Statistical Analysis – Chapter 4 Normal Distribution

Find Percentages on the Normal Curve TableLet’s do these questions as a class…a. What is the percentage of data from z = 0 to z = 0.1?b. What is the percentage of data from z = 0 to z = 2.16?c. What is the percentage of data from z = -1.11 to z =

1.11? d. What is the percentage of data above z = 1.24?e. What is the percentage of data below z = -0.6?

Answersa. .0398…39.8%

b. .4846…48.46%

c. .3665 + .3665 = .733…73.3%

d. .50 - .3925 = .1075…10.75%

e. .50 - .2257 = .2743…27.43%

Page 7: Statistical Analysis – Chapter 4 Normal Distribution

Working backwards from percentages… When working backwards from percentages,

we still use the normal table…but look for the percentage to give us the z-score…

a. What is the z-score associated 10.2% of the data?b. What is the z-score(s) for the middle 30% of the

normal curve?c. What is the z-score of data in the upper 25% of the

normal curve?

Answersa. z = 0.26b. z = -.39 to z = .39c. z = 0.67

Page 8: Statistical Analysis – Chapter 4 Normal Distribution

Let’s do Question 4.2Use the normal curve table to determine the percentage of

data in the normal curve

a. Between z = 0 and z = .82

b. Above z = 1.15

c. Between z = -1.09 and z = .47

d. Between z = 1.53 and z = 2.78

Work backward in the normal curve table to solve the following:

e. 32% of the data in the normal curve data can be found between z = 0 and z = ?

f. Find the z score associated with the lower 5% of the data.

g. Find the z scores associated with the middle 98% of the data.

Page 9: Statistical Analysis – Chapter 4 Normal Distribution

Question 4.2 Answers

Answers to Question 4.2a. 29.39% b. 12.51%c. 54.29%d. 6.03%e. Between z = 0 and z = .92, or between z

= 0 and z = -.92

Page 10: Statistical Analysis – Chapter 4 Normal Distribution

Question 4.7 Use the normal curve table to determine the percentage of

data in the normal curve

a. Between z = 0 and z = .38

b. Above z = -1.45

c. Above z = 1.45

d. Between z = .77 and z = 1.92

e. Between z = -.25 and z = 2.27

f. Between z = -1.63 and z = -2.89

Work backward in the normal curve table to solve the following.

g. 15% of the data in the normal curve can be found between z = 0 and z = ?

h. Find the z score associated with the upper 73.57% of the data.

i. Find the z scores associated with the middle 95%

Page 11: Statistical Analysis – Chapter 4 Normal Distribution

Question 4.7 Answersa. 14.80%b. 92.65%c. 7.35%d. 19.32%e. 58.71%f. 4.97%g. z = .39 or -.39h. z = -.63i. Between z = -1.96 and z = +1.96

Page 12: Statistical Analysis – Chapter 4 Normal Distribution

Binomial Distributions and SamplingBinomial means two categories in a population… Males and females Sports game players vs. Non sports game players Incomes over 40,000 vs. incomes under 40,000

Quick note: Remember…for binomial distributions, we would visualize this data through a pie chart…because we do not have enough categories for a histogram…

Page 13: Statistical Analysis – Chapter 4 Normal Distribution

Sampling from a Two-Category Population With two-category populations, we can describe

the population by p – the percentage of values in one category

This is the same p from the last chapter on probability (classical probability)…

P(event) ≈ s (number of chances for success)

n (total equally likely possibilities)

We know (actually….statisticians know) that if we randomly sampled from a population, then

ps ≈ p

Page 14: Statistical Analysis – Chapter 4 Normal Distribution

Sampling Distribution In order to know the odds of getting certain

values from this particular binomial sample, we have to know the sampling distribution from this population.

Under certain conditions, the sampling distribution for a binomial value is normal (i.e. the distribution follows the normal curve).

When the sampling distribution is normal, then we can make predictions using our table and our z-scores

Page 15: Statistical Analysis – Chapter 4 Normal Distribution

Sampling from a Binomial Distribution Suppose, we defined a population (full time FIT

students who either shop at Hot Topic), and we have made our measure of interest into a binomial distribution – those who shop at Hot Topic and those who do not.

Suppose over the last 10 years, marketers have surveyed the FIT population hundreds of times and found that Hot Topic shoppers are p = .13. (those who are non-Hot Topic shoppers is p = .87)

Page 16: Statistical Analysis – Chapter 4 Normal Distribution

Sampling from a Binomial Distribution But suppose sometime later, your manager

asks you to lead another study. But this time, you don’t have enough money to survey the whole population, and you have to get a sample.

We can assume, because so many studies have been done in the past that the true value of Hot Topic shoppers is p = .13. Thus, because we know that ps ≈ p, your sample should have approximately the same value.

Page 17: Statistical Analysis – Chapter 4 Normal Distribution

Sampling from a Binomial Distribution For each sample, we can use the number sampled,

and the p value from the population to predict the total number of Hot Topic shoppers. This is called the expected value.

Expected value = np

Thus, if we collected a sample of 200 FIT students, how many students would we expect to be Hot Topic shoppers?

np = (200)(.13) = 26

This expected value is the mean of your sample

Page 18: Statistical Analysis – Chapter 4 Normal Distribution

Binomial Distribution and the Normal Curve Now, we need to decide if we can use the normal

curve to solve problems…

If (np) > 5 and n(1 – p)>5…then the sampling distribution will be normally distributed.

So, our sample was 200 students. Is (np) > 5? Is n(1 – p)>5?

Yes…and yes. np = (200)(.13) = 26 n(1 – p) = (200)(1 - .13) = (200)(.87) = 174

Page 19: Statistical Analysis – Chapter 4 Normal Distribution

Binomial Distribution and the Normal Curve What do we mean that a sampling distribution is

normal?a) Just like someone’s age is one value among many

ages that we tally to make a histogram, we can tally many samples, get the p values of those sample, and construct histograms from these means.

If we took say, 1000 samples, and tallied the p values for Hot Topic shoppers, then those values, when turned into a histogram, should form a normal curve. Just like if we took the heights of a 1000 women, and tallied those values to get a normal curve.

Page 20: Statistical Analysis – Chapter 4 Normal Distribution

How to use the Binomial Distribution and the Normal Curve1. Get the mean (µ)…the mean is the expected

value (np)2. Get the standard deviation (σ) = √np(1 – p)3. Draw a normal curve using mean and standard

dev4. Use the “continuity correction factor,” and add

+/- half a unit to the value we want to solve for

5. Get the z-scores = x - µ σ6) Use the normal curve table to solve the

problem

Page 21: Statistical Analysis – Chapter 4 Normal Distribution

Why the “continuity correction factor”? This is only for discrete values (where values occupy only distinct

points.) For example, in our study, there is no such thing as a “half” or “3/4” Hot Topic shopper. Either you are a shopper or not. Looking at how histograms are presented, you can see why we have to use the correction factor.

1. Probability of getting a value equal to or greater than (=>), then you must subtract a half-unit

2. Probability of getting a value equal to or lesser than (=<), you must add a half unit.

3. Probability of getting the exact value, you must get the Z-scores for a half-unit above and a half-unit below

Page 22: Statistical Analysis – Chapter 4 Normal Distribution

Now let’s answer a Hot Topic Question…

If you collected a sample of 200 FIT students…

a.What is the probability that 13 will be Hot Topic shoppers?

b.What is the probability that you will have 30 or more Hot Topic shoppers?

c.What is the probability that you will have 25 or less Hot Topic shoppers?

Page 23: Statistical Analysis – Chapter 4 Normal Distribution

Question

1. What is the probability that 13 will be Hot Topic shoppers?

2. What is the probability that you will have 30 or more Hot Topic shoppers?

3. What is the probability that you will have 25 or less Hot Topic shoppers?

Answer

1. Get the mean (µ) = expected value = np = (200)(.13) = 26

2. Get the standard deviation (σ) = √np(1 – p) = √26(1 - .13) = √26(.87) = √22.62 ≈ 4.76

3. Draw a normal curve using mean and standard dev.

4. Use the continuity correction factor to correct x. (a) 12.5 and 13.5, (b) 29.5, (c) 25.5

5. Get the z-scores. (a) -2.83 and -2.62, (b) .735, (c)-.105

6. Solve the problem… (a) 4977 - .4956 = .002, or 2% (b) .50 - .2704 ≈ .23, or 23%, (c) .50 - .0596 = .4404

Page 24: Statistical Analysis – Chapter 4 Normal Distribution

Now let’s do question 4.16 as a class… In a marketing population of phone calls, 3%

produced a sale. If this population proportion (p = 3%) can be applied to future phone calls, then out of 500 randomly monitored phone calls,

a. How many would you expect to produce a sale?b. What is the probability of getting 11 to 14 sales?c. What is the probability of getting 12 or less sales?

a. 15b. 32.93%c. 25.46%

Page 25: Statistical Analysis – Chapter 4 Normal Distribution

Question 4.16 answersa. Expected value = np = 500(.03) = 15b. 32.93%c. 25.46%