chapter 1 normal curve

22
Chapter 1 The Normal Distribution _______________________________________________________ ___________ NORMAL DISTRIBUTION The most important continuous probability distribution in the entire field of statistics is the normal distribution. Its graph, called the normal curve, is the bell-shaped curve of Figure 1.1, which describes approximately many phenomena Figure 1.1 Normal Curve these include biological measurements such as height, weight, and life span, as well as psychological measurements such as scores on intelligence tests (IQ tests). In a normal distribution, most values fall near the average, with only a small percentage of values falling far above or below the average. For example, in a random sample of adults, most will measure between 120 cm (4 ft) and 210 cm (7 ft) tall, with very few heights outside this range. Normal distributions generally develop when the sample size or number of observations is very large. In addition, errors in scientific measurements are extremely well approximated by a normal distribution. In 1773, Abraham DeMoivre developed the mathematical equation of the normal curve. It provided a basis for which much of the theory of inductive statistics is founded. The normal distribution is often referred to as the Gaussian distribution, in honor of Karl Friedrich Gauss ( 1777 – 1855 ), who also derived its equation from a study of errors in repeated measurements of the same quantity. The density function of the normal random variable X, with mean and variance 2 , is 1

Upload: mario-m-ramon

Post on 18-Nov-2014

3.089 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

NORMAL DISTRIBUTION

The most important continuous probability distribution in the entire field of statistics is the normal distribution. Its graph, called the normal curve, is the bell-shaped curve of Figure 1.1, which describes approximately many phenomena

Figure 1.1 Normal Curve

these include biological measurements such as height, weight, and life span, as well as psychological measurements such as scores on intelligence tests (IQ tests). In a normal distribution, most values fall near the average, with only a small percentage of values falling far above or below the average. For example, in a random sample of adults, most will measure between 120 cm (4 ft) and 210 cm (7 ft) tall, with very few heights outside this range. Normal distributions generally develop when the sample size or number of observations is very large. In addition, errors in scientific measurements are extremely well approximated by a normal distribution. In 1773, Abraham DeMoivre developed the mathematical equation of the normal curve. It provided a basis for which much of the theory of inductive statistics is founded. The normal distribution is often referred to as the Gaussian distribution, in honor of Karl Friedrich Gauss ( 1777 – 1855 ), who also derived its equation from a study of errors in repeated measurements of the same quantity.

The density function of the normal random variable X, with mean and variance 2, is

where and are the parameters known as the mean and the standard deviation, respectively

Properties of the Normal Distribution

1. It is bell-shaped and has a single peak at the center of the distribution. The arithmetic mean, median, and mode are equal and located in the center of the distribution. Thus half the area under the normal curve is to the right of this center point and the other half to the left of it.

2. It is symmetrical about its mean . If we cut the normal curve vertically at the center value, the two halves will be mirror images.

3. If falls off smoothly in either direction from the central value. That is, the distribution is asymptotic: the curve gets closer and closer to the x-axis but never actually touches it. To put it another way, the tails to the curve extend indefinitely in both directions.

1

Page 2: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

4. The location of a normal distribution is determined by the mean. The dispersion (or spread) of the distribution is determined by the standard deviation.

5. The coefficient of skewness of a normal distribution is zero, while its kurtosis is equal to 3.

6. The total area under the curve and above the horizontal axis is equal to 1.

7. For any real number x, .

The Standard Normal Curve

Since each normally distributed variable has its own mean and standard deviation, the shape and location of these curves will vary. Therefore, different tables of values of areas under each curve will be needed for each variable. To simplify the situation, statisticians use the standard normal distribution. The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.

Any normal distribution can be converted into a standard normal. This can be done by means of the z-transform:

The above formula is also known as the z-statistic, z-values, the standard normal deviates, or just the normal deviates. Thus, if X is normally distributed with mean and standard deviation , and for any a and b, then

Areas Under the Normal Curve

Probabilities of normal distributions (or areas under a normal curve) can be obtained from a normal probability table. Most textbooks provide the cumulative probability of a standard normal distribution. That is, we can get

.

Figure 1.2 Area of P( Z < z )

2

z

Page 3: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Example 1 Determine the area under the standard normal curve between –1.15 and 0.94.

Solution For probabilities, a special notation is used. In this problem, to find the probability of z value between –1.15 and 0.94, this probability is written as P( -1.15 < Z < 0.94 ).To find the area under the standard normal curve between z = -1.15 and z = 0.94, simply subtract the area bounded by z = 0.94 to the area bounded by z = - 1.15.

To use the Standard Probability Table, note that all Z values must first be recorded to two decimal places. Now, for example we want to get P( Z < 0.94), scan down the Z column until you locate 0.9. At 0.9, read across this row until you intersect the column that contains the hundredths place of the Z value. In this case, the z value is 0.04. Therefore in the body of the table, the tabulated probability for z = 0.94 corresponds to the intersection of the row z = 0.9 with column 0.04. This probability is 0.8264. Hence, P( Z < 0.94) = 0.8264.

P = ( -1.15 < Z < 0.94 )

P = ( Z < 0.94 ) – ( Z < -1.15 )

P = 0.8264 – 0.1251

P = 0.7013

Therefore, the area under the standard normal curve between –1.15 and 0.94 is 0.7013.

3

-1.15 0.94

Page 4: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Producing Normal Probabilities Using Excel

Excel can be used to calculate normal probabilities and the z value corresponding to a specific probability. For example, suppose we want to get P( Z < 0.94). The following are the necessary Excel commands to produce the probability.

1. Select Insert and Function.2. Select Statistical in the category dialog and then select NORMSDIST

in the “Function name” dialog. Press OK.3. In the NORMSDIST dialog box, enter 0.94.4. The result will appear in the dialog box. If you click OK, the answers

appear in the spreadsheet.

In this case, P( Z < 0.94) = 0.826391238.

Another way of finding the probability of P( Z < 0.94) is by using the Megastat. Megastat is an Excel add-in that performs statistical analysis within an Excel workbook. After it is installed, it appears on the Excel menu and works like any Excel option.

1. Click Megastat.2. Select Probability and choose Continuous Probability Distributions.3. Select Normal Distribution. In this case, we select calculate

probability given z.4. In the dialog box, enter 0.94. The result will appear if you click

preview at p(lower) or click OK.

4

Page 5: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Megastat in Microsoft Excel

In this case, P( Z < 0.94) = 0.8264.

Example 2 Determine the area under the standard normal curve above 1.3.

Solution: Using megastat, click continuous probability distributions. Under normal distribution, select calculate probability given z and in the dialog box, enter 1.3. The result will appear if you click Preview at p(upper) or Click OK.

In this case, P( Z > 1.3 ) = 0.0968.

5

Page 6: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Example 3 The age of subscribers to a certain newspaper are normally distributed with mean of 32.5 years and standard deviation 4.6. What is the probability that the age of a random subscriber is (a) more than 30.5 years; (b) between 25 and 40 years?

Solution:(a) 1. Calculate the z – transform by selecting Insert and

Function. Choose Statistical and Standardize then press OK. In the dialog box, enter 30.5 for X, 32.5 for Mean, 4.6 for Standard_dev. The result will appear in the dialog box. If you click OK, the answers appear in the spreadsheets.

The probability of the raw score ( 30.5 ) having been standardized is the same as the probability of z > -0.434782609.By megastat, we can determine the P(Z > -0.4348).P(Z > -0.4348) = 0.6681.

Therefore, the probability is 0.6681.

Try to answer problem letter ( b ).

Example 4 Find the z value for which the area under the standard normal curve to the left of that value is 0.2583.

Solution: Using megastat, click continuous probability distributions. Under normal distribution, select calculate z given probability, and in the dialog box, enter 0.2583. The result will appear if you click Preview or it will appear on the spreadsheet when you click OK. In this case, the result is 0.65.

6

Page 7: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

The z value is 0.65 when the probability is 0.2583.

The Central Limit Theorem

In practice, there are some cases that the normality assumption is not satisfied. Hence, computing the probabilities of certain events may be cumbersome. But if we want to compute the probability that the sample mean

is between two fixed real numbers, then an appropriate model will be

the normal distribution. This is the content of the Central Limit Theorem, one of the most important theorems in applied probability and statistics. The theorem states that if is a random sample from a population with mean and standard deviation , then for large sample size n, the sample mean, , has

approximate normal distribution with mean and standard deviation . The

above statement can also be expressed has an approximate normal

distribution with mean and standard deviation .

To illustrate this theorem, consider a random sample from a distribution given by . Note that the mean of X is and standard deviation is . Now, the sum

counts the number of 1’s in the sample of n. By using the Central Limit

Theorem, it is approximately normally distributed with mean and standard deviation . This result is illustrated by the figure given below with n = 15 and p = 0.5.

7

0.65

0.2583

Page 8: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

The central limit theorem can be used to answer questions about sample means in the same manner that the normal distribution can be used to answer about individual values. The formula for computing z value:

where; X is the sample mean is the population mean / n is the standard deviation

The normal approximation for X will generally be good if n > 30. If n < 30, the approximation is good only if the population is not too different from a normal distribution and if the population is known to be normal, the sampling distribution of X will follow a normal distribution exactly, no matter how small the size of the samples.

Example 5 The average age of lawyers is 40 years, with a standard deviation of 3 years. If a law firm employs 36 lawyers, find the probability that the average age group is lower than 39 years old?

Solution: The sampling distribution of X will be approximately normal, with = 40 and = 3 / 36= 0.5. The shaded region of the figure below gives the desired probability.

Calculate the z – transform by selecting Insert and Function. Choose Statistical and Standardize then press OK. In the dialog box, enter 39 for X, 40 for Mean, 0.5 for Standard_dev. The result is – 2. Using megastat, z = -2 is equal to 0.0228. Thus, the probability of hiring a lawyer less than 39 years old is 2.28%.

Other Statistical Distribution

Aside from the normal distribution, another important distributions are the student’s t distribution, the chi-square distribution, and F distribution. Just like the standard normal distribution, all probabilities and critical values can be obtained using a specialized table. This table is available in all elementary statistics books. To generate the probabilities and critical values using Excel, click megastat, then select Probability and choose continuous probability distribution.

Example 6 Find the critical chi-square value for n = 15, a = 0.05 and the test is a right tailed.

8

X - / n

Z =

39 4

x = 0.5

Page 9: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Solution: Since n = 15, then degrees of freedom = n – 1 = 15 – 1 =14.Using megastat under continuous probability distribution, select chi-square distribution, then choose calculate chi-square given probability. In the dialog box, enter 0.05 as the probability and 14 for degrees of freedom. Click preview for the result. The solution is 23.68.

Therefore, the critical chi-square value of n = 15, a = 0.05, and right tailed is 23.68.

Note: Megastat provides critical values for one-tailed test. However, in finding the critical values for two-tailed test, the probability ( a ) will be divided by 2.

Name:_______________________________________ Score:______________Section:______________________________________ Date: ______________

9

0.05

23.68

Page 10: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Exercise 1.1

I. Standardized the following:No. X Mean SD Z1. 15 17.6 4.22. 100 98 53. 40 37 2.94. 721 689.23 605. 500 820 270

II. Find z given the probability and draw the figure.

6. P = 0.6289 to its right

7. P = 0.8096 to its left

8. P = 0.3780 to it left

9. P = 0.2009 to its right

10. P = 0.2777 to its left

III. Find the probability given z. Draw its figure.

10

Page 11: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

11. z > 1.232

12. z < -2.39

13. z < 1.67

14. –1.87 < z < 2.85

15. –2.37 < z < 1.27

Name:_______________________________________ Score:______________Section:______________________________________ Date: ______________

11

Page 12: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Exercise 1.2Answer the following.

1. If the scores for the test have a mean of 90 and a standard deviation of 10, find the percentage of scores that will fall below 80.

2. A certain type of storage battery lasts, on average, 5 years, with a standard deviation of 0.2 years. Assuming that the battery lives are normally distributed, find the probability that a given battery will last less than 4 years.

3. The average grade for an exam is 70, and the standard deviation is 5. If 20% of the class is given A’s, and the grades are curved to follow a normal distribution, what are the lowest possible A and the highest possible B?

4. The loaves of raisin bread distributed to local stores by a certain bakery have an average length of 29 centimeters and a standard deviation of 3 centimeters. Assuming that the lengths are normally distributed, what percentage of the loaves are

a. longer than 31.2 centimeters?b. shorter than 28.9 centimeters?c. between 28.1 and 31.2 centimeters in length?

5. For a medical study, the researchers’ wishes to select people in the middle 30% of the population based on blood pressure. If the mean systolic blood

12

Page 13: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

pressure is 110 and the standard deviation is 20, find upper and lower readings that would qualify people to participate in the study.

6. A certain university desires to accept only the top 15% of all graduating seniors who took their entrance test. This test has a mean of 500 and a standard deviation of 100. Assuming that the scores are normally distributed, find the cutoff score for the test.

7. The IQs of 2000 applicants of a certain university are approximately normally distributed with mean 300 and standard deviation of 40. If the university requires an IQ of at least 95, how many of these students will be rejected on this basis regardless of their other qualifications?

8. A jeepney arrives every 8 minutes at a certain terminal and standard deviation of 0.9 minutes. It is assumed that the waiting time for a particular individual is a random variable with a uniform distribution.

a. What is the probability that the individual waits more than 5 minutes?

b. What is the probability that the individual waits between 6 and 10 minutes?

Name:_______________________________________ Score:______________Section:______________________________________ Date: ______________

13

Page 14: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Exercise 1.3Answer the following.

1. The mean weight of 25-year-old males is 70 kilograms and the standard deviation is 5.3 kilograms. If a sample of 625 males is selected, find the probability that the mean of the sample will be greater than 75 kilograms.

2. A survey found that a family generates an average of 3.3 kilograms of garbage daily and a standard deviation of 0.7. Assume that it is normally distributed, find the probability that the mean of 81 families will be between 2.5 and 3.8 kilograms.

3. At a large university, the mean age of graduate students who are majoring in Mathematics is 34.6 years and the standard deviation is 1.2 years. If a random sample of 64 individuals selected at random, find the probability that the mean age is

a. below 30 years oldb. above 36 years oldc. between 32 and 35 years old.

4. The time it takes for a certain pain reliever to begin reducing pain is 2 hour2, with standard deviation of 7 minutes. If a random sample of 100

14

Page 15: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

patients who took the pain reliever selected at random, find the probability that it will take the medication more than 100 minutes to take effect.

5. A certain company manufactures cell phones that have a length of life that is approximately normally distributed, with mean equal to100 months and a standard deviation of 7 months. Find the probability that a random sample of 25 cell phones will have an average life of less than 94 months.

6. The average height of basketball players in a professional league if 6 feet, the standard deviation is 3 inches. What is the probability of selecting 50 players taller than 6 feet 7 inches? Assume that the variable is normally distributed.

Name:_______________________________________ Score:______________

15

Page 16: Chapter 1 Normal curve

Chapter 1 The Normal Distribution__________________________________________________________________

Section:______________________________________ Date: ______________

Exercise 1.4Answer the following.

1. Complete the table ( t – distribution )

item Degree of freedom Alpha ( a ) Tail/s t - valuea 28 0.05 1b 17 0.1 2b 19 0.25 1d 27 2 2.977e 39 1 2.086

2. Determine the f – value of the following:

a. f 0.05 ( 9,20 )

b. f 0.01 ( 27,10 )

c. f 0.025 ( 29, 17 )

3. Given the f – value and degrees of freedom, find the probability.

a. f – value = 1.9 ; ( 13, 19 )

b. f – value = 1.25 ; ( 185, 120 )

c. f – value = 4.56 ; ( 31, 26 )

4. Complete the table ( chi – square distribution )

item Degree of freedom Alpha ( a ) X2

a 29 0.05b 21 35.718c 35 0.01d 20 36.781e 28 37.697

16