part ii : descriptive statistics

20
Chapter 5 - The Normal Curve PART II : DESCRIPTIVE STATISTICS Dr. Joseph Brennan Math 148, BU Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 1 / 20

Upload: others

Post on 28-Nov-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PART II : DESCRIPTIVE STATISTICS

Chapter 5 - The Normal CurvePART II : DESCRIPTIVE STATISTICS

Dr. Joseph Brennan

Math 148, BU

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 1 / 20

Page 2: PART II : DESCRIPTIVE STATISTICS

Histogram and the Density Curve

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 2 / 20

Page 3: PART II : DESCRIPTIVE STATISTICS

Density Curves

A density curve describes thedistribution of a quantitativecontinuous variable.

A density curve may be used to display

the distribution of the data in addition

to or instead of a histogram.

We can consider a density curve as a

smooth approximation to the

histogram computed from the data.

For continuous response variables, the

histogram computed from the data

(sample), approximates the (unknown)

population density of the response

variable.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 3 / 20

Page 4: PART II : DESCRIPTIVE STATISTICS

Properties of Density Curves

Like histograms, the density curves may be described by their symmetryand if they are skewed.

Density curves also have measures of center and spread.

µ is the mean of a density curve.

µ̃ is the median of a density curve.

σ is the standard deviation of a density curve.

q1 and q3 are the first and third quartiles of a density curve.

NOTE 1: The mean and median are the same for a symmetricdensity curve. They both lie at the center of the curve.

NOTE 2: The mean of a skewed curve is pulled away from themedian in the direction of the long tail.

NOTE 3: The standard deviation of a density curve is computedmathematically, and is difficult to estimate visually.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 4 / 20

Page 5: PART II : DESCRIPTIVE STATISTICS

Population Parameters and Statistics

If the density curve describes the population distribution, then the mean µand standard deviation σ of the density curve are the (unknown)population parameters.

The sample average x̄ and s computed from a data set estimate µ and σ,respectively, but usually are not exact.

µ = 0.25 x̄ = 0.2556

σ = 0.144338 s = 0.144446

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 5 / 20

Page 6: PART II : DESCRIPTIVE STATISTICS

The Normal Curve

Perhaps, the most important density curve in statistics!

Figure : Figure 6. The (standard) normal density curve.

The curve is defined by the equation

p(z) =1√2π

e−z2

2 , where e = 2.71828... (1)

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 6 / 20

Page 7: PART II : DESCRIPTIVE STATISTICS

Properties of the Normal Curve

Properties of the (standard) normalcurve:

Symmetric about zero,

Unimodal,

The mean, median, and mode areequal,

Bell-shaped,

The mean µ = 0 and the standarddeviation σ = 1,

The area under the whole normalcurve is 100% (or 1, if you usedecimals).

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 7 / 20

Page 8: PART II : DESCRIPTIVE STATISTICS

The Normal Approximation of Data

Many histograms for data are similar in shape to the normal curve,provided they are drawn to an appropriate scale.

Normal Approximation: Transforming the horizontal scale of ahistogram so that it aligns with the standard normal density curve.

z-units are the resulting value a data point attains after normalapproximation. (More information to come!)

If the histogram follows the normal curve, the area under thehistogram will be about the same as the area under the curve.

The area under the histogram corresponds to the percentage ofobservations in the corresponding interval.

The goal of normal approximation is to use the normal density curveapproximating percentages of observations in a given interval.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 8 / 20

Page 9: PART II : DESCRIPTIVE STATISTICS

The Empirical Rule

The (standard) normal curve is plotted against z , the standard units. The

following property of the normal curve explains the origins of the Empirical Rule.

THE 68-95-99.7 RULE for the NORMAL CURVE

Approximately 68% of observations fall within 1 standard unit of 0(−1 < z < 1).

Approximately 95% of observations fall within 2 standard units of 0(−2 < z < 2).

Approximately 99.7% of observations fall within 3 standard unit of 0(−3 < z < 3).

The Empirical Rule, which is applicable to bell-shaped normal-like

histograms, is the direct consequence of the above property of the normal

curve.

The range −1 < z < 1 in standard units correspond to x̄ − s < x < x̄ + s in

the original, nonstandard units.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 9 / 20

Page 10: PART II : DESCRIPTIVE STATISTICS

The 68-95-99.7 Rule

Figure : Normal curve and percentage of observations under it. Horizontal scaleuses the standard units z .

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 10 / 20

Page 11: PART II : DESCRIPTIVE STATISTICS

z-Scores

z-Score: The transformation of data into standard units, normalapproximation:

z =observation−mean

standard deviationThus, any data point x may be recomputed in standard units as

zx =x − x̄

s. (2)

We call the z which corresponds to x the z-score zx . Note that

zx < 0 if x < x̄ ;

zx = 0 if x = x̄ ;

zx > 0 if x > x̄ .

We may reverse the transformation; if zx is known, x can be found by

x = x̄ + s · zx . (3)

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 11 / 20

Page 12: PART II : DESCRIPTIVE STATISTICS

z-Scores

zx =x − x̄

s

The z - score indicates the number of standard deviations away adata point falls above or below the average x̄ .

If the histogram plotted against the z - scores follows the normalcurve well, we say that the normal distribution provides a goodapproximation for the distribution of the data.

The normal curve is well studied and many of it’s values have been storedin normal tables. Data that is found to have a good normalapproximation can be correlated with the normal curve.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 12 / 20

Page 13: PART II : DESCRIPTIVE STATISTICS

Normal Table

A normal table found in the text providing the area between −z and z :

Figure : Figure 9. Fragment of a normal table.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 13 / 20

Page 14: PART II : DESCRIPTIVE STATISTICS

Exercise 1, Set B, p.84

Using a normal table, let us find the area under the normal curve:

(a) to the right of 1.25

Table Value: 0.8944

0.1056 = 1− 0.8944

(b) to the left of -0.4

Table Value: 0.34464

(c) to the left of 0.8

Table Value: 0.7881

(d) between 0.4 and 1.3

Table Value of 0.4: 0.6554

Table Value of 1.3: 0.9032

0.2478 = 0.9032− 0.6554

(e) between -0.3 and 0.9

Table Value of -0.3: 0.3821

Table Value of 0.9: 0.8159

0.4338 = 0.8159− 0.3821

(f) outside -1.5 to 1.5

Table Value of -1.5: 0.0668

Table Value of 1.5: 0.9332

0.1336 = (1− 0.9332) + 0.0668

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 14 / 20

Page 15: PART II : DESCRIPTIVE STATISTICS

Example 8, p.85

The heights of the men age 18 and over in HANES5 averaged 69 inches;the SD was 3 inches. Use the normal curve to estimate the percentage ofthese men with heights between 63 inches and 72 inches.

Solution: The exact percentage is equal to the area under the height histogram

between 63 inches and 72 inches. We assume that the histogram can be well

approximated by the normal curve.

We will estimate the percentage of men between 63 and 72 inches by finding the

area of the corresponding region under the standard normal curve.

Step 1: Draw a number line and shade the interval of interest.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 15 / 20

Page 16: PART II : DESCRIPTIVE STATISTICS

Example 8, p.85

Step 2: Mark the mean on the line and convert to standard units.The z - score for the left endpoint is

z63 =x − x̄

s=

63− 69

3= −2.

The z - score for the right endpoint is

z72 =x − x̄

s=

72− 69

3= 1.

Step 3: Sketch the normal curve and find the area under the curve abovethe shaded interval by using normal tables.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 16 / 20

Page 17: PART II : DESCRIPTIVE STATISTICS

Example 8, p.85

Conclusion: From our table of z-scores, z63 = −2 is the 2.28 percentileand z72 = 1 is the 84.13 percentile.

Therefore, about 82% of the heights were between 63 inches and 72inches. This is only an approximation, though, in truth, 81% of the menwere in that range.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 17 / 20

Page 18: PART II : DESCRIPTIVE STATISTICS

Example (S.A.T.)

The SAT is a test for readiness of students for college. The average SAT score (on

a 1600 point scale) is 1025 points and the standard deviation is 200 points. How

well must Jessica do on the SAT in order to place in the top 10% of all students?

Solution: The problem does not say that the histogram of the SAT scores is

bell-shaped, but it is reasonable to assume so. We will use the normal

approximation to the distribution of the SAT scores to solve the problem.

First, find a z-score representing the 90th percentile.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 18 / 20

Page 19: PART II : DESCRIPTIVE STATISTICS

Example (S.A.T.)

Using the normal table provided in the textbook, Jessica is hoping for ascore that translates to z ≈ 1.3.

We know x̄ = 1025 and s = 200.

z =x − x̄

s⇒ x = x̄ + s · z= 1025 + 200 · 1.3 = 1285

So Jessica should score 1285 points to expect to be among the top 10%of students.

The freshman average SAT score at Binghamton was 1305 in 2011,in what percentile is the average freshman?

z1305 =1305− 1025

200= 1.4

Using our z-table we find a value of 0.9192. Therefore the averagefreshman at BU is 92 percentile.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 19 / 20

Page 20: PART II : DESCRIPTIVE STATISTICS

IQ Score

An intelligence quotient, or IQ, is a score derived from one of severalstandardized tests designed to assess intelligence. The mean score isnormalized as 100 and the standard deviation is roughly 15.

An IQ score of 70 is what percentile?

z70 = 70−10015 = −2

Table Value of −2: 0.0228 or 2.2%

An IQ of 150 is required for entrance into a gifted program, whatpercentage of students are considered eligible?

z150 = 150−10015 = 3.33

Table Value of 3.33: 0.9996

With a requirement of a score of 150, only 0.04% of students will beconsidered ”gifted”.

Dr. Joseph Brennan (Math 148, BU) Chapter 5 - The Normal Curve 20 / 20