descriptive statistics: numerical measures distribution

25
1 Slide Descriptive Statistics: Numerical Measures Distribution Chapter 3 BA 201

Upload: hammer

Post on 24-Feb-2016

76 views

Category:

Documents


1 download

DESCRIPTION

Descriptive Statistics: Numerical Measures Distribution. Chapter 3 BA 201. Distribution. Measures of Distribution Shape, Relative Location, and Detecting Outliers. Distribution Shape. z-Scores. Chebyshev’s Theorem. Empirical Rule. Detecting Outliers. Distribution Shape: Skewness. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Descriptive Statistics:  Numerical Measures Distribution

1 Slide

Descriptive Statistics: Numerical Measures

Distribution

Chapter 3BA 201

Page 2: Descriptive Statistics:  Numerical Measures Distribution

2 Slide

DISTRIBUTION

Page 3: Descriptive Statistics:  Numerical Measures Distribution

3 Slide

Measures of Distribution Shape,Relative Location, and Detecting Outliers

Distribution Shape z-Scores Chebyshev’s

Theorem Empirical Rule Detecting Outliers

Page 4: Descriptive Statistics:  Numerical Measures Distribution

4 Slide

Distribution Shape: Skewness An important measure of the shape of a

distribution is called skewness. The formula for the skewness of sample data is

3

)2)(1(Skewness

sxx

nnn i

Page 5: Descriptive Statistics:  Numerical Measures Distribution

5 Slide

Distribution Shape: Skewness Symmetric (not skewed)

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = 0

• Skewness is zero.• Mean and median are equal.

Page 6: Descriptive Statistics:  Numerical Measures Distribution

6 Slide

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Distribution Shape: Skewness Moderately Skewed Left

Skewness = .31

• Skewness is negative.• Mean will usually be less than the median.

Page 7: Descriptive Statistics:  Numerical Measures Distribution

7 Slide

Distribution Shape: Skewness Moderately Skewed Right

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = .31

• Skewness is positive.• Mean will usually be more than the median.

Page 8: Descriptive Statistics:  Numerical Measures Distribution

8 Slide

Distribution Shape: Skewness Highly Skewed Right

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = 1.25

• Skewness is positive (often above 1.0).• Mean will usually be more than the median.

Page 9: Descriptive Statistics:  Numerical Measures Distribution

9 Slide

Distribution Shape: Skewness

Apartment Rents425 430 430 435 435 435 435 435 440 440440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Page 10: Descriptive Statistics:  Numerical Measures Distribution

10 Slide

Rela

tive

Freq

uenc

y

.05

.10

.15

.20

.25

.30

.35

0

Skewness = 0.92

Distribution Shape: Skewness

Apartment Rents

Page 11: Descriptive Statistics:  Numerical Measures Distribution

11 Slide

z-Scores

z x xsii

The z-score is often called the standardized value.It denotes the number of standard deviations a data value xi is from the mean.

Page 12: Descriptive Statistics:  Numerical Measures Distribution

12 Slide

z-Scores An observation’s z-score is a measure of the relative location of the observation in a data set.

x

z-score < 0

z-score = 0

z-score > 0

Page 13: Descriptive Statistics:  Numerical Measures Distribution

13 Slide

• z-Score of Smallest Value (425)425 490.80 1.2054.74

ix xzs

z-Scores

Standardized Values for Apartment Rents-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Apartment Rents

Page 14: Descriptive Statistics:  Numerical Measures Distribution

14 Slide

PRACTICEZ-SCORES

Page 15: Descriptive Statistics:  Numerical Measures Distribution

15 Slide

Practice #6 – z-Scores

z-Score3 -107 -611 -216 318 523 10

ix xxi x = 13s = 7.4

sxxz i

i

Page 16: Descriptive Statistics:  Numerical Measures Distribution

16 Slide

Chebyshev’s TheoremAt least (1 - 1/k2) of the items in any data set will be within k standard deviations of the mean, where k is any value greater than 1.

Within k standard

deviations of mean

% of data values

2 75%3 89%4 94%

Page 17: Descriptive Statistics:  Numerical Measures Distribution

17 Slide

Chebyshev’s Theorem

Let z = 1.5 with = 490.80 and s = 54.74x

At least (1 1/(1.5)2) = 1 0.44 = 0.56 or 56%of the rent values must be betweenx - k(s) = 490.80 1.5(54.74) = 409

andx + k(s) = 490.80 + 1.5(54.74) = 573

(Actually, 86% of the rent values are between 409 and 573.)

Apartment Rents

Page 18: Descriptive Statistics:  Numerical Measures Distribution

18 Slide

Empirical Rule

When data approximate a bell-shaped distribution, the empirical rule can be used to determine the percentage of data values that must be within a specified number of standard deviations of the mean.

Within … of the mean

% of data values

+/- 1 standard deviation 68.26%

+/- 2 standard deviations 95.44%

+/- 3 standard deviations 99.72%

Page 19: Descriptive Statistics:  Numerical Measures Distribution

19 Slide

Empirical Rule

xm – 3s m – 1s

m – 2sm + 1s

m + 2sm + 3sm

68.26%95.44%99.72%

Page 20: Descriptive Statistics:  Numerical Measures Distribution

20 Slide

PRACTICECHEBYSHEV’S THEOREM AND EMPIRICAL RULE

Page 21: Descriptive Statistics:  Numerical Measures Distribution

21 Slide

Practice #7 - Chebyshev’s Theorem

x = 1200s = 110

k = 1.25

k = 3.5

How many items (%) are within k standard deviations?

Page 22: Descriptive Statistics:  Numerical Measures Distribution

22 Slide

Practice #7 – Empirical Rule

x = 1200s = 110

What is the lower bound for 2 standard deviations? The upper bound? How many items (%) are within this area?

Page 23: Descriptive Statistics:  Numerical Measures Distribution

23 Slide

Detecting Outliers An outlier is an unusually small or unusually large value in a data set. A data value with a z-score less than -3 or greater than +3 might be considered an outlier. It might be:• an incorrectly recorded data value• a data value that was incorrectly included in the

data set• a correctly recorded data value that belongs in

the data set

Page 24: Descriptive Statistics:  Numerical Measures Distribution

24 Slide

Detecting Outliers

• The most extreme z-scores are -1.20 and 2.27• Using |z| > 3 as the criterion for an outlier, there are no outliers in this data set.

-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Standardized Values for Apartment Rents

Apartment Rents

Page 25: Descriptive Statistics:  Numerical Measures Distribution

25 Slide