statistics for business and economics - faculty and...

38
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics Chapter 2 Describing Data: Numerical

Upload: phamnhu

Post on 17-Mar-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1

Statistics for Business and Economics

Chapter 2

Describing Data: Numerical

Page 2: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Measures of Central Tendency

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Central Tendency

Mean Median Mode

n

xx

n

1ii∑

==

Overview

Midpoint of ranked values

Most frequently observed value

Arithmetic average

Ch. 2-2

2.1

Page 3: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Arithmetic Mean

n  The arithmetic mean (mean) is the most common measure of central tendency

n  For a population of N values:

n  For a sample of size n:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Sample size

nxxx

n

xx n21

n

1ii +++==

∑= ! Observed

values

Nxxx

N

xµ N21

N

1ii +++==

∑= !

Population size

Population values

Ch. 2-3

Page 4: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Arithmetic Mean

n  The most common measure of central tendency n  Mean = sum of values divided by the number of values n  Affected by extreme values (outliers)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

(continued)

0 1 2 3 4 5 6 7 8 9 10

Mean = 3

0 1 2 3 4 5 6 7 8 9 10

Mean = 4

3515

554321

==++++ 4

520

5104321

==++++

Ch. 2-4

Page 5: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Median

n  In an ordered list, the median is the “middle” number (50% above, 50% below)

n  Not affected by extreme values

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

0 1 2 3 4 5 6 7 8 9 10

Median = 3

0 1 2 3 4 5 6 7 8 9 10

Median = 3

Ch. 2-5

Page 6: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Finding the Median

n  The location of the median:

n  If the number of values is odd, the median is the middle number n  If the number of values is even, the median is the average of

the two middle numbers

n  Note that is not the value of the median, only the

position of the median in the ranked data

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

dataorderedtheinposition21npositionMedian +

=

21n +

Ch. 2-6

Page 7: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Mode

n  A measure of central tendency n  Value that occurs most often n  Not affected by extreme values n  Used for either numerical or categorical data n  There may may be no mode n  There may be several modes

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode Ch. 2-7

Page 8: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Review Example

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

n  Five houses on a hill by the beach $2,000 K

$500 K

$300 K

$100 K

$100 K

House Prices: $2,000,000 500,000 300,000 100,000 100,000

Ch. 2-8

Page 9: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Review Example: Summary Statistics

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

n  Mean: ($3,000,000/5) = $600,000

n  Median: middle value of ranked data = $300,000

n  Mode: most frequent value = $100,000

House Prices: $2,000,000 500,000 300,000 100,000 100,000

Sum 3,000,000

Ch. 2-9

Page 10: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Which measure of location is the “best”?

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

n  Mean is generally used, unless extreme values (outliers) exist . . .

n  Then median is often used, since the median is not sensitive to extreme values. n  Example: Median home prices may be reported for

a region – less sensitive to outliers

Ch. 2-10

Page 11: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Shape of a Distribution

n  Describes how data are distributed n  Measures of shape

n  Symmetric or skewed

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Mean = Median Mean < Median Median < Mean

Right-Skewed Left-Skewed Symmetric

Ch. 2-11

Page 12: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Geometric Mean

n  Geometric mean

n  Geometric mean rate of return

1/nn21

nn21g )xx(x)xx(xx ×××=×××= !!

1)x...x(xr 1/nn21g −×××=

Ch. 2-12

Page 13: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Example Initial Investment: $100 After 5 years: $125 Question: the average annual rate of returns? •  25 % divided by 5 years = 5%? This is Wrong!

•  ``Compound Interest’’ over 5 years $100 x (1+0.05)^5 = $127.63 > $125 after 5 years •  Answer: $100 x (1+r)^5 = $125 è

Ch. 2-13

Page 14: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Measures of Variability

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Same center, different variation

Variation

Variance Standard Deviation

Coefficient of Variation

Range Interquartile Range

n  Measures of variation give information on the spread or variability of the data values.

Ch. 2-14

2.2

Page 15: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Range

n  Simplest measure of variation n  Difference between the largest and the smallest

observations:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Range = Xlargest – Xsmallest

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13

Example:

Ch. 2-15

Page 16: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Disadvantages of the Range

n  Ignores the way in which data are distributed

n  Sensitive to outliers

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

7 8 9 10 11 12 Range = 12 - 7 = 5

7 8 9 10 11 12 Range = 12 - 7 = 5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 5 - 1 = 4

Range = 120 - 1 = 119

Ch. 2-16

Page 17: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Interquartile Range

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Median (Q2)

X maximum X

minimum Q1 Q3

Example:

25% 25% 25% 25%

12 30 45 57 70

Interquartile range = 57 – 30 = 27

Ch. 2-17

Page 18: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

STATA Example

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-18

2040

6080

100

High HW Group Low HW Group

Fina

l Gra

deInterquartile Rage by Low vs. High HW Group

Page 19: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Population Variance

n  Average of squared deviations of values from the mean

n  Population variance:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

N

µ)(xσ

N

1i

2i

2∑=

−=

Where = population mean

N = population size

xi = ith value of the variable x

µ

Ch. 2-19

Page 20: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Sample Variance

n  Average (approximately) of squared deviations of values from the mean

n  Sample variance:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

1-n

)x(xs

n

1i

2i

2∑=

−=

Where = arithmetic mean

n = sample size

Xi = ith value of the variable X

X

Ch. 2-20

Page 21: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Population Standard Deviation

n  Most commonly used measure of variation n  Shows variation about the mean n  Has the same units as the original data

n  Population standard deviation:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

N

µ)(xσ

N

1i

2i∑

=

−=

Ch. 2-21

Page 22: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Sample Standard Deviation

n  Most commonly used measure of variation n  Shows variation about the mean n  Has the same units as the original data

n  Sample standard deviation:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

1-n

)x(xS

n

1i

2i∑

=

−=

Ch. 2-22

Page 23: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Calculation Example: Sample Standard Deviation

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Sample Data (xi) : 10 12 14 15 17 18 18 24

n = 8 Mean = x = 16

4.24267

126

18

16)(2416)(1416)(1216)(10

1n

)x(24)x(14)x(12)X(10s

2222

2222

==

−++−+−+−=

−++−+−+−=

!

!

A measure of the “average” scatter around the mean

Ch. 2-23

Page 24: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Measuring variation

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Small standard deviation

Large standard deviation

Ch. 2-24

Page 25: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Comparing Standard Deviations

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5 s = 0.926

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5 s = 4.570

Data C

Ch. 2-25

Page 26: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Advantages of Variance and Standard Deviation

n  Each value in the data set is used in the calculation

n  Values far from the mean are given extra weight (because deviations from the mean are squared)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-26

Page 27: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Coefficient of Variation

n  Measures relative variation n  Always in percentage (%) n  Shows variation relative to mean n  Can be used to compare two or more sets of

data measured in different units

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

100%x

sCV ⋅⎟⎟

⎞⎜⎜⎝

⎛=

Ch. 2-27

Page 28: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Comparing Coefficient of Variation

n  Stock A: n  Average price last year = $50 n  Standard deviation = $5

n  Stock B: n  Average price last year = $100 n  Standard deviation = $5

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Both stocks have the same standard deviation, but stock B is less variable relative to its price

10%100%$50

$5100%

x

sCVA =⋅=⋅⎟⎟

⎞⎜⎜⎝

⎛=

5%100%$100

$5100%

x

sCVB =⋅=⋅⎟⎟

⎞⎜⎜⎝

⎛=

Ch. 2-28

Page 29: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

n  If the data distribution is approximated by normal distribution, then the interval:

n  contains about 68% of the values in the population or the sample

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

The Empirical Rule

1σµ ±

µ

68%

1σµ±Ch. 2-29

Page 30: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

n  contains about 95% of the values in the population or the sample

n  contains almost all (about 99.7%) of the values in the population or the sample

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

The Empirical Rule

2σµ ±

3σµ ±

3σµ ±

99.7% 95%

2σµ ±

Ch. 2-30

Page 31: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Weighted Mean

n  The weighted mean of a set of data is

n  Where wi is the weight of the ith observation and

n  Use when data is already grouped into n classes, with wi values in the ith class

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

x = wixii=1

n

∑ =w1x1 +w2x2 +!+wnxn

Ch. 2-31

wi∑ =1

2.3

Page 32: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Example

n  Consider a student with the scores of assignment (x1), midterm (x2), and final exam (x3) given by

x1 = 90, x2 = 90, and x3 = 46. n  The weights: n  w1 = 0.1, w2 =0.3, and w3 = 0.6. n  The final grade for this student is

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

wixii=1

3

∑ = 0.1×90+0.3×90+0.6× 46=63.6

Ch. 2-32

2.3

Page 33: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

The Sample Covariance n  The covariance measures the strength of the linear relationship

between two variables n  The population covariance:

n  The sample covariance:

n  Only concerned with the strength of the relationship n  No causal effect is implied n  Depends on the unit of measurement

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

N

))(y(xy),(xCov

N

1iyixi

xy

∑=

−−==

µµσ

1n

)y)(yx(xsy),(xCov

n

1iii

xy −

−−==∑=

Ch. 2-33

2.4

Page 34: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Interpreting Covariance

n  Covariance between two variables:

Cov(x,y) > 0 x and y tend to move in the same direction

Cov(x,y) < 0 x and y tend to move in opposite directions

Cov(x,y) = 0 x and y are independent

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-34

Page 35: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Coefficient of Correlation

n  Measures the relative strength of the linear relationship between two variables

n  Population correlation coefficient:

n  Sample correlation coefficient:

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

YXssy),(xCovr =

YXσσy),(xCovρ =

Ch. 2-35

Page 36: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Features of Correlation Coefficient, r

n  Unit free n  Ranges between –1 and 1

n  The closer to –1, the stronger the negative linear relationship

n  The closer to 1, the stronger the positive linear relationship

n  The closer to 0, the weaker any positive linear relationship

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-36

Page 37: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Scatter Plots of Data with Various Correlation Coefficients

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Y

X

Y

X

Y

X

Y

X

Y

X

r = -1 r = -.6 r = 0

r = +.3 r = +1

Y

X r = 0

Ch. 2-37

Page 38: Statistics for Business and Economics - Faculty and …faculty.arts.ubc.ca/hkasahara/Econ325/325_chap02.pdfStatistics for Business and Economics Chapter 2 Describing Data: Numerica

Example: HW and Final Grade

n  r = .0.499

n  There is a relatively strong positive linear relationship between HW scores and

Final Grades n  Students who scored high on HW assignment

tended to have high final grades

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-38

2040

6080

100

Fina

l Gra

de

0 2 4 6 8 10Homework

Grade Fitted values

Correlation = 0.499HW and Final Grade