instructor: prof. dr. samir safisite.iugaza.edu.ps/ssafi/files/2016/09/all-chapters.pdfsunday,...

162
Sunday, February 10, 2019 1 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 1 The Islamic University of Gaza Faculty of Economics & Administrative Sciences Department of Economics Course: Basic Statistics (ECOE 1302) Semester : Spring 2019 Instructor: Prof. Dr. Samir Safi Slide - 2 Business Statistics: A First Course Seventh Edition Chapter 3 Numerical Descriptive Measures Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Upload: others

Post on 08-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

1

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 1

The Islamic University of Gaza

Faculty of Economics & Administrative Sciences

Department of Economics

Course: Basic Statistics (ECOE 1302)

Semester : Spring 2019

Instructor: Prof. Dr. Samir Safi

Slide - 2

Business Statistics: A First Course

Seventh Edition

Chapter 3

Numerical

Descriptive

Measures

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Page 2: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

2

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 3

Objectives

1. Describe the properties of central tendency,

variation, and shape in numerical data

2. Construct and interpret a boxplot

3. Compute descriptive summary measures for a

population

4. Calculate the covariance and the coefficient of

correlation

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 4

Summary Definitions

DCOVA

• The central tendency is the extent to which the

values of a numerical variable group around a

typical or central value.

• The variation is the amount of dispersion or

scattering away from a central value that the

values of a numerical variable show.

• The shape is the pattern of the distribution of

values from the lowest value to the highest value.

Page 3: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

3

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 5

Measures of Central Tendency: The

Mean (1 of 2)

DCOVA

• The arithmetic mean (often just called the “mean”)

is the most common measure of central tendency

– For a sample of size n:

1 1 2

n

i

i n

XX X X

Xn n

The ith valuePronounced x-bar

Sample size Observed values

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 6

Measures of Central Tendency: The

Mean (2 of 2)

DCOVA

• The most common measure of central tendency

• Mean = sum of values divided by the number of

values

• Affected by extreme values (outliers)

11 12 13 14 15 6513

5 5

11 12 13 14 20 7014

5 5

Page 4: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

4

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 7

Measures of Central Tendency: The

Median

DCOVA

• In an ordered array, the median is the “middle”

number (50% above, 50% below)

• Less sensitive than the mean to extreme values

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 8

Measures of Central Tendency:

Locating the Median

DCOVA• The location of the median when the values are in numerical order

(smallest to largest):

1Median position position in the ordered data

2

n

• If the number of values is odd, the median is the middle number

• If the number of values is even, the median is the average of the two

middle numbers

Note that 1

2

n is not the value of the median, only the position of

the median in the ranked data

Page 5: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

5

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 9

Measures of Central Tendency: The

Mode

DCOVA• Value that occurs most often

• Not affected by extreme values

• Used for either numerical or categorical data

• There may be no mode

• There may be several modes

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 10

Measures of Central Tendency:

Review Example

DCOVA

Sum$

$2,000,000

$ 500,000

$ 300,000

$ 100,000

$ 100,000

3,000,000

HousePrices : • Mean:$3,000,000

5

= $600,000

• Median: middle value of ranked

data= $300,000

• Mode: most frequent value

= $100,000

Page 6: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

6

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 11

Measures of Central Tendency:

Which Measure to Choose?

DCOVA• The mean is generally used, unless extreme

values (outliers) exist.

• The median is often used, since the median is not

sensitive to extreme values. For example, median

home prices may be reported for a region; it is

less sensitive to outliers.

• In many situations it makes sense to report both

the mean and the median.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 12

Measures of Central Tendency:

Summary

DCOVA

Page 7: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

7

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 13

Measures of Variation

DCOVA

• Measures of variation

give information on the

spread or variability or

dispersion of the data

values.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 14

Measures of Variation: The Range

DCOVA

• Simplest measure of variation

• Difference between the largest and the smallest

values:

largest smallestRange X X

Example:

Page 8: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

8

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 15

Measures of Variation: Why the Range

Can Be Misleading

DCOVA• Does not account for how the data are distributed

• Sensitive to outliers

,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,1 5

Range = 5 − 1 = 4

,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,1 120

Range = 120 − 1 = 119

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 16

Measures of Variation: The Sample

Variance

DCOVA

• Average (approximately) of squared deviations of

values from the mean

– Sample variance:

2

2 1

1

n

i

i

X X

Sn

Where X arithmetic mean

n sample size

iX th valueof thevariablei X

Page 9: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

9

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 17

Measures of Variation: The Sample

Standard Deviation

DCOVA

• Most commonly used measure of variation

• Shows variation about the mean

• Is the square root of the variance

• Has the same units as the original data

– Sample standard deviation:

2

1

1

n

i

i

X X

Sn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 18

Measures of Variation: The Standard

Deviation

DCOVA

Steps for Computing Standard Deviation

1. Compute the difference between each value and the

mean.

2. Square each difference.

3. Add the squared differences.

4. Divide this total by n−1 to get the sample variance.

5. Take the square root of the sample variance to get the

sample standard deviation.

Page 10: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

10

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 19

Measures of Variation: Sample Standard

Deviation: Calculation Example

DCOVA :iXSample Data 10 12 14 15 17 18 18 24

n = 8 Mean 16X

2 2 2 2

10 12 14 24

1

X X X XS

n

2 2 2 2

10 16 12 16 14 16 24 16

8 1

1304.3095

7

A measure of the “average”

scatter around the mean

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 20

Measures of Variation: Comparing

Standard Deviations (1 of 2)

DCOVAData A

Mean = 15.5

S = 3.338Data B

Mean = 15.5

S = 0.926Data C

Mean = 15.5

S = 4.567

Page 11: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

11

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 21

Measures of Variation: Comparing

Standard Deviations (2 of 2)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 22

Measures of Variation: Summary

Characteristics

DCOVA

• The more the data are spread out, the greater the

range, variance, and standard deviation.

• The more the data are concentrated, the smaller

the range, variance, and standard deviation.

• If the values are all the same (no variation), all

these measures will be zero.

• None of these measures are ever negative.

Page 12: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

12

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 23

Measures of Variation: The

Coefficient of Variation

DCOVA

• Measures relative variation

• Always in percentage (%)

• Shows variation relative to mean

• Can be used to compare the variability of two or

more sets of data measured in different units

100%S

CVX

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 24

Measures of Variation: Comparing

Coefficients of Variation (1 of 2)

DCOVA• Stock A:

– Mean price last year = $50

– Standard deviation = $5

A

$5100% 100% 10%

$50

SCV

X

• Stock B:

– Mean price last year = $100

– Standard deviation = $5

B

$5100% 100% 5%

$100

SCV

X

Both stocks have the

same standard

deviation, but stock B

is less variable relative

to its mean price

Page 13: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

13

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 25

Measures of Variation: Comparing

Coefficients of Variation (2 of 2)

DCOVA• Stock A:

– Mean price last year = $50

– Standard deviation = $5

A

$5100% 100% 10%

$50

SCV

X

• Stock C:

– Mean price last year = $8

– Standard deviation = $2

C

$2100% 100% 25%

$8

SCV

X

Stock C has a

much smaller

standard

deviation but a

much higher

coefficient of

variation

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 26

Locating Extreme Outliers:

Z-Score (1 of 3)

DCOVA

• To compute the Z-score of a data value, subtract the mean and divide by the standard deviation.

• The Z-score is the number of standard deviations a data value is from the mean.

• A data value is considered an extreme outlier if its Z-score is less than −3.0 or greater than +3.0.

• The larger the absolute value of the Z-score, the farther the data value is from the mean.

Page 14: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

14

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 27

Locating Extreme Outliers:

Z-Score (2 of 3)

DCOVA

X XZ

S

where X represents the data value

X is the sample mean

S is the sample standard deviation

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 28

Locating Extreme Outliers:

Z-Score (3 of 3)

DCOVA

• Suppose the mean math SAT score is 490, with a

standard deviation of 100.

• Compute the Z-score for a test score of 620.

620 490 1301.3

100 100

X XZ

S

A score of 620 is 1.3 standard deviations above

the mean and would not be considered an outlier.

Page 15: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

15

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 29

Shape of a Distribution

DCOVA

• Describes how data are distributed

• Two useful shape related statistics are:

– Skewness

▪ Measures the extent to which data values are not

symmetrical

– Kurtosis

▪ Kurtosis measures the peakedness of the curve of

the distribution—that is, how sharply the curve rises

approaching the center of the distribution

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 30

Shape of a Distribution (Skewness)

DCOVA• Measures the extent to which data is not

symmetrical

Page 16: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

16

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 31

Shape of a Distribution -- Kurtosis Measures How

Sharply the Curve Rises Approaching the Center of

the Distribution

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 32

Quartile Measures

DCOVA

• Quartiles split the ranked data into 4 segments with an

equal number of values per segment

• The first quartile, 1,Q is the value for which 25% of the

observations are smaller and 75% are larger

• 2Q is the same as the median (50% of the observations aresmaller and 50% are larger)

• Only 25% of the observations are greater than the third

quartile.

Page 17: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

17

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 33

Quartile Measures: Locating

Quartiles (1 of 2)

DCOVA

Find a quartile by determining the value in the

appropriate position in the ranked data, where

First quartile position:

1

1

4

nQ

ranked value

Second quartile position:

2

1

2

nQ

ranked value

Third quartile position: 3

3 1

4

nQ

ranked value

where n is the number of observed values

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 34

Quartile Measures: Calculation Rules

DCOVA

• When calculating the ranked position use the

following rules

– If the result is a whole number then it is the ranked

position to use

– If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.)

then average the two corresponding data values.

– If the result is not a whole number or a fractional half

then round the result to the nearest integer to find the

ranked position.

Page 18: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

18

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 35

Quartile Measures: Locating

Quartiles (2 of 2)

DCOVA

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)

1Q is in the 9 1

2.5position4

of the ranked data

so use the value half way between the 2nd and 3rd values,

so 112.5Q

1 3Q Qand are measures of non-central location

2Q = median, is a measure of central tendency

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 36

Quartile Measures Calculating The

Quartiles: Example

DCOVASample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

9n

1Q is in the 9 1

2.54

position of the ranked data,

1

12 1312.5

2so Q

2Q is in the th9 1

52

position of the ranked data,

2 median 16so Q

3Q is in the 3 9 17.5

4

position of the ranked data,

3

18 2119.5

2so Q

1 3Q Qand are measures of non-central location

2Q = median, is a measure of central tendency

Page 19: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

19

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 37

Quartile Measures: The Interquartile

Range (IQR)

DCOVA• The IQR is 3 1Q Q and measures the spread in the middle

50% of the data

• The IQR is also called the midspread because it covers the

middle 50% of the data

• The IQR is a measure of variability that is not influenced by

outliers or extreme values

• Measures like 1 3, ,Q Q and IQR that are not influenced by

outliers are called resistant measures

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 38

Calculating The Interquartile Range

DCOVAExample:

Page 20: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

20

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 39

The Five Number Summary

DCOVAThe five numbers that help describe the center,

spread and shape of data are:

• smallestX

• First Quartile 1Q

• Median 2Q

• Third Quartile 3Q

• largestX

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 40

Relationships Among the Five-Number

Summary and Distribution Shape

DCOVA

Page 21: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

21

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 41

Five Number Summary and the

Boxplot

DCOVA

• The Boxplot: A Graphical display of the data

based on the five-number summary:

smallest 1 3 largest-- --Median -- --X Q Q X

Example:

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 42

Five Number Summary: Shape of

Boxplots

DCOVA

• If data are symmetric around the median then the box and

central line are centered between the endpoints

• A Boxplot can be shown in either a vertical or horizontal

orientation

Page 22: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

22

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 43

Distribution Shape and The Boxplot

DCOVA

Left-Skewed Symmetric Right-Skewed

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 44

Boxplot Example

DCOVA

• Below is a Boxplot for the following data:

• The data are right skewed, as the plot depicts

Page 23: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

23

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 45

Numerical Descriptive Measures for a

Population

DCOVA

• Descriptive statistics discussed previously described a

sample, not the population.

• Summary measures describing a population, called

parameters, are denoted with Greek letters.

• Important population parameters are the population mean,

variance, and standard deviation.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 46

Numerical Descriptive Measures for a

Population: The mean µ

DCOVA

• The population mean is the sum of the values in

the population divided by the population size, N

=1 1 2

N

i

i N

XX X X

N N

Whereμ = population mean

N = population sizeth

iX i value of the variable X

Page 24: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

24

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 47

Numerical Descriptive Measures for a

Population: The Variance Sigma Squared

DCOVA

• Average of squared deviations of values from the

mean.

– Population variance: 2

2 1

N

i

i

X

N

Where

μ = population mean

N = population sizeth

iX i value of the variable X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 48

Numerical Descriptive Measures for a

Population: The Standard Deviation Sigma

DCOVA

• Most commonly used measure of variation

• Shows variation about the mean

• Is the square root of the population variance

• Has the same units as the original data

– Population standard deviation:

2

1

N

i

i

X

N

Page 25: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

25

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 49

Sample Statistics Versus Population

Parameters

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 50

The Empirical Rule (1 of 2)

DCOVA

• The empirical rule approximates the variation of

data in a bell-shaped distribution

• Approximately 68% of the data in a bell shaped

distribution is within 1 standard deviation of themean or 1

Page 26: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

26

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 51

The Empirical Rule (2 of 2)

DCOVA• Approximately 95% of the data in a bell-shaped distribution

lies within two standard deviations of the mean, or 2μ σ

• Approximately 99.7% of the data in a bell-shaped

distribution lies within three standard deviations of themean, or 3μ σ

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 52

Using the Empirical Rule

DCOVA

• Suppose that the variable Math SAT scores is bell-

shaped with a mean of 500 and a standard

deviation of 90. Then,– Approximately 68% of all test takers scored between

410 and 590, 500 90 .

– Approximately 95% of all test takers scored between

320 and 680, 500 180 .

– Approximately 99.7% of all test takers scored between

230 and 770, 500 270 .

Page 27: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

27

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 53

Chebyshev Rule

DCOVA

• Regardless of how the data are distributed, at

least 2

11 100%

k

of the values will fall within k

standard deviations of the mean (for k > 1).

– Examples:

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 54

We Discuss Two Measures of the Relationship

Between Two Numerical Variables

• Scatter plots allow you to visually examine the

relationship between two numerical variables and

now we will discuss two quantitative measures of

such relationships.

• The Covariance

• The Coefficient of Correlation

Page 28: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

28

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 55

The Covariance

DCOVA

• The covariance measures the strength of the linearrelationship between two numerical variables &X Y

• The sample covariance:

=1cov ,1

n

i i

i

X X Y Y

X Yn

• Only concerned with the strength of the relationship

• No causal effect is implied

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 56

Interpreting Covariance

DCOVA

• Covariance between two variables:

cov , 0X Y X and Y tend to move in the same direction

cov , 0X Y X and Y tend to move in opposite directions

cov , 0X Y X and Y are independent

• The covariance has a major flaw:

– It is not possible to determine the relative strength of

the relationship from the size of the covariance

Page 29: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

29

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 57

Coefficient of Correlation

DCOVA

• Measures the relative strength of the linear

relationship between two numerical variables

• Sample coefficient of correlation:

cov ,

X Y

X Yr

S S

where

=1cov ,1

n

i i

i

X X Y Y

X Yn

2

=1

1

n

i

ix

X X

Sn

2

=1

1

n

i

iY

Y Y

Sn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 58

Features of the Coefficient of

Correlation

DCOVA• The population coefficient of correlation is referred as

• The sample coefficient of correlation is referred to as r.

• Either or r have the following features:

– Unit free

– Range between −1 and 1

– The closer to −1, the stronger the negative linear relationship

– The closer to 1, the stronger the positive linear relationship

– The closer to 0, the weaker the linear relationship

Page 30: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

30

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 59

Scatter Plots of Sample Data with

Various Coefficients of Correlation

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 60

The Coefficient of Correlation Using

Microsoft Excel Function

DCOVA

Test #1 Score Test #2 Score

78 82 0.7332 =CORREL(A2:A11,B2:B11)

92 88

86 91

83 90

95 92

85 85

91 89

76 81

88 96

79 77

Correlation Coefficient

Page 31: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

31

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 61

Interpreting the Coefficient of

Correlation Using Microsoft Excel

DCOVA

• r = 0.733.

• There is a relatively strong

positive linear relationship

between test score #1 and

test score #2.

• Students who scored high

on the first test tended to

score high on second test.

Scatter Plot of Test Scores

70

75

80

85

90

95

100

70 75 80 85 90 95 100

Test #1 Score

Test

#2 S

co

re

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 62

Pitfalls in Numerical Descriptive

Measures

DCOVA

• Data analysis is objective

– Should report the summary measures that best

describe and communicate the important aspects of

the data set

• Data interpretation is subjective

– Should be done in fair, neutral and clear manner

Page 32: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

32

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 63

Ethical Considerations

DCOVA

Numerical descriptive measures:

• Should document both good and bad results

• Should be presented in a fair, objective and neutral

manner

• Should not use inappropriate summary measures to

distort facts

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 64

Chapter Summary

In this chapter we have discussed:

• Describing the properties of central tendency,

variation, and shape in numerical data

• Constructing and interpreting a boxplot

• Computing descriptive summary measures for a

population

• Calculating the covariance and the coefficient of

correlation

Page 33: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

33

Slide - 65

Business Statistics: A First Course

Seventh Edition

Chapter 6

The Normal

Distribution

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 66

Objectives

1. To compute probabilities from the normal distribution

2. How to use the normal distribution to solve business

problems

3. To use the normal probability plot to determine whether a

set of data is approximately normally distributed

Page 34: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

34

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 67

Continuous Probability Distributions

• A continuous variable is a variable that can

assume any value on a continuum (can assume

an uncountable number of values)

– thickness of an item

– time required to complete a task

– temperature of a solution

– height, in inches

• These can potentially take on any value

depending only on the ability to precisely and

accurately measure

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 68

The Normal Distribution

• Bell Shaped

• Symmetrical

• Mean, Median and Mode

are Equal

Location is determined by themean,

Spread is determined by the

standard deviation,

The random variable has an

infinite theoretical range: to

Page 35: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

35

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 69

The Normal Distribution Density

Function

• The formula for the normal probability density function is

21 ( )

21( )

2

X μ

f X eπ

Where

e = the mathematical constant approximated by 2.71828

= the mathematical constant approximated by 3.14159

= the population mean

= the population standard deviation

X = any value of the continuous variable

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 70

By Varying the Parameters Mu and Sigma,

We Obtain Different Normal Distributions

A and B have the same mean but different standard deviations.

B and C have different means and different standard deviations.

Page 36: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

36

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 71

The Normal Distribution Shape

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 72

The Standardized Normal

• Any normal distribution (with any mean and

standard deviation combination) can be

transformed into the standardized normal

distribution (Z)

• To compute normal probabilities need to transform

X units into Z units

• The standardized normal distribution (Z) has a

mean of 0 and a standard deviation of 1

Page 37: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

37

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 73

Translation to the Standardized

Normal Distribution

• Translate from X to the standardized normal (the

“Z” distribution) by subtracting the mean of X and

dividing by its standard deviation:

X μZ =

σ

The Z distribution always has mean = 0 and

standard deviation = 1

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 74

The Standardized Normal Probability

Density Function

• The formula for the standardized normal

probability density function is

1 2

21( )

2

Z

f Z eπ

Where

e = the mathematical constant approximated by 2.71828

= the mathematical constant approximated by 3.14159

Z = any value of the standardized normal distribution

Page 38: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

38

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 75

The Standardized Normal Distribution

• Also known as the “Z” distribution

• Mean is 0

• Standard Deviation is 1

Values above the mean have positive Z-values.

Values below the mean have negative Z-values.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 76

Example

• If X is distributed normally with mean of $100

and standard deviation of $50, the Z value for

X = $200 is

$200 $1002.0

$50

X μZ =

σ

• This says that X = $200 is two standard deviations

(2 increments of $50 units) above the mean of

$100.

Page 39: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

39

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 77

Comparing X and Z Units

Note that the shape of the distribution is the same, only the

scale has changed. We can express the problem in the

original units (X in dollars) or in standardized units (Z)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 78

Finding Normal Probabilities

Probability is measured by the area under the

curve

Page 40: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

40

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 79

Probability as Area Under the Curve

The total area under the curve is 1.0, and the curve is

symmetric, so half is above the mean, half is below

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 80

The Standardized Normal Table (1 of 2)

• The Cumulative Standardized Normal table in the

textbook (Appendix table E.2) gives the probability

less than a desired value of Z (i.e., from negative

infinity to Z)

Example:

( 2.00) 0.9772P Z

Page 41: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

41

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 81

The Standardized Normal Table (2 of 2)

2.00) 0.9772P Z

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 82

General Procedure for Finding

Normal Probabilities

To find  P a X b when X is distributed normally:

• Draw the normal curve for the problem in terms of

X

• Translate X-values to Z-values

• Use the Standardized Normal Table

Page 42: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

42

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 83

Finding Normal Probabilities (1 of 2)

• Let X represent the time it takes (in seconds) to

download an image file from the internet.

• Suppose X is normal with a mean of 18.0 seconds

and a standard deviation of 5.0 seconds. Find

( 18.6)P X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 84

Finding Normal Probabilities (2 of 2)

• Let X represent the time it takes, in seconds to download an image file from

the internet.

• Suppose X is normal with a mean of 18.0 seconds and a standard deviation

of 5.0 seconds. Find ( 18.6)P X

18.6 18.00.12

5.0

X μZ

σ

Page 43: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

43

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 85

Solution: Finding P of Start Expression

Z Is Less Than 0.12 End Expression

Standardized Normal Probability Table (Portion)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 86

Finding Normal Upper Tail

Probabilities (1 of 2)

• Suppose X is normal with mean 18.0 and

standard deviation 5.0.

• Now Find 18.6P X

Page 44: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

44

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 87

Finding Normal Upper Tail

Probabilities (2 of 2)

• Now Find 18.6P X

( 0.12) 1.18.6 0 ( 0.12)P ZP PX Z

1.0 0.5478 0.4522

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 88

Finding a Normal Probability Between

Two Values

• Suppose X is normal with mean 18.0 and

standard deviation 5.0. Find (18 18.6)P X

Calculate Z-values:

18 180

5

X μZ

σ

18.6 180.12

5

X μZ

σ

Page 45: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

45

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 89

Solution: Finding P of Start Expression 0 Is

Less Than Z Is Less Than 0.12 End Expression

Standardized Normal

Probability Table (Portion)   18 18.6P X

0.(0 2)1P Z

0.1( )2 – 0P Z P Z

0.5478 0.5000 0.0478

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 90

Probabilities in the Lower Tail (1 of 2)

• Suppose X is normal with mean 18.0 and

standard deviation 5.0.

• Now Find 17.4 18P X

Page 46: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

46

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 91

Probabilities in the Lower Tail (2 of 2)

Now Find   17.4 . .8 .1P X

  17.4 18P X

0.12 0P Z

0 – 0.12P Z P Z

0.5000 0.4522 0.0478

The Normal distribution is

symmetric, so this probability

is the same as   .0 0.12P Z

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 92

Empirical Rule

What can we say about the distribution of values

around the mean? For any normal distribution:

Page 47: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

47

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 93

The Empirical Rule

• 2 covers about 95.44% of X’s

• 3μ covers about 99.73% of X’s

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 94

Given a Normal Probability Find the X

Value

• Steps to find the X value for a known probability:

1. Find the Z value for the known

probability

2. Convert to X units using the formula:

X Z

Page 48: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

48

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 95

Finding the X Value for a Known

Probability

Example:

• Let X represent the time it takes (in seconds) to download

an image file from the internet.

• Suppose X is normal with mean 18.0 and standard

deviation 5.0

• Find X such that 20% of download times are less than X.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 96

Find the Z Value for 20% in the Lower

Tail

1. Find the Z value for the known probability

Standardized Normal Probability

Table (Portion)

• 20% area in the

lower tail is

consistent with a Z

value of −0.84

Page 49: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

49

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 97

Finding the X Value

2. Convert to X units using the formula:

18.0 ( 0.84)5.0

13.8

X μ Z

So 20% of the values from a distribution with mean 18.0 and

standard deviation 5.0 are less than 13.80

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 98

Both Minitab & Excel Can Be Used to

Find Normal Probabilities

Find 9P X where X is normal with a mean of 7 and

a standard deviation of 2

Page 50: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

50

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 99

Evaluating Normality (1 of 3)

• Not all continuous distributions are normal

• It is important to evaluate how well the data set is

approximated by a normal distribution.

• Normally distributed data should approximate the

theoretical normal distribution:

– The normal distribution is bell shaped (symmetrical)

where the mean is equal to the median.

– The empirical rule applies to the normal distribution.

– The interquartile range of a normal distribution is 1.33

standard deviations.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 100

Evaluating Normality (2 of 3)

Comparing data characteristics to theoretical

properties

• Construct charts or graphs

– For small- or moderate-sized data sets, construct a stem-and-

leaf display or a boxplot to check for symmetry

– For large data sets, does the histogram or polygon appear

bell-shaped?

• Compute descriptive summary measures

– Do the mean, median and mode have similar values?

– Is the interquartile range approximately 1.33 ?

– Is the range approximately 6 ?

Page 51: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

51

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 101

Evaluating Normality (3 of 3)

Comparing data characteristics to theoretical

properties• Observe the distribution of the data set

– Do approximately 2 3

of the observations lie within mean   1

standard deviation?

– Do approximately 80% of the observations lie within mean   1.28

standard deviations?

– Do approximately 95% of the observations lie within mean   2

standard deviations?

• Evaluate normal probability plot

– Is the normal probability plot approximately linear (i.e. a straight

line) with positive slope?

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 102

Constructing a Normal Probability

Plot

• Normal probability plot

– Arrange data into ordered array

– Find corresponding standardized normal quantile

values (Z)

– Plot the pairs of points with observed data values (X)

on the vertical axis and the standardized normal

quantile values (Z) on the horizontal axis

– Evaluate the plot for evidence of linearity

Page 52: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

52

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 103

The Normal Probability Plot

Interpretation

A normal probability plot for data from a normal

distribution will be approximately linear:

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 104

Normal Probability Plot Interpretation

Nonlinear plots

indicate a deviation

from normality

Page 53: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

53

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 105

Evaluating Normality an Example:

Mutual Fund Returns (1 of 2)

The boxplot is skewed to the right.

(The normal distribution is symmetric.)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 106

Evaluating Normality an Example:

Mutual Fund Returns (2 of 2)

Descriptive StatisticsBlank 3YrReturn%

Mean 21.84

Median 21.65

Mode 21.74

Minimum 3.39

Maximum 62.91

Range 59.52

Variance 41.2968

Standard Deviation 6.4263

Coeff. of Variation 29.43%

Skewness 1.6976

Kurtosis 8.4670

Count 318

Standard Error 0.3604

• The mean (21.84) is approximately the same as the median

(21.65). (In a normal distribution the mean and median are

equal.)

• The interquartile range of 6.98 is approximately 1.09

standard deviations. (In a normal distribution the interquartile

range is 1.33 standard deviations.)

• The range of 59.52 is equal to 9.26 standard deviations. (In

a normal distribution the range is 6 standard deviations.)

• 77.04% of the observations are within 1 standard deviation

of the mean. (In a normal distribution this percentage is

68.26%.)

• 86.79% of the observations are within 1.28 standard

deviations of the mean. (In a normal distribution this

percentage is 80%.)

• 96.86% of the observations are within 2 standard deviations

of the mean. (In a normal distribution this percentage is

95.44%.)

• The skewness statistic is 1.698 and the kurtosis statistic is

8.467. (In a normal distribution, each of these statistics

equals zero.)

Page 54: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

54

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 107

Evaluating Normality via Excel an

Example: Mutual Fund Returns

Excel (quantile-quantile) normal probability plot

Plot is not a straight

line and shows the

distribution is

skewed to the right.

(The normal

distribution appears

as a straight line.)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 108

Evaluating Normality via Minitab an

Example: Mutual Fund Returns

Normal Probability Plot From Minitab

706050403020100

99.9

99

95

90

80

7060504030

20

10

5

1

0.1

3YrReturn%

Pe

rce

nt

Probability Plot of 3YrReturn%Normal Plot is not a straight

line, rises quickly in the

beginning, rises slowly

at the end and shows

the distribution is

skewed to the right.

Page 55: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

55

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 109

Evaluating Normality an Example:

Mutual Fund Returns

• Conclusions

– The returns are right-skewed

– The returns have more values concentrated around the

mean than expected

– The range is larger than expected

– Normal probability plot is not a straight line

– Overall, this data set greatly differs from the theoretical

properties of the normal distribution

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 110

Chapter Summary

In this chapter we discussed:

• Computing probabilities from the normal

distribution

• Using the normal distribution to solve business

problems

• Using the normal probability plot to determine

whether a set of data is approximately normally

distributed

Page 56: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

56

Slide - 111

Business Statistics: A First Course

Seventh Edition

Chapter 7

Sampling

Distributions

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 112

Objectives

1. The concept of the sampling distribution

2. To compute probabilities related to the sample

mean and the sample proportion

3. The importance of the Central Limit Theorem

Page 57: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

57

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 113

Sampling Distributions

DCOVA

• A sampling distribution is a distribution of all of the possible

values of a sample statistic for a given sample size

selected from a population.

• For example, suppose you sample 50 students from your

college regarding their mean GPA. If you obtained many

different samples of size 50, you will compute a different

mean for each sample. We are interested in the

distribution of all potential mean GPAs we might calculate

for any sample of 50 students.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 114

Developing a Sampling

Distribution (1 of 5)

DCOVA

• Assume there is a population

• Population size N = 4

• Random variable, X, is age of

individuals

• Values of X: 18, 20, 22, 24

(years)

Page 58: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

58

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 115

Developing a Sampling

Distribution (2 of 5)

DCOVA

Summary Measures for the Population Distribution:

i

18 20 22 2421

4

N

2( )2.236

iX

N

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 116

Developing a Sampling

Distribution (3 of 5)

DCOVA

Now consider all possible samples of size n = 2

Page 59: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

59

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 117

Developing a Sampling

Distribution (4 of 5)

DCOVASampling Distribution of All Sample Means

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 118

Developing a Sampling

Distribution (5 of 5)

DCOVA

Summary Measures of this Sampling Distribution:

18 19 19 2421

16Xμ

2 2 2(18 21) (19 21) (24 21)1.58

16Xσ

Note: Here we divide by 16 because there are 16 different samples of

size 2.

Page 60: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

60

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 119

Comparing the Population Distribution

to the Sample Means Distribution

DCOVA

Population

N = 421 2.236

Sample Means Distribution

n = 2

X21 1.58

X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 120

Sample Mean Sampling Distribution:

Standard Error of the Mean

DCOVA

• Different samples of the same size from the same

population will yield different sample means

• A measure of the variability in the mean from sample to

sample is given by the Standard Error of the Mean:

(This assumes that sampling is with replacement or sampling is

without replacement from an infinite population)

Xn

• Note that the standard error of the mean decreases as the

sample size increases

Page 61: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

61

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 121

Sample Mean Sampling Distribution:

If the Population Is Normal

DCOVA• If a population is normal with mean

and standard deviation , the sampling distribution

Xof is also normally distributed with:

Xμ and

Xn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 122

Z-Value for Sampling Distribution of

the Mean

DCOVA

• Z-value for the sampling distribution of :X

( ) ( )X

X

X μ X μZ = =

σσ

n

where:

X sample mean

population mean

population standard deviation

n sample size

Page 62: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

62

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 123

Sampling Distribution Properties (1 of 2)

DCOVA

X

(i.e. X is unbiased)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 124

Sampling Distribution Properties (2 of 2)

DCOVAAs n increases, X

decreases

Page 63: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

63

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 125

Determining an Interval Including a Fixed

Proportion of the Sample Means (1 of 2)

DCOVA

Find a symmetrically distributed interval around μ

that will include 95% of the sample means when

368, 15, 25.μ= n and

– Since the interval contains 95% of the sample means

5% of the sample means will be outside the interval

– Since the interval is symmetric 2.5% will be above the

upper limit and 2.5% will be below the lower limit.

– From the standardized normal table, the Z score with

2.5% (0.0250) below it is −1.96 and the Z score with

2.5% (0.0250) above it is 1.96.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 126

Determining an Interval Including a Fixed

Proportion of the Sample Means (2 of 2)

DCOVA

• Calculating the lower limit of the interval

15368 ( 1.96) 362.12

25LX Z

n

• Calculating the upper limit of the interval

15368 (1.96) 373.88

25UX Z

n

• 95% of all sample means of sample size 25 are

between 362.12 and 373.88

Page 64: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

64

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 127

Sample Mean Sampling Distribution:

If the Population Is Not Normal (1 of 2)

DCOVA

• We can apply the Central Limit Theorem:

– Even if the population is not normal,

– …sample means from the population will be

approximately normal as long as the sample size is

large enough.

Properties of the sampling distribution:

xμ μ and xn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 128

Central Limit Theorem

DCOVAAs the sample size gets

large enough…

the sampling

distribution of the

sample mean

becomes almost

normal regardless

of shape of

population

Page 65: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

65

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 129

Sample Mean Sampling Distribution:

If the Population Is Not Normal (2 of 2)

DCOVASampling distribution

properties:

Central Tendency

X

Variation

Xn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 130

How Large Is Large Enough?

DCOVA

• For most distributions, n > 30 will give a sampling

distribution that is nearly normal

• For fairly symmetric distributions, n > 15

• For a normal population distribution, the sampling

distribution of the mean is always normally

distributed

Page 66: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

66

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 131

Example 1 (1 of 3)

DCOVA

• Suppose a population has mean 8

and standard deviation 3. Suppose a random

sample of size n = 36 is selected.

• What is the probability that the sample mean is

between 7.8 and 8.2?

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 132

Example 1 (2 of 3)

DCOVA

Solution:

• Even if the population is not normally distributed,

the central limit theorem can be used (n > 30)

• … so the sampling distribution of X

is approximately normal

• … with mean 8X

• …and standard deviation3

0.536

Xn

Page 67: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

67

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 133

Example 1 (3 of 3)

DCOVASolution (continued):

7.8 8 8.2 8(7.8 8.2)

3 3

36 36

( 0.4 0.4) 0.6554 0.3446 0.3108

XP X P

n

P Z

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 134

Population Proportions

DCOVA

π = the proportion of the population having some

characteristic

• Sample proportion (p) provides an estimate of :

number of items in the sample having the characteristic of interest

sample size

Xp

n

• 0   1p

• p is approximately distributed as a normal distribution

when n is large

(assuming sampling with replacement from a finite population or without

replacement from an infinite population)

Page 68: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

68

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 135

Sampling Distribution of p

DCOVA• Approximated by a

normal distribution if:

5

1 5

n

n

and

where

p and(1 )

p

π π

n

(where = population proportion)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 136

Z-Value for Proportions

DCOVA

Standardize p to a Z value with the formula:

(1 )p

p pZ

n

Page 69: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

69

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 137

Example 2 (1 of 3)

DCOVA

• If the true proportion of voters who support

Proposition A is 0.4,  what is the probability

that a sample of size 200 yields a sample

proportion between 0.40 and 0.45?

• i.e.: 0.4 if and n = 200, what is

0.40 0.45 P p

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 138

Example 2 (2 of 3)

DCOVA• if 0.4 and n = 200, what is

0.40 0.45 P p

Find :p(1 ) 0.4(1 0.4)

0.03464200

pn

Convert to standardized normal:

0.40 0.40 0.45 0.40(0.40 0.45)

0.03464 0.03464

(0 1.44)

P p P Ζ

P Z

Page 70: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

70

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 139

Example 2 (3 of 3)

DCOVA• if 0.4 and n = 200, what is

0.40 0.45 P p

Utilize the cumulative normal table:

0 1.44  0.9251– 0.5000 0.4251P Z

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 140

Chapter Summary

In this chapter we discussed:

• The concept of a sampling distribution

• Computing probabilities related to the sample

mean and the sample proportion

• The importance of the Central Limit Theorem

Page 71: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

71

Slide - 141

Business Statistics: A First Course

Seventh Edition

Chapter 9

Fundamentals of

Hypothesis

Testing: One-

Sample Tests

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 142

Objectives

1. The basic principles of hypothesis testing

2. How to use hypothesis testing to test a mean or

proportion

3. The assumptions of each hypothesis-testing procedure,

how to evaluate them, and the consequences if they are

seriously violated

4. Pitfalls & ethical issues involved in hypothesis testing

5. How to avoid the pitfalls involved in hypothesis testing

Page 72: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

72

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 143

What Is a Hypothesis?

DCOVA

• A hypothesis is a claim (assertion)

about a population parameter:

– population mean

Example: The mean monthly cell phone bill

in this city is = $42μ

– population proportion

Example: The proportion of adults in this

city with cell phones is = 0.68

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 144

The Null Hypothesis, H Sub 0 (1 of 2)

DCOVA

• States the claim or assertion to be tested

Example: The mean diameter of a manufactured

bolt is 30mm 0 : 30H

• Is always about a population parameter, not

about a sample statistic

0 : 30H 0 : 30H X

Page 73: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

73

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 145

The Null Hypothesis, H Sub 0 (2 of 2)

DCOVA

• Begin with the assumption that the null

hypothesis is true

– Similar to the notion of innocent until

proven guilty

– Refers to the status quo or historical value– Always contains “=“, or ” “ ”,“ or sign

– May or may not be rejected

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 146

The Alternative Hypothesis, H Sub 1

DCOVA

• Is the opposite of the null hypothesis

– e.g., The mean diameter of a manufactured bolt

is not equal to 30mm 1 : 30H

• Challenges the status quo

• Never contains the “=“, or ” “ ”,“ or sign

• May or may not be proven

• Is generally the hypothesis that the

researcher is trying to prove

Page 74: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

74

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 147

The Hypothesis Testing Process (1 of 3)

DCOVA

• Claim: The population mean age is 50.

– 0 : 50,H 1 : 50H

• Sample the population and find the sample mean.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 148

The Hypothesis Testing Process (2 of 3)

DCOVA

• Suppose the sample mean age was 20.X

• This is significantly lower than the claimed mean

population age of 50.

• If the null hypothesis were true, the probability of

getting such a different sample mean would be very

small, so you reject the null hypothesis.

• In other words, getting a sample mean of 20 is so

unlikely if the population mean was 50, you conclude

that the population mean must not be 50.

Page 75: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

75

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 149

The Hypothesis Testing Process (3 of 3)

DCOVA

Sampling Distribution of X

X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 150

The Test Statistic and Critical

Values (1 of 2)

DCOVA

• If the sample mean is close to the stated population

mean, the null hypothesis is not rejected.

• If the sample mean is far from the stated population

mean, the null hypothesis is rejected.

• How far is “far enough” to reject 0 ?H

• The critical value of a test statistic creates a “line

in the sand” for decision making -- it answers the

question of how far is far enough.

Page 76: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

76

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 151

The Test Statistic and Critical

Values (2 of 2)

DCOVA

Sampling Distribution of the test statistic

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 152

Risks in Decision Making Using

Hypothesis Testing

DCOVA

• Type I Error

– Reject a true null hypothesis

– A type I error is a “false alarm”

– The probability of a Type I Error is α

Called level of significance of the test

Set by researcher in advance

• Type II Error

– Failure to reject a false null hypothesis

– Type II error represents a “missed opportunity”

– The probability of a Type II Error is β

Page 77: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

77

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 153

Possible Errors in Hypothesis Test

Decision Making (1 of 2)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 154

Possible Errors in Hypothesis Test

Decision Making (2 of 2)

DCOVA

• The confidence coefficient 1 is the

probability of not rejecting0H when it is true.

• The confidence level of a hypothesis test is

*

1 100%.

• The power of a statistical test 1 is the

probability of rejecting0H when it is false.

Page 78: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

78

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 155

Type I & II Error Relationship

DCOVA

• Type I and Type II errors cannot happen at the

same time.

– A Type I error can only occur if0H is true

– A Type II error can only occur if 0H is false

If Type I error probability (α) , then

Type II error probability (β)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 156

Factors Affecting Type II Error

DCOVA

• All else equal,

– β when the difference between hypothesized

Parameter and its true value

– β

– β

– β

Page 79: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

79

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 157

Level of Significance and the

Rejection Region

DCOVA

This is a two-tail test because there is a rejection region in both tails

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 158

Hypothesis Tests for the Mean

DCOVA

Page 80: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

80

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 159

Z Test of Hypothesis for the Mean (σ

Known)

DCOVA

• Convert sample statistic STATX Zto a test statistic

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 160

Critical Value Approach to Testing

DCOVA

• For a two-tail test for the mean, σ known:

• Convert sample statistic ( )X to test statistic

STATZ

• Determine the critical Z values for a specified level

of significance α from a table or by using

computer software

• Decision Rule: If the test statistic falls in therejection region, reject

0;H otherwise do notreject 0H

Page 81: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

81

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 161

Two-Tail Tests

DCOVA

• There are two

cutoff values

(critical values),

defining the

regions of

rejection

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 162

6 Steps in Hypothesis Testing (1 of 2)

DCOVA

1. State the null hypothesis,0H and the alternative hypothesis,

1H

2. Choose the level of significance, and the sample size, n.

The level of significance is based on the relative

importance of Type I and Type II errors

3. Determine the appropriate test statistic and sampling

distribution

4. Determine the critical values that divide the rejection and

nonrejection regions

Page 82: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

82

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 163

6 Steps in Hypothesis Testing (2 of 2)

DCOVA

5. Collect data and compute the value of the test statistic

6. Make the statistical decision and state the managerial

conclusion. If the test statistic falls into the nonrejection

region, do not reject the null hypothesis0.H If the test

statistic falls into the rejection region, reject the null

hypothesis. Express the managerial conclusion in the

context of the problem

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 164

Hypothesis Testing Example (1 of 4)

DCOVA

Test the claim that the true mean diameter of a

manufactured bolt is 30mm.( )0.8Assume

1. State the appropriate null and alternative hypotheses

– 0 :H 1 :H (This is a two-tail test)

2. Specify the desired level of significance and the sample

size

– Suppose that and n = 100 are chosen for

this test

Page 83: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

83

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 165

Hypothesis Testing Example (2 of 4)

DCOVA3. Determine the appropriate technique

– is assumed known so this is a Z test.

4. Determine the critical values

– For 0.05 the critical Z values are 1.96

5. Collect the data and compute the test statistic

– Suppose the sample results are

100, 29.84 0.8n X is assumed known)

So the test statistic is:

29.84 30 0.162.0

0.8 0.08

100

STAT

XZ

n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 166

Hypothesis Testing Example (3 of 4)

DCOVA

6. Is the test statistic in the rejection region?

Page 84: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

84

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 167

Hypothesis Testing Example (4 of 4)

DCOVA

6. (continued). Reach a decision and interpret the result

Since 2.0 1.96,STATZ reject the null hypothesis

and conclude there is sufficient evidence that the mean

diameter of a manufactured bolt is not equal to 30

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 168

p-Value Approach to Testing

DCOVA

• p-value: Probability of obtaining a test statistic

equal to or more extreme than the observed

sample value given0H is true

– The p-value is also called the observed level of

significance

– It is the smallest value of α for which0H can be rejected

Page 85: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

85

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 169

p-Value Approach to Testing:

Interpreting the p-Value

DCOVA

• Compare the p-value with α

– If p-value < α, reject0H

– If p-value , do not reject 0H

• Remember

– If the p-value is low then0H must go

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 170

The 5 Step p-Value Approach to

Hypothesis Testing

DCOVA

1. State the null hypothesis,0H and the alternative hypothesis,

1H

2. Choose the level of significance, α, and the sample size, n. The level

of significance is based on the relative importance of the risks of a type

I and a type II error.

3. Determine the appropriate test statistic and sampling distribution

4. Collect data and compute the value of the test statistic and the p-value

5. Make the statistical decision and state the managerial

conclusion. If the p-value is < α then reject 0 ,H otherwise do not

reject 0.H State the managerial conclusion in the context of the problem

Page 86: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

86

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 171

p-Value Hypothesis Testing

Example (1 of 2)

DCOVA

Test the claim that the true mean diameter of a

manufactured bolt is 30mm.(Assume 0.8)1. State the appropriate null and alternative hypotheses

–0 :H

1 :H (This is a two-tail test)

2. Specify the desired level of significance and the sample

size– Suppose that and n = 100 are chosen for

this test

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 172

p-Value Hypothesis Testing

Example (2 of 2)

DCOVA

3. Determine the appropriate technique

– σ is assumed known so this is a Z test.

4. Collect the data, compute the test statistic and the p-value

– Suppose the sample results are

n = 100, 29.84 0.8X is assumed known)

So the test statistic is:

29.84 30 0.162.0

0.8 0.08

100

STAT

XZ

n

Page 87: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

87

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 173

p-Value Hypothesis Testing Example:

Calculating the p-Value

DCOVA

4. (continued) Calculate the p-value.

– How likely is it to get a 2STATZ of (or something

further from the mean (0), in either direction) if 0H is

true?

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 174

p-Value Hypothesis Testing Example

DCOVA

5. Is the p-value < α?

– Since p-value = 0.0456 < α = 0.05 Reject0H

5. (continued) State the managerial conclusion in

the context of the situation.

– There is sufficient evidence to conclude the mean diameter of a

manufactured bolt is not equal to 30mm.

Page 88: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

88

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 175

Connection Between Two Tail Tests

and Confidence Intervals

DCOVA

• For 29.84,X and n = 100, the 95%confidence interval is:

0.8 0.8

29.84 1.96 29.84 + 1.96100 100

to

29.6832 29.9968

• Since this interval does not contain the

hypothesized mean (30), we reject the null

hypothesis at

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 176

Do You Ever Truly Know σ?

DCOVA

• Probably not!

• In virtually all real world business situations, σ is

not known.

• If there is a situation where σ is known then μ is

also known (since to calculate σ you need to know

μ.)

• If you truly know μ there would be no need to

gather a sample to estimate it.

Page 89: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

89

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 177

Hypothesis Testing: σ Unknown

DCOVA

• If the population standard deviation is unknown, you

instead use the sample standard deviation S.

• Because of this change, you use the t distribution instead

of the Z distribution to test the null hypothesis about the

mean.

• When using the t distribution you must assume the

population you are sampling from follows a normal

distribution.

• All other steps, concepts, and conclusions are the same.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 178

t Test of Hypothesis for the Mean (σ

Unknown)

DCOVA

• Convert sample statistic STATX tto a test statistic

Page 90: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

90

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 179

Example 1: Two-Tail Test (α

Unknown)

DCOVA

• The average cost of a hotel room

in New York is said to be $168 per

night. To determine if this is true,

a random sample of 25 hotels is

taken and resulted in an X of

$172.50 and an S of $15.40. Test

the appropriate hypotheses at α =

0.05.

(Assume the population distribution is normal)

0

1

: 168

: 168

H

H

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 180

Example 1 Solution: Two-Tail t Test

0

1

: 168

: 168

H

H

• α = 0.05

• n = 25, 25 1 24df

• σ is unknown, so

use a t statistic

172.50 1681.46

15.40

25

STAT

Xt

S

n

• Critical Value:

24,0.025 2.0639 t Do not reject0 :H insufficient evidence that

true mean cost is different from $168

Page 91: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

91

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 181

To Use the t-test Must Assume the

Population Is Normal

DCOVA

• As long as the sample size is not very small and

the population is not very skewed, the t-test can

be used.

• To evaluate the normality assumption:

– Determine how closely sample statistics match the

normal distribution’s theoretical properties.

– Construct a histogram or stem-and-leaf display or

boxplot or a normal probability plot.

– Section 6.3 has more details on evaluating normality.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 182

Connection of Two Tail Tests to

Confidence Intervals

DCOVA

• For 172.5,X S = 15.40 and n = 25, the 95%confidence interval for µ is:

172.5 2.0639 15.4

25

to

172.5 2.0639 15.4

25

166.14 178.86

• Since this interval contains the Hypothesizedmean (168), we do not reject the null hypothesis at

= 0.05

Page 92: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

92

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 183

One-Tail Tests

DCOVA

• In many cases, the alternative hypothesis focuses on a

particular direction

0

1

: 3  

: 3

H

H

This is a lower-tail test since the

alternative hypothesis is focused on

the lower tail below the mean of 3

0

1

: 3 

: 3

H

H

This is an upper-tail test since the

alternative hypothesis is focused on

the upper tail above the mean of 3

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 184

Lower-Tail Tests

DCOVA

• There is only one

critical value, since

the rejection area is

in only one tail

Page 93: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

93

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 185

Upper-Tail Tests

DCOVA

• There is only one

critical value,

since the

rejection area is

in only one tail

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 186

Example 2: Upper-Tail t Test for Mean

( Unknown)

DCOVA

A phone industry manager thinks that customer monthly cell

phone bills have increased, and now average over $52 per

month. The company wishes to test this claim. (Assume a

normal population)

Form hypothesis test:

0 : 52 H the mean is not over $52 per month

1 : 52 H the mean is greater than $52 per month(i.e., sufficient evidence exists to support the manager’s claim)

Page 94: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

94

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 187

Example 3: Find Rejection Region

DCOVA• Suppose that is chosen for this test and n = 25.

Find the rejection region:

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 188

Example 4: Test Statistic

DCOVAObtain sample and compute the test statistic.

Suppose a sample is taken with the following results: n = 25,

53.1,X and S = 10

– Then the test statistic is:

53.1 520.55

10

25

STAT

Xt

S

n

Page 95: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

95

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 189

Example 5: Decision

DCOVAReach a decision and interpret the result:

Do not reject 0H since 0.55 1.318STATt

there is not sufficient evidence that the

mean bill is over $52

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 190

Example 6: Utilizing The p-Value for

the Test

DCOVA• Calculate the p-value and compare to (p-value below

calculated using excel spreadsheet on next page)

Do not reject 0H since p-value .2937

Page 96: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

96

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 191

Hypothesis Tests for Proportions (1 of 2)

DCOVA

• Involves categorical variables

• Two possible outcomes

– Possesses characteristic of interest

– Does not possess characteristic of interest

• Fraction or proportion of the population in the

category of interest is denoted by

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 192

Proportions

DCOVA• Sample proportion in the category of interest is denoted

by p

–number in categoryof interest in sample

=sample size

Xp

n

• When both 1n n and are at least 5, p can be

approximated by a normal distribution with

mean and standard deviation:

– p 1

pn

Page 97: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

97

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 193

Hypothesis Tests for Proportions (2 of 2)

DCOVA• The sampling

distribution of p is

approximately

normal, so the test

statistic is a ZSTAT

value:

(1 )STAT

pZ

n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 194

Z Test for Proportion in Terms of

Number in Category of Interest

DCOVA• An equivalent form

to the last slide, but

in terms of the

number in the

category of interest,

X:

(1 )STAT

X nZ

n

Page 98: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

98

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 195

Example 7: Z Test for Proportion

DCOVA

• A marketing company

claims that it receives 8%

responses from its

mailing. To test this claim,

a random sample of 500

were surveyed with 25

responses. Test at the

significance level. Check:

500 . .08 40

1 500 .92 460

n

n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 196

Z Test for Proportion: Solution

DCOVA0

1

0.08

0.08

H

H

n = 500, p = 0.05

Critical Values: 1.96

Test Statistic:

.05 .082.47

1 .08 1 .08

500

STAT

pZ

n

Decision:

Reject 0H at

Conclusion:

There is sufficient

evidence to reject the

company’s claim of 8%

response rate.

Page 99: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

99

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 197

p-Value Solution

DCOVACalculate the p-value and compare to (For a two-tail test the p-value is always two-tail.)

p-value = 0.0136:

Z Z 2.47

2 0.0068 0.0136

P P

Reject0H since p-value 0.0136

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 198

Questions to Address in the Planning

Stage

• What is the goal of the survey, study, or experiment?

• How can you translate this goal into a null and an alternative

hypothesis?

• Is the hypothesis test one or two tailed?

• Can a random sample be selected?

• What types of data will be collected? Numerical? Categorical?

• What level of significance should be used?

• Is the intended sample size large enough to achieve the desired

power?

• What statistical test procedure should be used and why?

• What conclusions & interpretations can you reach from the results of

the planned hypothesis test?

Failing to consider these questions can lead to bias or incomplete results

Page 100: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

100

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 199

Statistical Significance versus Practical

Significance

• Statistically significant results (rejecting the null

hypothesis) are not always of practical

significance

– This is more likely to happen when the sample size

gets very large

• Practically important results might be found to be

statistically insignificant (failing to reject the null

hypothesis)

– This is more likely to happen when the sample size is

relatively small

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 200

Reporting Findings & Ethical Issues

• Should document & report both good & bad results

• Should not just report statistically significant results

• Reports should distinguish between poor research

methodology and unethical behavior

• Ethical issues can arise in:

– The use of human subjects

– The data collection method

– The type of test being used

– The level of significance being used

– The cleansing and discarding of data

– The failure to report pertinent findings

Page 101: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

101

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 201

Chapter Summary

• In this chapter we discussed:

• The basic principles of hypothesis testing

• How to use hypothesis testing to test a mean or proportion

• The assumptions of each hypothesis-testing procedure,

how to evaluate them, and the consequences if they are

seriously violated

• Pitfalls & ethical issues involved in hypothesis testing

• How to avoid the pitfalls involved in hypothesis testing

Slide - 202

Business Statistics: A First Course

Seventh Edition

Chapter 10

Two-Sample

Tests and One-

Way ANOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Page 102: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

102

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 203

Objectives

1. How to use hypothesis testing for comparing the

difference between

– The means of two independent populations

– The means of two related populations

– The proportions of two independent populations

– The variances of two independent populations

– The means of more than two populations

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 204

Two-Sample Tests

DCOVA

Page 103: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

103

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 205

Difference Between Two Means

DCOVA

Goal: Test hypothesis

or form a confidence

interval for the

difference between

two population means,

1 2

The point estimate for the

difference is

1 2X X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 206

Difference Between Two Means:

Independent Samples

DCOVA• Different data sources

– Unrelated

– Independent

Sample selected from one population has no effect

on the sample selected from the other population

Page 104: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

104

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 207

Hypothesis Tests for Two Population

Means

DCOVATwo Population Means, Independent Samples

Lower-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Upper-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Two-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 208

Hypothesis Tests for Mu Sub 1 Minus

Mu Sub 2

DCOVATwo Population Means, Independent Samples

Lower-tail test:

0 1 2

1 1 2

0

0

H

H

0 STATH t t Reject if

Upper-tail test:

0 1 2

1 1 2

0

0

H

H

0 STATH t tReject if

Two-tail test:

0 1 2

1 1 2

0

0

H

H

0

2

2

STAT

STAT

H t t

t t

Reject if

or

Page 105: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

105

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 209

Hypothesis Tests for Mu Sub 1 Minus Mu Sub 2 with

Sigma Sub 1 and Sigma Sub 2 Unknown and Assumed

Equal (1 of 2)

DCOVA

Assumptions:

• Samples are randomly and

independently drawn.

• Populations are normally

distributed or both sample

sizes are at least 30.

• Population variances are

unknown but assumed equal.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 210

Hypothesis Tests for Mu Sub 1 Minus Mu Sub 2 with

Sigma Sub 1 and Sigma Sub 2 Unknown and Assumed

Equal (2 of 2)

DCOVA

• The pooled variance is:

2 2

1 1 2 22

1 2

1 1

1 1p

n S n SS

n n

• The test statistic is:

1 2 1 2

2

1 2

1 1STAT

p

X Xt

Sn n

• Where 1 2d.f. 2STATt n n has

Page 106: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

106

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 211

Confidence Interval for Mu Sub 1 Minus Mu Sub 2

with Sigma Sub 1 and Sigma Sub 2 Unknown and

Assumed Equal

DCOVA

The confidence interval for

1 2 is :

2

1 2

1 22

1 1pX X t S

n n

Where

2

t has 1 2d 2.f. n n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 212

Pooled-Variance t Test Example

DCOVA

You are a financial analyst for a brokerage firm. Is there a

difference in dividend yield between stocks listed on the NYSE

& NASDAQ? You collect the following data:

Assuming both populations are approximately normal with equal

variances, is there a difference in mean yield (α = 0.05)?

Page 107: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

107

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 213

Pooled-Variance t Test Example:

Calculating the Test Statistic

DCOVA

0 1 2 1 2

1 1 2 1 2

: 0 i.e.

: 0 i.e.

H

H

The test statistic is:

1 2 1 2

2

1 2

3.27 2.53 02.040

1 11 11.5021

21 25p

X Xt

Sn n

2 2 2 2

1 1 2 22

1 2

1 1 21 1 1.30 25 1 1.161.5021

1 1 21 1 25 1p

n S n SS

n n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 214

Pooled-Variance t Test Example:

Hypothesis Test Solution

DCOVA

0 1 2 1 2

1 1 2 1 2

: 0 i.e.

: 0 i.e.

H

H

0.05

df 21 25 2 44

Critical Values: 2.0154t

Test Statistic:

3.27 2.532.040

1 11.5021

21 25

t

Decision:

Reject0 0.05H at

Conclusion:

There is evidence of a

difference in means.

Page 108: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

108

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 215

Pooled-Variance t Test Example: Confidence

Interval for Mu Sub 1 Minus Mu Sub 2

DCOVA

Since we rejected 0H can we be 95% confident that

?NYSE NASDAQ

95% Confidence Interval for NYSE NASDAQ

2

1 2

1 22

1 10.74 2.0154 0.3628 0.009,1.471pX X t S

n n

Since 0 is less than the entire interval, we can be 95%

confident that NYSE NASDAQ

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 216

Hypothesis Tests for Mu Sub 1 Minus Mu Sub 2 with

Sigma Sub 1 and Sigma Sub 2 Unknown, Not Assumed

Equal

DCOVA

Assumptions:

• Samples are randomly and

independently drawn

• Populations are normally

distributed or both sample

sizes are at least 30

• Population variances are

unknown and cannot be

assumed to be equal

Page 109: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

109

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 217

Hypothesis Tests for Mu Sub 1 Minus Mu Sub 2 with

Sigma Sub 1 and Sigma Sub 2 Unknown and Not

Assumed Equal

DCOVA

The formulae for this test

are not covered in this

book.

See reference 8 from this

chapter for more details.

This test utilizes two

separate sample variances

to estimate the degrees of

freedom for the t test

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 218

Separate-Variance t Test Example

DCOVA

You are a financial analyst for a brokerage firm. Is there a

difference in dividend yield between stocks listed on the NYSE

& NASDAQ? You collect the following data:

Assuming both populations are approximately normal with unequal

variances, is there a difference in mean yield (α = 0.05)?

Page 110: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

110

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 219

Separate-Variance t Test Example:

Calculating the Test Statistic

DCOVA

0 1 2 1 2

1 1 2 1 2

: 0 i.e.

: 0 i.e.

H

H

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 220

Separate-Variance t Test Example:

Hypothesis Test Solution

DCOVA

0 1 2 1 2

1 1 2 1 2

: 0 i.e.

: 0 i.e.

H

H

0.05

df 40

Critical Values: 2.021t

Test Statistic:Decision:

Fail To Reject0 0.05H at

Conclusion:

There is insufficient evidence

of a difference in means.

Page 111: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

111

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 221

Related Populations the Paired

Difference Test (1 of 2)

DCOVARelated

samples

Tests Means of 2 Related Populations

• Paired or matched samples.

• Repeated measures (before/after)

• Use difference between paired values:

1 2i i iD X X

• Eliminates Variation Among Subjects.

• Assumptions:

– Differences are normally distributed

– Or, if not Normal, use large samples

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 222

Related Populations the Paired

Difference Test (2 of 2)

DCOVARelated

samples

The ith paired difference is

iD where

1 2i i iD X X

The point estimate for the paired difference population mean

D D is

1

n

i

i

D

Dn

The sample standard deviation isDS

2

1

1

n

i

iD

D D

Sn

n is the number of pairs in the paired sample

Page 112: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

112

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 223

The Paired Difference Test:

Finding t Sub STAT

DCOVA

Paired

samples

• The test statistic for D is

DSTAT

D

Dt

S

n

• Where STATt has n − 1 d.f.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 224

The Paired Difference Test: Possible

Hypotheses

DCOVAPaired Samples

Lower-tail test:

0

1

0

0

D

D

H

H

0 STATH t t Reject if

Upper-tail test:

0

1

0

0

D

D

H

H

0 STATH t tReject if

Two-tail test:

0

1

0

0

D

D

H

H

0

2

2

STAT

STAT

H t t

t t

Reject if

or

Where STATt has n − 1 d.f.

Page 113: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

113

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 225

The Paired Difference Confidence

Interval

DCOVAPaired

samples

The confidence interval for

D is

2

DSD t

n

where

2

1

1

n

i

iD

D D

Sn

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 226

Paired Difference Test: Example

DCOVA

• Assume you send your salespeople to a “customer service”

training workshop. Has the training made a difference in the

number of complaints? You collect the following data:

4.2

iDD

n

2

1

5.67

i

D

D DS

n

Page 114: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

114

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 227

Paired Difference Test: Solution

DCOVA• Has the training made a difference in the

number of complaints (at the 0.01 level)?

0

1

0

0

D

D

H

H

.01 4.2D

0.0054.604

d.f. 1 4

t

n

Test Statistic:

4.2 01.66

5.67

5

DSTAT

D

Dt

S

n

Decision: Do not reject 0H

( ).statt is not in the rejection region

Conclusion: There is insufficient

evidence of a change in the number

of complaints.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 228

The Paired Difference Confidence

Interval -- Example

DCOVA

The confidence interval for D is

2

DSD t

n

4.2, 5.67D

D S

5.6799% : 4.2 4.604

5 DCI for

15.87,7.47

Since this interval contains 0 you are 99% confident

that 0D

Page 115: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

115

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 229

Two Population Proportions (1 of 3)

DCOVAPopulation

proportions

Goal: test a hypothesis or

form a confidence interval

for the difference between

two population proportions,

1 2 Assumptions:

1 1 1 1

2 2 2 2

5 1 5

5 1 5

n n

n n

The point estimate for the difference is 1 2p p

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 230

Two Population Proportions (2 of 3)

DCOVAPopulation

proportions

In the null hypothesis we assume

the null hypothesis is true, so we

assume 1 2 and pool the two

sample estimates

The pooled estimate for the overall proportion is:

1 2

1 2

X Xp

n n

where 1 2X Xand are the number of items of interest in samples 1 and 2

Page 116: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

116

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 231

Two Population Proportions (3 of 3)

DCOVAPopulation

proportions

The test statistic for

1 2 is a Z statistic:

1 2 1 2

1 2

1 11

STAT

p pZ

p pn n

where1 2 1 2

1 2

1 2 1 2

X X X Xp p p

n n n n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 232

Hypothesis Tests for Two Population

Proportions (1 of 2)

DCOVAPopulation proportions

Lower-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Upper-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Two-tail test:

0 1 2

1 1 2

0 1 2

1 1 2

0

0

H

H

H

H

i.e.

Page 117: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

117

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 233

Hypothesis Tests for Two Population

Proportions (2 of 2)

DCOVAPopulation proportions

Lower-tail test:

0 1 2

1 1 2

0

0

H

H

0 STATH Z Z Reject if

Upper-tail test:

0 1 2

1 1 2

0

0

H

H

0 STATH Z ZReject if

Two-tail test:

0 1 2

1 1 2

0

0

H

H

0

2

2

STAT

STAT

H Z Z

Z Z

Reject if

or

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 234

Hypothesis Test Example: Two

Population Proportions (1 of 3)

DCOVAIs there a significant difference between the

proportion of men and the proportion of

women who will vote Yes on Proposition A?

• In a random sample, 36 of 72 men and 35

of 50 women indicated they would vote Yes

• Test at the .05 level of significance

Page 118: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

118

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 235

Hypothesis Test Example: Two

Population Proportions (2 of 3)

DCOVA• The hypothesis test is:

0 1 2 0H (the two proportions are equal)

1 1 2 0 H (there is a significant difference between proportions)

• The sample proportions are:

– Men: 1

360.50

72p

– Women: 2

350.70

50p

• The pooled estimate for the overall proportion is:

1 2

1 2

36 35 71.582

72 50 122

X Xp

n n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 236

Hypothesis Test Example: Two

Population Proportions (3 of 3)

DCOVAThe test statistic for 1 2 is

1 2 1 2

1 2

1 11

STAT

p pZ

p pn n

.50 .70 02.20

1 1.582 1 .582

72 50

Critical Values 1.96

.05 For

Decision: Reject 0H

Conclusion: There is evidence

of a significant difference in the

proportion of men and women

who will vote yes.

Page 119: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

119

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 237

Confidence Interval for Two Population

Proportions

DCOVAPopulation

proportions

The confidence interval for

1 2 is

1 1 2 2

1 2

1 22

1 1p p p pp p Z

n n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 238

Confidence Interval for Two Population

Proportions -- Example

DCOVA

The 95% confidence interval for1 2 is

0.50 0.50 0.70 0.300.50 0.70 1.96

72 50

0.37, 0.03

Since this interval does not contain 0 can be 95%

confident the two proportions are different.

Page 120: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

120

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 239

Chapter Summary

In this chapter we discussed:

• How to use hypothesis testing for comparing the difference

between

– The means of two independent populations

– The means of two related populations

– The proportions of two independent populations

– The variances of two independent populations

– The means of more than two populations

Slide - 240

Business Statistics: A First Course

Seventh Edition

Chapter 11

Chi-Square Tests

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Page 121: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

121

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 241

Objective

1. How and when to use the chi-square test for

contingency tables

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 242

Contingency Tables

DCOVA

Contingency Tables

• Useful in situations comparing multiple population

proportions

• Used to classify sample observations according to

two or more characteristics

• Also called a cross-classification table.

Page 122: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

122

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 243

Contingency Table Example (1 of 2)

DCOVA

Left-Handed vs. Gender

Dominant Hand: Left vs. Right

Gender: Male vs. Female

• 2 categories for each variable, so this is called a

2 2 table

• Suppose we examine a sample of 300 children

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 244

Contingency Table Example (2 of 2)

DCOVA

Sample results organized in a contingency table:

sample size = n = 300:

120 Females, 12

were left handed

180 Males, 24 were

left handed

Page 123: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

123

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 245

Chi Squared Test for the Difference

Between Two Proportions

DCOVA

0 1 2:H (Proportion of females who are left

handed is equal to the proportion of

males who are left handed)

1 1 2:H (The two proportions are not the same)

• If 0H is true, then the proportion of left-handed females should be

the same as the proportion of left-handed males

• The two proportions above should be the same as the proportion of

left-handed people overall

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 246

The Chi-Square Test Statistic (1 of 3)

DCOVA

The Chi-square test statistic is:

2

2

all cells

o e

STAT

e

f f

f

• where:

of = observed frequency in a particular cell

ef = expected frequency in a particular cell if 0H is true

2

STAT for the 2×2 case has 1 degree of freedom

(Assumed: each cell in the contingency table has expected

frequency of at least 5)

Page 124: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

124

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 247

Decision Rule (1 of 3)

DCOVA

The2

STAT test statistic approximately follows a chi-squared

distribution with one degree of freedom

Decision Rule:

If2 2 ,STAT reject 0 ,H

otherwise, do not reject

0H

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 248

Computing the Overall Proportion (1 of 2)

DCOVA

The overall

proportion is:1 2

1 2

X X Xp

n n n

Here:120 Females, 12

were left handed

180 Males, 24 were

left handed

12 24 360.12

120 180 300p

i.e., based on all 300 children the proportion of left handers is 0.12,

that is, 12%

Page 125: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

125

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 249

Finding Expected Frequencies

DCOVA

• To obtain the expected frequency for left handedfemales, multiply the average proportion left handed ( )p

by the total number of females

• To obtain the expected frequency for left handed males,

multiply the average proportion left handed ( )p by thetotal number of males

If the two proportions are equal, then

(Left Handed | Female) (Left Handed | Male) .12P P

i.e., we would expect (.12)(120) = 14.4 females to be left handed

(.12)(180) = 21.6 males to be left handed

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 250

Observed vs. Expected Frequencies

DCOVA

Page 126: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

126

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 251

The Chi-Square Test Statistic (2 of 3)

DCOVA

The test statistic is:2

2

all cells

2 2 2 2

( )

(12 14.4) (108 105.6) (24 21.6) (156 158.4)0.7576

14.4 105.6 21.6 158.4

o eSTAT

e

f fχ

f

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 252

Decision Rule (2 of 3)

DCOVA

The test statistic is2 2

0.050.7576;STAT with 1 d.f. = 3.841

Decision Rule:

If2 3.841,STAT reject 0 ,H

otherwise, do not reject 0H

Here,2 2

0.050.7576 3.841,STAT

so we do not reject 0H

and conclude that there is not

sufficient evidence that the

two proportions are different

at α = 0.05

Page 127: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

127

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 253

Chi Squared Test for Differences

Among More Than Two Proportions

DCOVA

• Extend the2 test to the case with more than two

independent populations:

0 1 2

1

:

: are equal 1,2, ,

c

j

H

H j c

Not all of the

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 254

The Chi-Square Test Statistic (3 of 3)

DCOVA

The Chi-square test statistic is:

2

2

all cells

o e

STAT

e

f f

f

• Where

of = observed frequency in a particular cell of the 2 tablec

ef = expected frequency in a particular cell if0H is true

2

STATχ for the 2×c case has (2 1)( 1) 1c c degrees of freedom

(Assumed: each cell in the contingency table has expected frequency of at least 1)

Page 128: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

128

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 255

Computing the Overall Proportion (2 of 2)

DCOVA

The overall

proportion is:1 2

1 2

= c

c

X X X Xp

n n n n

• Expected cell frequencies for the c categories are

are calculated as in the 2 2 case, and the decision

rule is the same:

Decision Rule:

If2 2

0STAT H reject

otherwise, do not reject 0H

Where2

is from the chi-

squared distribution with

c − 1 degrees of freedom

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 256

Chi Squared Test of Independence (1 of 2)

DCOVA

• Similar to the2 test for equality of more than two

two proportions, but extends the concept to

contingency tables with r rows and c columns

0H The two categorical variables are independent

(i.e., there is no relationship between them)

1H The two categorical variables are dependent

(i.e., there is a relationship between them)

Page 129: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

129

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 257

Chi Squared Test of Independence (2 of 2)

DCOVA

The Chi-square test statistic is:2

2

all cells

( )o eSTAT

e

f f

f

• where:

of = observed frequency in a particular cell of the tabler c

ef = expected frequency in a particular cell if0H is true

2

STAT for the ×r c case has ( 1)( 1)r c degrees of freedom

(Assumed: each cell in the contingency table has expected frequency

of at least 1)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 258

Expected Cell Frequencies

DCOVA

• Expected cell frequencies:

row total column totalef

n

Where:

row total = sum of all frequencies in the row

column total = sum of all frequencies in the column

n = overall sample size

Page 130: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

130

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 259

Decision Rule (3 of 3)

DCOVA

• The decision rule is

If2 2

STAT reject0H

otherwise, do not reject 0H

Where2

is from the chi-square distribution with 1 1r c

degrees of freedom

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 260

Example 1 (1 of 2)

DCOVA

• The meal plan selected by 200 students is shown below:

Page 131: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

131

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 261

Example 1 (2 of 2)

DCOVA

• The hypothesis to be tested is:

0 :H Meal plan and class standing are independent

(i.e., there is no relationship between them)

1 :H Meal plan and class standing are dependent

(i.e., there is a relationship between them)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 262

Example 2: Expected Cell Frequencies

DCOVAObserved:

Example for one cell:row total column total

30 7010.5

200

efn

Expected cell frequenciesif 0H is true:

Page 132: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

132

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 263

Example 3: The Test Statistic

DCOVA

• The test statistic value is:

2

2

all cells

2 2 224 24.5 32 30.8 10 8.4

0.70924.5 30.8 8.4

o e

STAT

e

f f

f

2

0.05 12.592 from the chi-square distribution with

(4 1)(3 1) 6 degrees of freedom

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 264

Example 4: Decision and

Interpretation

DCOVA

The test statistic is2 2

0.050.709;STATχ χ with 6 d.f. = 12.592

Decision Rule:

If2 12.592STAT reject

0H

otherwise, do not reject0H

Here,2 2

0.050.709 12.592STAT

so do not reject 0H

Conclusion: there is not sufficient

evidence that meal plan and class

standing are related at α = 0.05

Page 133: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

133

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 265

Chapter Summary

In this chapter we discussed:

• How and when to use the chi-square test for

contingency tables

Slide - 266

Business Statistics: A First Course

Seventh Edition

Chapter 12

Simple Linear

Regression

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved

Page 134: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

134

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 267

Objectives

1. How to use regression analysis to predict the value of a

dependent variable based on a value of an independent

variable

2. To understand the meaning of the regression coefficients

0 1 b band

3. To evaluate the assumptions of regression analysis and

know what to do if the assumptions are violated

4. To make inferences about the slope and correlation

coefficient

5. To estimate mean values and predict individual values

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 268

Correlation vs. Regression

DCOVA• A scatter plot can be used to show the

relationship between two variables

• Correlation analysis is used to measure the strength of the association (linear relationship) between two variables

– Correlation is only concerned with strength of the relationship

– No causal effect is implied with correlation

– Scatter plots were first presented in Ch. 2

– Correlation was first presented in Ch. 3

Page 135: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

135

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 269

Types of Relationships (1 of 3)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 270

Types of Relationships (2 of 3)

DCOVA

Page 136: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

136

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 271

Types of Relationships (3 of 3)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 272

Introduction to Regression Analysis

DCOVA

• Regression analysis is used to:– Predict the value of a dependent variable based on the

value of at least one independent variable

– Explain the impact of changes in an independent variable on the dependent variable

• Dependent variable: the variable we wish to

predict or explain

• Independent variable: the variable used to predict

or explain the dependent variable

Page 137: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

137

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 273

Simple Linear Regression Model (1 of 3)

DCOVA

• Only one independent variable, X

• Relationship between X and Y is described

by a linear function

• Changes in Y are assumed to be related to

changes in X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 274

Simple Linear Regression Model (2 of 3)

DCOVA

0 1i i iY X

Linear component

Population

Y intercept

Population

Slope

Coefficient

Random

Error

term

Dependent

Variable

Independent

Variable

Random Error

component

Page 138: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

138

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 275

Simple Linear Regression Model (3 of 3)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 276

Simple Linear Regression Equation

(Prediction Line)

DCOVA

The simple linear regression equation provides an estimate of the population regression line

0 1i iY b b X

Estimate of

the regression

intercept

Estimate of the

regression slope

Estimated (or

predicted) Y

value for

observation iValue of X for

observation i

Page 139: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

139

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 277

The Least Squares Method

DCOVA

0b and1b are obtained by finding the values

that minimize the sum of the squared

differences between Y and : Y

2 2

0 1min minii i iY Y Y b b X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 278

Finding the Least Squares Equation

DCOVA

• The coefficients 0b and1,b and other

regression results in this chapter, will be found

using Excel or Minitab

Formulas are shown in the text for those who are interested

Page 140: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

140

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 279

Interpretation of the Slope and the

Intercept

DCOVA

• 0b is the estimated mean value of Y when

the value of X is zero

• 1b is the estimated change in the mean value

of Y as a result of a one-unit increase in X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 280

Simple Linear Regression Example

DCOVA

• A real estate agent wishes to examine the

relationship between the selling price of a home and

its size (measured in square feet)

• A random sample of 10 houses is selected

– Dependent variable (Y) = house price in $1000s

– Independent variable (X) = square feet

Page 141: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

141

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 281

Simple Linear Regression Example:

Data

DCOVAHouse Price in $1000s

(Y)

Square Feet

(X)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 282

Simple Linear Regression Example:

Scatter Plot

DCOVAHouse price model: Scatter Plot

0

50

100

150

200

250

300

350

400

450

0 500 1000 1500 2000 2500 3000

Ho

us

e P

ric

e (

$1

00

0s

)

Square Feet

Page 142: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

142

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 283

Simple Linear Regression Example:

Excel Output

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 284

Simple Linear Regression Example:

Graphical Representation

DCOVA

House price model: Scatter Plot and Prediction Line

98.24833 0.1097house p 7 square feetrice

Page 143: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

143

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 285

Simple Linear Regression Example:

Interpretation of b Sub 0

DCOVA

• 0b is the estimated mean value of Y when the

value of X is zero (if X = 0 is in the range of

observed X values)

• Because a house cannot have a square footage of

of 0, 0b has no practical application

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 286

Simple Linear Regression Example:

Interpreting b Sub 1

DCOVA

• 1b estimates the change in the mean value

of Y as a result of a one-unit increase in X

• Here,1 0.10977b tells us that the mean value of a

house increases by .10977 $1,000 $109.77,

on average, for each additional one square foot

of size

Page 144: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

144

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 287

Simple Linear Regression Example:

Making Predictions (1 of 2)

DCOVA

Predict the price for a house with 2000 square feet:

98.25 0.1098 sq.ft.

98.25 0.10

ho

98

use pri

2000

317.85

ce

The predicted price for a house with 2000 square

feet is 317.85 $1,000s $317,850

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 288

Simple Linear Regression Example:

Making Predictions (2 of 2)

DCOVA

• When using a regression model for prediction, only

predict within the relevant range of data

Page 145: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

145

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 289

Measures of Variation (1 of 3)

DCOVA

• Total variation is made up of two parts:

2

iSST Y Y 2

iSSR Y Y 2

iiSSE Y Y

where: Y = Mean value of the dependent variable

iY = Observed value of the dependent variable

iY = Predicted value of Y for the giveniX value

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 290

Measures of Variation (2 of 3)

DCOVA• SST = total sum of squares (Total Variation)

– Measures the variation of theiY values around their

mean Y

• SSR = regression sum of squares (Explained Variation)

– Variation attributable to the relationship between X

and Y

• SSE = error sum of squares (Unexplained Variation)

– Variation in Y attributable to factors other than X

Page 146: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

146

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 291

Measures of Variation (3 of 3)

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 292

Coefficient of Determination, r Squared

DCOVA

• The coefficient of determination is the portion of the

total variation in the dependent variable that is

explained by variation in the independent variable

• The coefficient of determination is also called

r-square and is denoted as2r

2 regression sum of squares

totalsumsquares

SSRr

SST

2: 0 1r note

Page 147: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

147

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 293

Examples of Approximate r Squared

Values (1 of 3)

DCOVA

Perfect linear relationship

between X and Y:

100% of the variation in Y is

explained by variation in X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 294

Examples of Approximate r Squared

Values (2 of 3)

DCOVA

20 < < 1 r

Weaker linear relationships

between X and Y:

Some but not all of the

variation in Y is explained by

variation in X

Page 148: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

148

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 295

Examples of Approximate r Squared

Values (3 of 3)

DCOVA2

0r

No linear relationship between X

and Y:

The value of Y does not depend

on X. (None of the variation in Y

is explained by variation in X)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 296

Simple Linear Regression Example: Coefficient

of Determination, r Squared in Excel

DCOVA

Page 149: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

149

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 297

Standard Error of Estimate

DCOVA

• The standard deviation of the variation of

observations around the regression line is

estimated by

2

1

ˆ( )

2 2

n

i i

iYX

Y YSSE

Sn n

Where

SSE = error sum of squares

n = sample size

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 298

Simple Linear Regression Example:

Standard Error of Estimate in Excel

DCOVA

Page 150: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

150

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 299

Comparing Standard Errors

DCOVA

YXS is a measure of the variation of observed Y values from

the regression line.

The magnitude ofYXS should always be judged relative to the size of the

Y values in the sample data.i.e., $41.33YXS K is moderately small relative to house prices in the

$200K − $400K range

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 300

Assumptions of Regression L.I.N.E

DCOVA

• Linearity:

– The relationship between X and Y is linear

• Independence of Errors

– Error values are statistically independent.

– Particularly important when data are collected over a period of

time.

• Normality of Error

– Error values are normally distributed for any given value of X

• Equal Variance (also called homoscedasticity)

– The probability distribution of the errors has constant variance

Page 151: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

151

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 301

Residual Analysis

DCOVAˆi i ie Y Y

• The residual for observation i, ,ie is the difference between its

observed and predicted value

• Check the assumptions of regression by examining the

residuals

– Examine for linearity assumption

– Evaluate independence assumption

– Evaluate normal distribution assumption

– Examine for constant variance for all levels of X (homoscedasticity)

• Graphical Analysis of Residuals

– Can plot residuals vs. X

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 302

Residual Analysis for Linearity

DCOVA

Page 152: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

152

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 303

Residual Analysis for Independence

DCOVA

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 304

Checking for Normality

DCOVA

• Examine the Stem-and-Leaf Display of the

Residuals.

• Examine the Boxplot of the Residuals.

• Examine the Histogram of the Residuals.

• Construct a Normal Probability Plot of the

Residuals.

Page 153: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

153

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 305

Residual Analysis for Normality

DCOVA

When using a normal probability plot, normal errors will

approximately display in a straight line.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 306

Residual Analysis for Equal Variance

DCOVA

Page 154: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

154

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 307

Inferences About the Slope

DCOVA

• The standard error of the regression slope

coefficient 1b is estimated by

1 2( )

YX YXb

i

S SS

SSX X X

where:

1bS = Estimate of the standard error of the slope.

2YX

SSES

n

= Standard error of the estimate.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 308

Inferences About the Slope: t Test

DCOVA• t test for a population slope:

– Is there a linear relationship between X and Y?

• Null and alternative hypotheses:

– 0 1: 0H (no linear relationship)

– 1 1: 0H (linear relationship does exist)

• Test statistic:

1

1 1STAT

b

b βt

S

d.f. 2n

where:

1b = regression slope coefficient

1 = hypothesized slope

1bS = standard error of the slope

Page 155: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

155

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 309

Inferences About the Slope: t Test

Example (1 of 4)

DCOVA

House Price

in $1000s

(y)

Square Feet

(x)

245 1400

312 1600

279 1700

308 1875

199 1100

219 1550

405 2350

324 2450

319 1425

255 1700

Estimated Regression Equation:

(sq.ft.) 0.1098 98.25 price house

The slope of this model is 0.1098

Is there a relationship between the

square footage of the house and

its sales price?

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 310

Inferences About the Slope: t Test

Example (2 of 4)

DCOVA0 1

1 1

: 0

: 0

H

H

Page 156: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

156

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 311

Inferences About the Slope: t Test

Example (3 of 4)

DCOVA

Test Statistic: 3.329STATt 0 1

1 1

: 0

: 0

H

H

Decision: Reject 0H

There is sufficient

evidence that square

footage affects house

price.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 312

Inferences About the Slope: t Test

Example (4 of 4)

DCOVA0 1

1 1

: 0

: 0

H

H

Decision: Reject 0 ,H since p-value < α

There is sufficient evidence that square footage

affects house price.

Page 157: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

157

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 313

F Test for the Slope

DCOVA

• F Test statistic:MSE

MSRFSTAT

where

1

SSRMSR

k

SSEMSE

n k

where STATF follows an F distribution with k numerator and 1n k denominator degrees of freedom

(k = the number of independent variables in the regression model)

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 314

F-Test for the Slope Excel Output

DCOVA

Page 158: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

158

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 315

F Test for Significance

DCOVA

Decision:

0 0.05Reject atH

Conclusion:

There is sufficient evidence that

house size affects selling price

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 316

Confidence Interval Estimate for the

Slope (1 of 2)

DCOVA

Confidence Interval Estimate of the Slope:

11

2

bb t Sd.f. 2 n

Excel Printout for House Prices:

At 95% level of confidence, the confidence interval for the

slope is (0.0337, 0.1858)

Page 159: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

159

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 317

Confidence Interval Estimate for the

Slope (2 of 2)

DCOVA

Since the units of the house price variable is $1000s, we are

95% confident that the average impact on sales price is

between $33.74 and $185.80 per square foot of house size

This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between house price and

square feet at the .05 level of significance

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 318

t-Test for a Correlation Coefficient (1 of 3)

DCOVA• Hypotheses

0 : 0H (no correlation between X and Y)

0 : 0H (correlation exists)

• Test statistic

21

2

STAT

r ρt

r

n

(with n − 2 degrees of freedom)

where

2

1

2

1

0

0

r r b

r r b

if

if

Page 160: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

160

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 319

t-Test for a Correlation Coefficient (2 of 3)

DCOVA

Is there evidence of a linear relationship between square

feet and house price at the .05 level of significance?

0 : 0H (No correlation)

1 : 0H (correlation exists)

.05, df 10 2 8

2 2

.762 03.329

1 1 .762

2 10 2

STAT

r ρt

r

n

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 320

t-Test for a Correlation Coefficient (3 of 3)

DCOVADecision:

Reject 0H

Conclusion:

There is evidence of a

linear association at the

5% level of significance

Page 161: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

161

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 321

Pitfalls of Regression Analysis

• Lacking an awareness of the assumptions of least-squares

regression

• Not knowing how to evaluate the assumptions of least-

squares regression

• Not knowing the alternatives to least-squares regression if

a particular assumption is violated

• Using a regression model without knowledge of the subject

matter

• Extrapolating outside the relevant range

• Concluding that a significant relationship identified always

reflects a cause-and-effect relationship

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 322

Strategies for Avoiding the Pitfalls of

Regression (1 of 2)

• Start with a scatter plot of X vs. Y to observe

possible relationship

• Perform residual analysis to check the

assumptions

– Plot the residuals vs. X to check for violations of

assumptions such as homoscedasticity

– Use a histogram, stem-and-leaf display, boxplot, or

normal probability plot of the residuals to uncover

possible non-normality

Page 162: Instructor: Prof. Dr. Samir Safisite.iugaza.edu.ps/ssafi/files/2016/09/ALL-Chapters.pdfSunday, February 10, 2019 17 Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights

Sunday, February 10, 2019

162

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 323

Strategies for Avoiding the Pitfalls of

Regression (2 of 2)

• If there is violation of any assumption, use alternative

methods or models

• If there is no evidence of assumption violation, then

test for the significance of the regression coefficients

and construct confidence intervals and prediction

intervals

• Refrain from making predictions or forecasts outside

the relevant range

• Remember that the relationships identified in

observational studies may or may not be due to

cause-and-effect relationships.

Copyright © 2016, 2013, 2010 Pearson Education, Inc. All Rights Reserved Slide - 324

Chapter Summary

In this chapter we discussed:

• How to use regression analysis to predict the value of a

dependent variable based on a value of an independent

variable

• To Understanding the meaning of the regression coefficients

0 1b band

• To evaluating the assumptions of regression analysis and

know what to do if the assumptions are violated

• To Making inferences about the slope and correlation

coefficient

• To Estimating mean values and predicting individual values