probability and statistics the tests we have worked with before is that chi square tests are used...

Post on 18-Apr-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Week 5Hypothesis Testing (two-mean groups & categorical variables)

3. Two Samples: Tests on Two Means (unpaired samples):

If and are known (and n1, n2 are

>30), then we have:

21

22

N(0,1)~

nn

)()XX(Z

2

2

2

1

2

1

2121

If and are unknown but = =2, then we have:21

22 2

122

)2nnt(~

n

1

n

1S

)()XX(T 21

21

p

2121

Where the pooled estimate of 2 is

2

)1()1(

21

222

2112

nn

SnSnS p

The degrees of freedom of is =n1+n22.

Now, suppose we need to test the null hypothesis

Ho: 1 = 2 Ho: 1 2 = 0

Generally, suppose we need to test

Ho: 1 2 = d (for some specific value d)

Against one of the following alternative hypothesis

2pS

H1:

1 2 d

1 2 > d

1 2 < d

Hypotheses Ho: 1 2 = d

H1: 1 2 d

Ho: 1 2 = d

H1: 1 2 > d

Ho: 1 2 = d

H1: 1 2 < d

Test Statistic

(T.S.)N(0,1)~

nn

d)XX(Z

2

2

2

1

2

1

21

)2nnt(~

n

1

n

1S

d)XX(T 21

21

p

21

{if = =2 is unknown}21

22

{if and are known}21

22

R.R.and A.R.

of Ho

Decision: Reject Ho (and accept H1) at the significance level if:

or or or

T.S. R.R.

Two-Sided Test

T.S. R.R.

One-Sided Test

T.S. R.R.

One-Sided Test

Example

An experiment was performed to compare the wear of two different materials. Twelve

pieces of material 1 were tested by exposing each piece to a machine measuring

wear. Ten pieces of material 2 were similarly tested. In each case, the depth of wear

was observed. The samples of material 1 gave an average wear of 85 units with a

sample standard deviation of 4, while the samples of materials 2 gave an average

wear of 81 and a sample standard deviation of 5. Can we conclude at the 0.05 level of

significance that the mean wear of material 1 exceeds that of material 2 by more than

2 units? Assume populations to be approximately normal with equal variances.

Solution:

Material 1 material 2

n1=12 n2=10

=85 =81

S1=4 S2=5

Hypotheses:

Ho: 1 = 2 + 2 (d=2)

H1: 1 > 2 + 2

Or equivalently,

Ho: 1 2 = 2 (d=2)

H1: 1 2 > 2

Calculation:

=0.05

1X 2X

05.2021012

)5)(110()4)(112(

2

)1()1( 22

21

222

2112

nn

SnSnS p

Sp=4.478

= n1+n22=12+10 2 = 20

t0.05 = 1.725

T.S.:

04.1

10

1

12

1)478.4(

2)8185(

11

)(

21

21

nnS

dXXT

p

Decision:

Since T=1.04 A.R. (T=1.04< t0.05 = 1.725), we accept (do not reject) Ho and

reject H1: 1 2 > 2 at =0.05.

R Examples

t.test(Sample,mu=40)

t.test(Sample,alternative="less",mu=40)

t.test(Sample,alternative="greater",mu=36)

t.test(Sample,mu=36, conf.level = 0.99)

Two Samples Tests

Always H0:mu1-mu2=0

H1: m1 Not equal m2, or mu1-mu2<0 , or mu1-mu2>0

Default: Not equal variances (estimated using Welch)

For equal variances, use Var.Equal=True

> Control = c(91, 87, 99, 77, 88, 91)

> Treat = c(101, 110, 103, 93, 99, 104)

t.test(Control, Treat, alternative="less", var.equal=TRUE)

Paired Tests (same sample, but before and after a treatment or intervention)

> Before= c(16, 20, 21, 22, 23, 22, 27, 25, 27, 28)

> After= c(19, 22, 24, 24, 25, 25, 26, 26, 28, 32)

> t.test(Before, After ,alternative="greater", paired=TRUE)

One Sample Tests

Hypothesis Testing on Categorical Variables (Chi-square)

Objectives

By the end of this topic, you should be able to:• Apply hypothesis testing in making inferences about the population’s

parameters when we have categorical variables.

• Apply Chi-square test as a goodness-of-fit test.

Introduction

• The primary difference between a chi-square test andthe tests we have worked with before is that chisquare tests are used for categorical data.

• The chi-square test can be used to• estimate how closely the distribution of a categorical

variable matches an expected distribution (the goodness-of-fit test),

• estimate whether two categorical variables areindependent of one another (the test of independence).

• When collect survey data, for example, if you find that the ratio of males to females who are in favor or against a certain design is 30:70, how would you test that the true population ratio is also 3:7?

• The observed frequencies (30, 70) will almost always differ from the expected frequencies due to sampling error.

• Question: Are these differences significant, or are they due to chance?

• The chi-square goodness-of-fit test will enable one to answer this question.

• The null and alternative hypotheses reflect this focus:• H0 : The population distribution of the variable is the same as the proposed

distribution

• H1 : The distributions are different

When to use test for goodness-of-fit?

• Suppose you conducted a survey to see whether consumers have any preference among five different designs of a new product. A sample of 100 people provided the following data. Test whether this indicates that consumers have preference towards some designs and that the obtained data is not just by chance due to sampling error.

Test for Goodness of Fit - Example

Desig1 Desig2 Desig3 Desig4 Desig5

32 28 16 14 10

• If there were no preferences, one would expect that each design would be selected with equal frequency.

• In this case, the equal frequency is 100/5 = 20.

• That is, approximately 20 people would select each design.

Test for Goodness of Fit - Example

• The frequencies obtained from the sample are called observed frequencies.

• The frequencies obtained from calculations are called expected frequencies.

• Table for the test is shown next.

Test for Goodness of Fit - Example

Test for Goodness of Fit - Example

Freq. Desig1 Desig2 Desig3 Desig4 Desig5

Observed 32 28 16 14 10

Expected 20 20 20 20 20

• The appropriate hypotheses for this example are:

• H0: Consumers show no preference for the design of the product.

• H1: Consumers show a preference.

• The degrees of freedom (df) for this test is equal to the number of categories minus 1.

Test for Goodness of Fit - Example

2

2

1

O E

E

d f number of categories

O observed frequency

E frequency

. .

expected

Test for Goodness of Fit - Example

Table of Chi-square

• Is there enough evidence to reject the claim that there is no preference in the selection of different designs? Let = 0.05.

• Step 1: State the hypotheses and identify the claim.

• H0: Consumers show no preference (claim).

• H1: Consumers show a preference.

• Step 2: Identify an appropriate test and significance level.

• A Chi-Square goodness of fit test is appropriate for answering this question. In the absence of a stated significance level in the problem, we assume the default 0.05.

Test for Goodness of Fit - Example

• Step 3: Analyze the sample data

• Find the critical value and the d. f. are 5 – 1 = 4 and = 0.05. Hence, the critical value= 𝜒𝛼,𝑑𝑓=4

2 = 9.49.

• Compute the test value. = (32 – 20)2/20 + (28 – 20)2/20 + … + (10 – 20)2/20 = 18.0.

• Step 4: Make the decision. The decision is to reject the null hypothesis, since 18.0 > 9.488.

• Step 5: Write conclusions. There is enough evidence to reject the claim that consumers show no preference for the designs.

Test for Goodness of Fit – Example1

• An insurance company needs to investigate the claim of one of itsemployees that female drivers get in fewer accidents than maledrivers. Specifically, he says that male drivers are held responsible in65% of accidents involving drivers under 23. Another survey is doneto investigate this claim. In the results, 46 out of the 85 accidentsconsidered involve male drivers, does this data support or refute theinitial hypothesis?

Test for Goodness of Fit – Example2

• Step 1: Clearly state the null and alternative hypotheses.• H0: In the population of all drivers, male drivers are responsible for 65% of

accidents and female drivers are responsible for 35%.• H1: The data do not match the proposed model.

• Step 2: Identify an appropriate test and significance level.• A Chi-Square goodness of fit test is appropriate for answering this question. In

the absence of a stated significance level in the problem, we assume the default 0.05.

• Step 3: Analyze sample data.

• Create a table to organize the data and compare the observed data to the expected one and find the T.S value:

Test for Goodness of Fit – Example2

Male Drivers Female Drivers Total

Observed 46 39 85

Expected 0.65*85=55.25 0.35*85=29.75 85

Test for Goodness of Fit – Example2

• Using the table, with df=1, we find the critical value to be 3.84.

• The critical value indicates that only 0.05, or 5%, of values would be as high as 3.84. If the 𝜒2 of our data is greater than 3.84, then fewer than 5 times out of 100 would we expect to get that result if the null hypothesis is true.

• Step 4: Make the decision. The decision is to reject the null hypothesis, since the test statistic is greater than the critical value.

• Step 5: Write conclusion. There is enough evidence to reject the claim that 65% of the accidents are done by male drivers.

Test for Goodness of Fit – Example2

R Functions Table2=table(BikeData$gender) F M 31 90

prop.table(BikeData$gender) F M 0.2561983 0.7438017

Table1 = table(BikeData$cyc_freq, BikeData$gender)

> Table1 F M

Daily 9 38 Less than once a month 2 0 Several times per month 5 9 Several times per week 15 43

barplot(Table1)

barplot(Table1,beside=TRUE)

R Functions

Note: All expected values for each category must be greater than 5 for the Chi-square test results to be valid.

Conclusion: We accept the hypothesis that the true proportion of female to male cyclists in the population is 1:2.

R- Functions

• If you don’t have the data frame, just numbers:

• observed = c(772, 1611, 737)expected = c(0.25, 0.50, 0.25)chisq.test(x = observed, p = expected)

X-squared = 4.1199, df = 2, p-value = 0.1275

Decision: Since p-value is greater than 0.05, there is no enough evidence that the observed distribution is true, hence we accept the null hypothesis that the population distribution matches the expected, not the observed.

top related