Transcript

Stat 31, Section 1, Last Time• Paired Diff’s vs. Unmatched Samples

– Compare with example

– Showed graphic about Paired often better

• Review of Gray Level Hypo Testing

• Inference for Proportions

– Confidence Intervals

– Sample Size Calculation

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 536-549, 555-566, 582-611

Approximate Reading for Next Class:

Pages 582-611, 634-667

Midterm IIComing on Tuesday, April 10

Think about:

• Sheet of Formulas– Again single 8 ½ x 11 sheet– New, since now more formulas

• Redoing HW…

• Asking about those not understood

• Midterm not cumulative

• Covered Material: HW 7 - 11

Midterm IIExtra Office Hours:

Monday, 4/9, 10:00 – 12:00

12:30 – 3:00

Tuesday, 4/10, 8:30 – 10:00

11:00 – 12:00

Hypo. Tests for Proportions

Case 3: Hypothesis Testing

General Setup: Given Value

pH :0

pH A :

Hypo. Tests for Proportions

Assess strength of evidence by:

P-value = P{what saw or m.c. | B’dry} =

= P{observed or m.c. | p = }

Problem: sd of npp

p 1

ˆ

Hypo. Tests for Proportions

Problem: sd of

Solution: (different from above “best guess”

and “conservative”)

calculation is done base on:

npp

p 1

ˆ

p

Hypo. Tests for Proportionse.g. Old Text Problem 8.16Of 500 respondents in a Christmas tree

marketing survey, 44% had no children at home and 56% had at least one child at home. The corresponding figures from the most recent census are 48% with no children, and 52% with at least one. Test the null hypothesis that the telephone survey has a probability of selecting a household with no children that is equal to the value of the last census. Give a Z-statistic and P-value.

Hypo. Tests for Proportions

e.g. Old Text Problem 8.16

Let p = % with no child

(worth writing down)48.0:0 pH

48.0: pH A

Hypo. Tests for Proportions

Observed , from

P-value =

44.0ˆ2 pP

48.0|04.0ˆ pppP

48.0|..44.0ˆ pcmorpP

500n44.0ˆ p

Hypo. Tests for ProportionsP-value

= 2 * NORMDIST(0.44,0.48,sqrt(0.48*(1-0.48)/500),true)

See Class Example 30, Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls

= 0.0734

Yes-No: no strong evidence

Gray-level: somewhat strong evidence

44.0ˆ2 pP

Hypo. Tests for ProportionsZ-score version:

P-value =

So Z-score is: =

1.79

04.0ˆ ppP

50048.0148.0

04.0

1

ˆ

npp

ppP

Hypo. Tests for ProportionsNote also 1-sided version:

Yes-no: is strong evidence

Gray Level: stronger evidence

HW: 8.22a (0.0057), 8.23, interpret

from both yes-no and gray-level

viewpoints

2 Sample ProportionsIn text Section 8.2

• Skip this

• Ideas are only slight variation of above

• Basically mix & Match of 2 sample

ideas, and proportion methods

• If you need it (later), pull out text

• Covered on exams to extent it is in HW

Chapter 9: Two-Way TablesMain idea:

Divide up populations in two ways– E.g. 1: Age & Sex– E.g. 2: Education & Income

• Typical Major Question:

How do divisions relate?

• Are the divisions independent?– Similar idea to indepe’nce in prob. Theory– Statistical Inference?

Two-Way TablesClass Example 31, Textbook Example 9.18Market Researchers know that background

music can influence mood and purchasing behavior. A supermarket compared three treatments: No music, French accordion music and Italian string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian and other wine purshased.

Two-Way TablesClass Example 31, Textbook Example 9.18Here is the two way table that summarizes

the data:

Are the type of wine purchased, and the background music related?

Music

Wine: None French Italian

French 30 39 30

Italian 11 1 19

Other 43 35 35

Two-Way TablesClass Example 31: Visualization

Shows how counts are broken down by:

music type wine type

NoneFrench

Italian

French Wine

Italian Wine

Other Wine

0

5

10

15

20

25

30

35

40

45

# Bottles purchased

Music

Class Example 31 - Counts

Two-Way TablesBig Question:Is there a

relationship?

Note: tallest bars French Wine French Music Italian Wine Italian Music Other Wine No MusicSuggests there is a relationship

NoneFrench

Italian

French Wine

Italian Wine

Other Wine

0

5

10

15

20

25

30

35

40

45

# Bottles purchased

Music

Class Example 31 - Counts

Two-Way TablesGeneral Directions:

• Can we make this precise?

• Could it happen just by chance?

– Really: how likely to be a chance effect?

• Or is it statistically significant?

– I.e. music and wine purchase are related?

Two-Way TablesClass Example 31, a look under the hood…Excel Analysis, Part 1:

http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Notes:• Read data from file• Only appeared as column• Had to re-arrange• Better way to do this???• Made graphic with chart wizard

Two-Way TablesHW: Make 2-way bar graphs, and discuss

relationships between the divisions, for

the data in:

9.1 (younger people tend to be better

educated)

9.9 (you try these…)

9.11

Two-Way TablesAn alternate view:

Replace counts by proportions (or %-ages)

Class Example 31 (Wine & Music), Part 2http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Advantage:

May be more interpretable

Drawback:

No real difference (just rescaled)

Two-Way TablesTesting for independence:

What is it?

From probability theory:

P{A | B} = P{A}

i.e. Chances of A, when B is known, are same as when B is unknown

Table version of this idea?

Independence in 2-Way TablesRecall:

P{A | B} = P{A}

Counts - proportions analog of these?

• Analog of P{A}?– Proportions of factor A, “not knowing B”

– Called “marginal proportions”

• Analog of P{A|B}???

Independence in 2-Way TablesMarginal proportions (or counts):

• Sums along rows

• Sums along columns

• Useful to write at margins of table

• Hence name marginal

• Number of independent interest

• Also nice to put total at bottom

Independence in 2-Way TablesMarginal Counts:

Class Example 31 (Wine & Music), Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Marginals are of independent interest:

• Other wines sold best (French second)

• Italian music sold most wine…

• But don’t tell whole story– E.g.Can’t see same music & wine is best…– Full table tells more than marginals

Independence in 2-Way Tables

Recall definition of independence:

P{A | B} = P{A}

Counts analog of P{A|B}???

Recall:

So equivalent condition is:

BPBAP

BAPAP&

|

}&{}{}{ BAPBPAP

Independence in 2-Way Tables

Counts analog of P{A|B}???

Equivalent condition for independence is:

So for counts, look for:

Table Prop’n = Row Marg’l Prop’n x Col’n Marg’l Prop’n

i.e. Entry = Product of Marginals

}{}{}&{ BPAPBAP

Independence in 2-Way TablesVisualize Product of Marginals for:

Class Example 31 (Wine & Music), Part 4http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Shows same structure

as marginals

But not match between

music & wine

Good null hypothesisNone

FrenchItalian

French Wine

Italian Wine

Other Wine

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

# Bottles purchased

Music

Class Example 31 - Independent Model

Independence in 2-Way Tables

• Independent model appears different

• But is it really different?

• Or could difference be simply explained

by natural sampling variation?

• Check for statistical significance…

Independence in 2-Way TablesApproach:

• Measure “distance between tables”– Use Chi Square Statistic

– Has known probability distribution when table is independent

• Assess significance using P-value

– Set up as: H0: Indep. HA: Dependent

– P-value = P{what saw or m.c. | Indep.}

Independence in 2-Way TablesChi-square statistic: Based on:

• Observed Counts (raw data),

• Expected Counts (under indep.),

Notes:– Small for only random variation

– Large for significant departure from indep.

iObs

iExp

icells i

ii

ExpExpObs

X2

2

Independence in 2-Way TablesChi-square statistic calculation:

Class example 31, Part 5:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

– Calculate term by term

– Then sum

– Is X2 = 18.3 “big” or “small”?

icells i

ii

ExpExpObs

X2

2

Independence in 2-Way TablesH0 distribution of the X2 statistic:

“Chi Squared” (another Greek letter )

Parameter: “degrees of freedom”

(similar to T distribution)

Excel Computation:– CHIDIST (given cutoff, find area = prob.)

– CHIINV (given prob = area, find cutoff)

2

Independence in 2-Way TablesExplore the distribution:

Applet from Webster West (U. So. Carolina)http://www.stat.sc.edu/~west/applets/chisqdemo.html

• Right Skewed Distribution

• Nearly Gaussian for more d.f.

2

Independence in 2-Way TablesFor test of independence, use:

degrees of freedom =

= (#rows – 1) x (#cols – 1)

E.g. Wine and Music:

d.f. = (3 – 1) x (3 – 1) = 4

Independence in 2-Way TablesE.g. Wine and Music:

P-value = P{Observed X2 or m.c. | Indep.} =

= P{X2 = 18.3 of m.c. | Indep.} =

= P{X2 >= 18.3 | d.f. = 4} =

= 0.0011

Also see Class Example 31, Part 5http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Independence in 2-Way TablesE.g. Wine and Music:

P-value = 0.001

Yes-No: Very strong evidence against

independence, conclude music has a

statistically significant effect

Gray-Level: Also very strong

evidence

Independence in 2-Way TablesExcel shortcut:

CHITEST

• Avoids the (obs-exp)^2 / exp calculat’n

• Automatically computes d.f.

• Returns P-value


Top Related