stat 31, section 1, last time paired diff’s vs. unmatched samples –compare with example...

40
Stat 31, Section 1, Last Time Paired Diff’s vs. Unmatched Samples Compare with example Showed graphic about Paired often better Review of Gray Level Hypo Testing Inference for Proportions Confidence Intervals Sample Size Calculation

Upload: richard-james

Post on 16-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Stat 31, Section 1, Last Time• Paired Diff’s vs. Unmatched Samples

– Compare with example

– Showed graphic about Paired often better

• Review of Gray Level Hypo Testing

• Inference for Proportions

– Confidence Intervals

– Sample Size Calculation

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 536-549, 555-566, 582-611

Approximate Reading for Next Class:

Pages 582-611, 634-667

Midterm IIComing on Tuesday, April 10

Think about:

• Sheet of Formulas– Again single 8 ½ x 11 sheet– New, since now more formulas

• Redoing HW…

• Asking about those not understood

• Midterm not cumulative

• Covered Material: HW 7 - 11

Midterm IIExtra Office Hours:

Monday, 4/9, 10:00 – 12:00

12:30 – 3:00

Tuesday, 4/10, 8:30 – 10:00

11:00 – 12:00

Hypo. Tests for Proportions

Case 3: Hypothesis Testing

General Setup: Given Value

pH :0

pH A :

Hypo. Tests for Proportions

Assess strength of evidence by:

P-value = P{what saw or m.c. | B’dry} =

= P{observed or m.c. | p = }

Problem: sd of npp

p 1

ˆ

Hypo. Tests for Proportions

Problem: sd of

Solution: (different from above “best guess”

and “conservative”)

calculation is done base on:

npp

p 1

ˆ

p

Hypo. Tests for Proportionse.g. Old Text Problem 8.16Of 500 respondents in a Christmas tree

marketing survey, 44% had no children at home and 56% had at least one child at home. The corresponding figures from the most recent census are 48% with no children, and 52% with at least one. Test the null hypothesis that the telephone survey has a probability of selecting a household with no children that is equal to the value of the last census. Give a Z-statistic and P-value.

Hypo. Tests for Proportions

e.g. Old Text Problem 8.16

Let p = % with no child

(worth writing down)48.0:0 pH

48.0: pH A

Hypo. Tests for Proportions

Observed , from

P-value =

44.0ˆ2 pP

48.0|04.0ˆ pppP

48.0|..44.0ˆ pcmorpP

500n44.0ˆ p

Hypo. Tests for ProportionsP-value

= 2 * NORMDIST(0.44,0.48,sqrt(0.48*(1-0.48)/500),true)

See Class Example 30, Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls

= 0.0734

Yes-No: no strong evidence

Gray-level: somewhat strong evidence

44.0ˆ2 pP

Hypo. Tests for ProportionsZ-score version:

P-value =

So Z-score is: =

1.79

04.0ˆ ppP

50048.0148.0

04.0

1

ˆ

npp

ppP

Hypo. Tests for ProportionsNote also 1-sided version:

Yes-no: is strong evidence

Gray Level: stronger evidence

HW: 8.22a (0.0057), 8.23, interpret

from both yes-no and gray-level

viewpoints

2 Sample ProportionsIn text Section 8.2

• Skip this

• Ideas are only slight variation of above

• Basically mix & Match of 2 sample

ideas, and proportion methods

• If you need it (later), pull out text

• Covered on exams to extent it is in HW

Chapter 9: Two-Way TablesMain idea:

Divide up populations in two ways– E.g. 1: Age & Sex– E.g. 2: Education & Income

• Typical Major Question:

How do divisions relate?

• Are the divisions independent?– Similar idea to indepe’nce in prob. Theory– Statistical Inference?

Two-Way TablesClass Example 31, Textbook Example 9.18Market Researchers know that background

music can influence mood and purchasing behavior. A supermarket compared three treatments: No music, French accordion music and Italian string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian and other wine purshased.

Two-Way TablesClass Example 31, Textbook Example 9.18Here is the two way table that summarizes

the data:

Are the type of wine purchased, and the background music related?

Music

Wine: None French Italian

French 30 39 30

Italian 11 1 19

Other 43 35 35

Two-Way TablesClass Example 31: Visualization

Shows how counts are broken down by:

music type wine type

NoneFrench

Italian

French Wine

Italian Wine

Other Wine

0

5

10

15

20

25

30

35

40

45

# Bottles purchased

Music

Class Example 31 - Counts

Two-Way TablesBig Question:Is there a

relationship?

Note: tallest bars French Wine French Music Italian Wine Italian Music Other Wine No MusicSuggests there is a relationship

NoneFrench

Italian

French Wine

Italian Wine

Other Wine

0

5

10

15

20

25

30

35

40

45

# Bottles purchased

Music

Class Example 31 - Counts

Two-Way TablesGeneral Directions:

• Can we make this precise?

• Could it happen just by chance?

– Really: how likely to be a chance effect?

• Or is it statistically significant?

– I.e. music and wine purchase are related?

Two-Way TablesClass Example 31, a look under the hood…Excel Analysis, Part 1:

http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Notes:• Read data from file• Only appeared as column• Had to re-arrange• Better way to do this???• Made graphic with chart wizard

Two-Way TablesHW: Make 2-way bar graphs, and discuss

relationships between the divisions, for

the data in:

9.1 (younger people tend to be better

educated)

9.9 (you try these…)

9.11

Two-Way TablesAn alternate view:

Replace counts by proportions (or %-ages)

Class Example 31 (Wine & Music), Part 2http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Advantage:

May be more interpretable

Drawback:

No real difference (just rescaled)

Two-Way TablesTesting for independence:

What is it?

From probability theory:

P{A | B} = P{A}

i.e. Chances of A, when B is known, are same as when B is unknown

Table version of this idea?

Independence in 2-Way TablesRecall:

P{A | B} = P{A}

Counts - proportions analog of these?

• Analog of P{A}?– Proportions of factor A, “not knowing B”

– Called “marginal proportions”

• Analog of P{A|B}???

Independence in 2-Way TablesMarginal proportions (or counts):

• Sums along rows

• Sums along columns

• Useful to write at margins of table

• Hence name marginal

• Number of independent interest

• Also nice to put total at bottom

Independence in 2-Way TablesMarginal Counts:

Class Example 31 (Wine & Music), Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Marginals are of independent interest:

• Other wines sold best (French second)

• Italian music sold most wine…

• But don’t tell whole story– E.g.Can’t see same music & wine is best…– Full table tells more than marginals

Independence in 2-Way Tables

Recall definition of independence:

P{A | B} = P{A}

Counts analog of P{A|B}???

Recall:

So equivalent condition is:

BPBAP

BAPAP&

|

}&{}{}{ BAPBPAP

Independence in 2-Way Tables

Counts analog of P{A|B}???

Equivalent condition for independence is:

So for counts, look for:

Table Prop’n = Row Marg’l Prop’n x Col’n Marg’l Prop’n

i.e. Entry = Product of Marginals

}{}{}&{ BPAPBAP

Independence in 2-Way TablesVisualize Product of Marginals for:

Class Example 31 (Wine & Music), Part 4http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Shows same structure

as marginals

But not match between

music & wine

Good null hypothesisNone

FrenchItalian

French Wine

Italian Wine

Other Wine

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

# Bottles purchased

Music

Class Example 31 - Independent Model

Independence in 2-Way Tables

• Independent model appears different

• But is it really different?

• Or could difference be simply explained

by natural sampling variation?

• Check for statistical significance…

Independence in 2-Way TablesApproach:

• Measure “distance between tables”– Use Chi Square Statistic

– Has known probability distribution when table is independent

• Assess significance using P-value

– Set up as: H0: Indep. HA: Dependent

– P-value = P{what saw or m.c. | Indep.}

Independence in 2-Way TablesChi-square statistic: Based on:

• Observed Counts (raw data),

• Expected Counts (under indep.),

Notes:– Small for only random variation

– Large for significant departure from indep.

iObs

iExp

icells i

ii

ExpExpObs

X2

2

Independence in 2-Way TablesChi-square statistic calculation:

Class example 31, Part 5:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

– Calculate term by term

– Then sum

– Is X2 = 18.3 “big” or “small”?

icells i

ii

ExpExpObs

X2

2

Independence in 2-Way TablesH0 distribution of the X2 statistic:

“Chi Squared” (another Greek letter )

Parameter: “degrees of freedom”

(similar to T distribution)

Excel Computation:– CHIDIST (given cutoff, find area = prob.)

– CHIINV (given prob = area, find cutoff)

2

Independence in 2-Way TablesExplore the distribution:

Applet from Webster West (U. So. Carolina)http://www.stat.sc.edu/~west/applets/chisqdemo.html

• Right Skewed Distribution

• Nearly Gaussian for more d.f.

2

Independence in 2-Way TablesFor test of independence, use:

degrees of freedom =

= (#rows – 1) x (#cols – 1)

E.g. Wine and Music:

d.f. = (3 – 1) x (3 – 1) = 4

Independence in 2-Way TablesE.g. Wine and Music:

P-value = P{Observed X2 or m.c. | Indep.} =

= P{X2 = 18.3 of m.c. | Indep.} =

= P{X2 >= 18.3 | d.f. = 4} =

= 0.0011

Also see Class Example 31, Part 5http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls

Independence in 2-Way TablesE.g. Wine and Music:

P-value = 0.001

Yes-No: Very strong evidence against

independence, conclude music has a

statistically significant effect

Gray-Level: Also very strong

evidence

Independence in 2-Way TablesExcel shortcut:

CHITEST

• Avoids the (obs-exp)^2 / exp calculat’n

• Automatically computes d.f.

• Returns P-value