stat 31, section 1, last time paired diff’s vs. unmatched samples –compare with example...
TRANSCRIPT
Stat 31, Section 1, Last Time• Paired Diff’s vs. Unmatched Samples
– Compare with example
– Showed graphic about Paired often better
• Review of Gray Level Hypo Testing
• Inference for Proportions
– Confidence Intervals
– Sample Size Calculation
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 536-549, 555-566, 582-611
Approximate Reading for Next Class:
Pages 582-611, 634-667
Midterm IIComing on Tuesday, April 10
Think about:
• Sheet of Formulas– Again single 8 ½ x 11 sheet– New, since now more formulas
• Redoing HW…
• Asking about those not understood
• Midterm not cumulative
• Covered Material: HW 7 - 11
Midterm IIExtra Office Hours:
Monday, 4/9, 10:00 – 12:00
12:30 – 3:00
Tuesday, 4/10, 8:30 – 10:00
11:00 – 12:00
Hypo. Tests for Proportions
Assess strength of evidence by:
P-value = P{what saw or m.c. | B’dry} =
= P{observed or m.c. | p = }
Problem: sd of npp
p 1
ˆ
p̂
Hypo. Tests for Proportions
Problem: sd of
Solution: (different from above “best guess”
and “conservative”)
calculation is done base on:
npp
p 1
ˆ
p
Hypo. Tests for Proportionse.g. Old Text Problem 8.16Of 500 respondents in a Christmas tree
marketing survey, 44% had no children at home and 56% had at least one child at home. The corresponding figures from the most recent census are 48% with no children, and 52% with at least one. Test the null hypothesis that the telephone survey has a probability of selecting a household with no children that is equal to the value of the last census. Give a Z-statistic and P-value.
Hypo. Tests for Proportions
e.g. Old Text Problem 8.16
Let p = % with no child
(worth writing down)48.0:0 pH
48.0: pH A
Hypo. Tests for Proportions
Observed , from
P-value =
44.0ˆ2 pP
48.0|04.0ˆ pppP
48.0|..44.0ˆ pcmorpP
500n44.0ˆ p
Hypo. Tests for ProportionsP-value
= 2 * NORMDIST(0.44,0.48,sqrt(0.48*(1-0.48)/500),true)
See Class Example 30, Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg30.xls
= 0.0734
Yes-No: no strong evidence
Gray-level: somewhat strong evidence
44.0ˆ2 pP
Hypo. Tests for ProportionsZ-score version:
P-value =
So Z-score is: =
1.79
04.0ˆ ppP
50048.0148.0
04.0
1
ˆ
npp
ppP
Hypo. Tests for ProportionsNote also 1-sided version:
Yes-no: is strong evidence
Gray Level: stronger evidence
HW: 8.22a (0.0057), 8.23, interpret
from both yes-no and gray-level
viewpoints
2 Sample ProportionsIn text Section 8.2
• Skip this
• Ideas are only slight variation of above
• Basically mix & Match of 2 sample
ideas, and proportion methods
• If you need it (later), pull out text
• Covered on exams to extent it is in HW
Chapter 9: Two-Way TablesMain idea:
Divide up populations in two ways– E.g. 1: Age & Sex– E.g. 2: Education & Income
• Typical Major Question:
How do divisions relate?
• Are the divisions independent?– Similar idea to indepe’nce in prob. Theory– Statistical Inference?
Two-Way TablesClass Example 31, Textbook Example 9.18Market Researchers know that background
music can influence mood and purchasing behavior. A supermarket compared three treatments: No music, French accordion music and Italian string music. Under each condition, the researchers recorded the numbers of bottles of French, Italian and other wine purshased.
Two-Way TablesClass Example 31, Textbook Example 9.18Here is the two way table that summarizes
the data:
Are the type of wine purchased, and the background music related?
Music
Wine: None French Italian
French 30 39 30
Italian 11 1 19
Other 43 35 35
Two-Way TablesClass Example 31: Visualization
Shows how counts are broken down by:
music type wine type
NoneFrench
Italian
French Wine
Italian Wine
Other Wine
0
5
10
15
20
25
30
35
40
45
# Bottles purchased
Music
Class Example 31 - Counts
Two-Way TablesBig Question:Is there a
relationship?
Note: tallest bars French Wine French Music Italian Wine Italian Music Other Wine No MusicSuggests there is a relationship
NoneFrench
Italian
French Wine
Italian Wine
Other Wine
0
5
10
15
20
25
30
35
40
45
# Bottles purchased
Music
Class Example 31 - Counts
Two-Way TablesGeneral Directions:
• Can we make this precise?
• Could it happen just by chance?
– Really: how likely to be a chance effect?
• Or is it statistically significant?
– I.e. music and wine purchase are related?
Two-Way TablesClass Example 31, a look under the hood…Excel Analysis, Part 1:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
Notes:• Read data from file• Only appeared as column• Had to re-arrange• Better way to do this???• Made graphic with chart wizard
Two-Way TablesHW: Make 2-way bar graphs, and discuss
relationships between the divisions, for
the data in:
9.1 (younger people tend to be better
educated)
9.9 (you try these…)
9.11
Two-Way TablesAn alternate view:
Replace counts by proportions (or %-ages)
Class Example 31 (Wine & Music), Part 2http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
Advantage:
May be more interpretable
Drawback:
No real difference (just rescaled)
Two-Way TablesTesting for independence:
What is it?
From probability theory:
P{A | B} = P{A}
i.e. Chances of A, when B is known, are same as when B is unknown
Table version of this idea?
Independence in 2-Way TablesRecall:
P{A | B} = P{A}
Counts - proportions analog of these?
• Analog of P{A}?– Proportions of factor A, “not knowing B”
– Called “marginal proportions”
• Analog of P{A|B}???
Independence in 2-Way TablesMarginal proportions (or counts):
• Sums along rows
• Sums along columns
• Useful to write at margins of table
• Hence name marginal
• Number of independent interest
• Also nice to put total at bottom
Independence in 2-Way TablesMarginal Counts:
Class Example 31 (Wine & Music), Part 3http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
Marginals are of independent interest:
• Other wines sold best (French second)
• Italian music sold most wine…
• But don’t tell whole story– E.g.Can’t see same music & wine is best…– Full table tells more than marginals
Independence in 2-Way Tables
Recall definition of independence:
P{A | B} = P{A}
Counts analog of P{A|B}???
Recall:
So equivalent condition is:
BPBAP
BAPAP&
|
}&{}{}{ BAPBPAP
Independence in 2-Way Tables
Counts analog of P{A|B}???
Equivalent condition for independence is:
So for counts, look for:
Table Prop’n = Row Marg’l Prop’n x Col’n Marg’l Prop’n
i.e. Entry = Product of Marginals
}{}{}&{ BPAPBAP
Independence in 2-Way TablesVisualize Product of Marginals for:
Class Example 31 (Wine & Music), Part 4http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
Shows same structure
as marginals
But not match between
music & wine
Good null hypothesisNone
FrenchItalian
French Wine
Italian Wine
Other Wine
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
# Bottles purchased
Music
Class Example 31 - Independent Model
Independence in 2-Way Tables
• Independent model appears different
• But is it really different?
• Or could difference be simply explained
by natural sampling variation?
• Check for statistical significance…
Independence in 2-Way TablesApproach:
• Measure “distance between tables”– Use Chi Square Statistic
– Has known probability distribution when table is independent
• Assess significance using P-value
– Set up as: H0: Indep. HA: Dependent
– P-value = P{what saw or m.c. | Indep.}
Independence in 2-Way TablesChi-square statistic: Based on:
• Observed Counts (raw data),
• Expected Counts (under indep.),
Notes:– Small for only random variation
– Large for significant departure from indep.
iObs
iExp
icells i
ii
ExpExpObs
X2
2
Independence in 2-Way TablesChi-square statistic calculation:
Class example 31, Part 5:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
– Calculate term by term
– Then sum
– Is X2 = 18.3 “big” or “small”?
icells i
ii
ExpExpObs
X2
2
Independence in 2-Way TablesH0 distribution of the X2 statistic:
“Chi Squared” (another Greek letter )
Parameter: “degrees of freedom”
(similar to T distribution)
Excel Computation:– CHIDIST (given cutoff, find area = prob.)
– CHIINV (given prob = area, find cutoff)
2
Independence in 2-Way TablesExplore the distribution:
Applet from Webster West (U. So. Carolina)http://www.stat.sc.edu/~west/applets/chisqdemo.html
• Right Skewed Distribution
• Nearly Gaussian for more d.f.
2
Independence in 2-Way TablesFor test of independence, use:
degrees of freedom =
= (#rows – 1) x (#cols – 1)
E.g. Wine and Music:
d.f. = (3 – 1) x (3 – 1) = 4
Independence in 2-Way TablesE.g. Wine and Music:
P-value = P{Observed X2 or m.c. | Indep.} =
= P{X2 = 18.3 of m.c. | Indep.} =
= P{X2 >= 18.3 | d.f. = 4} =
= 0.0011
Also see Class Example 31, Part 5http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg31.xls
Independence in 2-Way TablesE.g. Wine and Music:
P-value = 0.001
Yes-No: Very strong evidence against
independence, conclude music has a
statistically significant effect
Gray-Level: Also very strong
evidence