stat 31, section 1, last time distribution of sample means –expected value same –variance ...
Post on 18-Jan-2018
218 Views
Preview:
DESCRIPTION
TRANSCRIPT
Stat 31, Section 1, Last Time
• Distribution of Sample Means
– Expected Value same
– Variance less, Law of Averages, I
– Dist’n Normal, Law of Averages, II
• Statistical Inference
– Confidence Intervals
Choice of Sample Size
Additional use of margin of error ideaBackground: distributions Small n Large n
X
n
Choice of Sample Size
Could choose n to make = desired value
But S. D. is not very interpretable, so make “margin of error”, m = desired value
Then get: “ is within m units of , 95% of the time”
n
X
Choice of Sample Size
Given m, how do we find n?Solve for n (the equation):
n
mn
XPmXP
95.0
nmZP
Choice of Sample Size
Graphically, find m so that: Area = 0.95 Area = 0.975
nm
nm
Choice of Sample Size
Thus solve:
2
1,0,975.0
NORMINVm
n
1,0,975.0NORMINVn
m
1,0,975.0NORMINVm
n
Choice of Sample Size
EXCEL Implementation:Class Example 20, Part 3:
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg20.xls
HW: 6.19, 6.21
2
1,0,975.0
NORMINVm
n
Interpretation of Conf. Intervals2 Equivalent Views: Distribution Distribution
95%
pic 1 pic 2
m m m 0 m
X X
Interpretation of Conf. Intervals
Mathematically:
pic 1 pic 2
no pic
"",.. bracketsmXmXICtheP
mXPmXmP 95.0
mXmXP
Interpretation of Conf. Intervals
Frequentist View: If repeat the experiment many times,
About 95% of the time, CI will contain
(and 5% of the time it won’t)
Interpretation of Conf. Intervals
A nice Applet, from Ogden and West:http://www.amstat.org/publications/jse/v6n3/applets/ConfidenceInterval.html
• Try a few at
• “more interval” allows regeneration
• “on average” about 2.5/50 don’t cover
• This is idea of “% coverage”
Interpretation of Conf. IntervalsRevisit Class Example 16
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg16.xls
Recall Class HW: Estimate % of Male Students at UNC
CI View: Class Example 21https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg21.xls
Illustrates idea: CI should cover 95% of time
Interpretation of Conf. IntervalsClass Example 21:
https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg21.xls
Q1: SD too small Too many cover
Q2: SD too big Too few cover
Q3: Big Bias Too few cover
Q4: Good sampling About right
Q5: Simulated Bi Shows “natural var’n”
Interpretation of Conf. IntervalsHW: 6.23, 6.26 (0.857, 0.135, 0.993)
Sec. 6.2 Tests of Significance= Hypothesis Tests
Big Picture View:Another way of handling random error
I.e. a different view point
Idea: Answer yes or no questions, under uncertainty
(e.g. from sampling of measurement error)
Hypothesis TestsSome Examples:• Will Candidate A win the election?• Does smoking cause cancer?• Is Brand X better than Brand Y?• Is a drug effective?• Is a proposed new business strategy
effective?(marketing research focuses on this)
Hypothesis TestsE.g. A fast food chain currently brings in
profits of $20,000 per store, per day. A new menu is proposed. Would it be more profitable?
Test: Have 10 stores (randomly selected!) try the new menu, let = average of their daily profits.
X
Fast Food Business ExampleSimplest View: for :
new menu looks better.
Otherwise looks worse.
Problem: New menu might be no better (or even worse), but could have
by bad luck of sampling
(only sample of size 10)
000,20$X
000,20$X
Fast Food Business Example
Problem: How to handle & quantify gray area in these decisions.
Note: Can never make a definite conclusion e.g. as in Mathematics, Statistics is more about real life…
(E.g. even if or , that might be bad luck of sampling, although very unlikely)
0$X 000,000,1$X
Hypothesis Testing
Note: Can never make a definite conclusion,Instead measure strength of evidence.Approach I: (note: different from text)Choose among 3 Hypotheses: H+: Strong evidence new menu is better
H0: Evidence in inconclusive
H-: Strong evidence new menu is worse
Hypothesis Testing
Terminology:
H0 is called null hypothesis
Setup: H+, H0, H- are in terms of
parameters, i.e. population quantities
(recall population vs. sample)
Fast Food Business Example
E.g. Let = true (over all stores) daily
profit from new menu.
H+: (new is better)
H0: (about the same)
H-: (new is worse)000,20$
000,20$
000,20$
Fast Food Business Example
Base decision on best guess:
Will quantify strength of the evidence using
probability distribution of
E.g. Choose H+
Choose H0
Choose H-000,20$X
000,20$X
000,20$X
X
Fast Food Business Example
How to draw line?
(There are many ways,
here is traditional approach)
Insist that H+ (or H-) show strong evidence
I.e. They get burden of proof
(Note: one way of solving
gray area problem)
Fast Food Business Example
Assess strength of evidence by asking:
“How strange is observed value ,
assuming H0 is true?”
In particular, use tails of H0 distribution as
measure of strength of evidence
X
Fast Food Business ExampleUse tails of H0 distribution as measure of
strength of evidence: distribution under H0
observed value ofUse this probability to measure
strength of evidence
X
X
k20$
Hypthesis TestingDefine the p-value, for either H+ or H0, as:
P{what was seen, or more conclusive | H0}
Note 1: small p-value strong evidence against H0, i.e. for H+ (or H-)
Note 2: p-value is also called observed significance level.
Fast Food Business Example
Suppose observe: ,based on
Note , but is this conclusive?or could this be due to natural sampling variation?(i.e. do we risk losing money from new menu?)
400,2$s000,21$X10n
000,20$X
Fast Food Business Example
Assess evidence for H+ by:
H+ p-value = Area
10400,2,000,20' NndistX
000,21$000,20$
Fast Food Business Example
Computation in EXCEL:
Class Example 22, Part 1:https://www.unc.edu/~marron/UNCstat31-2005/Stat31Eg22.xls
P-value = 0.094
top related