stat 231 list of concepts and formulas -...
TRANSCRIPT
Stat 231 List of Concepts and Formulas"Course Review" Sheets
Prof. Stephen Vardeman
Iowa State UniversityStatistics and IMSE
September 6, 2011
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 1 / 58
Day 1-Introduction
Probability vs Statistics
Simple Descriptive Statistics
x =1n
n
∑i=1xi
s2 =1
n− 1n
∑i=1(xi − x)2
Properties of x and s
for y = ax + b, y = ax + b and sy = |a| sx
JMP
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 2 / 58
Day 2-Notions of "Chance" and Mathematical Theory
Sample Space (Universal Set) SEvents (Sets) A,B
Empty Event ∅
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 3 / 58
Day 3-Set Operations on Events
Words to Symbols and Symbols to Words
Set Operations on Events
A andB = A∩ BA orB = A∪ BnotA = Ac (A)
Mutuality Exclusive (Disjoint) Events A,B
A andB = ∅
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 4 / 58
Day 4-Axioms of Probability and "the Addition Rule"
Basic Rules of Operation1 0 ≤ P (A) ≤ 12 P (S) = 1 (and P (∅) = 0)3 If A1,A2, . . . are disjoint events
P (A1 orA2 or . . .) = P (A1) + P (A2) + · · ·
A Small "Theorem"
P (notA) = 1− P (A)
The "Addition Rule" (Another Theorem)
P (A orB) = P (A) + P (B)− P (A andB)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 5 / 58
Day 5-Conditional Probability and Independence of Events
Conditional Probability of A Given B
P (A|B) = P (A andB)P (B)
The "Multiplication Rule"
P (A andB) = P (A|B) · P (B)
Events A,B are Independent Exactly When
P (A|B) = P (A) i.e. when P (A andB) = P (A) · P (B)
(Multiple Events are Independent When Every Intersection of AnyCollection of Them (or Their Complements) Has ProbabilityObtainable as a Product of Individual Probabilities)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 6 / 58
Day 6-Counting
When Outcomes are Equally Likely
P (A) =# (A)# (S)
A Basic Principle: When a complex action can be broken into a series of kcomponent actions, the first of which can be done n1 ways, the second ofwhich can subsequently be done n2 ways, the third of which cansubsequently be done n3 ways, etc., the whole can be accomplished in
n1 · n2 · · · · · nkdifferent waysCount of Possible Permutations
Pr ,n =n!
(n− r)!Count of Possible Combinations(
nr
)=
n!r ! (n− r)!
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 7 / 58
Day 7-Discrete Random Variables and Specifying TheirDistributions
Probability Mass Function
f (x) = P [X = x ]
Cumulative Distribution Function
F (x) = P [X ≤ x ] (general)F (x) = ∑
z≤xf (z) (discrete)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 8 / 58
Day 8-Expectation for Discrete Variables
Expected (or Mean) Value of h (X ) for a Discrete X
Eh (X ) = ∑xh (x) f (x)
Mean of X (Mean/Center of the Distribution of X )
EX = ∑xxf (x) ( = µX )
Variance of X (a Measure of Spread for the Distribution of X )
VarX = ∑(x − EX )2f (x) ( = σ2X )
= ∑ x2f (x)− (EX )2
= EX 2 − (EX )2
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 9 / 58
Day 9-More Mean and Variance/Independent IdenticalSuccess-Failure Trials
Chebyschev’s Inequality (general)
P [µX − kσX < X < µX + kσX ] ≥ 1−1k2
Other Useful Facts (general)
E (aX + b) = aEX + b
Var (aX + b) = a2 VarX
σaX+b =√
Var (aX + b) = |a| σX
A Convenient (and Sometimes Appropriate) Model is the "BernoulliTrials" Model:
1 P [success on trial i ] = p (fixed, the same for all i)2 The events Ai = "success on trial i" are all independent
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 10 / 58
Day 10-Binomial and Geometric Distributions
Under the Bernoulli Trials Model:
X = the number of successes in n trials Has the Binomial(n, p)Distribution
f (x) =
(nx
)px (1− p)n−x for x = 0, 1, . . . , n
0 otherwise
With EX = np and VarX = np (1− p)X = the trial on which the first success occurs Has theGeometric(p) Distribution
f (x) ={p(1− p)x−1 for x = 1, 2, . . .0 otherwise
With 1− F (x) = (1− p)x , EX = 1pand VarX =
1− pp2
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 11 / 58
Day 11-Geometric and Poisson Distributions
The Poisson(λ) Distribution is a Commonly Used Model for
X = the number of occurrences of a relatively rare
phenomenon across a fixed interval of time or space
This Has
f (x) =
e−λλx
x !for x = 0, 1, 2, . . .
0 otherwise
With EX = λ and VarX = λ
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 12 / 58
Day 12-Continuous Random Variables, pdf’s and cdf’s
Probability Density Function, f (x) ≥ 0 with
P [a ≤ X ≤ b] =∫ b
af (x) dx
(Continuous) Cumulative Distribution Function
F (x) = P [X ≤ x ] =∫ x
−∞f (t) dt
cdf to pdfddxF (x) = f (x)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 13 / 58
Day 13-Expectation for Continuous Variables/NormalDistributions
Expected (or Mean) Value of h (X ) for a Continuous X
Eh (X ) =∫ ∞
−∞h (x) f (x) dx
Mean of X (Mean/Center of the Distribution of X )
EX =∫ ∞
−∞xf (x) dx ( = µX )
Variance of X (a Measure of Spread for the Distribution of X )
VarX =∫ ∞
−∞(x − EX )2f (x) dx ( = σ2X )
=∫ ∞
−∞x2f (x) dx − (EX )2
= EX 2 − (EX )2
All the Day 9 Facts Hold for Continuous VariablesVardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 14 / 58
Day 14-Normal Distributions
Normal(µ, σ2
f (x) =1√2πσ2
e−(x−µ)2/2σ2 for all x
Standard Normal (µ = 0, σ = 1) Version
f (z) =1√2πe−z
2/2 for all z
Standard Normal cdf (tabled)
Φ(z) = F (z) =∫ z
−∞
1√2πe−t
2/2 dt
Conversion to Standard Units
z =x − µ
σ
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 15 / 58
Day 15-Normal Approximation to Binomial/Exponentialand Weibull Distributions
For Large n and Moderate p, a Binomial(n, p) Distribution isApproximately Normal (With µ = np and σ2 = np (1− p))The Exponential(λ) Distribution Has
f (x) ={
λe−λx for x > 00 otherwise
With EX =1λ, VarX =
1
λ2and F (x) =
{0 if x ≤ 01− e−λx if x > 0
The Weibull(α, β) Distribution Has
F (x) ={0 if x < 01− e−(x/β)α if x ≥ 0
With Median F−1(.5) = βe−(.3665/α) and Scale Parameter β
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 16 / 58
Day 16-Jointly Discrete Random Variables
Joint Probability Mass Function
f (x , y) = P [X = x andY = y ]
Marginal Probability Mass Functions
g(x) = ∑yf (x , y) and h(y) = ∑
xf (x , y)
Conditional Probability Mass Functions
g(x | y) = f (x , y)h(y)
and h(y | x) = f (x , y)g(x)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 17 / 58
Day 17-Jointly Discrete and Continuous Variables
Independence of Discrete Random Variables
f (x , y) = g(x)h(y) for all x , y
Joint Probability Density Function f (x , y) ≥ 0
P [(X ,Y ) ∈ R] =∫ ∫
Rf (x , y) dx dy
Marginal Probability Density Functions
g(x) =∫ ∞
−∞f (x , y) dy and h(y) =
∫ ∞
−∞f (x , y) dx
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 18 / 58
Day 18-Continuous Variables, Conditionals, andIndependence
Conditional Probability Densities
g(x | y) = f (x , y)h(y)
and h(y | x) = f (x , y)g(x)
Independence of Continuous Random Variables
f (x , y) = g(x)h(y) for all x , y
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 19 / 58
Day 19-Functions of Several Random Variables andExpectation
For Jointly Distributed Variables X ,Y , . . . ,Z the Distribution ofU = g (X ,Y , . . . ,Z ) Can Sometimes Be Derived
JMP Simulation to Approximate the Distribution ofU = g (X ,Y , . . . ,Z ) (and EU) for Independent X ,Y , . . . ,Z
Expectation of U = g (X ,Y )
Eg (X ,Y ) = ∑x ,yg (x , y) f (x , y) (discrete)
Eg (X ,Y ) =∫ ∞
−∞
∫ ∞
−∞g (x , y) f (x , y) dx dy (continuous)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 20 / 58
Day 20-Covariance, Correlation, and Laws of Expectation
Cov (X ,Y ) = E (X − EX) (Y − EY ) ( = E (X − µX ) (Y − µY ) )
= EXY − EXEY ( = EXY − µX µY )
ρ = Corr (X ,Y ) =Cov (X ,Y )√VarX · VarY
=Cov (X ,Y )
σX σY
−1 ≤ ρ ≤ 1 With ρ = ±1 Exactly When X and Y are PerfectlyLinearly Related
X ,Y Independent Implies ρ = 0
E(aX + b) = aEX + b (from Day 9)
X ,Y Independent Implies Es (X ) t (Y ) =Es (X )Et (Y )
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 21 / 58
Day 21-Laws of Expectation and Variance
VarX =EX 2 − (EX )2 (From Day 8)
Var (aX + b) = a2 VarX (From Day 9)
Var (aX + bY ) = a2 VarX + b2 VarY + 2abCov(X ,Y )
X ,Y Independent Implies Var (aX + bY ) = a2 VarX + b2 VarY
For Independent X ,Y , . . . ,Z , U = a0 + a1X + a2Y + · · ·+ anZ Has
EU = a0 + a1EX + a2EY + · · ·+ anEZVarU = a21 VarX + a22 VarY + · · ·+ a2n VarZ
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 22 / 58
Day 22-Propagation of Error/Transition to Statistics
For Independent X ,Y , . . . ,Z Approximations for the Mean andVariance for U = g (X ,Y , . . . ,Z ) Are
Eg (X ,Y , . . . ,Z ) ≈ g (µX , µY , . . . , µZ )
Var g (X ,Y , . . . ,Z ) ≈(
∂g∂x
)2VarX + · · ·+
(∂g∂z
)2VarZ
(Where the Partials Are Evaluated at µX , µY , . . . , µZ )
Random Sampling From a Large Population or a Physically StableProcess is (at Least Approximately) Described by a Model That SaysData X1,X2, . . . ,Xn Are Independent Identically Distributed RandomVariables (With Marginal Probability Distribution the PopulationRelative Frequency Distribution)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 23 / 58
Day 23-Distributional Properties of the Sample Mean
If X1,X2, . . . ,Xn Are Independent Identically Distributed (Each WithMean µ and Variance σ2) the Random Variable
X =1n(X1 + X2 + · · ·+ Xn)
Has
EX = µX = µ
VarX = σ2X =σ2
n
Further, X is Approximately Normal if
1 The Population Distribution is Itself Normal2 The Sample Size, n, is Large (The Central Limit Theorem)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 24 / 58
Day 24-Introduction to Confidence Intervals
Following From
Z =X − µ
σ/√nis (at least approximately) Standard Normal
The Interval Formula (X − z σ√
n,X + z
σ√n
)Will Cover µ In a Fraction P (−z < Z < z) of All ApplicationsThe End Points
X ± z σ√n
Are Thus (Typically Practically Unusable) Confidence Limits for µ
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 25 / 58
Warning About Convention
Henceforth Drop the Convention That Random Variables Are RepresentedBy Capital Letters and Their Possible Values by Lower Case Letters.Typically (But Not Always) Lower Case Will Be Used For Both, andContext Will Have to Be Used to Distinguish.
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 26 / 58
Day 25-Large Sample Confidence Intervals for Means andProportions
Large n Confidence Limits for µ
x ± z s√n
Follow From
Z =x − µ
s/√n∼ (at least approximately) Standard Normal
For Large n, Confidence Limits for p
p ± z√p (1− p)
n(Where p = (np + 2) / (n+ 4)) Follow From
Z =p − p√p (1− p)
n
∼ (at least approximately) Standard Normal
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 27 / 58
Day 26-Small Sample Confidence Intervals for a (Normal)Mean
Small n Confidence Limits for µ (When Sampling From a NormalDistribution)
x ± t s√n
(For t a Percentage Point of the tn−1 Distribution) Follow From
T =x − µ
s/√n∼ tn−1
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 28 / 58
Day 27-Small Sample Confidence Intervals for a (Normal)Standard Deviation and Normal Prediction Limits
Confidence Limits for σ (For a Normal Distribution)
s
√n− 1χ2upper
and s
√n− 1χ2lower
Follow From
X 2 =(n− 1) s2
σ2∼ χ2n−1
Prediction Limits for xnew (From a Normal Distribution)
x ± ts√1+
1n
(For t a Percentage Point of the tn−1 Distribution) Follow From
T =x − xnew
s
√1+
1n
∼ tn−1
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 29 / 58
Day 28-Normal Prediction and Tolerance Limits/NormalPlotting
"Tolerance" Limits for a Large (User Chosen) Part of a NormalDistribution
x ± τ2s (two-sided)
x − τ1s or x + τ1s (one-sided)
Where τ2 or τ1 is Chosen For Given "Part of the Distribution" andConfidence LevelNormal Plots For an Ordered Data Set x1 ≤ x2 ≤ · · · ≤ xn MadePlotting n Points((
i − .5n
)data quantile,
(i − .5n
)standard normal quantile
)=
(xi ,Φ−1
(i − .5n
))= (xi , zi )
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 30 / 58
Day 29-Hypothesis Testing Introduction 1
Devore 7-Step Format
Null and Alternative Hypotheses
Test Statistic
Type 1 and Type 2 Errors
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 31 / 58
Day 30-Hypothesis Testing Introduction 2
Test Criteria/Rejection Criteria and Corresponding Error Probabilities
α, β and Their Competing Demands
Hypothesis Testing/Criminal Trial Analogy
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 32 / 58
Day 31-One-Sample Testing for a Mean
Large n Testing of H0:µ = # Uses
Z =x −#s/√n
and a Standard Normal Reference Distribution
(Normal Distribution) Small n Testing of H0:µ = # Uses
T =x −#s/√n
and a tn−1Reference Distribution
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 33 / 58
Day 32-One Sample Testing for a Proportion/"p-values"
Large n Testing of H0:p = # Uses
Z =p −#√# (1−#)
n
and a Standard Normal Reference Distribution
In ANY Hypothesis Testing Context
p-value = "observed level of significance"
= the probability (computed under H0)
of seeing a value of the test statistic
"more extreme" than the one observed
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 34 / 58
Day 33-One Sample Testing for a (Normal) StandardDeviation/(Large) Two-Sample Inference for Means
(Normal Distribution) Testing H0:σ2 = # Uses
X 2 =(n− 1) s2
#
and a χ2n−1 Reference DistributionLarge n1 and n2 (Independent Samples) Confidence Limits forµ1 − µ2 are
x1 − x2 ± z
√s21n1+s22n2
and a Test Statistic (for H0:µ1 − µ2 = #) is
Z =x1 − x2 −#√s21n1+s22n2
With Standard Normal Reference DistributionVardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 35 / 58
Day 34-(Small) Two-Sample Inference for (Normal) Means
(Somewhat Approximate) Small n1 or n2 (Normal Distribution)(Independent Samples) Confidence Limits for µ1 − µ2 are
x1 − x2 ± t
√s21n1+s22n2
(for t a Percentage Point of the t Distribution Withd .f . = min (n1 − 1, n2 − 1)) and a Test Statistic (forH0:µ1 − µ2 = #) is
T =x1 − x2 −#√s21n1+s22n2
With t Reference Distribution With d .f . = min (n1 − 1, n2 − 1)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 36 / 58
Day 35-Inference for a Mean Difference/a Difference inProportions
When Paired Values (x , y) Can Sensibly be Reduced to Differences
d = x − yand n of These to d and sd , One-Sample Inference Formulas Apply toInference for µd .Large n1, n2 (Independent Samples) Confidence Limits for p1 − p2 are
p1 − p2 ± z
√p1 (1− p1)
n1+p2 (1− p2)
n2
(Where p1 = (n1p1 + 2) / (n1 + 4) and p2 = (n2p2 + 2) / (n2 + 4) .)A Test Statistic for H0:p1 − p2 = 0 is
Z =p1 − p2√
p (1− p)√1n1+1n2
With a Standard Normal Reference Distribution.Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 37 / 58
Day 36-Inference for a Ratio of (Normal) Variances
Where s1 and s2 are Based on Independent Samples from NormalDistributions With Respective Standard Deviations σ1 and σ2,
F =s21/σ21s22/σ22
Has the (Snedecor) F Distribution With n1 − 1 and n2 − 1 Degrees ofFreedom.Hence, Confidence Limits for σ1/σ2 are
s1s2√Fupper
ands1
s2√Flower
and H0:σ1 = σ2 is Tested Using
F =s21s22
and an F Reference Distribution With n1 − 1 and n2 − 1 Degrees ofFreedom.
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 38 / 58
Day 37-Least Squares Fitting of a Line
Based on n Data Pairs (x1, y1) , . . . , (xn, yn)The "Least Squares Line" Through the Scatterplot Has
slope b1 =∑ni=1(xi − x)(yi − y)
∑ni=1(xi − x)2
=∑ni=1 xiyi −
(∑ni=1 xi ) (∑
ni=1 yi )
n
∑ni=1 x
2i −
(∑ni=1 xi )
2
nintercept b0 = y −b1xThe Sample Correlation Between x and y is
r =∑ni=1(xi − x)(yi − y)√
∑ni=1(xi − x)2 ·∑n
i=1(yi − y)2
=∑ni=1 xiyi −
(∑ni=1 xi ) (∑
ni=1 yi )
n√√√√(∑ni=1 x
2i −
(∑ni=1 xi )
2
n
)(∑ni=1 y
2i −
(∑ni=1 yi )
2
n
)Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 39 / 58
Day 38-Coeffi cient of Determination and the SLR Model
Based on n Data Pairs (x1, y1) , . . . , (xn, yn) and Least Squares FittedValues yi = b0 + b1xi
SSTot = (n− 1) s2y =n
∑i=1(yi − y)2, SSE =
n
∑i=1(yi − yi )2
and SSR = SSTot − SSE
Then
R2 =SSRSSTot
= (sample correlation of y and y)2 ( = r2 in SLR only)
The (Normal) Simple Linear Regression Model is
yi = β0 + β1xi + εi for independent N(0, σ2
)random "errors" εi
(Responses are Independent Normal Variables with Meansµy |x = β0 + β1x and Constant Standard Deviation, σ)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 40 / 58
Day 39-Inference Under the SLR Model 1
Single-Number Estimates of SLR Model Parameters are
σ = s ≡√SSEn− 2 , β1 = b1, and β0 = b0
Using (n− 2) s2/σ2 ∼ χ2n−2, Confidence Limits for σ are
s
√n− 2χ2Upper
and s
√n− 2χ2Lower
Since b1 is Normal With Mean β1 and StdDev σ/√
∑ni=1(xi − x)2,
Write SEb1 = s/√
∑ni=1(xi − x)2 and Have Confidence Limits for β1
b1 ± t · SEb1And Test H0:β1 = # Using a tn−2 Reference Distribution for
T =b1 −#SEb1
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 41 / 58
Day 40-Inference Under the SLR Model 2
Since y = b0 + b1xnew Has Mean µy |xnew = β0 + β1xnew and StdDev
σ
√1n+
(xnew − x)2
∑ni=1(xi − x)2
, Write SEy = s
√1n +
(xnew−x )2∑ni=1(xi−x )2
Confidence Limits for µy |xnew = β0 + β1xnew are Then
y ± t · SEy
And H0:µy |xnew = # May Be Tested Using a tn−2 ReferenceDistribution for
T =y −#SEy
Prediction Limits for ynew at x = xnew Are
y ± t · s
√1+
1n+
(xnew − x)2
∑ni=1(xi − x)2
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 42 / 58
Day 41-SLR and "ANOVA"
Breaking Down SSTot Into SSR and SSE Is a Kind of "ANalysis OfVAriance" (of y). Further, an F Test of H0:β1 = 0 Equivalent to aTwo-Sided t Test Can be Based on F = MSR/MSE and an F1,n−2Reference Distribution. Calculations Are Summarized in a Special"ANOVA" Table.
ANOVA Table (for SLR)Source SS df MS FRegression SSR 1 MSR = SSR/1 F = MSR/MSEError SSE n− 2 MSE = SSE/(n− 2)Total SSTot n− 1
The Facts that EMSE = σ2 and EMSR = σ2 + β21 ∑ni=1(xi − x)2
Provide Motivation for Rejecting H0 For Large Observed F .
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 43 / 58
Day 42-Practical Considerations 1
The Possibility that Neither Interpolation Nor Extrapolation isCompletely Safe Must Be Considered When Using a Fitted Equation.Rational Practice Requires That One Investigate the Plausibility of aRegression Model Before Basing Inferences on It.
In "Single x" Contexts One Should Plot y versus x Looking for a TrendConsistent With the Fitted Model and for Constant Spread AroundThat Trend.In General, "Residuals"
ei = yi − yiShould Be "Patternless" and "Normal-looking."Common Practice is to Normal-Plot and Plot Against All Predictors(and yi and Other Potential Predictors) "Standardized" Residuals
e∗i =eiSEei
= ei
s
√1− 1
n− (xi − x)2
∑ni=1(xi − x)2
in SLR
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 44 / 58
Day 43-Practical Considerations 2
For c Different (Sets of) "x Conditions" in the Data Let
SScond j = (ncond j − 1) s2cond jAnd Define (A "Pure Error" Sum of Squares)
SSPE =c
∑j=1SScond j
With Degrees of Freedom n− c = ∑cj=1 (ncond j − 1) . Then (a
"Lack of Fit" Sum of Squares) Is
SSLoF = SSE − SSPEWith Degrees of Freedom
d .f .LoF = error d .f .− (n− c)Then H0:the fitted model is appropriate Can Be Tested Using
F =SSLoF/d .f .LoFSSPE/ (n− c)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 45 / 58
Day 44-SLR Practical Considerations 3/MLR Model
Sometimes, Replacing y With Some Function of y (Like, e.g.,y ′ = ln y) and/or x’s With Some Function(s) Thereof Can Make TheSimple Technology of Regression Analysis Applicable to the Analysisof a Data Set.The Multiple Linear Regression Model is
yi = β0 + β1x1i + β2x2i + · · ·+ βkxki + εi
Where the εi Are Independent Normal With Mean 0 and StandardDeviation σ.Least Squares (e.g. Implemented in JMP) Can Be Used to Fit
y = b0 + b1x1 + b2x2 + · · ·+ bkxk(That is, Estimate the β’s). The Corresponding Estimate of σ Is
s =
√∑ (yi − yi )2
n− (k + 1)
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 46 / 58
Day 45-MLR R-Squared/Overall (Model Utility) F Test
In MLR as in SLR,
SSTot = (n− 1) s2y =n
∑i=1(yi − y)2, SSE =
n
∑i=1(yi − yi )2
and SSR = SSTot − SSEand R2 =
SSRSSTot
The Basic ANOVA Table for MLR Is
ANOVA Table (for MLR)Source SS df MS FRegression SSR k MSR = SSR/k F = MSR
MSEError SSE n− k − 1 MSE = SSE/(n− k − 1)Total SSTot n− 1
Which Organizes an F Test of H0:β1 = β2 = · · · = βk = 0
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 47 / 58
Day 46-MLR Partial F Tests/Fitted Coeffi cients
In the (Full) MLR Model
y = β0 + β1x1 + β2x2 + · · ·+ βkxk + ε
For l < k, The Hypothesis H0:βl+1 = · · · = βk = 0, Is That the FullModel is Not Clearly Better Than the Reduced Model
y = β0 + β1x1 + β2x2 + · · ·+ βlxl + ε
This Can Be Tested Using
F =(SSR (full)− SSR (reduced)) / (k − l)
SSE (full) / (n− k − 1)With an Fk−l ,n−k−1 Reference DistributionThe Fitted Coeffi cient bl Is Normal With Mean βl And StandardDeviation
σ · (a complicated function of the values of the predictors xli )
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 48 / 58
Day 47-Confidence and Prediction Limits in MLR
The Fitted Value y = b0 + b1x1 + b2x2 + · · ·+ bkxk Is Normal WithMean µy |x1,...,xk = β0 + β1x1 + β2x2 + · · ·+ βkxk And StandardDeviation
σ · (a complicated function of the values of the predictors xli )Replacing σ with s In the Previous Two Standard DeviationsProduces Standard Errors SEbl and SEy That Are Obtained From JMP(NOT "By Hand")Confidence and Prediction Limits Are Then
s
√n− k − 1
χ2upperand s
√n− k − 1
χ2lowerfor σ
bl ± t · SEbl for βly ± t · SEy for µy |x1,...,xk
y ± t ·√s2 + (SEy )
2 for ynew at (x1, . . . , xk )
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 49 / 58
Day 48-Practical Considerations
Model Checking Involves Residual Plots (Residuals are ei = yi − yiand Standardized Residuals are e∗i = ei/SEei for SEei =σ · (a complicated function of the values of the predictors xli )), Lackof Fit Tests, and Examination of the "PreSS" Statistic. For y(i ) AFitted Value For the ith Case Obtained Not Using the Case in theFitting
PRESS =n
∑i=1(yi − y(i ))2 ≥
n
∑i=1(yi − yi )2 = SSE
(Ideally, PRESS Is Not Much Larger Than SSE )Extrapolation is a Potentially Big Issue in MLRTransformations Extend the Potential Applications of MLRVariable/Model Selection in MLR Involves Balancing "Good Fit"versus a Small Number of Predictors Variables
Formal Tools are Partial F Tests and Tests for Lack of FitExamination of R2, MSE , and Cp For "All Possible Regressions" Is AMore Flexible Informal Approach
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 50 / 58
Day 49-Practical Considerations-Model Selection
Considering Submodels (Reduced Models)
y = β0 + β1x1 + β2x2 + · · ·+ βpxp + ε
of a (Full) Model
y = β0 + β1x1 + · · ·+ βpxp + βp+1xp+1 + · · ·+ βkxk + ε
Assumed to Produce Correct Values of The Means µyi ,
Cp = (n− k − 1)(SSEpSSEk
)+ 2 (p + 1)− n
(Under the Full Model) Estimates a Quantity that is
p + 1+(a positive measure of how badly thereduced model does at fitting the µyi
)So, Simple (Small p) Models With Big R2, Small MSE , andCp ≈ p + 1 Are Desired
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 51 / 58
Day 50-Practical Considerations-ModelSelection/Multicollinearity
JMP Fit Model "Stepwise," Lack of Fit, and PRESS
The Complication of Multicollinearity Arises in MLR When One orMore of the Predictors is Nearly a Linear Combination of Others ofthe Predictors (and Is Therefore Essentially Redundant in PracticalTerms). When This Occurs (Besides There Being Technical ProblemsAssociated With Solution of the Least Squares Fitting Problem):
While Good Prediction For Cases Like Those in the Data Set May BePossible, Extrapolation Is Extremely Dangerous, andAssessment of Individual Importance of Particular Predictors Is OftenImpossible. (This Produces Big Standard Errors for IndividualCoeffi cients, and Often Individual bl’s That Make No Sense in theSubject-Matter Application.)
Multicollinearity Can Be Prevented If One Gets to Choose(x1i , x2i , . . . , xki ) Combinations (By Making Predictors Uncorrelated).
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 52 / 58
Day 51-Multicollinearity
With
e(j)(yi ) = the ith y residual regressing
on all predictor variables except xj and
e(j)(xji ) = the ith xj residual regressing
on all predictor variables except xj
JMP Plots Accompanying "Effect Tests" in Fit Model Are Plots Of
e(j)(yi ) + y versus e(j)(xji ) + x j
So Small Spread In Horizontal Coordinates of Plotted Points IndicatesMulticollinearity.
When Predictors Are Uncorrelated, Regression Sums of Squares"Add" and Fitted Coeffi cients bl Are The Same For All ModelsIncluding xl .
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 53 / 58
Day 52-Multicollinearity/The One Way Normal Model
Multicollinearity Means That One Only Has (x1i , x2i , . . . , xki )Essentially In Some Lower-Dimensional (Than k) Subspace ofk-Dimensional Space and Thus Can Hope To Reliably Predict OnlyThere.
One-Way Analyses Are "r -Sample" Analyses (Not Unlike the2-Sample Analyses of Devore Ch 9). They Are Based On A ModelFor
yij = jth observation in the ith sample
Of The Formyij = µi + εij
For The εij Independent Normal Random Variables With Mean 0 andStandard Deviation σ. This Is "Samples From r Normal PopulationsWith Possibly Different Means But A Common Standard Deviation."
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 54 / 58
Day 53-Inference in the One Way Normal Model
A Single Number Estimate of σ Is
spooled = sP =
√(n1 − 1) s21 + (n2 − 1) s22 + · · ·+ (nr − 1) s2r
(n1 − 1) + (n2 − 1) + · · ·+ (nr − 1)And Confidence Limits For σ Are
sP
√n− rχ2upper
and sP
√n− rχ2lower
In The Context of Lack of Fit In Regression, (SSPE/d.f. PE) = s2P.For Population and Corresponding Sample Linear Combinations of rMeans
L = c1µ1 + · · ·+ crµr and L = c1y1 + · · ·+ cr y rConfidence Limits For L Are
L± tsP
√c21n1+ · · ·+ c
2r
nrVardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 55 / 58
Day 54-One Way ANOVA and Model Checking
In The One Way Context
SSE = ∑i ,j(yij − y i )2 = (n1 − 1) s21 + · · ·+ (nr − 1) s2r and
SSTr = SSTot − SSE =r
∑i=1ni (y i − y)2
And The Hypothesis H0:µ1 = µ2 = · · · = µr May Be Tested Using
F =MSTrMSE
=SSTr/ (r − 1)SSE/ (n− r)
And An Fr−1,n−r Reference Distribution.One Way Residuals and Standardized Residuals Are (Respectively)
eij = yij − y i and e∗ij =eij
sP
√ni − 1ni
And Are Used In Model Checking Exactly As In Regression.Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 56 / 58
Day 55-Statistics and Measurement 1
Measurand x
Measurement y
Measurement Error ε = y − x (So y = x + ε)
Standard Modeling Is That ε Is Normal With Mean β (MeasurementBias) And Standard Deviation σ
Then For Independent Measurements y1, y2, . . . , yn of a Fixed x
y ± t sy√nEstimates x + β (Measurand Plus Bias)
Limits sy
√n− 1
χ2Upperand sy
√n− 1χ2Lower
Are For σ (If Gauge And Operator
Are Fixed, This Is A Repeatability Std Dev ... If Each yi Is From ADifferent Operator This Is an R&R Std Dev)
If x Varies Independent of ε, The Situation is More Complex andModeling of Multiple Measurement Depends on Exactly How DataAre Taken ... For a Single y , Ey = µx + β And σy =
√σ2x + σ2
Vardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 57 / 58
Day 56-Statistics and Measurement 2
So if Each of y1, y2, . . . , yn Has a Different Measurandy ± t sy√
nEstimates µx + β (Average Measurand Plus Bias)
Limits sy
√n− 1
χ2Upperand sy
√n− 1χ2Lower
Are For√
σ2x + σ2 (A
Combination of Measurand Variability and Measurement Variability)
Two Important Applications of This Are WhereDifferent x’s Represent The Truth About Different Items, So σxMeasures Process VariabilityDifferent x’s Represent Different Operator-Specific Biases, So σxMeasures Reproducibility Variability
Where a Data Set Has r Measurands and m Measurements PerMeasurand, One Way ANOVA Can Help Separate σ2x And σ2
sP (And Associated Confidence Limits) Estimate σ√max
(0, 1m (MSTr −MSE )
)(And Limits Provided By JMP If You
Know How To Ask) Estimate σxVardeman (ISU Stat and IMSE) Stat 231 Summary September 6, 2011 58 / 58