fixed vs. random effects fixed effect –we are interested in the effects of the treatments (or...
TRANSCRIPT
Fixed vs. Random Effects Fixed effect
– we are interested in the effects of the treatments (or blocks) per se– if the experiment were repeated, the levels would be the same– conclusions apply to the treatment (or block) levels that were tested– treatment (or block) effects sum to zero
Random effect– represents a sample from a larger reference population– the specific levels used are not of particular interest– conclusions apply to the reference population
• inference space may be broad (all possible random effects) or narrow (just the random effects in the experiment)
– goal is generally to estimate the variance among treatments (or other groups)
Need to know which effects are fixed or random to determine appropriate F tests in ANOVA
2T
0i
i
Fixed or Random? lambs born from common parents (same ram and ewe)
are given different formulations of a vitamin supplement comparison of new herbicides for potential licensing comparison of herbicides used in different decades
(1980’s, 1990’s, 2000’s) nitrogen fertilizer treatments at rates of 0, 50, 100, and
150 kg N/ha years of evaluation of new canola varieties (2008, 2009,
2010) location of a crop rotation experiment that is conducted
on three farmers’ fields in the Willamette valley (Junction City, Albany, Woodburn)
species of trees in an old growth forest
Fixed and random models for the CRD
Fixed Model(Model I)
Random Model(Model II)
Yij = µ + i + ij
Expected Source df Mean Square
Treatment t -1
Error tr -t
2T
2e rs+s2e
s
Expected
+Te r 22s
Source df Mean Square
Treatment t -1
Error tr -t
2e
s
2 2t i
i(t 1)
variance among fixed treatment effects
Models for the RBD
Fixed Model Random Model
Yij = µ + i +j + ij
2 2T j
j
2 2B i
i
(t 1)
(r 1)
Source dfExpectedMean Square
Block r-1
Treatment t-1
Error (r-1)(t-1)
Source
Treatment
Block2 2e Bt 2 2e Tr 2e
Source dfExpectedMean Square
Block r-1
Treatment t-1
Error (r-1)(t-1)
Source
Treatment
Block2 2e Bt 2 2e Tr 2e
Source dfExpectedMean Square
Block r-1
Treatment t-1
Error (r-1)(t-1)
Source
Treatment
Block2 2e Bt 2 2e Tr 2e
Mixed Model
Nested (Hierarchical) Designs
Levels of one factor (B) occur within the levels of another factor (A)
Levels of B are unique to each level of A
Factor B is nested within A
Factor A = the pigs (sows)
Factor B = the piglets
Nested factors are usually random effects
Nested vs. Cross-Classified Factors
Cross-classified
A1
A2
A3
B1 B2
X X
X X
X X
All possible combinations of
A and B
Nested
A1 A2 A3
B1 B2 B3 B4 B5 B6
Each unit of B is uniqueto each unit of A
General form for degrees of freedom
B nested in A a(b-1) A*B (a-1)(b-1)
Sub - Sampling
It may be necessary or convenient to measure a treatment response on subsamples of a plot– several soil cores within a plot– duplicate laboratory analyses to estimate grain protein
Introduces a complication into the analysis that can be handled in one of two ways:– compute the average for each plot and analyze normally– subject the subsamples themselves to an analysis
The second choice gives an additional source of variation in the ANOVA – often called the sampling error
Use Sampling to Gain Precision
When making lab measurements, you will have better results if you analyze several samples to get a truer estimate of the mean.
It is often useful to determine the number of samples that would be required for your chosen level of precision.
Sampling will reduce the variability within a treatment across replications.
Stein’s Sample Estimate
Where
t1 is the tabular t value for the desired confidence level and the degrees of freedom of the initial sample
d is the half-width of the desired confidence interval
s is the standard deviation of the initial sample
2 21
2
t sn
d
For Example
If we collected and ran five samples from the same block and same treatment, we might obtain data like that above. We decide that an alpha level of 5% is acceptable and we would like to be able to get within .5 units of the true mean.
The formula indicates that to gain that type of precision, we would need to run 14 samples per block per treatment.
Subsample6.2 mean 6.507.4 variance 0.455.8 t (0.05, 4 df) 2.78
7 d 0.506.1 n 13.88
Suppose we were measuring grain protein content and we wanted to increase the precision with which we were measuring each replicate of a treatment.
2 2 21
2 2
t s 2.78 * 0.45n 13.88
d 0.5
Linear model with sub-sampling
For a CRD Yijk= + i + ij + ijk
= mean effecti = ith treatment effectij = random errorijk= sampling error
For an RBD Yijk= + i + j + ij + ijk
= mean effectβi = ith block effectj = jth treatment effectij = treatment x block interaction, treated as errorijk= sampling error
Source df Expected Mean Square
Block r-1 2 2 2s e bσ nσ tnσ
Treatment t-1 2 2 2s e tn rn
Error (r-1)(t-1) 2 2s en
Sampling Error rt(n-1) 2s
Expected Mean Squares – RBD with subsampling
In this example, treatments are fixed and blocks are random effects This is a mixed model because it includes both fixed and random effects Appropriate F tests can be determined from the Expected Mean Squares
The RBD ANOVA with Subsampling
Source df SS MS F
Total rtn-1 SSTot =
Block r-1 SSB= SSB/(r-1)
Trtmt t-1 SST = SST/(t-1) FT = MST/MSE
Error (r-1)(t-1) SSE = SSE/(r-1)(t-1) FE = MSE/MSS
Sampling Error SSS = SSS/rt(n-1) rt(n-1) SSTot-SSB-SST-SSE
ijk ijkY Y
2
iitn Y Y
2
jjrn Y Y
2
kkn Y Y SSB SST
Significance Tests
MSS estimates – the variation among samples
MSE estimates – the variation among samples plus– the variation among plots treated
alike
MST estimates– the variation among samples plus– the variation among plots treated
alike plus– the variation among treatment
means
Therefore: FE
– tests the significance of the variation among plots treated alike
FT – tests the
significance of the differences among the treatment means
Means and Standard Errors
Standard Error of a treatment mean
Confidence interval estimate
Standard Error of a difference
Confidence interval estimate
t to test difference between two means
Ys MSE rn
iiL MSE rntY
1 2Y Ys 2MSE rn
1 21 2L 2MSE rntY Y
1 2Y Yt2MSE rn
2 2 2 22 s e s eY
nMSEs
rn rn rn r
Allocating resources – reps vs samples
Cost function
C = c1r + c2rn– c1 = cost of an experimental unit
– c2 = cost of a sampling unit
If your goal is to minimize variance for a fixed cost, use the estimate of n to solve for r in the cost function
If your goal is to minimize cost for a fixed variance, use the estimate of n to solve for r using the formula for a variance of a treatment mean
2e2
2s1
c
cn
rrn
2e
2s2
y
See Kuehl pg 163 for an example