simulation for examining margin of error and sample size: binomial proportions acknowledgements to...
TRANSCRIPT
![Page 1: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/1.jpg)
Simulation for Examining Margin of Error and Sample Size: Binomial
Proportions
Acknowledgements to Mandy Kauffman (WEST, Inc.) for photosand ‘background’ slides…simulation exercise adapts pedagogy ofTrumbo,Suess, and Okumura (2005)
![Page 2: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/2.jpg)
2
Background – Bovine brucellosis
• Bacterial disease– History in US– Elk, bison, cattle (humans)– Cattle wildlife– Causes abortions– Environmental contamination– Potential transmission to cattle
• $$$$• Management implications
![Page 3: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/3.jpg)
3
• Harsh winters + development elk starving, commingling with cattle
• 23 supplemental winter elk feedgrounds created• 22 WGFD• 1 USFWS
• Up 84% of elk use feedgrounds• Low winter mortality• Costly• 22% seroprevalence
on feedgrounds3.7% elsewhere
Background - Elk Feedgrounds
Preble 1911
![Page 4: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/4.jpg)
Background – Management
• Management strategies1. Maintain cattle/elk separation
-hazing elk -fencing haystacks-elk feedgrounds
2. ↓ likelihood of exposed cattle experiencing abortions (RB51)3. ↓ seroprevalence in elk
-T&S -low density feeding-elk vaccination
![Page 5: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/5.jpg)
Background: Management• Despite ongoing management:
– Recent cases in cattle/bison traced back to elk– Affected area expanding
• Limited $$ available for management
– No clear scientifically sound method– Need for economic evaluation of available management strategies
• Groups 1 & 2 already evaluated/underway• Evaluation of Group 3 strategies still needed
– How to assess sero-prevalence of brucellosis in elk on feedgrounds…how many elk to sample?
![Page 6: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/6.jpg)
A Beginning
R Code:
samp <- sample(0:1,25,rep=T,prob=c(0.78,0.22))samp
Let’s start with simulating brucellosis diagnosis from 25randomly sampled elk…assume prevalence is 0.22…goal to estimate prevalence within 5%...how many elk needed?
Issue with the assumption of random here?
> samp
[1] 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 0 0 0 0 0
![Page 7: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/7.jpg)
Generate a Profile
Let’s observe the incidence rate for a variety of samplesizes...keep in mind that this profile doesn’t display independent samples
R Code:n <- 25NumElk <- 1:np.bruc <- c(0.78,0.22)x <- sample(0:1,n,rep=T,prob=p.bruc)run.tot.pos <- cumsum(x)Proportion <- run.tot.pos/NumElktabresults <- round(cbind(NumElk,x,run.tot.pos,Proportion),3)tabresults
![Page 8: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/8.jpg)
Generating a Profile
R Code:
plot(NumElk,Proportion,type="l", ylim= c(0,1))
abline(h=0.22,col="green",lwd=2)abline(h=0.17,col="blue",lwd=2,lty=3)abline(h=0.27,col="blue",lwd=2,lty=3)
Running the simulation and corresponding plots several timeswill provide differing versions of the profile on the next page…variation in profiles displays the instability in our statistic
![Page 9: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/9.jpg)
![Page 10: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/10.jpg)
Generating Multiple Profiles
R Code:
set.seed(11); n <- 25; numsamps <- 20
plot(0,pch=" ", xlim=c(0,n), ylim = c(0.1,1.0), xlab = "NumElk", ylab="Proportion")
#Loop below will produce a different profile for each of the specified#numsamps…
for(i in 1:numsamps){x <- cumsum(sample(0:1,n,rep=T,prob=p.bruc)) / (1:n)lines(1:n,x)}
abline(h=0.22,col="green",lwd=2)abline(h=0.17,col="blue",lwd=2,lty=3)abline(h=0.27,col="blue",lwd=2,lty=3)
![Page 11: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/11.jpg)
Calculated margin of error:
0.22*0.781.96 0.1624
25
![Page 12: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/12.jpg)
How about 263 elk?
![Page 13: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/13.jpg)
• Power is a concept that is often difficult to teach, regardless of the level of the course…probably because hypothesis testing is such a strange beast!
• Many practical applications involve evaluating sampling protocols in terms of the ability to detect change over a period of time
• Simulation often quite effective for determining power associated with a particular design
Simulation for Determining Power
![Page 14: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/14.jpg)
• Cobble bars are an important feature for many streams…home of native and non-native plants and many species of birds nest in these regions
What is the proportion of ‘woody vegetation’ cover on cobblebars in the Great Smokey Mountains?
Monitoring Woody Vegetation in Cobble Bars
![Page 15: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/15.jpg)
-
22 cobble bars exist in the BISO area…can afford to sample 9 of them
-GRTS selection of sampling units
- rotating panel monitoring schedule
![Page 16: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/16.jpg)
transect
Point intercept countson transect
![Page 17: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/17.jpg)
While there are built-in functions in the ‘pwr’ library for computing power and
websites such as Russ Lenth’s Power And Sample Size
(http://homepage.stat.uiowa.edu/~rlenth/Power/), it is often the case that
your study design will be more complicated than what ‘canned’ allow for when
computing power.
Ex 1. Suppose that we are investigating a single cobblebar and that this year we
observe 34% coverage in woody vegetation. We will go out once a year to measure the % coverage. If there is a linear trend then the % coverage can be modeled as a function of year using:
0 1cover%i i iyear
![Page 18: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/18.jpg)
Let’s scale ‘year’ to be 0, 1, 2, 3, and 4 (so a five year study) with 0 denoting
the % coverage in year 0, the slope, 1 , denoting the per-annum change in %
woody coverage, and the model error term, i denotes the uncertainty in the
measured woody percentage value within a given year (due to measurement
error, weather events, etc.)
0 1 0 1: 0 : 0 H vs. H
Will observe this cobblebar over time, run a linear regression and compute the
estimated slope along with its estimated standard error and see if the p-value
is less than our specified .
![Page 19: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/19.jpg)
We can estimate power by simulating many data sets with a given slope,
intercept, standard deviation, and then keeping track of how many times we
reject the null that the slope is zero. Let’s suppose we want to detect a per
annum change of 3.4% in the %woody coverage.
R Code for p-value extraction:
bzero <- 34 b1 <- 3.4 yr <- 0:4 N <- 5 sd <- 7.4 pct_mean <- bzero + b1 * yr pct <- rnorm(N, mean = pct_mean, sd = sd) m <- lm(pct ~ yr) coef(summary(m))["yr", "Pr(>|t|)"]
![Page 20: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/20.jpg)
Below, we have the true trend (i.e. 3.4% per year) plotted vs. the estimated
trend from one set of simulated data:
![Page 21: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/21.jpg)
For this particular simulation, the p-value extracted was: > coef(summary(m))["yr", "Pr(>|t|)"] [1] 0.1726658 Thus, although we know there is a 3.4% per annum change, our data did not
result in a small enough p-value for us to detect an actual trend.
To compute the power, we need to do what we just did many times and then
record what percentage of times the null hypothesis would have been rejected
![Page 22: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/22.jpg)
We can use the code below to do this: R Code: numsim <- 500 pvals <- numeric(numsim) for(i in 1:numsim){ pct_mean <- bzero + b1 * yr pct <- rnorm(N, mean = pct_mean, sd = sd) m <- lm(pct ~ yr) pvals[i] <- coef(summary(m))["yr", "Pr(>|t|)"] } sum(pvals < 0.05)/numsim
![Page 23: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/23.jpg)
lin_reg_sim <- function(bzero,b1,numyear,sd,numsim){ yr <- 0:(numyear-1) pvals <- numeric(numsim) for (i in 1:numsim) { pct_mean <- bzero + b1 * yr pct <- rnorm(numyear, mean = pct_mean, sd = sd) m <- lm(pct ~ yr) pvals[i] <- coef(summary(m))["yr", "Pr(>|t|)"] } # end of i loop power <- (sum(pvals < 0.05)/numsim) return (power) } # end of function linreg_sim lin_reg_sim(bzero=34,b1=3.4,numyear=5,sd=7.4,numsim=500)
As a Single Function
![Page 24: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/24.jpg)
The code to the left computesas we change the effect size (i.e. the per annum change)
R Code: bzero <- 34; yr <- 0:4; N <- 5; sd <- 7.4 bvec <- seq(0.85,8.5,by=0.85) power.b <- numeric(length(bvec)) for (j in 1:length(bvec)){ b1 <- bvec[j] for (i in 1:500) { pct_mean <- bzero + b1 * yr pct <- rnorm(N, mean = pct_mean, sd = sd) m <- lm(pct ~ yr) pvals[i] <- coef(summary(m))["yr", "Pr(>|t|)"] } # end of i loop power.b[j] <- (sum(pvals < 0.05)/500) } # end of j loop plot(bvec,power.b,xlab="slopes",ylab="power")
![Page 25: Simulation for Examining Margin of Error and Sample Size: Binomial Proportions Acknowledgements to Mandy Kauffman (WEST, Inc.) for photos and ‘background’](https://reader035.vdocuments.us/reader035/viewer/2022062718/56649e7a5503460f94b79f2f/html5/thumbnails/25.jpg)
Just as we would expect, the power increases with larger trends