chapter 4 simple random sampling (srs). srs srs – every sample of size n drawn from a population...

15
Chapter 4 Simple Random Sampling (SRS)

Upload: polly-cook

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Chapter 4

Simple Random Sampling (SRS)

Page 2: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

SRS

• SRS – Every sample of size n drawn from a population of size N has the same chance of being selected.

• Use table of random numbers (A.2) or computer software.

• Using the table:– Assign every sampling unit a digit– Use table of random numbers to select sample

Page 3: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Example

• In a population of N = 450, select a sample of size 10 using the table of random digits.– Starting digit value_______– Ending digit value_______– Line number started at _______– Sample digits selected for sample:

Page 4: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Estimating population average from SRS

• We use (Syi/n) to estimate m ( is an unbiased estimator of m)

• We use s2 to estimate s2 (unbiased estimator)• From previous, we know that V( ) = s2/n (infinite

population….or extremely large)• If finite population, then V( ) = ( (N-n)/(N-1)) (s2/n)• When we replace s2 by s2, this becomes

estimated variance of y-bar = (1-(n/N))(s2/n)

y y

y

y

Page 5: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Bound on the error of estimation

• Using 2 standard errors as our bound (think of MOE), we have 2sqrt( (1-(n/N))(s2/n))

• When can the finite population correction (fpc) be dropped? A good rule of thumb is when (1-n/N) > 0.95

• Want data to be approximately normal (sometimes transformations can be used…..the log transformation is one of the most popular transformations)

• Box people example• Problem 4.16 (and put a bound on it)

Page 6: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Estimating population total using SRS

• Since a SRS assumes all observations have an equally likely chance to be selected, we set di to be di = n/N)

• We use t-hat to estimate t ( =Syi/di =N*y-bar is an unbiased estimator of t)

• Therefore, for finite population, V( ) = N2( (N-n)/(N-1)) (s2/n)

• When we replace s2 by s2, this becomes estimated variance of = N2(1-(n/N))(s2/n)

Page 7: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Bound on the error of estimation

• Using 2 standard errors as our bound (think of MOE), we have 2sqrt( N2(1-(n/N))(s2/n))

• Normality is still important here!! (transform if necessary….i.e. small sample size and skewed data)

• Problem 4.17

Page 8: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Selecting Sample Size for m

• Use the variance of y-bar, which is V(y-bar) = ( (N-n)/(N-1)) (s2/n)

• Set B = 2sqrt(V(y-bar)), which isB = 2sqrt(( (N-n)/(N-1)) (s2/n) ) and solve for n….which yields n = (Ns2)/((N-1)D+s2) where

D=B2/4• Since s2 is usually not known, estimate it with

s2 (or s is approximately range/4)

Page 9: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Selecting Sample Size for t

• Set B = 2sqrt(N2V(y-bar)), which isB = 2sqrt(N2( (N-n)/(N-1)) (s2/n) ) and solve for n….which yields n = (Ns2)/((N-1)D+s2) where

D=B2/(4N2)• Since s2 is usually not known, estimate it with

s2 (or s is approximately range/4)

Page 10: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Examples

• 4.13, 4.23, 4.24, 4.27, 4.28

Page 11: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

4.5 Estimation of a Population Proportion

• Define yi as 0 (if unit does not have quantity of interest) and yi=1 (if unit does have quantity of interest)

• Then p-hat = Syi/n• p-hat is an unbiased estimator of p• Estimated variance of p-hat (for infinite sample sizes) is

p-hat*q-hat/n• Estimated variance of p-hat (for finite sample sizes) is (1-

n/N)(p-hat*q-hat)/(n-1), where q-hat= 1-p-hat• Bound = 2*sqrt(Estimated variance of p-hat)• Problem 4.14

Page 12: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

To estimate sample size

• n = Npq/( (N-1)D + pq ) where D = B2/4• If p is unknown, then we use p = 0.5• Normality is important here!!• Problem 4.15• Question: All the bounds that we have looked

at so far assumes what level of confidence?

Page 13: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

4.6 Comparing Estimates

• Comparing two means, or two totals or two proportions:• Quantity of interest is qhat1-qhat2

• Variance of quantity of interest is V(qhat1) + V(qhat2) – 2cov(qhat1,qhat2)

********NOTE: We will NOT be using finite population correction factor in this section!!

• If statistics come from two independent samples, then cov(qhat1,qhat2) = 0

• Problem 4.18

Page 14: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Examples

• A question asked to high school students was if they lied to a teacher at least one during the past year. The information is presented below

Male FemaleLied at least onceYes 3228 10295No 9659 4620

Find the estimated difference in proportion for those who lied at least once to the teacher during the past year by gender. Place a bound on this estimated difference.*

*Source: Moore, McCabe and Craig

Page 15: Chapter 4 Simple Random Sampling (SRS). SRS SRS – Every sample of size n drawn from a population of size N has the same chance of being selected. Use

Multinomial example

• If statistics are from a multinomial distribution, then cov(qhat1,qhat2) = (-p1p2/n)

• In a class with 30 students, the table below illustrates the breakdown of class:Freshmen 10Sophomore 5Junior 7Senior 8

Estimate the difference in percent Freshmen and percent Junior and place a bound on this difference.