chapter 8 - iit kanpurhome.iitk.ac.in/~shalab/sampling/wordfiles-sampling/... · web viewdouble...
TRANSCRIPT
Page 1
Chapter 8
Double Sampling (Two Phase Sampling)
The ratio and regression methods of estimation require the knowledge of population mean of an
auxiliary variable to estimate the population mean of study variable If the information on the
auxiliary variable is not available, then there are two options – one option is to collect a sample only on
study variable, and use sample mean as an estimator of the population mean.
An alternative solution is to use a part of the budget for collecting information on an auxiliary variable
to collect a large preliminary sample in which alone is measured. The purpose of this sampling is to
furnish a good estimate of . This method is appropriate when the information about is on file
cards that have not been tabulated. After collecting a large preliminary sample of size units from the
population, select a smaller sample of size from it and collect the information on . These two
estimates are then used to obtain an estimator of the population mean This procedure of selecting a
large sample for collecting information on auxiliary variable and then selecting a sub-sample from it
for collecting the information on the study variable y is called double sampling or two-phase sampling.
It is useful when it is considerably cheaper and quicker to collect data on than and there is a high
correlation between and
In this sampling, the randomization is done twice. First, a random sample of size is drawn from a
population of size and then again a random sample of size is drawn from the first sample of size
.
So the sample mean in this sampling is a function of the two phases of sampling. If SRSWOR is
utilized to draw the samples at both the phases, then the
- number of possible samples at the first phase when a sample of size is drawn from a
population of size is , say.
- number of possible samples at the second phase where a sample of size is drawn from the
first phase sample of size is , say.Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 2
Population of X (N units)
Sample (Large) units
Subsample(small)n units
0 samples M
1 samples M
Then the sample mean is a function of two variables. If is the statistic calculated at the second phase
such that being the probability that ith sample is chosen at the
first phase and jth sample is chosen at the second phase, then
where denotes the expectation over the second phase and denotes the expectation over the
first phase. Thus
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 3
Variance of
Note: The two-phase sampling can be extended to more than two phases depending upon the need and
objective of the experiment. Various expectations can also be extended on similar lines.
Double sampling in ratio method of estimation
If the population mean is not known, then the double sampling technique is applied. Take a large
initial sample of size by SRSWOR to estimate the population mean as
.
Then a second sample is a subsample of size selected from the initial sample by SRSWOR. Let
be the means of and based on the subsample. Then
The ratio estimator under double sampling now becomes
.
The exact expressions for the bias and mean squared error of are difficult to derive. So we find
their approximate expressions using the same approach mentioned while describing the ratio method of
estimation.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 4
Let
where is the sample mean of based on the sample size
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 5
where is the mean sum of squares of based on an initial sample of size
where is the variance of mean of based on an initial sample of size .
Estimation error of
Write
up to the terms of order two. Other terms of degree higher than two are assumed to be negligible.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 6
Bias of
The bias is negligible if is large and relative bias vanishes if i.e., the regression line passes through the origin.
MSE of
The second term is the contribution of the second phase of sampling. This method is preferred over the
ratio method if
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 7
Choice of and Write
where contain all the terms containing and respectively.
The cost function is where and are the costs per unit for selecting the samples
and respectively.
Now we find the optimum sample sizes and for fixed cost The Lagrangian function is
Thus
and so
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 8
Comparison with SRS
If is ignored and all resources are used to estimate , then required sample size =
Double sampling in regression method of estimation
When the population mean of the auxiliary variable is not known, then double sampling is used as
follows:
- A large sample of size is taken from of the population by SRSWOR from which the
population mean is estimated as , i.e.
- Then a subsample of size is chosen from the larger sample and both the variables and
are measured from it by taking in place of and treat it as if it is known.
Then . The regression estimate of in this case is given by
where based on the sample of size .
It is difficult to find the exact properties like bias and mean squared error of , so we derive the
approximate expressions.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 9
Let
Define
Estimation error:Then
Retaining the powers of up to order two assuming (using the same concept as detailed in
the case of ratio method of estimation)
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 10
Bias:
Mean squared error:
Retaining the powers of up to order two, the mean squared error up to the second order of
approximation is
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 11
Clearly, is more efficient than the sample mean SRS, i.e. when no auxiliary variable is used.
Now we address the issue of whether the reduction in variability is worth the extra expenditure
required to observe the auxiliary variable.
Let the total cost of the survey is
where are the costs per unit observing the study variable and auxiliary variable ,
respectively.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 12
Now minimize the for fixed cost using the Lagrangian function with Lagrangian
multiplier as
Thus
and .
Substituting these values in the cost function, we have
Thus the optimum values of are
The optimum mean squared error of is obtained by substituting as
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 13
The optimum variance of under SRS for SRS where no auxiliary information is used is
which is obtained by substituting in . The relative efficiency is
Thus the double sampling in regression estimator will lead to gain in precision if
.
Double sampling for probability proportional to size estimation:
Suppose it is desired to select the sample with probability proportional to an auxiliary variable but
information on is not available. Then, in this situation, the double sampling can be used. An initial
sample of size is selected with SRSWOR from a population of size and information on is
collected for this sample. Then a second sample of size is selected with replacement and with
probability proportional to from the initial sample of size . Let denote the mean of for the
initial sample of size , Let and denote means respectively of and for the second sample
of size . Then we have the following theorem.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 14
Theorem:
(1) An unbiased estimator of the population mean is given as
where denotes the total for in the first sample.
(2) and denote the totals of
and respectively in the population.
(3) An unbiased estimator of the variance of is given by
Proof. Before deriving the results, we first mention the following result proved in varying probability
scheme sampling.
Result: In sampling with varying probability scheme for drawing a sample of size from a population
of size and with replacement.
(i) is an unbiased estimator of the population mean being the
probability of selection of ith unit. Note that and can take any one of the N values
with initial probabilities respectively.
(ii) .
(iii) An unbiased estimator of the variance of is
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 15
.
Let denote the expectation of , when the first sample is fixed. The second is selected with
probability proportional to , hence using the result (i) with we find that
where is the mean of for the first sample. Hence
which proves the part (1) of the theorem. Further,
Now, using the result (ii), we get
and hence
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 16
using the probability of a specified pair of units being selected in the sample is So we can
express
Substituting this in , we get
This proves the second part (2) of the theorem.
We now consider the estimation of Given the first sample, we obtain
where Also, given the first sample,
Hence
Substituting
Using
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 17
we get
which further simplifies to
where is the mean sum of squares of for the first sample. Thus, we obtain
which gives an unbiased estimator of Next, since we have
and from this result we obtain
Thus
when gives an unbiased estimator of
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur
Page 18
Using (1) and (2) an unbiased estimator of the variance of is obtained as
Thus, the theorem is proved.
Sampling Theory| Chapter 8 | Double Sampling (Two Phase Sampling) | Shalabh, IIT Kanpur