Download - Sampling methods 16
Dr. J. ANAIAPPAN M.D., D.C.HSenior Assistant Professor
Department of Community medicine Kilpauk Medical College
Introduction Types of sampling methods Probability sampling methods Non-probability sampling methods Choice of sampling methods
Definition of samplingWhat is the need for sampling?What defines a proper sample?
Definition: Sampling is a process by which some persons
/objects / elements /events are selected from the predetermined population for carrying out studies and drawing inferences about the population as a whole.
Sampling is a process of selecting a required number of individuals from the study population so as to make observations on the sample instead of whole population
Principle of sampling : To get maximum information about the population with
minimum effort and with limited resourcesObjectives of sampling : Estimation of population parameters (proportion or mean) from the
sample statistics To test the hypothesis about the population from which the samples are
drawn
Studying the entire population is difficult It will be costly, time consuming and not feasible Studying the whole population is impossible and
unnecessary
If sampling is done properly : Accurate and reliable estimates can be made More characteristics or details can be collected Project management is easy Can get best possible results in least possible time
Sampling is inevitable when : Population is infinite Results are required in a short time Area is wide Resources are limited
What determines a proper sample? Representativeness Unbiased selection Adequacy of the sample
Representativeness: Sample has all the important characteristics and similar
distribution Requires knowledge of variables and their distribution
in the population Statistical sampling methods – gives reasonable
guarantee of representativeness
Bias occurs when : Wide difference between the estimate of the sample &
the true population value Some members are underrepresented or
overrepresented than others in the population Own bias or prejudice Laziness and sloppiness
Reasons for a biased sample : Faulty selection of sample Substitution Faulty demarcation of sampling units Non-response
Good sampling results in : Reduction of cost Saving of time Reduction in manpower requirement
Gives more accurate results than attempts to study the entire population
Population : ( universe ) The group of individuals or units possessing certain
predetermined characteristic intended for the study Population is an aggregate of elements (ie) persons,
objects, households or specified events
Representative sample : It has all the characteristics with similar distribution as
that of the population from which it is drawnSampling frame : It is the list of all elements – persons, households,
objects, specified events or units – in the population eg. Voter’s list
Sampling unit : It is the constituent elements of a population which are
to be sampled from the population and cannot be further subdivided for the purpose of sampling at a time
It is the unit of selection in the sampling process (eg) a person, a patient, a household, a village, a town, a hospital or a district
Sampling Fraction : The proportion of population that is included in the
sample (eg) 20%Sample : A finite subset of a population, a portion chosen from a
defined populationSample size : The number of units in a sample
Sampling error is any type of bias that is attributable to mistakes in either drawing a sample or determining the sample size
Basics of Sampling TheoryBasics of Sampling Theory
Population
Element
Defined target population
Sampling unit
Sampling frame
Types of sampling :
Probability sampling or Random sampling
Non-Probability sampling or Non-Random sampling
It uses some form of random selection All units in the study population have an equal chance
for being chosen for the study Best among all the methods Most powerful statistical analysis on the results can be
done subsequently
Random sampling methods are : Simple random sampling (unrestricted) Systematic random sampling (quasi-random) Stratified random sampling Cluster sampling (area sampling) Multistage sampling Multiphase sampling
Difference between random and non-random sampling is selection of sample unit does not ensure a known chance to the units being selected
May lead to unrepresentative samples It lacks accuracy in view of selection bias
Does not involve random selection Subject to prejudice and bias of researcher May not represent the population well Used when there is no sample frame for the population Mostly used in qualitative research like exploratory
research, opinion surveys and marketing studies
Methods : Purposive sampling (judgemental sampling) Convenience sampling (oppurtunity sampling) Quota sampling Expert opinion sampling Snowball sampling (chain sampling, chain referral
sampling or referral sampling)
Important and frequently used methods : Simple random sampling Systematic random sampling Stratified random sampling Cluster sampling Multi-stage sampling
Define the study population ( N ) Prepare a proper sampling frame (n) Determine the sample size Select the required number of samples
Selection of required number of samples by : Lottery method – small population Random number method – by using standard tables
( Tippet’s table, Fisher and Yate’s table and Kendall and Smith’s table )
Computer generated random numbers
Advantages : Personal bias is eleminated Representative of a homogenous population No need for thorough knowledge of the units of
population Accuracy of the sample can be tested Used in other methods of sampling
Disadvantages : Cannot be used for large population When there is large difference between units Units of sample lie apart geographically Cost and time of collection of data are more Logistically more difficult in field conditions
Simple & convenient way of selecting a sample Requires less time and cost Sample is spread evenly over entire reference
population Can be used in infinite population
This method requires sampling frame Units are selected at an uniform interval Useful when information is collected from units which
are in serial order (ie) enteries in register, house in blocks etc
Method : Identify the sample size (n) Put the population in sequential order & number them
serially – sampling frame Identify total no.of units in the population (N)
Method : Divide N/n = sampling interval (k) Identify a random no.which is less than or equal to ‘k’ Select every n’th item starting with a random one
Dividing the population into subgroups or strata - stratification
Units within the stratum are homogenous and between the strata are heterogeneous
From each stratum a simple random sample is selected and combined together to form the required sample from the population
Two types : Unequal size - Proportional stratified random sampling Equal size – Disproportionate stratified random
sampling
Sample size in each stratum is Unequal size - proportionate to the no. of units in each
stratum Equal size - disproportionate to the no. of units in
each stratum
Advantages : Every unit in the stratum has the same chance of being
selected More representative Ensures proportionate representation Greater accuracy Greater geographical concentration
Limitations: Division of population into strata needs more money,
time and statistical experience Improper stratification leads to bias – if there is
overlapping of strata
The whole population is divided into groups called clusters.
Each cluster is representative of the population Clusters are selected randomly A random sample is then is taken from within each
cluster
Lot of clusters are sampled so that the results can be generalized for whole population
Clusters should be as small a possible consistent with the time & cost limitations
No. of units in each cluster must be more or less equal Is a simple random sample of cluster of elements
Examples : WHO 30 clusters for coverage evaluation survey Pulse polio immunization coverage evaluation survey
Eg: In a PHC estimate the proportion of infants with age 6 months to 1 yr who are fully immunized .
1) Identification of total population and the geographical area
2) Identification of age group to be included3) Listing of all villages4) Tabulation
village Population Cumulative population
clusters
1.Adgaon 947 947 12.Asgaon 1208 2155 23.Borphal 712 2867 34.Bilaspur 3012 5879 4,5,65.Chitegaon 631 6510 76.Dhoregaon 1709 8219 87.Esapur 413 8632 98.Girnar 1203 9835 109.Goregaon 5153 14988 11,12,13,14,1510.Himmatpur 3128 18116 16,17,1811.Lalwadi 3689 21805 19,20,2112.Puri 1529 23334 2213.Solegaon 2604 25938 23,24,2514.Tisgaon 3210 29148 26,27,2815.Yeoti 2057 31205 29,30TOTAL 31205
5) Sampling interval (S.I.): Total cumulative Population 31205 S.I. = ------------------- = 1040. Number of clusters 306) Selection of a starting point7) Selecting subsequent clusters
C2 = random number + S.I.= 0196+1040= 1236 C30 = c29 + S.I.
8) Selecting first household in a cluster9) Collection of information
Advantages Disadvantages Cuts down the cost of
preparing sampling frame and cost of travelling between selected units
Eliminates the problem of “packing”
Sampling errors is usually higher than for a simple
Random sample of the same size
Used for large and diverse populations (eg) nation, region or state
Usually carried out in phases Involves more than one sampling methods Example : Estimating the problem of Iodine
deficiency disorders in India
First stage : few states are randomly selected Second stage : few districts from above states Third stage : few blocks from above districts Fourth stage : few villages from above districts Fifth stage : few households from each village
ADVANTAGES : Sample frame for individual units not required Cuts down the cost of preparing sample frame DISADVANTAGES : final sample may not be representative of the total
population Sampling error is increased, when compared with
simple random sampling
Non-probability Sampling
Does not involve random selection Subject to prejudice and bias of investigator May or may not represent the population well Used when there is no sampling frame Used in qualitative research If the investigator is experienced may yield valuable
results
Convenience sampling Judgemental /purposive sampling Quota sampling Snow ball sampling
Accidental, opportunity, accessibility or haphazard sampling
Use of readily available persons for the study-sample of convenience
Stopping people in a street corner, people select themselves in response to public notices-risk of bias is greater.
Lack of representativeness Used for making pilot studies
Judgmental sampling Researchers knowledge about the population can be
used to hand-pick sample members, knowledgeable about the study
Used in newly developed instruments can be pretested and evaluated
Researcher utilizes knowledge about the population –representativeness into the sampling plan
Population is divided into quotas – age, socioeconomic status, religion etc.
Number of units within each quota –personal judgment of the investigator.
Used by quantitative researchers Used in public opinion studies
Network or chain/referral sampling Research population of specific traits-difficult to
identify Early sample members asked to refer other people
who meet eligibility criteria Sampling hidden populations-homeless or IV drug
users-respondent driven sampling (rds),variant of snow ball sampling.
METHOD BEST WHEN Simple random whole population sampling is available Stratified random when specific sampling subgroups are to be investigated
METHOD BEST WHEN Systematic random when a stream of sampling representative people are available Cluster sampling when population groups are separated & access to all is difficult
THANK YOU