Introduction
1 Introduction
2 Example
3 Probability sampling
4 Survey Procedures
5 Survey Errors
Kim Ch. 1: Probability Sampling Fall, 2014 2 / 27
Introduction
What is sampling?
Population
Finite PopulationInfinite Population
Sample : Subset of a population
Sampling : Make statistical inference about the (finite) populationwithout measuring the whole population
Sampling error : the error that results from taking a sample instead ofexamining the whole population
Kim Ch. 1: Probability Sampling Fall, 2014 3 / 27
Introduction
Why sampling ?
1 To reduce the cost
2 To save the time
3 Sometimes, to get a more accurate information about the population.
4 Sometimes, it is the only way of getting information about the targetpopulation.
Kim Ch. 1: Probability Sampling Fall, 2014 4 / 27
Introduction
How to do the sampling ?
Two types of sampling
Probability samplingNon-probability sampling
Roughly speaking, a probability rule is assigned to obtain a sample inprobability sampling.
Kim Ch. 1: Probability Sampling Fall, 2014 5 / 27
Example
1 Introduction
2 Example
3 Probability sampling
4 Survey Procedures
5 Survey Errors
Kim Ch. 1: Probability Sampling Fall, 2014 6 / 27
Example
Let’s look at an artificial finite population (of size N = 4).
ID Size of farms yield(Acres) (y)
1 4 12 6 33 6 54 20 15
Parameter of interest: Mean yield of the farms in the population
θ = (y1 + y2 + y3 + y4)/4
The value of y is obtained only from the sampled units.
Kim Ch. 1: Probability Sampling Fall, 2014 7 / 27
Example
Instead of observing N = 4 farms, we want to select a sample of sizen = 2.
6 possible samples
case sample ID sample mean sampling error
1 1, 2 22 1, 3 33 1, 4 84 2, 3 45 2, 4 96 3, 4 10
Each sample has a sampling error.
Two ways of selecting one of the six possible samples.
Nonprobability sampling : (using size of farms or etc.) select a samplesubjectively.Probability sampling : select a sample by a probability rule.
Kim Ch. 1: Probability Sampling Fall, 2014 8 / 27
Example
Probability sampling
Simple Random Sampling : Assign the same selection probability toall possible samples
case sample ID sample sampling selectionmean (y) error probability
1 1, 2 2 -4 1/62 1, 3 3 -3 1/63 1, 4 8 2 1/64 2, 3 4 -2 1/65 2, 4 9 3 1/66 3, 4 10 4 1/6
In this case, the sample mean(y) has a discrete probabilitydistribution.
Kim Ch. 1: Probability Sampling Fall, 2014 9 / 27
Example
Probability mass function of y :
Py (y) =
{1/6 if y ∈ {2, 3, 4, 8, 9, 10}0 otherwise.
Unbiased
Variance
Kim Ch. 1: Probability Sampling Fall, 2014 10 / 27
Example
Remark
No model assumption about yi in the example: totally differentframework !
Design-based approach: the reference distribution is the samplingdistribution generated by the repeated application of the givensampling mechanism.
Kim Ch. 1: Probability Sampling Fall, 2014 11 / 27
Probability sampling
1 Introduction
2 Example
3 Probability sampling
4 Survey Procedures
5 Survey Errors
Kim Ch. 1: Probability Sampling Fall, 2014 12 / 27
Probability sampling
Definition & Notation
U = {1, 2, · · · ,N} : index set of finite population
A : subset of U, index set of the sample.
A: set of samples under consideration, sample support.
θ = θ(yi ; i ∈ A) : statistic
Kim Ch. 1: Probability Sampling Fall, 2014 13 / 27
Probability sampling
1 Probability distribution of samples, or sample distribution: probabilitymass function P (·) defined on A. That is, P (·) satisfies
1 P (A) ∈ [0, 1] , ∀A ∈ A2∑
A∈A P (A) = 1.
2 (Induced) probability distribution of a statistic
1. Expectation : E (θ) =∑
A∈A P(A)θ (A)
2. Variance : Var(θ) =∑
A∈A P(A)[θ (A)− E (θ)
]2
3. Mean squared error :
MSE (θ) =∑A∈A
P(A)[θ (A)− θ
]2
= Var(θ)
+[E (θ)− θ
]2
MSE = Variance + (Bias)2
Kim Ch. 1: Probability Sampling Fall, 2014 14 / 27
Probability sampling
Probability Sampling
Definition : For each element in the population, the probability thatthe element is included in the sample is known and greater than 0.
Advantages1 Exclude subjectivity of selecting samples.2 Remove sampling bias (or selection bias)
What is sampling bias ? ( θ : true value, θ: estimated value of θ)
(sampling) error of θ = θ − θ
={θ − E
(θ)}
+{E(θ)− θ}
= variation + bias
In nonprobability sampling, variation is 0 but there is a bias. Inprobability sampling, there exists variation but bias is 0.
Kim Ch. 1: Probability Sampling Fall, 2014 15 / 27
Probability sampling
Probability Sampling
Main theory1 Law of Large Numbers : θ converges to E (θ) for sufficiently large
sample size.2 Central Limit Theorem : θ follows a normal distribution for sufficiently
large sample size.
Additional advantages of probability sampling with large sample :1 Improve the precision of an estimator2 Can compute confidence intervals or test statistical hypotheses.
With the same sample size, we may have different precision.
Kim Ch. 1: Probability Sampling Fall, 2014 16 / 27
Probability sampling
Example - Continued : Probability sampling 2
Unequal probability sampling
Sample ID y value Mean Estimator Selection probability1, 4 1, 15 4.5 1/32, 4 3, 15 6 1/33, 4 5, 15 7.5 1/3
What is the probability distribution of the mean estimator ?
What is the expected value of the sampling error ?
E (y) =1
3(4.5 + 6 + 7.5) = 6.0
Compute the variance. Compare it with that of SRS.
Kim Ch. 1: Probability Sampling Fall, 2014 17 / 27
Survey Procedures
1 Introduction
2 Example
3 Probability sampling
4 Survey Procedures
5 Survey Errors
Kim Ch. 1: Probability Sampling Fall, 2014 18 / 27
Survey Procedures
Outline of a survey
1 Define Target population: this is the population to which theconclusions apply.
2 Determine population characteristics of interest (e.g. acres of corn,daily intake of calcium)
3 Find sampling frame: device that associates elements by samplingunits (e.g. phone book)
4 Obtain a sample by a probability sampling design.
5 Measure the study variables.
6 Use measured values to compute point estimates and standard errors.
Kim Ch. 1: Probability Sampling Fall, 2014 19 / 27
Survey Procedures
Basic procedures for survey sampling
1 Planning1 Statement of objectives2 Selection of a sampling frame
2 Design and development1 Sample design2 Questionnaire design
3 Implementation1 Data collection2 Data capture and coding3 Editing and Imputation4 Estimation5 Data analysis6 Data dissemination
4 Evaluation - Documentation
Kim Ch. 1: Probability Sampling Fall, 2014 20 / 27
Survey Procedures
Two aspects of Survey Design
1 Sampling design: how to collect a sample ?
What is your target population ?What is your sampling frame ?What information is available from the sampling frame ?What is the cost for observing each unit in the sample ?
2 Questionnaire design: how to obtain measurement from the selectedsample ?
What is your research questions ?What are the variables that you want to measure from the sample ?How to ask the questions properly ?Is there any other auxiliary variables that you want to measure to usein the weighting or editing stage ?
Kim Ch. 1: Probability Sampling Fall, 2014 21 / 27
Survey Errors
1 Introduction
2 Example
3 Probability sampling
4 Survey Procedures
5 Survey Errors
Kim Ch. 1: Probability Sampling Fall, 2014 22 / 27
Survey Errors
Source of Errors
1 Errors of nonobservationCoverage error
(Target) population 6= FrameSome elements are not listed.
Sampling error
Frame 6= sampleSome listed elements not sampled.
Non-response error
sample 6= respondentsSome sampled elements don’t respond.
2 Errors of observation
Measurement error: Interviewer, respondent, instrument, modeProcessing error
Kim Ch. 1: Probability Sampling Fall, 2014 23 / 27
Survey Errors
Sampling error:
If n = N, no sampling errorIn fact, if n ↑ then sampling error ↓.
Nonsampling error: Everything else
Even if n = N, you have nonsampling error.In practice, we can decrease nonsampling error by decreasing n.
Because of nonsampling error, a sample is often more accurate than aCensus.
Kim Ch. 1: Probability Sampling Fall, 2014 25 / 27
Survey Errors
Survey Methodology
Psychology (Cognitivescience), social science
More interested innon-sampling errors
By properly asking questions,we can reduce survey errors.
Questionnaire design, surveymanagement
Survey Statistics
Statistics
More interested in samplingerrors
Want to measure theuncertainty of survey errorsand incorporate them intoestimation.
Sampling design, estimation,editing etc.
Kim Ch. 1: Probability Sampling Fall, 2014 26 / 27