hdfs 361—research methods week 2: levels of measurement and sampling
Post on 21-Dec-2015
215 views
TRANSCRIPT
HDFS 361—Research Methods
Week 2:
Levels of Measurement and Sampling
Types of Studies
Descriptive studies
• These studies describe the results for the participants in the study.
Inferential studies
• These studies seek to generalize beyond the participants to a specified, larger population.
Sample and Population
Population
• A population includes the universe of people or groups about whom we are interested.
Sample
• A sample is a subset of a population.
• If a sample is representative of the population from which it was drawn, we can make an inference from the sample to the population.
Criteria for Levels of Measurement
• Mutually exclusive—each observation is assigned a single value or label.
• Exhaustive—every observation is classified (measured), even if assigned to a category called “other.”
• Ordered—observations are ranked or ordered on how much of the characteristic they have.
• Equal appearing intervals—an equal difference between values corresponds to an equal difference on the characteristic being measured.
• Meaningful zero point—a value of 0 corresponds to the absolute absence of the characteristic being measured.
Level of Measurement and Measurement Criteria: The Traditional Approach
Level of Meas-
urement
Measurement Criteria
Mutually Exclu-sive
Exhaus-tive
Ordered Equal Inter-vals
Mean-ingful zero
Examples
Ratio Yes Yes Yes Yes Yes Age
Interval Yes Yes Yes Yes Scales
Ordinal Yes Yes Yes Religiosity
Nominal Yes Yes Marital Status
Nominal Variables—Frequency Distributions
Marital status| Freq. Percent Cum.--------------+----------------------------------- married | 1,269 45.90 45.90 widowed | 247 8.93 54.83 divorced | 445 16.09 70.92 separated | 96 3.47 74.39never married | 708 25.61 100.00
---------+------------------------------Total | 2,765 100.00
Ordinal Ranks--Median
• The median is the value of the case in the middle.• Rank observations. If we had two children who were tied
at the 3rd rank, we would give both of them a rank of 3.5. This is because the pair of cases occupies both the 3rd and the 4th ranks. The average of 3 and 4 is, . The next person higher on the scale would have the rank of 5, resulting in rankings of 1, 2, 3.5, 3.5, 5, 6, 7, and 8.
• If we had 3 people tied for most aggressive (in addition to the 2 tied for third), our rankings would be (1, 2, 3.5, 3.5, 5, 7, 7, 7). • The three highest-ranking children occupy the 6th, 7th, and 8th
ranks. • In sporting events they try to be nice and give tied contestants the
highest rank they can.
Ordinal categories—Frequency Distribution
Health | Freq. Percent Cum.------------+----------------------------------- excellent | 568 30.75 30.75 good | 854 46.24 76.99 fair | 322 17.43 94.42 poor | 103 5.58 100.00------------+-----------------------------------
Total | 1,847 100.00
Ordinal Categories—Bar Charts
30.75
46.24
17.43
5.577
01
02
03
04
05
0P
erc
en
t
Excell
ent
Good
FairPoo
r
condition of health
General Social Survey 2002Distribution of Health Status for U.S. Adults
Nominal Level—Bar Charts0
10
20
30
40
50
Pe
rce
nta
ge
0
mar
ried
widowed
divor
ced
sepa
rate
d
neve
r mar
ried
Marital Status
General Social Survey 2002Distribution of U.S. Adults on Marital Status
Interval/Ratio Level—Underlying Continuum
Jim Sue Joe Chandra
Underlying Continuum
Interval/Ratio Level
• We can use most statistics and graphs
• Means, standard deviations
• Histograms and other charts
• We will cover these later in the course
Data Collection—Random Sample
• Simple Random sample means everybody has the same chance of selection.
• Assumes sampling with replacement, but this is rarely used in practice.
• Need a list of the entire population to do a random sample and this is often hard to obtain.
Using Stata to Select Random Sample of 1000 People from a Population of
15,000 +-------+ | id | |-------| 1. | 5546 | 2. | 4530 | 3. | 6419 | 4. | 5622 | 5. | 8877 | |-------| 6. | 3867 | 7. | 10748 | 8. | 6179 | 9. | 11602 | 10. | 361 | |-------|
set obs 15000gen id = _nsample 1000, countlist id in 1/10
Sample size and Sampling Error
Sample N Sampling Error
20 21.91%
50 13.86%
100 9.80%
200 6.93%
500 4.38%
1,000 3.10%
1,600 2.45%
10,000 0.98%
Graphic of 15 Confidence Intervals, n= 500, True proportion in Population = .48
.33 .35 .37 .43 .39 .41 .47 .45 .49 .51 .53 .55 .57 .59
True proportion is .48
This is the only one that
misses
Estimating Confidence Interval for Proportion
ppq
n
ppq
n
N n
N
F
HGIKJ
196
196
.
.
or
Stratified Sample
• By dividing the population into two or more strata, each of which is homogeneous, we can conduct a random sample of each stratum and then pool the results.
• This is more powerful than a simple random sample to the extent the strata are homogeneous.
• Rather than taking a random sample of the entire population, a stratified sample could be used to take a random sample of each stratum.
Stratified Samples
-2 74 13x
Two Normal Distributions
MEN WOMEN
Cluster Sample
• Cluster sampling is sometimes confused with stratified sampling, but it has a different purpose. If our population is geographically dispersed, we can often save a great deal of time and money by dividing the population into geographical clusters, randomly sampling the clusters
• Census data can be used on any city in the U.S. to list every city block (usually commercial blocks are excluded). We could then take a sample of blocks (sampling units) and interview all or some of the households in each block we included in our sample of blocks.
Cluster Sample
• A person interested in morale of elementary school teachers in a large school district could obtain a list of elementary schools (sampling units) and sample 10 percent of the schools.
• If your clusters are blocks, you can send an interviewer to a selected block. Once there the interviewer can go to the first house. If nobody is home, the interviewer can go to the next selected house, and so on.
• Sampling HDFS students by randomly sampling 20 sections from the class schedule, then giving the instrument to everybody in the selected sections.
Nonprobability Samples—Quota Sample
• Quota Sampling tries to be representative by sampling a reasonable number of certain groups.
• We might sample 100 women and 100 men for a 200 person sample. This would make the sample representative on gender.
• This approach is better than nothing, but should not be confused with a probability sample. We may represent the gender and racial distribution of our population, but without probability sampling, we should be hesitant to generalize to the population.
Nonprobability Samples—Snowball
• Snowball Sampling is an approach used for rare populations.
• What if you wanted to interview lesbian couples? It is practically impossible to get a sampling list of lesbian couples.
• You could go to a gay and lesbian group and interview people, but you would then be limiting yourself to lesbians who are activists.
Nonprobability Samples—Snowball
• When you interview a lesbian who is in the group you ask her to share with you the name of other lesbians who are not in the group. When you interview them, you ask them to give you the name of still other lesbians.
• Several “points of entry” are important• PFLAG would give you gays/lesbians whose parents
were supportive• Gay and Lesbian groups would give you gays/lesbians
whether their parents were supportive or not.• Snowballing would give you gays/lesbians who were
“out.” IRB issues might be a problem.
Nonprobability Samples for Qualitative Studies
• Purposive or elite sampling has decided advantages over probability sampling.
• The researcher wants to tap the range of people and because the interviews are so labor intensive the sample must be small, at least in most qualitative studies.
• If you are limited to interviewing 20 participants in your study, you want to select them purposively.
Nonprobability Samples for Qualitative Studies
• Suppose you were studying the effects of a change in the welfare system on parents. • You will want the perspective of both mothers and father,
unemployed and underemployed parents, single parents, cohabiting partners, married parents, and parents with different racial or ethnic backgrounds.
• You may also need the perspective of social service providers in the welfare system.
• If you randomly sampled 20 participants, you would not get this diversity. You need to purposively select each participant based on the information value they have.