1 kuo-hsien su, national taiwan university nan lin, academia sinica and duke university measurement...

1Kuo-hsien Su, National Taiwan UniversityNan Lin, Academia Sinica and Duke University

Measurement of Social Capital: Recall Errors and Bias Estimations

20

510

15P

erce

nt

-20 -10 0 10 20Differences in number of positions accessed (wave II - wave I)

Change in number of positions accessed from wave I to wave II (N=2,707 respondents)

No change : 12%

Decrease : 52.9%

Increase : 35%

3

Differences between the sets of accessed positions during two interviews may reflect…

4Motivations Measurement instability poses a serious

challenge to the study of network changes. Need a clear measurement or better

understanding of the possible sources of error. The two periods panel survey provided an

opportunity (1) to model factors associated with changes in accessed position (2) to detect whether the respondent forgot a subsequently/previously named contact .

5Prior research Forgetting is a pervasive

phenomenon in the elicitation of network contacts.

Research on forgetfulness has been disproportionately based on name generator instrument.

Little research on the reliability of position generator.

6Tasks

7Data Social Capital Project: the Taiwan

Survey, conducted in late 2004 and 2006 Consists of 1,695 men and 1,585 women

aged 20-65.

8Problem of Non-response

Wave I2004N =

3,280

Wave II2006N =

2,710Re-interview = 82.6%Non-response = 17.4%

9

Table 1. Characteristics of the follow-up and non-response sub-sample　　 Full sample Follow-up Non-response

sample　 (N=3280) (N=2710) (N=570)　　 Mean % Mean % Mean %Gender 　　　　 Male 51.7% 　 51.5% 　 52.5%　 Female 48.3% 　 48.5% 　 47.5%Age 41.3 41.7 39.5 　Years of schooling 11.7 11.7 11.8 　Marital Status 　　　　 Single 23.9% 　 22.8% 　 29.1%　 Married/cohab 70.2% 　 71.4% 　 64.4%　 Widow/divorced 6.0% 　 5.8% 　 6.5%Network resource indices 　　　　 Extensity 8.5 8.5 8.2 　　 Upper reachability 62.4 62.8 60.4 　　 Range of prestige 36.7 　 37.0 　 35.1 　

10

Three types of research designs (Brewer, 2000).

11Limitations of our data Our survey was not designed to

examine forgetting specifically. No recognition data or objective

records to compare with. Two years interval is too long: Test-

retest design is usually within a very short time interval.

12

Revised method C: Comparison of accessed positions elicited in two separate interviews

Wave II2006

Wave I2004

How many years have you known this person ？

2005

Forgetting = (Contact mentioned in wave II but not mentioned in wave I) AND (duration >= 3 years)Assumption: durations reported in wave II are more or less accurate.

Whether the respondent forgot a subsequently named contact？

13

Coding scheme for tie changes

Wave II (2006)NO YES

Wave I (2004)

NO

(1) Consistent “NO”

(2) New contacts (less than 3 years)

(3) Forgetting at wave I (more than 3 years)

YES(4) Lost

contact /Forgetting at wave II

(5) Consistent “YES”

The distribution of length of relationship of forgotten ties (N=4,332 dyads, 7.3%)0

.02

.04

.06

.08

.1D

ensi

ty

0 20 40 60Length of relationship (in years)

The average duration of ties forgotten is 13 years

15

How much does the respondent forget？Wave I Wave II know more

than 3 years? Categories N Percent

YES YES 　 Consistent "YES" 14,330 49.9%

NO YES NO New contact 1,240 4.3%

　　 YES Forgotten at wave I 4,332 15.1%

YES NO 　 Contact lost/Forgotten at wave II

8,794 30.7%

　　　 Total 28,696 100%

approximately 15% of forgetting

Unique= 51.1%

16

Distribution of respondents by number of ties forgotten (N=2707 respondents)0

1020

3040

Per

cent

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Number of forgotten ties in wave 1

35.6% of the respondents did not forget any ties

64.4 %of the respondents failed to mention at least one contact, with an average of 1.6 forgotten ties per respondent.These numbers suggest that

forgetting a contact was not a rare occurrence.

17Analytical Strategies

What factors are associated with forgetting？

Unit of analysis: person-contacts dyads

Model : Multilevel logit

Whether “forgetting” affects estimates of network resources ？

Unit of analysis: person

Model predicting “forgetting”

Analysis for the effect of forgetting on estimates of accessibility

18Sample A multi-level logit approach The models estimate the odds of “forgetting”

versus “not forgetting”; the reference population consisted of all contacts mentioned in the first interview (2004).

Data structure

Positions nested within individuals

LEVEL 2 LEVEL 1

The final sample consists of 2,682 respondents and 28,343 person-contact dyads.

The multi-level approach requires us to transform the individual-based data to person-contacts observations.

20Variables Level 2 (respondent level):

Age Years of schooling Marital status (married) Employment status (employee) Occupational prestige score Size of daily contact

21Variables

Level 1 (ties level): Type of relationships

Group into six categories: kin, neighbor, school tie, work-related ties, friends, indirect tie

Length of relations (in years) Closeness Gender homophily Status difference

Status distance = absolute difference between respondent’s prestige score and contact’s prestige scores

Status disparity = respondent’s prestige score – contact’s prestige score

Descriptive statistics (individual level)Level-2 Total Male Female　 (N=2676) (N= 1383) (N=1293)Age(in years) 41.62 41.46 41.79

(11.66) (11.62) (11.70)Years of education 11.77 12.30 11.19

(4.23) (3.75) (4.63)Marital statussingle 0.23 0.25 0.21 divorced/widowed 0.06 0.03 0.09 married 0.71 0.72 0.71 Employment statusemployee 0.72 0.68 0.76 self-employed/employer 0.18 0.23 0.12 part-timer 0.03 0.03 0.03 family worker 0.08 0.05 0.10 Occupation prestige score 39.88 41.26 38.39

(12.91) (13.13) (12.50)Size of daily contacts 3.42 3.52 3.31

(1.36) (1.31) (1.41)

Descriptive statistics (dyad level)Level-1 Total Forgetting Not

forgetting　 (N=27,103) (N=4,315 ) (N=22,788)Type of relationshipkin 0.21 0.24 0.21 neighbor 0.07 0.09 0.07 school tie 0.07 0.07 0.08 work-related ties 0.35 0.42 0.33 friends 0.24 0.12 0.26 indirect tie 0.05 0.07 0.05 Same sex 0.61 0.60 0.61 Length of relationship 12.89 12.95 12.88

(11.92) (11.98) (11.91)Closeness 3.46 3.34 3.49

(0.99) (0.99) (0.99)Status Distance 15.79 16.74 15.61

(11.66) (12.21) (11.54)Status Disparity -3.56 -5.89 -3.12 　 (19.30) (19.87) (19.16)

24 　　　 MODEL (1)Level-2 Model

Intercept -1.197***Female (male) -.146***Age (in years) .000Years of schooling -.054***Marital status (married)

Single .105+Divorced/widowed -.168+

Employment status (employee)Self-employed/employer -.080Part-timer -.076Family worker -.048

Occupation prestige scores -.007***

Size of daily contacts -.125***

Multi-level model predicting “forgetting”(level-2 model)

25

Multi-level model predicting “forgetting” (Level-1 model)

　　　 MODEL (1) MODEL (2)Level-1 Model

Type of relationship (work-related ties)

Kin .100* .100*

Neighbor .015 .011School ties -.196*** -.197***

Friends -.807*** -.804***

Indrect ties .096 .104+

Same sex -.039+ -.106***

(same-sex)×female .122**

Length of relationship -.008*** -.007***

Closeness -.173*** -.174***

Status Distance .007*** .011 ***

(status distance)×female -.007***

26

Multi-level model predicting “forgetting” (Level-1 model)

　　　 MODEL (3) MODEL (4)Level-1 Model

Type of relationship (work-related ties)Kin .108* .107*

Neighbor .022 .020School ties -.197*** -.200***

Friends -.814*** -.812***

Indrect ties .098+ .104+

Same sex -.040+ -.106***

(same-sex)×female .119*

Length of relationship -.008*** -.008***

Closeness -.179*** -.181***

Status disparity -.002** -.004***

(status disparity)×female .004**

27Findings Recall error may not be random. Forgetting is more likely among weak

ties. How does recall error affect the

estimation of network-driven indices ？

28

Table 4. Discrepancy between “true” (corrected) and “observed” (raw) network resources indices　　 Corrected

scoreRaw score Differences t-test

Extensity Mean 9.9 8.5 1.38 39.2 SD 5.5 5.5 　　Range Mean 40.6 36.7 3.92 25.8 　 SD 16.8 18.6 　　Upper reachability Mean 65.2 62.4 2.83 19.3 　 SD 15.2 17.6 　　

Because forgetting is more likely among weak ties, position-generator underestimate embedded network resources.

29

Table 5. Correlations between “true” (corrected) and observed (raw) network resources indices at wave I (N=3,272)

　　 Corrected indices Raw indices at wave I

　　 Extensity Range Reachability Extensity Range

Corrected indices　

Extensity -- 　　Range .817 　　Reachability .692 .886 　　

Raw indices at

wave I

Extensity .934 .745 .632 　Range .792 .884 .776 .832

Reachability .674 .804 .880 .694 .865

30Conclusions Forgetting a contact was not a rare

occurrence; Recall error is largely nonrandom. Status difference appears to govern the

recall process. Position generator systematically

underestimates network-driven resource indices.

1 kuo-hsien su, national taiwan university nan lin, academia sinica and duke university measurement...

Documents