1 internet-based research steve janssen [email protected] university of amsterdam
TRANSCRIPT
2
Internet-based research
• Advantages: DNMT and GC• Limitations• Trustworthiness• Reliability• Validity: Self-selection bias and drop-out
confounding• Conclusions
3
Advantages (Reips, 2000)
• Diversity of population• Potentially high number of participants• Fewer subject-experimenter effects• Up-to-date stimuli• Low costs• Longevity experiment
4
Characteristics Internet users (Gosling et al., 2004)
• Internet samples are more divers than traditional samples (i.e., first-year psychology students), although they are not completely representative of the entire population.
• Internet users do not differ from non-users on markers of adjustment and depression.
• Online participants are usually highly motivated.
5
High number of participants
• People can participate at any time from any place.
6
When do people take tests?
7
When do people take tests?
8
Advantages
• Daily News Memory Test (Meeter et al., in press)
• Galton-Crovitz Test (Janssen et al., in press-a; in press-b)
9
Daily News Memory Test (DNMT)
• http://memory.uva.nl/testpanel/• 10 open-end and 20 multiple-choice
questions• Q: “Welk land stemde op 29 mei 2005
als eerste tegen de Europese grondwet?”
• Constantly 400 questions available• Stratified selection of questions
10
11
12
13
14
Daily News Memory Test
• November 2000 – June 2005• N = 20432• Male 47.47%; Female 52.53%• Lag. School 5.2%; LBO 4.8%;
VMBO 9.4%; HAVO 9.8%; VWO 12.1%; MBO 11.8%; HBO 24.7%; WO 22.2%
• Mage = 37.73 yrs
15
16
Galton-Crovitz Test• http://memory.uva.nl/testpanel/gc/• Participants are presented 10 cue words• They have to describe the first specific
personal, which is associated to the cue word, that comes to mind
• Then, they have to date these 10 personal events and 10 public events.
• Example: “Wanneer stemde de Nederlandse bevolking tegen de Europese grondwet?”
17
18
Galton-Crovitz Test
• N = 8291• June 2002 – June 2005• Male 39.78%; Female 60.21%• Nau = 92; Nbe = 305; Nca = 119; Nuk =
227; Nnl = 6596; Nus = 952• Mage = 40.54 yrs
19
Reminiscence Bump
• Period around early adulthood with relatively more memories than the period before or after.
20
Rubin, Wetzler & Nebes (1986)
21
Removing recency effectObserved memory distribution NL 15-25
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0 10 20 30 40 50 60
age at event
pro
po
rtio
n
Observed memory distribution NL 55-65
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
0 10 20 30 40 50 60
age at event
pro
po
rtio
n
Retention function NL
y = 0,3684x-1,9742
R2 = 0,9221
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0123456
age event in years
pro
po
rtio
n
22
Removing recency effectEncoding function NL 15-25
0
1
2
3
4
5
0 10 20 30 40 50 60
age at event
enco
din
g
Encoding function NL 55-65
0
1
2
3
4
5
0 10 20 30 40 50 60
age at event
enco
din
g
Encoding function NL
0
1
2
3
4
5
0 10 20 30 40 50 60
age at event
enco
din
g
23
Encoding functions AU-CA-UK-US and BE-NL
0
0,5
1
1,5
2
2,5
3
3,5
0 10 20 30 40 50 60 70
age at event
enco
din
gAbsence culture effect?
24
Current focus Galton-Crovitz Test
• Different cultures: Portuguese, Italian, and Japanese versions
• First times• Emotionality and valence• Semantic knowledge• Preference in movies, records, and
books
25
Limitations
• No psychophysical measurements• Response-time experiments need plug-
ins• Refresh rate computer screens• Tests should not take more than 20 to
30 minutes• Participants ‘browse’ through
instructions
26
Drop-out
y = 5,7115x-0,3507
R2 = 0,8594
0%
20%
40%
60%
80%
100%
0 500 1000 1500 2000
Completion time in seconds
Co
mp
leti
on
rat
e
27
Drop-out
• Giving financial reward• Indicating in advance how long the test
takes
• Indicating participants progression
28
Usability test / Pilot
• Do the links between pages work?• Does the test work under different
browsers and computers?• Do error messages appear?• Are the instructions clear?• Are there any grammatical errors?• Is the time to complete the test not too
long?
29
Seriousness experiment• Short URL • Sober lay-out, no ‘flash’ • Name and logo university or institute• Emphasize goal and importance of
experiment• Emphasize that the site is not commercial• Mention approval Internal Review Board• Debriefing• Provide opportunity to give feedback or ask
questions
30
Build it and they will come?
• New media: Search engines, mailing lists, and websites
• Traditional media: Newspapers, magazines, and radio programmes
• Word of mouth
31
Algemeen Dagblad
19-07-02
32
Het Financieel Dagblad
26-02-03
33
Metro 29-07-03
34
De Volkskrant 29-07-03
35
Chessa & Murre (2004)
0
5
10
15
20
25
30
-20 -10 0 10 20 30
Base rate before interview Days after interview
Num
ber
of h
its (
100
0)
36
Problem publications
• Participants know what the experiment is measuring
37
Action letter Memory & Cognition
“…you need to show that you have better awareness of who these people were. That is, how can you be sure that you were not testing some bored 14 year olds?”
38
Trustworthiness
• Trustworthiness = subject fraud• Participant is not who he says he is• Cheating (e.g., using other websites to
look up the answer)• Participant takes the test multiple
times, each time under a different name• Multiple participants take the test under
the same name
39
Trustworthiness• Password technique: send only one password
to the provided e-mail address• Dynamic test: test contains each time
different questions• Long registration – Short log-in procedure:
only analyze the first test• Record IP address• Measure time needed to complete the test:
omit tests, which took too little or too much time to complete
• Filter questions
40
Reliability
• The extent to which a test is free from random error components or non-systematical errors
• The circumstances in which the test is taken
41
Reliability DNMT
• Test-retest correlation; N = 1750, r = .617, p < .001
• Split-half reliability; N = 4797, r = .684• Cronbach’s alpha; N = 4797, α = .681• KR21; N = 7192, KR21 = .635
42
Validity
• The extent to which a test reflects only the desired construct without contamination from other systematically varying constructs
• Does the test measure computer skills as well?
• Mirroring, self-selection bias, and drop-out confounding
43
Mirroring
• Half the participants take the test in the laboratory, whereas the other half is send away and asked to take the test ‘somewhere else’.
• The results of those two groups are compared to each other and to the results of a third group of participants with similar characteristics.
44
Mirroring
• Buchanan & Smith, 1999• Gosling et al., 2004• McGraw et al., 2000• Smith & Leigh, 1997
45
Mirroring
• Compare results to results of experiments by other people
• Compare results before publication in media with results recorded directly after publication
46
Self-selection bias
• The effect that voluntary participants perform better on a test or that they are more biased in a questionnaire, because they have a greater interest in the specific topic of the test (Smith & Leigh, 1997)
• DNMT: Record people’s interest in the news (e.g., how frequently does the participant read a newspaper)
47
Self-selection bias
• Participants pool technique: Subjects are selected from a large database. Therefore, one knows who took the test and who did not take the test.
• Multiple site entry technique: The test has more than one home pages or it has links from different websites.
48
Drop-out confounding (Reips, 2002)
• Participants, who perform badly on a test, are less likely to complete a test than participants, who perform well on a test.
• Therefore, one should record who does not complete the test as well as who completes the test.
49
Conclusions
• High number of participants with relatively diverse backgrounds
• However, no psychophysical measurements• Problems: Trustworthiness, self-selection
bias, and drop-out confounding